Note
Click here to download the full example code
Operating machine learning pipelines (basic)¶
You can create pipelines using the following operators:
- Chain –> It creates a sequential chain of components
- Shuffle –> It shuffles the componensts order
- Select –> It selects one of the given componets
Importing the required packages
import numpy as np
from pjml.operator.chain import Chain
from pjml.operator.select import Select
from pjml.operator.shuffle import Shuffle
from pjpy.modeling.supervised.classifier.dt import DT
from pjpy.modeling.supervised.classifier.svmc import SVMC
from pjpy.processing.feature.reductor.pca import PCA
from pjpy.processing.feature.scaler.minmax import MinMax
np.random.seed(0)
Using Chain¶
The Chain
is an operator that concatenates other components in a sequence.
Out:
{
"info": {
"_id": "SVMC@pjpy.modeling.supervised.classifier.svmc",
"config": {
"C": 1.0,
"kernel": "rbf",
"degree": 3,
"gamma": "scale",
"coef0": 0.0,
"shrinking": true,
"probability": false,
"tol": 0.001,
"cache_size": 200,
"class_weight": null,
"verbose": false,
"max_iter": -1,
"decision_function_shape": "ovr",
"break_ties": false,
"random_state": null,
"seed": 0
}
},
"enhance": true,
"model": true
}
{
"info": {
"_id": "DT@pjpy.modeling.supervised.classifier.dt",
"config": {
"seed": 0
}
},
"enhance": true,
"model": true
}
{
"info": {
"_id": "SVMC@pjpy.modeling.supervised.classifier.svmc",
"config": {
"C": 1.0,
"kernel": "rbf",
"degree": 3,
"gamma": "scale",
"coef0": 0.0,
"shrinking": true,
"probability": false,
"tol": 0.001,
"cache_size": 200,
"class_weight": null,
"verbose": false,
"max_iter": -1,
"decision_function_shape": "ovr",
"break_ties": false,
"random_state": null,
"seed": 0
}
},
"enhance": true,
"model": true
}
{
"info": {
"_id": "DT@pjpy.modeling.supervised.classifier.dt",
"config": {
"seed": 0
}
},
"enhance": true,
"model": true
}
Using Shuffle¶
The Select
is an operator that works like a bifurcation, where only one of
the components will be selected.
Out:
{
"info": {
"_id": "MinMax@pjpy.processing.feature.scaler.minmax",
"config": {
"feature_range": [
0,
1
]
}
},
"enhance": true,
"model": true
}
{
"info": {
"_id": "PCA@pjpy.processing.feature.reductor.pca",
"config": {
"n": 2
}
},
"enhance": true,
"model": true
}
You can also use the python operator @
Out:
{
"info": {
"_id": "PCA@pjpy.processing.feature.reductor.pca",
"config": {
"n": 2
}
},
"enhance": true,
"model": true
}
{
"info": {
"_id": "MinMax@pjpy.processing.feature.scaler.minmax",
"config": {
"feature_range": [
0,
1
]
}
},
"enhance": true,
"model": true
}
Using Select¶
The Shuffle
is an operator that concatenate components in a sequence, but
the order is not maintained.
Out:
{
"info": {
"_id": "DT@pjpy.modeling.supervised.classifier.dt",
"config": {
"seed": 0
}
},
"enhance": true,
"model": true
}
You can also use the python operator +
Out:
{
"info": {
"_id": "SVMC@pjpy.modeling.supervised.classifier.svmc",
"config": {
"C": 1.0,
"kernel": "rbf",
"degree": 3,
"gamma": "scale",
"coef0": 0.0,
"shrinking": true,
"probability": false,
"tol": 0.001,
"cache_size": 200,
"class_weight": null,
"verbose": false,
"max_iter": -1,
"decision_function_shape": "ovr",
"break_ties": false,
"random_state": null,
"seed": 0
}
},
"enhance": true,
"model": true
}
Using them all:¶
Using these simple operations, you can create diverse kind of pipelines to represent an end-to-end machine learning pipeline.
Out:
{
"info": {
"_id": "PCA@pjpy.processing.feature.reductor.pca",
"config": {
"n": 2
}
},
"enhance": true,
"model": true
}
{
"info": {
"_id": "MinMax@pjpy.processing.feature.scaler.minmax",
"config": {
"feature_range": [
0,
1
]
}
},
"enhance": true,
"model": true
}
{
"info": {
"_id": "DT@pjpy.modeling.supervised.classifier.dt",
"config": {
"seed": 0
}
},
"enhance": true,
"model": true
}
You can also use python operators
Out:
{
"info": {
"_id": "PCA@pjpy.processing.feature.reductor.pca",
"config": {
"n": 2
}
},
"enhance": true,
"model": true
}
{
"info": {
"_id": "MinMax@pjpy.processing.feature.scaler.minmax",
"config": {
"feature_range": [
0,
1
]
}
},
"enhance": true,
"model": true
}
{
"info": {
"_id": "DT@pjpy.modeling.supervised.classifier.dt",
"config": {
"seed": 0
}
},
"enhance": true,
"model": true
}
Total running time of the script: ( 0 minutes 0.062 seconds)