Running machine learning pipelines¶

Let’s run a machine learning pipeline to the Iris dataset.

Importing the required packages

import numpy as np

from pjml.data.communication.report import Report
from pjml.data.evaluation.metric import Metric
from pjml.data.flow.file import File
from pjml.operator.pipeline import Pipeline
from pjml.stream.expand.partition import Partition
from pjml.stream.reduce.reduce import Reduce
from pjml.stream.reduce.summ import Summ
from pjml.stream.transform.map import Map
from pjpy.modeling.supervised.classifier.svmc import SVMC
from pjpy.processing.feature.reductor.pca import PCA
from pjpy.processing.feature.scaler.minmax import MinMax

np.random.seed(0)

First, we must create a pipeline.

pipe = Pipeline(
    File("../data/iris.arff"),
    Partition(),
    Map(MinMax(), PCA(), SVMC(), Metric()),
    Summ(),
    Reduce(),
    Report("Mean S: $S"),
)

Now we will train our pipeline

res_train, res_test = pipe.dual_transform()
print("Train result: ", res_train)
print("test result: ", res_test)

Out:

   [model]  Mean S: array([[0.96]])
Train result:  <pjdata.content.data.Data object at 0x7f37688e6c70>
test result:  <pjdata.content.data.Data object at 0x7f376877b850>

Total running time of the script: ( 0 minutes 0.146 seconds)

Gallery generated by Sphinx-Gallery