Machine learning Algorithms in DEEPCRAFT™ Studio

This section introduces the two main types of supervised machine learning algorithms supported in Studio: Regression and Classification.

Regression Algorithms

Regression algorithms are designed to understand the correlations between dependent and independent variables. The dependent variables are those we aim to predict, while the independent variables are those we use for prediction. These algorithms are used to predict continuous numerical outputs. You can evaluate the regression models using metrics like Mean Squared Error (MSE), R2-Score, and Mean Absolute Percentage Error (MAPE), along with graphical plots such as Quantile - Quantile and Histogram of residuals. Studio supports linear regression analysis. Power Generation Prediction is one of the regression starter project supported in Studio.



Classification Algorithms

Classification algorithms are used to predict categorical or discrete values. Classification involves predicting a label or category. These algorithms classify the dataset into one or more labels. A binary classifier deals with two classes or categories, while a multi-class classification algorithm handles more than two classes. You can evaluate the classification models using metrics, like confusion matrix, recall, F1-Score, accuracy. Human activity recognition, keyword spotter, baby cry detection, siren detection are some of the classification starter projects supported in Studio. To know more about the different classification starter projects supported in studio, refer to Starter Projects.



Determining Project Type: Classification or Regression

In a classification project, there are two types of files, .data and .label. The .data file contains the actual data, while the .label file contains the labels that annotate the data in the .data file.

In a regression project, we use a .data file that contains the actual data, and instead of a .label file, the .data file itself is used to label the data. Additionally, regression projects do not require separate labeling, which saves time and makes the project more efficient.

So, when you import data into the project, studio will determine the project type based on the active label track of the sessions:

  • If all sessions have the active label track set to .data, Studio will classify the project as regression.
  • If all sessions have the active label track set to .label, Studio will classify the project as classification
ℹ️

In a project, all sessions must use either .data file or .label file as an active label track. It is not permissible to have some sessions with .data file and others with .label file as the active label track.