# Model Building & Training

At this point, we have collected our data, labelled it, distributed it into the different sets, and assigned class or symbol weights to balance out the classes. The next step is to actually build and train some models. In this section, we will look at how to utilise pre-processing to improve model performance, how to generate models and pick the labelling strategy, and finally how to actually train the generated models.

# Pre-processing

Pre-processing allows you to highlight important features or properties in the data you are working with. This means that with the right pre-processing you can build smaller, faster and better models. There are many different ways to process data and it can be highly dependent on the data you are using. In Imagimob Studio we have support for the most common pre-processing methods. We are constantly adding new pre-processing methods and layers. All included layers have built-in python and C functionality and you can add a layer by clicking +(Add New Layer).

This means that anything used in Studio can be automatically converted to C at the end! With that said, you can implement your own pre-processing layers very easily but we won't cover that in this tutorial.

# Pre-processing for the Acconeer based Gestures Model

For this project, we firstly reshape the data with 192 features to [64,3], such that the data is shaped into 3 separate columns. This Reshape function is typically used as a first layer to reshape the preprocessor input into a shape that represents the data in a better way. In this case the original shape the data coming out of the sensor is [64,3] but it's flattened to have 1 row of data per time stamp. Next, the data is shifted close to the original axis by subtracting the average from each element given axis 0, such that each column of data is comparable to each other. We also apply the Hann Smoothing before the discrete Fourier Transform(FFT) and we calculate the Frobenius norm after FFT to find the magnitude of the fourier transform. Eventually we sum up the outer dimension. The outer dimension at this point represents distance points away from the radar. So we have 3 distance points and we have so far computed the fourier transform for each point. By summing them, we are processing the data so that gesture performed further away or close to the radar are the same for the model perspective.

# Sliding Window

Sliding Window typically is used as the last preprocessing layer and there are three essential parameters here.

The Window shape controls the dimension of the window. Its first value is the length of the window and the second value must equal the output of the previous preprocessing layer. It is to be noted that the larger the window size, the larger the input layer of the model, which means that the larger the model.

The Stride controls the number of data points that the window shifts before making a new classification and its value must be a multiple of the previous preprocessing layer. The multiple is the time

The Buffer Multiplier is the size multiplier for the internal circular buffer, which must be bigger than 1.

Setting this value to 1 means that all calculations needed for the preprocessing layers plus the AI model itself must occur before a new data point is an input again. If this value is set to 2, a new full window can be input while calculating the output from the AI model for the previous window. If set too low, your AI application might crash at runtime. Recommended value: 2.00.

Note: As a rule of thumb your sliding window should be big enough to fit your longest event in it.

So, if your event takes 1.5 seconds, your starting window size can be calculated as 1.5 times your sampling frequency.

# Visualising the Pre-processing

Visualising the processing helps you know exactly what goes into the model and improve the model performance. It also identifies what events, gestures, or classes are easy to distinguish, which for example allows you to simplify the process of gesture selection. To do that, you can simply collect all the gestures you wish to evaluate into one file and visualise the pre-processed data before feeding it into model for training. But we don't cover the gesture selection process in this tutorial, since we already have defined the gestures.

We’ve talked about the benefits of visualising processing, now let us get into how to do it. After you have chosen pre-processing layers that you would like to try with, you can hit Create Track from Preprocessor. Here we delete the Sliding Window layer because we want to visualise the processing for each time point.

We use the CombinedGesture data as an example and its applied-preprocessed result is shown below. As we can see that three preprocessed gestures are separable from each other, which means what we pass to the model is distinct and unique for each gesture.

# Model Generation

Now we have decided on the preprocessing, the next thing we need to do is generating models for training. To do that, we can simply click the Generate Model List... and the Model Wizard will just show up.

As you can see, there are four parameters for us to tune in the Model Wizard.

The Number Of Model Structures defines the number of different model structures to generate, and Number Of Hyper Parameters defines the number of different hyperparameters for each model structure. Hence, the product of these two values is the number of models that will be generated.

For Epochs, it defines the number of iterations for training one model. And the Batch Size is the number of windows to send to update the model.

After we fix all the parameters in the Model Wizard, we are ready to click OK to generate a list of models, and all existing models will be replaced. Finally, we click Start New Training Job for the model training.

Previous: Label Analysis & Distribution

Next: Model Evaluation