# Data Labelling and Management

Now that we have collected the data we can begin the labelling process. Since we made the decision to perform a continuous gesture throughout the whole recording we can actually use the automatic labelling script to perform the labelling for us, vastly speeding up the process. It's not necessary for the gesture to be continuous for the labelling script because it's designed to trigger based on threshold of data and duration. With that said if the whole video is the gesture data then we can set a threshold of 0 and allow the script to label everything.

# Automatic Labelling Tool

The automatic labelling tool is part of the Imagimob ImagUtil repo which can be found here: ImagUtils (opens new window). The Automatic Labelling Script itself can also be found here (opens new window). It is installed similarly to how the Capture Server was installed and this is the documentation (opens new window).

# Using the Labelling Tool

The label is using the following command:

python -m imagutil.scripts.auto_label_recordings 
--input-dir {directory}
--label {label}
--data-process-strategy {data_process_strategy}
--threshold-strategy {strategy}
--threshold-value {value}

To automatically label multi-dimensional data, we must first process the data to reduce the dimensions down to one. This is done using the data-process-strategy parameter. The actual algorithm for determining the labels is set in the threshold-strategy parameter. The actual values are described below.

For data_process_strategy value, we can either choose "mean" to compute the average of multi-dimensional sensor data or "sum" to compute the sum of all data .

The strategy value can either be "percentage" for thresholding based on percentage of the max signal value or "value" for thresholding directly on raw signal value and value is the threshold to apply to the data. E.g. if the range of your data is [0, 300], then your threshold should be specified in that range for strategy "value". If you select "percentage"as strategy, then your threshold should be with the range of [0.0, 1.0].

This is how I performed my automatic labelling

# Project Creation and Data Analysis

This step is about ensuring that your data and labels are all good. Some common issues are that the data has some recording issues for example if a bottleneck occurs and the data points are transmitted out of order. Or the labels are slightly longer than the data. The easiest way to do is to create a project file and to add the data to it. This way we can view the data and evaluate the quality. Before doing that, though, we must first ensure that the sessions files are updated with the label files we created. We do this by performing a batch import.

# Batch Importing

To perform the batch import, you:

  1. Right click on data folder or any parent folder
  2. Move to Tools
  3. Select Session Batch Import
  4. In the following screen select the files you want to include. We deselected button file because it doesn't contain any info.

  1. Hit ok and ok to overwrite the session files

# Creating Project File

To create the project file, you:

  1. Right click the directory you want the file to be located
  2. Add -> New Project
  3. We will name ours AcconeerGestureTutorial

# Adding Data to Project File

Now it's time to add our data to the project file. To do that you:

  1. Navigate to Segments tab
  2. Click Add Segments
  3. Navigate to data folder and select it. If your data is located in multiple directories either navigate to parent folder or repeat this step for the individual directories.
  4. Then choose which data and labels files you want to add. The files should all be the same name. This means you need to ensure that if you have multiple sensors then all the data is compiled into 1 file for each recording.
  5. In this project we should have this:

  1. Hit ok

Now the data has been imported and we need distribute it to the train, validation and test datasets, to be explained in the next section.

Previous: Data Collection

Next: Class Distribution