After the execution of an experiment, the experimented should have collected a set of information:
-
raw data, as provided by the sensors used in the experiment;
-
pre-processed data, according to the format mentioned in the Eurobench data format;
-
a report document, providing informal description of the test conducted.
We focus here on the structure and organization of the pre-processed data, as they are the input information that will be used to compute automatically the Performance Indicator metrics.
Although each centre may have a well-defined data organization, we are required to define a Eurobench-compatible structure, to make sure the automatic scoring mechanism can handle the data submitted for benchmarking.
The structure proposed only constraints the pre-processed data, as the other ones (raw data, report document) are not used within the benchmarking process.
Generally speaking pre-processed data should follow the following pattern:
subject_X_cond_Y_run_Z_[type].csv
Such contextualized format provides the following information:
-
[type]
is a string related to the type of information stored in the file. This is the root name of the file. All possible types are described in the Eurobench data format. -
X
,Y
,Z
are integer respectively associated to the number of subjects involved, the number of different conditions being tested, and the number of repetitions per condition. -
The
subject
/cond
/run
namespaces are only used when it makes sense:-
An experiment involving a unique subject (or only an humanoid) would have have pre-processed files following the pattern:
cond_Y_run_Z_[type].csv
. -
An experiment with various subjects but unique condition would follow the pattern
subject_X_run_Z_[type].csv
. -
An experiment in which there is a single repetition of a trial would have datafiles following the pattern :
subject_X_cond_Y_[type].csv
. -
At the extreme, an experiment involving a unique subject (or an humanoid), with single condition and no repetition will be described by files with pattern:
[type].csv
.
-
Unless if specified different, all datafile submitted for benchmarking will have a name using this pattern.
Aside the pre-processed data file, we can foresee the following additional required files:
-
files containing information (anthropometric and or other aspects) about the subject involved:
subject_i_info.yaml
. -
files containing information on the robotic device used:
robot_info.yaml
(urdf
format may be used, the protocol should state it). -
files containing the settings of the controlled variables:
condition_i.yaml
.
These files can be repeated using the following patterns:
-
user information: if a single human subject is involved, the data file can be named
subject_info.yaml
. IfN
subject are involved, then we should provideN
files following the pattern:subject_i_info.yaml
, wherei < N
identifies the subject. -
robot device information: we assume only one robotic device is used per experiments, so that the file can be directly named
robot_info.yaml
. Any change of the control setting of the robot should be mentioned in the condition file. -
Condition setting file: we identified three cases:
-
if no parameter is set, no file is required.
-
if the condition setting contains information specific to the user (like an adjustment of a pushing force that may differ per subject), then we should get the :
subject_i_condition_j.yaml
, wherej
should be an integer referring to theY
different condition setting being considered. -
if the conditional settings are common for all subjects, then we should get only
Y
different files, namedcondition_j.yaml
.
-
Such model gives us a contextualized filename approach: based on the filename, we can deduce all the context of a data file: its content, the subject involved, the condition setting used, the repetition id. A contextualized filename is unique in the complete experiment.
Case 1: An experiment is conducted with 4 human subjects and a robotic device. The experiment involves 3 condition settings (for instance, same testbed but (1): without the robotic device, (2) with the robotic device totally passive, and (3) with the robotic device in active mode). These 3 settings are common across subjects. Each setting has been tested and repeated 4 times, except for the second subject that could only perform two repetitions. The files expected are the following:
-
subject_{1,2,3,4}_info.yaml
-
robot_info.urdf
-
condition_{1,2,3}.yaml
-
subject_{1,3,4}_cond_{1,2,3}_run_{1,2,3,4}_[type].csv
-
subject_{1,3,4}_cond_{1,2,3}_run_{1,2}_[type].csv
In the two last groups of files, the [type]
string should refer to one of the pre-processed format described in X.
Case 2: Another experiment involves 2 subjects and a robotic device. The experiment contains 3 different condition settings. One of the configuration parameters has to be adjusted to the subject involved. Each condition is only repeated once.
-
subject_{1,2}_info.yaml
-
robot_info.urdf
-
subject_{1,2}_condition_{1,2,3}.yaml
-
subject_{1,2}_cond_{1,2,3}_[type].csv
Case 3: an experiment compares two different parameter settings of an humanoid. The robot has been asked to repeat three times each configuration.
-
robot_info.urdf
-
condition_{1,2}.yaml
-
cond_{1,2}_run_{1,2,3}_[type].csv
Given the previous filename pattern, the simplest file organization is to place all the data-files into a single folder.
For personal matters the experimenter can prefer organizing the data into folders. We envision two possibilities:
-
The contextualized filename pattern is maintained across folder: this means that by copying all files in folders into a single folder, and keeping their name, we would get all data still organized following the previous filename pattern. This means the folder organization does not affect the naming of the file, and the possibility of understanding the context of a file given its name.
-
The filename is simplified according to the folder organisation. In that case, the contextualized filename could be obtained using the folder hierarchy.
We propose focusing on the first approach for now.