Skip to content

Commit

Permalink
First draft
Browse files Browse the repository at this point in the history
  • Loading branch information
aremazeilles committed Nov 16, 2020
1 parent b7052c2 commit d65a4a6
Showing 1 changed file with 121 additions and 0 deletions.
121 changes: 121 additions & 0 deletions modules/ROOT/pages/experiment_data.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,121 @@
= Experimental data
:imagesdir: ../images
:sectnums:
:sectnumlevels: 4
:experimental:
:keywords: AsciiDoc
:source-highlighter: highlight.js
:icons: font

## Introduction

After the execution of an experiment, the experimented should have collected a set of information:

* raw data, as provided by the sensors used in the experiment
* pre-processed data, according to the format mentioned in X
* a report document, providing informal description of the test conducted.
We focus here on the structure and organization of the pre-processed data, as they are the input information that will e used to compute _automatically_ the Performance Indicator metrics.

## Datafile naming

Although each centre may have a well-defined data organization, we are required to define a _Eurobench-compatible_ structure, to make sure the automatic scoring mechanism can handle the data submitted to trigger the benchmarking.

The structure proposed only constraints the **pre-processed data**, as the other ones (raw data, report document) are not used within the benchmarking process.

Generally speaking pre-processed data should follow the following pattern:

```
subject_X_cond_Y_run_Z_[type].csv
```
where:

* `[type]` is a string related to the type of information stored in the file.
This is the root name of the file.
All possible type are described in X.
* `X`, `Y`, `Z` are integer respectively associated to the number of subjects involved, the number of different conditions being tested, and the number of repetitions per condition.
* The `subject` / `cond` / `run` namespaces are only used when it makes sense:
** An experiment involving a unique subject would have have pre-processed files following the pattern: `cond_Y_run_Z_[type].csv`.
** An experiment with various subjects but unique condition would follow the pattern `subject_X_run_Z_[type].csv`.
** An experiment in which there is a single repetition of a trial would have datafiles following the pattern : `subject_X_cond_Y_[type].csv`.
** At the extreme, an experiment involving a unique subject (or an humanoid), with single condition and no repetition will be described by files with pattern: `[type].csv`.
Unless specified if different, all datafile recorded will have a name using this pattern.

Aside the pre-processed data file, we can foresee the following additional required files:

* files containing information (anthropometric and or other aspects) about the subject involved: `subject_i_info.yaml`.
* files containing information on the robotic device used: `robot_info.yaml` (`urdf` format may be used, the protocol should state it).
* files containing the settings of the controlled variables: `condition_i.yaml`.
These files can be repeated using the following patterns:

* user information: if a single human subject is involved, the data file can be named `subject_info.yaml`.
If `N` subject are involved, then we should provide `N` files following the pattern: `subject_i_info.yaml`, where `i < N` identifies the subject.
* robot device information: we assume only one robotic device is used per experiments, so that the file can be directly named `robot_info.yaml`.
Any change of the control setting of the robot should be mentioned in the condition file.
* Condition setting file: we identified three cases:
* if no parameter is set, no file is required.
* if the condition setting contains information specific to the user (like an adjustment of a pushing force that may differ per subject), then we should get the : `subject_i_condition_j.yaml`, where `j` should be an integer referring to the `Y` different condition setting being considered.
* if the conditional settings are common for all subjects, then we should get only `Y` different files, named `condition_j.yaml`.
Such model gives us a _contextualized_ file name: based on the filename, we can deduce all the context of a data file: its content, the subject involved, the condition setting used, the repetition id.
A contextualized filename is unique in the complete experiment.


### illustration

**Case 1**: An experiment is conducted with 4 human subjects and a robotic device.
The experiment involves 3 condition settings (for instance, same testbed but (1): without the robotic device, (2) with the robotic device totally passive, and (3) with the robotic device in active mode). These 3 settings are common across subjects.
Each setting has been tested and repeated 4 times, except for the second subject that could only perform two repetitions.
The files expected are the following:

* `subject_{1,2,3,4}_info.yaml`
* `robot_info.urdf`
* `condition_{1,2,3}.yaml`
* `subject_{1,3,4}_cond_{1,2,3}_run_{1,2,3,4}_[type].csv`
* `subject_{1,3,4}_cond_{1,2,3}_run_{1,2}_[type].csv`
In the two last groups of files, the `[type]` string should refer to one of the pre-processed format described in X.

**Case 2**: Another experiment involves 2 subjects and a robotic devices.
The experiment contains 3 different condition settings.
One of the configuration parameters has to be adjusted to the subject involved.
Each condition is only repeated once.

* `subject_{1,2}_info.yaml`
* `robot_info.urdf`
* `subject_{1,2}_condition_{1,2,3}.yaml`
* `subject_{1,2}_cond_{1,2,3}_[type].csv`
## Data file organization

Given the previous filename pattern, the simplest is to place all the datafiles into a single folder.

For personal matters the experimenter can prefer organizing the data into folders.
We envision two possibilities:

* The **contextualized filename pattern is maintained across folder**: this means that by copying all files in folders into a single folder, and keeping their name, we would get all data still organized following the previous filename pattern.
This means the folder organization does not affect the naming of the file, and the possibility of understanding the context of a file given its name.
* The **filename is simplified according to the folder organisation**.
In that case, the contextualized filename could be obtained using the folder hierarchy.
### Illustration

The first possibility has already described in the previous section X: the files may be organized in folder, but files within the folder hierarchy will follow the complete filename pattern previously mentioned.

For the second possibility (where the folder name matters), we can get the following:

**Case1**:

Using a yaml structure, where a key refers to a folder:


* `subject_{1,2,3,4}_info.yaml`
* `robot_info.urdf`
* `condition_{1,2,3}.yaml`
* `subject_{1,3,4}_cond_{1,2,3}_run_{1,2,3,4}_[type].csv`
* `subject_{1,3,4}_cond_{1,2,3}_run_{1,2}_[type].csv`

0 comments on commit d65a4a6

Please sign in to comment.