Skip to content

Commit

Permalink
Update docs
Browse files Browse the repository at this point in the history
  • Loading branch information
AlexDelitzas committed Apr 18, 2024
1 parent a851c16 commit 90fdc70
Show file tree
Hide file tree
Showing 4 changed files with 40 additions and 8 deletions.
2 changes: 1 addition & 1 deletion search/search_index.json

Large diffs are not rendered by default.

Binary file modified sitemap.xml.gz
Binary file not shown.
2 changes: 1 addition & 1 deletion track_1/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -1007,7 +1007,7 @@ <h2 id="submission-instructions"><strong>Submission Instructions</strong></h2>
<p>Given the open-vocabulary query, the participants are asked to segment object instances that fit best with the query. Expected result is object instance masks, and confidence scores for each mask. </p>
<p>We ask the participants to upload their results as a single <code>.zip</code> file, which when unzipped must contain in the root the prediction files. There must not be any additional files or folders in the archive except those specified below.</p>
<p>Results must be provided as a text file for each scene. Each text file should contain a line for each instance, containing the relative path to a binary mask of the instance, and the confidence of the prediction. The result text files must be named according to the corresponding scan, as <code>{SCENE_ID}.txt</code> with the corresponding scene ID. Predicted <code>.txt</code> files listing the instances of each scan must live in the root of the unzipped submission. Predicted instance mask files must live in a subdirectory of the unzipped submission. For instance, a submission should look like:</p>
<pre><code>submission_opensun3d
<pre><code>submission_opensun3d_track1
|__ {SCENE_ID_1}.txt
|__ {SCENE_ID_2}.txt
Expand Down
44 changes: 38 additions & 6 deletions track_2/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -951,7 +951,7 @@ <h3 id="challenge-phases">Challenge Phases</h3>
</li>
</ul>
<h3 id="data-organization-and-format">Data organization and format</h3>
<p>We represent each scene with a visit_id (6-digit number) and each video sequence with a video_id (8-digit number). Each scene has on average three video sequences recorded with a 2020 iPad Pro.</p>
<p>We represent each scene with a visit_id (6-digit number) and each video sequence with a video_id (8-digit number). For each scene, we provide a high-resolution point cloud generated by combinding multiple Faro laser scans of the scene. Additionally, each scene is accompanied by on average three video sequences recorded with a 2020 iPad Pro.</p>
<pre><code>PATH/TO/DATA/DIR/{dev or test}/
├── {visit_id}/
| ├── {visit_id}.ply # combined Faro laser scan with 5mm resolution
Expand Down Expand Up @@ -983,8 +983,8 @@ <h3 id="data-organization-and-format">Data organization and format</h3>
.
</code></pre>
<h3 id="annotations-format">Annotations format</h3>
<p>Annotations are organized in two separate files and follow this format:</p>
<p><em>descriptions.json</em></p>
<p>We provide GT annotations for the scenes in the development set which are organized in two separate files and follow this format:</p>
<p><em><a href="https://github.com/OpenSun3D/cvpr24-challenge/blob/main/challenge_track_2/benchmark_data/descriptions_dev.json">descriptions_dev.json</a></em></p>
<pre><code>[
{
&quot;desc_id&quot;: unique id of the description,
Expand All @@ -997,7 +997,7 @@ <h3 id="annotations-format">Annotations format</h3>
...
]
</code></pre>
<p><em>annotations.json</em></p>
<p><em><a href="https://github.com/OpenSun3D/cvpr24-challenge/blob/main/challenge_track_2/benchmark_data/annotations_dev.json">annotations_dev.json</a></em></p>
<pre><code>[
{
&quot;annot_id&quot;: unique id of the annotation,
Expand All @@ -1007,7 +1007,7 @@ <h3 id="annotations-format">Annotations format</h3>
...
]
</code></pre>
<p>The file <em>descriptions.json</em> contains the language task descriptions and links them to the corresponding functional interactive element instances. The file <em>annotations.json</em> contains the functional interactive element annotations, i.e., the mask indices of a single functional interactive element instance in the original laser scan. </p>
<p>The file <em>descriptions_dev.json</em> contains the language task descriptions and links them to the corresponding functional interactive element instances. The file <em>annotations_dev.json</em> contains the functional interactive element annotations, i.e., the mask indices of a single functional interactive element instance in the original laser scan. </p>
<blockquote>
<p>&#128221; We <em>highlight</em> that a single language task description can correspond to one or multiple functional interactive element instances.</p>
</blockquote>
Expand Down Expand Up @@ -1156,7 +1156,39 @@ <h2 id="example-code">Example code</h2>
</code></pre>
<p>where the <code>wide</code> RGB frames are used for coloring, the extraneous point will be cropped from the laser scan and the output will be stored.</p>
<h2 id="submission-instructions">Submission Instructions</h2>
<p>Coming soon.</p>
<p>Given the open-vocabulary language task description, the participants are asked to segment functional interacive element instances that an agent needs to interact with to successfully accomplish the task. Expected result is functional interacive element masks, and confidence scores for each mask. </p>
<p>We ask the participants to upload their results as a single <code>.zip</code> file, which when unzipped must contain in the root the prediction files. There must not be any additional files or folders in the archive except those specified below.</p>
<p>Results must be provided as a text file for each scene. Each text file should contain a line for each instance, containing the relative path to a binary mask of the instance, and the confidence of the prediction. The result text files must be named according to the corresponding laser scan (<code>visit_id</code>) and language description (<code>desc_id</code>), as <code>{visit_id}_{desc_id}.txt</code>. Predicted <code>.txt</code> files listing the instances of each scan must live in the root of the unzipped submission. Predicted instance mask files must live in a subdirectory named <code>predicted_masks/</code> of the unzipped submission. For example, a submission should look like the following:</p>
<pre><code>submission_opensun3d_track2
|__ {visit_id_1}_{desc_id_1}.txt
|__ {visit_id_2}_{desc_id_2}.txt
|__ {visit_id_N}_{desc_id_N}.txt
|__ predicted_masks/
|__ {visit_id_1}_{desc_id_1}_000.txt
|__ {visit_id_1}_{desc_id_1}_001.txt
</code></pre>
<p>for all the available N pairs (laser scan, language description).</p>
<p>Each prediction file for a scene should contain a list of instances, where an instance is: (1) the relative path to the predicted mask file, (2) the float confidence score. If your method does not produce confidence scores, you can use 1.0 as the confidence score for all masks. Each line in the prediction file should correspond to one instance, and the two values above separated by a space. Thus, the filenames in the prediction files must not contain spaces.
The predicted instance mask file should provide a mask over the vertices of the provided laser scan, i.e. <code>{visit_id}_laser_scan.ply</code>, following the original order of the vertices in this file.
Each instance mask file should contain one line per point, with each line containing an integer value, with non-zero values indicating part of the instance. For example, consider a scene identified by visit_id <code>123456</code>, with a language description input identified by desc_id <code>5baea371-b33b-4076-92b1-587a709e6c65</code>. In this case, the submission files could look like:</p>
<p><code>123456_5baea371-b33b-4076-92b1-587a709e6c65.txt</code></p>
<pre><code>predicted_masks/123456_5baea371-b33b-4076-92b1-587a709e6c65_000.txt 0.7234
predicted_masks/123456_5baea371-b33b-4076-92b1-587a709e6c65_001.txt 0.9038
</code></pre>
<p>and <code>predicted_masks/123456_5baea371-b33b-4076-92b1-587a709e6c65_000.txt</code> could look like:</p>
<pre><code>0
0
1
1
0
</code></pre>
<blockquote>
<p>&#128221; <strong>IMPORTANT NOTE</strong>: The prediction files must adhere to the vertex ordering of the original laser scan point cloud <code>{visit_id}_laser_scan.ply</code>. If your pipeline alters this vertex ordering (e.g., through cropping the laser scan using the <code>crop_mask</code> data asset), ensure that the model predictions are re-ordered to match the original vertex ordering before generating the prediction files.</p>
</blockquote>
<h2 id="evaluation-guidelines">Evaluation Guidelines</h2>
<p>In order to evaluate the results on the scenes of the dev set, we provide <a href="https://github.com/OpenSun3D/cvpr24-challenge/blob/main/challenge_track_2/benchmark_eval/eval_utils/eval_script_inst.py">evaluation functions</a> as well as an example <a href="https://github.com/OpenSun3D/cvpr24-challenge/blob/main/challenge_track_2/benchmark_eval/demo_eval.py">evaluation script</a>. We follow the standard evaluation for 3D instance segmentation, and compute Average Precision (AP) scores. The evaluation script computes the AP scores for each language task description and then averages the scores over all language task descriptions in the set. </p>
<p>You can run the example evaluation script as:</p>
Expand Down

0 comments on commit 90fdc70

Please sign in to comment.