-
Notifications
You must be signed in to change notification settings - Fork 289
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Training with data generated by Isaac Sim Replicator #381
Comments
@nv-jeff maybe jeff could help you, I never used isaac replicator sorry. |
@TontonTremblay Thanks for your answer, I would be really appreciated if @nv-jeff could also chime in. Meantime, would you have a comment on the ongoing training (loss/maps etc.)? |
I think you would want 0.001 around loss, not sure would need to check other issue threads. Also maybe your object has symmetries in there. Check the section on generating data with symmetries. |
ahh you are doing the ycb wood box, yeah there are symmetries on that object. You will have to define them. |
I got you but this object is not generated by NVISII, so where am I supposed to define model_info.json? I just have .pngs and .jsons in my dataset. I don't see any option for it in the train.py either. Is this supposed be in Isaac Sim's data generation config?
|
@TontonTremblay @nv-jeff, my training just completed even though the loss value looks okay, I can not get any inference results with |
can you share belief maps i think it is |
Only thing I could find related to but it throws:
So I have changed
to
I also set the following fields in
How can I get belief map images? Sorry, I could not find any explanation for this in the repo. Thank you for your support. |
I have a view in TensorBoard for the epoch 176. Would it be useful for you @TontonTremblay ? |
yeah it looks like you have symmetries. see how for some points it does not know which one is which? you need to add them to your annotation. https://github.com/NVlabs/Deep_Object_Pose/blob/master/data_generation/readme.md#handling-objects-with-symmetries read this. |
@TontonTremblay thanks for your answer. I wanted to use Ketchup as a reference point to see if I could get an expected result before your answer. I have generated ~3900 images using Blenderproc. What do you think about my belief maps while it is in ~165th epoch: |
These look great to my eye.
…On Wed, Aug 21, 2024 at 12:25 Doruk Sönmez ***@***.***> wrote:
@TontonTremblay <https://github.com/TontonTremblay> thanks for your
answer. I wanted to use Ketchup as a reference point to see if I could get
an expected result before your answer. I have generated ~3900 images using
Blenderproc. What do you think about my belief maps while it is in ~165th
epoch:
Screenshot.from.2024-08-21.22-08-44.png (view on web)
<https://github.com/user-attachments/assets/0fb7a4b5-a2cf-4893-87a3-c8ebfbdd88fc>
—
Reply to this email directly, view it on GitHub
<#381 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABK6JIBBAQBS4BIZ6ATBLPLZSTSRPAVCNFSM6AAAAABMYTQLXKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMBSHA2TKNBWGQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Can you try on an image with a single ketchup bottle? Also I think your
cuboid keypoints order is looking strange. I forgot if you used Isaac sin
or nvisii.
…On Thu, Aug 22, 2024 at 07:33 Doruk Sönmez ***@***.***> wrote:
I think I'm getting somewhere with this but my inference results are not
good. I was able to run train2/inference.py with small modifications to
the detector.py line 496 (which prevented running with an error):
#belief -= float(torch.min(belief)[0].data.cpu().numpy())
#belief /= float(torch.max(belief)[0].data.cpu().numpy())
belief -= float(torch.min(belief).data.cpu().numpy())
belief /= float(torch.max(belief).data.cpu().numpy())
Here is a couple belief maps from the actual train2/inference.py script
and pose estimation results:
file_12.png (view on web)
<https://github.com/user-attachments/assets/e5ae3d08-3a49-4647-b4b6-6e0cd1a2e0b0>
file_12_belief.png (view on web)
<https://github.com/user-attachments/assets/d77f5e31-dd8f-4c50-8be3-a76462b6a010>
file_25.png (view on web)
<https://github.com/user-attachments/assets/961621ad-d6ac-4908-81af-8b07afbed3ef>
file_25_belief.png (view on web)
<https://github.com/user-attachments/assets/a0a46dbe-640c-4659-989c-884fa2eb9ec4>
In some of the images, there is no result at all:
file_28.png (view on web)
<https://github.com/user-attachments/assets/361ac12d-91c7-4295-8a8a-c76c92389817>
file_28_belief.png (view on web)
<https://github.com/user-attachments/assets/eb73dc97-8022-4523-9d21-64e6dffab7ae>
What might be the issue?
—
Reply to this email directly, view it on GitHub
<#381 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABK6JIFBXCWK5T74AJPUMZDZSXZC7AVCNFSM6AAAAABMYTQLXKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMBUHAZDSMJYGA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Ok I see the problème. It is the order of the cuboid in blender proc that
is different than nvisii. I am out this week. Ping nv Jeff so he can check
into that or fix one or the other. Might be adding a flab from where the
data comes from.
…On Thu, Aug 22, 2024 at 09:52 Doruk Sönmez ***@***.***> wrote:
And this one is without distractors:
000000.png (view on web)
<https://github.com/user-attachments/assets/8dfdb915-cd8b-4444-8a6d-0c6cf5521ed8>
000000_belief.png (view on web)
<https://github.com/user-attachments/assets/b462f4ba-e810-4a49-aed8-09173ed2e53b>
—
Reply to this email directly, view it on GitHub
<#381 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABK6JIAZOL4JMOHH2NNJPP3ZSYJLLAVCNFSM6AAAAABMYTQLXKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMBVGIYTONBSHA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
@TontonTremblay Okay, many thanks for clarifying. @nv-jeff could we please get your help for this issue? Thanks in advance. |
Hi, first of all, sorry for the long post. I'm currently experimenting with training DOPE to understand the project fully and its theoretical background. First, I generated synthetic data using Blenderproc since NVISII is no longer supported by newer Python versions. However, I've missed the point where I should be running the below command 5 times as mentioned here: #361 (comment), data generation took so much time and I ended up only having ~1600 images after a long time.
./run_blenderproc_datagen.py --nb_runs 5 --nb_frames 1000 --path_single_obj /isaac_dope_retrain/Popcorn/google_16k/textured.obj --nb_objects 6 --distractors_folder ../nvisii_data_gen/google_scanned_models --nb_distractors 10 --backgrounds_folder ../dome_hdri_haven/
Then I followed NVIDIA's official documentation on generating synthetic dataset using Isaac Sim Replicator and generated ~7200 images for the
036_wood_block
class which is from the YCB dataset. Then I separated 202 images for testing and run debug tool on this split like the following:python3 debug.py --data ../isaac_data/test/
Here is the debug result:
My first question is, do these debug results look correct for the object? (looks like there is no connection to the 8th point)
Then I started DOPE training with the
train/train.py
script with a batch size of 16 and for 200 epochs:python3 -m torch.distributed.launch --nproc_per_node=1 train.py --batchsize 16 -e 200 --data ../isaac_data/isaac_data/ --object _36_wood_block
Here is my training log, TensorBoard, and belief maps so far:
My second question is, do these loss values/tensorboard/belief maps indicate a good training process? (I'm asking because training really takes time on RTX 4070 Ti and if something is wrong, I would like to fix it and restart the training)
My third question is; I have noticed some differences between NVISII/Blenderproc data and Isaac Sim Replicator data, so these changes would affect my training or inference process? Are there any other steps that should apply to Isaac data before or after the training to get the expected results?
Sample data for
cracker
:Wood block data from Isaac Sim Replicator:
Thanks in advance.
The text was updated successfully, but these errors were encountered: