Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed instruction::replace() logic - Fix CI. #3574

Merged
merged 16 commits into from
Nov 11, 2024
Merged

Conversation

tcgu-amd
Copy link
Contributor

@tcgu-amd tcgu-amd commented Oct 30, 2024

Fixing issues in PR #3553 that is causing failures in some of the tests.

Description of 3553:

The previous fix with BFS doesn't fully work in more complex cases (e.g. it will fail in the newly added test case check_replace_dag). This fix implements topological sorting to replace instructions in topological order which should work for all cases.

More details:

In a dummy scenario of add2(reduce(x), add1(abs(reduce(x)), sin(reduce(x)))), we will have a dependency tree looking like

reduce _
        \_abs__
         \_sin__\_add1_
          \_____________\_add2

If we call reduce.replace(), BFS will visit the instructions in the following order:

reduce -> abs -> sin -> add2 -> add1

This will causes an error of shape mismatch at add2 because it is called before its input add1.

Topological sorting the instruction tree will yield:

reduce -> sin -> abs -> add1 -> add2

Which is the correct order to process the instructions.

This should be able to extend to more complex cases.

… fully work in more complex cases (e.g. it will fail in the newly added test case check_replace_dag). This fix implements topological sorting to replace instruction in topological order which should work for all cases.
…ly based on Khan's algorithm.

This version avoids sorting the entire graph, and will terminate when no more changes are requried like old versions
added, which preserves the topological ordering of dependencies.
@tcgu-amd tcgu-amd requested a review from causten as a code owner October 30, 2024 00:25
@tcgu-amd tcgu-amd requested review from pfultz2 and causten and removed request for causten October 30, 2024 00:25
@tcgu-amd tcgu-amd changed the title Fix bug #3553 causing failures Fixed instruction::replace() logic - Fix CI. Oct 30, 2024
Copy link

codecov bot commented Oct 30, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 92.17%. Comparing base (f5df004) to head (0e4eb3d).
Report is 129 commits behind head on develop.

Additional details and impacted files
@@           Coverage Diff            @@
##           develop    #3574   +/-   ##
========================================
  Coverage    92.17%   92.17%           
========================================
  Files          513      513           
  Lines        21536    21547   +11     
========================================
+ Hits         19851    19862   +11     
  Misses        1685     1685           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@migraphx-bot
Copy link
Collaborator

Test Batch Rate new
0e4eb3
Rate old
f5df00
Diff Compare
torchvision-resnet50 64 3,257.79 3,256.17 0.05%
torchvision-resnet50_fp16 64 6,981.39 6,985.11 -0.05%
torchvision-densenet121 32 2,434.89 2,435.96 -0.04%
torchvision-densenet121_fp16 32 4,090.78 4,058.51 0.80%
torchvision-inceptionv3 32 1,638.23 1,636.64 0.10%
torchvision-inceptionv3_fp16 32 2,761.72 2,762.67 -0.03%
cadene-inceptionv4 16 777.06 776.18 0.11%
cadene-resnext64x4 16 810.41 811.76 -0.17%
slim-mobilenet 64 7,531.07 7,534.07 -0.04%
slim-nasnetalarge 64 211.38 211.45 -0.03%
slim-resnet50v2 64 3,502.51 3,504.24 -0.05%
bert-mrpc-onnx 8 1,151.97 1,149.67 0.20%
bert-mrpc-tf 1 471.57 463.86 1.66%
pytorch-examples-wlang-gru 1 423.43 420.46 0.70%
pytorch-examples-wlang-lstm 1 382.69 381.56 0.30%
torchvision-resnet50_1 1 802.95 780.85 2.83%
cadene-dpn92_1 1 401.20 405.55 -1.07%
cadene-resnext101_1 1 384.59 383.55 0.27%
onnx-taau-downsample 1 343.27 343.07 0.06%
dlrm-criteoterabyte 1 33.33 33.34 -0.05%
dlrm-criteoterabyte_fp16 1 52.71 52.74 -0.06%
agentmodel 1 9,998.60 8,306.27 20.37% 🔆
unet_fp16 2 58.89 58.82 0.12%
resnet50v1_fp16 1 1,002.99 1,001.66 0.13%
resnet50v1_int8 1 1,011.13 995.76 1.54%
bert_base_cased_fp16 64 1,171.70 1,171.04 0.06%
bert_large_uncased_fp16 32 363.41 363.62 -0.06%
bert_large_fp16 1 200.31 198.87 0.72%
distilgpt2_fp16 16 2,205.45 2,204.83 0.03%
yolov5s 1 531.88 540.84 -1.66%
tinyllama 1 43.45 43.47 -0.05%
vicuna-fastchat 1 176.36 176.64 -0.16%
whisper-tiny-encoder 1 418.01 418.46 -0.11%
whisper-tiny-decoder 1 428.02 433.85 -1.34%

Check results before merge 🔆

@migraphx-bot
Copy link
Collaborator


     ✅ bert-mrpc-onnx: PASSED: MIGraphX meets tolerance

     ✅ bert-mrpc-tf: PASSED: MIGraphX meets tolerance

     ✅ pytorch-examples-wlang-gru: PASSED: MIGraphX meets tolerance

     ✅ pytorch-examples-wlang-lstm: PASSED: MIGraphX meets tolerance

     ✅ torchvision-resnet50_1: PASSED: MIGraphX meets tolerance

     ✅ cadene-dpn92_1: PASSED: MIGraphX meets tolerance

     ✅ cadene-resnext101_1: PASSED: MIGraphX meets tolerance

     ✅ dlrm-criteoterabyte: PASSED: MIGraphX meets tolerance

     ✅ agentmodel: PASSED: MIGraphX meets tolerance

     ✅ unet: PASSED: MIGraphX meets tolerance

     ✅ resnet50v1: PASSED: MIGraphX meets tolerance

     ✅ bert_base_cased_fp16: PASSED: MIGraphX meets tolerance

🔴bert_large_uncased_fp16: FAILED: MIGraphX is not within tolerance - check verbose output


     ✅ bert_large: PASSED: MIGraphX meets tolerance

     ✅ yolov5s: PASSED: MIGraphX meets tolerance

     ✅ tinyllama: PASSED: MIGraphX meets tolerance

     ✅ vicuna-fastchat: PASSED: MIGraphX meets tolerance

     ✅ whisper-tiny-encoder: PASSED: MIGraphX meets tolerance

     ✅ whisper-tiny-decoder: PASSED: MIGraphX meets tolerance

     ✅ distilgpt2_fp16: PASSED: MIGraphX meets tolerance

@causten causten merged commit 2f97579 into ROCm:develop Nov 11, 2024
22 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants