Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for AVX10.2, Add AVX10.2 API surface and template tests #111209

Open
wants to merge 38 commits into
base: main
Choose a base branch
from

Conversation

khushal1996
Copy link
Contributor

@khushal1996 khushal1996 commented Jan 8, 2025

  • This PR covers implementation of approved AVX10.2 APIs along with addition of corresponding template tests and lowering support in JIT.

  • It enables ymm embedded rounding which comes with AVX10.2
    image

Testing overview
We follow a multi-step testing plan to verify the encoding correctness and the semantic correctness.

Testing results will be presented below.

  1. Emitter unit tests
    In codgenxarch.cpp, similar to genAmd64EmitterUnitTestsSse2, we used the JitLateDisasm feature to insert instructions to encode as unit tests for emitter, and LateDisasm will invoke LLVM to disasm the code stream, this gave us the chance to cross validate the disassembly from JIT and LLVM. The output of this step is to verify the emit paths are generating "correct" code that would not trigger #UD or have wrong semantics.

Note that we are using a custom coredistools.dll which uses a recent LLVM that supports AVX10.2 decoding.

  1. SuperPMI
    In this step, we would run the SuperPMI pipeline to get the asmdiffs,; the inputs are all the MCH files. This step will give us the chance to check if there is any assertion failure or internal error within JIT and since the pipeline will invoke coredistools.dll as well, so we can verify the encoding correctness in a larger scope.

To ensure the new changes will not hit the existing code path in terms of throughput, we ran asmdiffs with base JIT to be the main branch where changes are based on, and diff JIT to be the one with all the changes.

  1. JIT unit tests
    The 2 steps mentioned above are mainly verifying the encoding correctness of the generated binary code. In this step we have used the existing CoreCLR unit test set: JIT and run it in the Intel SDE emulator with AVX10.2 on and off.

Testing results


Run Emitter tests

Result of emitter tests using LLVM disassembler (left - JIT emitted code, right - LLVM disassembler output)
image
image
image

Run superpmi using JITLateDisasm to check for errors
No Decode failures observed in superpmi log.

Running superpmi without JITLateDisasm to check for assert errors
No assertion failures or asm diffs observed

[11:10:47] Running asm diffs of D:\Base_repos\runtime\artifacts\spmi\mch\64146448-11b1-4f94-b1f2-edce91fbcb33.windows.x64\aspnet.run.windows.x64.checked.mch
[11:10:47] Invoking: D:\Base_repos\runtime\artifacts\tests\coreclr\windows.x64.Checked\Tests\Core_Root\superpmi.exe -a -v ewi -f C:\Users\kmodi\AppData\Local\Temp\2\tmp98kfiph0\aspnet.run.windows.x64.checked.mch_fail.mcl -details C:\Users\kmodi\AppData\Local\Temp\2\tmp98kfiph0\aspnet.run.windows.x64.checked.mch_details.csv -jitoption force JitEnableNoWayAssert=1 -jitoption force JitNoForceFallback=1 -jitoption force JitAlignLoops=0 -jit2option force JitEnableNoWayAssert=1 -jit2option force JitNoForceFallback=1 -jit2option force JitAlignLoops=0 -p D:\Base_repos\runtime\artifacts\tests\coreclr\windows.x64.Checked\Tests\Core_Root\clrjit.dll D:\Git_repos\runtime\artifacts\tests\coreclr\windows.x64.Checked\Tests\Core_Root\clrjit.dll D:\Base_repos\runtime\artifacts\spmi\mch\64146448-11b1-4f94-b1f2-edce91fbcb33.windows.x64\aspnet.run.windows.x64.checked.mch
[11:11:03] SuperPMI encountered missing data for 33 out of 129205 contexts
[11:11:03] Running asm diffs of D:\Base_repos\runtime\artifacts\spmi\mch\64146448-11b1-4f94-b1f2-edce91fbcb33.windows.x64\benchmarks.run.windows.x64.checked.mch
[11:11:03] Invoking: D:\Base_repos\runtime\artifacts\tests\coreclr\windows.x64.Checked\Tests\Core_Root\superpmi.exe -a -v ewi -f C:\Users\kmodi\AppData\Local\Temp\2\tmp98kfiph0\benchmarks.run.windows.x64.checked.mch_fail.mcl -details C:\Users\kmodi\AppData\Local\Temp\2\tmp98kfiph0\benchmarks.run.windows.x64.checked.mch_details.csv -jitoption force JitEnableNoWayAssert=1 -jitoption force JitNoForceFallback=1 -jitoption force JitAlignLoops=0 -jit2option force JitEnableNoWayAssert=1 -jit2option force JitNoForceFallback=1 -jit2option force JitAlignLoops=0 -p D:\Base_repos\runtime\artifacts\tests\coreclr\windows.x64.Checked\Tests\Core_Root\clrjit.dll D:\Git_repos\runtime\artifacts\tests\coreclr\windows.x64.Checked\Tests\Core_Root\clrjit.dll D:\Base_repos\runtime\artifacts\spmi\mch\64146448-11b1-4f94-b1f2-edce91fbcb33.windows.x64\benchmarks.run.windows.x64.checked.mch
[11:11:09] SuperPMI encountered missing data for 13 out of 28757 contexts
[11:11:09] Running asm diffs of D:\Base_repos\runtime\artifacts\spmi\mch\64146448-11b1-4f94-b1f2-edce91fbcb33.windows.x64\benchmarks.run_pgo.windows.x64.checked.mch
[11:11:09] Invoking: D:\Base_repos\runtime\artifacts\tests\coreclr\windows.x64.Checked\Tests\Core_Root\superpmi.exe -a -v ewi -f C:\Users\kmodi\AppData\Local\Temp\2\tmp98kfiph0\benchmarks.run_pgo.windows.x64.checked.mch_fail.mcl -details C:\Users\kmodi\AppData\Local\Temp\2\tmp98kfiph0\benchmarks.run_pgo.windows.x64.checked.mch_details.csv -jitoption force JitEnableNoWayAssert=1 -jitoption force JitNoForceFallback=1 -jitoption force JitAlignLoops=0 -jit2option force JitEnableNoWayAssert=1 -jit2option force JitNoForceFallback=1 -jit2option force JitAlignLoops=0 -p D:\Base_repos\runtime\artifacts\tests\coreclr\windows.x64.Checked\Tests\Core_Root\clrjit.dll D:\Git_repos\runtime\artifacts\tests\coreclr\windows.x64.Checked\Tests\Core_Root\clrjit.dll D:\Base_repos\runtime\artifacts\spmi\mch\64146448-11b1-4f94-b1f2-edce91fbcb33.windows.x64\benchmarks.run_pgo.windows.x64.checked.mch
[11:11:22] SuperPMI encountered missing data for 62 out of 105618 contexts
[11:11:22] Running asm diffs of D:\Base_repos\runtime\artifacts\spmi\mch\64146448-11b1-4f94-b1f2-edce91fbcb33.windows.x64\benchmarks.run_tiered.windows.x64.checked.mch
[11:11:22] Invoking: D:\Base_repos\runtime\artifacts\tests\coreclr\windows.x64.Checked\Tests\Core_Root\superpmi.exe -a -v ewi -f C:\Users\kmodi\AppData\Local\Temp\2\tmp98kfiph0\benchmarks.run_tiered.windows.x64.checked.mch_fail.mcl -details C:\Users\kmodi\AppData\Local\Temp\2\tmp98kfiph0\benchmarks.run_tiered.windows.x64.checked.mch_details.csv -jitoption force JitEnableNoWayAssert=1 -jitoption force JitNoForceFallback=1 -jitoption force JitAlignLoops=0 -jit2option force JitEnableNoWayAssert=1 -jit2option force JitNoForceFallback=1 -jit2option force JitAlignLoops=0 -p D:\Base_repos\runtime\artifacts\tests\coreclr\windows.x64.Checked\Tests\Core_Root\clrjit.dll D:\Git_repos\runtime\artifacts\tests\coreclr\windows.x64.Checked\Tests\Core_Root\clrjit.dll D:\Base_repos\runtime\artifacts\spmi\mch\64146448-11b1-4f94-b1f2-edce91fbcb33.windows.x64\benchmarks.run_tiered.windows.x64.checked.mch
[11:11:27] SuperPMI encountered missing data for 2 out of 55912 contexts
[11:11:27] Running asm diffs of D:\Base_repos\runtime\artifacts\spmi\mch\64146448-11b1-4f94-b1f2-edce91fbcb33.windows.x64\coreclr_tests.run.windows.x64.checked.mch
[11:11:27] Invoking: D:\Base_repos\runtime\artifacts\tests\coreclr\windows.x64.Checked\Tests\Core_Root\superpmi.exe -a -v ewi -f C:\Users\kmodi\AppData\Local\Temp\2\tmp98kfiph0\coreclr_tests.run.windows.x64.checked.mch_fail.mcl -details C:\Users\kmodi\AppData\Local\Temp\2\tmp98kfiph0\coreclr_tests.run.windows.x64.checked.mch_details.csv -jitoption force JitEnableNoWayAssert=1 -jitoption force JitNoForceFallback=1 -jitoption force JitAlignLoops=0 -jit2option force JitEnableNoWayAssert=1 -jit2option force JitNoForceFallback=1 -jit2option force JitAlignLoops=0 -p D:\Base_repos\runtime\artifacts\tests\coreclr\windows.x64.Checked\Tests\Core_Root\clrjit.dll D:\Git_repos\runtime\artifacts\tests\coreclr\windows.x64.Checked\Tests\Core_Root\clrjit.dll D:\Base_repos\runtime\artifacts\spmi\mch\64146448-11b1-4f94-b1f2-edce91fbcb33.windows.x64\coreclr_tests.run.windows.x64.checked.mch
[11:13:16] SuperPMI encountered missing data for 149 out of 582221 contexts
[11:13:16] Running asm diffs of D:\Base_repos\runtime\artifacts\spmi\mch\64146448-11b1-4f94-b1f2-edce91fbcb33.windows.x64\libraries.crossgen2.windows.x64.checked.mch
[11:13:16] Invoking: D:\Base_repos\runtime\artifacts\tests\coreclr\windows.x64.Checked\Tests\Core_Root\superpmi.exe -a -v ewi -f C:\Users\kmodi\AppData\Local\Temp\2\tmp98kfiph0\libraries.crossgen2.windows.x64.checked.mch_fail.mcl -details C:\Users\kmodi\AppData\Local\Temp\2\tmp98kfiph0\libraries.crossgen2.windows.x64.checked.mch_details.csv -jitoption force JitEnableNoWayAssert=1 -jitoption force JitNoForceFallback=1 -jitoption force JitAlignLoops=0 -jit2option force JitEnableNoWayAssert=1 -jit2option force JitNoForceFallback=1 -jit2option force JitAlignLoops=0 -p D:\Base_repos\runtime\artifacts\tests\coreclr\windows.x64.Checked\Tests\Core_Root\clrjit.dll D:\Git_repos\runtime\artifacts\tests\coreclr\windows.x64.Checked\Tests\Core_Root\clrjit.dll D:\Base_repos\runtime\artifacts\spmi\mch\64146448-11b1-4f94-b1f2-edce91fbcb33.windows.x64\libraries.crossgen2.windows.x64.checked.mch
[11:13:36] SuperPMI encountered missing data for 3 out of 280377 contexts
[11:13:36] Running asm diffs of D:\Base_repos\runtime\artifacts\spmi\mch\64146448-11b1-4f94-b1f2-edce91fbcb33.windows.x64\libraries.pmi.windows.x64.checked.mch
[11:13:36] Invoking: D:\Base_repos\runtime\artifacts\tests\coreclr\windows.x64.Checked\Tests\Core_Root\superpmi.exe -a -v ewi -f C:\Users\kmodi\AppData\Local\Temp\2\tmp98kfiph0\libraries.pmi.windows.x64.checked.mch_fail.mcl -details C:\Users\kmodi\AppData\Local\Temp\2\tmp98kfiph0\libraries.pmi.windows.x64.checked.mch_details.csv -jitoption force JitEnableNoWayAssert=1 -jitoption force JitNoForceFallback=1 -jitoption force JitAlignLoops=0 -jit2option force JitEnableNoWayAssert=1 -jit2option force JitNoForceFallback=1 -jit2option force JitAlignLoops=0 -p D:\Base_repos\runtime\artifacts\tests\coreclr\windows.x64.Checked\Tests\Core_Root\clrjit.dll D:\Git_repos\runtime\artifacts\tests\coreclr\windows.x64.Checked\Tests\Core_Root\clrjit.dll D:\Base_repos\runtime\artifacts\spmi\mch\64146448-11b1-4f94-b1f2-edce91fbcb33.windows.x64\libraries.pmi.windows.x64.checked.mch
[11:14:05] SuperPMI encountered missing data for 68 out of 295086 contexts
[11:14:05] Running asm diffs of D:\Base_repos\runtime\artifacts\spmi\mch\64146448-11b1-4f94-b1f2-edce91fbcb33.windows.x64\libraries_tests.run.windows.x64.Release.mch
[11:14:05] Invoking: D:\Base_repos\runtime\artifacts\tests\coreclr\windows.x64.Checked\Tests\Core_Root\superpmi.exe -a -v ewi -f C:\Users\kmodi\AppData\Local\Temp\2\tmp98kfiph0\libraries_tests.run.windows.x64.Release.mch_fail.mcl -details C:\Users\kmodi\AppData\Local\Temp\2\tmp98kfiph0\libraries_tests.run.windows.x64.Release.mch_details.csv -jitoption force JitEnableNoWayAssert=1 -jitoption force JitNoForceFallback=1 -jitoption force JitAlignLoops=0 -jit2option force JitEnableNoWayAssert=1 -jit2option force JitNoForceFallback=1 -jit2option force JitAlignLoops=0 -p D:\Base_repos\runtime\artifacts\tests\coreclr\windows.x64.Checked\Tests\Core_Root\clrjit.dll D:\Git_repos\runtime\artifacts\tests\coreclr\windows.x64.Checked\Tests\Core_Root\clrjit.dll D:\Base_repos\runtime\artifacts\spmi\mch\64146448-11b1-4f94-b1f2-edce91fbcb33.windows.x64\libraries_tests.run.windows.x64.Release.mch
[11:15:22] SuperPMI encountered missing data for 447 out of 751895 contexts
[11:16:23] Running asm diffs of D:\Base_repos\runtime\artifacts\spmi\mch\64146448-11b1-4f94-b1f2-edce91fbcb33.windows.x64\libraries_tests_no_tiered_compilation.run.windows.x64.Release.mch
[11:16:23] Invoking: D:\Base_repos\runtime\artifacts\tests\coreclr\windows.x64.Checked\Tests\Core_Root\superpmi.exe -a -v ewi -f C:\Users\kmodi\AppData\Local\Temp\2\tmp98kfiph0\libraries_tests_no_tiered_compilation.run.windows.x64.Release.mch_fail.mcl -details C:\Users\kmodi\AppData\Local\Temp\2\tmp98kfiph0\libraries_tests_no_tiered_compilation.run.windows.x64.Release.mch_details.csv -jitoption force JitEnableNoWayAssert=1 -jitoption force JitNoForceFallback=1 -jitoption force JitAlignLoops=0 -jit2option force JitEnableNoWayAssert=1 -jit2option force JitNoForceFallback=1 -jit2option force JitAlignLoops=0 -p D:\Base_repos\runtime\artifacts\tests\coreclr\windows.x64.Checked\Tests\Core_Root\clrjit.dll D:\Git_repos\runtime\artifacts\tests\coreclr\windows.x64.Checked\Tests\Core_Root\clrjit.dll D:\Base_repos\runtime\artifacts\spmi\mch\64146448-11b1-4f94-b1f2-edce91fbcb33.windows.x64\libraries_tests_no_tiered_compilation.run.windows.x64.Release.mch
[11:17:17] SuperPMI encountered missing data for 208 out of 342818 contexts
[11:17:17] Running asm diffs of D:\Base_repos\runtime\artifacts\spmi\mch\64146448-11b1-4f94-b1f2-edce91fbcb33.windows.x64\realworld.run.windows.x64.checked.mch
[11:17:17] Invoking: D:\Base_repos\runtime\artifacts\tests\coreclr\windows.x64.Checked\Tests\Core_Root\superpmi.exe -a -v ewi -f C:\Users\kmodi\AppData\Local\Temp\2\tmp98kfiph0\realworld.run.windows.x64.checked.mch_fail.mcl -details C:\Users\kmodi\AppData\Local\Temp\2\tmp98kfiph0\realworld.run.windows.x64.checked.mch_details.csv -jitoption force JitEnableNoWayAssert=1 -jitoption force JitNoForceFallback=1 -jitoption force JitAlignLoops=0 -jit2option force JitEnableNoWayAssert=1 -jit2option force JitNoForceFallback=1 -jit2option force JitAlignLoops=0 -p D:\Base_repos\runtime\artifacts\tests\coreclr\windows.x64.Checked\Tests\Core_Root\clrjit.dll D:\Git_repos\runtime\artifacts\tests\coreclr\windows.x64.Checked\Tests\Core_Root\clrjit.dll D:\Base_repos\runtime\artifacts\spmi\mch\64146448-11b1-4f94-b1f2-edce91fbcb33.windows.x64\realworld.run.windows.x64.checked.mch
[11:17:23] SuperPMI encountered missing data for 11 out of 24824 contexts
[11:17:23] Running asm diffs of D:\Base_repos\runtime\artifacts\spmi\mch\64146448-11b1-4f94-b1f2-edce91fbcb33.windows.x64\smoke_tests.nativeaot.windows.x64.checked.mch
[11:17:23] Invoking: D:\Base_repos\runtime\artifacts\tests\coreclr\windows.x64.Checked\Tests\Core_Root\superpmi.exe -a -v ewi -f C:\Users\kmodi\AppData\Local\Temp\2\tmp98kfiph0\smoke_tests.nativeaot.windows.x64.checked.mch_fail.mcl -details C:\Users\kmodi\AppData\Local\Temp\2\tmp98kfiph0\smoke_tests.nativeaot.windows.x64.checked.mch_details.csv -jitoption force JitEnableNoWayAssert=1 -jitoption force JitNoForceFallback=1 -jitoption force JitAlignLoops=0 -jit2option force JitEnableNoWayAssert=1 -jit2option force JitNoForceFallback=1 -jit2option force JitAlignLoops=0 -p D:\Base_repos\runtime\artifacts\tests\coreclr\windows.x64.Checked\Tests\Core_Root\clrjit.dll D:\Git_repos\runtime\artifacts\tests\coreclr\windows.x64.Checked\Tests\Core_Root\clrjit.dll D:\Base_repos\runtime\artifacts\spmi\mch\64146448-11b1-4f94-b1f2-edce91fbcb33.windows.x64\smoke_tests.nativeaot.windows.x64.checked.mch
[11:17:27] SuperPMI encountered missing data for 2 out of 29727 contexts
[11:17:27] Asm diffs summary:
[11:17:27]   Summary Markdown file: D:\Base_repos\runtime\artifacts\spmi\diff_summary.1.md
[11:17:27]   Short Summary Markdown file: D:\Base_repos\runtime\artifacts\spmi\diff_short_summary.1.md
[11:17:27]   No asm diffs
[11:17:27] Finish time: 11:17:27
[11:17:27] Elapsed time: 0:06:39.678574

Run JIT subtree with AVX10.2 enabled / disabled

AVX10.2 enabled
image

AVX10.2 disabled
image

@dotnet-issue-labeler dotnet-issue-labeler bot added area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI new-api-needs-documentation labels Jan 8, 2025
Copy link

Note regarding the new-api-needs-documentation label:

This serves as a reminder for when your PR is modifying a ref *.cs file and adding/modifying public APIs, please make sure the API implementation in the src *.cs file is documented with triple slash comments, so the PR reviewers can sign off that change.

1 similar comment
Copy link

Note regarding the new-api-needs-documentation label:

This serves as a reminder for when your PR is modifying a ref *.cs file and adding/modifying public APIs, please make sure the API implementation in the src *.cs file is documented with triple slash comments, so the PR reviewers can sign off that change.

@dotnet-policy-service dotnet-policy-service bot added the community-contribution Indicates that the PR has been added by a community member label Jan 8, 2025
Copy link
Contributor

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

@khushal1996 khushal1996 force-pushed the kcm-avx102-api-public-pr branch from ec7732d to 216999c Compare January 8, 2025 18:55
khushal1996 and others added 20 commits January 8, 2025 11:58
Comment on lines 1439 to 1440
// Reserved for isas Avx10.1 and below
// Needs to be set to 0 for AVX10.2 adn above to indicate YMM embedded rounding
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

Suggested change
// Reserved for isas Avx10.1 and below
// Needs to be set to 0 for AVX10.2 adn above to indicate YMM embedded rounding
// Reserved for isas Avx10.1 and below
// Set to 0 on AVX10.2 and above for YMM embedded rounding support

Or something along those lines. The current wording makes it sound like it must be always set to 0, rather than only conditionally set if we're using embedded rounding for YMM sizes

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe on Avx10.1 and below it is required to be set to 1 as well, so this should likely indicate that. Simply reserved doesn't imply a necessary state

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have pushed these changes in my latest commit

// Scalar Double Scalar Single Packed Double
return ((b == 0xF2) || (b == 0xF3) || (b == 0x66));
// Scalar Double Scalar Single Packed Double No prefix
return ((b == 0xF2) || (b == 0xF3) || (b == 0x66) || (b == 0x00));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change isn't necessary given the assert(b != 0 above

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have pushed these changes in my latest commit

@@ -2314,6 +2346,13 @@ emitter::code_t emitter::emitExtractEvexPrefix(instruction ins, code_t& code) co
break;
}

case 0x05:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need to handle leadingBytes being [0x00, 0x04] and [0x06, 0x07]?

The higher assert seems to indicate it can be any of those, but you've only added 0x05

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As of now we dont have those instructions. But yes, I think it is better to have to handle those as invalid as of now since we dont have those instructions yet.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have pushed these changes in my latest commit

HARDWARE_INTRINSIC(AVX10v2_V512, ConvertToVectorUInt32WithTruncationSaturation, 64, 1, {INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_vcvttps2udqs, INS_vcvttpd2udqs}, HW_Category_SimpleSIMD, HW_Flag_BaseTypeFromFirstArg|HW_Flag_EmbBroadcastCompatible|HW_Flag_EmbMaskingCompatible)
HARDWARE_INTRINSIC(AVX10v2_V512, ConvertToVectorUInt64WithTruncationSaturation, 64, 1, {INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_vcvttps2uqqs, INS_vcvttpd2uqqs}, HW_Category_SimpleSIMD, HW_Flag_BaseTypeFromFirstArg|HW_Flag_EmbBroadcastCompatible|HW_Flag_EmbMaskingCompatible)
HARDWARE_INTRINSIC(AVX10v2_V512, MinMax, 64, 3, {INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_vminmaxps, INS_vminmaxpd}, HW_Category_IMM, HW_Flag_BaseTypeFromFirstArg|HW_Flag_EmbBroadcastCompatible|HW_Flag_EmbMaskingCompatible)
HARDWARE_INTRINSIC(AVX10v2_V512, MultipleSumAbsoluteDifferences, 64, 3, {INS_vmpsadbw, INS_invalid, INS_invalid, INS_vmpsadbw, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid}, HW_Category_IMM, HW_Flag_FullRangeIMM|HW_Flag_EmbMaskingCompatible) // TBD where should we put the instruction typ_byte or typ_ushort?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the TBD, this has to match the tracked simdBaseType that the node has.

The node typically defaults to the base type of the SIMD return as that is generally unambiguous and matches the types used for the inputs. However, if there are conflicts there then we switch to the base type of the first or second argument, depending on which is "unique" and allows disambiguation (in the same way it would for overload resolution).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My bad. I missed the TBD comment here. So what you are saying is the simdBaseType of the node needs to be matched here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have pushed these changes in my latest commit

@@ -98,7 +98,7 @@ enum instruction : uint32_t
inline bool IsAvx512OrPriorInstruction(instruction ins)
{
#if defined(TARGET_XARCH)
return (ins >= INS_FIRST_SSE_INSTRUCTION) && (ins <= INS_LAST_AVX512_INSTRUCTION);
return (ins >= INS_FIRST_SSE_INSTRUCTION) && (ins <= INS_LAST_AVX10v2_INSTRUCTION);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're changing this, we may need to update the IsAvx512OrPriorInstruction name as otherwise the name and the check being done don't match up.

Alternatively, we should be introducing a new helper method that covers the newer range if that is applicable instead (such as if we have paths that care about AVX512 and prior vs AVX10v2 and prior that need to be disambiguated).

Looking at the existing usages, most rather looks to be intending something more like IsSimdInstruction as they're trying to cover everything that isn't a general-purpose instruction (so SIMD or K instructions).

However, some usages are closer to how emitter::IsAVXOnlyInstruction or emitter::IsSSEOrAVXInstruction are used where they're trying to check for a general "range" (it's unclear if AVX10.1 and AVX10.2 should also be "applicable" to those range changes or not)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm... I did rename it to IsAvx10OrPriorInstruction but it got changed back to IsAvx512OrPriorInstruction in our internal review. Upon checking, IsSimdInstruction makes more sense.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have pushed these changes in my latest commit

@tannergooding
Copy link
Member

The changes overall LGTM. Just a couple small cleanup asks.

It should generally be good for secondary review, CC. @dotnet/jit-contrib

@BruceForstall BruceForstall added avx10 Related to the AVX10 architecture and removed apx Related to the Intel Advanced Performance Extensions (APX) labels Jan 23, 2025
Copy link
Member

@BruceForstall BruceForstall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me except for one nit and one thing in lookupInstructionSet that looks like a bug.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI avx10 Related to the AVX10 architecture community-contribution Indicates that the PR has been added by a community member new-api-needs-documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants