Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JIT: Emulate missing x86 shift instructions for xplat intrinsics #111108

Merged
merged 8 commits into from
Jan 10, 2025

Conversation

saucecontrol
Copy link
Member

@saucecontrol saucecontrol commented Jan 6, 2025

This extends the byte shift emulation to handle arithmetic shift and adds emulation of qword arithmetic shift for platforms without AVX-512/AVX10.

SPMI diffs show the updated codegen.

@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Jan 6, 2025
@dotnet-policy-service dotnet-policy-service bot added the community-contribution Indicates that the PR has been added by a community member label Jan 6, 2025
Copy link
Contributor

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

@saucecontrol saucecontrol marked this pull request as ready for review January 7, 2025 03:30
@saucecontrol
Copy link
Member Author

CC @tannergooding

Comment on lines +12362 to +12367
emitAttr attr3 = attr;
if (hasTupleTypeInfo(ins) && ((insTupleTypeInfo(ins) & INS_TT_MEM128) != 0))
{
// Shift instructions take xmm for the 3rd operand regardless of instruction size.
attr3 = EA_16BYTE;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not quite sure this is the "best" fix.

This feels like a case where we could utilize the existing tags for memory operand size for the register as well rather than hardcoding a one-off special case.

Not going to block on it, especially since this is only used for disassembly; but its something that would be nice to have improved for this and other cases where the last operand is a different register size than the others.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you elaborate on your preferred fix here? This is already using the tuple type rather than hard-coded instruction list -- I just mention shift in the comment explicitly because it's the only thing I'm aware of that uses full xmm/m128 for what is ultimately an 8-bit scalar, and this should be the only encoding form where that xmm fix is required.

@tannergooding
Copy link
Member

CC. @dotnet/jit-contrib for secondary review

@tannergooding tannergooding merged commit 574b967 into dotnet:main Jan 10, 2025
109 of 116 checks passed
@saucecontrol saucecontrol deleted the xplat-intrinsics branch January 10, 2025 01:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI community-contribution Indicates that the PR has been added by a community member
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants