-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JIT: Emulate missing x86 shift instructions for xplat intrinsics #111108
Conversation
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch |
emitAttr attr3 = attr; | ||
if (hasTupleTypeInfo(ins) && ((insTupleTypeInfo(ins) & INS_TT_MEM128) != 0)) | ||
{ | ||
// Shift instructions take xmm for the 3rd operand regardless of instruction size. | ||
attr3 = EA_16BYTE; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not quite sure this is the "best" fix.
This feels like a case where we could utilize the existing tags for memory operand size for the register as well rather than hardcoding a one-off special case.
Not going to block on it, especially since this is only used for disassembly; but its something that would be nice to have improved for this and other cases where the last operand is a different register size than the others.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you elaborate on your preferred fix here? This is already using the tuple type rather than hard-coded instruction list -- I just mention shift in the comment explicitly because it's the only thing I'm aware of that uses full xmm/m128 for what is ultimately an 8-bit scalar, and this should be the only encoding form where that xmm fix is required.
CC. @dotnet/jit-contrib for secondary review |
This extends the byte shift emulation to handle arithmetic shift and adds emulation of qword arithmetic shift for platforms without AVX-512/AVX10.
SPMI diffs show the updated codegen.