Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Function Pointer Returned is NOT the Actual Method Pointer Invoked by Runtime #111224

Closed
toupswork opened this issue Jan 9, 2025 · 11 comments
Closed

Comments

@toupswork
Copy link

toupswork commented Jan 9, 2025

Description

I have found that this is happening only when I try to get the method pointer for generic static methods in BCL. The same does not occur if I create my own static class, with a static generic method. Through disassembly, I've found that the address given to the CALL opcode is not the same as the address of the function pointer returned. It doesn't matter if the generic argument is reference type or value type.

Reproduction Steps

Example: System.Tuple.Create<T>(T item)

// Yes, it's index 0 for the Create<T>(T)
MethodInfo mi = typeof(Tuple).GetMethods()[0].MakeGenericMethod(typeof(int));
//ensure that it's jitted / closed generic
RuntimeHelpers.PrepareMethod(mi.MethodHandle, [typeof(int).TypeHandle]);
IntPtr fp = mi.MethodHandle.GetFunctionPointer();
// this fp address is not the same address invoked when this is called
_ = Tuple.Create(0); // different address

Expected behavior

The addresses should be the same, just as it is if I had created the same class in my local project.

Actual behavior

Addresses are different.

Regression?

No response

Known Workarounds

No response

Configuration

.NET 8
Windows 11
x64

Other information

No response

@dotnet-policy-service dotnet-policy-service bot added the untriaged New issue has not been triaged by the area owner label Jan 9, 2025
Copy link
Contributor

Tagging subscribers to this area: @dotnet/area-system-reflection
See info in area-owners.md if you want to be subscribed.

@huoyaoyuan
Copy link
Member

huoyaoyuan commented Jan 9, 2025

when I try to get the method pointer for generic static methods in BCL

There is not a concept of the method pointer. A managed method can have several different function pointers for tiered JIT and R2R etc. Tier 0 version of application code is probably using the R2R version.

@huoyaoyuan
Copy link
Member

See also the comments here:

//*******************************************************************************
//
// Returns a callable entry point for a function.
// Multiple entry points could be used for a single function.
// ie. this function is not idempotent
//
// We must ensure that GetMultiCallableAddrOfCode works
// correctly for all of the following cases:
// 1. shared generic method instantiations
// 2. unshared generic method instantiations
// 3. instance methods in shared generic classes
// 4. instance methods in unshared generic classes
// 5. static methods in shared generic classes.
// 6. static methods in unshared generic classes.
//
// For case 1 and 5 the methods are implemented using
// an instantiating stub (i.e. IsInstantiatingStub()
// should be true). These stubs pass on to
// shared-generic-code-which-requires-an-extra-type-context-parameter.
// So whenever we use LDFTN on these we need to give out
// the address of an instantiating stub.
//
// For cases 2, 3, 4 and 6 we can just use the standard technique for LdFtn:
// (for 2 we give out the address of the fake "slot" in InstantiatedMethodDescs)
// (for 3 it doesn't matter if the code is shared between instantiations
// because the instantiation context is picked up from the "this" parameter.)
PCODE MethodDesc::TryGetMultiCallableAddrOfCode(CORINFO_ACCESS_FLAGS accessFlags)

For shared generics things are more complicated. The function pointer needs to have instantiation information "with it", however in JITed code, the instantiation information is provide at call site, and accepted as an external parameter from the shared generic function.

@toupswork
Copy link
Author

@huoyaoyuan
Thank you so much for your prompt and courteous response.

  1. Is there anything I can do to avoid having a stub called for generic method resolution?
  2. If not, is there a way to get the address of the stub?
    a. And is the stub the same for all generics or specific to each one?
  3. When the code that executes in the stub eventually resolves the correct generic function pointer, will it resolve to the same address that MethodHandle.GetFunctionPointer() returns?

I was under the impression that PrepareMethod causes the JIT to provide the jitted address in the MethodDescriptor, thereby obviating the need to invoke the pre-jit stub, or is that a concept that applies only to non-generic methods. I know that behind the scenes, there's a generic method dictionary, where one entry exists for reference types as __Canon and separate ones for value types, so maybe the stub is for checking that dictionary to resolve the correct pointer. But I don't believe that exists outside of the C++ implementation of the runtime.

Image

@huoyaoyuan
Copy link
Member

  1. Is there anything I can do to avoid having a stub called for generic method resolution?

The stub will just set the instantiation and jump to shared generic code. There won't be much overhead when you are invoking indirectly. It's conceptually equivalent to declaring a helper M_OfString() => M<string>().

2. If not, is there a way to get the address of the stub?

The invokable function pointer you get is the stub. You won't be able to get the real function pointer of shared generic code, because it's not invokable with standard calling convention.

2. a. And is the stub the same for all generics or specific to each one?

By definition, it's specific to each instantiation.

3. When the code that executes in the stub eventually resolves the correct generic function pointer, will it resolve to the same address that MethodHandle.GetFunctionPointer() returns?

MethodHandle.GetFunctionPointer() is the stub. It will jump to the implementation of shared generic, which can be acquired as explained above.

I was under the impression that PrepareMethod causes the JIT to provide the jitted address in the MethodDescriptor, thereby obviating the need to invoke the pre-jit stub

It's now outdated with TieredJIT. A method can have multiple versions of code rather than the single pre-jit stub. Only the fully-optimized tier will be treated as final.

But I don't believe that exists outside of the C++ implementation of the runtime.

Explaining for the function pointer stuff: what you get as function pointer matches the signature of current instantiation, for example Method(string, int). However, the actual shared generic implementation has a signature taking instantiation, conceptually Method(TypeHandle, object, int). Thus, what you get is a stub dispatching the former to the latter.
The shared generic is the implementation detail. For managed code directly invoking the method, the JIT "skips" the stub, directly fills parameters for the shared generic.

BTW, the graph you are referring to are also quite outdated. The layout of method table today can be found at

private:
// Low WORD is component size for array and string types (HasComponentSize() returns true).
// Used for flags otherwise.
DWORD m_dwFlags;
// Base size of instance of this class when allocated on the heap
DWORD m_BaseSize;
// See WFLAGS2_ENUM for values.
DWORD m_dwFlags2;
// <NICE> In the normal cases we shouldn't need a full word for each of these </NICE>
WORD m_wNumVirtuals;
WORD m_wNumInterfaces;
#ifdef _DEBUG
LPCUTF8 debug_m_szClassName;
#endif //_DEBUG
PTR_MethodTable m_pParentMethodTable;
PTR_Module m_pModule;
PTR_MethodTableAuxiliaryData m_pAuxiliaryData;
// The value of lowest two bits describe what the union contains
enum LowBits {
UNION_EECLASS = 0, // 0 - pointer to EEClass. This MethodTable is the canonical method table.
UNION_METHODTABLE = 1, // 1 - pointer to canonical MethodTable.
};
static const TADDR UNION_MASK = 1;
union {
DPTR(EEClass) m_pEEClass;
TADDR m_pCanonMT;
};
__forceinline static LowBits union_getLowBits(TADDR pCanonMT)
{
LIMITED_METHOD_DAC_CONTRACT;
return LowBits(pCanonMT & UNION_MASK);
}
__forceinline static TADDR union_getPointer(TADDR pCanonMT)
{
LIMITED_METHOD_DAC_CONTRACT;
return (pCanonMT & ~UNION_MASK);
}
// m_pPerInstInfo and m_pInterfaceMap have to be at fixed offsets because of performance sensitive
// JITed code and JIT helpers. The space used by m_pPerInstInfo is used to represent the array
// element type handle for array MethodTables.
union
{
PerInstInfo_t m_pPerInstInfo;
TADDR m_ElementTypeHnd;
};
public:
union
{
PTR_InterfaceInfo m_pInterfaceMap;
TADDR m_encodedNullableUnboxData; // Used for Nullable<T> to represent the offset to the value field, and the size of the value field
};
// VTable slots go here
// Optional Members go here
// See above for the list of optional members
// Generic dictionary pointers go here
// Interface map goes here
// Generic instantiation+dictionary goes here

@toupswork
Copy link
Author

Thank you so much for all of this! This has been very helpful. I really appreciate it, and I hope you have a great day.
Take care!

@dotnet-policy-service dotnet-policy-service bot removed the untriaged New issue has not been triaged by the area owner label Jan 9, 2025
@toupswork
Copy link
Author

toupswork commented Jan 14, 2025

For anyone reading this, I was able to get the address of the stub!!!
I took the function pointer and found that the stub was offset 0x4000 from the function pointer address!

@huoyaoyuan
Copy link
Member

huoyaoyuan commented Jan 16, 2025

the stub was offset 0x4000 from the function pointer address

This is likely very unreliable. They are allocated in different paces by code/stub allocator. You shouldn't make any assumption about relative addresses between code fragments.

Also for the naming: the function pointer is the stub. You are probably finding the shared generic body.

A potential working way is to read the code pointed to by the function pointer, and extract the jump target address from the instructions. Again this is implementation detail and can change at any time.

@toupswork
Copy link
Author

toupswork commented Jan 18, 2025

This is likely very unreliable. They are allocated in different paces by code/stub allocator. You shouldn't make any assumption about relative addresses between code fragments.

Of course it is unreliable. It's undocumented and buried in assembly code. But, I need it for writing unit tests to allow method redirection for static, generic methods. Microsoft has not maintained MS fakes and it doesn't support the latest .NET. There are many things that are unmockable, so I need another way to test them. And when it changes, then I just update my code. Problem solved.

Also for the naming: the function pointer is the stub. You are probably finding the shared generic body.

I have read many things on this topic, including the book of the runtime. They are often referred to as pre-code stubs or jump stubs, so that is why I use that term. Looking back on this thread, I think it was quite clear that I was referring to this generic precode stub when I talked about how the address was not the same as the address in the disassembled JIT.
I asked you if there was a way I could access this and you said no. Well I DID, and now my method redirector works. There are many method hijacker examples out there, but none of them are able to hijack a generic static method. The only alternative is IL weavers.

A potential working way is to read the code pointed to by the function pointer, and extract the jump target address from the instructions. Again this is implementation detail and can change at any time.

Of course I tried that before posting this. I went to the address and disassembled the code. It did not jump to the same address as what I got when I took the function pointer address, added 0x4000 to it, took the memory address from that location, then read the memory address at that location. This is shared generic method that is resolved initially at runtime, so they were not at all the same. After the generic method is compiled, then they are the same.

You mentioned PrepareMethod can't be used because of tiered JIT. Well I can easily get around that by disabling it in an environmental variable. However when I overwrite the function preamble with a JMP stub, the performance monitors that the JIT injects to circle back later and recompile it are never executed, therefore preventing tiered JIT overwrite

@teo-tsirpanis
Copy link
Contributor

But, I need it for writing unit tests to allow method redirection for static, generic methods.

If these methods are in your code, or your own code calls them, you can add an abstraction to support mocking. If some third-party code calls another third-party static method, mocking it is very likely going to be disproportionately hard to implement, compared to the returned value.

What do these methods that you want to mock do?

@toupswork
Copy link
Author

toupswork commented Jan 20, 2025

Hi, Theodore.

I have had this discussion many times about how to get around things that can't be mocked. Yes there are ways to overcome them, such as wrapping them. But that doesn't help with code coverage.

The bottom line is that:

  • I have dependencies that are static and generic.
  • I need to test the code that calls them
  • Therfore, I need to have the call to them redirected to a mocked method
  • It would be nice if C#'s interceptor feature could be used for this but it can't and it wasn't designed for it
  • It would be nice if Microsoft Fakes worked with the latest version of .NET, but it doesn't
  • It would be nice if Microsoft provided an API or something else to help with this, but there isn't, and testing is seen as more of a tooling problem than a language or runtime problem

Therefore, I am left with no other choice but to hack the runtime to get around this, by injecting a long jump stub into the preamble of the method. I shouldn't have to go to those lengths, but it was it is.

Generics are more difficult because the JIT puts in a precode stub before the geneic method is compiled to native. And I had no way to know where it is by using reflection. The GetFunctionPointer() returns a different address than the stub I am referring to. And the PrepareMethod() on RuntimeHelpers does not get rid of it like it does for all other methods.

But now I know where it lives, and I am good-to-go. I was trying to help others who are facing the same issue, and I know there are others because I have found many redirector authors, but none of them could get past this issue with generic methods.

So anyone who is trying to do the same thing and stumbled upon this, now you know that you get the function pointer, add 0x4000 to it, and that gives you a pointer to pointer where you can inject your stub.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants