Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add fiber safety to __crystal_once & class_[getter|property]?(&) macros #15340

Draft
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

ysbaddaden
Copy link
Contributor

@ysbaddaden ysbaddaden commented Jan 13, 2025

Fixes a couple issues:

  1. __crystal_once isn't fiber safe (concurrency issues); the initializer can be invoked multiple times from multiple fibers (despite using Mutex).

    This issue is fixed by always using the Mutex not only when MT is enabled.

  2. class_getter, class_getter?, class_property and class_property? are neither thread nor fiber safe (parallelism & concurrency issues).

    This issue is fixed by reusing Crystal.once.

NOTE: calls to the aforementioned macros had to be dropped in Fiber and Thread because Mutex depends on them and we need the later to implement said macros (chicken/egg => infinite recursion => stack overflows).

This is a breaking change because the block is now captured, and we can't return from it anymore. This is outlined by the commit that fixes SocketSpecHelper.supports_ipv6? that didn't work as expected anyway (the @@supports_ipv6 class var was never set to true). The block is no longer captured. We might want to introduce a compile time flag to enable the new behavior as it could help with some LLVM inlining behavior (inline the check).

builds on top of #15333
closes #14905

@ysbaddaden
Copy link
Contributor Author

ysbaddaden commented Jan 13, 2025

By duplicating the Crystal.once logic we might be able to avoid the breaking change (we let Crystal inline the blocks instead of passing a proc pointer, which avoids to capture the block). We won't be able to take advantage of the AlwaysInline + NoInline annotations optimization... but that may be acceptable.

@ysbaddaden
Copy link
Contributor Author

ysbaddaden commented Jan 13, 2025

Wonderful. CI decided to blow up 😮‍💨

It might have something to do with musl-libc. It was an issue with musl (another thread classvar). Windows and GNU seem fine. But mingw and openbsd were broken (yet another thread class var). They're now explicitly initialized by Thread.init before we need them.

MacOS is broken for some other reason because CI still uses LLVM 11 😨 and there's an issue with LLVM 14 and below apparently (which may explain why it breaks on some older crystal releases). I can't reproduce when cross compiling to x86_64-apple-darwin on a Linux host and LLVM 18.1.8. It correctly compiles a .o file. This should be fixed: the compiler was missing a pointer cast.

The interpreter is broken because of flag.value = :processing. It's confused by the assignment through a pointer and can't translate the symbol to the enum value:

Error: BUG: missing upcast_distinct from Symbol to Crystal::OnceState (Crystal::SymbolType to Crystal::EnumType)

That was easy to fix, but now it segfaults on the IO.pipe { } interpreter spec for no immediate reason: I don't see where it would use a global and debugging the interpreter is an impossible task. It might be something to do with const or classvars used by the interpreter itself 🤔 The interpreter actually never calls __crystal_once_init and doesn't use crystal once at all.

@ysbaddaden ysbaddaden force-pushed the fix/add-fiber-safety-to-crystal-once branch 2 times, most recently from f4347db to d4fb901 Compare January 14, 2025 18:29
@ysbaddaden ysbaddaden dismissed a stale review January 21, 2025 08:28

Unknown user.
Inaccessible profile.

@ysbaddaden ysbaddaden force-pushed the fix/add-fiber-safety-to-crystal-once branch from d4fb901 to 1925c7f Compare January 21, 2025 08:44
@ysbaddaden ysbaddaden marked this pull request as ready for review January 21, 2025 08:44
We need a Mutex to protect against recursion and to make sure the lazy
initializers only run once, but Mutex depends on the current fiber, and
indirectly the current thread, which themselves may not have been
initialized yet and will lead to an infinite recursion once we protect
the class getter and property helpers.
Reuses the logic for `__crystal_once`.
Protects against recursion and adds thread (parallelism) and fiber
(concurrency) safety to class var initialization.
The initializer block is now captured, and we can't return from a
captured block. This outlines that the previous commit is a BREAKING
CHANGE!

The `class_getter?` version used skip over the intent to cache the
result in `@@supports_ipv6` (it returned from the generated function,
not from the block), so whenever IPv6 was supported every test was
creating yet-another TCPServer (oops).
1. it fails to translate symbol to the enum value
2. it doesn't call `__crystal_once_init`
@ysbaddaden ysbaddaden force-pushed the fix/add-fiber-safety-to-crystal-once branch from 1925c7f to cabbc9c Compare January 21, 2025 09:26
# :nodoc:
# Using @[NoInline] so LLVM optimizes for the hot path (var already
# initialized).
@[NoInline]
def self.once(flag : OnceState*, initializer : Void*) : Nil
def self.once(flag : OnceState*, initializer : Void*, closure_data : Void*) : Nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: Why pass the function pointer and closure data as separate values instead of a Proc instance?

\{% end %}

\{% name = names[0] %}

\{% if name.is_a?(TypeDeclaration) %}
{{var_prefix}}\{{name.var.id}} : \{{name.type}}?
{% if macro_prefix == "class_" %}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thought: With this specialization I'm wondering if it even makes sense anymore to generate these macros from a template. Perhaps we should just evaluate the first macro pass and have much simpler code.
This would be a task for a follow-up though.

in .initialized?
return
in .uninitialized?
flag.value = OnceState::Processing
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue: Shouldn't we unlock the mutex at this point? The initialization code might be expensive and delay execution of other initializers. I figure it ought to be possible to execute different initializers concurrently.

By setting the flag value to processing we have reserved exclusive write access for the current fiber, so we might not even need to reacquire it for setting the state to initialized (assuming the i8 assignment is atomic).

Copy link
Contributor Author

@ysbaddaden ysbaddaden Jan 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need the mutex reentrancy for thread A to notice the recursion but also to block thread B from accessing the value until thread A has properly initialized it.

Otherwise thread B could lock the mutex, notice that the flag is :processing and fail with a recursion issue (oops) instead of waiting.

To solve this, we'd need a checked mutex (not a reentrant one) per constant and class variable, which might not be a bad idea 🤔

@straight-shoota
Copy link
Member

straight-shoota commented Jan 21, 2025

I'm wondering if we should split this change into a couple of individual PRs, so we retain self-contained and descriptive commits in master.

ab8b994 and 86ff3c4 would be great as independent pre-refactors.

I presume dc635fb could also easily be extracted as a follow-up.
Not sure whether it's feasible to split up the rest of the commits which implement the core functionality. Especially the fact that cabbc9c reverts some of it makes it a bit hard to gauge interim states.

@ysbaddaden
Copy link
Contributor Author

Back to draft: as per the suggestions + latest discussion in the issue (capturing the block might be a good breaking change), I'll split some commits into their own PR, and rework things further.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Review
Development

Successfully merging this pull request may close these issues.

Ensuring class_getter runs exactly once under concurrent access
2 participants