-
-
Notifications
You must be signed in to change notification settings - Fork 196
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
status of clang config files in wrappers #469
Comments
Thanks for checking in! See #253 for more discussion around the use of config files. The issues I ran into, which I mentioned in the revert, e0964ce, have been fixed.
So the config files themselves work fine overall, but there seems to be a bit of occurrance that some projects build with Note that while the config files are neat for many things, the If there are no more surprises, the next point release of llvm-mingw will use the config files, and once that's out, I guess we'll get more widespread testing of it, and see if people stumble on other issues with it. |
One apparent side effect of this change is a reduction in the set of target triples that will successfully compile. For example, all the following In the 20241119 release, of that list, only I noticed this because llvm-mingw stopped working with vcpkg. In their mingw toolchain script they define the compiler target like so: Would it be more correct for them to use |
Thanks for the report! Yes, I think that probably would be more correct to do, in general. I think most distributions of mingw-w64 based toolchains use the Alternatively, if there are strong reasons not to do it, we could add more copies of the config files, to pick up other forms of the triples, but I would say that overall, mingw-w64 toolchains do tend to use |
It would be nice to have *-pc-windows-gnu config files, that triple does not cause lld target mismatch warnings when cross-language LTO with rust gnullvm. |
@Andarwinux can you provide more details about the Rust issue? |
Use -Clinker-plugin-lto=yes -Clto to build a rust project such as libdovi as a static library, then use LTO to build a C/C++ project such as svtav1-psy and link it with libdovi
rustc apparently using pc-windows-gnu as the internal triple for the gnullvm target, whereas llvm will normalize w64-mingw32 to w64-windows-gnu. |
So to recap here - this issue isn't something that appeared when llvm-mingw switched to using config files, but a preexisting issue that you hope to fix with the config files? Just adding |
For wrapper-based llvm-mingw, I just append -target x86_64-pc-windows-gnu to the wrapper, while config-based llvm-mingw overrides -target via global CFLAGS, but can't build anything at all without a config for pc-windows-gnu. Of course, I can rename the existing w64 config, but it would be more convenient if llvm-mingw also included a pc-windows-gnu config. Overall, I prefer the old wrapper way, at least on Linux it's not as slow as on Windows, and $CCACHE is very convenient.
llvm/llvm-project#117573 |
Can we somehow normalize the triples to the same output string somehow? |
Ah, right, I see. Yes, with the config file based setup, your use case does become more complicated indeed.
It's not so much about performance (I doubt the wrappers cause that much extra overhead anyway), as it is about making various tool invocation cases work better (where e.g. clangd can figure out that the
Oh, neat, thanks for the pointer! (We'd still need to munge the |
I've heard that creating new processes is more expensive on Windows than Linux. If the wrapper is fast enough, then maybe it would be possible to build llvm as a busybox-style single executable with LLVM_TOOL_LLVM_DRIVER_BUILD without requiring users to enable symlink support, allowing statically linking libLLVM to speed up compilation significantly without inflating the size or even reducing the total size.
It occurs to me that I could add -Xclang -triple -Xclang x86_64-pc-windows-gnu to the w64 config to override internal triple by default without affecting driver behavior, is this a clean approach? Perhaps llvm-mingw could include this by default? |
Are you referring to the performance penalty of the dynamically linked libLLVM/libclang-cpp (LLVM_LINK_LLVM_DYLIB)? I think the cost of that generally is overstated; I recently measured it to be 0.7% of total compilation time on Linux, for the test case of compiling Clang. Perhaps it is more costly on Windows - I haven't measured that - but I doubt it's very siginificant (more than a couple of percent, tops) anyway compared to the actual work of doing compilation. And yes, process creation generally is more expensive on Windows, so the wrapper executable setup does cost us a little bit there.
I don't think I'd include that by default, that sounds quite odd and specific for your case. In the current Do note that mixing different variants of triples can be prone to other breakage as well. We currently don't build any of the runtimes with the |
The difference is small enough for most small source files, but is noticeable for some huge sources, such as sqlite3.c and Qt6 with UNITY_BUILD enabled. On my machine, clang statically linked to libLLVM takes 14% less time to compile sqlite3.c with -O3 -march=znver5. You can experiment with this using Fuchsia Clang (LLVM_TOOL_LLVM_DRIVER_BUILD static without PGO)
-Xclang is passed directly to cc1, so it doesn't change the behavior of the driver searching for config and resource-dir |
14 % for that sounds like a lot - that's not what I'm seeing, so I would think there's another factor playing in here too. I just (re)tested this; on Ubuntu 24.04 x86_64, a plain 1 stage build of Clang using the host compiler (GCC 13) and linker with I guess it's possible that there's a bigger difference if the build is more tuned (PGO, or the linker attempting to sort things in a clever way, etc).
Yes, but that's not what I asked - I wanted to explore what effects it has if that existing |
Building LLVM with GCC seems strange, could you try Fuchsia Clang?
Yes, LTO+PGO+BOLT for static linking are so powerful that they can actually reduce compilation time by 36%.
After some experimentation, I realized that clang just ignores -target or --target in config and infers it from argv[0]. |
Not sure what's strange with that? That's a fairly reasonable default stage1 on Linux using whatever the host toolchain is.
I can try building with Clang and ThinLTO - it is plausible that there is a notable difference with ThinLTO.
One of the main issues I have with anything profile/instrumentation based, is that it's problematic to apply when Clang is cross compiled. (E.g. all my Windows toolchains are cross built from Linux, and built for architectures which aren't even runnable on github actions yet, like Windows/aarch64. I also build for Linux/aarch64 this way on Linux/x86_64.)
Hmm, that's odd. In an earlier stage of the cfg file support (see e2e9216), I explicitly selected a config file with |
I wonder if it could be reasonable to make LLVM skip that warning for LTO, if the triples only differ in the vendor field (either in general, or for specific OSes where there are known multiple vendor fields used in the wild). In order for LTO to work in this combination, isn't there still a requirement that both Rust and llvm-mingw use pretty much the same version of LLVM? (IIRC Rust uses a patched LLVM - hopefully those patches don't affect IR and LTO interop. And based on https://www.npopov.com/2025/01/05/This-year-in-LLVM-2024.html#rust, the number of patches these days is down to only one.) |
See llvm/llvm-project#122801 for an implementation of this - it should be ready to land after rerunning the CI for it. |
This is off topic for this discussion here, but just for reference - I tried to dig into the actual performance for your testcase, compiling I set up the whole build matrix for doing this on github actions, so that it should be reproducible if someone wants to tweak the test build setup: mstorsjo/llvm-project@gha-clang-perf I'll summarize it in a table (picking the minimum execution time for each benchmarked case):
Overall, the slowdown due to using dylibs seems to be <1%, and in two of the build cases it even seems to be marginally faster. Nowhere near the mentioned 14% in any case. |
I modified this workflow to add a fuchsia-clang benchmark, and got similar results to "Clang hosted, LTO, nodylib". But my local machine is indeed very close to 14% (fuchsia clang 14.094s - llvm-mingw nightly clang 16.296s). It seems that improvements mainly come from LTO and will be affected by hardware. I also modified the workflow to match my local build (building with fuchsia-clang, enabling LLVM_TOOL_LLVM_DRIVER_BUILD and disabling unnecessary components) and now the LTO build takes only 4 minutes longer. https://github.com/Andarwinux/llvm-project/actions/runs/12792083533 |
Right, that explains the confusion. Yes, LTO and PGO give very large, undisputable speedups, while the dylib configuration slowdown is in the range of <1% (if it even is a slowdown at all). |
I was just doing my usual post-llvm-update update of my cross-compiler wrappers (msys2/MINGW-packages#8762) (which are basically ripped off from yours), which includes looking to see if you made any relevant changes to the wrappers here. I saw a somewhat confusing commit history, where you switched to using cfg files, tweaked them, then reverted all of that due to several issues, and then seemingly reapplied the commits without change. I decided for this go-around to use the version of the wrappers at the revert (which was just a comment update different from what I had before), but I wanted to check in to see how the issues referenced in the revert were addressed here before I consider also switching to cfg files.
The text was updated successfully, but these errors were encountered: