Tracking parallel testing outstanding issues #18199

majocha · 2025-01-03T13:12:51Z

Parallel testing is now merged and gets a lot more runtime in the CI. This means some test cases that are not thread safe and were missed before may fail occasionally.

I think it would be good to have a central place to keep track of tests that we suspect are flaky because of concurrent execution. Other annoyances like testhost crashes, both locally and in the Azure pipeline, or any weird behavior are also worth reporting.

I noticed it is often more likely for a test to fail in CI and seem to pass locally. To reproduce locally I just select the whole module containing the test case and do "Run until failure" in VS Test Explorer.

Some issues noticed so far:

net9.0 CLR crashing with Internal CLR error. (0x80131506). This is external and reported.
ProjectAnalysisTests.Test Project25 whole project errors and other test cases using Project25 are flaky. (Because of type provider?)
Test project1 and make sure TcImports gets cleaned up sometimes waits forever for GC.
tests executing ildasm sometimes crash on mac

The text was updated successfully, but these errors were encountered:

T-Gro · 2025-01-06T10:19:06Z

It is not an issue in terms of a crash, but rather an issue in (??) unrealized potential (?):
https://dev.azure.com/dnceng-public/public/_build/results?buildId=906679&view=logs&j=09cc464c-89cd-5758-006d-ba9e3958cfa9&t=098678ff-b2f3-5c88-44d3-4c7be2f47e75&l=17907
FSharp.Suite.Tests on DESKTOP, the only significant factor that determines the overall time for CI jobs to finish, takes about 40 minutes of testing.

Lot of the tests inside are workflows with multiple IO reads/writes as well as .exe invocations - it is common for the tests inside to take seconds, exceptionally even minutes for a single test.

The number of processors is defined as 4 here: https://dev.azure.com/dnceng-public/public/_build/results?buildId=906679&view=logs&j=09cc464c-89cd-5758-006d-ba9e3958cfa9&t=098678ff-b2f3-5c88-44d3-4c7be2f47e75&l=159

Is it fully saturated and we should rather go multi-machine (which is doable, I did some experiments earlier) for this suite?
Or is there something specific about the tests preventing the full benefits to appear?

majocha · 2025-01-06T12:08:21Z

Is it fully saturated and we should rather go multi-machine (which is doable, I did some experiments earlier) for this suite?

Yes, going multi-agent in CI should be very beneficial anyway.

Or is there something specific about the tests preventing the full benefits to appear?

I believe there is something that throttles the net472 target in certain tests. When running the tests locally at some points I see very little CPU utilization for the testDesktop. OTOH testCoreClr does saturate the CPU 100% almost all the time.
I haven't been able to identify what the bottleneck is. I suspected the appdomains that are only used in desktop testing but I'm not sure it's that.

A lot of test cases that do little but compile code run dramatically slower in net472.
For example:

Basic Grammar Element Constants module

On my PC completes net9.0 in around 4 sec, while the net472 takes 8 seconds.

FSharp.Compiler.UnitTests.CodeGen.EmittedIL namespace

net9.0 takes 10s, net472 25s.

T-Gro · 2025-01-06T13:03:59Z

Might be also file-locking of files that are shared across test executions.

But I do not think it makes sense to investigate at individual test level - if individual tests are to be investigated & improved, they might as well be migrated to ComponentTests which would bring the benefit of running them on Linux,macOS and coreCLR on Windows, instead of just net472.

(so far, only the easier ones have been migrated in various tranches)

majocha · 2025-01-06T13:51:23Z

Might be also file-locking of files that are shared across test executions.

But I do not think it makes sense to investigate at individual test level - if individual tests are to be investigated & improved, they might as well be migrated to ComponentTests which would bring the benefit of running them on Linux,macOS and coreCLR on Windows, instead of just net472.

It seems the underlying issue affects also ComponentTests, 42 minutes:
https://dev.azure.com/dnceng-public/public/_build/results?buildId=906679&view=logs&j=09cc464c-89cd-5758-006d-ba9e3958cfa9&t=098678ff-b2f3-5c88-44d3-4c7be2f47e75&l=8988

vs 13 minutes CoreClr:
https://dev.azure.com/dnceng-public/public/_build/results?buildId=906679&view=logs&j=e2e0a842-382a-5483-7008-fc809f5eff8d&t=6a49776c-db36-5de0-b127-a6a0074bf26a&l=15810

majocha · 2025-01-07T14:42:48Z

Might be also file-locking of files that are shared across test executions.

Hmm... it seems the issue is there even when comparing a single test case. I have an idea to just add --times to compile options in tests and see what it shows.

majocha · 2025-01-07T15:06:15Z

Ok, so I just added yield "--times" here:

fsharp/tests/FSharp.Test.Utilities/CompilerAssert.fs

Line 435 in 73b8cbb

yield! options

A rather typical test case that just compiles a snippet of code (Warn If Discarded In List -- 01 - version46) measures like this:
net472:

Phase name	Elapsed	Duration	WS(MB)	GC0	GC1	GC2	Handles	Threads
Import mscorlib+FSharp.Core	1.1545	1.0804	144	16	8	4	907	30
Parse inputs	1.7158	0.5549	149	0	0	0	910	30
Import non-system references	1.8160	0.0977	172	5	1	0	910	30
Typecheck	3.0397	1.2216	182	0	0	0	910	30
Typechecked	3.0489	0.0070	182	0	0	0	910	30
Write Interface File	3.0511	0.0000	182	0	0	0	910	30
Write XML doc signatures	3.0723	0.0000	182	0	0	0	910	30
Write XML docs	3.0793	0.0047	182	0	0	0	910	30
Encode Interface Data	3.2464	0.1649	195	3	1	0	910	30
Optimizations	3.6962	0.4476	199	0	0	0	925	33
Ending Optimizations	3.6985	0.0000	199	0	0	0	925	33
Encoding OptData	3.7415	0.0409	200	0	0	0	925	33
TAST -> IL	4.2212	0.4775	203	1	1	0	925	33
>Write Started	4.3308	0.0554	204	0	0	0	926	33
>Module Generation Preparation	4.3347	0.0015	204	0	0	0	926	33
>Module Generation Pass 1	4.3546	0.0175	204	0	0	0	926	33
>Module Generation Pass 2	4.4658	0.1090	204	0	0	0	926	33
>Module Generation Pass 3	4.4785	0.0103	204	0	0	0	926	33
>Module Generation Pass 4	4.4815	0.0008	204	0	0	0	926	33
>Finalize Module Generation Results	4.4843	0.0004	204	0	0	0	926	33
>Generated Tables and Code	4.4897	0.0029	204	0	0	0	926	33
>Layout Header of Tables	4.4919	0.0000	204	0	0	0	926	33
>Build String/Blob Address Tables	4.4951	0.0010	204	0	0	0	926	33
>Sort Tables	4.4972	0.0000	204	0	0	0	926	33
>Write Header of tablebuf	4.5010	0.0017	204	0	0	0	926	33
>Write Tables to tablebuf	4.5034	0.0000	204	0	0	0	926	33
>Layout Metadata	4.5071	0.0011	204	0	0	0	926	33
>Write Metadata Header	4.5094	0.0000	204	0	0	0	926	33
>Write Metadata Tables	4.5119	0.0003	204	0	0	0	926	33
>Write Metadata Strings	4.5142	0.0003	204	0	0	0	926	33
>Write Metadata User Strings	4.5164	0.0000	204	0	0	0	926	33
>Write Blob Stream	4.5203	0.0019	204	0	0	0	926	33
>Fixup Metadata	4.5230	0.0003	204	0	0	0	926	33
>Generated IL and metadata	4.5463	0.0212	204	0	0	0	928	33
>Layout image	4.5535	0.0050	204	0	0	0	928	33
>Writing Image	4.5598	0.0039	204	0	0	0	927	33
>Signing Image	4.5621	0.0000	204	0	0	0	927	33
Write .NET Binary	4.5643	0.3407	204	0	0	0	927	33

net9.0:

Phase name	Elapsed	Duration	WS(MB)	GC0	GC1	GC2	Handles	Threads
Import mscorlib+FSharp.Core	0.7380	0.6945	176	6	3	2	447	39
Parse inputs	0.9850	0.2363	182	0	0	0	450	39
Import non-system references	1.0249	0.0372	182	0	0	0	450	39
Typecheck	1.7374	0.7093	191	0	0	0	450	39
Typechecked	1.7441	0.0036	191	0	0	0	450	39
Write Interface File	1.7469	0.0001	191	0	0	0	450	39
Write XML doc signatures	1.7494	0.0001	191	0	0	0	450	39
Write XML docs	1.7538	0.0020	191	0	0	0	450	39
Encode Interface Data	1.8847	0.1280	210	0	0	0	450	39
Optimizations	2.0891	0.2019	214	0	0	0	468	45
Ending Optimizations	2.0916	0.0000	214	0	0	0	468	45
Encoding OptData	2.1546	0.0608	212	1	0	0	468	45
TAST -> IL	2.3913	0.2340	215	0	0	0	468	45
>Write Started	2.4299	0.0220	216	0	0	0	470	45
>Module Generation Preparation	2.4341	0.0011	216	0	0	0	470	45
>Module Generation Pass 1	2.4471	0.0102	216	0	0	0	470	45
>Module Generation Pass 2	2.5073	0.0576	216	0	0	0	470	45
>Module Generation Pass 3	2.5166	0.0067	216	0	0	0	470	45
>Module Generation Pass 4	2.5209	0.0017	216	0	0	0	470	45
>Finalize Module Generation Results	2.5241	0.0003	216	0	0	0	470	45
>Generated Tables and Code	2.5295	0.0028	216	0	0	0	470	45
>Layout Header of Tables	2.5321	0.0000	216	0	0	0	470	45
>Build String/Blob Address Tables	2.5352	0.0006	216	0	0	0	470	45
>Sort Tables	2.5383	0.0003	216	0	0	0	470	45
>Write Header of tablebuf	2.5414	0.0007	216	0	0	0	470	45
>Write Tables to tablebuf	2.5437	0.0000	216	0	0	0	470	45
>Layout Metadata	2.5460	0.0000	216	0	0	0	470	45
>Write Metadata Header	2.5484	0.0000	216	0	0	0	470	45
>Write Metadata Tables	2.5508	0.0000	216	0	0	0	470	45
>Write Metadata Strings	2.5532	0.0003	216	0	0	0	470	45
>Write Metadata User Strings	2.5554	0.0000	216	0	0	0	470	45
>Write Blob Stream	2.5579	0.0003	216	0	0	0	470	45
>Fixup Metadata	2.5600	0.0000	216	0	0	0	470	45
>Generated IL and metadata	2.5674	0.0052	216	0	0	0	470	45
>Layout image	2.5786	0.0089	217	0	0	0	470	45
>Writing Image	2.5827	0.0017	217	0	0	0	469	45
>Signing Image	2.5851	0.0000	217	0	0	0	469	45
Write .NET Binary	2.5875	0.1937	217	0	0	0	469	45

majocha added the Needs-Triage label Jan 3, 2025

github-project-automation bot added this to F# Compiler and Tooling Jan 3, 2025

github-project-automation bot moved this to New in F# Compiler and Tooling Jan 3, 2025

github-actions bot added this to the Backlog milestone Jan 3, 2025

abonie removed the Needs-Triage label Jan 6, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tracking parallel testing outstanding issues #18199

Tracking parallel testing outstanding issues #18199

majocha commented Jan 3, 2025 •

edited

Loading

T-Gro commented Jan 6, 2025

majocha commented Jan 6, 2025

T-Gro commented Jan 6, 2025 •

edited

Loading

majocha commented Jan 6, 2025

majocha commented Jan 7, 2025 •

edited

Loading

majocha commented Jan 7, 2025

Tracking parallel testing outstanding issues #18199

Tracking parallel testing outstanding issues #18199

Comments

majocha commented Jan 3, 2025 • edited Loading

T-Gro commented Jan 6, 2025

majocha commented Jan 6, 2025

T-Gro commented Jan 6, 2025 • edited Loading

majocha commented Jan 6, 2025

majocha commented Jan 7, 2025 • edited Loading

majocha commented Jan 7, 2025

majocha commented Jan 3, 2025 •

edited

Loading

T-Gro commented Jan 6, 2025 •

edited

Loading

majocha commented Jan 7, 2025 •

edited

Loading