-
Notifications
You must be signed in to change notification settings - Fork 100
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
e2e tests: limit name_len_slow to 3, split e2e tests from others #842
Conversation
Deploying datachain-documentation with Cloudflare Pages
|
877d974
to
445dad7
Compare
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #842 +/- ##
=======================================
Coverage 87.42% 87.42%
=======================================
Files 128 128
Lines 11429 11429
Branches 1559 1559
=======================================
Hits 9992 9992
Misses 1042 1042
Partials 395 395
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
[The `test_query_e2e` takes almost ~8mins to run][1] (whole CI job takes 11 mins). The `name_len_slow` script is the main culprit, since it sleeps for 1 sec in each udf function and that mapper is run in a single process parallel mode. ``` 474.21s call tests/test_query_e2e.py::test_query_e2e@tmpfile ``` This commit adds a limit of 3 files to the name_len_slow script, which is enough, since it's only running a single process. (We immediately interrupt the running process after seeing "UDF Processing Started" gets printed). This also split tests into two: one for the e2e tests and one for the rest, so that these things are more obvious in the future. [1]: https://github.com/iterative/datachain/actions/runs/12879531971/job/35907168617#step:8:82
445dad7
to
c019173
Compare
shell: bash | ||
|
||
- name: Run E2E tests | ||
run: nox -s tests-${{ matrix.pyv }} -- -m "e2e" $DISABLE_REMOTES_ARG |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Q] Do we get some benefit from splitting this out?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will the tests be distributed amongst different workers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Q] Do we get some benefit from splitting this out?
I hope that this will make these issues obvious by splitting them.
Will the tests be distributed amongst different workers?
Yes, it runs in the same way as the above test run. PTAL at https://github.com/iterative/datachain/actions/runs/12880225335/job/35909694647#step:9:17 where this workflow is actually run (we use pull_request_target
which is running workflow from the default branch).
did this change break the coverage reports? |
…#842) [The `test_query_e2e` takes almost ~8mins to run][1] (whole CI job takes 11 mins). The `name_len_slow` script is the main culprit, since it sleeps for 1 sec in each udf function and that mapper is run in a single process parallel mode. ``` 474.21s call tests/test_query_e2e.py::test_query_e2e@tmpfile ``` This commit adds a limit of 3 files to the name_len_slow script, which is enough, since it's only running a single process. (We immediately interrupt the running process after seeing "UDF Processing Started" gets printed). This also split tests into two: one for the e2e tests and one for the rest, so that these things are more obvious in the future. [1]: https://github.com/iterative/datachain/actions/runs/12879531971/job/35907168617#step:8:82
The
test_query_e2e
takes almost ~8mins to run (whole CI job takes 11 mins). Thename_len_slow
script is the main culprit, since it sleeps for 1 sec in each udf function and that mapper is run in a single process parallel mode.This commit adds a limit of 3 files to the name_len_slow script, which is enough, since it's only running a single process.
(We immediately interrupt the running process after seeing "UDF Processing Started" gets printed).
This also split tests into two: one for the e2e tests and one for the rest, so that these things are more obvious in the future.