Skip to content

Commit

Permalink
e2e tests: limit name_len_slow to 3, split e2e tests from other tests
Browse files Browse the repository at this point in the history
[The `test_query_e2e` takes almost ~8mins to run][1] (whole CI job takes
11 mins). The `name_len_slow` script is the main culprit, since it
sleeps for 1 sec in each udf function and that mapper is run in a single
process parallel mode.

```
474.21s call     tests/test_query_e2e.py::test_query_e2e@tmpfile
```

This commit adds a limit of 3 files to the name_len_slow script, which is
enough, since it's only running a single process.
(We immediately interrupt the running process after seeing "UDF
Processing Started" gets printed).

This also split tests into two: one for the e2e tests and one for the
rest, so that these things are more obvious in the future.

[1]: https://github.com/iterative/datachain/actions/runs/12879531971/job/35907168617#step:8:82
  • Loading branch information
skshetry committed Jan 21, 2025
1 parent ce1b09a commit 877d974
Show file tree
Hide file tree
Showing 2 changed files with 6 additions and 2 deletions.
6 changes: 5 additions & 1 deletion .github/workflows/tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -112,7 +112,11 @@ jobs:
run: echo 'DISABLE_REMOTES_ARG=--disable-remotes=azure,gs' >> $env:GITHUB_ENV

- name: Run tests
run: nox -s tests-${{ matrix.pyv }} -- $DISABLE_REMOTES_ARG
run: nox -s tests-${{ matrix.pyv }} -- -m "not e2e" $DISABLE_REMOTES_ARG
shell: bash

- name: Run E2E tests
run: nox -s tests-${{ matrix.pyv }} -- -m "e2e" $DISABLE_REMOTES_ARG
shell: bash

- name: Upload coverage report
Expand Down
2 changes: 1 addition & 1 deletion tests/scripts/name_len_slow.py
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,6 @@ def name_len(file):
DataChain.from_storage(
"gs://dvcx-datalakes/dogs-and-cats/",
anon=True,
).filter(C("file.path").glob("*cat*")).settings(parallel=1).map(
).filter(C("file.path").glob("*cat*")).limit(3).settings(parallel=1).map(
name_len, params=["file"], output={"name_len": int}
).save("name_len")

0 comments on commit 877d974

Please sign in to comment.