Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regression in v3.8.0 #452

Closed
carltongibson opened this issue Mar 21, 2024 · 9 comments · Fixed by #455
Closed

Regression in v3.8.0 #452

carltongibson opened this issue Mar 21, 2024 · 9 comments · Fixed by #455
Labels

Comments

@carltongibson
Copy link
Member

The recent 3.8.0 release causes an intermittent test failure on Django's test suite.

https://djangoci.com/job/pull-requests-focal/database=spatialite,label=focal-pr,python=python3.11/24499/testReport/junit/asgi.tests/ASGITest/test_disconnect_with_body/

tests % ./runtests.py asgi.tests.ASGITest.test_disconnect_with_body
Testing against Django installed in '/Users/carlton/Projects/Django/django/django' with up to 10 processes
Found 1 test(s).
System check identified no issues (0 silenced).
F
======================================================================
FAIL: test_disconnect_with_body (asgi.tests.ASGITest.test_disconnect_with_body)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/carlton/Projects/Django/asgiref/asgiref/sync.py", line 254, in __call__
    return call_result.result()
           ^^^^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/concurrent/futures/_base.py", line 449, in result
    return self.__get_result()
           ^^^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/concurrent/futures/_base.py", line 401, in __get_result
    raise self._exception
  File ".../asgiref/sync.py", line 331, in main_wrap
    result = await self.awaitable(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".../django/tests/asgi/tests.py", line 374, in test_disconnect_with_body
    with self.assertRaises(asyncio.TimeoutError):
AssertionError: TimeoutError not raised

----------------------------------------------------------------------
Ran 1 test in 0.005s

FAILED (failures=1)

This tests disconnect handling for a request with a body. The main task processing the request is not correctly cancelled, on some runs only. (The assertion fails, with response content {'type': 'http.response.start', 'status': 200, 'headers': [(b'Content-Type', b'text/html; charset=utf-8'), (b'Content-Length', b'12')]} delivered, rather than the timeout.)

Bisected to 0503c2c by @ttys0dev for #435.

0503c2c4df06dbe56f04689fe537325f28dbbe8b is the first bad commit
commit 0503c2c4df06dbe56f04689fe537325f28dbbe8b
Author: ttys0dev <[email protected]>
Date:   Fri Jan 26 23:13:55 2024 -0700

    Fix task cancellation propagation to subtasks when using sync middleware (#435)

 asgiref/sync.py    |  59 +++++++++++++++-----
 tests/test_sync.py | 159 ++++++++++++++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 201 insertions(+), 17 deletions(-)

//cc @felixxm

@ttys0dev
Copy link
Contributor

I think this may be a buggy test:
Sending a http.disconnect AFAIU is supposed to trigger a task cancellation exception not a asyncio.TimeoutError.

@carltongibson
Copy link
Member Author

The task is cancelled. It's the communicator that times out waiting for the response.

@ttys0dev
Copy link
Contributor

It's the communicator that times out waiting for the response.

Hmm, but the communicator only waits if the task is not already done right? And the http.disconnect is already called before await communicator.receive_output() so wouldn't it be expected to not hit the codepath which would raise a asyncio.TimeoutError since there would already be an asyncio.CancelledError?

@carltongibson
Copy link
Member Author

carltongibson commented Mar 21, 2024

ASGIHandler runs two parallel tasks. One to process the request, the other to listen for the http.disconnect event.

In the test here, on receiving the http.disconnect that task completes, and the other task, processing the request should be cancelled.

When the test fails — and it's intermittent so the behaviour here isn't deterministic — the main request processing task isn't cancelled, which can be checked by inspecting the output received by the application communicator.

That's a regression. We need to work out why it's happened, then we can resolve it. In the meantime though, unless we can solve it quickly, I think we need to revert the problem patch and issue a 3.8.1 to avoid a breakage in deployed applications.

@ttys0dev
Copy link
Contributor

That's a regression. We need to work out why it's happened, then we can resolve it. In the meantime though, unless we can solve it quickly, I think we need to revert the problem patch and issue a 3.8.1 to avoid a breakage in deployed applications.

Debugging it now.

@carltongibson
Copy link
Member Author

@ttys0dev Thanks for taking a look. 🎁

@ttys0dev
Copy link
Contributor

When the test fails — and it's intermittent so the behaviour here isn't deterministic — the main request processing task isn't cancelled, which can be checked by inspecting the output received by the application communicator.

Yeah, so it looks like the issue was that we were effectively inhibiting normal parent task asyncio.CancelledError propagation if the parent task was already done by the time we finished cancelling child tasks, #455 should fix that by checking if the task is done and propagating the asyncio.CancelledError if it was.

@carltongibson
Copy link
Member Author

@ttys0dev great thanks. I will give it a run in the morning.

Good hustle!

@carltongibson
Copy link
Member Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
2 participants