-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix endpoint busy flag race condition #383
Fix endpoint busy flag race condition #383
Conversation
Busy flag was set to true after call to dcd_edpt_xfer(). In some cased it was possible that transfer finished before function ended. In this case busy flag could be set to false before it was set to true. Then setting it to true after dcd_edpt_xfer() made edpoint busy forever. This change marks endpoint as busy before transfer is started to avoid race condition.
In the case that the On the other hand, I'm not sure if |
|
It is physically possible, but it won't mark the endpoint busy forever. The transfer complete will push XFER_COMPLETE event into the queue, which is processed after this call is complete. It will then mark this endpoint as available again. |
oh, I just realized that using preemptive RTOS, usbd task can actually preempt and mark this as |
Yeah, I didn't encounter any dcd that couldn't queue an transfer. However the PR does keep it the same behavior as it is currently when edpt xfer failed --> busy is not marked. Since there is no error cleanup procedure when edpt_xfer failed, making it busy will cause it busy forever. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great finding, I don't expect a usb controller can run that fast. Also nice usage of TU_VERIFY_HDLR
, however, I found it a bit non-instinctive therefore I don't use it anymore and may remove it. Maybe it is best to just use normal ifelse. This is optional, I could change it later on as well.
if ( !dcd_edpt_xfer() )
{
TU_LOG2("failed\r\n")
TU_BREAKPOINT();
_usbd_dev.ep_status[epnum][dir].busy = false
}
PS: Out of curiosity, could I ask which MCU you are running on.
It was discovered on STM32F407 and STM32F411, before #380 was merged (which hides the problem). Occasionally during debug session similar situation happened caused by other IN endpoint interrupts that are not handled at all (STM cube generated USB code handles all interrupts in ISR, even if it is just clearing then) that could be applied to synopsys driver. But for this I could not find any systematic reproduction procedure. |
Yeah, usbnet also observes a similar (but different) issue when a bug is resolved by moving notification EP from EP1 to EP3. It is kind of funny, and hard to explain, probably some a little bit of delay in hw is enough to eliminate the racing. PS: ah mynewt, I couldn't able to run any tests with it. I think they break newt update/upgrade several months ago and haven't fixed it yet 😓 |
I had forgotten that we allow USBD functions to be called from any task (I wrongly remembered them only being callable from the USBD task). This patch does help, and looks right in all but the most pathological cases (such as receiving a USB reset at exactly the wrong time?). So, I'm happy with merging it. There are definitely many other thread-safety issues. It's heavy-handed, but I wonder if a usbd mutex should be used to prevent concurrency mistakes in usbd. |
dcd_edpt_xfer() is sort of especial one, I don't remember there is any other dcd API that can have this effect. We are good by only proccess even in the usbd task so far. |
Busy flag was set to true after call to dcd_edpt_xfer().
In some cased it was possible that transfer finished before function ended.
In this case busy flag could be set to false before it was set to true.
Then setting it to true after dcd_edpt_xfer() made edpoint busy forever.
This change marks endpoint as busy before transfer is started to
avoid race condition.
For my case PR #380 would hide the issue but I guess other cases could result in
same behavior.