-
-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[🐛 Bug]: Getting java.net.ConnectException while running tests that handle two windows #2267
Comments
@farious2009, thank you for creating this issue. We will troubleshoot it as soon as we can. Info for maintainersTriage this issue by using labels.
If information is missing, add a helpful comment and then
If the issue is a question, add the
If the issue is valid but there is no time to troubleshoot it, consider adding the
If the issue requires changes or fixes from an external project (e.g., ChromeDriver, GeckoDriver, MSEdgeDriver, W3C),
add the applicable
After troubleshooting the issue, please add the Thank you! |
I am failing to see how we can help. I do not see an apparent bug related to the Selenium Grid, but rather a combination of elements that could be the reason for your issues. One crucial fact is that you are involving a 3rd party in your tests that you do not have control over. Facebook might be working well, and your tests work, but if, for some reason, a request does not work, you have zero control there. Thank you for taking the time to explain your situation, but I am sorry, I do not know what should be done on our side. |
You can track the issues for Chrome and ChromeDriver at https://bugs.chromium.org/p/chromedriver, and try to find something that explains your problem. Regarding the second question, I would like to say again that I do not know. Please note that this is an issue tracker, and we can act on reported issues that can be reproduced, especially in some cases where the log is clear enough to demonstrate the bug. I understand you are trying to figure out what is going on, but we cannot help you if we cannot easily reproduce the issue. |
Will try to check/find anything related or similar. |
What happened?
Hi Team,
We do run tests using Java 17 + TestNG (7.10.2)+ Selenium (4.21), Allure using Selenium Grid with Chrome nodes in dind.
The problem happens with tests related to 2 windows handling, e.g. Facebook registration or any other deed where new window pops up after a click and we need to handle 2 windows: e.g. when I click on register Facebook button on our standard registration page an additional window pops up asking for credentials:
Then entering user’s data and upon clicking on log in I will be redirected back to the initial page of ours and after that test got stuck. It happens really randomly: sometimes after the first cycle of registration or log ins, sometime – after the second circle if test’s logic requires that. Recently playing around with Grids settings I was able to run upon tests’ timeouts already on our page (far away then usual), but the errors are always the same:
Unable to execute request for an existing session: java.net.ConnectException
and does not matter whether it is Windows or Linux, single thread or multi: I tested against all. Always looks and behaves that browser/node become abandoned until the very end of the suite.
That happens completely random, and no one knows which test will become broken (among “2 windows’ ones”) and what is more important that the following tests from the same thread will be broken as domino after the very first one. Also, worth mentioning: the higher version of Chrome more unstable tests/threads are. Currently we do using Chrome node version 115 which is relatively stable, but 122, 124 and 125 are impossible to be used as the results are always unpredicted and we cannot rely on them (A situation is the same for Edge either, though I have not tested against not Chromium based browsers).
Below there is info alongside an example of a suite with 3 threads how I can reproduce an issue locally having Dynamic Grid via docker compose:
I tried different SE_ values: lower or higher, but all are the same:
Toml (it is used without changes: I commented video recording recently as know where tests are failing: also, to test what impact it would be having on results):
Based on my volumes I am able to save 2 (in fact 3, but logs, for some reason, are empty: maybe I am mistaken with volumes somewhere, as permissions are as below):
Session capabilities:
Allure’s small example:
What is important the used session ID for such test are the same starting from the very first broken test (the one with mentioned broken connection):
You see: IDs are the same, and I have no idea where those 5s timeouts come from (maybe that’s TestNG puts the session till next failing test if a connection suddenly lost with a current test/node and tries to apply to another test: though, it is just my own speculations).
Timeline are from above-mentioned run:
A whole suite to show you how threads are “occupied” by these exceptions and how all next tests fail apart being under the same thread:
Vs good run against Chrome 115 :
Here the test failed exactly prior to SwitchTo a parent window (this is a different run than above, so ID is a different), however tests are broken as the behavior as well as an error are the same, and those 5 sec..:
Regarding the code/steps themselves suing as an example Facebook (but there are others external systems which fall as well): tests are trivial ones:
That’s pretty much all. Sleeps, wait utils have not helped out at all. I tried to play with Grid settings, but it has not worked out either. We have 16 cores, but even 4 threads fail to be completed; even single thread is not stable; only lowering Node version affects the results.
Below I am attaching logs what I was managed to fetch considering my current settings. If you needed something more to be extracted, would be glad to do so, but I would need you to tell me how exactly to config my current stuff (though, there are no settings for Chrome 115 and it works fine only having SE_EVENT_BUS_HOST and ports being specified with shm_size: 512mb for 13 replicas). But the same does not work for Selenuim Grid nor Dynamic Grid (even with shm_size:2 gb).
What I am seeing in docker logs that those weird warnings with 404 errors requests (I am pretty sure they appear when I enter/click or doing something else on a child window).
And if test fails prior or after (I am not sure here) there are also 500s errors (but maybe that’s expected if you check after the container and session have been removed, I dunno, just speculation as we've run out of options).
The expected behavior
We want to run 1 test against one Chrome node (presumably the latest one: either 124 or 125) and close it once test either completed or failed. And this node should have been our Las Vegas: what happens inside remains there and won’t be affecting another tests nor thread at all even it got stuck: independent multithreading is needed here either.
Best Regards
Command used to start Selenium Grid with Docker (or Kubernetes)
Relevant log output
Operating System
Windows 10 (local run), Linux (job's run)
Docker Selenium version (image tag)
selenium/node-docker 4.21.0-20240517
Selenium Grid chart version (chart version)
No response
The text was updated successfully, but these errors were encountered: