Joining often gets stuck on waiting for metadata #1903

leblowl · 2023-10-03T00:55:46Z

On develop, with two local nodes, this happens quite often. Sorry I haven't looked into it any more than that.

vinkabuki · 2023-10-04T09:35:04Z

@leblowl Stucks for ever?

holmesworcester · 2023-10-04T10:09:55Z

I think so, or at least quite a while. The steps to reproduce would be to start a community in the latest develop branch and try to join it.

Expected: you see the general channel very quickly after Tor connects.

Actual: you get stuck on one of the last steps.

vinkabuki · 2023-10-04T12:04:52Z

I cannot reproduce it, eventually I am always joining

holmesworcester · 2023-10-26T17:28:30Z

@leblowl any more insights on how to reproduce this? is it a problem in the latest develop?

leblowl · 2023-10-30T22:09:12Z

Is it not possible to reproduce? If it's taking more than 30 seconds, or sometimes several minutes, then I think that's an issue. I haven't looked into this further, but I will soon.

vinkabuki · 2023-10-31T11:04:56Z

So this is a problem with misleading message. Anything below 20 minutes is not suspicious at all. What happens here you are just waiting for tor to connect to the first peer., It's known issue that tor needs a lot of time to "publish" addresses and make them dialable.
Some days it works better, some days it works worse.

holmesworcester · 2023-10-31T12:49:02Z

Are we sure that's what the problem is?

vinkabuki · 2023-10-31T13:03:54Z

yes, if it's not stuck forever, then it's not a bug, it's just a standard behavior of tor.

leblowl · 2023-10-31T16:22:34Z

If it takes 20 minutes to connect, then I think that's a pretty big issue, for internal testing and general usability. I think we should look into it more and confirm what we think is happening and document that behavior. If it is a limitation of Tor, then that's great to know as it provides another data point to consider when talking about moving away from Tor.

holmesworcester · 2023-10-31T18:51:35Z

I think we should look into it more and confirm what we think is happening and document that behavior. If it is a limitation of Tor, then that's great to know as it provides another data point to consider when talking about moving away from Tor.

I agree that we should do this, and I don't think we know yet that it's just Tor not connecting. Just because it connects eventually does not mean we know what the problem is.

Also, we should ensure that the text under the progress bar accurately reflects what is happening and what we are waiting for. If we think it's wrong, we should make an issue for that.

@vinkabuki do you think the message is misleading? can you make an issue for that with steps to reproduce?

EmiM · 2023-11-02T13:14:26Z

It's always been a known behavior for tor. You can even see it with e2e tests - last week multiple client test was failing because 15 minutes timeout was not enough. Today the test passed in 8 minutes. I know that one example is not a scientific evidence but I'm just writing what we observed during 2 years working with tor.

We have some general data on tor connection that we gathered (and are still gathering) on AWS however these tests are based on connecting to http server - so the old registering mechanism with registrar: https://s3.console.aws.amazon.com/s3/buckets/tor-connection-data?region=us-east-1&tab=objects
Btw. I think we can already stop running those.

vinkabuki · 2023-11-03T11:55:17Z

imo 'waiting for metadata' doesn't explain what's actually happening. we used to have a good message.

holmesworcester · 2023-11-03T12:29:24Z

Any step where Tor has started and we are waiting for Tor to connect to an onion address should say 'connecting with Tor'

siepra · 2023-11-08T11:32:55Z

Are there any decisions about steps to take in terms of this task?
Also, does it belong with current sprint? It sounds like a general problem, not exactly related to the changes we're about to publish.

@holmesworcester

holmesworcester · 2023-11-08T12:37:25Z

One very concrete thing is: every step in the joining process should be correctly described to the user, so that if a step fails or takes too long we know why.

siepra · 2023-11-08T14:00:06Z

I guess we need specific guidelines on the descriptions then. Otherwise we'll get "lost in translation". I mean every one of us may have different understanding of what the step is about. "Waiting for metadata" is actually accurate when you think of it.

holmesworcester · 2023-11-08T16:57:19Z

This is how we dealt with it last time: #1277 (comment)

Most important things to show:

Tor bootstrapping.

Tor connection process in registration (most important because this is where we get stuck; show a new message at the beginning of each fetch. This will repeat many times in most cases. 'fetching/timeout/fetching/timeout/etc')

Orbitdb block download of messages. (doesn't have to be continuous or "make sense" just show what's happening.)

Notes:

We don't want to show "errors" that are not really errors. (maybe change them in the logs or maybe hide them)

Maybe debouncing is a good idea so we don't show everything but we should everything that takes 1s or 2s.

Being busy and fast is good

It's good if we can see what we got stuck on for debugging purposes

Think of yourself as an artist trying to give the user some clear information about what's happening and reassure them that something is happening.

We can use indeterminate progress bars in most cases (bonus if we use a steady progress bar)

Since we've changed some of the steps in the process, we may need to change what is reported to the user.

Also, any time we are waiting for Tor to connect, we should say "Connecting via Tor..." until Tor has connected successfully. And any time we are waiting for something else, like block data, we should not mention Tor. This way, we and our users will be on the same page about the impact of Tor on the joining process.

In other words, if Tor slowness is really the culprit here, let's prove it by showing status messages that clearly state when we are waiting for Tor and when we are not.

holmesworcester · 2023-11-13T14:39:54Z

@siepra regarding the call we just had, I don't think there's any reason in particular why lucas needs to work on this. The workflow that @Kacper-RF did when he worked on these screens initially was:

Identify stages in the process that are meaningful.
Propose names for those stages that will display to the user
Get approval for those names
Implement it and show a screencast and confirm that it's right.

I think this approach will work again, so I think anyone on the team can do this. @Kacper-RF might be the best person since he worked on it initially.

Kacper-RF · 2023-11-14T14:48:58Z

Current state:

5% - Connecting process started(initial log)
20% - Connecting to community owner via Tor
20% - Registering owner certificate(only visible for owner)
30% - Launching community
40% - Spawning hidden service for community
50% - Initializing libp2p
60% - Initialized storage
70% - Initializing IPFS
75% - Loaded certificates to memory
80% - Initialized DBs
85% - Launched community
87% - Waiting for metadata
90% - Channels replicated
95% - Certificates replicated

From my observations, there are few steps on which the user spends the most time and I think that these are the steps where we should provide the most valuable information:

20% - Connecting to community owner via Tor
85% - Launched community
87% - Waiting for metadata

holmesworcester · 2023-11-15T12:59:12Z

Thanks @Kacper-RF! It's super helpful to see this list. So, just to clarify we're talking about 2.x now. I have a few general suggestions about some of these so I'm going to go through them.

"20% - Connecting to community owner via Tor"

This should now say "Connecting to peers"

40% - Spawning hidden service for community

This happens synchronously? Why do we have to wait for this at all? We already have the ability to make outgoing connections to peers so it shouldn't block anything.

85% - Launched community

This should say "Launching community", right? Because it's in progress at this stage? Or if we have already launched the community and something else is in progress, what is in progress?

It's weird to be waiting on a step that is described in the past tense as having already happened. Like, if it happened, why am I waiting?

87% - Waiting for metadata

Okay, at this point it sounds like we have already made connections to peers over Tor, in 2.x, so we aren't waiting for any more connections. What is actually happening here?

Should this say "Downloading community metadata"?

And if all we're doing here is downloading community metadata, how do we explain "joining often gets stuck?" Above Emi says:

It's always been a known behavior for tor. You can even see it with e2e tests - last week multiple client test was failing because 15 minutes timeout was not enough. Today the test passed in 8 minutes. I know that one example is not a scientific evidence but I'm just writing what we observed during 2 years working with tor.

But we've already made connections via Tor to some peers (at "20% - Connecting to community owner via Tor") so what is our explanation for the issue leblow is seeing?

Kacper-RF · 2023-11-15T15:55:47Z

After taking a closer look at the current joining flow, these steps are very confusing, they are tied to the old master/production joining flow.

87% - Waiting for metadata - this is truly the moment waiting to connecting with other peers.

Previous logs are visible very briefly, it is even difficult to read

So I will suggest to use only 4 steps

Connecting process started
Connecting to peers (most time consuming)
Channels replication
Certificates replication
optionally:
5.Waiting for load messages ( We are waiting for at least one message to be displayed on the channel, to not throw user to empty channel)

holmesworcester · 2023-11-17T00:06:04Z

Previously we showed information about Tor's startup process. Can we show that here using the existing language, if Tor has not started yet? (Sometimes it will have started already, sometimes not.)

So I will suggest to use only 4 steps

What percentages will you display for these steps and what will you call them? I'd propose:

[Tor startup steps]
Connecting to community members via Tor
Loading messages

Is certificates replication a step that would block other steps? I don't think it should be, so I think these steps can be enough. Is there a way to show progress on "loading messages"?

Kacper-RF · 2023-11-17T13:04:14Z

We can do something like that:

5% - Connecting process started ( to always give user information that is on going)

From 5% - 50% Tor bootstraping logs, example:

Bootstrapped 5% (conn)
Bootstrapped 10% (conn_done)
Bootstrapped 14% (handshake)
Bootstrapped 15% (handshake_done)
Bootstrapped 25% (requesting_status)
Bootstrapped 30% (loading_status)
Bootstrapped 40% (loading_keys)
Bootstrapped 45% (requesting_descriptors)
Bootstrapped 50% (loading_descriptors)
Bootstrapped 55% (loading_descriptors)
Bootstrapped 61% (loading_descriptors)
Bootstrapped 70% (loading_descriptors)
Bootstrapped 75% (enough_dirinfo)
Bootstrapped 90% (ap_handshake_done)
Bootstrapped 100% (done)

I will adjust somehow progress bar for them.

55% - Connecting to community members via Tor (long step, maybe adding 1% to the progress bar every 3 seconds, but I don't know if that's a good idea?)

80% - Loading messages (I think I can leave the same information but change the value( 80%, 85%, 90% ), after receiving events from Orbit DB, and some backend logic)

We want to have replicated channels and certificates before showing the channel list to the user, because as far as I remember, some logic depends on certificates.

I think I can start working on a draft-solution and send you some videos of what it looks like.

Let me know what you think.

holmesworcester · 2023-11-17T14:01:01Z

This sounds good!

Kacper-RF · 2023-11-23T14:33:52Z

After several different approaches, I finally did something like this:

Showing Tor boostraping logs was very problematic due to the asynchronous start and sometimes broke the progress bar.

I tried to limit the steps to make them clearly visible and readable to the user.

I implemented an additional animation with a progress bar when user is on most consuming step - Connecting to community members via Tor

The idea of adding 1% every 3 seconds was risky because sometimes the process takes a really long time if no member is online.

joining-user.mp4

Kacper-RF · 2023-11-24T11:00:48Z

#2093

holmesworcester · 2023-11-24T12:11:14Z

This looks great!

Is that new animation implemented on mobile too? (Just asking because it seems non-standard)

It might be more standard to switch the whole thing to an "infinite" progress bar that goes back and forth, if that's easier on mobile for some reason. But this looks great and, most importantly, the decisions about what to say to the user look great to me.

Kacper-RF · 2023-11-24T12:47:26Z

Thanks !
Yes, I implemented the animation on mobile as well, it was a bit more tricky than on desktop, but it looks the same as on desktop.

* feat: basic changes * feat: better UX * feature: state manager and desktop * feat: mobile part * fix: use enum instead hardcoded string * fix: mobile channel list screen * feat: debug log * feat: trigger mobile e2e * feat: isJoiningCompleted selector * test: add isJoiningCompleted

kingalg · 2024-01-11T09:37:09Z

Desktop: 2.0.3-alpha.15
Mobile: [email protected], ios

Done.

leblowl mentioned this issue Oct 3, 2023

Unregistered users improvements #1902

Closed

13 tasks

leblowl changed the title ~~Community often get's stuck on waiting for metadata~~ Joining often get's stuck on waiting for metadata Oct 3, 2023

holmesworcester added this to Quiet Oct 3, 2023

holmesworcester moved this to Sprint in Quiet Oct 3, 2023

holmesworcester changed the title ~~Joining often get's stuck on waiting for metadata~~ Joining often gets stuck on waiting for metadata Oct 3, 2023

holmesworcester added the bug Something isn't working label Oct 3, 2023

holmesworcester assigned leblowl Oct 4, 2023

holmesworcester added the can't reproduce label Oct 26, 2023

siepra added the needs clarification label Nov 8, 2023

leblowl mentioned this issue Nov 13, 2023

After restarting Quiet while joining, user should return to progress screen #1910

Open

leblowl removed their assignment Nov 13, 2023

siepra removed can't reproduce needs clarification labels Nov 14, 2023

Kacper-RF self-assigned this Nov 16, 2023

Kacper-RF moved this from Sprint to In progress in Quiet Nov 16, 2023

Kacper-RF moved this from In progress to Waiting for review in Quiet Nov 24, 2023

Kacper-RF moved this from Waiting for review to Merged (develop) in Quiet Nov 30, 2023

Kacper-RF moved this from Merged (develop) to Ready for QA in Quiet Dec 1, 2023

kingalg closed this as completed Jan 11, 2024

kingalg moved this from Ready for QA to Done in Quiet Jan 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Joining often gets stuck on waiting for metadata #1903

Joining often gets stuck on waiting for metadata #1903

leblowl commented Oct 3, 2023

vinkabuki commented Oct 4, 2023

holmesworcester commented Oct 4, 2023

vinkabuki commented Oct 4, 2023

holmesworcester commented Oct 26, 2023

leblowl commented Oct 30, 2023 •

edited

Loading

vinkabuki commented Oct 31, 2023 •

edited

Loading

holmesworcester commented Oct 31, 2023

vinkabuki commented Oct 31, 2023 •

edited

Loading

leblowl commented Oct 31, 2023

holmesworcester commented Oct 31, 2023 •

edited

Loading

EmiM commented Nov 2, 2023

vinkabuki commented Nov 3, 2023

holmesworcester commented Nov 3, 2023

siepra commented Nov 8, 2023 •

edited

Loading

holmesworcester commented Nov 8, 2023

siepra commented Nov 8, 2023

holmesworcester commented Nov 8, 2023 •

edited

Loading

holmesworcester commented Nov 13, 2023

Kacper-RF commented Nov 14, 2023

holmesworcester commented Nov 15, 2023

Kacper-RF commented Nov 15, 2023

holmesworcester commented Nov 17, 2023 •

edited

Loading

Kacper-RF commented Nov 17, 2023

holmesworcester commented Nov 17, 2023

Kacper-RF commented Nov 23, 2023

Kacper-RF commented Nov 24, 2023

holmesworcester commented Nov 24, 2023

Kacper-RF commented Nov 24, 2023

kingalg commented Jan 11, 2024 •

edited

Loading

Joining often gets stuck on waiting for metadata #1903

Joining often gets stuck on waiting for metadata #1903

Comments

leblowl commented Oct 3, 2023

vinkabuki commented Oct 4, 2023

holmesworcester commented Oct 4, 2023

vinkabuki commented Oct 4, 2023

holmesworcester commented Oct 26, 2023

leblowl commented Oct 30, 2023 • edited Loading

vinkabuki commented Oct 31, 2023 • edited Loading

holmesworcester commented Oct 31, 2023

vinkabuki commented Oct 31, 2023 • edited Loading

leblowl commented Oct 31, 2023

holmesworcester commented Oct 31, 2023 • edited Loading

EmiM commented Nov 2, 2023

vinkabuki commented Nov 3, 2023

holmesworcester commented Nov 3, 2023

siepra commented Nov 8, 2023 • edited Loading

holmesworcester commented Nov 8, 2023

siepra commented Nov 8, 2023

holmesworcester commented Nov 8, 2023 • edited Loading

holmesworcester commented Nov 13, 2023

Kacper-RF commented Nov 14, 2023

holmesworcester commented Nov 15, 2023

Kacper-RF commented Nov 15, 2023

holmesworcester commented Nov 17, 2023 • edited Loading

Kacper-RF commented Nov 17, 2023

holmesworcester commented Nov 17, 2023

Kacper-RF commented Nov 23, 2023

Kacper-RF commented Nov 24, 2023

holmesworcester commented Nov 24, 2023

Kacper-RF commented Nov 24, 2023

kingalg commented Jan 11, 2024 • edited Loading

leblowl commented Oct 30, 2023 •

edited

Loading

vinkabuki commented Oct 31, 2023 •

edited

Loading

vinkabuki commented Oct 31, 2023 •

edited

Loading

holmesworcester commented Oct 31, 2023 •

edited

Loading

siepra commented Nov 8, 2023 •

edited

Loading

holmesworcester commented Nov 8, 2023 •

edited

Loading

holmesworcester commented Nov 17, 2023 •

edited

Loading

kingalg commented Jan 11, 2024 •

edited

Loading