Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Read replicas do not log 503 errors to stderr #3841

Open
laurenceisla opened this issue Jan 6, 2025 · 0 comments
Open

Read replicas do not log 503 errors to stderr #3841

laurenceisla opened this issue Jan 6, 2025 · 0 comments
Labels

Comments

@laurenceisla
Copy link
Member

Environment

  • PostgreSQL version: 15.6
  • PostgREST version: 12.2.3
  • Operating system: ubuntu

Description of issue

PostgREST logged the following (requests queries redacted):

127.0.0.1 - service_role [06/Jan/2025:17:44:02 +0000] "GET /rpc/<an_rpc> HTTP/1.1" 503 - "" "node"
06/Jan/2025:17:44:03 +0000: Successfully connected to PostgreSQL 15.6 ...
06/Jan/2025:17:44:03 +0000: Config reloaded
06/Jan/2025:17:44:03 +0000: Schema cache queried in 35.5 milliseconds
06/Jan/2025:17:44:03 +0000: Schema cache loaded ...
127.0.0.1 - service_role [06/Jan/2025:17:44:02 +0000] "GET /rpc/<another_rpc> HTTP/1.1" 503 - "" "node"
06/Jan/2025:17:44:03 +0000: Successfully connected to PostgreSQL 15.6 ...
06/Jan/2025:17:44:03 +0000: Config reloaded
06/Jan/2025:17:44:04 +0000: Schema cache queried in 37.9 milliseconds
06/Jan/2025:17:44:04 +0000: Schema cache loaded ...

As seen above it does not log the 503 error (when it should) and retries connection immediately. For instance, the 500 errors are logged correctly:

06/Jan/2025:17:44:59 +0000: {"code":"57014","details":null,"hint":null,"message":"canceling statement due to statement timeout"}
127.0.0.1 - service_role [06/Jan/2025:17:44:59 +0000] "GET /rpc/<an_rpc> HTTP/1.1" 500 - "" "node"

Now, checking the PostgreSQL logs, it shows that it's an error related to conflicts between the replica and the primary db. In particular, it shows these two errors:

message:"canceling statement due to conflict with recovery"
detail:"User query might have needed to see row versions that must be removed."
message:"terminating connection due to conflict with recovery"
detail:"User query might have needed to see row versions that must be removed."
hint:"In a moment you should be able to reconnect to the database and repeat your command."

Not sure if PostgREST should try to reconnect or not in these cases, since a solution is to set max_standby_archive_delay and max_standby_streaming_delay in PG config to avoid the error.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

No branches or pull requests

1 participant