-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds Variable Batching Proposal #307
base: main
Are you sure you want to change the base?
Conversation
Does the distributed GraphQL executor currently support 'regular' batching requests? |
@Shane32 we will specify this as well ... we call it at the moment request batching ... the idea is that a request batch can also consist of variable batches. |
So it would support variable batching within request batching then? |
Yes, this is the current discussion. There are a lot of constraints we will put in place for the first iteration of this, we have explored this also in combination with subscriptions and all. But for this initial appendix we are focusing on variable batching first as this will be the minimum requirement for the composite schema spec. |
Ok. I would suggest that we do not use the jsonl format.
Perhaps it could be specified that when request batching was layered on top of variable batching, then the lists are flattened. I'm not sure this is the best approach, but it's feasible.
Let's use the response format that is commonplace now and supported by various servers and clients alike. Perhaps as a separate appendix, the jsonl format is described as an optional response format for batching requests (request batching or variable batching). There it can state that if multiple batching approaches are used, the lists are flattened. But I just don't see the benefit of adding another response format. |
There actually is we have specified that there is a |
BTW ... we will introduce |
@Shane32 I have put a bit more about the response in. |
@@ -0,0 +1,170 @@ | |||
## B. Appendix: Variable Batching |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After discussion in the composite schema working group we will have a single appendix about batching as this would allow for better example that show also both request batching and variable batching in combination.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@benjie shall we do also some example in here with defer and stream ... I asked that because defer and stream is not yet handled in the spec.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For now: no; but would be good to have the text handy anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should remove some of the repetition of the core spec in this appendix, try and see it more as "additions" (extra fields, extra rules, changes in behavior) rather than a re-implementation.
|
||
A client SHOULD indicate the media types that it supports in responses using the `Accept` HTTP header as specified in [RFC7231](https://datatracker.ietf.org/doc/html/rfc7231). | ||
|
||
For **variable batching requests**, the client SHOULD include the media type `application/graphql+jsonl` in the `Accept` header to indicate that it expects a batched response in JSON Lines format. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a few issues with this:
+jsonl
isn't allowed; see: https://www.rfc-editor.org/rfc/rfc6838.html#section-4.2.8
The primary guideline for whether a structured type name suffix is
registrable is that it be described by a readily available
description, preferably within a document published by an established
standards-related organization, and for which there's a reference
that can be used in a Normative References section of an RFC.
application/graphql
is incorrect (that's normally used for GraphQL documents, not GraphQL responses) - I think you meant application/graphql-response
?
This isn't a response but instead a number of responses.
Suggestion: just use application/jsonl
as mentioned by https://jsonlines.org/ - this is sufficient to solve the issues that application/json
has w.r.t. status codes/etc.
|
||
### Response | ||
|
||
When a server receives a well-formed _variable batching request_, it MUST return a well‐formed stream of _GraphQL responses_. Each response in the stream corresponds to the result of validating and executing the requested operation with one set of variables. The server's response stream describes the outcome of each operation, including any errors encountered during the execution of the request. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should use the word "stream" here. I think "list" is fine; I think the default media type would likely be e.g. application/graphql-response-list+json
(a JSON array of GraphQL responses) but application/jsonl
would allow for easier early parsing of responses when the client is stream-capable.
|
||
A server must comply with [RFC7231](https://datatracker.ietf.org/doc/html/rfc7231). | ||
|
||
Each response in the stream follows the standard GraphQL response format, with the addition of a required `variableIndex` field at the top level of each response. The `variableIndex` indicates the index of the variables map from the original request, allowing the client to associate each response with the correct set of variables. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The definition of the application/graphql-response
type should be modified such that other keys are allowed iff they are specified as part of a specification or appendix hosted under the graphql.org
domain; then this spec can "extend" that type with these keys. May want to go a slightly different direction on this in future.
|
||
#### Response Structure | ||
|
||
Each line in the JSON Lines (JSONL) response MUST include the following fields: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The combination of MUST
and optional
is confusing. I would change this to something more along the lines of "in addition to the regular fields expected on a GraphQL response, each entry in a variable batching response MUST include the variableIndex
field..."
It's not really fair to expect a JSON parser to parse JSONL... it's a different format. However; to make JSONL parseable as JSON is very straightforward; here's a JSONL parser for JS: const parseJSONL = (jsonl) => JSON.parse(`[${jsonl.trim().replaceAll('\n', ',')}]`); I'm not sure that you can really say that this significantly "increases the code complexity"? I would expect .NET to also be capable of:
JSONL is an incredibly straightforward format. Most importantly for this use case, it allows you to process the values as they stream down because you just scan through it looking for the next All that said, I think we should RECOMMEND that clients and servers implement JSONL, but we should REQUIRE that servers support plain JSON arrays. Which I think is already the case in this text (since SHOULD is equivalent to RECOMMEND according to RFC2119). |
Co-authored-by: Benjie <[email protected]>
Co-authored-by: Benjie <[email protected]>
Co-authored-by: Benjie <[email protected]>
Co-authored-by: Benjie <[email protected]>
Co-authored-by: Benjie <[email protected]>
Co-authored-by: Benjie <[email protected]>
Fair enough; I didn't think of that.
True, .NET could perform string manipulation also fairly easily. However, the typical .NET JSON parser performs on a UTF8 byte stream so that there is no additional memory allocations beyond the minimum required to deserialize the data. Converting the incoming data to a string, performing a number of string manipulations on it, and then parsing it would considerably slow down the JSON engine. To maintain speed on par with the default implementation, you'd have to write a specialized wrapper that parsed the characters as they were being read, looking for
Probably true. I may have another viewpoint if I was more familiar with the use case (which I am not). |
In general batching is used when the client has a lot of queries to execute all at the same time (e.g. as the result of rendering a React tree). In traditional batching, the server receives this as a list, executes them in parallel, and then returns the resulting array - the result being that none of the components can render until the slowest of all the queries has finished executing. By allowing a) the server to return the results in order of execution completion (rather than in request order) and b) the client to determine easily when a result is ready (e.g. by scanning the response for a |
Would it? You can initialize your binary UTF8 data buffer with 0x5b ( |
I see. Why does this PR focus on allowing any return order for variable batching vs regular batching then? If anything, variable batching would be much more likely to have a consistent execution time across each request and so having the responses return in any order is much less useful. Maybe we should focus on a flag of some sort to allow this behavior for traditional batching requests, and then take the solution to this PR for consistency. |
I meant a streaming wrapper. One that does not read the entire result into memory at any time. Yes, I believe it would be more complex than the other two options I presented. For instance, if you read a blob of data and the last character is a LF, you can't pass it to the JSON parser but instead must cache it until the next read so you know if this is the EOF and it must be overwritten with I'm okay with the complexity if it's optional and has benefits, mind you. I would agree JSONL seems to be a better format for reading streaming responses, either in .NET or JS, where you want to read individual GraphQL responses as they are transmitted. |
For example, maybe anytime a JSONL format is requested for a batching request (traditional or variable batching), a However for JSON format, no extra property is returned and the results are returned sequentially. This would simplify response processing for users that do not spend the time to write streaming response processing code (in either JS or .NET). |
Within the Composite Schema WG we have discussed a new batching format that is in the first place meant for Subgraphs/Source Schema in a federated graph. This new variable batching allows the distributed GraphQL executor to execute the same operation with a set of variables.
graphql/composite-schemas-spec#25