wip(llmobs): record bedrock token counts #5152
Draft
+96
−2
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What does this PR do?
Trying to record token counts for AWS BedrockRuntime commands. This is because we cannot guarantee they show up in the model response for every model provider through bedrock, and unlike the Python counterpart, the token counts are not deserialized onto the response metadata (from the headers).
This PR tries to capture these tokens by wrapping the deserializer used by the BedrockRuntime command types (for now, only
InvokeModelCommand
).Motivation
Full support for counting tokens from models served through AWS Bedrock from
@aws-sdk/client-bedrock-runtime
. This is only for the LLMObs plugin but we are adding the publishers in patching.