-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: added support for audio timestamp understanding to Google Vertex #4061
Conversation
Hi there, thank you for the contribution! As you may have seen, we recently shipped a 2.0 update to the https://x.com/aisdk/status/1866044262409765270 As part of this we moved to using the Vertex AI Gemini REST API instead of the Just looking briefly at the example on the page you linked it looks like the submitted audio would be handled as a file attachment, which we already have support for, so I am not sure we need the We would need unit tests for new logic, demo scripts in If this sounds like a lot we can put it in our feature request queue, please file an issue or link to one if it already exists. |
d27dc5f
to
d0afd31
Compare
@shaper I've updated the PR, it's only two edits to support this now! You can use it by passing |
Lgtm. We would need an example under |
@lgrammel I've added an example, updated the docs, and added a changeset file. let me know if this is satisfactory! thanks for your guidance 😎 |
Would love to see this one get in! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, just a few small things, thanks for the continued work! Would like to help land this with you.
* Optional. Enables timestamp understanding for audio-only files. | ||
* This is a preview feature. | ||
* | ||
* Available for the following models: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of listing the supported models here, is there a page we can link to in vertex docs?
@@ -109,6 +109,7 @@ export class GoogleGenerativeAILanguageModel implements LanguageModelV1 { | |||
this.supportsStructuredOutputs | |||
? convertJSONSchemaToOpenAPISchema(responseFormat.schema) | |||
: undefined, | |||
audioTimestamp: this.settings.audioTimestamp, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most of the time this.settings.audioTimestamp
won't be defined, but as written this will then add audioTimestamp: undefined
to every request.
Can we alter this to something like the below to avoid that?
...(this.settings.audioTimestamp && { audioTimestamp: this.settings.audioTimestamp }
Optional. Enables timestamp understanding for audio files. Defaults to false. | ||
|
||
This is useful for generating transcripts with accurate timestamps. | ||
Only available for `gemini-1.5-pro-002` and `gemini-1.5-flash-002`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as below re: is there a Vertex doc page we can link to rather than specifying the models here, just to simplify maintenance.
Thanks again! |
Thanks for your help @shaper! How does the release process work? Any idea when this would be rolled out? |
It is automated and triggered by one of our team members. I will publish this today and follow up when it's live, should be within an hour or so. |
Changes to have the Google Vertex provider support audioTimestamp understanding
audioTimestamp
to GoogleVertexSettings, which is passed to the GenerationConfig of the sdk