Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add video models + functions #814
base: main
Are you sure you want to change the base?
Add video models + functions #814
Changes from all commits
75877d1
031b9df
548bbd5
b55149a
2cd6d62
5892ab9
f3dc66a
65529f3
b044082
2a77047
89ee2f0
5f522ad
e2f5a3a
67beb9f
60c5848
d3b1619
e31210c
bcd95b1
08edd27
dbefa5f
258454e
328c1a7
a1a47b2
5b2f45b
14caa08
746fd73
0fe47dd
1598c4c
0c3f3b4
bf824af
428d865
b7549b1
8639246
3376449
5b2e437
213b1d8
43389f7
5a20c4e
b72c440
55cd044
69a4385
7859e16
3f47d12
17118d1
cc05da9
8d9f6c2
23514f7
8a8dd64
1a04dd0
0c95c3d
e55405d
8e2a673
9c910ec
a2b8c9a
63448d9
abe39f5
3b7b829
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we need this separate model?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as for video info (
Video
model). I can remove it from this PR 🤔There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's just a bit weird that we have ImageFile and Image (that contains only some basic metadata) 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was
VideoMeta
(andImageMeta
) before, but Dmitry was asked to rename these models here. I agree havingVideo
(Image
) model with just meta looks odd. I think you're right and we should inherit this model fromVideoFile
(ImageFile
) to extend files with meta, than it will make sense. If no, what do you think aboutVideoInfo
(andImageInfo
)?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thinking out-loud here but should a frame be an
ImageFile
andImageFile
have ato_ndarray
method?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll take a look at the notebook but my thought would be that you would want to be able to call something like
video.split_to_frame
with an optional start/end frame + optional destination path and DataChain would split the video into frames and upload them all to a bucket as images.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The frames use-case could end up looking something like:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This also can be done by the
save_frames
method below, or we can add new methodupload_frames
to upload images to the storage instead of saving them.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
save_frames
without upload breaks the promise of dataset reproducibility - thinking out-loud againThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what format is default? does it support different formats?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Output format is taken from output file extension. See here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so, extension determines it? I wonder if we need to clarify or will be kinda expected by end users 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated:
Check warning on line 741 in src/datachain/lib/file.py
Codecov / codecov/patch
src/datachain/lib/file.py#L741