A tool that crafts personalized supercuts from videos, using Large Language Models to intelligently include only the content you're interested in.
- Automated extraction of time intervals relevant to a specified topic from video subtitles.
- Generation of a supercut video based on identified segments.
- Customizable output resolution and model selection for enhanced flexibility.
Before starting, ensure you have:
- Python 3.x installed.
- FFmpeg installed and set in your system's PATH.
- An OpenAI API key for GPT model usage.
-
Clone the repository:
git clone https://github.com/kalpit-S/Supercut-Creator.git cd Supercut-Creator
-
Install the required Python packages:
pip install -r requirements.txt
- Rename
config.ini.example
toconfig.ini
. - Update the configuration file with your paths for videos, subtitles, output directories, and FFmpeg installation.
- Set your OpenAI API key in an
.env
file or as an environment variable.
-
Prepare Video and Subtitles:
- Place your
.mp4
video and.vtt
subtitle files in their respective directories as defined inconfig.ini
. - Ensure the video and subtitle files are named identically.
- Place your
-
Execute the Script:
python supercut_creator.py [video_name] [topic] [--output_file OUTPUT_FILE] [--config CONFIG_PATH] [--model MODEL_NAME] [--resolution RESOLUTION]
Parameters:
video_name
: Base name of the video file (without extension).topic
: Desired topic for the supercut.--output_file
: Optional. Name for the output video file (without extension). Default:[videoname] super cut
.--config
: Optional. Path to the configuration file. Default:config.ini
.--model
: Optional. GPT model choice. Default:gpt-3.5-turbo-1106
.--resolution
: Optional. Desired output resolution inWIDTH:HEIGHT
format.
-
Access the Supercut:
- The final supercut video will be stored in the designated output directory.
Creating a supercut from a video titled "Talk about LLMs"
, focusing on "Speeding up transformer models"
, and naming the output transformer_speedup_supercut
:
python supercut_creator.py "Talk about LLMs" "Speeding up transformer models" --output_file transformer_speedup_supercut
Ensure the corresponding .vtt
subtitle file is in your subtitles folder.
Distributed under the MIT License.
The Supercut Creator script simplifies the process of creating a themed supercut video. It works in several key stages:
- Reading Subtitles: The script starts by reading the VTT subtitle file of the target video.
- Cleaning Subtitles: Cleaning and formatting subtitles to extract only the necessary text, maximizing efficiency and reducing token count for LLM processing.
- Chunking Subtitles: Subtitles are chunked, considering token limits of LLMs and adding overlap for context.
- Querying LLM Model: Each chunk is sent to an LLM with the user-specified topic to find relevant segments.
- Timestamp Extraction: A follow-up LLM call converts the model's response into a structured JSON format, extracting precise time intervals.
- Cutting Segments: FFmpeg cuts the video segments based on the identified timestamps.
- Merging Segments: These segments are seamlessly concatenated to form the final supercut.
- The script offers extensive configurability, including input/output paths, video resolution, and LLM choice, through command-line arguments and a
config.ini
file.
- Ideal for content creators, educators, and researchers for thematic content compilation from lengthy videos.
- Demonstrates a practical application of combining NLP and video processing for automated content creation.
- Cleaner User Feedback: Cleaning up print statements and giving time estimates.
- Support for SRT Subtitle Format
- Improved Prompting: Still room to improve performance through prompting / parameter tuning
- More Config File Options
- Adding Support for Google Gemini: They are currently giving 60 free requests per minute for Gemini Pro!
- Adding Support for Together AI API: Allows for the use of hundreds of open source models
- Relevancy Checks for Segments: Right before cutting the video segments, making sure if the segments actually fit with the topic (thinking of using a smaller model here).
- Making a GUI: Making it easier to use
- Generating Own Subtitles with "Insanely-fast-whisper": Moving towards generating subtitles directly, no subtitle file needed.
- Support for Any Subtitle Format: Basically having the LLM on the fly generate code to extract subtitles in the format needed to work with the functions.