How do you recommend short videos when all you have is the raw file and no metadata? The explosion of user-generated short videos demands a reinvention in RecSys. Leveraging Generative AI and techniques, it’s possible to create systems delivering both viral hits and hidden gems.
Presentation at Google Dev Fest 2024 https://gdg.community.dev/events/details/google-gdg-jakarta-presents-devfest-jakarta-2024/
Watch 15-minutes Youtube videos of the full explanation http://youtu.be/wN3T5NCTSAY?t=15744s
Free free to give star ⭐ for this project.
I'm using Ubuntu 24.04, Miniforge Python 3.11 and 2 x RTX 4090 for this. You can adjust accordingly. Here is the replicate the experiments
- Google CLI ( https://cloud.google.com/sdk/docs/install#deb )
- FFMPEG GPU NVIDIA works for Ubuntu 24.04, but issue on 24.10 ( https://docs.nvidia.com/video-technologies/video-codec-sdk/12.0/ffmpeg-with-nvidia-gpu/index.html )
- Gemini API Key ( https://aistudio.google.com/apikey )
pip install -r requirements.txt
This will require 31GB of spaces.
gsutil -m cp -r \
"gs://shorts-hdr-dataset/videos/sdr" \
.
Run convert.sh
in the folder of SDR. This will convert all the movies into compressed and 1 FPS (Frame per second). Reduced into 2.7GB.
Gemini will read the video from GCS
gsutil -m cp -r * gs://YOUR_BUCKET_NAME/small/
Open the shorts_recommendation.ipynb
The result
file title cosine_similarity
2 SDR_Hobby_v4op.mp4 Serene River Flowing Through a Mountain Gorge 1.000000
43 SDR_Hobby_rdby.mp4 Fishing Perch 0.637745
19 SDR_Animal_svq5.mp4 Orange and White Kitten Exploring Grass 0.616769
15 SDR_Hobby_rl1n.mp4 Northern Pike Caught on Lure 0.578450
27 SDR_Health_1p77.mp4 Nature's Consequences 0.573209
You can play around with the final_df.csv
and load it into ipynb.
By @yodiaditya. Happy to connect with you over linkedin and Github!