Indexing 669 GB of Video Locally on Apple Silicon
How developers are leveraging open-source ML models to analyze, search, and edit massive personal video archives locally.
Local machine learning has transitioned from a niche hobby to a highly practical tool for developers looking to wrangle massive datasets on consumer hardware. As open-source models grow more efficient, developers are increasingly bypassing cloud-based APIs entirely. Instead, they are opting to index hundreds of gigabytes of personal video archives directly on their local workstations, keeping their data private and avoiding recurring subscription fees.
A recent project by developer Ilias Haddad highlights this shift. Haddad successfully indexed 628 GoPro videos—totaling 668.68 GB and representing over 15 hours of raw footage—on an M1 Max Mac. By combining local ML models with smart preprocessing, the pipeline not only makes the footage searchable but also integrates directly with professional editing workflows.
Optimizing the Video Pipeline
Processing raw, high-resolution action camera footage frame-by-frame is incredibly compute-intensive. To make local indexing feasible on a workstation, the pipeline relies on aggressive but practical optimization techniques.
First, the pipeline downscales each video frame to 720p. Because ML models for object detection and scene classification do not require pristine 4K or 5K resolutions to achieve high accuracy, downscaling drastically reduces memory bandwidth and processing overhead.
Second, the system avoids analyzing every single frame of the video. Instead, the frame analysis pipeline divides the footage into separate one-second scenes, sampling at a rate of 1 frame per second (fps). This temporal downsampling reduced the total workload for Haddad's 15-hour archive to just 57,537 keyframes.
Hardware Realities: Apple Silicon vs. Dedicated GPUs
While local execution offers privacy and zero API costs, it demands a realistic look at hardware performance. On the M1 Max, analyzing those 57,537 frames took a total compute time of 67 hours, 40 minutes, and 42 seconds.
Although Apple Silicon's unified memory architecture is highly capable of handling large models, raw execution speed for sequential frame-by-frame inference remains a bottleneck. Haddad noted that a dedicated NVIDIA GPU, such as an RTX 3060 with 12GB of VRAM, completed similar tasks significantly faster than the M1 Max.
For developers who want the privacy of local models but lack high-end local hardware, pay-as-you-go cloud GPU platforms like RunPod have emerged as a popular middle ground, allowing developers to spin up powerful instances temporarily to crunch through massive initial indexing jobs.
Bridging AI and Creative Workflows
What makes this local indexing approach particularly compelling is its integration into professional creative suites. Rather than simply generating a static text index, the pipeline connects directly to DaVinci Resolve. Once the local ML models identify specific moments—such as interesting segments of a cycling journey—the best clips are sent straight to the DaVinci Resolve timeline for editing.
This project is part of a growing trend of developers building custom, local media-management tools. Other creators have built similar pipelines, such as the Framedex project, which was designed to index a year's worth of personal video locally. As open-source models continue to improve in both speed and accuracy, the barrier to running sophisticated, private media search engines on consumer hardware will only continue to fall.
Sources & further reading
- I indexed 669 GB of my GoPro videos using my M1 Max computer and local ML models — news.ycombinator.com
Mariana covers the fast-moving world of machine learning and generative AI, with a particular focus on how these technologies are reshaping development workflows. When she isn't stress-testing the latest foundation models, she's usually at a local hackathon.
Discussion 2
i'd love to see a breakdown of the actual search and edit performance on that 668.68 GB dataset, not just the indexing time - how does it handle queries, and what kind of latency are we talking about?
@ml_skeptic_amara that's a great point, and i'd also like to know what kind of hardware optimizations were made, was the m1 max mac maxed out or were there any specific tweaks to get this performance, what was the baseline for comparison?