Ancient way of coding helps boost popular video encoder by 100x — but is it too good to be true?

  • FFmpeg’s biggest speedup yet affects only one function few people will have heard of
  • Handwritten Assembly makes a comeback in a niche filter that most users will never even touch
  • AVX512 gives FFmpeg an absurd 100x gain – but only if your CPU supports it

The FFmpeg project, known for powering some of the most widely used video editing software and media tools, is making headlines again.

Developers claim to have achieved what they call “the biggest speedup so far,” delivering a 100x performance gain in a recent update.

The catch? It only applies to a single, obscure function, and the means of achieving it is raising eyebrows – handwritten Assembly code, a technique largely seen as outdated by most of today’s developers.

Assembly coding sparks both nostalgia and skepticism

Assembly language, once essential for getting the most out of limited hardware in the 1980s and 1990s, has become a niche practice.

Yet FFmpeg developers continue to rely on it for extreme optimization, calling themselves “assembly evangelists.”

In their latest patch, they rewrote a filter called rangedetect8_avx512 using AVX512 instructions, part of a modern SIMD (Single Instruction, Multiple Data) toolkit that helps CPUs perform multiple tasks in parallel.

On systems without AVX512 support, the AVX2 variant still delivers a 65.63% improvement.

As the team points out, “It’s a single function that’s now 100x faster, not the whole of FFmpeg.”

This news follows a similar boost reported in November 2024, where another patch brought certain operations up to 94x faster.

In that case, part of the earlier performance gap stemmed from mismatched filter complexity: the generic C version used an 8-tap convolution, while the SIMD version used a simpler 6-tap approach.

Even compiling the C version in release mode with a better compiler like Clang could close over 50% of the gap, suggesting that some of the claimed speed gains may have been exaggerated by comparing worst-case with best-case conditions.

“Register allocator sucks on compilers,” the devs quipped on social media, highlighting compiler inefficiencies.

Despite the caveats, this renewed focus on low-level coding has sparked fresh conversations around performance optimization.

FFmpeg powers everything from VLC Media Player to countless YouTube downloader tools, so even small improvements in isolated filters can ripple through widely used software.

However, it’s worth noting that such results are often difficult to replicate and apply across broader parts of the codebase.

While these kinds of deep optimizations are impressive, they may not reflect real-world improvements for everyday users editing footage with video editing software.

Unless other core functions receive similar treatment, the promise of a faster FFmpeg might remain limited to technical benchmarks.

Via TomsHardware

You might also like

Request data export

Use this form to request a copy of your data on this website.

Request data removal

Use this form to request removal of your data from this website.

Request data rectification

Use this form to request the rectification of your data on this website. Here you can correct or update your data, for example.

Request unsubscribe

Use this form to request to unsubscribe your email from our email lists.