FFmpeg Sees 94x Performance Boost with Handwritten AVX-512 Code

TuxBot · 18 hours ago

FFmpeg Sees 94x Performance Boost with Handwritten AVX-512 Code

Τρομοκρατιστής ✭@kafeneio.social · 17 hours ago

@duval AVX2 is enough to get such speed through SiMD or less through just the more 16 registers. Today all x86 supports AVX2, but Atom Intel Family (known as Celeron too after their first failure) @tuxbot

Clément Duval@todon.eu · 17 hours ago

@tromo @tuxbot then why this article? worse: “the complexity of AVX-512 means that such optimizations are generally limited to specific applications and require specialized knowledge of low-level programming.”
so anyway, if I really got serious and scaled up this might matter, but my limitations are in setting up the job. Since I do lots of small clips, I am always the limiting factor, slowing everything down.
They have not found a fix for this, and I refuse medication.

Τρομοκρατιστής ✭@kafeneio.social · 16 hours ago

@duval Also, SIMD is like multithread in one thread in the same cyclaes … can execute the same operator in 16 different addresses or registers… Read Wikipedia about this feauture that cames from Mainframes. As I said u dont need to use it. @tuxbot

Τρομοκρατιστής ✭@kafeneio.social · 17 hours ago

@duval I didnt read it, I did it boost for the optimization that can be used with AVX even without SIMD commands (which needs low-level programming). Get a C program and compile it with -m avx2 -O this will enable the 16 more 64bit registers and will use them at optimization. This mean far less access to/from memory. @tuxbot

Clément Duval@todon.eu · 16 hours ago

@tromo @tuxbot I used to compile my ffmpeg, and that would be a good experiment. I used to also compile liquidsoap, but I don’t bother them lately, since now my specific problems are resolved well enough.

Τρομοκρατιστής ✭@kafeneio.social · 16 hours ago

@duval btw, the AVX-512 it is too much, lets say avx3 with 512bit registers… forget it… AVX2 is far enough @tuxbot

Clément Duval@todon.eu · 16 hours ago

@tromo @tuxbot I could probably compile in the optimization, but why bother?

Τρομοκρατιστής ✭@kafeneio.social · 16 hours ago

@duval for speed… @tuxbot

Clément Duval@todon.eu · 16 hours ago

@tromo @tuxbot ☕ yeah right

FFmpeg Sees 94x Performance Boost with Handwritten AVX-512 Code

FFmpeg Sees 94x Performance Boost with Handwritten AVX-512 Code

FFmpeg Sees 94x Boost with Handwritten AVX-512 Code