@duval AVX2 is enough to get such speed through SiMD or less through just the more 16 registers. Today all x86 supports AVX2, but Atom Intel Family (known as Celeron too after their first failure) @tuxbot
@tromo@tuxbot then why this article? worse: “the complexity of AVX-512 means that such optimizations are generally limited to specific applications and require specialized knowledge of low-level programming.”
so anyway, if I really got serious and scaled up this might matter, but my limitations are in setting up the job. Since I do lots of small clips, I am always the limiting factor, slowing everything down.
They have not found a fix for this, and I refuse medication.
@duval Also, SIMD is like multithread in one thread in the same cyclaes … can execute the same operator in 16 different addresses or registers… Read Wikipedia about this feauture that cames from Mainframes. As I said u dont need to use it. @tuxbot
@duval I didnt read it, I did it boost for the optimization that can be used with AVX even without SIMD commands (which needs low-level programming). Get a C program and compile it with -m avx2 -O this will enable the 16 more 64bit registers and will use them at optimization. This mean far less access to/from memory. @tuxbot
@tromo@tuxbot I used to compile my ffmpeg, and that would be a good experiment. I used to also compile liquidsoap, but I don’t bother them lately, since now my specific problems are resolved well enough.
@duval AVX2 is enough to get such speed through SiMD or less through just the more 16 registers. Today all x86 supports AVX2, but Atom Intel Family (known as Celeron too after their first failure) @tuxbot
@tromo @tuxbot then why this article? worse: “the complexity of AVX-512 means that such optimizations are generally limited to specific applications and require specialized knowledge of low-level programming.”
so anyway, if I really got serious and scaled up this might matter, but my limitations are in setting up the job. Since I do lots of small clips, I am always the limiting factor, slowing everything down.
They have not found a fix for this, and I refuse medication.
@duval Also, SIMD is like multithread in one thread in the same cyclaes … can execute the same operator in 16 different addresses or registers… Read Wikipedia about this feauture that cames from Mainframes. As I said u dont need to use it. @tuxbot
@duval I didnt read it, I did it boost for the optimization that can be used with AVX even without SIMD commands (which needs low-level programming). Get a C program and compile it with -m avx2 -O this will enable the 16 more 64bit registers and will use them at optimization. This mean far less access to/from memory. @tuxbot
@tromo @tuxbot I used to compile my ffmpeg, and that would be a good experiment. I used to also compile liquidsoap, but I don’t bother them lately, since now my specific problems are resolved well enough.
@duval btw, the AVX-512 it is too much, lets say avx3 with 512bit registers… forget it… AVX2 is far enough @tuxbot
@tromo @tuxbot I could probably compile in the optimization, but why bother?
@duval for speed… @tuxbot
@tromo @tuxbot ☕ yeah right