The Art of SIMD Programming by Sergey Slotin

Sdílet
Vložit
  • čas přidán 8. 09. 2022
  • Modern hardware is highly parallel, but not only in terms of multiprocessing. There are many other forms of parallelism that, if used correctly, can greatly boost program efficiency - and without requiring more CPU cores. One such type of parallelism actively adopted by CPUs is "Single Instruction, Multiple Data" (SIMD): a class of instructions that can perform the same operation on a block of 16, 32, or 64 bytes of data in one go, yielding a proportional speedup over scalar code.
    While SIMD shares many similarities with classic multiprocessor computing, it is quite different and often requires creative use of the instruction set. In this talk, we will give a general introduction to the technology (focusing on x86/AVX2), derive and implement several state-of-the-art SIMD algorithms, and discuss their use in impactful open-source projects.
    skillsmatter.com/skillscasts/...
  • Věda a technologie

Komentáře • 5

  • @yuangchen905
    @yuangchen905 Před rokem +3

    great video. Thank very much for your lightening example and insightful explanation!

  • @Roxas99Yami
    @Roxas99Yami Před rokem +1

    Thanks very appreciated. Especially the examples in C. Is this directky compatible in Cython ?

  • @martingeorgiev999
    @martingeorgiev999 Před rokem +3

    I don't understand why these architecture specific instructions are not recognized directly by gcc on O3.

    • @bouazzase4202
      @bouazzase4202 Před rokem +9

      they are, when you give the -march= argument, otherwise the compiler doesn't know which instruction sets are allowed and will fall back to a default (usually x86-64 without avx)