Data Orientation For The Win! - Eduardo Madrid - CppCon 2021

CppCon

zhlédnutí 13 053

Přidat do
- Můj playlist
- Přehrát později
Sdílet

Sdílet

Vložit

Velikost videa:

Zobrazit ovladače přehrávání

Automatické přehrávání

Přehrát

čas přidán 31. 05. 2024
cppcon.org/
github.com/CppCon/CppCon2021
---
C++ conferences have had presentations showing the important performance benefits of data-oriented design principles; however, the principles seem to require lots of "manual" effort and "code uglification"; these make the principles less practical, and there haven't been clear recommendations about how to deal with runtime-polymorphic types.
In this talk we will recapitulate on data orientation principles and their benefits showing their application through production-strength Generic Programming components made to support them.
Specific examples include:
1. Structures of arrays instead of arrays of complex structures (a.k.a. "scattering")
2. Support for data oriented designs for runtime-polymorphism without inheritance+virtual (the equivalent of using std::variant or std::function, but generalized as allowed by the Zoo type-erasure framework)
----1. Hybrid buffers: the equivalent of the virtual table pointer is scattered out of the objects solving the "Goldilocks problem" of how big the local buffer should be, objects occupy the available space optimally
----2. Easy (de)serialization through very easy relocatability
----3. Voiding the need for pointers in favor of indices into arrays
---
Eduardo Madrid
Eduardo has been working for many years on financial technologies, automated trading in particular, and other areas where performance challenges can be solved in C++. He contributes to open source projects and teaches advanced courses on Software Engineering with emphasis in Generic Programming
---
Videos Filmed & Edited by Bash Films: www.BashFilms.com
CZcams Channel Managed by Digital Medium Ltd events.digital-medium.co.uk
Register Now For CppCon 2022: cppcon.org/registration/
Věda a technologie

Komentáře • 20

@marthacolmenares Před 2 lety ⁺⁷
Eduardo has been working for many years in financial technologies, in particular in automated trading, and I am very pleased to attend his lecture for instructing in a clear, didactic and brilliant way a cutting-edge topic. We all should know about this.
@cppmsg Před 8 měsíci ⁺¹
Good talk 😀to start understanding "Data Orientation" which I had struggled to find as applicable to C++ elsewhere on youtube. I might call the general technique "Data Layout Orientation" or "Columnar Data Layout Orientation" which I would have immediately understood due to my database understanding.
@kimberleemodel7182 Před 2 lety ⁺⁷
Loved this talk!!! (Loved Mike Acton's 2014 talk too). As much as possible, I try to do data orientation too, although I tend to work with trees and graphs, rather than tabular/columnar data. Funny thing is that half the time, the first principles data oriented best way to work with the data is the same thing you'd get from objected oriented, just now with pool allocators. Another thing I find working with trees/graphs from data oriented and first principles, is that heterogeneous nodes (rather than homogenizing them with inheritence) makes it easy to decouple my algorithms and data (sometimes this is even arguably the right thing to do from a SOLID principles and object oriented point of view).
@slimnet04 Před 5 měsíci
great talk, thank you :)
@user-ol1qp5xg3j Před 7 měsíci
In 37:52 - you're essentially dereferencing a pointer. Do we assume the next skuHandle will also be in the cacheline for the next operation?
@AinurEru Před 2 lety ⁺¹¹
There should be a better version of this talk that is more coherent and digestible
@andrewmccommons9212 Před 2 lety ⁺¹
This was a great talk!
@GrandpaRanOverRudolf Před 10 měsíci ⁺²
this guy should do rust 😉
@syntaxed2 Před 2 lety ⁺¹
OpenGL is an excellent API?
@MrAbrazildo Před 2 lety ⁺²
Is the primary goal of your zoo lib to make OO faster?
17:35, _"Only use unsigned types if you gonna use all its bits. There's a low level optimization for signed types"_ - Chandler Carruth.
23:23, déjà vu. 30:24, again!
27:27, does that mean inheritance is not a bad thing, if not using actual virtual tables?
31:30, about how much performance?
33:27, ooooouch! That hurts security so badly! Get rid of virtual is ok; of inheritance will damage code understandability/elegance/practice, although feasible; but lose encapsulation is not acceptable. That's OO's main key feature. Saves us from lots of headaches.
@eduardomadrid2380 Před 2 lety ⁺³
The reason why I work on my Open Source library ("zoo") is to express and share ideas I don't have the opportunity to express/share in other ways. Fortunately, I work at a great team inside my employer, Snap, where I have the freedom to express things and share them, so I don't have to recur to my open source library as much as before.
Sure, we should use unsigned types in production code only when we intend to do bit patterns with them, however, the needs for "slide-ware" are different, most things in C++ (and C) use unsigned types to denote sizes, in particular std::size_t. When doing a presentation the primary goal is understandability, if I put a signed integer, I might distract the viewer because it might look strange.
About encapsulation, one key point is that we ought to provide encapsulation to the user programmer so they don't have to concern with the details of implementation, but not in a way that prevents us, infrastructure developers, authors of the libraries to have access to the details. Also, if you are going to "scatter" the fields of objects, by definition you do not have the classical encapsulation.
C++ is a very feature-rich language, individually, the features are good, but combinations of them are problematic.
@jeeperscreeperson8480 Před 2 lety
Chandler who? Intel 64/ia32 optimization manual explicitly says that only when the value is unsigned the cpu can macrofuse cmp/test/add/sub and conditional jump instructions, which is particularly important for loops. And that applies to all modern Intel architactures since Core.
And loop counters are much much more important than sizes: sizes don't get updated that often, whereas the loop counter has to be updated possibly tens of thousands of times per frame.
@eduardomadrid2380 Před rokem
@@jeeperscreeperson8480 @MrAbrazildo accurately points out that, most other things equal, signed integers are to be preferred to unsigned, the reason is the technicality in the C and C++ rules that the arithmetic on integers is not allowed to over/under flow, this is "undefined behavior", because compilers are allowed to assume that "undefined behavior" does not happen, when attempting to optimize signed arithmetic the compiler would be able to prove some cases won't happen, to enable more optimization.
Unsigned operations have "implementation defined" semantics, in practice 2's complement arithmetic, thus the compiler is not allowed to make any assumption, then it optimizes less.
To your comment: if the optimizer is able to prove signed/unsigned does not matter, for a particular piece of code, and unsigned arithmetic is better performant, then it can emit unsigned assembler even i fthe C++ types are signed!, the compiler is more likely to be able to prove it if using signed integers.
@ABaumstumpf Před rokem
@@eduardomadrid2380 "thus the compiler is not allowed to make any assumption" ?? The compiler can make nearly the same assumption. It is just not allowed to throw away sanity-check codes that check for overflows etc.
"and unsigned arithmetic is better performant"
You do realise that there is not really a difference there as signed/unsigned does not exist at the hardware-level?
@a314 Před 2 lety ⁺²
Good talk. But in general, C++ is lagging behind heavily. What the world needs is "Design by Introspection" that will result in significant code and performance benefits.
@GrandpaRanOverRudolf Před 10 měsíci
woo talks by people who know what they're talking about
@trejohnson7677 Před 2 lety
Impedance oriented design*
@awsumgeorge Před 2 lety ⁺²
Actually, it makes sense to consider the OO-part of your program as a sort of database, from which you pull out the input data for some computation, arrange it in data-oriented way, do the computation on the efficient representation, and then put the result back into the "database".
@alexkfridges Před 2 lety ⁺⁸
Good content, but unfortunately the presenter isn't very good at presenting material. His train of though is hard to follow
@marcbotnope1728 Před 2 lety ⁺⁴
Stop the sanewashing

Další v pořadí

Automatické přehrávání

Building an Extensible Type Serialization System Using Partial Template Specialization - CppCon 2021