Warning: std::find() is Broken! - Sean Parent - CppCon 2021
Vložit
- čas přidán 3. 06. 2024
- cppcon.org/
github.com/CppCon/CppCon2021
---
We often take it for granted that calling one of the Standard algorithms will do something meaningful. For example, when invoking `position = find(first, last, value)` we expect that if an element equal to value is contained in the range `[first, last)` then `position` will point to the first such element; otherwise, position will equal `last`. But how do we know `find `will perform this operation? This talk explores requirements, guarantees, and domains, and we'll discover that maybe `find` doesn't.
---
Sean Parent
Adobe
Sean Parent is a senior principal scientist and software architect for Adobe’s Software Technology Lab (v2). Sean has been at Adobe since 1993 when he joined as a senior engineer working on Photoshop and later managed Adobe’s Software Technology Lab. In 2009 Sean spent a year at Google working on Chrome OS before returning to Adobe. From 1988 through 1993 Sean worked at Apple, where he was part of the system software team that developed the technologies allowing Apple’s successful transition to PowerPC.
---
Videos Filmed & Edited by Bash Films: www.BashFilms.com
Register Now For CppCon 2022: cppcon.org/registration/ - Věda a technologie
My mentor asked me to read "Elemenets of Programming" by Stepanov. Being halfway there, this talk really enlightens me on what do "all these mathematical complexities" have to do with code writing. thank you.
IMO, it is obviously true to anyone who has done much programming that understanding why something doesn't work is not at all the same as understanding why something else does. It's quite common to spend a few hours fixing bugs, and get to the "end", finding that you understood each bug and how to remove it, but you're suspicious of being done, because you don't understand why your program actually works now.
I once wanted to to use `remove_if` to delete and remove some pointers from vector, I read online documentation and saw that this function do not give me all requirements that I need to solve my problem. Because of that I decide to write my own function that have all requirements. I was wondering if my decision was right (as I effective reinvented wheel) but after this talk this was correct solution as my requirements need different function that standard provide and relay on one implementation would only bring bugs in some other implementation of standard or change of compiler version.
For the record, it is GCC's implementation of remove_if which passes the predicate by value to find_if.
Holy crap... this opens a can of worms!!! This is really good stuff... but OMG this is not something that we were told to look at! "Happens to Work" you know what else just "happens"? And "NAN" is "Not A NAN"... yep that stupid thing bit me about 2 weeks ago... Grrrr
I don't think a talk with an length of over an hour was necessary to introduce the weirdness of NaN.
I recall the term "Generic programming" going back much farther. How about Common Lisp? Clearly you refer to a particular formalism with that paper, but the base nomenclature including "generic function" and rough ideas were around for at least 15 years prior.
I guess that the assumption that nan("") is undefined made it prone to be NOT equal to anything, which made STL library implementers accept it in a sequence of double.
If the ideas extend to real-world algorithms, that would explain things like when I can't find my glasses even though they were right here a moment ago.
These talks are good but I wish we could have a sum up of everything. Finding the time to properly listen to a talk for 1hour and 40 minutes is truly challenging
It's 1am for me, so right now.
to sum up, c++ has once again poorly absorbed a foreign concept, because c++ concepts aren't c# interfaces, neither they're eiffel contracts. they're just concepts, you know. watch the software get easier and more fun to read, write and reason about.
So what is the correct way to use std::remove_if in his exampe, if any? Any why does capture-by-reference violate the rule, "pred(u) must equal pred(*first)"? The arguments are not changing, only the captures?
This is obviously a contrieved example and std::remove_if should not be used. One should use std::find_if and erase what's been found. As to your other question, suppose the vector is [1,1]. When pred will be called for the second "1", it will yeild different result, while input value is equal to the *first, hence standard requirement is not met.
You simply can not use remove_if for such task.
I tried out the code you presented at around 32:35 Both in Windows with VC2019 and in Ubuntu with g++ and without optimization it printed "null-reference", but with optimization it printed "valid. Interesting.
That’s actually what I’d expect. Without optimizations, it shouldn’t throw away code because you’ll likeley want to step through with a debugger or something.
dereferencing a null ptr is undefined behavior so the compiler is allowed to ASSUME that you would never do that. one of the optimization steps goes through and applies that rule.
One and a half hour of empty lecture with a wrong conclusion.
std::find is not broken. It does exactly what it sais: : "searches for an element equal to value".
std::find uses operator== to establish equality. The standart sais that two nans are not equal to each other. And you can interpret it as two nans do not represent the same entity. Why the standart does that is the whole different story. You can't find a nan with std::find because this operation does not make sense as two nans do not represent the same entity. It has absolutely nothing to do with std::find..
You have to use std::find_if and std::isnan.
What I learned, _equality_ requires ∀a, b. a = b ⇒ f(a) ≡ f(b), where ≡ is equality of result and side-effects, and for any logical predicate P, ∀a, b. a = b ⇒ (P(a) ⇔ P(b)).
Only if the values 'a', 'b' are in the domain of operation of your function 'f'
@@dadisuperman3472 Well, yes, in mathematics, the notation f(x) implies that.
How you define “domain” in a programming language is not so clear. Especially in a typed language with exceptions, you'd model it via: the domain is the whole type, but for some values, the function “returns” (in the mathematical sense) an exception via throwing.
The usage of the word “domain” in this talk (as far as I remember) is not how mathematics uses it. A language specification like C++ can use the word for something else, but don't pretend it's the same when it's not.
So C++ in general has a bit of a problem with how a few things like comparison are defined as they can have unintuitive or behaviour that is implementation-defined despite not being clearly visible to be that way.
The language is getting more and more extensions that are completely unintuitive and you would have to know more and more about the entire c++ specification (which you would first have to buy) just to find out that due to strange combinations of multiple sections of the standard your perfectly reasonable code does not behave like any sane programmer would expect.
100% Dude. I started my career in C++ and loved the elegance despite its complexity. Moved into C# for many years and now that I’m taking a look years later at what C++ has become it’s a complete unintuitive mess. I look at what “best practice” and “recommended” code looks like and it’s a joke.
"Shipping defective software will kill a company pretty quick".
a) laughs in microsoft 😉
b) all the software every produced has been defective, right?
Just because something is not perfect doesn't mean it's an excuse to not care about correctness. Some software can kill people or cause great material damage, and even less "responsible" software can lose companies a lot of money.
>laughs in pretty much the entire AAA video game industry
"defective" doesn't mean "having defects".
All software has defects. Not all software is defective.
@@timseguine2 Unless you are a wise-man wanting to say something about feelings behind words, "defective" is verbatim for "it has defects" :D
I always find these talks frustrating as an engineer. I find code correctness to be an iterative process. I never expect any of my initial implementations of things that I have no prior experience implementing to be correct, especially when coding and architectural design are happening at the same time. Correctness comes in stages. I know it's popular to talk as if it's possible to build correct code out of smaller correct pieces and get the right behavior via composition, but we all know that's simply not true. What's important is ensuring that things are moving in the right direct, that the entire codebase is getting more correct over time.
Well, developing well defined concepts is also an iterative process...
The way of programming that Sean is promoting on this talk is called Software Engineering, all other methodologies are called Software Development.
Not all Software Developers are Software Engineers.
C++ is a DIY flatpack language..
Twitter ? What is that ?
One and half hours is definitely boring. Just making it concise helps the C++ community a lot. Great lecture but all that history and quotes. Was I think it's unnecessary.
Passing by value is biting your a_s_s. For some reason these days people want to pass everything by value in C++. The slicing problem also comes from that hell. If you want to pass by value everything, you need a different tool. It's called C#.
This doesn't make sense. You pass by value when what you are passing is small enough to pass by value. If you pass everything by reference you are giving up optimizations for no good reason at all. Take string_view for example. Unless you are mutating the original string view object, there is no good reason to pass it as reference. Pointers are passed by value and not by reference.
I don't get what you are trying to say here, even.
b-but my move semantics...
There are even more disadvantages to passing by reference. The compiler must assume possible aliasing. This can even mean that an argument passed by const reference isn't actually const as observed by the function since it is aliased by a non-const reference.
@@simonfarre4907 Pointers are usually register-sized. The exception is fat pointers for member functions.
@@majormalfunction0071 yeah and they themselves are passed by value not by reference. That is the point.