CppCon 2018: Victor Ciura “Enough string_view to Hang Ourselves”

Sdílet
Vložit
  • čas přidán 17. 10. 2018
  • CppCon.org
    -
    Presentation Slides, PDFs, Source Code and other presenter materials are available at: github.com/CppCon/CppCon2018
    -
    Wouldn’t it be nice if we had a standard C++ type to represent strings ? Oh, wait... we do: std::string. Wouldn’t it be nice if we could use that standard type throughout our whole application/project ? Well… we can’t ! Unless we’re writing a console app or a service. But, if we’re writing an app with GUI or interacting with modern OS APIs, chances are that we’ll need to deal with at least one other non-standard C++ string type. Depending on the platform and project, it may be CString from MFC or ATL, Platform::String from WinRT, QString from Qt, wxString from wxWidgets, etc. Oh, let’s not forget our old friend `const char*`, better yet `const wchar_t*` for the C family of APIs…
    So we ended up with two string types in our codebase. OK, that’s manageable: we stick with std::string for all platform independent code and convert back-and-forth to the other XString when interacting with system APIs or GUI code. We’ll make some unnecessary copies when crossing this bridge and we’ll end up with some funny looking functions juggling two types of strings; but that’s glue code, anyway… right?
    It’s a good plan... until our project grows and we accumulate lots of string utilities and algorithms. Do we restrict those algorithmic goodies to std::string ? Do we fallback on the common denominator `const char*` and lose the type/memory safety of our C++ type ? Is C++17 std::string_view the answer to all our string problems ?
    We’ll try to explore our options, together: best practices, gotchas, things to avoid... all in the context of modern C++ projects.
    -
    Victor Ciura, CAPHYON
    Software Developer
    Victor Ciura is a Senior Software Engineer at CAPHYON and Technical Lead on the Advanced Installer team (www.advancedinstaller.com).
    For over a decade, he designed and implemented several core components and libraries of Advanced Installer.
    He’s a regular guest at Computer Science Department of his Alma Mater, University of Craiova, where he gives student lectures & workshops on “Using C++STL for Competitive Programming and Software Development”.
    Currently, he spends most of his time working with his talented team on improving and extending the repackaging and virtualization technologies in Advanced Installer IDE, helping clients migrate their Win32 desktop apps to the Windows Store (MSIX).
    -
    Videos Filmed & Edited by Bash Films: www.BashFilms.com
    *-----*
    Register Now For CppCon 2022: cppcon.org/registration/
    *-----*

Komentáře • 18

  • @YourCRTube
    @YourCRTube Před 5 lety +11

    I am surprised this talk missed the most important use of string view, besides glue code - string manipulation and querying. I am using QStringRef for years now, and all the code is for string manipulation - you can traverse, create "substrings" with zero allocation. For example get the filename from path, or the file extension, or the parent path or any combination of this - you explode the string the user passed into views, with no allocation or copying, the user can then use all this and if he decided to store any part, _then_ he will have an allocation and a copy.
    And even about storing the string view is not an absolute truth. You can have value-type object that has-a string and N number of views into that string, this is correct and useful. For example, a Breadcrumbs widget can have a path and a bunch of views into that path, representing the different path components (/a/b/c/d, /a/b/c, /a/b, /a).
    Lastly views are nothing new, for example image manipulation libraries like OpenCV (and others) have RIO which is Region Of Interest - it is basically non owning image (view). You create it from an image, and most if not all algorithms that work on an image type work on RIO as well.

    • @defeqel6537
      @defeqel6537 Před 5 lety

      @Ziggi Mon I prefer gsl::string_spans because I can use them with multiple types of strings without creating my own template methods, but like the talk said, spans and views are not for beginners.

  • @MatthijsvanDuin
    @MatthijsvanDuin Před 5 lety +5

    Interestingly, the equivalent of std::string_view in rust is "&str", i.e. "reference to str", which like other references is subject to the rigorous compile-time lifetime checking rust is known for, hence can normally never be dangling. The primitive "str" type represents just the string data itself (a sequence of bytes which is valid utf-8), i.e. the size is neither known at compile-time (in general) nor contained within the value itself, which is why a reference to such an object must include the size explicitly.
    You can't directly create an instance of "str" btw. Rather, a &str will be typically given to you by something that manages string data storage: either the compiler in case of static data (string literals), or a String container object that manages the storage at runtime.)
    Same thing for so-called "slice references", "&[T]" and "&mut [T]", basically equivalent to std::span and std::span in C++20.

  • @araeos
    @araeos Před 5 lety +13

    Slide 48 shows safe (though not performant) code assuming the string_view is valid and not dangling. Only when the caller made a mistake by passing a dangling string_view would this be of concern. In that sense the slide was misleading to the (un)safety of string_view.

    • @hpesoj00
      @hpesoj00 Před 5 lety +5

      Yeah, he was making no sense whatsoever. If the string_view points at released memory then you can't use it at all. Of what relevance is the data member? In fact, of what relevance is the sink idiom? As far as I could see, it was a perfectly fine use of string_view. Very confusing...

    • @pheww81
      @pheww81 Před 5 lety +1

      This also looked weird to me. It was like if was unsafe because the std::string constructor from a std::string_view would not copy the data.
      But std::string ctor actually copy the data right? So we don't care about the lifetime of the source as long it's still valid when the ctor is called.
      But the std::move is useless since it will never be able to move event if the source object of the std::string_view was a std::string.
      I'm getting this right?

    • @YourCRTube
      @YourCRTube Před 5 lety

      The problem is that the string_view might dangle and you will create a broken member that is a time-bomb, literally. If you take a string however, even if this string was created, by the caller, from a view, then it will (almost certainly) bomb right away on move, so the error will be caught. The point is that a view will happily create a string even if it dangles as the string will just memcpy from it.

    • @hpesoj00
      @hpesoj00 Před 5 lety +2

      But the member is a string, not a string_view. As long as the string_view is not dangling at the point the member is constructed, the memory will be copied and everything will be fine. In this instance, a string_view is no different from a char const* as far as "dangling references" are concerned.

    • @MarkAtkinson99
      @MarkAtkinson99 Před 5 lety +2

      Yes that whole thing around slides 46-48 is nonsense. The code he presents as a problem works just fine, it copies from wherever the string_view points into the std::string, and there's no way the source data can somehow be invalidated in the middle of that operation. The std::move is redundant but harmless. Storing a string_view as a member, or returning a string_view can be problematic, but not using it as a param in this way. Conceptually (and how it's probably implemented) think of it as a (const char* s, size_t count), it works the same as that.

  • @brenogi
    @brenogi Před 5 lety +1

    I don't think the problem on slide 70 is auto. Is the "dbl" function forcing the return to T...isnt it?

    • @-taz-
      @-taz- Před 5 lety

      The root problem is operator overloading. I don't think "+" should both add numbers and concatenate strings in the same language. It's also a common problem and source of bugs in JS. Originally, C++ had "

  • @sergeikrainov2512
    @sergeikrainov2512 Před 2 lety +5

    Horrible talk. An hour of bad usages and downsides of string_view and nothing about proper usage