"I See What You Mean" by Peter Alvaro

Sdílet
Vložit
  • čas přidán 9. 07. 2024
  • I love query languages for many reasons, but mostly because of their semantics. Wait, come back! In contrast to most systems programming languages (whose semantics can be quite esoteric), the semantics of a query (given some inputs) are precisely its outcome -- rows in tables. Hence when we write a query, we directly engage with its semantics: we simply say what we mean. This makes it easy and natural to reason about whether our queries are correct: that is, whether they mean what we intended them to mean.
    Query languages have traditionally been applied to a relatively narrow domains: historically, data at rest in data stores; more recently, data in motion through continuous, "streaming" query frameworks. Why stop here? Could query languages do for a notoriously complex domain such as distributed systems programming what they have done so successfully for data management? How would they need to evolve to become expressive enough to capture the programs that we need to write, while retaining a simple enough semantics to allow mere mortals to reason about their correctness?
    I will attempt to answer these questions (and raise many others) by describing a query language for distributed programming called Dedalus. Like traditional query languages, Dedalus abstracts away many of the details we typically associate with programming, making data and time first-class citizens and relegating computation to a subordinate role, characterizing how data is allowed to change as it moves through space and time. As we will see, this shift allows programmers to directly reason about distributed correctness properties such as consistency and fault-tolerance, and lays the foundations for powerful program analysis and repair tools (such as Blazes and LDFI), as well as successive generations of data-centric programming languages (including Bloom, Edelweiss and Eve).
    Peter Alvaro
    UNIVERSITY OF CALIFORNIA SANTA CRUZ
    @palvaro
    Peter Alvaro is an Assistant Professor of Computer Science at the University of California Santa Cruz. His research focuses on using data-centric languages and analysis techniques to build and reason about data-intensive distributed systems, in order to make them scalable, predictable and robust to the failures and nondeterminism endemic to large-scale distribution. Peter is the creator of the Dedalus language and co-creator of the Bloom language.
    While pursuing his PhD at while UC Berkeley, Peter co-developed and taught Programming the Cloud, an undergraduate course that explored distributed systems concepts through the lens of software development. Prior to attending Berkeley, Peter worked as a Senior Software Engineer in the data analytics team at Ask.com. Peter's principal research interests are databases, distributed systems and programming languages.
  • Věda a technologie

Komentáře • 19

  • @coolsebz
    @coolsebz Před 8 lety +38

    This is one of those talks that makes me google for a few hours to barely get to the meat of the ideas presented! Really awesome stuff!

  • @bonnydonny
    @bonnydonny Před 8 lety +11

    Wow, that was a mind-bender! Great talk. Looking at abstractions like time and the messy stuff that distributed systems give us will someday, hopefully, at UCSC first, make reasoning easy and natural. I suggest reading Stephen Toulmin on the philosophy side of the topic. He shows where/how the original problem of hiding abstractions took us down the wrong road. Glad to see Peter Alvaro working on re-integrating the world with new languages and respectful design. Bravo!

  • @jjurksztowicz
    @jjurksztowicz Před 6 lety +6

    Computation is rendezvous of ephemera... nice.

  • @420_gunna
    @420_gunna Před 4 lety +1

    This talk gave me a nosebleed, two thumbs up

  • @thomas.moerman
    @thomas.moerman Před 8 lety +19

    It's almost stand-up comedy fused with hardcore tech.. great talk!

  • @valtih1978
    @valtih1978 Před 7 lety +3

    Extremely profound guy. Almost like Michael Parenti in politics.

  • @OmyTrenav
    @OmyTrenav Před 8 lety +2

    Great talk! Very educational.

  • @danielfava
    @danielfava Před 8 lety +1

    Awesome!

  • @HenkPoley
    @HenkPoley Před 8 lety +8

    There's a talk from this same conference about the briefly mentioned "Eve" here : czcams.com/video/5V1ynVyud4M/video.html

  • @VladyYakovenko
    @VladyYakovenko Před 2 lety

    terrific talk

  • @clementdato6328
    @clementdato6328 Před rokem +2

    One that confused me is that he talked about Datalog using examples only reading data, which is actually quite easy even for usual languages. Difficult part seems to be that the order of the sequence needs to be guaranteed, mostly because of side effects of some sorts. Am I missing something?

    • @DmitryRomanov
      @DmitryRomanov Před rokem

      Based on prolog college course, you just add an additional requirement "this is executed before/after that", and the engine (here it is called optimizer) will find the proper call order you want.

  • @arhyth
    @arhyth Před 4 lety +1

    i wonder how long before this or at least the ideas here gets seen in production distributed systems.

    • @HenkPoley
      @HenkPoley Před 4 lety +1

      It seems like Bloom had some development for a while, but nothing after 2017. github.com/bloom-lang

  • @DmitryRomanov
    @DmitryRomanov Před rokem +1

    25:55 prolog students cry here 🙈

  • @supersearch
    @supersearch Před 5 lety

    A distributed secure system wold be similar to a blockchain system. It's must not support data deletion nor data updates. It must be a purely constructive system. A deletion must be just a new annotation about a state of some data. But this type of system may grow much, so we can keep the chain of changes but in the memory work with a limited version with only the current data for better performance. But the construction of new data based on old data must also be perfectly deterministic...

  • @chromosundrift
    @chromosundrift Před 2 lety

    Wouldn't it be fair to say that blockchains provide "the god line" and this is how they solve this fundamental distributed system problem?

    • @jakedewey3686
      @jakedewey3686 Před rokem

      Blockchain doesn't really solve the problem here. What happens when two different systems disagree on who extended the chain first? You just run into the same problem all over again, because you'd need to figure out how to deal with the asynchrony of block creation.
      The goal isn't to create an external system to synchronize your other systems; it's to make it so there's no need to synchronize at all, because your systems are guaranteed to behave the way you expect.