Automerge: a new foundation for collaboration software

Sdílet
Vložit
  • čas přidán 27. 11. 2021
  • Local-first software is an effort to make collaboration software less dependent on cloud services, and Automerge is an open-source library for realising local-first software. In this talk I explain our motivation for creating Automerge, and map out 7 years worth of research projects that are feeding into this project.
    Recording of a talk given at the University of Cambridge SRG Seminars on 25 Nov 2021.
    www.cl.cam.ac.uk/research/srg...
  • Věda a technologie

Komentáře • 21

  • @MazeChaZer
    @MazeChaZer Před 2 lety +36

    Great talk, I really enjoyed it!
    These are my personal notes about the talk, in case you want to skip around in the video:
    - (0:02:44) SPA architectures are very complex and have many layers of
    abstraction
    - (0:04:40) Read/write latencies are very high because of network/system
    roundtrips
    - (0:05:30) Optimistic UIs break in case of network errors
    - (0:06:42) Proposal: Local-First Software
    - (0:07:35) In local-first software, the primary storage is on the devices,
    server only relay communication and save backups
    - (0:08:33) Data is replicated in the background, non-blockingly
    - (0:09:04) This requires only a few layers of abstraction
    - (0:11:25) Your data is lost when the cloud service is shut down or if you get
    in trouble with the provider
    - (0:12:40) Sync services for local-first software can be generic and
    interchangeable
    - (0:14:04) Long-term preservation of data is only feasible with local-first
    software
    - (0:14:35) Working offline works by default in local-first software but is very
    hard to do in cloud software
    - (0:15:01) In cloud software servers are trusted with unencrypted sensitive
    data, in local-first software data is end-to-end encrypted during sync
    - (0:16:03) In cloud software the server is trusted with data integrity, in
    local-first software data integrity can be cryptographically ensured in the
    sync protocol
    - (0:16:39) In cloud software, users are at the mercy of the service provicer
    - In local-first software users have ownership, control, angency and autonomy
    over their data
    - (0:17:03) Big challenge of local-first software: Merging concurrent edits
    correctly
    - (0:19:20) Because local-first software has to solve the merging problem, it is
    straightforward to implement version control (including branches)
    - This could bring version control in many other areas where it would be very
    valuable other than in software development (Git)
    - (0:22:29) Introduction Automerge (“Git for your app's data”)
    - Automerge operates directly on the data model
    - (0:25:05) Automerge preservers all changes, guarantees eventual consistency
    (makes concurrent operations commutative) and can merge branches
    - (0:26:38) Automerge is a CRDT
    - (0:26:57) Timeline of Automerge research projects
    - (0:28:23) JSON CRDT
    - (0:29:02) CRDTs in Isabelle
    - (0:29:18) Automerge project
    - (0:30:05) Automerge is used in production by the Washington Post homepage
    editors
    - (0:30:29) Move operation for CRDT trees
    - (0:31:29) Local-first principles (Onward! paper)
    - (0:31:54) Authenticated snapshots
    - (0:32:46) Byzantine eventual consistency
    - (0:33:57) End-to-end encryption for CRDTs
    - Decentralized authentication, works without a central server
    - (0:34:59) Metadata privacy with anonymity networks
    - (0:36:17) CRDT for rich-text data (Peritext)
    - (0:37:23) WIP stuff
    - More asynchronous collaboration workflows
    - Cut + paste
    - Interleaving freedom
    - Access control
    - User discovery
    - (0:38:34) Philosophical question from the audience about blockchains
    - Operations are not totally, but partially ordered with regard to causality
    - (0:43:39) Automerge records an operation log
    - (0:44:26) Each operation is given an ID and the causality of operations is
    tracked by an “overwrites” field
    - (0:45:44) In case of conflicts one change is arbitrarily picked over the other
    by default, but the complete conflict can also be retrieved
    - (0:46:35) Automerge can represent JSON, sets/tables, text, counters, date/time
    and cursors
    - (0:47:31) Some skipped slides
    - Collaboration latency
    - Functional reactive programming
    - Network Topologies
    - (0:47:40) There is a JavaScript and Rust implementation of Automerge, the Rust
    implementation is the basis for other language bindings: WebAssembly, Python,
    Swift
    - Automerge itself is just a data structure library and implements no disk
    persistence or networking
    - (0:48:50) In real-time collaboration the log contains many small operations
    that can be compressed into snapshots to reduce log size
    - Automerge employs sophisticated compression that can arrive at 0.8 Bytes per
    operation on a real-world document editing history recording every single
    keystroke
    - (0:50:31) This is achieved with ideas from columnar databases
    - (0:53:51) Skipped over conclusion slide
    - (0:55:39) Building on top of Matrix.org as a backend has come up as an idea
    - (0:56:06) Question: What's next?
    - Testing, improving and finalizing the Automerge (API) for 1.0
    - Research: Many ideas and projects to work on available
    - There are also many security aspects that can be worked at
    - (0:58:00) Another blockchain buzzword discussion

    • @GalacticApple
      @GalacticApple Před 2 lety +1

      I wish i could tip comments, this is very helpful

    • @varshard0
      @varshard0 Před 6 měsíci

      Thank you.
      I should do this with other videos so I could benefit myself and other people

  • @TheItamarp
    @TheItamarp Před 2 lety +1

    Great talk. I do want to note that blockchain tends towards centralization, not for the reason that Martin suggested, but because:
    1) Most transactions happen through an exchange, marketplace, or other service which people rely on for discovery and facilitation of transactions.
    2) Power tends to concentrate to few individuals (ie 80% of bitcoins are owned by 10% of users), often allowing them greater influence in what transactions are added to the chain (there are likely exceptions to this, but broadly that is the trend that I am seeing).

  • @СанСанычВасильев

    I wish the "director's cut" of the talk would be available online.

  • @Snorehog
    @Snorehog Před 2 lety +1

    Very clear, thank you!

  • @aster_nova
    @aster_nova Před 2 lety +14

    I'm kind of disappointed that the guy butted in to correct the speaker about crypto, while the one woman audience member was trying to ask a question. Crypto is inherently uninteresting in this space, I was wondering what she was going to say, and she didn't actually get a chance.

  • @subhobroto
    @subhobroto Před 2 lety +3

    My biggest concern is the proliferation of CGNAT that still makes relay/central servers extremely necessary. Intuitively, it feels like finding a solution around CGNAT that's not relay/central server dependent would do a lot to make software less dependent on cloud services

    • @allanwind295
      @allanwind295 Před 2 lety

      A networked application, by definition, need another node to be online to exchange information. Peer to peer network means either both the clients or interest (phone call), or in a store-and-forward architecture the application may be able to send the information to any number of nodes for eventual deliver to the target node (email). It doesn't seem like a significant technical hurdle to replace the server with a peer node in a local-first design. Skype, I believe, went from peer to peer to client/server to improve quality service. If the server is generic, as author was speculating, then the automerge service would become a commodity if successful (55m02s).

  • @derjansan9564
    @derjansan9564 Před 2 lety +1

    Very interesting. Luckily audio got better after half a minute.

  • @bradyfractal6653
    @bradyfractal6653 Před 2 lety +2

    Was there a mention of YJS? I must have missed it, if not that’s odd considering it’s a leader in this space.

    • @mrvectorhc7348
      @mrvectorhc7348 Před rokem

      YJS is an implementation of ideas above, same as automerge and a couple of others which all are based off of works around CRDTs.

  • @jeromeadams4740
    @jeromeadams4740 Před rokem +1

    Does anyone have a name for the speaker at 1:00:46?

  • @MorganIntrator
    @MorganIntrator Před rokem

    How did he make those hand drawn sides??

  • @karthikeyanak9460
    @karthikeyanak9460 Před 2 lety +1

    Isn't this how email clients work? Will still have a outbox that stores unsent emails.

    • @bangonkali
      @bangonkali Před 2 lety

      I think that's just one aspect of offline first. This solution goes further into a potential feature of when editing a single email/doc in async by multiple users. The solution is focusing on merging the changes made by multiple users and less on how the changes are communicated to all users which I think what email ad a spec was focusing on.

  • @randyqx
    @randyqx Před 2 lety

    did you have to pay the dinosaur from the '70s to beg for his mainframe back? :)

  • @cybernessful
    @cybernessful Před 9 měsíci

    So, basically everything new is a very forgotten old repeated on the new technological level. Quite a bit of software indeed to not benefit much from being purely cloud based and the only point of internet for them is to back up or refresh the data. The problem is none of existing corporations would ever give up user's data, the opportunity to collect and sell it, or show you some ads.

  • @PragyAgarwal
    @PragyAgarwal Před 2 lety

    Hey Martin. Can automerge support full text search over a very large database (like Wikipedia) without the user having to locally cache the entirety of data?

  • @user-ef2fl3oz3c
    @user-ef2fl3oz3c Před 2 lety +4

    *He is legit and reliable hacker💯*

  • @user-ef2fl3oz3c
    @user-ef2fl3oz3c Před 2 lety +1

    *He is legit and reliable hacker💯*