"Open-Sourcing Venice" by Felix GV (Strange Loop 2022)

Sdílet
Vložit
  • čas přidán 16. 05. 2024
  • Venice is LinkedIn's derived data storage system, providing high-throughput ingestion of data generated from batch and stream processing jobs, and low latency online serving.
    After five years in production, Venice now hosts ~1500 datasets, most of which are entirely rewritten daily, and used as inputs for AI model inference workloads. A prominent use case is "People you may know" which performs online deep learning, executing tens of millions of reads and computations per second on a single Venice cluster.
    Client applications can either eagerly load data in local RAM or SSD, or send queries across the network to Venice's distributed backend. Thus, AI engineers can leverage the same data plane and APIs for both the L1 and L2 steps of a multi-pass ranker.
    Venice is built from the ground up for massive scale and operability. It supports self-healing, linear scalability on commodity hardware, multi-tenancy, multi-clusters and multi-datacenters with CRDT-based active-active replication.
    Felix GV
    Principal Staff Engineer at LinkedIn
    Felix joined LinkedIn's data infrastructure team in 2014, first working on Voldemort, the predecessor of Venice. Over the years, Felix participated in all phases of the development lifecycle of Venice, from requirements gathering and architecture, to implementation, testing, roll out, integration, stabilization, scaling and maintenance.
    ------- Sponsored by: -------
    Stream is the # 1 Chat API for custom messaging apps. Activate your free 30-day trial to explore Stream Chat. gstrm.io/tsl
  • Věda a technologie

Komentáře • 2

  • @gaojieliu2241
    @gaojieliu2241 Před rokem +5

    Great work, Felix!

    • @felixgv
      @felixgv Před rokem +1

      I am but the messenger for this huge team effort 😊🙏