DEMO - High performance HTAP with Postgres & Hyperscale (Citus)

Sdílet
Vložit
  • čas přidán 25. 06. 2020
  • In this demo, we run a large-scale HTAP workload on Azure Database for PostgreSQL with the built-in Hyperscale (Citus) deployment option. Hyperscale (Citus) uses the open source Citus extension to Postgres to turn a cluster of PostgreSQL servers into a single distributed database that can shard or replicate Postgres tables across the cluster. Citus can simultaneously scale transaction throughput by routing transactions to the right server, and scale analytical queries and data transformations by parallelizing them across all of the servers in the database cluster. In combination with all the powerful Postgres features such as its different index types and other PostgreSQL extensions, this makes Hyperscale (Citus) able to run high performance HTAP workloads at scale.
    We will show a side-by-side comparison of Hyperscale (Citus) and a single PostgreSQL server running a transactional workload generated by HammerDB, while simultaneously running analytical queries, and show how you get further speedups by pre-aggregating the data in parallel (using rollups) on the same Postgres database.
    Video bookmarks:
    ► 0:17 What Citus is
    ► 0:37 Overview of anatomy of the demo
    ► 1:47 Demo begins
    ► 9:55 Summary of performance results
    ► 10:58 Marco’s interpretation of the demo
    ► 14:14 How Windows telemetry team uses Postgres & Citus
    This demo, originally shared at SIGMOD and excerpted from an interview Claire Giordano did with Marco Slot at the Microsoft European Virtual Open Source Summit-explores the promise of HTAP-that there is finally a database that can do transactions and analytics at scale-and shows how you can use Postgres with Hyperscale (Citus) on Azure to serve both transactional & analytical needs of HTAP applications.
    📌 Let’s connect:
    Twitter - Claire Giordano, / clairegiordano
    Twitter - Marco Slot, / marcoslot
    Twitter - Citus Data, / citusdata
    Twitter - AzureDBPostgres, / azuredbpostgres
    🔔 Subscribe to the Citus monthly technical newsletter:
    aka.ms/citus-newsletter
    ✅ Learn more:
    Azure Postgres Blog: aka.ms/azure-postgres-blog
    Azure Database for PostgreSQL managed service: aka.ms/azure-postgres
    Citus open source GitHub repo: github.com/citusdata/citus
    #PostgreSQL #Azure #AzureDBPostgres
  • Věda a technologie

Komentáře • 10

  • @mrjamiebowman1337
    @mrjamiebowman1337 Před 3 lety +4

    This looks really amazing. I'm very impressed by the performance gains of using Citus with Postgres.
    Being a .NET developer that has traditionally used MSSQL I am learning Postgres now and I see so many benefits to learning and using this technology. I hope Microsoft and Azure continue to adopt and embrace Postgres/Open Source Technologies.

  • @gsilva877
    @gsilva877 Před 11 měsíci

    This is amazing, and the fact that is open source is a bless, thank you all for that. I'm creating a multtenant saas app and this will be game changer even if the load now is quite minimal I'm sure this can scale with citrus and postgres and also simplify a lot the architecture for the app. Amazing...

  • @bbqchickenrobot3
    @bbqchickenrobot3 Před 3 lety

    Great stuff! Postgres + Citus will be backing my ASP.NET Core servers for my startup! Can't wait!

  • @joefagan3926
    @joefagan3926 Před 2 lety +2

    At 10:30 I noticed in addition to your 10 servers you had also a Co-ordinator. What about fault tolerance of the co-ordinator? What are the options for that?

  • @ilzammulkhaq8648
    @ilzammulkhaq8648 Před 2 měsíci

    okay, now, how to decide which column should be key of shard, I already pick citus as one of my option, but there is 110 rows and more than 10 indexes, any ideas?

  •  Před rokem

    One of the tests shows 300x improvement. Is this still talking about the 10 node Citus? I would expect a 10x improvement, but 300x improvement to me sounds like the nodes are having the chance to cache all data in ram. I'd like to see this test repeated after cold booting all nodes.

    • @RU-qv3jl
      @RU-qv3jl Před 2 měsíci

      It wouldn’t be 300 times faster but should be a lot faster. The data was clearly in memory and avoided disk access. It would have been interesting to see how a single very large PG instance compares. It’s not like you get 11 nodes for the price of 1 of the same size. So the demo, whilst somewhat valid, is also saying “Look if we have a lot more hardware it’s faster”. I would have liked to have seen a comparably priced PG cluster compared to the multi-node Citus cluster.

  • @RU-qv3jl
    @RU-qv3jl Před 2 měsíci

    This is a nice demo to show off Citus, but it’s also totally invalid to compare the systems like that. You had 11 nodes for Citus of the same size as the single PG node. You used all that extra hardware to demonstrate that more hardware is faster and then showed off that a roll-up table is faster than accessing base data. None of that is surprising. If you spent the same amount of money on a PG cluster how fast would it have been?