Scaling MongoDB: PRO TIPS for Sharding, Supercharged Indexing & Turbocharged Performance Isolation

Sdílet
Vložit
  • čas přidán 27. 07. 2024
  • Developers and data teams often face the challenge of making early decisions on data modeling, shard configuration, and indexing strategy in MongoDB without a complete understanding of future query access patterns and database loads. These choices can significantly impact MongoDB's scalability in handling data growth and evolving application requirements. So, how can engineering organizations effectively tackle the challenges of scaling MongoDB?
    MongoDB Pro Tips: Dai Shi, a former Site Reliability Engineer at Foursquare, brings years of experience operating MongoDB at scale. He shares three key scaling strategies honed from his hands-on experience:
    Sharding & Cluster Balancing: Learn how to effectively select the optimal shard key and leverage the cluster auto balancer.
    Indexing: Discover best practices for creating collections and indexes and techniques to diagnose production issues caused by indexes.
    Offloading from Production MongoDB: Identify workloads suitable for offloading to ensure reliability and isolation of your primary MongoDB database, and explore methods for integrating with secondary data stores.
    Join Dai for MongoDB Pro Tips: Scaling Strategies, followed by a live Q&A session where you can get expert insights into your scaling challenges.

Komentáře • 3

  • @galeop
    @galeop Před rokem +2

    great video.
    my notes:
    - for the sharding distribution, you can use a hash function on a shard key, or a range of values for your shard key. Each sharding node gets 1 chunk (and vise versa); a chunk is a set of documents with contiguous shard key values (in case of _range_ sharding). If you often use range based queries (eg $lt, $gt on your shard key), then you'll benefit from using a _range_ sharding ; otherwise hash sharding is almost always the best choice.
    - indexes add to the _write_ workload on your DB, as their value have to be added to indexes. ¿Is this done synchronously?
    - 24:30 A "working set" is the data that your application is constantly requesting. But that includes both the actual documents AND all their related indexes. But the full working set has to fit in memory for your request to perform efficiently ; so by adding unecessary indexes, you are increasing the risk of making your working set too big for the memory of your sharding node.

  • @ganeshbb1
    @ganeshbb1 Před 2 lety +1

    Thanks.

  • @CinemaTalkiesPro
    @CinemaTalkiesPro Před rokem

    Hey Prakash and Dai, Thanks for the insights, actually I'm new to mongodb and having a lot of questions, I hope you guys will help me, How frequent balancer runs? do we have any setting for balancer to run for every 1 minute or 10 minute etc? if we have where and how can we do that?