Top 7 Ways to 10x Your API Performance

Sdílet
Vložit
  • čas přidán 24. 05. 2024
  • Get a Free System Design PDF with 158 pages by subscribing to our weekly newsletter: bytebytego.ck.page/subscribe
    Animation tools: Adobe Illustrator and After Effects.
    Checkout our bestselling System Design Interview books:
    Volume 1: amzn.to/3Ou7gkd
    Volume 2: amzn.to/3HqGozy
    The digital version of System Design Interview books: bit.ly/3mlDSk9
    ABOUT US:
    Covering topics and trends in large-scale system design, from the authors of the best-selling System Design Interview series.

Komentáře • 166

  • @leetcode7857
    @leetcode7857 Před 10 měsíci +269

    Other techniques:
    1. Tuning the database connection pool size based on the application behaviour. (Large number doesn't always mean more performance)
    2. Optimizing the SQL query. (Ensuring your most frequent queries end up using index scan instead of full table scan)
    3. Not hopping between multiple microservices for a single user request. (While a single user request can hit multiple services but those services should not in turn hit another set of services and so on).
    4. Authorization data should be always cached.
    5. As much as possible, do the most computation on the database layer. There's a huge difference between doing the computation at application layer vs doing it at database layer.

    • @ArvidRegenberg
      @ArvidRegenberg Před 10 měsíci +4

      Really good comment! Embrace the Database.

    • @jitx2797
      @jitx2797 Před 10 měsíci +2

      I didn't understand the 4th point. What do you mean by caching Authorization data.
      Did you mean like for example while use JWT based authorization we are checking token and validating it through the database if the token is valid or not.
      So you're talking about that data?
      Sorry If I am not understanding it correctly.

    • @azzazkhansiddiqui
      @azzazkhansiddiqui Před 10 měsíci +4

      @@jitx2797 I guess storing the session token in cache (Redis) and authenticating form there if the user logged out remove the session key else it would be automatically removed when session expires (times out)

    • @albinantony4998
      @albinantony4998 Před 10 měsíci +4

      ​@@jitx2797you don't need to save jwt token in database to check if its is valid or not. That's eradicating the advantage of JWT.

    • @jwbonnett
      @jwbonnett Před 10 měsíci

      @@albinantony4998 You should always store the token, that is litterally in any JWT Auth spec, like OpenID or OAuth2.0.

  • @bananesalee7086
    @bananesalee7086 Před 10 měsíci +37

    for some reasons, listening to you is calming

  • @wissemaljazairi
    @wissemaljazairi Před 10 měsíci +84

    1. 1:00 Caching
    2. 1:45 Connection Pool
    3. 2:45 Avoid N+1 Query pattern
    4. 3:35 Pagination
    5. 3:58 JSON Serialization
    6. 4:20 Compression
    7. 4:50 Asynchronous logging

  • @narasimhareddy8323
    @narasimhareddy8323 Před 10 měsíci +39

    One quick question...
    Who does the video animation work for you? Kudos to the designer whoever he/she is.

  • @nmmm2000
    @nmmm2000 Před 10 měsíci +38

    2 cents from me too:
    - Use HTTP keep alive or HTTP2 - if you do separate HTTP call for each API call, speed will be slow. if you do HTTP keep alive, speedup is considerable - this technique often used in SMS industry.
    - Optimize SQL queries :)
    - Optimize SQL queries :) :)
    - Optimize SQL queries :) :) :)
    - Database replication

  • @justmasdd
    @justmasdd Před 10 měsíci +5

    Thank you! I love watching ByteByteGo system design videos!

  • @MrSongsword
    @MrSongsword Před 10 měsíci +47

    The 7 methods:
    1. Caching
    2. Connection pool
    3. Avoid N+1 Query Problem
    4. Pagination
    5. JSON Serializers
    6. Payload Compression
    7. Asynchronous logging

  • @guhkunpatata3150
    @guhkunpatata3150 Před 9 měsíci

    the animation + explanation is GREAT! thanks for sharing!

  • @tinjurcevic4309
    @tinjurcevic4309 Před 10 měsíci +7

    Great video as usual!

  • @kns6132
    @kns6132 Před 10 měsíci +12

    Superbly explained and very valid tips ❤

  • @gregoirehebert
    @gregoirehebert Před 10 měsíci +13

    Around the N+1 problem, I think it's worth mensionning that using HTTP Cache on the comments would reduce the amount of processing. No need to claim them all at once. The pagination approach is still valid ! A simple IRI toward the collection of comments is also valid.
    But you still need to request them at some point, even if this is to the cache reverse proxy.
    To avoid waiting your frontend to parse the payload then query the comments, the use of 103 EarlyHint can eliminate that waiting time.
    Using HTTP/2 as it is using binary frames, multiplexing and solves HOL blocking.
    Using HTTP/3 as it speed up establishing the connecion API.
    Formats like protobuff also reduces the size of the payload.
    To circumvent pipelining problems using HTTP/1.1, sometimes batch HTTP request can be a solution (prefer standard specification) But please stay stateless as much as possible.
    Of course the infamous domain sharding 🤐

  • @Ajdin87
    @Ajdin87 Před 10 měsíci +20

    I think it is worth mentioning that if you decide to use caching, Redis, and you are using, for example AWS, it will be additional cost. Caching is done in memory, and oh boy do they love to make you pay for everything.
    Great video btw.

    • @Vincent-hb7ub
      @Vincent-hb7ub Před 9 měsíci

      That's interesting @Adjin87 What could be an efficient solution around that problem?

    • @Ajdin87
      @Ajdin87 Před 9 měsíci +1

      @@Vincent-hb7ub I am not as nearly as expirienced to be able to answer that. Hopefully someone else might help.
      I am working on some smaller projects, but tried and used most of things mentioned in the video.
      As for Redis, I just scratched the surface with some bssic caching snd queing locally. Results were awesome, up to 10 times faster response times when using it.

    • @Vincent-hb7ub
      @Vincent-hb7ub Před 9 měsíci

      @@Ajdin87 thanks for your reply. I'm still learning all these and I feel like my knowledge in system design is kinda scattered. So I never thought of the cloud cost implication.

  • @tonydeveloperdndndn
    @tonydeveloperdndndn Před 22 hodinami

    Thanks for summary this topics

  • @harshdevsingh6506
    @harshdevsingh6506 Před 9 měsíci +6

    1. Tunning database tables like purging old data which is not needed can minimize the performance of select queries on tables.
    2.Putting the index on the column of the table also helps the same.
    3.Adding load balance also helps to improve the performance.
    4.payload compression also can be helpful to fetch large size data like image/videos.

    • @pauljohnsonbringbackdislik1469
      @pauljohnsonbringbackdislik1469 Před 8 měsíci

      ad. 2 - Some databases either add indexes automatically or give you suggestions based on the self-monitoring data
      ad, 4 - If images are sourced from users, they can be resized right after the upload. 4MB image is then stored as 1MB FHD jpeg + 80kB thumbnail. It rarely makes sense to keep the original. Videos are automatically handled this way when using 3rd party services (like Vimeo or Mux).

  • @SalvadorFaria
    @SalvadorFaria Před 10 měsíci +13

    Other optimization techniques: Partial Responses and Field Masks - Request specific data fields, reduces processing load, and improves efficiency in API interactions.

    • @pauljohnsonbringbackdislik1469
      @pauljohnsonbringbackdislik1469 Před 8 měsíci

      Nice tip. Too many times API responds with tons of data that won't be consumed by the user even if displayed.

  • @anthonysalls
    @anthonysalls Před 10 měsíci +4

    An architecture I work with involves a secondary copy mirror, and I once “crashed” the mirror by supplying too many writes to primary that were handled effortlessly at source but the synchronous writes to the secondary DB backed up as a result of its lower tier hardware and all applications that ran on the secondary (non time critical systems like operational reporting that can usually wait the .8 seconds to run while secondary catches up to the point the job originated from primary) stalled for hours and refused all new jobs and queued all writes in that period.
    Switching to asynchronous log writing dramatically improved the performance and the secondary system could handle the load fed from primary again, but there was half a day of 40,000 users who were not happy that their reports when from taking half a minute to run to half a day. Additionally HA and data loss was risked as the secondary system was the same-data center backup of primary (there was another primary off site with similar hardware that kept pace but would have taken longer to fail over to).
    The lesson is, if production is reliant on secondary systems it communicates with, and you’re going to be running production hot for hours, you must have secondary systems attached to your test suite! We’d 4 dozen attached but missed the mirror that was instrumental for reporting 😅

    • @mohammedsardar3779
      @mohammedsardar3779 Před 10 měsíci

      I would like to learn these advanced concepts. But I don't understand what ever given here. Can you share any Link to read more on this?

  • @thanhnx-vnit
    @thanhnx-vnit Před 10 měsíci +3

    @ByteByteGo
    Video is great. Thank you so much!
    Btw, could you tell me the software to create presentation slides like in the video?

  • @leomysky
    @leomysky Před 10 měsíci

    Wonderful
    Thank you for the video

  • @MichaelKubler-kublermdk
    @MichaelKubler-kublermdk Před 9 měsíci +3

    Something we recently did was use Amazon SQS to push after save and after update tasks to a background processing server.
    This allows things like thumbnail generation or OCR processing of files, complex Multi-document updating (updating one entry will cause fields on multiple other entries to be updated in different ways) and things like updating the Lucene search system, or generating notifications or user activities event stream, all can be done on the backend server moments later instead of during the API call.
    Of course we are using PHP not something like NodeJS where processing after an API response is much easier.

  • @olhoTron
    @olhoTron Před 3 měsíci +1

    Its worth stressing: ALWAYS measure, before, during and after an optimization, and check if the performance really improved and the results remained the same. Gut feelings don't work when optimizing software
    Also think of the system as a whole, sometimes what shows up at the top on the profiler is the symptom and not the cause of the problem

  • @attien5392
    @attien5392 Před 6 měsíci

    Thank you so much, that really nice

  • @TariqSajid
    @TariqSajid Před 9 měsíci +6

    Hidden thing about pagnation is be very careful of the total count query that might be taking too much time if you do on every pagination request and you have large dataset

    • @razvanandrei2671
      @razvanandrei2671 Před 7 měsíci +1

      I think you can avoid that count by using kind of infinite paging/scrolling and not showing the total results(if possible)

  • @ravipvkiranpv
    @ravipvkiranpv Před 10 měsíci

    Add Streaming to the list.. Great video..

  • @dmitrypichugin7449
    @dmitrypichugin7449 Před 10 měsíci +2

    Bulk REST endpoints (controller architype)
    REST -> GraphQL

  • @pauljohnsonbringbackdislik1469

    Since browsers limit active parallel downloads, there are some cases when batching requests may shave 2-5s from the overall page load (e.g. on pages that tend to load multiple data to compose a report or a summary).
    I wish @ByteByteGo could make a follow-up video with "Top 7 ways to optimize performance suggested by community" :)

  • @sarakhushi23
    @sarakhushi23 Před 7 měsíci

    1- Caching , Store result comoutation so that can be used later in Redis..
    2- Connection Pool-
    3- Avoid N+1 Query problem
    4- Pagination
    5- Lightweight JSON Serialization..
    6- Compression..
    7- Asynchronous logging..

  • @andrewwhitehouse1878
    @andrewwhitehouse1878 Před 10 měsíci +1

    This is gold. Your whole channel is gold and the production values are amazing ❤

  • @TheOnlyEpsilonAlpha
    @TheOnlyEpsilonAlpha Před 5 měsíci

    Other Techniques:
    1. Key Pool for Oauth Operations:
    - When you want to reduce the waste of individual overhead, that comes from the whole process of ongoing re-rolling API Keys, make a dedicated component for that to have a pool available of valid API Keys or Tokens, so the components which need to work with the API don't have to go through the whole API Key managing process and just do their stuff.
    - By letting a dedicated component manage those pool for you, keep an eye that are always enough valid entries available. Based on the need the component can request a bigger amount of keys
    - The Life-cycle of that keys and be easily managed by storing them in a redis database with EXPIRE set, so it will get deleted when it's expired also in the according API.

  • @nagsworld
    @nagsworld Před 9 měsíci

    your video's looks greate, mainly presntation along with content. What tool you are using for animations?

  • @makhaer
    @makhaer Před 9 měsíci +3

    @ByteByteGo could you tell which tool you use to draw these animated diagrams ?

  • @vighnesh153
    @vighnesh153 Před 10 měsíci +24

    You could also consider replacing JSON with protobufs. They are super optimized for data transfer between systems.

    • @aiml84
      @aiml84 Před 9 měsíci

      protobufs are fire. Have used in building online feature store.

    • @MobiusOne6
      @MobiusOne6 Před 9 měsíci +1

      we use both protobufs and thrift in our applications. Both are good options for packaging data to send over a wire. Thrift seems to be easier to debug in java in my experience... but protobuf is arguably more efficient.

  • @arfinexe539
    @arfinexe539 Před 10 měsíci +3

    Great video, but that car transition caught me off guard 😂

  • @iSerjioL
    @iSerjioL Před 5 měsíci

    I would point out the 3rd technique "Avoid N+1 Problem" into a separate section - Database Query Optimisation, including proper indexes, query optimisation, usage of memory-optimised tables, cluster configuration (DB setup, e.g. using SSD, temp db size, transaction isolation level) etc.

  • @juanitoMint
    @juanitoMint Před 2 měsíci

    For logs you can just output to std output (just 1 fire descriptor) and use a lot harvester like filebeat fluentbit to offload to log backend

  • @raj_kundalia
    @raj_kundalia Před 9 měsíci

    thank you!

  • @infomaniac50
    @infomaniac50 Před 10 měsíci +72

    When doing compression on API responses, make sure you're not exposing yourself to a CRIME attack. CVE-2012-4929

    • @carponneutrality1955
      @carponneutrality1955 Před 5 měsíci +7

      This is a long mitigated, 11 year old low-score CVE that only applies to the TLS protocol level and has nothing to do with the content compression discussed here.

  • @CarlosGonzalez-rg6ht
    @CarlosGonzalez-rg6ht Před 10 měsíci

    I think data partitioningnin line to the querying to be performed could be a usefull method to improve performance.

  • @rbelatamas
    @rbelatamas Před 9 měsíci

    thank you ❤

  • @ukaszkiepas57
    @ukaszkiepas57 Před 8 měsíci

    thank you !!!!

  • @sijilo
    @sijilo Před 10 měsíci +1

    Nice 👍

  • @jacob_90s
    @jacob_90s Před 8 měsíci +6

    One issue to look out for with pagination when using TOP/MAX and OFFSET are changes that occur between page request and how they affect the order of the data. I've worked with several API's where a record on page 1 would be changed after I had already accessed that page, and it changes the order of all the results, so when I would grab the next page, the results would be shifted over and I would miss some of them.

    • @krishnashetty9388
      @krishnashetty9388 Před 8 měsíci

      How to overcome this?

    • @hungluu902
      @hungluu902 Před 7 měsíci

      @@krishnashetty9388 I think saving the last item Id and use it for the next page query should do the trick, or maybe better, cursor pagination!

  • @systemBuilder
    @systemBuilder Před 28 dny

    Connection Pooling not only can increase throughput, often more importantly, it reduces latency.

  • @LNSFLIVE
    @LNSFLIVE Před 6 měsíci

    what you using to make your visualizations? animations are great

  • @khari_baat
    @khari_baat Před 9 měsíci +1

    Which efficient JSON serializer are we talking about? Can you please suggest some?

  • @JIANGNIANHANG
    @JIANGNIANHANG Před 9 měsíci +1

    I have 2 doubt/question:
    5. JSON Serialization
    which json library should been used?
    Does json libraries has really large difference in performance?
    7. Asynchronous logging
    Dose logging method can improve API performance?

  • @danregep4646
    @danregep4646 Před 9 měsíci

    what about avoiding memcopy and serialization by using an optimized data-serialization format? (TRIFT PBufers etc)

  • @efnobrega
    @efnobrega Před 3 měsíci

    HiThere! Please, let me know what tool do you use to create the videos graphics / animations. Ths!

  • @HappyRogue-fs2jd
    @HappyRogue-fs2jd Před 7 měsíci

    Amazing

  • @orisueXtriumvir
    @orisueXtriumvir Před 7 měsíci

    What techniques can we use for an endpoint that handles file download requests where the data content can change for each request?

  • @Mo-bs7ct
    @Mo-bs7ct Před 10 měsíci +3

    You may need indexing to speed up queries

  • @BhaveshAgarwal
    @BhaveshAgarwal Před 9 měsíci

    ByteByteGo - please please share the tools and softwares you use to create these wonderful videos. It will be extremely helpful to learn them and use it for work and share knowledge in general. Thanks in advance!

  • @Achrafsouk
    @Achrafsouk Před 10 měsíci +2

    CDNs like CloudFront also can help with connection pooling.

    • @sergeibatiuk3468
      @sergeibatiuk3468 Před 9 měsíci +1

      What kind of connection pooling are you referring to?

    • @Achrafsouk
      @Achrafsouk Před 9 měsíci

      @@sergeibatiuk3468 connection pools are maintained from CDN servers to the API servers, leading to improved performance overall.

  • @Ashirgaziyev
    @Ashirgaziyev Před 5 měsíci

    Hey guys, I know this might be a silly question; however, what would be the great learning material for a newbie backend developer?
    I am considering to switch from mobile development to backend development asap. Thanks a lot!

  • @sergeibatiuk3468
    @sergeibatiuk3468 Před 9 měsíci +2

    Use non-blocking architecture

  • @nikhilgoyal007
    @nikhilgoyal007 Před 3 měsíci

    love!

  • @VaibhavShewale
    @VaibhavShewale Před 9 měsíci +1

    well that was too informative

  • @robotmama259
    @robotmama259 Před 10 měsíci +3

    super awesome sir. I am your big fan

  • @SoulsExpert
    @SoulsExpert Před 4 měsíci

    awsome

  • @diegofelipe91
    @diegofelipe91 Před 7 měsíci

    In MongoDB it's not good to use skip and limit approach to pagination, specially when you're paginating over a huge number of documents, it's more suitable to have a find and limit approach instead.

  • @user-ns2fz1tl9s
    @user-ns2fz1tl9s Před 9 měsíci

    Pagination could be very difficult and confusion on some databases. For example Postgres have to read all previous pages, to read 101 page. So on really big datasets just limit\offset leads only to problems.

    • @robertpiosi
      @robertpiosi Před 9 měsíci +5

      Offset on indexed column will jump you right to the place.

  • @hemantpanchal8087
    @hemantpanchal8087 Před 9 měsíci

    Can someone please help me in below scenario...
    Eg. My program has more than 1.5 lakh employees records in database and this data won't get change so frequently.
    So is it good idea to publish json of individual employee on azure blob/s3 and than from Api we can read from Here and display in front-end.
    I would like to know whether it will impact read performance for 1.5 lakh json on azure blob.

  • @myrondai
    @myrondai Před 8 měsíci

    Use uncrowded resources for crowded resources: computing for memory/bandwidth (compression), memory for computing (Cache);
    reuse objects;
    Avoid unnecessary calculations;
    Question: Why do people want to do pagination? "User interface has limited place to display data, so we only fetch what we needed. " is one explanation. Anything else?

  • @user-xe9on9yr7c
    @user-xe9on9yr7c Před 9 měsíci

    What about binary optimization and grpc?

  • @aravind.a
    @aravind.a Před 10 měsíci

    Hi Team, can you please explain serialization in detail?
    Does it mean - instead of sending the json / xml it is better to send as string ?

    • @ariseyhun2085
      @ariseyhun2085 Před 10 měsíci +4

      It just means, try to use a fast library for serialising/deserialising json data.
      If you use python, there might be multiple libraries that do this, but choose the fast one, since there can be some slow ones

    • @aravind.a
      @aravind.a Před 10 měsíci +1

      @@ariseyhun2085 thanks for the explanation 🙏👋

    • @vaughnhelmer4219
      @vaughnhelmer4219 Před 10 měsíci

      Often garbage collected languages such as Python, JavaScript, and Java rely on a library written in a lower level language such as C to implement faster serialization. I would perhaps have ranked this first or second along with connection pooling. I believe these are examples of optimizations which are never premature.

  • @ColossalMcBuzz
    @ColossalMcBuzz Před 7 měsíci

    Tip #8: Ensure you're using the right data structures and/or algorithms for the endpoints.

  • @DJenriqez
    @DJenriqez Před 2 měsíci

    8. Binary serialization (if possible)

  • @Rscnry99
    @Rscnry99 Před 10 měsíci

    @bytebytego what app do you use for your animations?

  • @abhishekdhiman5719
    @abhishekdhiman5719 Před 2 měsíci

    Title: "Top 7 Ways to 10x Your API Performance"
    Caching: Store results of expensive computations to avoid repeating them.
    Connection pooling: Reuse database connections instead of opening new ones for each API call.
    Avoid N+1 queries: Fetch data in a single query or two, instead of multiple queries for related entities.
    Pagination: Break large data responses into smaller pages using limit and offset parameters.
    Lightweight JSON serializers: Use fast libraries to minimize conversion time to JSON format.
    Compression: Reduce data size transferred over the network by enabling compression on large responses.
    Asynchronous logging: Improve performance by placing log entries in a buffer for separate logging thread to write.

  • @nixonrod
    @nixonrod Před 10 měsíci

    master slave | partitioning | sharding techniques for improving database performance.

  • @TomDoesTech
    @TomDoesTech Před 10 měsíci +1

    Anyone know how he makes these animations?

  • @amancca
    @amancca Před 10 měsíci +3

    Image compression is one of my favorite one. My API can reduce image from 5MB to 10kb 😀

    • @alexeibrinza2719
      @alexeibrinza2719 Před 9 měsíci

      I wonder why do you compress images, if they're already compressed? For example jpeg image with 5MB after gzip compression will still be around 5MB.

    • @IAMGregEVA
      @IAMGregEVA Před 8 měsíci

      gzip will not really compress images - he is referring to image specific compression which happens and is persisted, not per transfer general compression@@alexeibrinza2719

  • @_sk_videos
    @_sk_videos Před 8 měsíci

    Any suggestions on how to improve an API that returns a large amount of data, 10-200MB?

  • @XuLilu
    @XuLilu Před 4 měsíci +1

    Why all pages are page 1 at 3:46?

  • @lychenus
    @lychenus Před 6 měsíci

    u r fomr hong kong

  • @helloworld4872
    @helloworld4872 Před 9 měsíci

    how about object pooling

  • @Zibul444
    @Zibul444 Před 2 měsíci

    🔥🔥🔥🔥

  • @craumm
    @craumm Před 9 měsíci

    Can you share some fast JSON serialization libraries for Java?

    • @DavisTibbz
      @DavisTibbz Před 9 měsíci +1

      1. Gson, 2. FastJSON, 3. Jackson

  • @phoenicianathletix2866
    @phoenicianathletix2866 Před 10 měsíci +1

    Can Payload Compression be used while using Web sockets?

  • @robl39
    @robl39 Před 10 měsíci +1

    Fun fact: doing pagination killed performance at the database level for my product. We were using offset fetch in sql server and we quickly going out that once a given table reaches a certain size it slows way down. To solve this we introduced a “bookmark” methodology that doesn’t need to perform a table scan

    • @AzharUddin-ob7vb
      @AzharUddin-ob7vb Před 10 měsíci

      Explain or refer your bookmark technique
      I m intrested knowing that

    • @arpanghoshal2579
      @arpanghoshal2579 Před 10 měsíci +1

      Offset-limit based pagination in DB does not scale well for example if you want to access the data after the offset 1000 the DB still has to scan all the 1000 records and does not directly jump to the 1000th record, instead use cursor based pagination the overcomes this problem that uses a "where" caluse in the query avoid scanning needed data

    • @AzharUddin-ob7vb
      @AzharUddin-ob7vb Před 10 měsíci

      @@arpanghoshal2579
      This is what I have understood look below example
      Select * from table where index >= 1000 Limit 10 ?
      By doing this we need to have extra column name => index..

    • @alexeibrinza2719
      @alexeibrinza2719 Před 9 měsíci

      You may also consider cursor pagination, which is noticeably faster than offset pagination, but only allows you to scroll the result forwards and backwards, without going on arbitrary page.

    • @Winnetou17
      @Winnetou17 Před 9 měsíci

      @@AzharUddin-ob7vb I had this exact problem in a big table in MySQL several years ago. For example, doing
      SELECT whatever FROM table_name WHERE id > 433453 LIMIT 100 OFFSET 1000000;
      This killed the performance, because the id column was a clustered index (PRIMARY KEY). It knew to jump to that id immediately, but couldn't compute the offset and also do that jump directly, it had to go through all that one million rows/ids. So the higher the offset, the slower it ran (and that table had between 25 and 35 million rows). The job was to export basically the whole table, but in order, from where it left off. And we had small batches of exports. Once I changed the job to remember the last id, I changed the query to simply be
      SELECT whatever FROM table_name WHERE id > 124342343 LIMIT 100;
      And voila! the speed was back. No matter the id, if I was at the beggining, middle or end of the table, the query was fast, in constant time basically.
      So the lesson is - watch out when using OFFSET. Some indexes allow "jumping", but some don't, and in these cases, high offset values are a performance killer.

  • @globalcitizen123-hl3dv
    @globalcitizen123-hl3dv Před 10 měsíci

    wait I had no idea you wrote the system design books hahaha damn, I'm stupid. I'm a technical program manager that works with Platform/Infra teams (but don't really have a background in either lol) so I used these books as reference. very useful.

  • @santosh_bhat
    @santosh_bhat Před 4 měsíci

    Someone please please tell me how to make such animations?

  • @114dev8
    @114dev8 Před 9 měsíci

    Hey guys, I was wondering if anyone can propose some good database architectures that can help to improve.

    • @DavisTibbz
      @DavisTibbz Před 9 měsíci

      Connection pooling. Other specifics depends on your programming language or framework. You will need an effecient Connection Pool library

  • @zakraw
    @zakraw Před 10 měsíci +3

    Surprised you didn't mention Eager vs Lazy Loading, Database indexing, Breaking excessive SQL queries with multi-level subqueries into smaller ones.

  • @jwbonnett
    @jwbonnett Před 10 měsíci

    JSON is the old defacto, Protobuff is the new one.

  • @LCTesla
    @LCTesla Před 8 měsíci

    5:20 kinda kills the idea for me... that's when logs are most important

  • @jon9103
    @jon9103 Před 3 měsíci

    Paging is often poorly implemented in practice because the ordering of items does not stay consistent between page loads leading to some items being missed and others duplicated and a jarring UX.

  • @b3arwithm3
    @b3arwithm3 Před 2 měsíci +1

    Who does n+1 queries? They don't know how to write SQL queries?

    • @Nkdeveloper
      @Nkdeveloper Před měsícem

      Usually happens to those who use ORM. Like Django will do this automatically if you didn’t write the query properly.

    • @b3arwithm3
      @b3arwithm3 Před měsícem

      @@Nkdeveloper I haven't used Django but all the orm I have played with have query language that map the result set back to the objects. We don't have to do get() in a loop.

  • @VadimFilin
    @VadimFilin Před 10 měsíci +4

    do not use offset! use cursor pattern instead

    • @vaughnhelmer4219
      @vaughnhelmer4219 Před 10 měsíci +1

      Cursors are stateful. Each pattern has its advantages.

  • @Nephtys1
    @Nephtys1 Před 10 měsíci

    This is missing the number 0 of optimization: deleting your api / get rid of it entirely. Nothing is faster or cheaper than no code.
    Contrary to popular belief, this is most of the times the first and best optimization.
    Last year I optimized a specific 'Microservice' away by moving its 'logic' (nearly nothing really useful) to the client. Decreased latency of the whole feature by 80%, and decreased total LOC.
    Always try to localize operations or data if your goal is speed and efficiency.

    • @mutexin
      @mutexin Před 10 měsíci +1

      OMG. Don’t do that. Client side can never be trusted. The most important rule in client-server architecture: never rely on the client.

    • @Nephtys1
      @Nephtys1 Před 10 měsíci

      @@mutexin actually only the server should never trust the client. The client itself can and should trust itself except in extremely rare edge cases. Even with hardware faults at play.

    • @mutexin
      @mutexin Před 10 měsíci

      @@Nephtys1 looks like you don’t understand why client cannot be trusted. Client is in full control of the user. The user can inspect and alter it how he wishes.

    • @Nephtys1
      @Nephtys1 Před 9 měsíci

      @@mutexin yes, and? I can even use different git clients and still push to the same server.
      Clients may be malleable, but normally they are under the control of the user.
      I'm afraid you're missing the core concepts of distributed computing.
      You only care for your own data integrity.
      I'm not requiring you to scan your whole device and send me all your data on your disk whenever you visit my website. Instead you get HTML with everything I want in it and I don't care what you do with it. And if you enter data into a form and send it to me, I'm only verifying the data that is sent to me. I'm not verifying your whole device.
      Point taken, anti-cheat software for gaming does exactly that. It scans your whole device. But that is because game util development is kind of bad.
      For most other things there is reason.

    • @mutexin
      @mutexin Před 9 měsíci

      @@Nephtys1 No, no, no. You were talking about getting rid of the API which is on the server side, you were talking about moving server logic from microservices to the client. Don’t pretend now that you talked about client side logic and data.
      I know it must be hard to admit that you were wrong but it’s better for everyone.

  • @andrasczinege
    @andrasczinege Před 10 měsíci

    Thanks for the awesome video. I use Azure Functions that often read and write from/to a database, and I am just wondering what if I create a connection pool in sqlalchemy and bind it to an engine on the startup of a python azure function. Every time a function is called, azure function starts a new threard to handle the request, and those theads could use that connection pool, which is said to be threadsafe. This way I could reuse connections even though I am running serverless python azure functions. Is it not possible? @ByteByteGo

  • @rollinOnCode
    @rollinOnCode Před 9 měsíci

    the n+1 query can be huge and tends to happen when you got scope creep.