28: Bidding Platform (eBay) | Systems Design Interview Questions With Ex-Google SWE

Sdílet
Vložit
  • čas přidán 6. 09. 2024

Komentáře • 80

  • @hotnonsense5892
    @hotnonsense5892 Před 2 měsíci +11

    Hey Jordan, just wanted to say I just landed an E5 role at Meta and your videos were a big part of helping me get past the system design interview. It wasn't the only resource I used, but it was definitely a huge help, especially early on in my study when it helped me fill a lot of holes in my knowledge. Thanks so much for producing these videos!

    • @jordanhasnolife5163
      @jordanhasnolife5163  Před 2 měsíci +1

      Woo!! Congrats man and good luck in the new role!

    • @aishwaryakala9653
      @aishwaryakala9653 Před 2 měsíci

      Congrats! Mind sharing your resources and learning paths so that it could help others as well.

    • @hotnonsense5892
      @hotnonsense5892 Před 2 měsíci

      @@aishwaryakala9653 Hello Interview blogs and videos are awesome. They also help with getting the logistics of how to structure and pace the interview.
      I watched System Design Fight Club videos, but the videos are so unfocused and long-winded you really have to watch them at high speed and skip over large parts.
      DDIA is a great read, although I can't directly attribute anything I did in any interviews to it (there isn't enough time usually), I do think it helped reinforce and internalize a lot of ideas I'd only skimmed. Also Kleppmann is also a great, engaging writer.
      I read Alex Xu's books and... idk personally I kind of hated them and found some of his solutions very questionable, but maybe that's just me. The writing isn't that great and some of it is a real slog to get through. It seemed to me like he didn't get anyone to proof-read or edit them. A lot of people like them though.
      Though even for material that is bad, reading it can still help because it stimulates your mind to think about what's wrong with it and come up with your own better solution.
      More important than any specific resource... practice! At least for Meta you have only 35 to 40 minutes after accounting for introductions and questions at the end. Write down checkpoints/milestones for how many minutes you can spend on each step of your system design and practice consistently hitting them. Being able to come up with a great design in 2 hours is pointless because they will cut you off long before then.
      I practiced with a random design prompt every day and a timer and made sure I was hitting those checkpoints. Also consider that in the real interview you will probably get interrupted sometimes.
      Also if you can afford it, consider some paid mocks.

    • @hotnonsense5892
      @hotnonsense5892 Před 2 měsíci +5

      ​@@aishwaryakala9653
      Hello Interview blogs and videos are great. They also help with getting the logistics of how to structure and pace the interview.
      Various other random YT videos on topics I felt I didn't understand. I watched System Design Fight Club vids, but the videos are so unfocused and long-winded you really have to watch them at high speed and skip over large parts.
      DDIA is awesome, although I can't directly attribute anything I did in any interviews to it (there isn't enough time usually), I do think it helped reinforce and internalize a lot of ideas I'd only skimmed. Also Kleppmann is also a great engaging writer.
      I read Alex Xu's books and... idk personally I kind of disliked them and found some of his understanding of some systems a bit questionable, but maybe that's just me. The writing isn't always great and it's a slog to get through some parts. He needed better editing and proof-reading imo. A lot of people like them though.
      Even for material that you don't agree with, reading it can still help because it stimulates your mind to think about what's wrong with it and come up with your own better solution.
      But more important than any specific resource... practice! At least for Meta you have only 35 to 40 minutes after accounting for introductions and questions at the end. Write down checkpoints/milestones for how many minutes you can spend on each step of your system design and practice consistently hitting them. Being able to come up with a great design in 2 hours is pointless because they will cut you off long before then.
      Practice with a random design prompt every day and a timer and make sure you're hitting your time checkpoints. Also consider that in the real interview you will probably get interrupted sometimes.
      Also if you can afford it, consider some paid mocks.

  • @MahdiBelbasi
    @MahdiBelbasi Před 8 hodinami

    Dude, this was awesome. Well done on the design and the explanation.

  • @mayuragrawal4352
    @mayuragrawal4352 Před měsícem +1

    Great in depth video. loved the part that all unnecessary part of design or commonplace is not focussed on but only critical part which is about scale and concurrency and ordering problem, achieving atomicity without transactions. Thanks Jordan. I watch your videos only for system design. no need to go to other place.

  • @Ambister
    @Ambister Před 2 měsíci +7

    You look like Marv from Home Alone after the iron fell on his face. Love the vids tho

  • @jporritt
    @jporritt Před měsícem +3

    Hi Jordan. Kafka ordering is only guaranteed within a partition. To get Kafka to have guarantee ordering therefore you need to configure a topic to have only one partitoin.

    • @jordanhasnolife5163
      @jordanhasnolife5163  Před měsícem +1

      Yeah we partition by auctionid here

    • @jporritt
      @jporritt Před měsícem

      @@jordanhasnolife5163 Ah…sorry, yes!

    • @akshayyn
      @akshayyn Před měsícem +1

      ​@@jordanhasnolife5163 correct me if I am worng, but we also need only one consumer with single thread listening to a partition to ensure order right?

    • @jordanhasnolife5163
      @jordanhasnolife5163  Před měsícem

      @@akshayyn Typically in kafka you're not interleaving which entries go to which consumer, each consumer tends to read all entries in its partition.
      But yes, technically events could be processed out of order if you did do that.

  • @user-in1ve1co9l
    @user-in1ve1co9l Před 16 dny +1

    Great contents as always, thanks! Some questions if I may: 1) on the state machine replication for bid engine back up, isn’t it almost like a 2 phase commit? And because user has to wait for the backup writes/replication confirmation, how is it going to be fast? Or at least faster than Kafka? Imagine if you put the same in memory bid engine behind Kafka, isn’t it going to similar or even better? 2) on handling the hot bid part, I see although we can scale the readers (get the current bidding info), biding engine, a single machine most likely, is still the bottleneck and arguably does most of the heavy lifting? Acquire lock, decide on winning logic, send to backup/kafka and etc.

    • @jordanhasnolife5163
      @jordanhasnolife5163  Před 15 dny +1

      Yes, it basically is a two phase commit, except the message only has to get to kafka (I'm using a successful write there as an indication that the write is durable), as opposed to completely fully replicated in memory down stream.
      2) Yeah, but there's not really a solution to this. If you need multiple things to choose an absolute ordering over them they've gotta go to one place. We could buffer in many different kafka queues and have the bid engine poll at the rate it's able to, but then this whole process becomes asynchronous, and I had wanted it to be synchronous.

    • @ekinrf
      @ekinrf Před 15 dny +1

      @@jordanhasnolife5163 Not sure I fully get the argument for a synchrous process though, as long as the througput is high, which arguably async architecture might be better here, user experience would not be affected?

    • @jordanhasnolife5163
      @jordanhasnolife5163  Před 13 dny +1

      @@ekinrf perhaps not. being "synchronous" is really an illusion over an asyncrhonous network anyways.

  • @soumik76
    @soumik76 Před měsícem +2

    Hey thanks for this. I was wondering if Redis would be a good choice to run on Bid Engine, given it's atomic and single-threaded. Custom Lua scripts could have the logic to push the things in kafka. Do you see a downside to this?

    • @jordanhasnolife5163
      @jordanhasnolife5163  Před měsícem +1

      I think if you can make what you said work that seems pretty feasible!

  • @Ayshhg
    @Ayshhg Před 10 dny +1

    Hey Jordan,
    Why dont we use something like transactional outbox pattern for writing to kafka.
    Instead of the broker listening to db I can have some outbox table whose job is to send the data to kafka and this will achieve dual write with a very high throughput.

    • @jordanhasnolife5163
      @jordanhasnolife5163  Před 8 dny

      Because then I have to write to disk first which lowers my throughput. If this table is in memory then by all means go for it

  • @youaresowealthy7333
    @youaresowealthy7333 Před měsícem +1

    If interview is to design a heavy read light write, then sync is fine. But for heavy write&read, it can dangerous/hard to sell this sync design, all sync part should be doing early rejection only while those surviving bidding need be queued and processed async by a single node eventually. More like a flash sale problem, rejection starts from client/browser.

    • @jordanhasnolife5163
      @jordanhasnolife5163  Před měsícem

      I agree that putting all bids in kafka and processing them asynchronously is the better way to go here.

  • @rr5349
    @rr5349 Před 29 dny +1

    I think I missed it (or maybe i keep tuning out at the right time), but where is the actually collision case discussed where two people at the exact time submit the same bid price (and how is it determined who wins?). I understand the "we lock each bid coming in -> increment sequence -> ship to primary/backup engine for state, kafka for source of truth" but wouldn't the two concurrent writes be the same sequence id. I guess what I'm asking is what the collision principle is there. Either way, phenomenal video. I'm having a great time binging these before my interview weh.

    • @jordanhasnolife5163
      @jordanhasnolife5163  Před 29 dny +1

      what does locking accomplish? We aren't "locking bids", each bid must successfully grab a lock so we can assign it a sequence number.

    • @rr5349
      @rr5349 Před 28 dny +1

      @@jordanhasnolife5163 ah i see the difference there, so we are locking against the sequence number to avoid writing two of the same sequence ids? I can see it from a logistical perspective, but where does the fairness aspect come into play? If we have two bids at the exact same time, then wouldn't the winner be whoever acquires a lock first? I guess if it isn't up to us to decide who wins the tie breaker (for the sake of this exercise), and its just get lucky and your bid is accepted before the other one because you got the lock first, then I get how it is expected to work. Thanks for the follow up!

    • @jordanhasnolife5163
      @jordanhasnolife5163  Před 28 dny +1

      @@rr5349 Yeah I don't see how you can ever be "fair" unless one party decides which thing gets there first. You can't trust distributed timestamps.

  • @LUN-bo2fb
    @LUN-bo2fb Před 2 měsíci +1

    Jordan, how do you feel about the delivary framework that's suggested by vast amount of youtubers.
    Functional and nonfunctional requirement -> back of envelop estimation -> list down core entities -> API design -> high level design -> deep dive.
    I found you have your own style of delivary. and it is still smooth.

    • @jordanhasnolife5163
      @jordanhasnolife5163  Před 2 měsíci +1

      I used to do this in the 1.0 playlist. I found that it wasted a lot of time for me, and I think that most people are pretty capable of figuring out the APIs that you need fairly quickly. The high level design to deep dive is something I've considered doing more, which is why I tend to have these overview slides. I don't think giving a high level design without some initial discussion first makes much sense to me.

    • @LUN-bo2fb
      @LUN-bo2fb Před 2 měsíci +1

      @@jordanhasnolife5163 Yes I think high level design always goes into very similiar form if you do it without discussion on how you are going to handle and store data.
      I am still thinking if I should draw a high level diagram in my upcoming interview or not. I did this in a mock interview, and eventually, the interview still ask me how data model looks like and how I handle race condition.
      I think put the high level design after presenting data model to do a quick summarization of current discussion may be a good idea.

  • @tobiadeoye1439
    @tobiadeoye1439 Před měsícem +1

    Very nice video, Jordan.
    One question I have is: how do we accurately restore the Auction state on the server if both the Bid Engine and Backup Engine go down?

    • @jordanhasnolife5163
      @jordanhasnolife5163  Před měsícem

      MOAR BACKUPS
      Beyond a point, everything can fail. There's no way to guarantee fault tolerance against everything, but within reason hopefully.

  • @nikhilm9494
    @nikhilm9494 Před 2 měsíci +2

    Hey Jordan, Do you plan on bringing back the low level system design videos anytime going forward?

    • @jordanhasnolife5163
      @jordanhasnolife5163  Před 2 měsíci +1

      At least for the time being, I'm planning to stick with distributed systems design. That being said, I'm sure I'll eventually fall into a rut, and once I do I may revisit these!

  • @golabarood1
    @golabarood1 Před 2 měsíci +1

    Thanks Jordan for an amazing upload as always!
    A question from the last slide -
    - what is the use of Auction DB? Is it just for the bidding engine to read and write some state that it needs? As looks like only bidding engine is interacting with it.

    • @jordanhasnolife5163
      @jordanhasnolife5163  Před 2 měsíci

      Yeah, we basically need it to write end auction result as well as query for existing auctions if users want to bid.

  • @manojgoyal-y3k
    @manojgoyal-y3k Před 23 dny +1

    Hi jordan,
    i i believe, when a bid is being processed, then we are keep the price of the auction updated with highest what if,we can keep the bid id also in same auction table of auction db.
    now in case. we are keeping the auction db replicated also using leader follower with consesus algo. then we will not need backup engine at all even big engin(stateless) gose down.
    will this method work?

    • @jordanhasnolife5163
      @jordanhasnolife5163  Před 22 dny

      Yeah for sure. It just now takes a full consensus write to submit a bid, so your throughput goes down quite a bit.

  • @AkritiBhat
    @AkritiBhat Před 2 měsíci +1

    one quick question, how would we handle hot auctions near the deadline. When there are too many bids. In that case, we might need to have Kafka before our bidding engine

    • @jordanhasnolife5163
      @jordanhasnolife5163  Před 2 měsíci

      Agreed, we lose any request response ability for our requests but what can ya do

  • @perfectalgos9641
    @perfectalgos9641 Před 2 měsíci +1

    At the 31:28th second, why do we need an Bid Gateway behind LB? I think It should be other way around BidGateway and then LB for Bid Engine.

    • @jordanhasnolife5163
      @jordanhasnolife5163  Před 2 měsíci

      Technically you'd want one for both. The bid gateway is likely going to be run by many servers which we can round robin to. Finding the right bidding engine will depend on the id of the auction.

  • @mdasifqureshi
    @mdasifqureshi Před 2 měsíci +3

    One thing I wanna point out is that if the primary goes down, the backup server will have to wait to be caught up with all the kakfa messages in its partition before it can start serving requests. So we can have some unavailability, which is ok here since we are trading it for consistency.
    Another thing is that since kafka already guarantees ordering within a partition I don't think we need the sequence numbers. Since bid engine is evaluating the bids in a serial order through a critical section and also persists messages to kafka in that critical section, the ordering of messages in kafka will be consistent with the sequence numbers making them redundant. @jordan let me know if I missed something.

    • @mdasifqureshi
      @mdasifqureshi Před 2 měsíci +2

      On reviewing the pseudocode it seems like we are writing to kafka in a background thread. I don't think thats feasible as it'll violate our durability guarantees. I think have to persist to kafka before we acknowledge a bid.

    • @jordanhasnolife5163
      @jordanhasnolife5163  Před 2 měsíci

      1) Totally agree that you'd have to wait for the backup to read all Kafka messages. You could mitigate this by sending messages to the backup first and then it puts them in Kafka, but there are tradeoffs there. The sequence numbers were just for the case where we publish to Kafka on a another thread

    • @jordanhasnolife5163
      @jordanhasnolife5163  Před 2 měsíci

      As for the separate Kafka thread thing, I registered a handler for when we get the ack at which point we return to the user. But then is it truly synchronous?

    • @mdasifqureshi
      @mdasifqureshi Před 2 měsíci

      @@jordanhasnolife5163 Got it. But I wonder if there is any benefit to using a separate thread to publish to kafka.
      If we use a single bg thread that reads for the in memory queue and publishes to kafka, I'd say lets just use the main thread as pusblishing to kafka is the bottleneck so having multiple thread doesn't really improve response time to client.
      If we use multiple bg thread that reads for the in memory queue and publishes to kafka( which eliminates the main bottleneck through parallelism), then the messages will be out of order in kafka. Now sequence number do come to rescue here but consider a scenario where I successfully published sq no 5 but the primary died before publishing sq no 4. Since the determination of whether sq no 5 was accepted/rejected depends on sq no 4, loosing sq no 4 which sq no 5 is persisted will create inconsistency in our system.
      Interested in hear your thoughts on this.

    • @mdasifqureshi
      @mdasifqureshi Před 2 měsíci

      @@jordanhasnolife5163 Got it. Though I am having trouble understanding what the benefit of having a separate thread to publish messages is.
      I don't think we can use multiple bg thread to publish to kafka because that would screw up ordering and can create inconsistency like sq no 5 is persisted while the primary died before persisting sq no 4. And since sq no 4 was used in the determination of sq no 5's accepted/rejected status this would lead to inconsistency.
      And if were using a single bg thread might as well use the main thread and publishing to kafka will be bottleneck in the bid processing. using a single background thread just changes where we wait i.e. do we wait before entering critical section(main thread scenario) vs after entering critcal section(bg thread scenario)

  • @Av-cu6gm
    @Av-cu6gm Před 2 měsíci +1

    Why we have auction db mysql, when it's decided to choose kafka and time series as source of truth?

    • @jordanhasnolife5163
      @jordanhasnolife5163  Před 2 měsíci

      This is just for the metadata of the auction itself, not the bids

  • @jordiesteve8693
    @jordiesteve8693 Před měsícem +1

    in the 2nd pseudo code, you enqueue some bids into an in-memory queue, and then ship to kafka, am I right?

  • @popricereceipts4279
    @popricereceipts4279 Před 2 měsíci +1

    So how does the bid engine actually determine the ordering? Like what is the actual logic?

    • @jordanhasnolife5163
      @jordanhasnolife5163  Před 2 měsíci +1

      You just grab a lock, increment the sequence number by 1, release the lock.

  • @techlifewithmohsin6142
    @techlifewithmohsin6142 Před 2 měsíci +1

    Love the content, can you also make a video on CZcams Analytics like Video counts, watch time with no double view on given time period and extensible to have new metrics for content creators.

    • @jordanhasnolife5163
      @jordanhasnolife5163  Před 2 měsíci

      I'd say this sounds a lot like the "top K" problem, but instead of getting the top K you compute it for all of them.

    • @techlifewithmohsin6142
      @techlifewithmohsin6142 Před 2 měsíci

      @@tttrrrrr1841 Yeah I saw this on Leetcode only, can you add more details. You mean you used video chunk_id to get the count and its time_window can be used to multiply the count to get total watch time?

    • @techlifewithmohsin6142
      @techlifewithmohsin6142 Před 2 měsíci

      @@tttrrrrr1841 you mean you used video segment_id events to aggregate and then use video_id to perform those metrics operations at the query time.?

    • @techlifewithmohsin6142
      @techlifewithmohsin6142 Před 2 měsíci +1

      @@tttrrrrr1841 Can you also tell what was the question you had in second round of SD?

    • @jordanhasnolife5163
      @jordanhasnolife5163  Před 2 měsíci

      @@tttrrrrr1841 lol funny enough I did not see your leetcode comment, this one was me I promise

  • @Raymondhjc
    @Raymondhjc Před 2 měsíci +1

    Hey Jordan, do you plan to do a email system design?

  • @firefly_3141
    @firefly_3141 Před 2 měsíci +1

    Hey Jordan your linkedin?

  • @riyakaushik3585
    @riyakaushik3585 Před 2 měsíci +2

    The first viewer on a Saturday Evening? damn thats sad

  • @foxedex447
    @foxedex447 Před 2 měsíci +1

    Bro exposing his company system design 💀💀

    • @scuderia6272
      @scuderia6272 Před 2 měsíci +3

      Has he worked at eBay?

    • @jordanhasnolife5163
      @jordanhasnolife5163  Před 2 měsíci +1

      No never lmao, I'm a bit confused here

    • @foxedex447
      @foxedex447 Před 2 měsíci +1

      ​@@scuderia6272 no i meant like "he getting the system design from his company and putting it here", like he has a bidding system in his company and putting it here not on ebay XDD

    • @foxedex447
      @foxedex447 Před 2 měsíci +2

      @@TenFrenchMathematiciansInACoat GUYS ITS JUST A JOKE 😭

    • @scuderia6272
      @scuderia6272 Před 2 měsíci +1

      😂