System Design Interview Question - TinyURL System Design | URL shortner System Design

Sdílet
Vložit
  • čas přidán 7. 09. 2024

Komentáře • 195

  • @ThinkSoftware
    @ThinkSoftware  Před 4 lety +4

    Thanks for watching this video. Please let me know your feedback :)

    • @gaming_with_ajey
      @gaming_with_ajey Před 3 lety

      Great Video as always.

    • @ThinkSoftware
      @ThinkSoftware  Před 3 lety

      Thanks for the comment :)

    • @neriyamuralikrishna4644
      @neriyamuralikrishna4644 Před 3 lety +1

      First time watched your presentation. It was very intuitive and good explanation. Thanks for your support to educate us.

    • @obamabinladen5055
      @obamabinladen5055 Před 3 lety +1

      hi. one question. If the app server gets say 1-100 ID range (or 1000-10000 ID range) for usage. How will they be converted to the 6 char alphanum key? If we do sha256 or anything, we again risk collision.

    • @ThinkSoftware
      @ThinkSoftware  Před 3 lety

      Just like you convert a decimal number (base10) to hexa number (base16), same will be done for converting it to base62/base64 whatever you choose.

  • @snehasrivastava8720
    @snehasrivastava8720 Před 2 lety +2

    Thank you so much for the amazing content you share, the way you explain the concepts are so crisp and clear. I am preparing for my Design interviews and bumped to your videos and I am just glued here. Its a one stop shop for all the amazing System Design tips and tricks. Waiting for more these kind of videos !!

  • @parul.gangwar0
    @parul.gangwar0 Před 2 lety +3

    I have almost covered all of your videos please try to upload videos once in a week they are really helpful for us.

    • @ThinkSoftware
      @ThinkSoftware  Před 2 lety +1

      Thanks for the comment. Will start uploading more videos soon.

  • @vartikagupta8816
    @vartikagupta8816 Před 3 lety +1

    You have a very nice way of explaining. I like the way you support your logic with the reason. Great work!!

  • @user-oy4kf5wr8l
    @user-oy4kf5wr8l Před 3 lety +1

    Please make more videos, i recommend you to introduce some GOOD & Important concept (BUT NOT THOSE things already introduced by other people) on your youtube videos. so that for low level sde, they can learn a lot from you. And latter, they will buy your course if they want to know more details!
    Ur videos are the best. i am super willing to buy your course!

  • @coop4476
    @coop4476 Před 4 lety +4

    Excellent pace. Thank you!

  • @ganzee6928
    @ganzee6928 Před 2 lety

    Great content as always, I love the aspect of comparing against other solutions out there. Thanks!
    Two advantages of having a KGS imho are:
    a. The ability to independently change the logic of key generation and store the keys and deploy that microservice.
    b. It can generate random short URLs (as against predictable ones in the counter logic), ensure there is no collision and store them.

    • @ThinkSoftware
      @ThinkSoftware  Před 2 lety

      thanks for the comment :)

    • @ganzee6928
      @ganzee6928 Před 2 lety +1

      @@ThinkSoftware I bought your course and the depth of analysis for each design was impressive, gave a lot of confidence for my FAANG interviews. The volume of content, in terms of number of design questions, was on the lower side though.

    • @ThinkSoftware
      @ThinkSoftware  Před 2 lety

      @@ganzee6928 Thanks for buying the course. Yes as far as the number of chapters are concerned, it is still far away from my target number of chapters and that is why I am giving unlimited access to all my students right now. There are three chapters in the pipeline 1. Stripe System Design, 2. Object Storage System Design and 3. Design of Distributed Datastore, with Stripe System Design coming very soon. As you can see creating quality content is very time consuming, specially when I myself being an architect in my current company is very busy in my primary job but then you have seen the quality of the content/chapters that I have written.

    • @ganzee6928
      @ganzee6928 Před 2 lety +1

      @@ThinkSoftware Completely understand the time constraint. Your course quality (specifically the depth and comparative analysis) is one of the best I have seen online. Looking forward to going through the new material when published. Thanks.

  • @ankitrathore007
    @ankitrathore007 Před 4 lety +1

    i really like the way you explain generate counter from data base .. nice explanation.. keep it up

  • @alexanderbennett9195
    @alexanderbennett9195 Před 4 lety +1

    Thank you! I love your videos. Very informative and helpful. Keep making great content please!

    • @ThinkSoftware
      @ThinkSoftware  Před 4 lety +3

      Thanks. I need comments like this to get more motivation for making more videos 😁

  • @saurabhverma1381
    @saurabhverma1381 Před 2 lety +2

    12:00 There's a standard CFT Lemma which says, if there are 2f+1 total nodes in a distributed system, then that system can tolerate atmost f nodes failure. So if changes are getting deployed to one server (which is minimum viable amount of servers which we can deploy things to), then that one server suffers some downtime while deployment, then the max nodes required to keep system running is 3. That's why we keep servers to a minimum of three. This doc can be used for reference, en.wikipedia.org/wiki/Quorum_%28distributed_computing%29.

  • @anirudhvyas6069
    @anirudhvyas6069 Před 2 lety +1

    like your channel, your bottom up design skills are unique. thank you!

  • @touseefliaqat
    @touseefliaqat Před 4 lety

    Excellent job in explaining the concept and tradeoffs in different approaches.

  • @cagarwal
    @cagarwal Před 11 měsíci

    Thanks for creating these videos. I really like your approach of asking questions and make audience think. Would it be possible for you to create a video on system design for google sheet?

  • @aayushdhir7295
    @aayushdhir7295 Před 3 lety +3

    The approach of using counter for generating short Urls would NOT be very optimal if user is given a choice to pick the short URL of their choice. In those cases, lets say 100-200 counters values could contain already used short URLs. The application server would need to check the database whether a value between 100-200 is already set or not in that case and if only free it would use it. Let me know your thoughts on tackling this extension of the problem

    • @ThinkSoftware
      @ThinkSoftware  Před 3 lety

      I think I have discussed this in the course :)

  • @shilpaaggarwal5008
    @shilpaaggarwal5008 Před 2 lety +1

    I am thankful to have found this channel. Amazing video.

  • @VikasJain-qd9sw
    @VikasJain-qd9sw Před 3 lety +1

    super like... well explained

  • @felixhu8185
    @felixhu8185 Před 4 lety +1

    nice explanation. but countered based approach is not random, which is one of the functional requirement listed at the beginning.

    • @ThinkSoftware
      @ThinkSoftware  Před 4 lety +1

      Considering 3+ app servers and the way LB route requests to them and also you don't know how many requests are coming to the service from other users, it is somewhat random (although could be random in a range).

  • @urunovtimes5816
    @urunovtimes5816 Před 3 lety

    Detail explained and more clarified ,

    • @ThinkSoftware
      @ThinkSoftware  Před 3 lety +1

      Thanks for the comment

    • @urunovtimes5816
      @urunovtimes5816 Před 3 lety

      @@ThinkSoftware github.com/Urunov/upcoding
      I initiated coding every day, here is TinyUrl source code.

  • @SalmanAhmed-cn9qe
    @SalmanAhmed-cn9qe Před 4 lety

    Very Nice explanation sir.please add more design videos

  • @payadav
    @payadav Před 4 lety +1

    Thanks for the detailed explanation. How do you ensure that the same big URL is assigned the same TinyURL if it's submitted multiple times?

    • @ThinkSoftware
      @ThinkSoftware  Před 4 lety

      Thanks for the comment. This was one of the requirements that we discussed that we are not going to return same tiny URLfor a big URL. Let's see what others will say if we change this requirement.

  • @teenthofabud
    @teenthofabud Před 3 lety +2

    17 minutes into the video and all I can see is a generic approach to making a distributed system scalable and available while the title speaks of URL shortner system design. Am I wrong to expect TinyURL specific designs throughout the video as opposed to the last part of the video ?

    • @ThinkSoftware
      @ThinkSoftware  Před 3 lety

      Thanks for the comment. This was my first system design video so I discussed several other concepts.

    • @mahmoudyounes3005
      @mahmoudyounes3005 Před 3 lety +5

      TinyUrl problem is a very generic very simple problem that introduces the viewer to how to think about scalability and how to handle availability. it's not a complicated problem by any means like Twitter or Netflix or Facebook. most of the concepts used here will be taken for granted if you try to solve the more complicated problems. however, in the more complicated problems you will solve another type of problems that will require you to make a simple design more complicated. it's a good starter to system design videos..

  • @VivekSundararajan
    @VivekSundararajan Před 4 měsíci

    What if the counter db fails? It is not fault tolerant in that case right?

  • @hansenz7033
    @hansenz7033 Před 4 lety

    Very nice video.Thank you!

  • @hakankanplay
    @hakankanplay Před 3 lety +1

    In the non functional requirements, you said strong consistency. Can you explain how different components ( specifically datsstore/counter database) enable strong consistency during component failures? Thanks!

    • @ThinkSoftware
      @ThinkSoftware  Před 3 lety +1

      Will make a future video on the design of distributed datastore..will discuss this things there.

  • @akshaysinghyaduvanshi3768

    How do we convert int value to 6 char short URL?

    • @ThinkSoftware
      @ThinkSoftware  Před 4 lety +5

      Coversion can look like following, considering base62:
      0 -> 000000
      1 -> 000001
      9 -> 000009
      10 -> 00000a
      11 -> 00000b
      35 -> 00000z
      36 -> 00000A
      37 -> 00000B
      61 -> 00000Z
      62 -> 000010
      63 -> 000011
      and so on.

  • @semenchernyy1418
    @semenchernyy1418 Před 3 lety

    Hello. Thank you for running the channel! Great job.
    I have a quick question on datatype of shortURL in DB. Can we use the value of the counter to store shortURL instead of explicit storage of shortURL as suggested on the video? Given you have function to convert integer to string (base62 encode/decode) you can save some on disk by using data type with smaller size.

    • @ThinkSoftware
      @ThinkSoftware  Před 3 lety

      Thanks for the comment. Yes this can be done as well.

  • @vipulraj3351
    @vipulraj3351 Před 2 lety +1

    Thankyou for the great content. Need some clarification over key generation.
    suppose the counter value is 65535. Convering to base 64, it becomes "NjU1MzU".
    Now, as you replied in some comment below that if the base64 converted value is less than 6 character then we can add 0 or z in the prefix to make it 6 digit.
    But here for 65535 counter value, the base 64 is NjU1MzU which is greater than 6 in length. How will we make it 6 digit? If we trim some characters then it will be same issue like MD5 hash value trimming.
    pls help. Thanks in advance. :)

    • @ThinkSoftware
      @ThinkSoftware  Před 2 lety +2

      What you did is string encoding of 65535. What you need to do is first the integer 65535 (base10) to an equivalent integer value of base62 or base64 (e.g. equivalent integer value in base16 is FFFF). Once you have the integer equivalent in base62 or base64 then you encode it in string.

  • @TabmansTube
    @TabmansTube Před 4 lety +1

    how are you linking generating a short url with counters/integer values? I don't understand the algo/logic for that?

    • @ThinkSoftware
      @ThinkSoftware  Před 4 lety

      If you check I have explained it in other comments.

  • @SmartStuden88
    @SmartStuden88 Před 4 lety +1

    Hi, thanks for video! Can you explain transition from 128bit to 21 characters in md5 encoding approach?

    • @ThinkSoftware
      @ThinkSoftware  Před 4 lety

      For base64 encoding you need 6 bits. So now if you start encoding every 6 bits in 128bit md5 hash to a base64 character then you need total 128/6 characters

  • @prafulbhosale3354
    @prafulbhosale3354 Před 4 lety +2

    Hi, Thanks for the video. I have one doubt regarding the counter approach. You mentioned that the app server ranges will be 101-200 or 301-400 or even 1001-2000, etc but if we convert any number within these ranges to its base62 representation, it will generate key less than 6 chars since you have decided 6 chars should be a key length in your requirements. How you are going to resolve this problem

    • @ThinkSoftware
      @ThinkSoftware  Před 4 lety

      100 can be considered 000100

    • @ashwinnatty
      @ashwinnatty Před 4 lety

      @@ThinkSoftware I think even if you consider 100 as 000100 and then try to convert it to base 62, i dont think it will end up in a 6 character long key. Someone please correct me if am wrong

    • @ThinkSoftware
      @ThinkSoftware  Před 4 lety +2

      This is after conversion to base62. Another example let's say after conversion it is abc. This can be converted to 000abc or zzzabc based on whichever character is used for base62 encoded 0 (i.e. 0 or z)

    • @vinyasshetty4042
      @vinyasshetty4042 Před 4 lety

      @@ThinkSoftware First, thank you for the video, its very well explained. Following up on the above question, assuming we have counters in db which are regular decimal number, we then use base64 encoding on the ranges on app side , what if the counters goes above say a million , then the base64 encoded value of 1million is greater than 6 length, now taking a substring from them will be same issue like MD5 ,correct?Thanks again for the video

    • @ThinkSoftware
      @ThinkSoftware  Před 4 lety

      Base 64 with 6 chars can denote upto 6^64 values. You misunderstood how the base64 conversion is being done.

  • @DeepakSah3.0
    @DeepakSah3.0 Před rokem

    Thanks for the video. Also Create a video on tinder system design.

    • @ThinkSoftware
      @ThinkSoftware  Před rokem

      Have you seen czcams.com/video/XFQIW2R_Klk/video.html

  • @ChronoCoder
    @ChronoCoder Před rokem

    Great system design. Regarding API Design, I have been wondering if we should use the "redirect" approach and the get_long_url returns a 302 response or use the approach you are suggesting in your design. I am leaning towards what you suggested but it probably means we will need a layer of code to redirect after getting the long url. Curious, what are your thoughts on that.

  • @siddhupiddu
    @siddhupiddu Před 4 lety +1

    Why put usertoken in api signature? Its usually in the header.

    • @ThinkSoftware
      @ThinkSoftware  Před 4 lety +1

      Thanks for the comment. Here I am passing the userToken in the API to emphasize the need to pass the userToken. The API section describes the API from a high-level. At the REST api level the userToken will be set in the header.

  • @josephfernandez9010
    @josephfernandez9010 Před 4 lety

    Excellent video.

    • @ThinkSoftware
      @ThinkSoftware  Před 4 lety

      Thanks for the comment

    • @josephfernandez9010
      @josephfernandez9010 Před 4 lety

      @@ThinkSoftware The pace you work in is very helpful for me. The way you approach the problems and the speed at which you work are both very beneficial for me while I study system design.

    • @ThinkSoftware
      @ThinkSoftware  Před 4 lety

      Thanks for the nice words 😊

  • @maganaluis92
    @maganaluis92 Před 3 lety +1

    I think you can always skip auth APIs, internal applications integrate with internal identity brokers, and external application integrate with external identity brokers. You almost will never implement authentication from scratch.

  • @pawandeepchor89
    @pawandeepchor89 Před 4 lety

    job well done !!

  • @tusharmishra8232
    @tusharmishra8232 Před 3 lety

    storing user as foreign key, does it not make the duplicacy of data to be high. As all 100 user can ask to create a tiny url for same big url?

  • @guillaumeborne4155
    @guillaumeborne4155 Před 3 lety +1

    I don't really agree with the fact of having 3 servers running all the time just for anticipating maintenance or new releases. You can have 2 servers running and when you need to release a new version you just pop-up another instance and once it's successfully launched you remove one of the old instances and you do the same with the other instance. Kind of blue green deployment in a way.

    • @ThinkSoftware
      @ThinkSoftware  Před 3 lety +1

      Thanks for the comment. It totally depends on what type of infrastructure support your service has. If the underlying infrastructure provides support for adding host and removing host in a seamless manner and you don't have to go modify various configuration settings then you can go that route. Otherwise, it will be less hassle-free to have at least three servers (and I have rarely seen anyone using the route that you suggested in my 15+ years of experience). And also you need to understand the CICD pipeline that you have. We don't do deployment or maintenance once a month, nowadays the frequency for deployment is much higher, if not daily then still once a week or sometimes twice. Of course, when you are starting new (think a startup of two/three-person), you can still go with two servers because you might be at a stage where you are the only person (or two/three more) who knows when to perform the deployment. But I am talking, in general, about suppose if you are working in a cloud company at the scale of Stripe, Lyft, Uber, or FAANG.

  • @shreejitnair2174
    @shreejitnair2174 Před 3 lety

    Had a question on the range keys being maintained on each app server. Would multiple threads not have to synchronize with each other to get the next available offset in this range , leading to higher latency? I am assuming this counter logic is needed only when we run into collisions.

    • @ThinkSoftware
      @ThinkSoftware  Před 3 lety

      We would be using interlockedIncrement() to get the next counter value in the range

  • @Asha-se4wv
    @Asha-se4wv Před 3 lety

    if collision occurs what to be done (we are choosing the first 7 chars from MD5 hash + base64 + counter values) ??
    We can try using another hash alorigthm or chossing specific index chars in output of (MD5 hash + base64 + counter values).
    Issues:
    1) But for every request we need to check in DB if that short url already exists or not.
    2)Lets say we need same short url for every request for long url, in that case, let's say the first request has generated short url with the solution I mentioned above then for every next request for same long url it has to perform that soltuon (extra steps) to generate same url.
    your opinion on this?

    • @ThinkSoftware
      @ThinkSoftware  Před 3 lety

      Thanks for the comment. I already discussed this issue and provided better algorithm in the video. May be you didn't watch the video complete.

  • @tango2olo
    @tango2olo Před 2 lety

    1. Think over the Functional requirements.
    2. Think over the Non-functional requirements.

  • @satang500
    @satang500 Před 4 lety

    I've watched some of your videos and thanks for great videos. One thing I noticed is the voice quality is sometimes not good, too small or unclear.Maybe the microphone is too far or something.

    • @ThinkSoftware
      @ThinkSoftware  Před 4 lety

      Thanks for the feedback. Let me know do you find this issue in other videos as well, specially the newer ones. I am still learning to produce better videos day by day 🙂

  • @pallavibansal84
    @pallavibansal84 Před 4 lety

    What if there are too many write requests which end up lots of app servers needing to fetch counter range, resulting in write failures and overload on counterDB ... In order to avoid this scenario, do you think instead of using a counter stored in DB and app servers going to it directly, we use a service (counter service) that pushes counter-ranges to a queue and app servers just pick a range from that? It can keep pushing to the queue at regular intervals. This design is more complex is only needed if there are too many write requests that we need to handle ..

    • @ThinkSoftware
      @ThinkSoftware  Před 4 lety

      We can always increase the range size if that is happen. With 100 as size, it means that servers are going to DB only 100th time and that the scenario you are mentioning is when somehow all the app servers exhaust their range of 100 at the same time.

  • @MrEnvy999
    @MrEnvy999 Před 2 lety

    Instead of relying on an external data store for getting valid count ranges why don't we just append a UUID to every input URL, hash it with MD5 and bas62 encode it? This reduces the possibility of collisions and eliminates the network overhead. The individual app servers can generate UUIDs independently

    • @ThinkSoftware
      @ThinkSoftware  Před 2 lety +2

      This I have already discussed in the course and in this video that MD5 hash can't be used because for 6 character short url, we have to drop the bits from MD5 which will result in conflicts.

  • @weightwatcher221
    @weightwatcher221 Před 4 lety

    Very well explained. Might not be reasonable question but possible. How do we scale it to some arbitrary large(billion) of txns per day?

    • @ThinkSoftware
      @ThinkSoftware  Před 4 lety

      Thanks for the comment 🙂. We scale by horizontal scaling our app servers and have database sharding.

  • @devvashisth8192
    @devvashisth8192 Před 4 lety

    Very nice eplanation, specially nice to see comparison of diff. ans. available, I have 1 doubt - in the final approach, you have told that we will keep a counter in the datastore, but since we hv around 68 Billion possible keys, then after some time won't the counter value reach 68 Billion? and store that huge number in DB would be a problem?

    • @ThinkSoftware
      @ThinkSoftware  Před 4 lety

      Thanks for the comment. Check 64 bit integer has 2^64 possible values.

    • @devvashisth8192
      @devvashisth8192 Před 4 lety

      @@ThinkSoftware Thanks, got it, 64^6 = 68719476736, whereas 2^64 = 1.844674407371E+19, which is very big, so 68billion integer number can be accomodated easily

  • @brain_segfault
    @brain_segfault Před 2 lety

    If you're bulk pulling the IDs per web server, how are you safely sharing and incrementing that value between req?

    • @ThinkSoftware
      @ThinkSoftware  Před 2 lety

      Thanks for the comment. No one is bulk pulling IDs. It is just a counter that you increment atomically and get at the DB level by some threshold (e.g. 100 or 1000) and then when the app server get the counter value it can assume that it can give away Ids in the range [countValue, counterValue + threshold)

  • @experience147
    @experience147 Před 4 lety

    Will global cache will also trigger invalidation of app server cache ??
    Because lets say in app server 1 you have not found sentiment stored as there was no mapping after going to db.
    And after some time on app server 2 for same mapping write request came then global cache will have mapping after going to db
    Now lets say for different user request came on app server 1 since we have not found sentiment store for same mapping we will return wrong data.
    So if any write on global cache happened it should update all app server to invalidate cache entry for that mapping ?

    • @ThinkSoftware
      @ThinkSoftware  Před 4 lety

      We cannot use local cache in that case. There is no way to invalidate local cache when write happen. I think I already discussed that in the video.

  • @deepakchhabra5932
    @deepakchhabra5932 Před 4 lety

    Question - in your approach to generate URL using counter maintained in DB (discussed at 38-40 mins in your video), I'm unable to understand one thing. Let's say we use counters to generate number range but we are limited to 6 characters that means we can only generate shortlinks from 0 to 100000 and we will exceed 6 character limit. How can we apply 62 power 6 combinations. Can you please explain with a sample URL how will it look like?

    • @ThinkSoftware
      @ThinkSoftware  Před 4 lety

      6 characters use base62 encoded number and not decimal number which is base10 encoded. 100000 that you mentioned is base10. We will convert it to base62 which will be way smaller.

    • @EnternalQuest
      @EnternalQuest Před 3 lety

      @@ThinkSoftware are you suggesting to convert long number value to bytes and these bytes to base64 encoded string.
      if yes then if we convert 67billion = 67000000000 to base64 encoded string it returns this value "AAAAD5mC3gA=" which is of length 12. Then how do you shorten it length 6 again?
      or lets say long number in java is a 64 bit number, so with base64 encoding with 2 padding it will become 66 bits. So not 66/6 = 11.
      So the length can 11.

  • @harishkumarrayasam
    @harishkumarrayasam Před 3 lety

    As long as you take care of security/rate limiting and DDOS attacks, I don't think randomization is really needed

  • @sharadskywalker
    @sharadskywalker Před 3 lety

    can we have a separate database instances for reading and for writing? So that lookup and creation can happen at the same time

    • @ThinkSoftware
      @ThinkSoftware  Před 3 lety

      This goes into other detail discussion about how the replication is happening between the DB instances (synchronous vs asynchronous)...then what are the consequences if we are using synchronous vs asynchronous replication on the reads and writes.

  • @ravindrabhatt
    @ravindrabhatt Před 2 lety

    Wouldn't it be a good idea to put Token in GET request as well to prevent attack such as DDOS?

  • @shrikanthnarayanan2
    @shrikanthnarayanan2 Před 2 lety

    If one of the app-server goes down, won't we lose a range of usable values for url generation? I would try to think of a way to reclaim a range

    • @Paradise-kv7fn
      @Paradise-kv7fn Před 2 lety

      Its a very small trade off. 7 characters can generate 68B combos and we can easily afford to loose some chunks in case of server failures. Also, when the cron job runs to delete expired tiny url mappings, we can actually add these tiny urls back to the table containing the available urls so that they can be reused.

  • @vishalgarg2122
    @vishalgarg2122 Před rokem

    You mentioned that we want high availability and strong consistency. How can it be possible as per CAP theorem

    • @ThinkSoftware
      @ThinkSoftware  Před rokem

      I think I have to make a video on it soon. People have misunderstood CAP theorem. First of all they don't understand in which scenarios CAP theorem is applicable. Secondly, according to CAP theorem, in scenarios where it is applicable, only in the presence of a partition, you can't have both high availability and strong consistency and you have to give up one for the other. It does not mean that when there is no partition, still you can't have both. And there is a third reason also, which I have answered in other comments before as well.

  • @sanketvictor
    @sanketvictor Před 2 lety

    Which datastore you will need to use for maintaining the counter? Will that also be stored inside the nosql db?

    • @ThinkSoftware
      @ThinkSoftware  Před 2 lety

      Any database that either support transactions or database sequence/counters.

    • @sanketvictor
      @sanketvictor Před 2 lety

      @@ThinkSoftware Okay, so just for this single column, single row data value, do we have to deploy a distributed database like mysql here? since for rest of the system you are recommending to use nosql db which would not let you use sequences that easily...

    • @ThinkSoftware
      @ThinkSoftware  Před 2 lety +1

      @@sanketvictor You have many options. There are NoSQL databases that support transactions (light-weight transactions) and DB counters as well. You can use them as well. As a last resort, you can have a separate sql db deployment for this purpose.

  • @rethink9957
    @rethink9957 Před 4 lety

    Nice video. Question: How are spam and malicious links handled beside login requirement?

    • @ThinkSoftware
      @ThinkSoftware  Před 4 lety

      Thanks for the comment 🙂. This question would be handled in the in coming book 😉 What do how these links can be handled?

  • @piyushsharma1040
    @piyushsharma1040 Před 3 lety

    Can't we just use the Semaphore concept to keep the counter value so as to avoid any conflict?

    • @nammi895
      @nammi895 Před 3 lety +1

      Semaphore mutex is used in single codebase. Here we have multiple servers. If want to use mutex in DB side or in cache the problem will the the same. That Db or cache will become bottleneck. So that's why he's incrementing it by 100, to reduce load on DB.

  • @EnternalQuest
    @EnternalQuest Před 3 lety

    are you suggesting to convert long number value to bytes and these bytes to base64 encoded string.
    if yes then if we convert 67billion = 67000000000 to base64 encoded string it returns this value "AAAAD5mC3gA=" which is of length 12. Then how do you shorten it length 6 again?
    or lets say a long number in java is a 64 bit number, so with base64 encoding with 2 padding it will become 66 bits. So now 66/6 = 11.
    So the encoded string length can be 11.
    long n = 67000000000L;
    final ByteBuffer allocate = ByteBuffer.allocate(8);
    allocate.putLong(n);
    final byte[] bytes = allocate.array();
    final String s = Base64.getEncoder().encodeToString(bytes);
    System.out.println(s);

    • @ThinkSoftware
      @ThinkSoftware  Před 3 lety

      Both the approaches you have mentioned are not correct. Consider a base 10 number 15. When you convert it to base 16, it becomes F. 10000 in base 10 become 2710 in base 16. In the same way you convert an integer to base62 or base64 which ever you choose. More information in the course. There is free preview chapter on tinyurl design.

  • @rishikhurana17
    @rishikhurana17 Před 3 lety

    Thanks for the video but I have a question for you.
    How is this approach any different from zookeeper approach. your data source is still a single point of failure whereas zookeeper on the other hand is considered to be more reliable and high available.

    • @ThinkSoftware
      @ThinkSoftware  Před 3 lety +11

      The datastore is not a single point of failure. I didn't discuss this in this video but it is already discussed in the course that you can check out. Now regarding using zookeeper to generate the counter values, I think people suggest that without even thinking what it means. In an interview, you cannot just say that you will use zookeeper to generate the counter values, and zookeeper will magically handle generating this value and also provide high availability (i.e. no single point of failure). Generating a unique sequence is the crux of this tiny URL system design and an interviewer will ask you then how zookeeper achieve this. In practice, nodes/machines in the zookeeper fail and they fail all the time. In this case, when one server in the Zookeeper cluster fails, somehow the system needs to make sure that the others don't return a duplicate range. The only way to do that is to make all servers agree on which ranges have been given, and which have not. Now this lies in the distributed consensus problem. In fact, zookeeper uses the one as well for this. Now, if you want to use zookeeper just to assign different counter ranges then this is a bad use of zookeeper because you can achieve the same by just providing counter ranges in some configuration form where all the servers go and read from a file uploaded somewhere to get their ranges. You don't need to run and maintain a separate service for it. This was a long answer as it was not easy to explain in a few words. I hope it clarifies your question.

  • @saurabhbhalla90
    @saurabhbhalla90 Před 4 lety

    Does the application server save the long URL to the DB as soon as it creates it? (before returning to the user). Otherwise if App Server 2 gets a request to fetch a long URL, it might think it does not exist, because App Server 1 didn't save it to the DB

    • @ThinkSoftware
      @ThinkSoftware  Před 4 lety +1

      Yes. This I think I have discussed in the video as well.

    • @Claudius025
      @Claudius025 Před 4 lety

      @@ThinkSoftware but what if app server 1 can create 1-101 urls and app server 2 can create 101-200 urls. app server 1's 99th url creation request is for url www.123.com. The server creates the shortened url. App server 2 gets a request to shorten the exact same url app server 1 did some time later. The request is app server 2's 102nd request. Now one long url has two different short urls. How does the system not create many different short urls for the same long url without a url shortening service? Or are the system designers accepting the cost and overhead of having many different short urls for the same long url?

    • @ThinkSoftware
      @ThinkSoftware  Před 4 lety

      It all depends on the requirements and design changes if the requirements changes. Here in our requirements we didn't have any requirements for always have a single short url for a big url. Instead our very first functional requirement is always return unique short url if you check again.

  • @hemantagrawal25
    @hemantagrawal25 Před 3 lety

    Can we use timestamp and node id instead of maintaining the token counter ?

    • @ThinkSoftware
      @ThinkSoftware  Před 3 lety

      What do you think what are the issues in your approach?

  • @khuranaishaan5
    @khuranaishaan5 Před 4 lety

    Hey. This video was great. You have made all the concepts abundantly clear. I had a quick question tho. While sending the write request, we ask the user to provide the API key right? Can't we just use that along with the counter variable which keeps on incrementing until a unique URL is generated? I'd think that it might tick up the overall complexity?

    • @ThinkSoftware
      @ThinkSoftware  Před 4 lety

      Thanks for the comment 🙂. Are you suggesting about different counters for different users? This will not work due to conflicts and you need to keep increasing the counter.

    • @khuranaishaan5
      @khuranaishaan5 Před 4 lety

      @@ThinkSoftware I was just thinking using the user info and the url provided to generate the unique url. Not sure how that'd pan out.

    • @khuranaishaan5
      @khuranaishaan5 Před 4 lety

      @@ThinkSoftware Another quick question. I'm sort of trying to design this system in nodeJS. So I have created 3 heroku accounts for 3 different servers and one for exposing the API service. I was pondering if creating the global cache in the API server will be a good idea or not. Like we could expose the read operation for the global cache through the API. Whenever a read request is sent to any of the servers from the API, it calls the cache endpoint on the API and if no match is found, the server checks in its own local cache and then ultimately resorts to accessing the database. What's your stance on this approach?

    • @ThinkSoftware
      @ThinkSoftware  Před 4 lety

      You will start seeing issues when you have lots of short url in the database.

    • @ThinkSoftware
      @ThinkSoftware  Před 4 lety

      I think already discussed the pros and cons of local vs global cache in the video.

  • @harrylin6282
    @harrylin6282 Před 4 lety

    Thanks for your video. one question, when you mentioned the non-functional requirements (@5:32) should include high available and strong consistency, but based on the CAP theorem, we can only choose one from them, how can we keep high availability and strong consistency at the same time?

    • @ThinkSoftware
      @ThinkSoftware  Před 4 lety +2

      Thanks for the comment. As we discussed, the service has strong consistency in the sense that once we created and returned a short URL to the user, if the user queries the big URL using that short URL, the service should be able to return the big URL. Otherwise it will not be a good user experience. Now the question arises that how a service is available and consistent at the same. It boils down to the fact that no service is 100% available. Question arises whether we are targeting five 9s or four 9s of availability and again it also boils down to whether we are talking about availability of the whole service or some part of it. The service itself can be highly available (as a whole) and at the same time fulfill consistency requirement as a whole (keyword is "as a whole") however the individual partitions (in the datastore) may have to prefer either availability or consistency. Now it is totally possible that one partition could be down due to network partition but it won't affect the functionality of other partitions, thus overall service still considered as highly available. And please note, I didn't discuss the design of datastore where we would discuss all these things (which will be a topic of a future video). I hope this clarifies.

    • @harrylin6282
      @harrylin6282 Před 4 lety

      @@ThinkSoftware Thanks a lot !~

    • @vaibhavbhardwaj2244
      @vaibhavbhardwaj2244 Před 3 lety +2

      I would like to add on to what has already been said - the question of choosing between availability and consistency arises only in cases of a partition. Otherwise in a normal running scenario a service could be both available and consistent. Now let's discuss the partition scenario . In this scenario . we could make the read service to be highly available while the write service might become unavailable in event of partition thus maintaining the consistency . So different microservice of a system could be configured differently to support availability or consistency ! Hope this helps

    • @ThinkSoftware
      @ThinkSoftware  Před 3 lety

      Thanks for extending my answer with great explanation 👍

    • @mohanhh
      @mohanhh Před 3 lety

      @@ThinkSoftware I would think, eventual consistency should be OK. As most of the time, the short URLs are created may not be immediately queried. The flow would be user creates a short URL, posts it to his followers and the followers may not click on it immediately. If we use a NoSQL Database like Cassandra, we can improve Read Performance tremendously

  • @wg7982
    @wg7982 Před 3 lety

    If the datastore is a single machine, will that beat the purpose of high availability if it is down?

    • @ThinkSoftware
      @ThinkSoftware  Před 3 lety

      Of course

    • @wg7982
      @wg7982 Před 3 lety

      @@ThinkSoftware in your example, datastore is a single machine or you just use single drawing to show datastore?

    • @ThinkSoftware
      @ThinkSoftware  Před 3 lety +1

      it is not a single machine. It is just logical diagram of data store. If we would have gone inside that logical component then the discussion could go about how to achieve all non functional requirements in that data store. Check my other video on intro to distributed systems in which I discussed a bit in detail about it.

  • @ArunKumar-jk5pq
    @ArunKumar-jk5pq Před 3 lety

    Sorry, what is the algorithm for short url? how to use the id from db to get short url?

    • @ThinkSoftware
      @ThinkSoftware  Před 3 lety

      This is discussed in the video.

    • @ArunKumar-jk5pq
      @ArunKumar-jk5pq Před 3 lety

      @@ThinkSoftware Could you pls point me to the time in the video talking about this?

    • @ThinkSoftware
      @ThinkSoftware  Před 3 lety

      I have to watch the video now to get the time as it was long time ago when I upload this video.

  • @rahulbatheja7933
    @rahulbatheja7933 Před 3 lety

    Hi @ThinkSoftware, I have a question. let's suppose we have 3 app servers, and we try to deploy a new build. obviously, we can't deploy a new build on all the servers at the same time, so now if we go one
    by one then it's possible that a single user try to access our service gets a response from a machine where the new build has been deployed and if the user tries to access the same service gets a response from another machine(where new build-id not there yet, as it's gone routed by load balancer) is this a valid case. in a real-world how this thing works?

    • @ThinkSoftware
      @ThinkSoftware  Před 3 lety

      Yes this is always the case. So any new feature deployment need to make sure that you don't enable the feature till the department is complete.

    • @rahulbatheja7933
      @rahulbatheja7933 Před 3 lety

      @@ThinkSoftware I am not sure, how this will be possible in every case. I know spring cloud provides us a way to keep our configurations in a central place & we can make use of it to make feature enable after deployment but let's suppose we are trying to deploy a front-end app(server-side rendering) and my change was just to change the background color of a particular div/model from yellow to blue, how can I enable the feature only after deployment in all the app-servers?

    • @ThinkSoftware
      @ThinkSoftware  Před 3 lety +3

      It depends from what are you trying to achieve. If we just consider your example, you should be come up with an intelligent way to just update the background color without even needing the deployment. In reality, when we are working on cloud service, our feature work include design, implementation, testing and even deployment strategy that is most suited for that feature enablement.

    • @rahulbatheja7933
      @rahulbatheja7933 Před 3 lety

      @@ThinkSoftware Got it, thanks a lot for this awesome video

    • @shekharamitanshu
      @shekharamitanshu Před 2 lety

      @@rahulbatheja7933 The answer to your question is blue-green deployment. In cloud anyway you have the feasibility to increase the instance as you require. So while your blue server is catering the customer request , you can do the deployment in green server and once the deployment is complete, this green server will take place of blue server and that transition is very smooth. This is called rolling update where you will not see any service disruption.

  • @fokkor
    @fokkor Před 4 lety

    You talked about updating the counter values inside a transaction construct. But then we're going to store this in a No-Sql datastore, is this transaction construct possible in DynamoDB or Cassandra like DBs? Maybe Spanner or cockroach offers this but my question is in a generic sense. Thanks for the Awesome work!

    • @ThinkSoftware
      @ThinkSoftware  Před 4 lety +2

      Thanks for the comment. Other databases provide similar constructs as well. E.g. Cassandra provides atomic writes and light weight transactions for updating a single record etc.

    • @arunagiriswarane1155
      @arunagiriswarane1155 Před 4 lety

      @@ThinkSoftware but between two tables how will u acheive transacation in cass.

    • @ThinkSoftware
      @ThinkSoftware  Před 4 lety

      @@arunagiriswarane1155 which scenario in our design requires updating two tables in a transaction?

    • @amandama6165
      @amandama6165 Před 3 lety

      @@ThinkSoftware When generated new mapping between short and big URLs, we have to both add it into mapping table and update the opt URL count in user table. Should these two write-operations be done in one transaction?

    • @kunzhou5159
      @kunzhou5159 Před 3 lety

      ​@@amandama6165 I agree this is a problem. Also, lightweight transactions require a quorum so it's not exactly scalable.

  • @KrishnaSharma-vl4re
    @KrishnaSharma-vl4re Před 3 lety

    counter value can go out of range ?

    • @ThinkSoftware
      @ThinkSoftware  Před 3 lety

      A 64 bit long integer has long way to go if you calculate it and that at that time counter going out of range is least of what you should worry about 🙂

  • @shubhamchandra9258
    @shubhamchandra9258 Před rokem +1

    In the counter example with datastore, is it not a single point of failure ? How to fix that in case of counter in db ?

    • @ThinkSoftware
      @ThinkSoftware  Před rokem

      Thanks for the question. This has been discussed in the course. Short answer: master-slave architecture.

    • @genuineprofile6400
      @genuineprofile6400 Před rokem

      @@ThinkSoftware what if master fails before the new incremented value is replicated to slave ?

  • @ameets10
    @ameets10 Před 3 lety

    There is single point of failure using single db for generating keys

    • @ThinkSoftware
      @ThinkSoftware  Před 3 lety

      Thanks for the comment. There are various ways to handle it. Will be discussing in the course.

  • @ravindrabhatt
    @ravindrabhatt Před 2 lety

    if we increase the size of tiny url to 7, then can we use base64 encoding? or we have to use base128?

    • @ThinkSoftware
      @ThinkSoftware  Před 2 lety

      using base64 or base62 or base128 has nothing to do with using 6 or 7 characters. With 6 chars, you can use base62 or base64 or even any other base encoding as long as those characters are supported by HTTP request in the query path.

  • @JaswinderSingh-uw2hf
    @JaswinderSingh-uw2hf Před 3 lety

    one requirement will not be achieved using this approach sir.
    - short url should not be guessable.

    • @ThinkSoftware
      @ThinkSoftware  Před 2 lety

      If you check my course, i have discussed this there in detail how this will be achieved via this approach. Good thing is the tinyURL system design is free preview chapter.

  • @KapilSharma-zv5oo
    @KapilSharma-zv5oo Před 2 lety

    at 27:29 you say something like "in one of the online resources the author has used + and / characters but these are invalid characters". True but I would highly recommend not trying to be-little others in your videos. I can find several flaws in your videos also and I am sure other youtubers will also. But the professional ones will stick to the topic won't take it as an opportunity to criticize others work. I have seen some other Tiny URL videos where the author explained the concepts much better than yours. By the way - I also purchased your paid course and I am regretting the purchase. It seems like you started this project as a hobby project 3 years back, you are selling it online but more recently you have lost interest in it and there is not much content. All I see is those mock interviews which have little benefits. One shouldn't really be paying for that kind of content. There is plenty of free material online which is much better.

    • @ThinkSoftware
      @ThinkSoftware  Před 2 lety +2

      Thanks for the comment. I only mentioned that an online course mentioned these characters but using them are invalid. However I didn't mention any name so how am I be-little someone? I just highlighted a mistake to avoid. Also I appreciate your feedback about my course. There are already 10 chapters on design of 10 different systems like tinyurl, twitter, Netflix, Uber, stripe, distributed queues, etc. I hope you did go through them. So I do like to know what you didn't like. I do plan to add few more chapters on the design of object storage, distributed db, etc. before considering my course as complete.

    • @sriram6148
      @sriram6148 Před 2 lety

      @@ThinkSoftware
      I completely go with you as you jave not mentioned anybody's reference; moreover, saying something is incorrect is a better way to guide the bigger community.