ULID vs UUID: Which One Should You Use?

Th30z Code

zhlédnutí 8 284

Přidat do
- Můj playlist
- Přehrát později
Sdílet

Sdílet

Vložit

Velikost videa:

Zobrazit ovladače přehrávání

Automatické přehrávání

Přehrát

čas přidán 4. 06. 2024
ULID, UUID v7 (Time-Based IDs) and UUID v4 (Random Ids) offers completely different characteristics.
Learn about the data layout and when to prefer one or the other based on your type of database and behaviour you want to obtain.
Learn how ULID can cause hotspots in distributed databases, and how UUID v4 can cause fragmentation and increased disk I/O in traditional databases.
Slides are available here: speakerdeck.com/matteobertozz...
Chapters:
0:00 Data Layout
0:50 Rand-based IDs Storage behaviors
1:21 Time-based IDs Storage behaviors
2:10 String Encoding characteristics
2:33 Use cases and usage
#database #backend #coding #codingtips #programming #databasedesign #backenddeveloper #databases
Věda a technologie

Komentáře • 16

@rajughorai7483 Před 6 měsíci ⁺³
🎯 Key Takeaways for quick navigation:
00:00 🧱 Both UUID and ULID are 128-bit data blocks, but ULID's 48-bit timestamp field allows for lexicographical sorting and time-based locality.
00:28 🔄 ULID is not directly compatible with UUIDs; however, UUID version 7 shares similar properties with ULID.
00:57 🗃️ Random-based IDs like UUID v4 are beneficial for distributed databases due to good data distribution across nodes, but can lead to fragmentation in single-machine databases.
01:23 ⏳ Time-based IDs like ULID improve disk I/O and cache efficiency in single-machine databases, but may create hotspots in distributed systems.
02:18 🔢 ULID uses Crockford's base32 for a more compact string encoding, maintaining lexicographical order, and is suitable for generating offline IDs in multi-machine environments.
@Aceptron Před 7 měsíci ⁺¹
Great explanation! Concise, yet covered great questions.
Loved it! ❤
@th30z-code Před 7 měsíci ⁺¹
Glad you liked it! thanks for watching!
@AK-vx4dy Před 8 měsíci ⁺²
Quick and good explanation! Good job !
@th30z-code Před 8 měsíci
Thank you!
@vikingthedude Před 5 měsíci ⁺¹
This is good stuff. I hope to see more backend/architecteture/system design content like this. This reminds me of Hussein Nasser. I love it
@th30z-code Před 5 měsíci
Thank you! really appreciated.
@user-jq4li8kj8o Před 2 měsíci
Great video, thanks!
@mofo78536 Před 9 měsíci ⁺²
Would be nice for a quick rundown on ULID vs UUIDv7 and practical considerations between it (e.g. can it be converted between both?)
@th30z-code Před 9 měsíci ⁺¹
From a data block point of view ULID and UUIDv7 are almost the same thing, since they are both 48bit timestamp + "random data".
But the 6bit for version and variant of the UUID format makes them incompatible.
The conversion from ULID to UUID will result in losing information.
Truncating those 6bits and replacing them with v7 and the variant should not be a problem, but in the worst case it may generate collisions.
Converting from UUIDv7 to ULID is not a problem since there is no data loss.
From a textual representation point of view, the ULID has the advantage to be shorter.
Unfortunately there is no standard support for that, but with few lines of code you can also encode the UUID to Base32 and the Base32 to UUID assuming you are interested in converting them (e.g. if you are using type UUID "type" that your language provides).
I hope I have answered your question.
Thanks for watching!
@mofo78536 Před 9 měsíci
@@th30z-code yeah I wonder if the difference of 6 bits would make a difference in practical application as of course the length would effect how likely a birthday collision may occur in a distributed system.
I am also checking with the spec writer if it would make sense to also standardise on an optional extended ULID that would include a checksum as there may be a chance that people may be writing these IDs down on paper as well. Was looking at Crockford bade32 checksum char which adds 4 more symbols, but his use of = in checksum chars would be hard to use in a url... So might be better if it was appended with a _ to the original ULID string but where the extra characters are the CRC-8 or something easily implemented check instead.
@th30z-code Před 9 měsíci
In practices you can ignore the probability to generate a collision with both UUID/ULID.
I've seen many systems in production blindly trusting the random implementation (but don't use javascript Math.random(), or any pseudo-random generator for your implementation).
But yeah, from a numeric point of view, the missing 6bit makes a difference:
with ULID the probability of collision in a given millisecond is 1 out of 1208925819614629174706176 (2^80).
with UUIDv7 the probability of collision in a given millisecond is 1 out of 18889465931478580854784 (2^74).
but it's still a really really low probability.
To me, as for all the error handling logic, the question is:
Is my code putting a life at risk or not. if not, you can trust the implementation and don't bother about the collision probability.
If your ULID/UUID are generated from an enumerable number of machines,
you can switch to a different ID generation strategy and remove completely the problem of collisions and even reducing the length of your IDs.
Check out the Twitter Snowflake IDs: czcams.com/users/shortseHr2EIRcYZY
@kennedydre8074 Před 6 měsíci
Great video! Thank you for the amazing explanation, I have a question. How can I use uuidv7 or ulid in my typescript project?
@th30z-code Před 6 měsíci
You can use 3rd party libraries that provides already simple functions to generate the uuidv7/ulid.
github.com/LiosK/uuidv7
github.com/kripod/uuidv7
github.com/aarondcohen/id128
github.com/ulid/javascript
@abdelrahmandwedar Před 2 měsíci
At the moment 1:21 I thought then random-based IDs would be more suitable for databases that's frequently being used for sharding like MongoDB, while time-based IDs would be more suitable for most of other databases with b-tree data structure.
@AxioMATlC Před 12 dny
For a primary key, neither ULID or UUID is preferable vs an integer for the sake of inefficient storage which will take up precious memory if you want to keep your indices all in memory (you do). a global ID is great for generating IDs without communicating to the database to get an auto incrementing ID, or for using it as a external ID that cannot be exploited/guessable by users. You should have an int/bigint primary key and additionally/optionally a UUID or ULID depending on your use-case

Další v pořadí

Automatické přehrávání

Database Clustered vs Non-Clustered Index