Managing Data in Microservices

Sdílet
Vložit
  • čas přidán 8. 07. 2024
  • InfoQ Dev Summit Boston, a two-day conference of actionable advice from senior software developers hosted by InfoQ, will take place on June 24-25, 2024 Boston, Massachusetts.
    Deep-dive into 20+ talks from senior software developers over 2 days with parallel breakout sessions. Clarify your immediate dev priorities and get practical advice to make development decisions easier and less risky.
    Register now: bit.ly/47tNEWv
    --------------------------------------------------------------------------------------------------------------------------------------
    Download the slides & audio at InfoQ: bit.ly/2wVAkdN
    Randy Shoup shares proven patterns that have been successful at Google, eBay, and Stitch Fix. Shoup covers managing data, the need to isolate a microservice's data store behind the service interface, using events as a first-class tool in the architectural toolbox, techniques for service extraction from a monolithic database and much more.
    This presentation was recorded at QCon New York 2017.
  • Věda a technologie

Komentáře • 108

  • @lironlevi
    @lironlevi Před 3 lety +23

    The problem with "do it right the first time" approach is that its an all or nothing mindset. You sometimes dont have the time or knowledge to do it right the first time so you do a far from perfect solution that is still enough to drive the business forward and thats all that matters.
    The second problem i have with his database approach is that resolving data inconsistencies via compensating actions can quickly become a major source of data inconsistency bugs. Its a lot harder to maintain data consistency without the help a relational database provides in terms of transaction support.

  • @darkopz
    @darkopz Před 5 lety +27

    The sad reality is to the initially question is that sometimes we DO have time to do it twice. Why? Because projects set unrealistic expectations for code freezes and release but build in obscene “testing” cycles in which you can essentially rewrite the entire Feature as long as you’ve said you finished the first time. I’ve come across quite a few of these types of companies. As much as one pushes to do it right the first time, it’s not always possible given the project dynamics.

  • @ClimbAClassic
    @ClimbAClassic Před 6 lety +121

    Rewind to 28 minutes to get to the topic

    • @wartem
      @wartem Před 6 lety +2

      ClimbAClassic thanks

    • @neuemage
      @neuemage Před 6 lety +7

      *forward

    • @tansudasli
      @tansudasli Před 5 lety +4

      is that make sense :)

    • @sn20
      @sn20 Před 4 lety

      @@tansudasli the talk was good. minus the does it make sense. ofcourse I saw it from 28+ climbclassic thanks.

    • @user-ud8hw4gp6t
      @user-ud8hw4gp6t Před 2 měsíci

      @@tansudasli does it

  • @GiovanniSosa
    @GiovanniSosa Před 5 lety +13

    Great advice on TDD and why it's important "Do it right the first time"

  • @matimon
    @matimon Před 4 lety +6

    I really REALLY like this talk and keep coming back to watch it again every few months

  • @lifedesignguru
    @lifedesignguru Před 3 lety +1

    Fantastic talk! Thank you so much.

  • @vimalneha
    @vimalneha Před 5 lety +2

    Excellent talk

  • @wepranaga
    @wepranaga Před 3 lety +3

    I'm really rooting for the quote. don't have time to do it right.
    but have time to do it twice

  • @osamaa.h.altameemi5592
    @osamaa.h.altameemi5592 Před 3 lety +1

    That was awesome talk. Thank you.

  • @chrisstakutis6574
    @chrisstakutis6574 Před 5 lety

    Fantastic and so well-done. Especially loved how to think of ACID transactions as a series of events (and backwards to unroll)

  • @koredeaderele1666
    @koredeaderele1666 Před 3 lety

    good talk, clearly presented

  • @l_karuhanga
    @l_karuhanga Před 4 lety +2

    'I'm gonna talk about joins'- finally!

    • @zpengh
      @zpengh Před 4 lety

      That’s the question I have when I see he separated tables

  • @solmonr
    @solmonr Před 3 lety +2

    Great talk, but one has to have a great ROI to choose Micro Services.. Sometimes or most of the time it is a showoff from the point of architect, trying to force

  • @rdean150
    @rdean150 Před 6 lety

    Wait, back up, that last part didn't make sense.
    Seriously though, great talk. This is a topic that is much harder in practice than the theoretical ideals make it sound, and the speaker clearly understands that and highlights the key aspects that teams must keep in mind throughout their design and implementations. At times it can make you question the ultimate benefits of such an architecture vs the new challenges it creates. And I think that the key is the understanding the problem domain, resources, and true requirements.

  • @umaradilov1489
    @umaradilov1489 Před rokem

    Great speech !

  • @tmustafad
    @tmustafad Před 6 lety +7

    Does it make sense? ahaha . ı really liked the way he expresses the topic and himself. so cool and mind catching session.recommend everyone!

  • @joepage3065
    @joepage3065 Před 6 lety +4

    Great presentation, thanks.

  • @kaushikvelidandla7010
    @kaushikvelidandla7010 Před 2 lety +1

    Around 38:41, for materializing the view , he says bunch of items and feedback about each item is a many to many relationship. How ? 1 item can have N feedbacks. Can 1 feedback to associated with M items ? ( Understandable if it's order feedback and each order feedback consists of multiple item feedbacks)

  • @kejiaosui791
    @kejiaosui791 Před 4 lety +2

    really awesome talk and solved our problems!

    • @hanifa205
      @hanifa205 Před 4 lety +2

      Can I Know what problem you have and what solutions you got in this video and worked for you

    • @alvaromoe
      @alvaromoe Před 3 lety +1

      You must work at stitch fix

  • @AndrewBerezovskiy
    @AndrewBerezovskiy Před 5 lety

    For those thinking to replace transactions with Sagas: don't! ACID was mentioned in the talk but Sagas are NOT ACID! en.wikipedia.org/wiki/ACID_(computer_science). Sagas don't deal with isolation, the I in ACID! Atomicity would also be hard to guarantee because systems can fail on rollback.

    • @robfielding8566
      @robfielding8566 Před 5 lety

      transactions have existed before computers and RDBs. you can get transactions by having data structures that are amenable to everybody converging on the same state. (ie: compensating transactions). when you are mutating existing records in a DB (which you dont have to do), you are creating a problem that requires the transactions in the first place; namely, undoing updates made to records.
      more abstractly: if you use a sharpie marker to draw out immutable functional data structures, you have not lost information when you add more structures (ie: extending a linked list where new nodes backpoint to old nodes). if you number new nodes with the current transaction, you can see that if you can't complete the transaction by adding the final node, then you just delete (or ignore) the new objects you were adding. if you use immutable append-only structs, "rollback" is trivial. of course if you have an actual side-effect (ie: launch the missles), then you can't roll back. but an RDB won't solve that problem either.
      if you want a consistent and mutable database, then you are saying that it's ok for all updates to stop once two halves of the network become isolated, and a vote on what the next state change is cannot pass.

  • @basilio100
    @basilio100 Před 5 lety +4

    Not clear abt sagas - how you do "rollback" (compensating) if process broke somewhere in between ? who initializes rollback process? Will that compensation event go to both direction - to A and to D services?....

  • @danielschmider5069
    @danielschmider5069 Před 3 lety +2

    what he says on 30:00 is basically "get rid of all your joins"?
    decouple your databases, then cache all the data that you WOULD have available?

    • @f135ta
      @f135ta Před 2 lety

      Thats what I did! Life without joins - its bliss

    • @JimmyZ0
      @JimmyZ0 Před 2 lety

      Yep,that is also what I am doing. Use cache instead, your life will be much easier!

  • @sandipshrestha7170
    @sandipshrestha7170 Před 2 lety

    what does it mean to have read-only, non authoritative cache?
    The fulfilment service doesn't store customer's data in it's database right? where is it stored?

  • @zpengh
    @zpengh Před 4 lety

    The part for shared data is in 31:10

  • @mdanetzky
    @mdanetzky Před 5 lety +21

    What should we do when the transaction rollback fails?

    • @CronosTsHastaroth
      @CronosTsHastaroth Před 5 lety +31

      Cry in a fetal position ?

    • @SiddharthKulkarniN
      @SiddharthKulkarniN Před 5 lety +5

      Waiting for a serious answer.

    • @VishalGupta9
      @VishalGupta9 Před 4 lety +1

      What you could do is say the transaction rollback is happening from service D -> C -> B -> A. For example if there was a failure when D sent the event to C. You could have a journal of events. D would send the event once and maybe receive a confirmation whether the event was consumed or not. If D didn't receive a confirmation, it could retry sending the event again. On service C side, let's say it received the event, it could store it in its local journal and try to consume it. If for any reason the event consumption failed, it could retry consuming failed events later or maybe a patch fix could be sent to service C after identifying the bug. Since the failed event is still present in its local journal. It would again try consuming it and maybe succeed now. After succeeding it would pass the event down to event B and so forth.

    • @zpengh
      @zpengh Před 4 lety

      Vishal Gupta the rollback could fail too . I think this is a serious problem when it involves money transactions.

    • @Dth091
      @Dth091 Před 3 lety +3

      @@zpengh What if the action you take when the rollback fails also fails? What if the code that handles that failure also fails? There's no real guarantee on consistency if you accept the fact that any part of your system can fail, but having good testing and resilience of your systems combined with a good set of tools for your operations teams is a solid way of keeping on top of things. Good alerting, tracing, and logging will be essential to this, along with making sure that your systems/services are written defensively to notice malformed or inconsistent data and fail gracefully.

  • @singarajusreedhar
    @singarajusreedhar Před 3 lety

    How is the SAGA handle failures?

  • @dattannguyen4093
    @dattannguyen4093 Před 5 lety +2

    The way he talked is super attractive

  • @ziadirida
    @ziadirida Před 2 lety

    I wonder if the “private database” for a service in postgresql is also a separate cluster! Is it a separate schema, database or cluster?

  • @MrGYPSYSPADE
    @MrGYPSYSPADE Před 6 lety

    where is the link to this new open source tech which can check the architectural efficiency of the microservices based systems...this DTMS he mentions? DTMS(Does This Make Sense) Jokes apart...great insight. thanks.

  • @tdrake59
    @tdrake59 Před 6 lety +47

    How many times can you ask "does this make sense?"

    • @elektrixmk
      @elektrixmk Před 6 lety +7

      I wish other speakers did the same instead of just "reading from the prompter".

    • @lztverygood
      @lztverygood Před 5 lety +11

      i kinda like he keep asking this

    • @Carl-yu6uw
      @Carl-yu6uw Před 5 lety

      Is a bit condescending I say..does that make sense?

    • @sn20
      @sn20 Před 4 lety

      I can't believe people like this -"Ask me does it make sense" every 3 sentences. I am surprised and I cannot have sex for 6 months.

    • @really6717
      @really6717 Před 3 lety

      @@sn20 lol! wth?!

  • @NikhilDhiman
    @NikhilDhiman Před 6 lety

    hi, i just want to know how to handle real time data that requires join. If we use a some event to propagate data from services to maintain cache in some other service. How we maintain consistency

    • @nagyzoli
      @nagyzoli Před 6 lety

      You have to change the way you think. Leave SQL behind when you go distributed. You have to reuse the old architecture of hierarchy that was the norm before SQL (Think foxPRO, dBase). So basically key value pairs, mongoDB and friends. The "re-creating" of the join mechanism seems stupid a bit.

  • @AI7KTD
    @AI7KTD Před 4 lety +10

    Something tells me he's not sure if what he says makes any sense!

  • @felipealvarez1982
    @felipealvarez1982 Před 6 lety +1

    Feel free to give your own talk on the subject.

  • @JeffChentingwei628
    @JeffChentingwei628 Před 3 lety

    40:38 saga

  • @yuchen52
    @yuchen52 Před 3 lety +1

    It is good talk over all. But why saying "dose this make sense" so many times?

  • @danielschmider5069
    @danielschmider5069 Před 4 lety +8

    the audience must be looking absolutely clueless for him to verify if it makes sense every 3 minutes

  • @erenxbjk
    @erenxbjk Před 4 lety +2

    i doesn't make sense at all. since i am waiting for him to say this makes sense instead of understanding what he's saying?

  • @godblessCL
    @godblessCL Před 4 lety

    I am not quite sure that every microservice project must used its own database and did all that problematic corrections to have consistence data. If you project is big and you want to scale or you need performance out of db, then you can do that.

    • @f135ta
      @f135ta Před 2 lety

      The point is that microservices should be individually scalable. If you have a shared database, thats a single point of failure and also a shared dependency. You can't scale the database without affecting all the connected services. That breaks the rules of microservices being independent. Take this example: You use STRIPE for your payments processing. You share the database with Stripe because "it makes everything easier", then STRIPE decides it needs to take their database offline to update it or scale it or even change it to another technology. That now affects YOU. That is the fundamental point of microservices - you bring all those dependencies into a single place that relates to one service and you communicate with the other services in your application by decoupled methods

  • @bsuperbrain
    @bsuperbrain Před 4 lety +2

    Is eBay still on board? 16K methods in a single class?! Applause...

  • @kaveee971
    @kaveee971 Před 6 lety +18

    So basically you are sacrificing all the built in features of relational databases such as transactions and implementing your own version of transactions - to me seems like lots of extra work and architectural burden to maintain.

    • @dovh49
      @dovh49 Před 6 lety +14

      Muhammad Kamran, if you listen to the speaker he is saying that you do this when you need to scale. It doesn't make sense if you don't need to scale. So, this isn't a problem for most companies and shouldn't be implemented at most companies.

    • @FellshardYT
      @FellshardYT Před 6 lety

      I think Muhammad has a point here, though, because what seems silly here as that the primary recommendation is to split core tables to 'service-per-table', which is likely to end up being pretty naive for most monolithic databases. Considering bounded contexts and domains may be more effective _and_ more efficient.

    • @clray123
      @clray123 Před 6 lety +3

      dovh49 seems like there are other ways to scale without throwing out decades of database technology and research (clustering? sharding?), but maybe that's just me

    • @ArchimedesTrajano
      @ArchimedesTrajano Před 6 lety

      Unfortunately those scaling technologies cost a lot of money and effort to run. Thankfully we have Amazon RDS that does quite a bit for us and even if it is costly, at least they provide that service that can scale up much better than an in-house database team for the most part.

    • @clray123
      @clray123 Před 6 lety +5

      Archimedes Trajano reinventing the wheel is surely going to be more costly in the long run than leasing someone else's working wheel. When you are in APPLICATION development, you should be caring about business application logic, not about conundrums of distributed systems design.

  • @-Jason-L
    @-Jason-L Před 2 lety

    if you don't start with microservices, you are in reality falling prey to "you don't have time to do it right ? Do you have time to do it twice?". Migrating to microservices is far more complex and time consuming than just doing it right to first time. This is totally avoidable tech debt.

  • @manderskutz
    @manderskutz Před 2 lety

    Three sentences worth of valuable information stretched across an hour :(

  • @nareshgb1
    @nareshgb1 Před 6 lety

    So instead of managing one database (availability, backup and recovery, administration [capacity,performance,deployment,upgrades]) you manage "N" databases.
    "Makes sense".
    Makes even more sense with sharded databases - then you multiply "N" by "M" (number of shards)

    • @clray123
      @clray123 Před 6 lety

      When I watched some early presentations about this shit, I thought they were contriving examples to make them more approachable, but they actually seem serious about it. One database per table?!...

    • @yahorsinkevich4451
      @yahorsinkevich4451 Před 5 lety +1

      That's the way to make things scalable. You can't scale single database (well, when it is traditional relational database, not cassandra or so)

    • @yahorsinkevich4451
      @yahorsinkevich4451 Před 5 lety +1

      @@clray123 Nobody have mention one single database per table, that is insane. One database per service might have tens ot tables.

    • @clray123
      @clray123 Před 5 lety +2

      @Yahor Sinkevich For the great majority of applications scalability of this kind is not an issue at all. The pain in the ass caused by implementing caching and consistency management not to mention simple joins with distributed data without ACID properties is, on the other hand, quite certain. And what is a "service", how do you defined which data belongs to one service and which to another - when it's most likely that you will need to join it at one time or another (else you would have two different applications).

  • @markemerson98
    @markemerson98 Před 3 lety +1

    Nope: 4 months to deliver a perfect system; How to do that right? Software is never complete, it’s just a version... in reality we do not have time to do it right in all cases...

  • @f135ta
    @f135ta Před 2 lety +1

    What are you expecting when you ask "Is this making sense"? A bunch of people to start yelling "No"? - Mildly irritating.

  • @alexeystaroverov4804
    @alexeystaroverov4804 Před 5 lety +2

    Cut bragging off, please

  • @abdullahkauchali3896
    @abdullahkauchali3896 Před 4 lety +2

    "Don't use 2PC for managing ACID transactions - bad for scalability" and then proceeds to re-invent a distributed transaction management system with what he calls "SAGA" and events and state machines!
    What if the compensating "rollbacks" themselves fail??

  • @kungfu71186
    @kungfu71186 Před 5 lety +3

    This is how you don't do microservices.

  • @KenGormanGuitar
    @KenGormanGuitar Před 4 lety +4

    By the time this company completes moving all of their entities from the monolithic shared database he references into separate databases, microservices will no longer be in style.

  • @TheNikavashnika
    @TheNikavashnika Před 2 lety

    I don't usually comment on tech talks but I'm a bit upset I wasted an hour or so.
    Long exposition, contrived examples, zero useful information.
    The only way this whole talk really makes sense (pun intended) is if you consider it to be a show-off for potential investors.

  • @berndeckenfels
    @berndeckenfels Před 5 lety +1

    The rhetoric question if you make sense is super annoying to me.

  • @MrMikomi
    @MrMikomi Před 4 lety +1

    Stitch Fix has a pretty meh rating of 2.5 stars out of 5 on consumer affairs. 100 data scientists and 100 software engineers can't do anything but polish a turd. Typical wall of greedy VC money in desperate (failed) search for an idea that will make a ton of cash.

  • @hotplugin
    @hotplugin Před 6 lety +1

    Lame conference.

  • @DeepakPandey-ij3bz
    @DeepakPandey-ij3bz Před 4 lety

    Worst