Distributed Scheduling with Spring Boot: the challenges & pitfalls of implementing a background job

Sdílet
Vložit
  • čas přidán 6. 06. 2024
  • Spring I/O 2024 - 30-31 May, Barcelona
    Speaker: Rafael Ponte
    Slides: speakerdeck.com/rponte/distri...
    Sooner or later a developer will implement his/her first background job using Java and Spring Boot, and what usually is a simple task for the majority of systems might become a nightmare in scenarios that need to deal with high performance, parallelism, distributed systems and a large volume of data. Scenarios like those hide several issues which many developers are not used to, such as large volumes of data, network failures, data inconsistency, out-of-memory errors and even taking the whole system down.
    Although it seems controversial, dealing with many of these problems does not require hype technologies or services, but solid distributed systems fundamentals. This talk will present how an experienced developer implements a background job with Java and Spring Boot taking into consideration the main challenges and pitfalls it brings along, and how he/she designs a solution for high-performance, resilience and horizontal scalability at the same time he/she takes advantage of many modules of Spring Boot, Hibernate and the relational database.
    If you still believe that a background job is a simple task, so this talk is for you!
  • Věda a technologie

Komentáře • 116

  • @felipedossantos7246
    @felipedossantos7246 Před 10 hodinami +1

    I've seen this presentation in portuguese before of Rafael Pontes in Zup Channel, and I could implement something similar it in my job. Great work, Bro! Thank you so much

    • @RafaelPonte
      @RafaelPonte Před 3 hodinami

      Hi Felipe,
      Thanks for this comment and for having watched both versions of the talk. ❤

  • @eduardo120155
    @eduardo120155 Před 25 dny +5

    Congratulations on your presentation! You absolutely nailed it. Your thorough research and confident delivery captivated everyone in the room. Your ability to explain complex ideas so clearly is truly impressive. Keep up the fantastic work!

    • @RafaelPonte
      @RafaelPonte Před 25 dny

      Thanks for the kind words, Eduardo! ❤

  • @terteseamos579
    @terteseamos579 Před 8 dny +2

    this for me is the best presentation. Great job

    • @RafaelPonte
      @RafaelPonte Před 8 dny

      What a comment! Thanks for that! ❤️

  • @bkavun
    @bkavun Před 9 dny +1

    Great presentation, great work. Thanks a lot for sharing this knowledge with us!

    • @RafaelPonte
      @RafaelPonte Před 8 dny +1

      Thanks so much! I am glad you liked it 🥳

  • @RaphaelDeLio
    @RaphaelDeLio Před měsícem +5

    Parabéns, Rafael! Foi um prazer assistir sua apresentação pessoalmente!

    • @RafaelPonte
      @RafaelPonte Před měsícem

      Obrigado demais, Rapha! ❤ Você eh top!

  • @linhvudev
    @linhvudev Před měsícem +4

    Thanks Rafael! especially for the SKIP_LOCKED feature, new knowledge learnt

    • @RafaelPonte
      @RafaelPonte Před měsícem +1

      Thank you so much! I am glad the talk was helpful for you! 🥰
      And yeah, SKIP LOCKED is fantastic!! 💪🏻

  • @jesprotech
    @jesprotech Před měsícem +3

    I really like the way you explained short running transactions. Nice addition to the jobs! Parabéns pela excelente apresentação! É muito útil!

    • @RafaelPonte
      @RafaelPonte Před měsícem

      Thanks so much! I am glad you liked it 🥰

  • @YZ-ix3dn
    @YZ-ix3dn Před 13 dny +1

    Thank you for clear and well-structured presentation. It's very useful and important information even for people with lots years of experience. I wish every developer should watch this video when every time they put @Transactional onto theirs method.

    • @RafaelPonte
      @RafaelPonte Před 12 dny

      Thanks for the kind words! I am glad you enjoyed the talk! ☺

  • @annabeatrizelias9689
    @annabeatrizelias9689 Před měsícem +2

    Parabéns, muito show!

  • @pavanerbeck23
    @pavanerbeck23 Před 2 dny +1

    Nicely done @RafaelPonte.

  • @danielponte3134
    @danielponte3134 Před měsícem +2

    Parabéns meu irmão , você deu um show na apresentação, impecável! show de top!

  • @ferlezcano
    @ferlezcano Před měsícem +1

    Excellent topic! Have some background jobs running here and there and I definitely going to check them again.

    • @RafaelPonte
      @RafaelPonte Před měsícem +1

      Nice! I am glad this talk was helpful to you! 👊🏻

  • @duyetpham7924
    @duyetpham7924 Před měsícem +1

    Beautiful presentation, thank you

    • @RafaelPonte
      @RafaelPonte Před měsícem

      Thank you so much! That's very nice you liked it! 🥰

  • @hirenpandit8499
    @hirenpandit8499 Před 23 dny +1

    Great talk!! so much learnings and addressed real life problems I faced while writing background scheduled jobs... btw we used ShedLock library but this is real good insight.

    • @RafaelPonte
      @RafaelPonte Před 23 dny

      Thanks! Nice you liked it!! 😊
      By the way, ShedLock is a very cool library! 👊🏻

  • @hamedalipour1012
    @hamedalipour1012 Před měsícem +1

    you are an amazing presenter thank you so much learned a lot

    • @RafaelPonte
      @RafaelPonte Před měsícem

      Thank you so much!!! I am happy this talk was helpful for you 🥳

  • @codeisma
    @codeisma Před měsícem +4

    Great talk! There are a few Java libraries that already solve these challenges (db-scheduler, JobRunr or Quartz). At JobRunr we'd love to share your talk as it explains JobRunr's architecture well and can help our users understand the challenges of distributed scheduling even better!

    • @RafaelPonte
      @RafaelPonte Před měsícem +3

      Thanks for your comment! I'm glad you liked it! ☺
      Please, I would appreciate it if you shared it! By the way, I received great feedback from Ronald, the creator of JobRunr-he watched my talk! He is a fantastic guy! ❤

    • @RonaldDehuysser
      @RonaldDehuysser Před měsícem +1

      @@RafaelPonte You're too kind 🤩!

    • @marshall143
      @marshall143 Před měsícem +1

      What is your opinion on nflow Java library? Thank you for video

    • @RafaelPonte
      @RafaelPonte Před měsícem +1

      ⁠@@marshall143Thanks for the comment! 😊 I didn't know nFlow, but I understand that if your context allows your team or project to adopt a task scheduler or workflow engine, you should go with it.
      Usually, those libs and frameworks make the developer's life easier because they address very well all the issues discussed in the talk.

  • @jessilyneh
    @jessilyneh Před měsícem +1

    Congrats for your amazing presentation, Rafa!

  • @victoralcantara8470
    @victoralcantara8470 Před měsícem +1

    Amazing! Congrats Rafa!

    • @RafaelPonte
      @RafaelPonte Před měsícem

      Thanks, I'm glad you liked it ☺

  • @Cassitu
    @Cassitu Před měsícem +1

    Parabeeens manooo! ficou top! sucesso

    • @RafaelPonte
      @RafaelPonte Před měsícem

      Obrigado! Feliz que curtiu ❤

  • @lobaorn
    @lobaorn Před měsícem +1

    Congrats Rafael! Parabéns Rafa!

  • @gjperes1
    @gjperes1 Před měsícem +1

    O Rafael é fera demais!! Great presentation

  • @DiegoFerreiradaSilva
    @DiegoFerreiradaSilva Před měsícem +1

    mandou bem, parabéns!

  • @mustafaabdsh
    @mustafaabdsh Před měsícem +1

    thank you

  • @rommelcosta6548
    @rommelcosta6548 Před měsícem +1

    excellent lecture 💚

  • @DevMultitask
    @DevMultitask Před měsícem +1

    Great job Rafael!

  • @metrocartao
    @metrocartao Před 27 dny +1

    Muito bom!

  • @MrDaniloko23
    @MrDaniloko23 Před 22 hodinami +1

    Great content!

  • @user-cw8lz4ml5u
    @user-cw8lz4ml5u Před měsícem +1

    Great talk! Did not catch all the red flags in this :)

    • @RafaelPonte
      @RafaelPonte Před měsícem

      Thanks! I am glad you liked it!! ❤

  • @thiagonunes3619
    @thiagonunes3619 Před měsícem +1

    Nice!!

  • @gsledoux
    @gsledoux Před měsícem +1

    Parabéns marajá! 😉

  • @LuisSantos-ip8np
    @LuisSantos-ip8np Před měsícem +1

    What a prince 💛🔥

  • @JuniorAdy10
    @JuniorAdy10 Před měsícem +1

    Braaabo de mais. Parabéns, príncipe do oceano kkk 👏👏👏

    • @RafaelPonte
      @RafaelPonte Před měsícem

      Brigadão, Junior! 👊🏻

    • @benicioavila
      @benicioavila Před měsícem +1

      @@RafaelPonte Parabéns Rafael! Compartilhando com todos do meu time! Abraço.

    • @RafaelPonte
      @RafaelPonte Před měsícem

      @@benicioavila obrigado ☺️ E valeu por compartilhar!! ❤️

  • @HariharanIyer
    @HariharanIyer Před měsícem +1

    Great talk and lot of cool new (for me) information about Spring/JPA semantics! But not much of this is specific to background jobs, and not much in the talk about generic background job processing. So I'd say the title is a bit misleading.

    • @RafaelPonte
      @RafaelPonte Před měsícem

      Thanks for the comment 😊 I am glad the content was helpful for you!
      Out of curiosity, what do you understand as background jobs and job processing, and what do you expect from a talk about these subjects?

  • @davi.mustafa
    @davi.mustafa Před měsícem +1

    é o cara! boooraa!

  • @gleicianylemos8824
    @gleicianylemos8824 Před měsícem +1

    👏👏👏

  • @flavioecintia
    @flavioecintia Před měsícem +1

    boa ponte!!!!!

  • @mindcontrolkmc.3286
    @mindcontrolkmc.3286 Před 16 dny +1

    Really great talk!
    But I am curious that if 2 save statements already wrap in 1 small transaction how can it combine with the hibernate batch with another save statement process

    • @RafaelPonte
      @RafaelPonte Před 16 dny

      Thanks for the comment and feedback 😊
      I am not sure if I understood your question correctly. Could you elaborate a little bit more on it?

    • @mindcontrolkmc.3286
      @mindcontrolkmc.3286 Před 16 dny +1

      Hi Rafael,
      In the scenario of this video, we are using short-transaction to save data to the database so I think each transaction should be isolated so they can't be wrapped in one batch like your example INSERT INTO ... Values (A),(B)

    • @RafaelPonte
      @RafaelPonte Před 15 dny

      @@mindcontrolkmc.3286 Yeah, the idea is precisely that! For each batch (chunk) of 50 rows, Hibernate will group (and reorder if needed) each INSERT and UPDATE inside that short-running transaction and convert them into only two single statements right on the commit.

  • @YuliSlabko
    @YuliSlabko Před 27 dny +1

    Nice explanation! But did not cover very important case if your app has more than one job marked with @Scheduled annotation. Because it may be crucial moment of performance. May be it will be covered in next topics.

    • @RafaelPonte
      @RafaelPonte Před 27 dny

      Thanks for the comment 😊 Nice you liked it!
      I am not sure if I understood what you mean. Usually, a single application has multiple @Scheduled jobs running concurrently doing different things (sometimes at other times).
      Could you give more details?

    • @YuliSlabko
      @YuliSlabko Před 27 dny +1

      @@RafaelPonte If you do not specify in application.yml thread pool size for scheduler explicitly all jobs will be operated by one single thread.

    • @RafaelPonte
      @RafaelPonte Před 26 dny

      ​@@YuliSlabko Thanks for the explanation. Now I got your point! ☺
      You're right. If your application runs multiple jobs close together or jobs that take too long to finish, tuning the Scheduler's thread pool size is essential. 👊🏻

  • @aleksandrS3894
    @aleksandrS3894 Před 8 dny +1

    Does a single @Transactional annotation for Scheduled method (in case of JPA framework) fix the original code right away?

    • @RafaelPonte
      @RafaelPonte Před 8 dny

      Thanks for the comment.
      It depends on which problem you're talking about.
      In the talk’s context, it solves only part of the problem: it makes the whole operation atomic and recoverable but causes a few side effects.

  • @user-un1um2vf3y
    @user-un1um2vf3y Před 27 dny +1

    what's the difference between reading and writing with a rabbit or kafka and reading and writing with a database?
    Usually i'm using REDIS for solve same problem, because it much faster than usual relation db

    • @RafaelPonte
      @RafaelPonte Před 26 dny

      Thanks for the comment.
      I will ignore the trade-offs of having a new component in the infrastructure now and focus only on the developer's perspective.
      There are differences, but how they can impact your solution depends on your context. I mean, using Kafka or RabbitMQ in the talk's job perspective may have little difference on the job's code, but in the application perspective, which produces events in the queue, we may have to deal with a dual write issue.
      The same is true for Redis: it depends on how you're using it, such as a distributed lock provider or a message queue.

  • @zickzack987
    @zickzack987 Před měsícem +2

    Ummm...
    Distribution topic starts after 27 min.
    Using db locks is tricky and works differently for different databases, e.g. lock escalation. Better use an app level locking.
    All that had not really to do a lot with jobs. Just long running tasks in a distributed system.

    • @RaphaelSousa-or1dl
      @RaphaelSousa-or1dl Před měsícem +1

      Do you have a resource recommendation on app level locking? I'm studying the topic and it would be awesome to see it more detailed. Thanks

    • @RafaelPonte
      @RafaelPonte Před měsícem +1

      Thanks for your comment ☺
      Distributed systems are tricky, and database locks have worked well for over 30 years. Although some databases might differ, an exclusive row-level lock works similarly. By the way, a few RDBMS suffer from lock escalation, but not PostgreSQL (which was used in the talk's context); in addition to that, we used many approaches in the talk that mitigate the chances of lock escalation 💪🏻
      Regarding application-level locking, PostgreSQL offers Advisory Locks as an excellent alternative to row-level locks. They're very light and are handled by the application side.

  • @顾清l
    @顾清l Před měsícem +1

    In my understanding, `select ... limit 50 for update` would directly lock these 50 rows, instead of locking one row and processing one row at a time. But in the video, it seems to be the latter approach. Why is that?

    • @wukash999
      @wukash999 Před měsícem +1

      He just presents it like that for a purpose of presentation. Of course it will lock all 50 rows (as long as they meet select criteria and are not locked already). Overall this is a very basic presentation, not sure what was the point of that.

    • @RaphaelSousa-or1dl
      @RaphaelSousa-or1dl Před měsícem +3

      @@wukash999 I think the point is to introduce to more unexperienced people the possibles problems one might encounter, so you can study further on it (at least for me it worked ,since I've never thought or knew about this problems), not to make a thourough implementation guide

    • @RafaelPonte
      @RafaelPonte Před měsícem

      Thanks for your comment ☺
      As @wukash999 commented, the idea was to make it as didactic and accessible as possible so that junior and inexperienced developers could understand it.
      Do you think it got confused?

    • @RafaelPonte
      @RafaelPonte Před měsícem

      @@wukash999 Thanks for your comment and helping them to understand my intention ☺
      Do you think this was an introductory and basic talk? I'm afraid I have to disagree. The talk was designed to simplify the subject and make it accessible for everyone, but it's still a complex, tricky, and detailed theme.

  • @brunobrasilweb
    @brunobrasilweb Před měsícem +1

    Parabéns Rafael, Zerou game do Java.

  • @rabah4306
    @rabah4306 Před měsícem +1

    @Transactional Will this works if You have to call a mongoRepositoy and Kafka template ?
    All or nothing
    If Kafka call KO
    The mongo call also ?

    • @RafaelPonte
      @RafaelPonte Před měsícem

      Thanks for your comment ☺
      Although MongoDB and Kafka support some level of transactions, I don't know how @Transactional annotation would work with MongoRepositories or KafkaTemplates. It's worth reading the Spring Data docs.
      But it's important to be aware that you do NOT have an atomic operation (all or nothing) when your code mixes different external service calls, like PostgreSQL, Mongo, and Kafka. When you do that, you hit a common issue in distributed systems called "dual write".

    • @asterixcode
      @asterixcode Před měsícem +1

      @@RafaelPonte I have the same use case where i need to write to mongo, kafka and also to google cloud storage bucket within the same transaction. Do you by any chance know how to solve this problem so I get a all or nothing? Or if not possible, how we would solve this problem then….

    • @rabah4306
      @rabah4306 Před měsícem +1

      @@RafaelPonte obrigado :)

    • @MrKar18
      @MrKar18 Před 29 dny +1

      For mongo, you can spin a new session with transaction as well, manually. However for Kafka if the produced records are idempotent, you can use the mongo transaction support above to achieve the same.

  • @eddwinpaz
    @eddwinpaz Před měsícem +1

    Adorei a conversa, mas não sei se queria falar sobre Spring Boot ou se candidatar a político, hahaha.. brincadeira!

  • @lacerdaph23
    @lacerdaph23 Před měsícem +2

    Rafa is humble, Freak and beatifiul

    • @RafaelPonte
      @RafaelPonte Před měsícem

      Hehe, you're very kind, my friend! ❤

  • @TJ-hs1qm
    @TJ-hs1qm Před měsícem +1

    Is he describing Spark 😆?

    • @RafaelPonte
      @RafaelPonte Před měsícem +1

      Thanks for the comment 😊
      Do you mean Apache Spark? hehe

  • @ereboucas
    @ereboucas Před měsícem +1

    Almost made me want to work with boring techs again ;)

    • @andreas_bergstrom
      @andreas_bergstrom Před měsícem +1

      I’m moving back to Java/JVM after 15 years in Node/JS/Python

    • @RafaelPonte
      @RafaelPonte Před měsícem

      Boring techs are amazing! 🙌🏻

  • @fbarrosoflf
    @fbarrosoflf Před měsícem +1

    Congrats, nice job!