Working with time series data in MySQL

Sdílet
Vložit
  • čas přidán 23. 08. 2024
  • To learn more about PlanetScale, head to planetscale.co...!
    💬 Follow PlanetScale on social media
    • Twitter: / planetscale
    • Discord: / discord
    • TikTok: / planetscale
    • Twitch: / planetscale
    • LinkedIn: / planetscale

Komentáře • 80

  • @DevanSabaratnam
    @DevanSabaratnam Před 7 měsíci +10

    I've been working in MySQL for over 20 years now, but learned a TON from this video - Thank you for putting out such great learning content Aaron!

  • @ahmad-murery
    @ahmad-murery Před 7 měsíci +24

    Wow Steve's graphic design skill is getting better, compared to the previous video Aaron in this video looks more realistic and is barely distinguishable from the real one 👍😁
    anyway, even when the query is fast enough, we also need to consider the number of times we may execute it (multiple users requesting the data), so using a summary table is definitely helpful here.
    Thanks Aaron,

  • @thewayis_meh987
    @thewayis_meh987 Před 7 měsíci +21

    Classic Aaron, using MySQL as a calculator

    • @PlanetScale
      @PlanetScale  Před 7 měsíci +2

      Haha didn't even think about it

  • @hans-joergwahmkow2621
    @hans-joergwahmkow2621 Před 7 měsíci +3

    If I understand this correctly, one important feature of a window function is that everything inside is on a rolling basis. That is for sure not something you could ever do with a simple GROUP BY. I do have an application where this may actually come in very handy. Thanks Aaron, great content as always!

  • @edgars-strods
    @edgars-strods Před 7 měsíci +46

    Maybe I did not catch the specific point why window function was used here instead of simple GROUP BY "bucket"?

    • @mentalmaniac76
      @mentalmaniac76 Před 7 měsíci +2

      my guess is so that you are able to have multiple, separate, "groups" and operate on them all within the same query

    • @xelhuajosephcoronaperez3531
      @xelhuajosephcoronaperez3531 Před 7 měsíci

      @planetscale???

    • @PlanetScale
      @PlanetScale  Před 7 měsíci +11

      Window functions are a great way to compute running totals, averages, sums, etc. Or do something specific with the first or last row. Or peek at the next row or the previous row within a group. Or group by different things within a single query. They're very powerful!

    • @CodeErrorOfficial
      @CodeErrorOfficial Před 7 měsíci +9

      @@PlanetScale yes but this specific task would be better with group by. still a great thing to know

    • @victorcosenza254
      @victorcosenza254 Před 7 měsíci +1

      in this case it could be done with groupby right? Or im missing something?

  • @anderskozuch7838
    @anderskozuch7838 Před 7 měsíci +4

    Thank you for this video!! Wonderful to dive into SQL like this and absolutely fantastic to see your enthusiasm Aaron. It's very inspiring and I am really excited to learn about the summery table when you make that video. Such a nice idea to make a query faster by just sort of "computing it before its needed" !

  • @referralhelper
    @referralhelper Před 7 měsíci +2

    One of the best CZcams channels when it comes to sql!

  • @ZacZajdel
    @ZacZajdel Před 7 měsíci

    The fact this content is free is outstanding. Beyond blessed to be a developer in 2024!

  • @warutt5356
    @warutt5356 Před 7 měsíci

    Great job! So far, the best tutorial clips on common windows functions.

  • @jfdirienzo
    @jfdirienzo Před 7 měsíci

    This channel just keeps on getting better 😍

  • @alexrix4063
    @alexrix4063 Před 7 měsíci +4

    Video request: Geospacial indexing

  • @massaynus
    @massaynus Před 6 měsíci

    GROUP BY looking at you menacingly from the corner 😂

  • @OnlinePseudonym
    @OnlinePseudonym Před 7 měsíci +1

    Amazing! Been trying to do this for ages! The explanation is great.

  • @shohruxshomaxmudov1225
    @shohruxshomaxmudov1225 Před 7 měsíci

    This is like learning chess openings: at first everything is clear and easy, but in the end, your mind will explode.

  • @jricardoprog
    @jricardoprog Před 7 měsíci

    Thank you!!! Good to see this type of content. 🍻

  • @d47im5e
    @d47im5e Před 7 měsíci

    Cool, detailed, niche case handling explanation video, keep it up!

  • @barmalini
    @barmalini Před 7 měsíci +1

    Thanks for a great video. SQL is clearly not amazingly fast when it gets to bigger datasets and more complex analysis. Wouldn't it be beneficial to use another tool for a task like this, something like Spark perhaps?

  • @cagdasucar3932
    @cagdasucar3932 Před 7 měsíci

    Idea for your summary table video: in order to maximize performance and reduce unnecessary storage, I like to do summary view + index instead of an actual table.

  • @chao-chienchen4393
    @chao-chienchen4393 Před 7 měsíci

    Perfect content right there!

  • @developertpl
    @developertpl Před 7 měsíci +5

    17 minutes in and still asking myself why are we not grouping by key and bucket instead of trying to get the last row of each group.

    • @Szarlus
      @Szarlus Před 7 měsíci

      My thoughts exactly as well :P Probably could even do `GROUP BY bucket ORDER BY bucket`

    • @goaserer
      @goaserer Před 7 měsíci

      For the aggregations a simple group by should work as well. It would complicate the query in case you actually need the fields for first- and last timestamp (although one could be calculated back from the bucket).
      In the end using window functions was most likely the more interesting and educational topic for a video, even though the simple example could be solved in other ways

    • @developertpl
      @developertpl Před 7 měsíci

      @@goaserer I think it’s more like a leading video for the summary tables topic. I recently worked on a topic like this so I keep seeing group by solutions everywhere I look.

    • @Szarlus
      @Szarlus Před 7 měsíci +1

      @@developertpl I agree that showing window functions approach here is a nice educational approach. However, since it was mentioned that it is a no go for PlanetScale, it would be nice to also show the group by approach as an alternative here. Especially after we filtered only last record of each bucket which is literally what would group by achieve.
      From my point of view window functions are best used when you're reviewing per-record data in context of its group/bucket, which wasn't the case here.
      Tl;dr I enjoyed the video and always appreciate videos on this channel as they are of high quality and very informative. Just pointing out showing group by approach as well would make it "whole" :)

    • @developertpl
      @developertpl Před 7 měsíci

      @@Szarlus Great content for sure! Always a pleasure.

  • @dhawaljoshi
    @dhawaljoshi Před 7 měsíci

    MySQL lacks mviews, anyway to get similar functionalities using just DB options. I've experimented with triggers on insert,update,delete that reruns the query to fill the table but incase of multiple inserts or updates it's not quite what was expected.

  • @jackjsy
    @jackjsy Před 7 měsíci

    Perfect timing!

  • @mogadanez
    @mogadanez Před 5 měsíci

    with rollup table potential problem is with presenting data in different timezones, especially when doing stats per day

  • @noamanahmed1
    @noamanahmed1 Před 7 měsíci +1

    Video Suggestion
    How to handle case where you have to insert n numbers of rows and then you need the primary key(id) for those to be inserted in a child row WITHOUT making n insertion queries

  • @GringoDotDev
    @GringoDotDev Před 7 měsíci +2

    another video I'd be very interested to see is a deep dive on MySQL replication

    • @conceptrat
      @conceptrat Před 7 měsíci +1

      Multi master is so much fun 🤔😬

  • @JacoBoogie
    @JacoBoogie Před 7 měsíci

    Thanks Steve!

  • @xesau
    @xesau Před 7 měsíci +2

    Can't you just SELECT MIN(dt), AVG(v), MIN(v), MAX(v) FROM timeseries GROUP BY FLOOR(UNIX_TIMESTAMP(dt) / bucket_length_in_secs)?

    • @michaelmo5
      @michaelmo5 Před 7 měsíci

      Bump! I was thinking the same.

  • @mogadanez
    @mogadanez Před 5 měsíci

    I get same output with groupBy but times faster, only one thing missed is first_value(dt), but it can be replaced by MIN(dt)

  • @hannielgutierrez9139
    @hannielgutierrez9139 Před 7 měsíci

    Hi.
    I wanna thank you for beeing a great teacher. I have learnt a lot of things from you.
    I have a question, traditionally, when i have the case where i need to do some calculations, and the reuse them, I usually create a temporary table, so that i can consume the data that has been calculate and not repeat it.
    What would it be your recommendation , use subqueries, use temporary table or the one that I just leaned, the with utility.

  • @abdelilahou2822
    @abdelilahou2822 Před 7 měsíci

    I dont use mysql i dont even use planetscale (no offense) but i learn alot of things from you, and I like how you explain stuff

  • @BloodyClash
    @BloodyClash Před 7 měsíci

    😁only understand about 1% but i'm sooo impressed what you are able to and also what mysql is capable of. i only use it as a middleman in my java and c# programs

  • @zeratax
    @zeratax Před 7 měsíci

    slick animations!

  • @LenWoodward
    @LenWoodward Před 7 měsíci +2

    If you keep making videos like this, my `favourites` saved list is just going to look like your channels `videos` page.

  • @fredyy
    @fredyy Před 7 měsíci

    Why don't just group by bucket and use mysql's AVG function to get the average value in the bucket?

  • @RajveerSingh-vf7pr
    @RajveerSingh-vf7pr Před 7 měsíci

    Hi PlanetScale,
    I have a Table, with one column, which are the timestamps on which server was observed alive.
    I want to write a select query, which returns me the data for every minute,
    if there exists a timestamp in that minute, 1, else 0.

  • @joelkuijper
    @joelkuijper Před 7 měsíci

    Have you ever heard of an espresso martini?

  • @batvetone
    @batvetone Před 7 měsíci

    If we have a bucket as calculated column, would group by do the job?

  • @GringoDotDev
    @GringoDotDev Před 7 měsíci

    this is great!

  • @user-lx9tx8ie8o
    @user-lx9tx8ie8o Před 6 měsíci +1

    SAD there is no more hobby plan in PlanetScale, :/

  • @YandiBanyu
    @YandiBanyu Před 7 měsíci +1

    Is there an advantage counting these on the database level rather than the application level? Strictly speaking, is there any performance advantage when doing this calculation on the database?

    • @_MB_93
      @_MB_93 Před 7 měsíci +1

      It is faster
      First, you don't pull tens of thousands of rows over the network
      Second, even if MySQL is not optimized for timeseries data, it is still dozen of times faster than say, Javascript or Java. After you got the grouped result which has reduced size, then you pull that data to application level and display them as charts

    • @YandiBanyu
      @YandiBanyu Před 7 měsíci

      @@_MB_93 But, will it create another bottleneck elsewhere? I mean, database connection is a resource. Will performing calculation hold those connections? Because if it is, then wouldn't it be better to just dump data and then perform calculation on the application side since the application server is typically easier to scale than db server? But of course, this also needs to be performed after profiling.

    • @_MB_93
      @_MB_93 Před 7 měsíci

      @@YandiBanyu It is a bottleneck no matter where you put it, its just that database is a specialized software for aggregating and transforming data, I personally think it is "re-inventing the wheels" if things such as "group by" is performed on application side.
      As for scaling, well, as Aaron said in the video, MySQL can take you a long long way before you actually need it.

    • @YandiBanyu
      @YandiBanyu Před 7 měsíci

      @@_MB_93 yeah, there will always be bottleneck somewhere. But some of it is easier to solve than others. I agree that not many applications will scale beyond what RDBMS such as MySQL offer, but it is still good to keep in mind too when doing costly operations on the database level.
      After all, it is (IMHO) harder to reduce the bottleneck of a single database server (before going to a more complex clustering strategy) than say, spin up another server instance behind a load balancer. This can matter when making an application that needs on premise data for some security/legal reasons.

  • @JakobStrasser
    @JakobStrasser Před 7 měsíci

    Always forget the end on the case. Every time.

  • @Vali615
    @Vali615 Před 7 měsíci

    What about data gaps?

  • @userasd360
    @userasd360 Před 7 měsíci

    can you provide the data file for the data that you have used?

  • @TheEkkas
    @TheEkkas Před 7 měsíci

    MySql Gandalf

  • @Wizatek
    @Wizatek Před 7 měsíci

    Can't you just group by the bucket calculation?

  • @davidharting3119
    @davidharting3119 Před 7 měsíci

    brb gonna go build an APM tool

  • @nikhilpsathyanathan
    @nikhilpsathyanathan Před 7 měsíci

    Totally new for me😅

  • @milad8589
    @milad8589 Před 7 měsíci

    🤯👏

  • @testdummy6792
    @testdummy6792 Před 7 měsíci

    With all the talk on indexing, what does it really do? Is there such a thing as too many indexes? I'd imagine there's eventually a limit, either through practicality or performance, stopping you from adding an index for each and every column.

  • @mityukov
    @mityukov Před 7 měsíci +1

    GROUP BY on steroids 😅

  • @SashNone
    @SashNone Před 7 měsíci

    I was lost around 10 minutes in.....

  • @userasd360
    @userasd360 Před 7 měsíci

    Isn't summary table similar temp table?