Faster geospatial queries in MySQL

Sdílet
Vložit
  • čas přidán 21. 07. 2024
  • Searching a table in MySQL based on latitude and longitude columns.
    ------------------
    📚 Learn more about PlanetScale
    • Website: planetscale.com/
    • Docs: docs.planetscale.com/
    ------------------
    00:00 A point of interest
    00:21 Haversine formula
    01:00 ST_DISTANCE_SPHERE distance calculation
    03:25 Filtering by ST_DISTANCE_SPHERE
    04:28 It's too slow
    05:11 Bounding boxes
    06:55 Using a bounding box
    08:10 Adding an index
    10:20 Adding a compound index
    11:34 A note about POINT columns
    ------------------
    💬 Follow PlanetScale on social media
    • Twitter: / planetscaledata
    • TikTok: / planetscale
    • Twitch: / planetscale
    • LinkedIn: / planetscale
  • Věda a technologie

Komentáře • 144

  • @TravisNorthcutt
    @TravisNorthcutt Před 10 měsíci +110

    Good to know this function has the optional radius argument so we can still use it when we go to Mars

  • @20toninho
    @20toninho Před 10 měsíci +28

    i really like the way how he teaches ...

  • @Maazin5
    @Maazin5 Před 10 měsíci +32

    If you're wondering how he got the 4 values on lines 6 and 8 at 7:00, use these formulas:
    LAT +/- (180/pi)*(1.05 miles / 3958.71 miles) and
    LON +/- (180/pi)*(1.05 miles / 3958.71 miles) / cos(pi * LAT / 180)

    • @SriTejaChilakapati
      @SriTejaChilakapati Před 10 měsíci +4

      Also, do note that these values are not universal. It depends on where your point of interest actually is located. Specifically how far it is from the equator.

  • @ZacZajdel
    @ZacZajdel Před 10 měsíci +26

    These videos by Aaron are so informative. I look forward to every single one and learn so much along the way. PlanetScale give this man a raise 😅🎉

    • @aarondfrancis
      @aarondfrancis Před 10 měsíci +5

      You're kind to say so. Thank you so much Zac! Forwarding that last part to my boss

  • @jhechtf
    @jhechtf Před 7 měsíci

    "see if i can explain it to you very quickly"
    "... what the f--"
    "okay, new plan"
    every damn video you get a chuckle out of me.

  • @ahmad-murery
    @ahmad-murery Před 10 měsíci +4

    It's always good idea to reduce the number of rows needed to be filtered using an expensive function beforehand.
    Keep it up, Aaron!

  • @ambuj.k
    @ambuj.k Před 10 měsíci +5

    I had once used haversine formula for one of my freelance projects and didn't realize how much performance I left on the table after watching this video. Please keep making this awesome content!

  • @fededelsur4306
    @fededelsur4306 Před 3 měsíci

    amazing explanation ! love this video and of course thanks!

  • @creativenull
    @creativenull Před 10 měsíci

    I read your post on this before! Really glad to see a video on it nicely explained :D

  • @nishatsayyed8326
    @nishatsayyed8326 Před 10 měsíci +1

    This is the type of quality in any tutorial. Looking forward to a LOT of these. Thanks ❤

    • @PlanetScale
      @PlanetScale  Před 10 měsíci

      We've only just begun 😎 Thanks for the kind words!

  • @JakobTheCoder
    @JakobTheCoder Před 3 měsíci +1

    Such a good video, thank you, Aaron

  • @eddie10
    @eddie10 Před 7 měsíci

    Excellent video! Thank you! 🚀

  • @tash1coder
    @tash1coder Před 7 měsíci

    wow! gonna use the same concept to improve job search by lat,long in my current project. Such a simple idea but a huge improvement. thank you man.

  • @alexrimui
    @alexrimui Před 3 měsíci

    This really helped me at work today. Thanks.

  • @osamaa.h.altameemi5592
    @osamaa.h.altameemi5592 Před 5 měsíci

    fantastic video but if we use a composite index on "x" , "y" and change the query to be
    where x>x1 and yy2
    that should work and make full use of the composite index on (x,y). Again thx for the fantastic video

  • @dimitridovgan6364
    @dimitridovgan6364 Před 10 měsíci

    Very interesting topic! Thank you very much!

  • @yangsui1301
    @yangsui1301 Před 10 měsíci +1

    I really like the way you explain to people, especially adding some funny stuff in the middle XD Keep going Aaron!

  • @sanyokes
    @sanyokes Před 10 měsíci

    This is the most useful and interesting video I have seen in the last few days. Amazing and insightful

    • @PlanetScale
      @PlanetScale  Před 10 měsíci

      Thank you! We'll keep 'em coming 💪

  • @AlexCastroDev
    @AlexCastroDev Před 10 měsíci

    My new research

  • @federico.magnani
    @federico.magnani Před 10 měsíci

    Incredibly clear and useful!

  • @anmai3805
    @anmai3805 Před 10 měsíci

    This is a beneficial video. Hope you can upload more videos on this subject. Keep going !!!

  • @rafaelacioly3252
    @rafaelacioly3252 Před 10 měsíci +2

    Amazing video! It would be great to see how to calculate the box and how to use the "Point" column as well :)

  • @Formula7Driver
    @Formula7Driver Před 7 měsíci

    I felt so smart when I figured this out on my own a couple of years ago

  • @InfinityFnatic
    @InfinityFnatic Před 10 měsíci +1

    Cool video! I like the minimal and focused presentation style and the usage of the simple UI. It looks very focused and doesn't distract from what you are trying to do. I've never had a chance to use planetscale but I will keep on watching your videos.

    • @PlanetScale
      @PlanetScale  Před 10 měsíci +1

      These are a ton of work, so I really appreciate the kind feedback!

  • @TWPO
    @TWPO Před 9 měsíci +1

    Excellent! You communicate concepts extremely well

    • @PlanetScale
      @PlanetScale  Před 9 měsíci

      Thank you! I really enjoy teaching

  • @MichaelWilding-n2w
    @MichaelWilding-n2w Před 15 dny

    This video was great! I actually will try to implement this because even though the geolocation functions I do have are for Admin only, it would be great to release to the wild and have users have access to it with speed.

  • @hans-joergwahmkow2621
    @hans-joergwahmkow2621 Před 10 měsíci +3

    I knew the trick with the bounding box. What I didn't know was that MySQL has that nice function. So I always did that calculation in my applications. Still learning new things every day. Keeps the job exciting. ;-)

    • @PlanetScale
      @PlanetScale  Před 10 měsíci +1

      > Still learning new things every day. Keeps the job exciting. ;-)
      Honestly this is such a good perspective. Thank you for the thoughtful comment

  • @huuhhhhhhh
    @huuhhhhhhh Před 10 měsíci

    This is gold

  • @Alekzv9
    @Alekzv9 Před 10 měsíci +1

    This is a really fun problem, reminds me one of my first tasks as a developer 7 years ago. Although I did it in SQLServer

    • @PlanetScale
      @PlanetScale  Před 10 měsíci

      I feel like every developer has been asked to do this at some point!

  • @erlanggafauzanrezaganie9595
    @erlanggafauzanrezaganie9595 Před 10 měsíci

    great content, hoping there is new contnt about performance

  • @rafaelhernand3z
    @rafaelhernand3z Před 10 měsíci +2

    Excellently explained 🎉
    Great videos with clear explanations. Do you offer consulting? I'm running into a time series issue with timezones and I'd be willing to book a consultation to get an understanding of how to best approach the problem.

  • @MatthiasFeist-de
    @MatthiasFeist-de Před 10 měsíci +3

    A follow up video about the spatial column type and the geospatial functions would be awesome.

    • @PlanetScale
      @PlanetScale  Před 10 měsíci +2

      Absolutely! On my list. I feel like this video could've been an hour long 😅 More to come

    • @KyleMcNally
      @KyleMcNally Před 10 měsíci +1

      @@PlanetScaleI store my lat and lng as GEOMETRY, but I still pull them out using X() and Y() to do some work. video wanted!

  • @keenoogodlike
    @keenoogodlike Před 10 měsíci

    Very useful feature I just know. I must try it

  • @Laflamablanca969
    @Laflamablanca969 Před 10 měsíci

    Damn your videos are so good

    • @PlanetScale
      @PlanetScale  Před 10 měsíci +1

      These comments are wonderful to read. Thank you so much

  • @Ceyesse
    @Ceyesse Před 10 měsíci

    Is there a way to select all the points in a 1 hour radius, instead of 1 mile radius ?

  • @kayodionizio2886
    @kayodionizio2886 Před 5 měsíci

    i have an applicatipion that save your geolocation position when you check in... isn't best to use decimal for EXAT location? and how much long should be this numbers?

  • @pflasterstrips7254
    @pflasterstrips7254 Před 5 měsíci

    hmmm, i wonder how you could boost performance by putting the points in a grid to quickly decide handle grid squares that are entirely in the circle

  • @i000alen
    @i000alen Před 9 měsíci

    "because I don't speak meters" lmao😂

  • @svoloSlovakia
    @svoloSlovakia Před 7 měsíci

    which editor/ide are you using?

  • @jogindersingh9096
    @jogindersingh9096 Před 6 měsíci

    What editor/IDE are you using?

  • @jameslucas5590
    @jameslucas5590 Před 10 měsíci

    I live in SA. very cool.

  • @meetpithadia746
    @meetpithadia746 Před 9 měsíci

    We need an entire MySQL course from scratch from Aaron

    • @PlanetScale
      @PlanetScale  Před 9 měsíci +1

      Boy do I have good news for you PlanetScale.com/mysql

  • @erikgurney306
    @erikgurney306 Před 7 měsíci

    32 second mark is legendary

  • @danielkirsch4351
    @danielkirsch4351 Před 10 měsíci

    Great video, clear explanations - this gets me looking forward to learn more about MySQL geospatial capabilities.
    I wanted to ask what MySQL client are you using - it looks like it may run queries instantly as you type, almost like hmr- is that so?

  • @johnypatera160
    @johnypatera160 Před 7 měsíci

    Nice video Aaron, but: You should use “Restaurant” condition from the very beginning, because in that case initial query will take less then ~2.2s. So it its a bit misleading and can lower value of otherwise very informative video. #feedback #keepGoing

  • @TheMoonWatcher
    @TheMoonWatcher Před 9 měsíci

    I wonder if you would be able to gain some performance if you use a spatial index and compound index by doing a sub-query/join 🤔 Or having the sub-query would negate any potential gains from using a spatial index.

    • @PlanetScale
      @PlanetScale  Před 9 měsíci +1

      Interesting question... I think I'll do another video geo stuff. There's a lot more here to explore

  • @RandomGeometryDashStuff
    @RandomGeometryDashStuff Před 10 měsíci

    05:20 how would bouding box look if it contains north or south pole?

    • @PlanetScale
      @PlanetScale  Před 10 měsíci

      That's... a great question. I don't know! I'd have to research

    • @nathanbrown2640
      @nathanbrown2640 Před 10 měsíci

      I think your longitude wouldn't matter - so just set a maximum latitude if you are looking around the north pole, and a minimum latitude if looking around the south pole?
      Though I'm imagining the centre point is at the poles, but if you are 0.5miles south of the North Pole... not a clue! Yes, time to get researching!

  • @coordinates_
    @coordinates_ Před 5 měsíci

    sorry, what's the name of the sql too you're using please

  • @nathanbrown2640
    @nathanbrown2640 Před 10 měsíci

    @PlanetScale, just as you created your compund type_lon index, I wondered:
    "Could we create a compound index with truncated values of our latitude and longitude?"
    It would be like indexing all points by with map grid square they fall into.
    Then our 'bounding box' redundant query becomes 'the point falls into one of the 4/8/16 grid squares'
    A bit more work perhaps in query design to calculate what the valid grid squares would be, but perhaps that would allow for the index reducing down to just the bounding box contents more immediately, rather than still reading all of our long rectangle?

    • @nathanbrown2640
      @nathanbrown2640 Před 10 měsíci +1

      Answering my own question: yes!
      I work with UK postcode data occasionally, and adding an generated column and then indexing it improved the speed for finding postcodes nearby by a further MASSIVE factor:
      alter table postcodes add column gridsquare varchar(15)
      GENERATED ALWAYS AS (CONCAT(TRUNCATE(latitude, 2), ":", TRUNCATE(longitude, 2))) STORED;
      alter table postcodes add index gridsquare_index (gridsquare);
      My performance improvement is hard to work out as it got vanishingly small:
      Find all postcodes within 10km of one point: 1452 postcodes returned
      - using no bounding box: 0.4 seconds
      - using bounding box and index: 0.1 seconds
      - using gridsquares:

    • @PlanetScale
      @PlanetScale  Před 10 měsíci

      Yes! You can absolutely do that. I originally wrote an article with that method, but it didn't make the video: aaronfrancis.com/2021/efficient-distance-querying-in-my-sql Perhaps the next one 🤔

    • @PlanetScale
      @PlanetScale  Před 10 měsíci

      Bravo on the explanation 👏👏 This is delightful

    • @nathanbrown2640
      @nathanbrown2640 Před 10 měsíci

      ​@@PlanetScale Thank you! Ooh, I didn't realise you had an article on it too!
      And compliment right back at you - it's a really well explained article! And, yes, that's so clear that by tranching one of the variables, we can make a compound index that is SOOO much better!
      (For others who haven't read the article - and btw, REALLY do! - the idea is to create a new column rounding each decimal to the nearest 0.001, thereby allowing you to query for specific starts of the decimal like 'it must be 30.254 or 30.255' as a way of indexing bounding boxes and getting another performance boost! It's really neat and Aaron explains it very nicely!)
      Also, I didn't know that MySQL `IN` was so much more optimised than `BETWEEN` - perhaps another weird one for humans vs computers!
      Is there any benefit then to making the index ENTIRELY from tranched values - would it be quicker? Or is the point of an index that we don't want to have so many rows with the same index and so reduce its power/versatility for other query patterns?
      (I feel like I have only started scratching the surface of the power of indexes recently, so I'm intrigued by how to get more out of them!)

  • @quillaja
    @quillaja Před 8 měsíci

    Why are you storing your "point" in 2 different fields? Just use the POINT column type and add (spatial) index on that. Then buffer on your geometries of interest, and select those that intersect with the buffer. Done.

    • @Tbagstealer
      @Tbagstealer Před 5 měsíci

      Could you expand on bigger on your geometries of interest?

  • @AlexCernat
    @AlexCernat Před 9 měsíci

    excelent video, "bounding rectangle" is an excellent way to speed up such kind of queries (personally I just "assumed" that earth is "flat", so euclidian distance ftw :-P - of course, it's not correct, but it was about closest locations, not all location in a range, so ... trading some accuracy to performance)
    what I didn't understand from your video is why do you need some extra distance offset, shouldn't be the rectangle just tangent to the circle in all four points? (or maybe I make the same mistake thinking 2D, but the earth is 3D?)

    • @PlanetScale
      @PlanetScale  Před 9 měsíci

      Haha I made the square a little bigger because I was afraid I was gonna do it wrong! It's probably fine to do it tangentially, but I got scared 😂

  • @lavisharma3210
    @lavisharma3210 Před 10 měsíci

    What db client/tool is that ?

  • @jessieonfly
    @jessieonfly Před 10 měsíci

    Can you share the test data?

    • @PlanetScale
      @PlanetScale  Před 10 měsíci

      Unfortunately I can't. There might be some open data sets with lat/lon pairs though!

  • @medilies
    @medilies Před 10 měsíci +1

    Why not group LONG and LAT in one POINT type column

    • @PlanetScale
      @PlanetScale  Před 10 měsíci +2

      You can! I mention that at the end. I'll have to do another full video on that, as it's a big topic.

  • @njdarda
    @njdarda Před 10 měsíci

    the earth is not flat? and to think that I trusted this guy!

    • @PlanetScale
      @PlanetScale  Před 10 měsíci

      To be fair I do know more about databases than celestial bodies, so stay tuned for more database content

  • @nikhilpsathyanathan
    @nikhilpsathyanathan Před 8 měsíci

    New to me

  • @sohailyunusmogambo
    @sohailyunusmogambo Před 8 měsíci

    Why no mention of geohash?

    • @PlanetScale
      @PlanetScale  Před 8 měsíci

      They aren't super common, even if they are neat

  • @sanathkumaru6358
    @sanathkumaru6358 Před 10 měsíci

    This is God level content!!!!.Glad that i stumbled upon the video.Made my day❤

    • @PlanetScale
      @PlanetScale  Před 10 měsíci

      Thank you so much! Lots more to come

  • @osama6795
    @osama6795 Před 8 měsíci

    good explanation but im not getting the most accurate result the same as it is in google maps

    • @PlanetScale
      @PlanetScale  Před 8 měsíci

      Google Maps measures by driving distance typically. This is "as the crow flies"

  • @tunmis3_183
    @tunmis3_183 Před 10 měsíci

    Could someone please explain why he said he wouldn't use MySQL for driving directions?
    @11:30
    Because that's exactly what I wanted my app to do and is the reason I clicked on this video 😂

    • @PlanetScale
      @PlanetScale  Před 10 měsíci +1

      This is able to calculate the distance between two points in a straight line, which doesn't always correspond with roads or driving time. If two points are one mile apart, but there is a mountain or lake between them, that will dramatically change the travel time! You'd need a navigation API for that. Something that can tell you where roads are.

    • @tunmis3_183
      @tunmis3_183 Před 10 měsíci

      @@PlanetScale Thank you
      Perhaps I'll look into using the Google maps API

  • @hrd1711
    @hrd1711 Před 7 měsíci

    Good video. You might also want to look into Geohash where you can represent every coordinate with a string. You can then index that column with a string and use like % query.

  • @AngelEduardoLopezZambrano
    @AngelEduardoLopezZambrano Před 9 měsíci

    Shout out to the flat earthers that didn't like this video 0:33

  • @MrJellekeulemans
    @MrJellekeulemans Před 10 měsíci

    To find the closest point you can probably just use pythagoras right? Unless you need to find points on the other half of the earth, this will work just fine.

    • @PlanetScale
      @PlanetScale  Před 10 měsíci +1

      I don't know enough about GIS to say, but even over several miles I think it would get inaccurate quickly

    • @musicmaker368
      @musicmaker368 Před 9 měsíci +1

      For short distances like across a large city. But it will become quite inaccurate if need say 100 miles

    • @MrJellekeulemans
      @MrJellekeulemans Před 9 měsíci

      The earth is a sphere, so no matter in which direction you travel a 100 miles, you always follow the same curvature. So, it doesn't matter if you just use Pythagoras's theorem instead right? You only need a calculated distance when you want to select it or compare it to some arbitrary value. Otherwise I think it's safe to use a heuristic.

    • @quillaja
      @quillaja Před 8 měsíci

      If your "points" are in longitude-latitude where units are degrees, then you'll get the "distance" in degrees as well. The distance per degree latitude changes as a function of longitude, so accurate conversion isn't simple. That's why the haversine formula exists.

    • @MrJellekeulemans
      @MrJellekeulemans Před 8 měsíci

      @@quillaja I understand that the haversine formula calculates the accurate distance between two points, but I think that when only comparing the distance, you can use a heuristic like pythaghoras

  • @nomadshiba
    @nomadshiba Před 10 měsíci +1

    pratically open world chunks

  • @j4ggi
    @j4ggi Před 10 měsíci +1

    > 2:45 you had π km distance and you converted that to miles. wtf

    • @aarondfrancis
      @aarondfrancis Před 10 měsíci +1

      Did I do it wrong? Or you just prefer meters?

    • @j4ggi
      @j4ggi Před 10 měsíci

      @@aarondfrancis I just meant you had such a nice number of km (3.1415...) between two first points and you promptly got rid of it. but I'm just joking - apologies if that wasn't clear. Great video in general - not many people know you can deal with spatial data on the db side at all, not to mention how to do it efficiently. Thanks mate and keep it up

  • @iyadahmed430
    @iyadahmed430 Před 10 měsíci

    btw which gui is that?

  • @night23412
    @night23412 Před 7 měsíci

    guys don't pause at 0:01

  • @sirhenrystalwart8303
    @sirhenrystalwart8303 Před 10 měsíci +1

    Why do you make the bounding box larger than the circle? I would have expected the box have sides of length 2*r, and thus perfectly bound the circle.

    • @PlanetScale
      @PlanetScale  Před 10 měsíci +1

      Honestly, just to be kinda safe? I don't think the bounding box calculation I used is done on a sphere, but rather on a plane. So to take into account the haversine deal, I just bumped it up a bit. The worst possible thing would be for my redundant condition not to be redundant after all, but rather filter out some things that should be there.

  • @hasithamapalagama3810
    @hasithamapalagama3810 Před 10 měsíci

    Converts meters to miles.
    * close the tab

    • @aarondfrancis
      @aarondfrancis Před 10 měsíci +1

      Thanks for the comment! Every little thing helps 🤗

  • @Mempler
    @Mempler Před 10 měsíci

    Correct, earth is not flat. It is donut shaped!!

  • @BrunoGabrielAraujoLebtag
    @BrunoGabrielAraujoLebtag Před 7 měsíci

    I don't speak meters... Muricas

  • @leotravel85
    @leotravel85 Před 10 měsíci

    wtf is a mile

  • @raphaelmobistacla2236
    @raphaelmobistacla2236 Před 10 měsíci

    The highest point of this video was "... since I don't speak meters.".

  • @der.Schtefan
    @der.Schtefan Před 9 měsíci

    A bunch of POIs? A speckle of POIs? A murder of POIs? A database table of POIs?

  • @invinciblemode
    @invinciblemode Před 10 měsíci +1

    When you explicitly converted from metres to miles, I had to unsubscribe. Sorry man 😂

    • @PlanetScale
      @PlanetScale  Před 10 měsíci +1

      Other than that part what'd you think?

    • @invinciblemode
      @invinciblemode Před 10 měsíci +1

      @@PlanetScale awesome video! Love the incremental approach to optimising queries.

    • @PlanetScale
      @PlanetScale  Před 10 měsíci

      @@invinciblemode great! glad to hear it

  • @akam9919
    @akam9919 Před 9 měsíci

    the earth is flat

  • @c01nd01r
    @c01nd01r Před 10 měsíci +1

    > 0:33
    Dislike, unsubscribe

    • @PlanetScale
      @PlanetScale  Před 10 měsíci +9

      FlatPlanetScale

    • @becelli
      @becelli Před 10 měsíci

      @@PlanetScale You can't spell planet scale without plane

  • @BanAaron
    @BanAaron Před 10 měsíci

    The earth is flat. Disliked, didn't comment, unsubscribed

    • @PlanetScale
      @PlanetScale  Před 10 měsíci

      Even set your name to "Ban Aaron," good commitment to the bit! (My name is Aaron)

  • @Ceyesse
    @Ceyesse Před 10 měsíci +1

    Is there a way to select all the points in a 1 hour radius, instead of 1 mile radius ?

    • @PlanetScale
      @PlanetScale  Před 10 měsíci +1

      Unfortunately not. You'd need to calculate drive times for that, and that's beyond the scope of a database.

    • @nathanbrown2640
      @nathanbrown2640 Před 10 měsíci +1

      I'm guessing you will want to do the time calculation outside of the database, as suggested.
      That calculation is probably expensive, so the fewer times you have to run that, the better...
      Hey presto! Another reason to use a bounding box!
      Decide what is the max distance you could travel in 1 hour from your location in each direction.
      Add & subtract from your centre, and that becomes a new bounding box.