Ray: Faster Python through parallel and distributed computing

Sdílet
Vložit
  • čas přidán 29. 08. 2024
  • Parallel and Distributed computing sounds scary until you try this fantastic Python library. Ray makes it dead simple to run your code on a cluster of computers with minimal changes to the actual code. Check it out!
    MY OTHER VIDOES:
    ○ 5 Common Python Mistakes: • 5 Things You're Doing ...
    ○ 5 Amazing Python Libraries: • Five Amazing Python Li...
    ○ Making Python fast: • Can VSCode be a reason...
    ○ VSCode's Python Interactive Mode: • VSCode's Python Intera...
    ○ Learning programming language Julia: • How to learn Julia, a ...
    Twitter: / safijari
    Patreon: / jackofsome
    #python #vscode #notebooks

Komentáře • 78

  • @shadid_io
    @shadid_io Před rokem +8

    the way you compacted 1 hour worth of content into 10min video is just amazing.

  • @JOHNSMITH-ve3rq
    @JOHNSMITH-ve3rq Před 3 měsíci +1

    astonishingly efficient pedagogy here. Props man! Thanks!!

  • @jmnunezd1231
    @jmnunezd1231 Před 4 lety +5

    Awesome video Jack! Keep on with the good content bro!

  • @prashanthb6521
    @prashanthb6521 Před 3 lety +5

    Simple and Lucid, thanks a lot. This gives me hope for my current project.

  • @yudhaputra7035
    @yudhaputra7035 Před 3 lety +3

    Thank you, I've been trying to do distributed computing using hadoop and spark. And I just found out that there is an easier way for it. Thank you, I will try it next time.

  • @SyedFarhanhaseeb
    @SyedFarhanhaseeb Před 4 lety +1

    You are great Jack! Learn new things with you. Keep it up!!

  • @vighneshpp
    @vighneshpp Před rokem

    What an amazing awesome video! thanks a TON for making this gem!

  • @tomasmizera2
    @tomasmizera2 Před 3 lety +1

    Great introduction to Ray, thank you!

  • @user-hy2cs1zt3t
    @user-hy2cs1zt3t Před 2 lety

    the best ray intro I have seen

  • @thisoldproperty
    @thisoldproperty Před rokem

    Thank you! This was an awesome presentation. well planned out and thought through. I'm going to give ray a go.!

  • @lamduongtung8783
    @lamduongtung8783 Před 3 lety

    Nice work, Jack

  • @sreenath1987
    @sreenath1987 Před 7 měsíci

    Awesome video.
    Small nit though… in the second time test… imread comes after start time. Wouldn’t that increase the time taken!?

  • @victornoagbodji
    @victornoagbodji Před 4 lety +4

    🙏 🙏 😊 great intro! 😅 show us your tmux, vim setup?

  • @themarksmith
    @themarksmith Před 3 lety +1

    Interesting stuff, thank you

  • @Tbone913
    @Tbone913 Před rokem

    Excellent.

  • @Kattemageren
    @Kattemageren Před 2 lety

    That is very awesome, will have to try it out

  • @ChromePlatypus-
    @ChromePlatypus- Před 2 lety

    thanks for the info, I am new to setting something like this up. I do have a couple questions. So using the function decorator with .get(), ray will automatically use multiprocessing on each individual machine as well? My current implementation is using python multiprocessing pool with apply_async. Also, is there a tool/easy way to install the same venvs and dependencies for many cloud servers? You also mention to have a good internet connection or LAN but I imagine in most applications it will be a cloud server connection.

  • @khemmahato8420
    @khemmahato8420 Před 2 lety

    Omg best solution for my case

  • @tonmoyroy7145
    @tonmoyroy7145 Před 2 lety

    Beautiful!

  • @k4is3r
    @k4is3r Před 2 lety

    this is awesome .... thanks for shared

  • @datasciencetoday7127
    @datasciencetoday7127 Před rokem

    this is very helpful even in 2023

  • @ameersohailsyed6243
    @ameersohailsyed6243 Před 2 lety

    Hi when I try to connect another laptop to head node I get an error can’t connect to GCS address check that it is correct or firewall is turned off can you help us please

  • @syedsohail1513
    @syedsohail1513 Před 2 lety

    Can you please help the client node gets connected to cluster but the worker is just idle all the workload is done by host computer

  • @namegoeshere3398
    @namegoeshere3398 Před 3 lety

    How would I have it so I just have to start the server and not type the command using ws (websocket)?

  • @vishwasshankar3929
    @vishwasshankar3929 Před rokem

    Can you use this to parallelize the code for cluster computing?

  • @BrunaBorgesA
    @BrunaBorgesA Před 3 lety +1

    does it work with windows? I can't seem to get ray installed

    • @JackofSome
      @JackofSome  Před 3 lety

      I've used it under WSL but it's a suboptimal experience. I'm not sure if it works under windows proper. Are you using anaconda?

  • @crossvariation
    @crossvariation Před 3 lety +1

    Very helpful video. How would this be hooked up to remote AWS servers? Thanks

    • @JackofSome
      @JackofSome  Před 3 lety +1

      They have documentation for deploying it to a number of clouds as well as putting it on kubernetes in general. I haven't tried it myself. If I ever do I'll make another video

    • @crossvariation
      @crossvariation Před 3 lety

      @@JackofSome Sounds great, thank you. I'll check the docs out.

  • @yoganandaiyadurai9474
    @yoganandaiyadurai9474 Před 3 lety

    Great video. Where can I get the source code for this demo?

  • @xiangyu9445
    @xiangyu9445 Před 3 lety

    I have tried run 'ray start --head --redis-port=8888' and an error arisees, which is 'no such option: --redis-port'

    • @JackofSome
      @JackofSome  Před 3 lety

      The commandline options may have changed. Run it again with --help and see if there's a new option for Redis

  • @woulg
    @woulg Před 3 lety

    Very cool

  • @amanvaishnavfit
    @amanvaishnavfit Před 2 lety

    Hi, I installed ray using pip and I am able to import in python program but when I try using 'ray start --head' then it shows 'command ray not found'. I am using ubuntu. Any suggestions?

    • @JackofSome
      @JackofSome  Před 2 lety +1

      Seems to be a path issue. You can always import ray, look at ray.__path__, and then run the binary file near there by giving the full path

    • @amanvaishnavfit
      @amanvaishnavfit Před 2 lety

      @@JackofSome Thanks for the reply but this was resolved. Now there is another problem. I have made a cluster of 4 nodes. All nodes are connected. But when I execute thr python script it is only running on head node only and not using all the nodes. When I checked the logs, I found out something like this: Export to Agent Metrics Failed: Can't connect to all addresses.

  • @galaktoza
    @galaktoza Před 2 lety

    Great tutorial, but what to do when neither the virtual machines nor the local machine are on the same network? I have 3 VMs on different networks in Azure and a local machine. How to spin these up? Do I have to have Redis server installed somehow manually? Help please.

    • @JackofSome
      @JackofSome  Před 2 lety

      A VPN would probably be your best option in that case but you need to careful with this kind of a setup. You'll get charged for network activity and would need to very carefully design your inputs and outputs so that they're very small, otherwise network overheads will dominate your run time

    • @galaktoza
      @galaktoza Před 2 lety

      @@JackofSome I reinstalled VMs in a virtual network, so they are regarded as in the same local network. Will test this today. Thanks Jack

    • @galaktoza
      @galaktoza Před 2 lety

      @@JackofSome So, the approach of reinstalling VMs and putting them inside a single virtual network succeeded and now ray status shows that nodes are connected. However, port forwarding over ssh for Ray dashboard does not work since dashboard did not start on port 8625. In fact, with netstat I only see port 6734 being occupied.

  • @dharikarsath8852
    @dharikarsath8852 Před 3 lety

    I cannot see ray dashboard after ray.init() method browser says This site cannot be reached what should I do now?

    • @dharikarsath8852
      @dharikarsath8852 Před 3 lety

      I have tried in colab kaggle and in my localhost as well

    • @JackofSome
      @JackofSome  Před 3 lety

      That won't work in colab and kaggle. For your localhost you're trying to access from the same computer where the master node is?

    • @dharikarsath8852
      @dharikarsath8852 Před 3 lety

      @@JackofSome yes I'm doing everything in same computer

    • @dharikarsath8852
      @dharikarsath8852 Před 3 lety +1

      I've installed Ray Ray [default] ray[dashboard] using conda then I've tried ray init() it gives me ip if i copy that ip to browser it didn't show dashboard. everything in same system localhost

  • @JuanBPedro
    @JuanBPedro Před 4 lety

    That was very helpful ! I am setting up a small office with a couple of servers and will try this for sure. Does anyone know how to set up a VPN to connect from home ? Thanks !!!

    • @namegoeshere3398
      @namegoeshere3398 Před 3 lety +1

      On the main server or host, google "my IP address" and it should look something like "123.456.78.90", enter that into your address bar and login to your wifi and port forward. (Look up a video on youtube for more info)

  • @cristiangofiar3320
    @cristiangofiar3320 Před 4 lety

    Hello again. I have another problem:
    I am using WSL 2 on 2 windows 10 computers to use the Ubuntu console. I have downloaded Anaconda and Ray to be able to make the connection as shown in the video. The server computer works fine and I can see everything on the "Localhost", but the client node that I want to connect cannot do so. The console returns: "RuntimeError: Unable to connect to Redis. If the Redis instance is on a different machine, check that your firewall is configured properly."
    I have already disabled the Firewall of both computers from windows, manually (including the full antivirus), and also disable the Firewall from Ubunto with "sudo ufw disable".
    Please help. I need it to work for an integrative practical job in a subject at my university.

    • @JackofSome
      @JackofSome  Před 4 lety

      This is an open wsl2 issue. It needs to be in bridge mode for things to work or proper forwarding rules need to be set up which were to complicated for me so I gave up on it

    • @cristiangofiar3320
      @cristiangofiar3320 Před 4 lety

      @@JackofSome What did you do then? did you use Virtual Box with Ubuntu?
      Can that work?

    • @JackofSome
      @JackofSome  Před 4 lety +1

      VMware but it will work with vbox too. Both can be in bridge mode

    • @cristiangofiar3320
      @cristiangofiar3320 Před 4 lety

      @@JackofSome hi again. Im download Ubuntu SO in VMware in 2 PC. The same thing happens to me as with WSL2. The client node, when I put "ray start --address = '***' --redis-password = '***' I get the error" RuntimeError: Unable to connect to Redis. ".

    • @JackofSome
      @JackofSome  Před 4 lety +1

      Check your VM settings and make sure it's set in bridge mode. Your VM needs to look like it's another device connected to the same physical network as your host OS

  • @krishnanarasimha1243
    @krishnanarasimha1243 Před 2 lety

    can you please share link of github?

  • @staticmind1872
    @staticmind1872 Před 2 lety

    be honest jack, did you expect it to be slower @ 4:10 ? I can almost hear the "what the f-" in your voice lol

    • @JackofSome
      @JackofSome  Před 2 lety +1

      Nope. I left it in because I felt it was a great learning exercise

  • @peegee101
    @peegee101 Před 4 lety

    Hi there, how does ray compare to dask?

    • @JackofSome
      @JackofSome  Před 4 lety

      I'm not too familiar with dask but I think it's different use cases no? Ray is more of a core technology rather than one applied to a specific use case whereas dask is focused on arrays/data frames. I guess dask ML is similar to Ray Tune?

    • @peegee101
      @peegee101 Před 4 lety

      @@JackofSome So I personally work on automated ML pipelines, not AI, meaning that I usually do a lot of models that are easily trained, more so than one big one. I think that Ray is very kewl for if you want to have a shared state of an object, something that is not possible with Dask. The rest I havent figured out yet however :P

  • @cristiangofiar3320
    @cristiangofiar3320 Před 4 lety

    how is the ray web link?
    (this is my serious account)

    • @JackofSome
      @JackofSome  Před 4 lety +1

      ray.io
      I'm not sure what you mean exactly. I'm assuming you wanted the website

  • @eternablue730
    @eternablue730 Před 3 lety +1

    Help me, reallly important, i can't install ray, i'm on windows

  • @silkworm6861
    @silkworm6861 Před 3 lety

    "As straight forward as doing pip install ray"... Ahh, the naïveté of x86 people. Ray uses stupid Bazel as a build system and it needs a lot of tinkering around to install from source.

    • @JackofSome
      @JackofSome  Před 3 lety

      My condolences for having to build things for ARM (I assume that's where you're running into trouble). I've never had a good experience there.
      As it gets more prevalent eventually you'll have self contained wheels on pypi, and then my video will be accurate again 👀

    • @silkworm6861
      @silkworm6861 Před 3 lety

      ​@@JackofSome Not ARM but PPC64LE where you can find wheels and Conda channels just for really really popular packages and you're on your own for all the rest. I think it took me two work days to figure out all of Bazel's shenanigans and build Ray.

    • @JackofSome
      @JackofSome  Před 3 lety +1

      Damn. Glad it worked out. I'd legit watch an angry rant video about the process if you ever make one 😅