Creating Your First Airflow DAG for External Python Scripts

Sdílet
Vložit
  • čas přidán 11. 07. 2021
  • I show how to start automatically triggering or scheduling external python scripts using Apache Airflow. Here are the steps:
    Clone repo at github.com/vastevenson/vs-air...
    Make sure that pip is fully upgraded prior to proceeding
    Run command: pip install apache-airflow (note this will install the most recent, stable version of Airflow, no need to specify exact version or constraints)
    Install Docker and Docker Compose
    Run command docker-compose up -d from the directory with the Dockerfile exists
    Confirm that you can hit localhost:8080 with a GET request, should see Airflow UI
    You can also monitor the logs of the Airflow environment as follows: docker-compose logs
    You can stop the Airflow environment as follows: docker-compose down
    Note: referencing external Python scripts can only be done from the directory where you’ve configured Airflow to look for dags.
  • Věda a technologie

Komentáře • 26

  • @ruthlove7813
    @ruthlove7813 Před 2 lety +1

    I discovered your channel yesterday and i liked how your videos are short, organized, and cover most needed topics for Che Eng. Hope you will do videos on matlab, relating MATLAB with process control, and Che Eng process design. Thank you & looking forward to watching more videos.

  • @krystianopala4299
    @krystianopala4299 Před rokem

    Simple, yet explanatory example. Thank you:)

  • @ep23a4
    @ep23a4 Před 5 měsíci

    Amazing thank u so much for the easy explanation and for the repo!

  • @javiercasanuevamartos9997

    Hi! thanks for the video :) One question...how can we deal with the scenario where the external python script is using a package that is not installed in the Airflow environment? It will raise an error "'X' module is not found" in the Airflow UI, isn't?

  • @vittaquant
    @vittaquant Před rokem

    Amazing , that works perfectly, thanks.

  • @hayathbasha4519
    @hayathbasha4519 Před 2 lety

    Hi is it possible to get task/dag duration from context

  • @jasonli1420
    @jasonli1420 Před 2 lety +1

    Thanks a lot. Can you explain: - ./airflow-data/includes:/opt/airflow/includes

  • @brendoaraujo9110
    @brendoaraujo9110 Před 2 lety

    instead of calling the script function, is there an operator that I can call the script completely to be executed?

  • @ammadkhan4687
    @ammadkhan4687 Před 2 měsíci

    Hi Vincent, nice video. I am wondering if you could also make a Demo for external Docker connectivity. For example I create a job and it runs other dockers microservices. This will help to unload the server itself and putting the calculation work on other microservices. I hope you understand what I am trying to say here.

  • @algoJamming
    @algoJamming Před 2 lety

    I am getting "modulenotfounderror" in DAG import errors. Can you please help me with this ?

  • @tg8799
    @tg8799 Před 2 lety +2

    Too many issues with this.
    The external python file simply isn't being recognised by the python_callable function.

  • @zahraelhaddi6980
    @zahraelhaddi6980 Před 5 měsíci

    please the pip install apache-airflow command does not work, could you tell us how you've done it

  • @vishwasrchonu7134
    @vishwasrchonu7134 Před 4 měsíci

    How is it that your editor has allowed you to strike off an import name and re-write a new one? What feature is this? Are there any alternates to this?

  • @henrikpohlmann8394
    @henrikpohlmann8394 Před 2 lety

    I am running into errors that say " NameError: name '_mysql' is not defined" when attempting to run docker-compose up. Do you know what the issue is here?

    • @MiShell103
      @MiShell103 Před 2 lety

      I am getting this error too, did you ever figure it out?

  • @Matias-eh2pn
    @Matias-eh2pn Před rokem

    Nice video! Thanks!

  • @saurabhruikar1131
    @saurabhruikar1131 Před 2 měsíci

    How to import files or modules which are located out of the root project directories somewhere in PC?

  • @yilu435
    @yilu435 Před 2 lety +1

    why my DAG always in the queue

  • @user-rs2ox7rm8f
    @user-rs2ox7rm8f Před měsícem

    Thank you!

  • @dmrirdmriririr
    @dmrirdmriririr Před 2 lety +3

    for those who suffering from name ‘_mysql’ is not defined error... here's how i solve this.
    - Make sure you following the instructions that he gave.
    pip install -upgrade pip
    pip install apache-airflow
    - Check your docker-compose version is 1.29.1
    - Revise image, context, PYTHON_DEPS section in docker-compose.yml file.
    image: puckel/docker-airflow:1.10.9
    context: {puckel/docker-airflow.git url}#1.10.9
    (my comment is deleted bc of the this url.. change 1.10.1 to .9)
    PYTHON_DEPS: sqlalchemy==1.3.0 markupsafe==2.0.1 wtforms==2.2

    • @srishtiganguly6000
      @srishtiganguly6000 Před 2 lety +1

      Thank you for sharing the solution. I was looking on StackOverflow and did not find a solution. The steps you mentioned worked and saved my day :)

  • @aqibfayyaz1619
    @aqibfayyaz1619 Před 2 lety

    thank you

  • @jorgealejandrorodriguez1941

    Thanks !!

  • @oluwatobitobias
    @oluwatobitobias Před 2 lety

    Emphasis was to educate us on how to run external scripts...you did less than a minute on that...options like using bashoperator "bash_command" would have been nice.