Creating Your First Airflow DAG for External Python Scripts
Vložit
- čas přidán 11. 07. 2021
- I show how to start automatically triggering or scheduling external python scripts using Apache Airflow. Here are the steps:
Clone repo at github.com/vastevenson/vs-air...
Make sure that pip is fully upgraded prior to proceeding
Run command: pip install apache-airflow (note this will install the most recent, stable version of Airflow, no need to specify exact version or constraints)
Install Docker and Docker Compose
Run command docker-compose up -d from the directory with the Dockerfile exists
Confirm that you can hit localhost:8080 with a GET request, should see Airflow UI
You can also monitor the logs of the Airflow environment as follows: docker-compose logs
You can stop the Airflow environment as follows: docker-compose down
Note: referencing external Python scripts can only be done from the directory where you’ve configured Airflow to look for dags. - Věda a technologie
I discovered your channel yesterday and i liked how your videos are short, organized, and cover most needed topics for Che Eng. Hope you will do videos on matlab, relating MATLAB with process control, and Che Eng process design. Thank you & looking forward to watching more videos.
Simple, yet explanatory example. Thank you:)
Amazing thank u so much for the easy explanation and for the repo!
Hi! thanks for the video :) One question...how can we deal with the scenario where the external python script is using a package that is not installed in the Airflow environment? It will raise an error "'X' module is not found" in the Airflow UI, isn't?
i'd also like to know this!!
Did you fix this issue.... Pls let me know
Amazing , that works perfectly, thanks.
Hi is it possible to get task/dag duration from context
Thanks a lot. Can you explain: - ./airflow-data/includes:/opt/airflow/includes
instead of calling the script function, is there an operator that I can call the script completely to be executed?
Hi Vincent, nice video. I am wondering if you could also make a Demo for external Docker connectivity. For example I create a job and it runs other dockers microservices. This will help to unload the server itself and putting the calculation work on other microservices. I hope you understand what I am trying to say here.
I am getting "modulenotfounderror" in DAG import errors. Can you please help me with this ?
Too many issues with this.
The external python file simply isn't being recognised by the python_callable function.
please the pip install apache-airflow command does not work, could you tell us how you've done it
How is it that your editor has allowed you to strike off an import name and re-write a new one? What feature is this? Are there any alternates to this?
I am running into errors that say " NameError: name '_mysql' is not defined" when attempting to run docker-compose up. Do you know what the issue is here?
I am getting this error too, did you ever figure it out?
Nice video! Thanks!
How to import files or modules which are located out of the root project directories somewhere in PC?
why my DAG always in the queue
Thank you!
for those who suffering from name ‘_mysql’ is not defined error... here's how i solve this.
- Make sure you following the instructions that he gave.
pip install -upgrade pip
pip install apache-airflow
- Check your docker-compose version is 1.29.1
- Revise image, context, PYTHON_DEPS section in docker-compose.yml file.
image: puckel/docker-airflow:1.10.9
context: {puckel/docker-airflow.git url}#1.10.9
(my comment is deleted bc of the this url.. change 1.10.1 to .9)
PYTHON_DEPS: sqlalchemy==1.3.0 markupsafe==2.0.1 wtforms==2.2
Thank you for sharing the solution. I was looking on StackOverflow and did not find a solution. The steps you mentioned worked and saved my day :)
thank you
Thanks !!
Emphasis was to educate us on how to run external scripts...you did less than a minute on that...options like using bashoperator "bash_command" would have been nice.