Intro to Data Build Tool (dbt) // Create your first project!
Vložit
- čas přidán 24. 07. 2024
- Learn how to get started using dbt (data-build-tool) by following along with this step-by-step tutorial.
In this video, you will learn how to install dbt, initialize a new project and then publish your project to a GitHub repository.
What is Dbt?
dbt (data build tool) enables analytics engineers to transform data in their warehouses by simply writing select statements. dbt handles turning these select statements into tables and views.
dbt does the T in ELT (Extract, Load, Transform) processes - it doesn’t extract or load data, but it’s extremely good at transforming data that’s already loaded into your warehouse.
►► The Starter Guide for dbt (Free PDF)
Get clarity on key dbt concepts so you can build better projects & avoid common mistakes → bit.ly/starter-dbt
Timestamps:
0:00 - Intro
0:51 - Begin Installation
2:10 - Create GitHub Repository
3:25 - Initialize dbt Project
4:19 - Review Project Layout
6:27 - Setup Profile for Snowflake
9:42 - Run dbt Commands / Deploy Models
11:54 - Push to GitHub
13:17 - Update Folder Layout
Title & Tags:
Getting Started (Install & Create a Project) | dbt labs | Data Build Tool (dbt) Tutorial for Beginners
#kahandatasolutions #databuildtool #dataengineering
►► The Starter Guide for dbt (Free PDF) → bit.ly/starter-dbt
Well I need to say this is great to start on DBT. Not too complicated, not too simple, just something in between. I am watching the whole playlist. I am excited because this will get me up to speed for POC purposes quick enough. Thanks
Hey Mike! Thank you so much for creating this DBT playlist. i am so grateful to you, i had a job assesment exam for Data modelling using DBT and your videos really helped me to learn the basics and later develop my own project. Again thank you so much! 😊😊
That is awesome! So glad to hear it was helpful you for you. That type of feedback is what makes creating these videos worth it!
Thank you very much. This was the best video for a beginner like myself .... 1 or 2 things that had been changed by dbt since 2020, but I was able to follow you.
sweeeeeet! thanks for all the videos, I'mana go through all of these over the next two weeks
Nicely done, was able to follow through and get my first commit into a snowflake instance.
Awesome!
I was able to get started with dbt a lot faster than going through dbt documentation and videos.
Great to hear! Thank you for watching.
@jusmaVids does dbt cloud have an in hand database with data to try few queries on?
Great tutorial, thanks Kahan for uploading this.
Great Video For Any Beginner, thank You
Very informative and helpful. Thanks
Glad it was helpful!
Nice and clear, thank you.
Super helpful! Thank you!
Glad it was helpful. Thanks for watching!
Hey Mike
I love your videos as they helped me a lot.
wanted to ask if you have any plan to make a video about incremental techniques and the difference between them and best practices, i always struggle with this as an analytics engineer. Thanks
Thank you very much for your time, very helpful
Glad you found it helpful!
Hey Mike, Excellent Content. Many Thanks. I know these videos are good to start with . Do you plan to include any videos on advanced concepts Once again Thanks so much for great content
It is really good one to start DBT.. Thank you
You're welcome, thanks for watching.
fantastic playlist. Thank you.
Thank you!
Excellent information
thanks for this !!! really helpful
Glad it was helpful!
Great content. Thanks Mike
I see you! Thanks bro
this is the best tutorial on dbt!
Thanks Partha!
Also if you can please advise how can I run update statement.
Good tutorial, thank you!
You're welcome! Thanks for watching
I am getting dbt command not found error in visual studio code. However pip dbt install was successful and its running fine from cmd
dbt : The term 'dbt' is not recognized as the name of a cmdlet, function, script file, or operable program. Check the spelling of the name, or
if a path was included, verify that the path is correct and try again.
At line:1 char:1
+ dbt debug
+ ~~~
+ CategoryInfo : ObjectNotFound: (dbt:String) [], CommandNotFoundException
+ FullyQualifiedErrorId : CommandNotFoundException
Nice video. Thanks.
Really good Brother
Great tutorial!
Thank you!
Hello, Please advise how can i run update statement on snowflake.
dude how r you, i have the next question, what could i do if i have a stream on snowflake that i want to "consume" in dbt but not creating a physical table or view, instead something live a ephemeral materialization, only to purge the stream and avoid to become stale. I create an ephemeral model and select the stream source but that only create obviously an ephemeral materialization but kind not clean the data on the stream, thoughts??
hey mike. Thanks for your video. really helpful! I wonder if I can use python to do loops instead of writing 1000+ rows sql? is it possible to combine dbt and python
I recommend checking out Jinja, which allows you to incorporate more dynamic things like loops into your SQL code.
docs.getdbt.com/docs/building-a-dbt-project/jinja-macros
Please update the video, looks the installation process has changed
10/10 --thanks for the video. New dbt users should view before visiting @dbt. Found this after nearly a day installing/reinstalling to dbt spec (is 'erratic' a good description for dbt docs? maybe...'incorrect'?).
so helpful, thank you
Glad it was helpful!
Hi @kahan, While I try to install dbt (pip install dbt) an error occurs.Error:KeyError:ICU_VERSION.could you help me to fix?
Hi there - Unfortunately that sounds like an environment issue specific to your machine/setup and not something I can easily answer. Make sure you are using an acceptable version of Python for dbt and try searching around online to see if others have encountered similar issues.
Can you possibly configure dbt profile to work with Spark into AWS S3 parquet files?
Here are the list of available adapters - docs.getdbt.com/docs/available-adapters
Thanks Kahan!
You're welcome!
Do you know how we can write stored-procs in dbt? Can you provide something on that
I was able to pip install dbt, but when i run " dbt init [project name] " it says something like cant recognize path, external, or batch file ?? what do it do ?
Hey Kahan! There were some changes to get the dbt env setup going.. perhaps it's time for an update? Had to find work arounds.. when I do it, profiles.yml file doesn't exist or cannot be found.. anyways, thanks for making the tutorial
when I run your dbt init step in the beginning on a mac I keep getting an error raise ValidationError(f"Unable to create schema for '{type_name}'")
hologram.ValidationError: Unable to create schema for 'Optional'
Hello Frank, If you are running Python 3.9, One of their dependencies is not compatible. You will have to downgrade your python version maybe 3.8. If you had found any other solution please reply to this message.
Hey Mike! Thank You so much.
May i know how to create a new table in DBT tool?
Thanks for watching! To create a table you create a model (sql file) and set the materialization as "table". You can follow along the rest of the videos in this playlist to learn more about how to use dbt.
does dbt cloud have an in hand data base with data to try out the commands?
Hi Tata - Unfortunately DBT does not come out of the box with a database, but instead is designed to plug into one that you already have.
I don't see the .dbt folder under my admin folder, will this still work?
Hello @Kahan. Thank you for everything you share with us. It's a pleasure to follow you.
For this tutorial, I followed everything well until 11:30. dbt shows me "Completed successfully" => 2 of 2 Ok created view model Public ... etc. But when I look in snowflake under public, nothing has been created ... Do you have an explanation? Thx
Hello and thank you for your kind words!
Regarding your visibility in Snowflake - perhaps you need to switch roles when you sign into Snowflake. For example, if you set your profiles.yml on dbt to use accountadmin , then any role with "less" privileges won't have visibility. This is where access control comes into play, but was unfortunately outside the scope of this video. Let me know if that does it. Thanks again.
Hi@@KahanDataSolutions Thank you for your quick response
In fact, I had done everything right.
while I am testing something else in snowflake by creating a table by hand in the same schema (DEMO_DB.PUBLIC, in my case), I saw the table and the view appear in the worksheet (MY_FIRST_DBT_MODEL and MY_SECOND_DBT_MODEL) ... It's a bit strange !! but everything is fine, I will be able to move forward ...
@@cherah3012 you have to grant privileges to SYSADMIN then you'll be able to see the table.
After the command applied pip install dbt
When I check for dbt --version it's showing that dbt is not recognized as an internal or external command, operable program or batch file what I do now?
Did the pip install dbt command run successfully?
The command "pip install dbt" will not work anymore. The documentation says:
"Note that, as of v1.0.0, pip install dbt is no longer supported and will raise an explicit error. Since v0.13, the PyPi package named dbt was a simple "pass-through" of dbt-core and the four original database adapter plugins. For v1, we formalized that split."
This command emulates the behaviour of the pip install dbt command:
```
pip install \
dbt-core \
dbt-postgres \
dbt-redshift \
dbt-snowflake \
dbt-bigquery
```
You are correct - This video is now a bit dated compared to the latest version but if you make the adjustment you mention you should be all set. Here is a reference link for others who are interested: docs.getdbt.com/dbt-cli/install/pip#pip-install-dbt
@KahanDataSolutions Hi, while running dbt debug, I am facing connection test error, could not connect to snowflake backend, I have populated all the fields popularly. Could you please help me ?
hey Mike. Can we install DBT in Azure data factory
Hello there!
I got stuck at the start.
I installed dbt by pip with no errors (Actually dbt-redshift). But when I run dbt --version or any other dbt command I get the typical error of not recognized as internal.
I tried to add .....\python310\site-packages in the path of environment variables, but no luck.
Thank you for any help,
Looking forward to watching the rest of videos too.
😃
Hi Serna. Sorry for reaching out to you but I am encountering the same problem. Were you able to find a solution to this problem?
thanks!
HI, I am having difficulties finding my profiles.yml file on my MacOS. Please help?
Hello, I have 2 models in same folder , but wants dbt to write out put from each model to separate database in same account, how can I do that? Please advise.
Hi Dinesh, this can be accomplished by changing the "database" config setting. Here is the documentation:
docs.getdbt.com/docs/building-a-dbt-project/building-models/using-custom-databases
Thank you, This is what I have been looking for. Can you please advise on below scenario where I have 4 tables to load, each table has 4 scripts to run in specific sequence (insert script + logging scripts), please advise how can I set this 16 script project with dependencies.
How to update records in dbt. Update operation is failing upon dbt run.
can you please make a video on how to setup dbt in VS code
Will this intro work on a mac as well?
Is this tutorial applicable (after the initial install for WIndows) for MacOS?
Hey Jeremiah - Yes the general workflow of dbt will be the same across operating systems.
Dbt is cloud login right why did you install it locally?? What difference between on prem vs cloud dbt login pls help
thanks mate.
Thanks for watching!
where do i find profiles.yml file when installed in mac?
It will be on your home directory at "~/.dbt"
However, it might be hidden by default in the finder so you need to press "CMD + Shift + ." to show hidden directories.
@@KahanDataSolutions Great, Thank you so much, I am about to buy your course, have 1-2 questions on that, can we somehow chat or have a call?
Ive done everything correct up to this point but when I run dbt debug I get an error "dbt : The term 'dbt' is not recognized as the name of a cmdlet, function, script file, or operable
program."
Hi Grant - Sorry for reaching out to you but I am encountering the same problem. Were you able to find a solution to this error? Thanks Sam
@@samuelbrown797 I ended up just working with DBT in a virtual environment. They have some documentation on it and I believe he has a video on it. I am taking a DBT udemy course that has been solid in setting up and configuring DBT
nice👍
Thanks!
Please make videos on “Meltano” and how to use dbt within this? 🎉
How to avoid hardcoding of username and password?
Hello! I am watching your video, for now the dbt request to choose an version in according the situation. In my case i choose the dbt-redshift to install. "pip install dbt-redshift"
That's correct! Here's more docs on Redshift specifically: docs.getdbt.com/reference/warehouse-setups/redshift-setup
Good Morning, when i tried checking the version of dbt, i get this error - 'dbt' is not recognized as an internal or external command, operable program or batch file. How do i proceed?
Did you successfully install dbt before running that command? Perhaps you need to open a new/fresh terminal.
@@KahanDataSolutions i did install dbt using pip3 before running that command
This means that dbt is not added to the path of your execution. Make sure when you run pip install that your virtual environment is active.
Hi
I am looking into a career as a Data Engineer i am overwhelmed with software I should learn other than learning python,sql is the any tools and areas I should focus Thanks
I am unable to install dbt on windows from CLI "Error: Could not build wheels for cryptography which use PEP 517 and cannot be installed directly"
What version of Windows and Python are you using?
@@KahanDataSolutions windows 10 & python 3.9
@@KahanDataSolutions Please help
stackoverflow.com/questions/59441794/error-could-not-build-wheels-for-cryptography-which-use-pep-517-and-cannot-be-i
@@vickysworld9 found that python 3.9 does not support dbt's snowflake connector hence downgraded my python installation version to 3.8 which helped
If i were 3 years younger I'd have pushed the code with all the secrets in the yaml file
How’s dbt as a career with snowflake
There is a big demand right now for dbt developers, particularly with Snowflake.
Having said that, I would consider dbt as just one of the many possible tools in the broader career of a data/analytics engineer.
Either way - I'd strongly suggest learning it!
@@KahanDataSolutions thanks bro for replying , so if we learn dbt is there any need to learn ETL tools like informatica talend etc and is your playlist complete course on dbt
nice playlist, but i found really annying CZcamsrs not buying decent microphones and forcing viewers to hear every single mouse click and every single keyboard key press.. sorry for being honest, but we can only improve with others suggestions
Appreciate the feedback & glad to hear the playlist was helpful! I've also upgraded my microphone since this video and you can see it on my newer videos.
The terminal on Windows is an abomination, that shouldn't exist 😭
Unfortunately, I tried to install dbt but it is not working"\Python311\Lib\site-packages\mashumaro\meta\helpers.py", line 161, in is_generic raise NotImplementedError, I tried to find the solution but still no result
Hi , This is sateesh, i have completed snowflake snow pro certification. planning to learn DBT. What are the pre-requisites ? and complete Python is required before learn DBT ? Could you please let me know.
I recommend having a solid understanding of SQL as the #1 pre-req. You don't need to be a master Python dev to use it. Very beginner understanding of general programming is all you need to get started.
@@KahanDataSolutions Thank you so much for your reply. Are you providing online training on DBT ?
@@sateeshbabu5792 I do have a dbt course. You can check it out here - www.kahandatasolutions.com/the-playbook-for-dbt