Denys on Data
Denys on Data
  • 31
  • 441 967
Query structured data with LLM: LlamaIndex with RAG
Query structured data with LLM: LlamaIndex with RAG
Are you ready to take your data querying skills to the next level 🤩🤩🤩? In this video, we dive deep into the powerful combination of LlamaIndex and Retrieval-Augmented Generation (RAG) techniques to revolutionize how you interact with structured data. Discover how Large Language Models (LLMs) can transform your data analysis and querying processes.
Here is a detailed walk through of how exactly querying a structured data source works with LLMs and Llamaindex.
🔍 In This Video, You'll Discover:
LlamaIndex Uncovered: Understand how this innovative tool can streamline and enhance your data querying process.
The Magic of RAG: Learn how Retrieval-Augmented Generation can supercharge your data analysis and improve accuracy.
Step-by-Step Tutorials: Watch real-time demonstrations showing how to leverage these technologies for practical, real-world applications.
👉🏻👉🏻👉🏻Don’t forget to like, comment, subscribe and hit the bell icon for more insightful content on data analytics and advanced querying techniques!
🤝🤝🤝Join us and unlock the full potential of your data with LlamaIndex and RAG! Here is the link to the code repo that has the notebook used in the tutorial: github.com/denysthegitmenace/aws-bedrock.
zhlédnutí: 356

Video

Amazon Bedrock Agent: LLM-Powered Text-to-SQL for Data Analytics
zhlédnutí 702Před měsícem
Amazon Bedrock Agent: LLM-Powered Text-to-SQL for Data Analytics Ready to supercharge your data analytics with AI 🚀 ? In this video, we dive deep into Amazon Bedrock Agent and show how its LLM-powered Text-to-SQL feature simplifies complex data queries in just a few clicks. Say goodbye to manual SQL coding and hello to automated insights! Text2SQL has a loooong way to go to replace data analyst...
LLM for data analytics: text-to-sql 3 architecture patterns
zhlédnutí 2,2KPřed měsícem
This is the first video in a series exploring how to work with structured data using Large Language Models (LLMs). In this video, I explain the three main architectural patterns for building Text-to-SQL pipelines: 1. Prompt engineering & manual metadata retrieval (BASE) 2. BASE RAG for metadata retrieval 3. 1 or 2 using the fine-tuned model Stay tuned for more videos in this series on leveragin...
End-to-end ML pipeline with SageMaker pipelines | Quick walkthrough
zhlédnutí 1KPřed měsícem
Quick walkthrough of building an ML pipeline with SageMaker pipelines. Here is the link to the original tutorial from AWS sagemaker-examples.readthedocs.io/en/latest/sagemaker-pipelines/tabular/abalone_build_train_deploy/sagemaker-pipelines-preprocess-train-evaluate-batch-transform.html #aws #sagemaker #mlops #ml
AWS Bedrock Tutorial: chat with your files in 10 min with AWS Bedrock, Streamlit, and knowledge base
zhlédnutí 2,8KPřed 2 měsíci
I show how to build an AI Agent with a knowledge base using AWS Bedrock and Streamlit in 10 minutes. The idea is to build the skeleton for the app as quickly as possible. Accuracy and deployment are not of concern in this video. Here is the repo for the Streamlit UI github.com/acwwat/amazon-bedrock-agent-test-ui #genai #ai #aiagents #awsbedrock #aws #streamlit
Langchain tutorial cite sources
zhlédnutí 3,1KPřed 11 měsíci
Here we look at what's available on the surface when it comes to citing sources with Langchain and OpenAI. 00:07 Intro to citing sources 00:49 OpenAI playground 02:48 Langchain cite sources fuzzy match chain 08:25 Langchain multiple sources #llm #langchain #openai #chatgpt #dataengineering #dataarchitecture
Querying a database with OpenAI's ChatGPT and Langchain
zhlédnutí 1,5KPřed 11 měsíci
This video is a technical deep dive into how Lanchain interacts with OpenAI API to answer the questions about relation data. If you simply want to see Langchain and OpenAI working together on top of a Postgres database to answer user questions-make sure to watch my previous video (appears as a cards in the very beginning) #llm #langchain #openai #chatgpt #dataengineering #dataarchitecture
Langchain tutorial. Query a database with OpenAI's ChatGPT
zhlédnutí 7KPřed 11 měsíci
Here is a quick overview of how to query data in a relational database (Postgres in our case) with the help of OpenAI's ChatGPT and Langchain. I double-check the results provided by the LLM and stress multiple times that outputs are non-deteministic and that you should be using all sorts of safeguards when relying on the results generated by LLMs :) #llm #langchain #openai #chatgpt #dataenginee...
Langchain chatbot
zhlédnutí 221Před rokem
00:32 Langchain Chatbot in Action 01:13 Gettting started without memory 04:00 LangChain chains brief intro Adding memory 09:41 Adding a knowledge base 17:20 LangChain with knowledge base wihtout losing general knowledge 19:26 Wrap-up into a Streamlit app In this tutorial, we'll walk through the process of building a LangChain chatbot using OpenAI's ChatGPT. Starting with a simple chatbot, we'll...
Digital twin on top of AWS IoT TwinMaker-Dashboard overview
zhlédnutí 1,1KPřed rokem
In this CZcams video, we get an overview of a dashboard built on top of a digital twin using the AWS IoT TwinMaker. The dashboard has several components, including an assets list with associated alerts, data feeds from the assets, a hierarchy of assets, a 3D view, and a live video stream. I will continue learning about the world of digital twins and share my insights with you. If you're interes...
Index-free adjacency in graph databases explained
zhlédnutí 634Před rokem
Short explanation of what an index-free adjacency is and why it makes certain graph traversals much more efficient compared to index-based lookups. Link to the Miro board used for the video: miro.com/app/board/uXjVOTZper0=/?share_link_id=139126415302
Graph Databases-Chapter 1. Introduction
zhlédnutí 71Před rokem
1. Graph is a structure that represents entities as nodes and relationships between the entities as edges or vertices. 2. Twitter data model maps nicely to a graph. Users and posts are represented as nodes. The actions of following a user and publishing a post are represented as vertices. 3. Labeled property graph-it the most popular graph data model. It’s main features are: 1. it has nodes and...
AWS Certified Solutions Architect Associate - Thoughts, impressions, tips
zhlédnutí 9KPřed 5 lety
I recently passed AWS Certified Solutions Architect Associate Certification. Here is a link to the so called "badge" www.certmetrics.com/amazon/public/badge.aspx?i=1&t=c&d=2018-12-28&ci=AWS00648127&dm=80 In this video I share my thought on AWS Certification and certifications in general, I also share preparation strategies that worked successfully for me. If you have any questions or need an ad...
AWS Fargate tutorial - Running a Docker container with a Python Flask app
zhlédnutí 24KPřed 5 lety
In this AWS Fargate tutorial we package a Flask app into a Docker container and run it on top of AWS Fargate - a new compute engine of Elastic Containers Service (ECS). A thorough walkthrough of building the mentioned Python Flask app is in this video czcams.com/video/UNrr8MneoJo/video.html A short explanation on what a Docker container is can be found here czcams.com/video/qgWLcywSsjY/video.ht...
Getting started with Docker - What is a Docker Container?
zhlédnutí 347Před 5 lety
In this video I explain what a Docker container is.
Flask Tutorial - Building a simple web app with Flask and Python
zhlédnutí 2,6KPřed 5 lety
Flask Tutorial - Building a simple web app with Flask and Python
AWS Lambda Python triggered by API Gateway
zhlédnutí 1,1KPřed 5 lety
AWS Lambda Python triggered by API Gateway
Create an RDS Postgres instance and connect with pgAdmin
zhlédnutí 36KPřed 5 lety
Create an RDS Postgres instance and connect with pgAdmin
S3 AWS - Upload local folder to AWS S3 bucket
zhlédnutí 784Před 5 lety
S3 AWS - Upload local folder to AWS S3 bucket
S3 AWS - Load files from and to AWS S3 bucket
zhlédnutí 163Před 5 lety
S3 AWS - Load files from and to AWS S3 bucket
S3 AWS - Create AWS S3 bucket
zhlédnutí 176Před 5 lety
S3 AWS - Create AWS S3 bucket
S3 AWS - Downloading an entire AWS S3 bucket
zhlédnutí 12KPřed 5 lety
S3 AWS - Downloading an entire AWS S3 bucket
Getting started with AWS - Signing Up
zhlédnutí 66Před 5 lety
Getting started with AWS - Signing Up
AWS CLI Tutorial - Setting up AWS Command Line Interface (AWS CLI) on your laptop
zhlédnutí 580Před 5 lety
AWS CLI Tutorial - Setting up AWS Command Line Interface (AWS CLI) on your laptop
Python Generators. Quick explanation.
zhlédnutí 40Před 5 lety
Python Generators. Quick explanation.
Python enumerate
zhlédnutí 47Před 5 lety
Python enumerate
Python swap values of two variables
zhlédnutí 56Před 5 lety
Python swap values of two variables
Creating PostgreSQL tables with pgAdmin
zhlédnutí 43KPřed 7 lety
Creating PostgreSQL tables with pgAdmin
Populating PostgreSQL tables using pgAdmin
zhlédnutí 20KPřed 7 lety
Populating PostgreSQL tables using pgAdmin
Creating a PostgreSQL database with pgAdmin and logging into it
zhlédnutí 256KPřed 7 lety
Creating a PostgreSQL database with pgAdmin and logging into it

Komentáře

  • @santoshkiranm
    @santoshkiranm Před 5 dny

    Can we not use RAG within bedrock and use the default OpenSearch vector db for this. Does that also do chunking and creating vector store similar to llama index?

  • @namitsurana7417
    @namitsurana7417 Před 5 dny

    Thanks man

  • @AnnabelleWhite-js6rh

    Thank you for this tutorial!

  • @ghazwannamoujablak4265

    You're using service context in vectorstoreindex in the lambda function but not using that in the notebook. Why is that? Will the output in the two cases be different. Sorry for asking many questions, but the video is really interesting, and I am trying to learn llama_index

  • @Jocob-Beller
    @Jocob-Beller Před 9 dny

    Really good illustration Denys! Just one question, will this architecture still function well when you have too many tables with bad naming? I only see some products like AskYourDatabase work well with this situation. How should the solution fit in this architecture?

    • @DenysonData
      @DenysonData Před 9 dny

      I guess the easiest/cleanest/cheapest is getting the names right. Or creating a layer of views on top. In my last video I provide an extra explanation for each table, which could also help. But if you are looking for hands-off solution that should work "out-of-the-box" on top of lots of tables, i guess having tables named nicely goes a long way. Let me know if I misunderstood the question.

  • @ghazwannamoujablak4265

    Just a question about the action group: Did you build a simple lambda function from scratch or create one from a container? I am asking this question to understand how you installed the dependencies specified in the requirements file

    • @DenysonData
      @DenysonData Před 9 dny

      It's a container-based lambda. The requirements are installed during the docker build stage. The whole thing was deployed/managed with terraform

    • @ghazwannamoujablak4265
      @ghazwannamoujablak4265 Před 9 dny

      Thank you. It seems like you're using terraform to push the image to ecr, right? I am wondering if it is possible to create a video about creating the action group step by step.

  • @ghazwannamoujablak4265

    I don't think openai key is required since you're uisng aws bedrock models, right?

    • @DenysonData
      @DenysonData Před 21 dnem

      Correct. In the tutorial credentials are pull from env vars

  • @darkmatter9583
    @darkmatter9583 Před 21 dnem

    thank you just discovered your channel

  • @WesFang
    @WesFang Před 23 dny

    really good video. thank you!

  • @WesFang
    @WesFang Před 23 dny

    Thanks Denys for putting this together - can you elaborate on what goes into the "prompt template"?

    • @DenysonData
      @DenysonData Před 23 dny

      Sure. Here is link to the file with a prompt template I am covering in my last video: github.com/denysthegitmenace/aws-bedrock/blob/main/query_structured_data_lambda/prompt_templates.py SQL_TEMPLATE_STR is a good example

  • @ghazwannamoujablak4265

    Great video. Many thanks for sharing this

  • @elenaromanova2841
    @elenaromanova2841 Před 28 dny

    Hello Denis. Thanks for the video. I am wondering if it’s possible to add implementation details in tech stack and tools for RAG type of architecture. What framework was used to load DB schema - if Langchain, what loader and how it was vectorized, which Vector DB is good for this type of cases and Foundational modes kids from your experience for both: vectors as well as generation. Maybe some examples of code for loader, retrieval and connectors if possible. I have the case in mind to implement and puzzling on how to load structured data into vector DB as well as retrieve it for generations. Thank you in advance. ❤

    • @DenysonData
      @DenysonData Před 28 dny

      Yep. Planning to publish this exact walk-through this weekend. No Langchain, though. It was done with LLamaIndex. Also, I am not using any extereanl Vector storage for this tutorial here-it's all in-memory. But I know that my collegues (and we are working primarily on AWS) started using Aurora PostgreSQL with pgvector instead of OpenSearch serverless for cost-efficiency reasons. Hope that helps and stay tuned :)

    • @DenysonData
      @DenysonData Před 25 dny

      Just uploaded the video. Curious to learn what you think

  • @ghazwannamoujablak4265

    Many thanks for your prompt answers. Can't wait to see the next video

    • @DenysonData
      @DenysonData Před 25 dny

      Just uploaded the video. Curious to learn what you think

  • @ghazwannamoujablak4265

    it seems like RAG (knowledge base) has not been used in the architecture, llama indes is used instead, so the llm model (foundation model) is building the query with the help of user NLP input + few shots examples and tables metadata, right?

    • @DenysonData
      @DenysonData Před 29 dny

      Correct. There is no knowledge base, but the approach for pull tables metadata and for identifying most relevant queries is exactly the same as used in RAG-identifying similarity of the user input and various elements.

  • @ghazwannamoujablak4265
    @ghazwannamoujablak4265 Před měsícem

    I would be also greatful if you can share a walkthrough of how to create few shot examples

    • @DenysonData
      @DenysonData Před měsícem

      Yep. Will make sure to include it.

  • @ghazwannamoujablak4265
    @ghazwannamoujablak4265 Před měsícem

    Great video. Could you please explain here or in a separate video the glu and metadata data extraction part

    • @DenysonData
      @DenysonData Před měsícem

      Thank you. Sure. Will do so during the coming weekend :)

  • @DenysonData
    @DenysonData Před měsícem

    Good day! Thank you for all the kind words 💞 Unfortunately, I don’t have the capacity to answer specific technical questions here. If you need support with a specific problem, please consider joining my Patreon private chat (link in bio). There, I can help you with your issues and we can also schedule private sessions to address more complex problems.

  • @ASHS-j2c
    @ASHS-j2c Před měsícem

    InvokeAgent operation: Failed to retrieve resource because it doesn't exist. Retry the request with a different resource identifier - this is the error I am getting .. Any thoughts on this ?

    • @DenysonData
      @DenysonData Před měsícem

      I recenly created a patreon (link in bio) for providing any sort of guidance. Feel free to join for help with this and any other question

  • @-SANJAIMI
    @-SANJAIMI Před měsícem

    hello sir!!! I have a issue in creating the knowledge base. When I create it shows failed to create open search serverless collection.eventhough I gave the full access access for bedrock and opensearch service for the user and made the s3 bucket to be accessed by the opensearch service, the issues is not fixed. can you help me to clear that issue? I'm struggling with that issue !!! please help me !!!!

    • @DenysonData
      @DenysonData Před měsícem

      Hi Sanjaimi! I just created a patreon for exactly such questions (link in bio). Feel free to join and get some help with this and any other questions you might have in the future.

    • @-SANJAIMI
      @-SANJAIMI Před měsícem

      @@DenysonData will do Sir !!!

  • @xerwanderer
    @xerwanderer Před měsícem

    i've down all 3 of architectures you've mentioned, but still not getting the ideal results. The main issues i've encountered: 1. lack of text2sql pairs, i've collected all of the sql queries succeed in our database, but it's incredibly hard to inference back to the original query in human language. 2. it's almost impossible to help llm understand the relation between business info(usually used in human language) to actual data structure. 3. the information dense is quite low when export database scheme, table structure, we used lots of nested json stored in single column, also enums with no detailed discription. but it was done months ago, today i might have some new ideas on issue 1&3, but 2 remains to be seemingly impossible.

    • @DenysonData
      @DenysonData Před měsícem

      RE: "i've collected all of the sql queries succeed in our database, but it's incredibly hard to inference back to the original query in human language" Good approach! However, i guess, with txt2sql more than ever you need to start with the end user questions-and from my experience there is usually a VERY limited set. RE: "it's almost impossible to help llm understand the relation between business info(usually used in human language) to actual data structure. " 100% that's also my main argument against the hype around "genbi" and AI will replace data analysts. RE: "we used lots of nested json stored in single column, also enums with no detailed discription." as with efficient data analytics pre-processing according the END business needs is your best friend here. Point the smartest person at the complex schema with dozens of caveats and they would trow their hands up rather sooner than later

  • @fuehnix
    @fuehnix Před 5 měsíci

    Oh... I was hoping you would cover a bit more, like if I have the source in the Documents metadata, how can I get that to be used for citations

    • @DenysonData
      @DenysonData Před měsícem

      Sorry for the late reply. Figuring out the "unanswered comments" functionality just now. Let me know if you still have questions in this directions. Also, if you need support with a specific problem, please consider joining my Patreon private chat (link in bio). There, I can help you with your issues and we can also schedule private sessions to address more complex problems.

  • @fuehnix
    @fuehnix Před 5 měsíci

    I agree with what you were saying about langchain hiding things lol. Even with Debug and verbose on, with everything set to "with_config(run_name="blah blah")", while also reading through the source code, it's hard to really trace what's going on in langchain :')

  • @Hunter-x3b
    @Hunter-x3b Před 6 měsíci

    how to we coming to this page the first is how we registry to get user and password

    • @DenysonData
      @DenysonData Před měsícem

      Hi! I just created a patreon for exactly such questions (link in bio). Feel free to join and get some help with this and any other questions you might have in the future.

  • @quengelbeard
    @quengelbeard Před 6 měsíci

    Hey Denys, great video! Is it possible to query a postgreSQL database?

  • @MihirRawal
    @MihirRawal Před 7 měsíci

    Just one question. How about the database which has around 1000 table? how it will handle prompt and tokens? will it send 1000 table schema each time the query is passed by the user? will appreciate your prompt reply. thank you.

    • @DenysonData
      @DenysonData Před měsícem

      Good question, Mihir! I covered this in my latest video-you would vectorize your tables schema so that LLM would decide on the go which tables are most appropriate for answering user question. I am planning to publish anothe video on this topic this week as well. Also, you could subscribe to the patreon I just created (link in bio), where you could get my personal take on any future questions you might have.

  • @MihirRawal
    @MihirRawal Před 7 měsíci

    Excellent. Thank you for this tutorial.

  • @steftrando
    @steftrando Před 7 měsíci

    I would never trust the LLm to write the sql query. I would create all the sql queries behind a JDBC and write restful api with search parameters. Can langchain do that?

    • @DenysonData
      @DenysonData Před měsícem

      Providing examples of the queries is one of the best practices these days. It is then used for the few-shot-prompt creation. I explain it in my latest video. If this is of interest to you, I just created a patreon account (link in bio), where you could get my personal take on any future questions you might have.

  • @user-dp7lr5qh6o
    @user-dp7lr5qh6o Před 8 měsíci

    thank you

  • @user-dp7lr5qh6o
    @user-dp7lr5qh6o Před 8 měsíci

    thank you

  • @Aidev7876
    @Aidev7876 Před 8 měsíci

    how about context memory? the uer asks question A and then a follow up question that needs to use question A to answer?

    • @SanjayRoy-vz5ih
      @SanjayRoy-vz5ih Před 7 měsíci

      Context memory can be implemented using conversationBufferMemory... please understand LLMs are Stateless by design

    • @DenysonData
      @DenysonData Před měsícem

      Good point.

  • @vaishalipandey002
    @vaishalipandey002 Před 8 měsíci

    thank you !!

  • @Aidev7876
    @Aidev7876 Před 8 měsíci

    How dobi add memory? For example Question 1. Who how many salesmen do we have Answer 3 Question2. Can i get their names? As you can see, the second question rely on the first .. How do we achieve that? Thanks

    • @DenysonData
      @DenysonData Před měsícem

      Hi! I just created a patreon for exactly such questions (link in bio). Feel free to join and get some help with this and any other questions you might have in the future.

  • @jzam5426
    @jzam5426 Před 8 měsíci

    have you been able to get this working with Llama through langchain?

    • @DenysonData
      @DenysonData Před 8 měsíci

      Never got to playing with open source models. But many people do that!

  • @ratmirmukazhanov7985
    @ratmirmukazhanov7985 Před 8 měsíci

    Finally someone who is not indian, thank you Denys!!!

  • @CyberCreed
    @CyberCreed Před 9 měsíci

    Is there anything simlar for no-sql?

    • @DenysonData
      @DenysonData Před měsícem

      For sure. This approach would work for any db engine. LLM only need to generate the correct syntax. Sorry for the delayed reply. Figuring out the "Unanswered comments" functionality only now -_-

  • @melvinsagini1198
    @melvinsagini1198 Před 9 měsíci

    does it still work with the current updates? been having trouble downloading mine, this tutorial just made it all possible.

  • @soft.developer
    @soft.developer Před 9 měsíci

    In 1:57 i try to save but i got an error "Unable to connect to server: connection failed: FATAL : password authentication failed for user "postgres"

  • @nicolasfelipe1
    @nicolasfelipe1 Před 10 měsíci

    very good video, will watch the next one about a deeper sql context.

  • @Funkandrew
    @Funkandrew Před 10 měsíci

    This is so helpful thanks! I myself am struggling with the /files endpoint. I want to upload a pdf but it only accepts jsonl? Any advice

    • @DenysonData
      @DenysonData Před 10 měsíci

      Sounds like you want to impelemnt RAG use-case, but the OpenAI /files endpoint for now is intended to upload data for fine-tuning, which is a completely different use-case.

  • @texasfossilguy
    @texasfossilguy Před 10 měsíci

    how can this work with FAISS vector databases?

    • @DenysonData
      @DenysonData Před 10 měsíci

      This use-case starts being implemented in a variety of products these days. For example-Microsoft allows you to search your Sharepoint documents (which of course under the hood are vectorized). Here is an example repo-github.com/Azure-Samples/azure-search-openai-demo. We played around with it in our company and it works smoothly.

  • @anuvratshukla7061
    @anuvratshukla7061 Před 11 měsíci

    Can you make video using Local LLM (open Source instead of Open AI) to do the same. TIA

    • @DenysonData
      @DenysonData Před 11 měsíci

      Good point. Thank you for the suggestion. Will do!

    • @DenysonData
      @DenysonData Před 8 měsíci

      Sorry. Never got to this one. The world of data is so unpredictable 🙈

  • @kaiserchief500
    @kaiserchief500 Před rokem

    Is it possible to cite the document sources?

    • @DenysonData
      @DenysonData Před rokem

      Yes. Will make a short video on this soon 🙂

    • @DenysonData
      @DenysonData Před 11 měsíci

      Here we go czcams.com/video/MOawB4k9-jk/video.html

  • @mywork1067
    @mywork1067 Před rokem

    very short video. please upload with full detail

    • @DenysonData
      @DenysonData Před rokem

      Sure. Let me know what exactly you would like to learn about.

  • @thangudujyothsna2899

    thank you so much....I was struggling for hours ....yet you made it very simple.

  • @StasPakhomov-wj1nn
    @StasPakhomov-wj1nn Před rokem

    these two graph videos were some of the best I've seen yet, keep going sir, amazing channel

    • @DenysonData
      @DenysonData Před rokem

      Thank you! Let me what topics you would be intested to dive into and I will look into it.

  • @djohnworthy1040
    @djohnworthy1040 Před rokem

    Denys I have an issue that my task definition keeps stopping. exec /usr/local/bin/flask: exec format error. I need help man. Is it possible to check it via discord?

    • @DenysonData
      @DenysonData Před rokem

      Sorry for the late reply and thank you for the great suggestion! I just added a link to a Telegram public group where you can ask questions like the one above to the channel description.

  • @veuxgun
    @veuxgun Před rokem

    THANKYOU SO MUCH! I CAN FINALLY STOP CRYING OVER THIS SHIT NOW. 😭😭

  •  Před rokem

    Ayyyeee! a well deserved like! Congrats for the great job! It helped me! It took weeks for me to get to deploy my flask app, the issue was the security group as well. Thank you so much again!

  • @gavin9370
    @gavin9370 Před rokem

    noticed your github link isnt working anymore in the description

    • @DenysonData
      @DenysonData Před rokem

      Sorry about. Had to do a major GitHub cleanup. Let me know if there is anything I could help you with.

  • @Add-__5128
    @Add-__5128 Před rokem

    Thanks bro