Advanced RAG 02 - Parent Document Retriever

Sam Witteveen

zhlédnutí 23 940

Přidat do
- Můj playlist
- Přehrát později
Sdílet

Sdílet

Vložit

Velikost videa:

Zobrazit ovladače přehrávání

Automatické přehrávání

Přehrát

čas přidán 22. 08. 2024

Komentáře • 68

@HaakonJacobsen Před 10 měsíci ⁺⁵
I have been waiting for a series like this, on this topic, Thank you Sam🙏 One of the things I have struggled the most with is sourcing the documents, and get the LLM to specify excactly the sentence(s) that answer the question. That would be such a powerful feature to have.
@priya-dwivedi Před 10 měsíci
Agreed. Any tips on citing the answers. Specifically if certain attributes like $, amount or interesting facts are mentioned in an answer, how can we validate them and cite their sources?
@JosemarMigowskiBrasilia Před 10 měsíci ⁺⁴
Congratulations on the videos related to RAG.
This second one, in particular, is exactly in line with the resources we are working on. It was extremely enlightening.
Once again, congratulations and thank you very much for your teachings.
@micbab-vg2mu Před 10 měsíci ⁺¹
I try to get deeper into RAG - thank you for the video.
@henkhbit5748 Před 10 měsíci ⁺¹
Great video about parent/child retriever👍. Indeed a nice addition to the "normal" rag retrieval. What I missed in Langchain is how to do RAG effectively for a "help desk question and answer conversation" that is normally stored in some database. For example a customer asks about a problem with his iphone and the agent replies with some answer. And this conversation could be going back and forth ie. cust->agent->cust->agent-> .... The same use case is doing RAG for FAQ information/database..
@Gingeey23 Před 10 měsíci
Brilliant video - this series is exactly what I have been after for improving RAG performance on large datasets!
@ninonazgaidze1360 Před 10 měsíci ⁺⁴
For anyone who works on retrieval of info from large documents, this video is like air. Because we really need to be smart enough to take this concept into account if we really want great results. @Sam, I am so glad I saw your videos!
@pdamartin4203 Před 9 měsíci
I loved the series of videos on advanced RAG, so clear and insightful. Great job on bringing these tips and tricks from your professional knowledge and experience to learners like me. Thank you.
@shortthrow434 Před 10 měsíci
Thank you Sam. This is very useful. Keep up the excellent work.
@curlynguyen6456 Před 6 měsíci
literally spend the whole day thinking and implementing this by vanilla python without knowing the keywords and this show up at bed time
@pedrojesusrangelgil5064 Před 10 měsíci ⁺²
Hi Sam 👋I'm starting my learning path on LLM and RAG and your channel is helping me a "big chunk" with that. What metrics can we use to determine if a model is a "decent" one for RAG applications? Thanks a lot for all your content 🙌
@justriseandgrind6910 Před 10 měsíci
I never knew this existed haha Going to dig into this for fun
@maxlgemeinderat9202 Před 9 měsíci ⁺¹
Great Tutorial! How can i save the "big_chunks_retriever"? Would you recommend a pickle?
@chickenanto Před 10 měsíci
Thank you Sam for the great work ! Is the following video about MultiVectorRetriever? That would be awesome!
@toastrecon Před 10 měsíci ⁺¹
I love this topic. One question I have: doesn't the more or less "arbitrary" selection of 400 tokens mean that our parent document will always be split into child documents that are close by in the text? For example: day a document is 10k tokens long. Subject 1 is mainly in the first 2k tokens, and then summarized with some conclusions in the last 1k tokens. Subject 2 is in tokens 2-7k (also included in the summary) and then Subject 3 consists of 7-9k tokens, etc... So, using the 400 slice, maybe you'd have some "contamination" of one subject into another child document? Maybe those numbers are too big in the sense that the overlap wouldn't be that bad, but you get the idea: if you're trying to create embeddings that are topically-focused, but you chop up the parent document just by adjacent tokens in a "window"? Maybe if you had some way to keep changing that window, even making it dynamically adjusted, to maximize some kind of relevance metric? That's not what recursive splitting, is it?
@dimknaf Před 10 měsíci
Very useful! Thanks!
@pascualsilva4210 Před měsícem
Thank you very much.
What is the optimal method for consulting similar documents when working with documents that are 10 to 20 pages long, and then summarizing all of them?
Or is it better to combine the chunks of an entire document and then summarize it with LangChain?
@Skhulile84 Před 10 měsíci
Thanks for this! I'm loving this series on Advanced RAG! Quick one, have you had a look at Semantic Kernel? I see it as a LangChain alternative, but I'm trying to decided on what to go for for a RAG system for some work documents. I'm more leaning on to LangChain as it seems there is just a lot of cool implementations they have. What do you think?
@tubingphd Před 10 měsíci
Thank you Sam
@davidmonterocrespo Před 3 měsíci
Thank !!!
@yusufkemaldemir9393 Před 10 měsíci
Very cool! I am confused though with pro tips 1 and 2 videos and fine tuning. Does this mean for local private docs we can skip fine tuning and explore the options provided in video 1 or 2?
Does your notebook run with m2 MacBook? Do you have any trial with m2 Mac? Thanks again!
@andrewlaery Před 10 měsíci
Effective RAG is so critical and so powerful. This is a great tutorial to help with tools and workflow. Always top notch, thx! Have you got thoughts n how and where to store all the text data (I.e raw text data, queries, llm responses)? Obv. scale is a consideration, so while in dev mode, store text in JSON locally maybe? Then import import to Python when needed for processing?
@shamikbanerjee9965 Před 10 měsíci
👌👌👌 really good ideas
@seththunder2077 Před 10 měsíci ⁺¹
Is it possible to use RetrievalQAChainWithSources instead of the normal RetrievalQAChain? If so, how can I add a memory and a prompt for that? Sorry if it's a beginner question but it doesn't show it can be used when I checked the different methods
@askcoachmarty Před 7 měsíci
Another great tutorial, thanks! Can you check my understanding of this? This retriever takes one or more parent documents and creates child documents from the incoming "parent" documents. To contrast, with a simple example, let's say I have a document with a table of contents; the TOC is good for understanding the flow and structure of the document, but then each section identified in the TOC is in another document in the vector store. This ParentDocumentRetriever isn't appropriate or applicable for that use case. Is that correct?
@ghosthanded Před 4 měsíci
Hi Sam, I had a question regarding multimodal embeddings how I embed image - text pairs where the associated text is very big
@rickmarciniak2985 Před 3 měsíci
Hi Sam. Is it possible to add large pdf files of text books into a RAG process like this? Content is technical. Books are on Cytogenetic laboratory testing. 300-700 pages. Have you ever done something like this?
My background is in genetics, not computer science. Not great with python, but I write reports in SQL every day. Thanks!
@ninonazgaidze1360 Před 10 měsíci
For working with a bunch of different documents, which model would you take for the most accurate answers? OpenAI model like "text-embeddings-ada-002" or HuggingFaceBGEEmbeddings, and exactly which?
@arkodeepchatterjee Před 10 měsíci
please make a video on the web scrapper 🙏🏻🙏🏻
@peterc.2301 Před 10 měsíci
One more great video! Could you give any insight on how to use more complex prompts like the "Professor Synapse prompt" in langchain?
@picklenickil Před 10 měsíci
I could have done something like this -
Preprocess document to include pagenumber, doc name in the corpus for every page.
Train a universal encoder decoder pipeline, tf . Probably
Use nearest neighbor to get top 5 matches.
Send it to the cheapest LLM I can find and instruct it to 'reply to the query and mention reference in APA' or whatever
..
If I'm feeling fancy.. I would save these references and queries in a text file, because why not...as cache or something.
Boom..
What do you guys think?
@mshonle Před 10 měsíci
You might be interested in the DensePhrases work: Real-Time Open-Domain Question Answering with Dense-Sparse Phrase Index (2019, arXiv:1906.05807), and Learning Dense Representations of Phrases at Scale (2021, arXiv:2012.12624)
@dejoma. Před 7 měsíci
I am watching these videos and I'm like why does his voice sound familiar. But I've worked it out.. Stewie from family guy!!! HAHA YOU CANT UNHEAR IT
@mohamednihal8215 Před 4 měsíci
won't we get tokens exceeded error if we pass in larger chunks to LLM as in context learning and also might be very expensive tokens wise
@rishab7746 Před 2 měsíci
How do I use a parent document retriever with qdrant on langchain?
@DongToni Před 2 měsíci
Hi, Sam, thanks for your video and it's really helpful for me to understand RAG better. For chunks, I want to know more about that. In our case, Chinese and English characters blend together in different document, and we try to split the docs by using certain size, like 512k or 5000 character, but that will make the chunks really mess. That will lead to the output inaccurately. Any suggestion from side? or shall I ask business stakeholder to support us to split the doc into smaller chunks? Thanks.
@samwitteveenai Před měsícem
Try a multilingual embedding model like bge-m3. can also look at semantic chunking.
@DongToni Před měsícem
@@samwitteveenai I will try it later. Thanks for your recommendation.
@akash_a_desai Před 10 měsíci
love your tutorials, also can you make video on fast api + langchain? deploying apps in production
@samwitteveenai Před 10 měsíci
So LC has some interesting things coming in this space, so I will get to it soon. What are the challenges you are having with the FastAPI etc?
@alx8439 Před 10 měsíci
So the secret souce is just to slice and vectorize documents more carefully, into thinner slices, and when to construct the context for LLM prompt just to get some more surrounding text from the doc, where the needle was found?
@samwitteveenai Před 10 měsíci ⁺²
It certainly can help for a lot of tasks.
@alx8439 Před 10 měsíci
@@samwitteveenai that's for sure. I wonder why this simple but effective idea didn't come to their minds earlier :)
@alexandershevchenko4167 Před 6 měsíci
Sam Witteveen, can you please help: how to save/load db (vectorstore) when using Parent Document retriever in langchain?
@user-fq3yt7zd5n Před 10 měsíci
Do we require large language models (LLMs) trained on billions of parameters for RAG QA on custom knowledge bases? If so, how they ? If not, what are the models that are good for RAG applications to do fast and decent operations, please tell other than openAI.
@maninzn Před 10 měsíci
Great video, Sam. If I have a lot of tables in my pdf, what's the best way to create the embedding? Textspliter is not really good for table data
@pensiveintrovert4318 Před 10 měsíci ⁺¹
The great thing about tables is that authors usually have one table or two per PDF page. I have had very reasonable results with 10Ks, by looking at the whole page.
P.s. you don't need vectors for table data, just headers, footers, row labels.
@RUSHABHPARIKH-vy6ey Před 4 měsíci
Amazing stuff, can you please opnesource the scraper or link to the code? thanks!
@pensiveintrovert4318 Před 10 měsíci
Do you produce transcripts for the videos? With a video one can grasp a few ideas maybe, but to increase the value it would be good to have step by step instructions that can be read. You could use open source models to transcribe.
@samwitteveenai Před 10 měsíci ⁺¹
I upload the subtitles which are pretty much a direct transcript. I am actually experimenting with getting a LLM to convert the video transcript to a blog post. I haven’t got it to a standard I am happy releasing but need to try a few more ideas. I totally get some people want to refer back to the text etc.
@pensiveintrovert4318 Před 10 měsíci
@@samwitteveenai I have been playing with data extraction from 10K PDFs, using CodeLlama 2 34B Phind flavor. It produces pretty clean JSON, but I am now running into problems, that if I change anything trying to improve the output, it breaks something else. Tried multiple passes. Kind of works but very slow running locally.
@pensiveintrovert4318 Před 10 měsíci
@@samwitteveenai Wouldn't an LLM writing a transcript from CC be a perfect project? 😎
@Canna_Science_and_Technology Před 10 měsíci
from langchain.document_loaders import CZcamsLoader
@loicbaconnier9150 Před 10 měsíci
Thanks for all. Could you add a direct link to your notebook ? Quite impossible to find it among all of them.
@syedhaideralizaidi1828 Před 8 měsíci
Can I use this retriever with azure cognitive search?
@stablegpt Před 10 měsíci
how to specify "k" in this case?
@ChrisadaSookdhis Před 10 měsíci
When we do big chunks + little chunks, both are added to the vector store. In this case, what does the InMemoryStore does and is it still required?
@samwitteveenai Před 10 měsíci
Hi Chris. Yes you can store them in a separate vector store rather than in memory. with LlamaIndex you. can some some fancier stuff as well.
@Reccotb Před 10 měsíci
@@samwitteveenai : hi can you provide an example? i'm stuck in trying exactly this because inmemorystore is BaseStore[str, Document] and not VectorStore
@Shriganesh-jc9jc Před 10 měsíci
At the end, the parent document that a small chunk originated from is sent to LLM as context for the answers. Is this correct?
@samwitteveenai Před 10 měsíci ⁺¹
yes for this, in the video 2 after this I show some other ways you can deal with it.
@Shriganesh-jc9jc Před 10 měsíci
@@samwitteveenai Thank you ..
@stablegpt Před 10 měsíci
how to specify "k" here??
@bodimallareddy9160 Před 10 měsíci
Can you please let what embedding model for RAG on german text
@samwitteveenai Před 10 měsíci ⁺¹
I would probably go with this for now huggingface.co/intfloat/multilingual-e5-large but keep an eye out, there is supposed to be a a multi lingual BGE coming as well.
@arkodeepchatterjee Před 10 měsíci
please release the scrapper 🙏🏻🙏🏻

Další v pořadí

Automatické přehrávání

Advanced RAG 03 - Hybrid Search BM25 & Ensembles