187
592 441

SimCSE - Unsupervised Sentence Embeddings

5:56

TSDAE - Unsupervised Training of Sentence Embeddings

9:25

Understanding Precision@K and Recall@K Metrics

6:09

Proposition-Based Retrieval Explained!

9:06

Why do we need Positional Encoding in Transformers?

4:30

Retrieval-Augmented Generation with Knowledge Graphs for Customer Support Q/A (Paper Summary)

15:51

What is a Masked Language Model (MLM) ?

#bert #naturallanguageprocessing #transformers
A Masked Language Model (MLM) is a neural network-based language model that's trained to predict missing words in a text. It's a pre-training technique used in natural language processing (NLP) to help AI models understand language structure and context
⏩ IMPORTANT LINKS
Research Paper Summaries: czcams.com/video/ykClwtoLER8/video.html
Enjoy reading articles? then consider subscribing to Medium membership, it is just 5$ a month for unlimited access to all free/paid content.
Subscribe now - prakhar-mishra.medium.com/membership
*********************************************
⏩ CZcams - czcams.com/channels/oz8NrwgL7U9535VNc0mRPA.html
⏩ LinkedIn - linkedin.com/in/prakhar21
⏩ Medium - medium.com/@prakhar.mishra
⏩ GitHub - github.com/prakhar21
*********************************************
⏩ Please feel free to share out the content and subscribe to my channel - czcams.com/channels/oz8NrwgL7U9535VNc0mRPA.html?sub_confirmation=1
Tools I use for making videos :)
⏩ iPad - tinyurl.com/y39p6pwc
⏩ Apple Pencil - tinyurl.com/y5rk8txn
⏩ GoodNotes - tinyurl.com/y627cfsa
#techviz #datascienceguy #deeplearning #ai #openai #chatgpt #machinelearning #recommendersystems #CustomerServiceTechnicalSupport #EfficientlyResolvingCustomerInquiries #RetrievalAugmentedGeneration #LargeLanguageModels #IssueTrackingTickets
#CustomerServiceQuestionAnswering #KnowledgeGraphRetrieval
About Me:
I am Prakhar Mishra and this channel is my passion project. I am currently pursuing my MS (by research) in Data Science. I have an industry work-ex of 4+ years in the field of Data Science and Machine Learning with a particular focus on Natural Language Processing (NLP).

zhlédnutí: 85

Video

SimCSE - Unsupervised Sentence Embeddings

5:56

SimCSE - Unsupervised Sentence Embeddings

zhlédnutí 84Před 14 dny

#sentencetrasformers #unsupervisedlearning #naturallanguageprocessing In this video, we discuss the paper SimCSE - An unsupervised learning method for learning Sentence Embeddings. SimCSE is a contrastive learning framework for generating sentence embeddings. It utilizes an unsupervised approach, which takes an input sentence and predicts itself in contrastive objective, with only standard drop...

TSDAE - Unsupervised Training of Sentence Embeddings

9:25

TSDAE - Unsupervised Training of Sentence Embeddings

zhlédnutí 138Před 28 dny

#sentencetrasformers #unsupervisedlearning #naturallanguageprocessing In this video, we discuss the paper TSDAE - An unsupervised learning method for learning Sentence Embeddings or representations. ⏩ IMPORTANT LINKS Research Paper Summaries: czcams.com/video/ykClwtoLER8/video.html Enjoy reading articles? then consider subscribing to Medium membership, it is just 5$ a month for unlimited access...

Understanding Precision@K and Recall@K Metrics

6:09

Understanding Precision@K and Recall@K Metrics

zhlédnutí 246Před měsícem

#recommendations #machinelearning #evaluation Precision at k (P@k) and Recall at k (R@k) are metrics used in information retrieval and machine learning to evaluate the performance of a ranking model or system. These metrics help understand how well the top k results are performing in relevance. Research Paper Summaries: czcams.com/video/ykClwtoLER8/video.html Enjoy reading articles? then consid...

9:06

Proposition-Based Retrieval Explained!

zhlédnutí 263Před měsícem

#rag #genai #llms Dense retrieval has become a prominent method to obtain relevant context or world knowledge in open-domain NLP tasks. When we use a learned dense retriever on a retrieval corpus at inference time, an often-overlooked design choice is the retrieval unit in which the corpus is indexed, e.g. document, passage, or sentence. We discover that the retrieval unit choice significantly ...

Why do we need Positional Encoding in Transformers?

4:30

Why do we need Positional Encoding in Transformers?

zhlédnutí 233Před měsícem

#transformers #positionalencodings #naturallanguageprocessing Transformers unlike LSTMs do not inherently account for the order of tokens in a sequence, positional encodings provide a way for the model to understand the position of each token, which is crucial for many tasks such as natural language processing. In this video, we try to answer the Why, What, and Where of Positional Encodings in ...

Retrieval-Augmented Generation with Knowledge Graphs for Customer Support Q/A (Paper Summary)

15:51

Retrieval-Augmented Generation with Knowledge Graphs for Customer Support Q/A (Paper Summary)

zhlédnutí 436Před měsícem

#rag #knowledgegraph #customersupport #machinelearning #llms #naturallanguageprocessing In customer service technical support, swiftly and accurately retrieving relevant past issues is critical for efficiently resolving customer inquiries. The conventional retrieval methods in retrieval augmented generation (RAG) for large language models (LLMs) treat a large corpus of past issue tracking ticke...

Improving your RAG system with Self Querying Retrieval

6:35

Improving your RAG system with Self Querying Retrieval

zhlédnutí 663Před 2 měsíci

#genai #rag #machinelearning Self-query retriever is used to convert an unstructured query into a structured query and then apply structured query to a vector store to get relevant results. This method can help you improve your RAG performance and have a high-performing RAG pipeline. Research Paper Summaries: czcams.com/video/ykClwtoLER8/video.html Enjoy reading articles? then consider subscrib...

Improving Document Re-ranking with Zero-Shot Question Generation (Paper Summary)

6:58

Improving Document Re-ranking with Zero-Shot Question Generation (Paper Summary)

zhlédnutí 531Před 2 měsíci

#rag #llms #informationretrieval We propose a simple and effective re-ranking method for improving passage retrieval in open question answering. The re-ranker re-scores retrieved passages with a zero-shot question generation model, which uses a pre-trained language model to compute the probability of the input question conditioned on a retrieved passage. This approach can be applied on top of a...

Understanding Reciprocal Rank Fusion in Hybrid Search [Advanced RAG]

8:16

Understanding Reciprocal Rank Fusion in Hybrid Search [Advanced RAG]

zhlédnutí 856Před 2 měsíci

#ragsystem #hybridsearch #genai #llm Reciprocal Rank Fusion (RRF) is an effective and simple method for combining multiple ranked lists of search results into a single, more accurate ranked list. The main idea behind RRF is to give higher ranks to documents that appear near the top of any of the input lists, thus rewarding consensus among the different lists. Research Paper Summaries: czcams.co...

API Design - 3 common Pagination Strategies

3:32

API Design - 3 common Pagination Strategies

zhlédnutí 267Před 6 měsíci

#apidevelopment #softwaredevelopment #api In API design, pagination is a technique used to manage and limit the amount of data returned by an API endpoint. When dealing with large sets of data, it's not efficient or practical to return the entire dataset in a single response. Pagination allows the API to provide a subset or "page" of the data, making it more manageable for clients to retrieve a...

Blueprint for Designing Machine Learning Systems

10:38

Blueprint for Designing Machine Learning Systems

zhlédnutí 876Před 6 měsíci

Blueprint for Designing Machine Learning Systems

LlamaRec: Two-Stage Recommendation using Large Language Models for Ranking

9:25

LlamaRec: Two-Stage Recommendation using Large Language Models for Ranking

zhlédnutí 2,6KPřed 7 měsíci

LlamaRec: Two-Stage Recommendation using Large Language Models for Ranking

Large Language Models (LLMs) for Recommendations (Paper Walkthrough)

8:56

Large Language Models (LLMs) for Recommendations (Paper Walkthrough)

zhlédnutí 4,5KPřed 7 měsíci

Large Language Models (LLMs) for Recommendations (Paper Walkthrough)

Document Re-ranking using LLMs - Advanced RAG

15:32

Document Re-ranking using LLMs - Advanced RAG

zhlédnutí 6KPřed 7 měsíci

Document Re-ranking using LLMs - Advanced RAG

Introducing PandasAI: Generative AI Python Library

10:15

Introducing PandasAI: Generative AI Python Library

zhlédnutí 7KPřed 7 měsíci

Introducing PandasAI: Generative AI Python Library

Rephrase and Respond: Let Large Language Models Ask Better Questions for Themselves

7:57

Rephrase and Respond: Let Large Language Models Ask Better Questions for Themselves

zhlédnutí 392Před 8 měsíci

Rephrase and Respond: Let Large Language Models Ask Better Questions for Themselves

Self-Improving for Zero-Shot Named Entity Recognition with Large Language Models

10:49

Self-Improving for Zero-Shot Named Entity Recognition with Large Language Models

zhlédnutí 1,5KPřed 8 měsíci

Self-Improving for Zero-Shot Named Entity Recognition with Large Language Models

MemPrompt: Memory-assisted prompt editing to improve GPT-3 after deployment

15:52

MemPrompt: Memory-assisted prompt editing to improve GPT-3 after deployment

zhlédnutí 236Před 8 měsíci

MemPrompt: Memory-assisted prompt editing to improve GPT-3 after deployment

Zero-Shot Next-Item Recommendation using Large Pretrained Language Models

10:36

Zero-Shot Next-Item Recommendation using Large Pretrained Language Models

zhlédnutí 1,7KPřed 8 měsíci

Zero-Shot Next-Item Recommendation using Large Pretrained Language Models

Advanced RAG Concept: Improving RAG with Multi-stage Document Reranking

12:33

Advanced RAG Concept: Improving RAG with Multi-stage Document Reranking

zhlédnutí 5KPřed 8 měsíci

Advanced RAG Concept: Improving RAG with Multi-stage Document Reranking

Improving Language Model Reasoning with Contrastive Chain-of-Thought Prompting

9:15

Improving Language Model Reasoning with Contrastive Chain-of-Thought Prompting

zhlédnutí 750Před 9 měsíci

Improving Language Model Reasoning with Contrastive Chain-of-Thought Prompting

Improving RAG performance with Parent Context Retriever

4:53

Improving RAG performance with Parent Context Retriever

zhlédnutí 599Před 9 měsíci

Improving RAG performance with Parent Context Retriever

Understanding RAG: Basics, Challenges, and Improvements

13:09

Understanding RAG: Basics, Challenges, and Improvements

zhlédnutí 1,5KPřed 9 měsíci

Understanding RAG: Basics, Challenges, and Improvements

PDF Document Question Answering with ChatGPT #demo

2:32

PDF Document Question Answering with ChatGPT #demo

zhlédnutí 9KPřed rokem

PDF Document Question Answering with ChatGPT #demo

GPT-3 Fine-Tuning Made Easy: No Coding Required!

8:45

GPT-3 Fine-Tuning Made Easy: No Coding Required!

zhlédnutí 2,7KPřed rokem

GPT-3 Fine-Tuning Made Easy: No Coding Required!

DialogLM: Pre-trained Model for Long Dialogue Understanding and Summarisation (Paper Summary)

12:09

DialogLM: Pre-trained Model for Long Dialogue Understanding and Summarisation (Paper Summary)

zhlédnutí 972Před rokem

DialogLM: Pre-trained Model for Long Dialogue Understanding and Summarisation (Paper Summary)

Frustratingly Easy Model Ensemble for Abstractive Summarization (Research Paper Walkthrough)

9:37

Frustratingly Easy Model Ensemble for Abstractive Summarization (Research Paper Walkthrough)

zhlédnutí 728Před rokem

Frustratingly Easy Model Ensemble for Abstractive Summarization (Research Paper Walkthrough)

GPTZero: Hero or Zero in Detecting AI Generated Text?

5:37

GPTZero: Hero or Zero in Detecting AI Generated Text?

zhlédnutí 3,1KPřed rokem

GPTZero: Hero or Zero in Detecting AI Generated Text?

I built GPT-3 powered Excel for analysing Amazon Reviews

12:51

I built GPT-3 powered Excel for analysing Amazon Reviews

zhlédnutí 1,2KPřed rokem

I built GPT-3 powered Excel for analysing Amazon Reviews

Komentáře

@ravi03071991 Před 6 dny
Great explaination.
@TechVizTheDataScienceGuy Před dnem
Thank you Ravi :)
@martymcfly6411 Před 6 dny
Gotzero sucks. It says things i typed myself are AI. Its a scam
@prajyotmane9067 Před 7 dny
You are really doing good job! this is going to help alot of ppl
@TechVizTheDataScienceGuy Před dnem
Thank you :)
@VALedu11 Před 9 dny
thank you for such lucid explanation
@TechVizTheDataScienceGuy Před dnem
Thank you :)
@mahaalabduljalil6596 Před 10 dny
To @everyone: You have to watch this video, it's pretty cool 😄 such a nice job explaining even the minor details. Well done!
@TechVizTheDataScienceGuy Před dnem
Thank you 🤩
@AnuragMishra-ws4zc Před 13 dny
sir can you suggest some of the best ranking rag algorithms like to build ai freelancer matcher in which I will rank the gigs based on the description, ratings, number of review. How can I approach this.
@thumarzeel Před 22 dny
You are a true TechViz man...love the simplicity that you explain with.
@TechVizTheDataScienceGuy Před 22 dny
Thank you. I am glad 😁
@RajdeepBorgohainRajdeep Před 25 dny
Really appreciated the effort, please continue the paper deep dives!
@TechVizTheDataScienceGuy Před 25 dny
Thank you so much. Lot more already scheduled 👏
@writtenkannadalyrics Před 27 dny
Please share the pdf link of this paper
@TechVizTheDataScienceGuy Před 27 dny
Hey, in the description ⬆️
@TechVizTheDataScienceGuy Před 29 dny
List of unsupervised sentence embedding methods: github.com/UKPLab/sentence-transformers/tree/master/examples/unsupervised_learning
@tarun-prakash Před měsícem
Well explained, Thanks !
@TechVizTheDataScienceGuy Před měsícem
Thank you!
@DrAIScience Před měsícem
Thanks. Can donut be used for text region detection such as caption, oage number, serial number and classifiying them?
@TechVizTheDataScienceGuy Před měsícem
You could train it to specifically extract required entries from pdf. Making it simulate like region specific extracts
@stephenmicaiah3170 Před měsícem
%AI improve and develop our contextual strand adequacy interpretation 2024 to %AI improve and develop our contextual strand adequacy interpretation 2024
@TechVizTheDataScienceGuy Před měsícem
GPT generated ?
@alexilaiho6441 Před měsícem
Awresome man! Keep doing this stuff. Subscribed!!
@TechVizTheDataScienceGuy Před měsícem
Thanks man! 👏
@ravi03071991 Před měsícem
Great video, thanks for making it.
@TechVizTheDataScienceGuy Před měsícem
Thanks Ravi !
@TechVizTheDataScienceGuy Před měsícem
NDCG Metric: czcams.com/video/xHhLOQ7Pnb4/video.html Mean Reciprocal Rank Metric: czcams.com/video/6dDvfGrxFns/video.html
@ramaraopirati Před měsícem
Your explanation of T2 becoming 0 with sigmoid is incorrect.
@TechVizTheDataScienceGuy Před měsícem
Oh okay 🤔 maybe I might have missed something. I clearly don’t remember the content of this one now. It would be really helpful, if you could point the error with timestamp and its alternate explanation, it will really benefit anyone who sees it from here on. I’ll also pin the comment to prioritise its visibility. 🙏 thank you
@tombombadil208 Před měsícem
Thanks for sharing this, as always! Interesting approach, but based on their results, it does not seem to provide significant improvement (besides SimCSE and Contriever). One must keep in mind the cost associated with having additional chunks, especially in popular retrievers like DPR.
@TechVizTheDataScienceGuy Před měsícem
Yes, 💯
@waleedyasinkhan7838 Před měsícem
You should also mention that a value head is getting trained, which is placed on the LLM. You can refer TRL PPO example and check it's source code.
@samriddhlakhmani284 Před měsícem
Thanks for sharing this. Awesome work ✌🏻
@TechVizTheDataScienceGuy Před měsícem
Thank you! 👏
@TechVizTheDataScienceGuy Před měsícem
🌟Watch out more Research Paper Summaries at czcams.com/video/ykClwtoLER8/video.html
@dailywisdomquotes518 Před měsícem
can you code this paper with sample tickets data
@TechVizTheDataScienceGuy Před měsícem
Aah.. that’s looks very difficult with my current schedule :/ but I am sure someone must have or will be doing open source implementation of it 🔜
@himanshukumarsharma9098 Před měsícem
Awesome.... Thank you so much
@TechVizTheDataScienceGuy Před měsícem
You’re welcome!
@himanshukumarsharma9098 Před měsícem
Interesting.... Thank you for sharing....🤝
@TechVizTheDataScienceGuy Před měsícem
👏 👏
@PuzzlePlungeRiddlesAndPatterns Před měsícem
Thanks for the intuition ! :)
@bhargavigottapu4851 Před měsícem
Can u share the source code
@TechVizTheDataScienceGuy Před měsícem
Hey, that’s not my app. Source link is there in the description.
@ravi03071991 Před měsícem
Amazing paper. Thanks for making the video.
@TechVizTheDataScienceGuy Před měsícem
Thank you!
@rohitdutkunwar8705 Před měsícem
Thanks a lot... Very much needed.
@TechVizTheDataScienceGuy Před měsícem
👍👏
@TechVizTheDataScienceGuy Před měsícem
> Interested in consuming byte-sized AI/ML content. Then feel free to check www.youtube.com/@TechVizTheDataScienceGuy/shorts > If you're looking for more research paper summaries, then check czcams.com/play/PLsAqq9lZFOtWUz1WEoJ3GXw197LD7BxMc.html
@PuzzlePlungeRiddlesAndPatterns Před měsícem
Thanks for sharing the insights. I was exploring for similar usecase.
@TechVizTheDataScienceGuy Před měsícem
Great to hear that. Thank you!
@tombombadil208 Před 2 měsíci
thanks for this. you helped me solve a problem
@PavanKumar-oo1vz Před 2 měsíci
Thanks for the clear explanation! Is there any self query retrieval cookbook available for reference?
@TechVizTheDataScienceGuy Před 2 měsíci
Hi, thank you :) There’s an implementation in langchain, maybe you might want to check that out - python.langchain.com/v0.1/docs/modules/data_connection/retrievers/self_query/
@PuzzlePlungeRiddlesAndPatterns Před 2 měsíci
Amazing 🙌
@TechVizTheDataScienceGuy Před 2 měsíci
Thanks :)
@TechVizTheDataScienceGuy Před 2 měsíci
⭐Interested in consuming byte-sized AI/ML content. Then feel free to check www.youtube.com/@TechVizTheDataScienceGuy/shorts ⭐If you're looking for research paper summaries, then check czcams.com/play/PLsAqq9lZFOtWUz1WEoJ3GXw197LD7BxMc.html
@pratik6447 Před 2 měsíci
Which model is used in Re ranker? And how to get likelihood probability using that model. Any notebook for demo.
@TechVizTheDataScienceGuy Před 2 měsíci
Hey, any LLM model that can give out next word probability is good here. The authors specifically tried with GPT versions like neo, J and then T5 variants. The likelihood of question is nothing but product of probability of every word given previously generated words. Here’s the official repo - github.com/DevSinghSachan/unsupervised-passage-reranking Hope this helps!
@PuzzlePlungeRiddlesAndPatterns Před 2 měsíci
Crisp and clear! ❤
@TechVizTheDataScienceGuy Před 2 měsíci
⭐ Interested in consuming byte-sized AI/ML content. Then feel free to check www.youtube.com/@TechVizTheDataScienceGuy/shorts ⭐Just like this one, if research paper summaries are your type, then check czcams.com/play/PLsAqq9lZFOtWUz1WEoJ3GXw197LD7BxMc.html
@TechVizTheDataScienceGuy Před 2 měsíci
🌟Blog: prakhar-mishra.medium.com/enhancing-passage-retrieval-with-zero-shot-question-generation-paper-summary-301d34e0278b 🌟Interested in consuming byte-sized AI/ML content. Then feel free to check www.youtube.com/@TechVizTheDataScienceGuy/shorts 🌟Just like this one, if research paper summaries are your type, then check czcams.com/play/PLsAqq9lZFOtWUz1WEoJ3GXw197LD7BxMc.html
@theindianrover2007 Před 2 měsíci
Nicely explained
@ujraman Před 2 měsíci
This is not your typical BI or reporting engine where you are performing a dtilldown and drill through your indices. The index in your Vector db and LLM is good enough for your RAG
@user-nf4wt3ef9e Před 2 měsíci
Was waiting for your explainer videos since last few months! Thanks for sharing 😊
@thumarzeel Před 2 měsíci
Awesome man
@PuzzlePlungeRiddlesAndPatterns Před 2 měsíci
Very clear explanation. Thank you so much!
@mukeshkund4465 Před 2 měsíci
Its very clear and crisp . Is it possible to add the practical implementation of RRF ??
@TechVizTheDataScienceGuy Před 2 měsíci
Found this resource online - safjan.com/implementing-rank-fusion-in-python You can check this out.
@imten5518 Před 2 měsíci
Good approach but isn't it too expensive? Comparing each sentence with the next sentence in the whole document. It would work good for smaller dataset but will be slower otherwise.
@TechVizTheDataScienceGuy Před 2 měsíci
It would be O(n) time complexity. You can add a few preprocessing steps to do it on smaller document portions, if that is a concern.
@_Han-xk1zv Před 2 měsíci
Are you familiar with the reasons for conducting re-ranking step? Specifically, given the premise of extracting relevant document candidates using only DPR, I'm curious about your perspective on why we'd need to conduct re-ranking using a cross-encoder, in addition to extracting relevant document candidates by computing cosine similarity with a bi-encoder.
@TechVizTheDataScienceGuy Před 2 měsíci
Bi-encoder doesn’t really takes into account inter-attention calculation between sentences whereas cross attention does. Ideally one should use cross attention for first step as well, it’s just that it gets really expensive to do so. So get top k using fast method then re-arrange then with better method. Thanks
@_Han-xk1zv Před 2 měsíci
@@TechVizTheDataScienceGuy Thanks for your constructive feedback!
@ANKRY10 Před 2 měsíci
Please reply, Confused with so many approaches( BART,BERT,t5,pytorch...) in this domain, how would you develop a doable uni level Text Summarization project (MAJOR) at present time? Appreciate the work!
@TechVizTheDataScienceGuy Před 2 měsíci
Hey, Didn’t follow the latter part. Can you please come again?

TechViz - The Data Science Guy

Komentáře