Reliable Graph RAG with Neo4j and Diffbot

Sdílet
Vložit
  • čas přidán 28. 06. 2024
  • We're developing a GraphRAG system using Diffbot's APIs to construct reliable knowledge graphs, which are then stored in a Neo4j graph database for efficient querying and information retrieval.
    0:00 intro
    0:22 brief overview of graph rag and knowledge graphs
    0:50 potential pitfalls of vector-based rag
    1:29 graph rag research by microsoft
    2:09 potential pitfalls of llms constructing knowledge graphs
    2:20 brief intro of entity resolution
    3:03 entity resolution problem with gpt-4
    3:41 entity resolution handled by Diffbot
    3:51 Graph RAG demo + article importer
    4:37 web scraping without worrying about hallucinated sources
    5:24 KG construction from news article
    5:37 enrich the knowledge graph with Enhance API
    5:57 final network graph
    6:09 question answering (vector vs. vector+kg)
    7:26 more examples (vector vs. vector+kg)
    7:35 skip the outro and have fun with the repo!
    7:52 attribution to Tomaž Bratanic and Anej Gorkič
    Get your free Diffbot token to start building graph rag at:
    app.diffbot.com/get-started
    Github repo for this Graph RAG project:
    github.com/tomasonjo/diffbot-...
    #graphrag #knowledgegraphs #llms
  • Věda a technologie

Komentáře • 17

  • @johnkintree763
    @johnkintree763 Před 2 dny +1

    Excellent presentation. Neo4j has done a great job of integrating vector embeddings with knowledge graphs. The neo4j LLM Knowledge Graph Builder for extracting entities and relationships from video transcripts, Wikipedia articles, and pdf format files is impressive. It also merges the extracted knowledge into a graph structure, and then provides an interface to query the LLM about knowledge in the graph.

  • @kakashisensie100
    @kakashisensie100 Před 4 dny +12

    Amazing video really. and jokes are definitely necessary. 🤓

  • @ronifintech9434
    @ronifintech9434 Před 3 dny +2

    Thanks for the knowledge (graph) sharing :)

  • @johannesdeboeck
    @johannesdeboeck Před 4 dny +1

    Awesome indeed 😍Thank you!

  • @alv9551
    @alv9551 Před dnem

    omg I was trying to hold it together at the introduction when I saw the large soup bowl but I broke down when it said "whatever, y'all get replaced by AI in 5 yrs" 😂 Great video overall. I hope I am right about the size of the soup bowl and you are not 3 feet tall.😅

  • @tomazbratanic5502
    @tomazbratanic5502 Před 4 dny

    Awesome!

  • @alchemication
    @alchemication Před 3 dny

    Hey. Thx for the awesome content. Would it be possible to actually show a fully working large scale graph (like proper prod scale thing), and also discuss pros and cons of the approach, and when did you find KGs working well and not so well? The reason I asked is that my own KG experiments worked perfectly for me at smaller scale, but then the speed was so slow, that it was killing my M2 mac all together.

  • @mshonle
    @mshonle Před 4 dny +1

    Once the entities are extracted, can the model then be prompted to write a graph query (that could then be executed)? I’m thinking in particular of the “knowing A is B but not B is A” problem, such as when you ask an LLM “who is Mary Pfeiffer’s son?” and it does not say “Tom Cruise” but can answer “who is Tom Cruise’s mother?” just fine?

  • @trinityblood5622
    @trinityblood5622 Před 3 dny +1

    Hi it would be great if you could please make longer videos explaining how you did each of these transition for example entity extraction, relationship extraction and so on and then how you did the neo4j integration. Maybe you can make a short video like this out of the original video to attract customers while the long video would still serve as a promising directions for the developer/researcher. I love the output produced from your system, but there’s no way to reproduce what you are doing. Reproducibility is a major concern in KGC.

  • @viky2002
    @viky2002 Před dnem

    how does this compare to llamaindex property graph ?

  • @V0v1kkk
    @V0v1kkk Před 4 dny

    DiffBot - sound not so good for privacy.
    That is the role of the DiffBot? Entities extraction?
    Are there any selfhosted replacements?

  • @ignaciopincheira23
    @ignaciopincheira23 Před 4 dny

    Hi, could you convert complex PDF documents (with graphics and tables) into an easily readable text format, such as Markdown? The input file would be a PDF and the output file would be a text file (.txt).

    • @joshuajose8598
      @joshuajose8598 Před 3 dny +1

      Maybe the pages have to be turned into images using Poppler and you could use an LLM that allows image inputs like GPT4Vision and Claude3 along with Function Calling to get the entities, objects and relations.

  • @darkmatter9583
    @darkmatter9583 Před 3 dny

    be my teacher,and advisor please connect

  • @jackbauer322
    @jackbauer322 Před 4 dny

    too much information on the screen at once ... really painful to follow ... be more sober and straight to the point. Jokes are not necessary