Summary
In this episode of the AI Engineering podcast, Philip Rathle, CTO of Neo4J, talks about the intersection of knowledge graphs and AI retrieval systems, specifically Retrieval Augmented Generation (RAG). He delves into GraphRAG, a novel approach that combines knowledge graphs with vector-based similarity search to enhance generative AI models. Philip explains how GraphRAG works by integrating a graph database for structured data storage, providing more accurate and explainable AI responses, and addressing limitations of traditional retrieval systems. The conversation covers technical aspects such as data modeling, entity extraction, and ontology use cases, as well as the infrastructure and workflow required to support GraphRAG, setting the stage for innovative applications across various industries.
Announcements
Parting Question
In this episode of the AI Engineering podcast, Philip Rathle, CTO of Neo4J, talks about the intersection of knowledge graphs and AI retrieval systems, specifically Retrieval Augmented Generation (RAG). He delves into GraphRAG, a novel approach that combines knowledge graphs with vector-based similarity search to enhance generative AI models. Philip explains how GraphRAG works by integrating a graph database for structured data storage, providing more accurate and explainable AI responses, and addressing limitations of traditional retrieval systems. The conversation covers technical aspects such as data modeling, entity extraction, and ontology use cases, as well as the infrastructure and workflow required to support GraphRAG, setting the stage for innovative applications across various industries.
Announcements
- Hello and welcome to the AI Engineering Podcast, your guide to the fast-moving world of building scalable and maintainable AI systems
- Your host is Tobias Macey and today I'm interviewing Philip Rathle about the application of knowledge graphs in AI retrieval systems
- Introduction
- How did you get involved in machine learning?
- Can you describe what GraphRAG is?
- What are the capabilities that graph structures offer beyond vector/similarity-based retrieval methods of prompting?
- What are some examples of the ways that semantic limitations of nearest-neighbor vector retrieval fail to provide relevant results?
- What are the technical requirements to implement graph-augmented retrieval?
- What are the concrete ways in which the embedding and retrieval steps of a typical RAG pipeline need to be modified to account for the addition of the graph?
- Many tutorials for building vector-based knowledge repositories skip over considerations around data modeling. For building a graph-based knowledge repository there obviously needs to be a bit more work put in. What are the key design choices that need to be made for implementing the graph for an AI application?
- How does the selection of the ontology/taxonomy impact the performance and capabilities of the resulting application?
- Building a fully functional knowledge graph can be a significant undertaking on its own. How can LLMs and AI models help with the construction and maintenance of that knowledge repository?
- What are some of the validation methods that should be brought to bear to ensure that the resulting graph properly represents the knowledge domain that you are trying to model?
- Vector embedding and retrieval are a core building block for a majority of AI application frameworks. How much support do you see for GraphRAG in the ecosystem?
- For the case where someone is using a framework that does not explicitly implement GraphRAG techniques, what are some of the implementation strategies that you have seen be most effective for adding that functionality?
- What are some of the ways that the combination of vector search and knowledge graphs are useful independent of their combination with language models?
- What are the most interesting, innovative, or unexpected ways that you have seen GraphRAG used?
- What are the most interesting, unexpected, or challenging lessons that you have learned while working on GraphRAG applications?
- When is GraphRAG the wrong choice?
- What are the opportunities for improvement in the design and implementation of graph-based retrieval systems?
Parting Question
- From your perspective, what are the biggest gaps in tooling, technology, or training for AI systems today?
- Thank you for listening! Don't forget to check out our other shows. The Data Engineering Podcast covers the latest on modern data management. Podcast.__init__ covers the Python language, its community, and the innovative ways it is being used.
- Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
- If you've learned something or tried out a project from the show then tell us about it! Email hosts@aiengineeringpodcast.com with your story.
- To help other people find the show please leave a review on iTunes and tell your friends and co-workers.
- Neo4J
- GraphRAG Manifesto
- RAG == Retrieval Augmented Generation
- VLDB == Very Large DataBases
- Knowledge Graph
- Nearest Neighbor Search
- PageRank
- Things Not Strings) Google Knowledge Graph Paper
- pgvector
- Pinecone
- Tables To Labels
- NLP == Natural Language Processing
- Ontology
- LangChain
- LlamaIndex
- RLHF == Reinforcement Learning with Human Feedback
- Senzing
- NeoConverse
- Cypher query language
- GQL query standard
- AWS Bedrock
- Vertex AI
- Sequoia Training Data - Klarna episode
- Ouroboros
[00:00:05]
Tobias Macey:
Hello, and welcome to the AI Engineering podcast, your guide to the fast moving world of building scalable and maintainable AI systems. Your host is Tobias Macy. And today, I'm interviewing Philip Ratheley about the application of knowledge graphs and AI retrieval systems, also known as RAG. So, Philip, can you start by introducing yourself?
[00:00:30] Philip Rathle:
Yes. Great to meet you, Tobias, and good to be here today. I'm, CTO of Neo 4J. Joined the company in 2012 as the first product hire. I've been on the Graph Journey a long time, was inspired by that after, oh, probably close to 20 years of working with data and databases.
[00:00:51] Tobias Macey:
And do you remember how you first got involved in the data and machine learning space?
[00:00:56] Philip Rathle:
Yeah. So I got involved with data long before I got involved with machine learning, and it was really my first job coming out of school. I was working doing, you know, a software project, and I would write queries, and then we would go to this DPA who would go and work some incomprehensible magic. And then the query would suddenly run a 100 times faster. I was like, wow. What's going on there? And then, my next project, I ended up as a consultant doing data modeling, which got me to realize the power of data and structure and how data is a reflection of all these things happening in the business.
And then from there, I ended up apprenticing with this, kinda like boutique consultancy that nobody's ever heard of. It's called tanning technology. But in the late nineties, early 2000, they were, one one of the hot small consultancies that I I view as being like the wolf of pulp fiction except for, what we used to call the VLDB at the time, very large databases. So cut my teeth there and, yeah, have have just really fell into data really early and fallen in love with it because of what data, is and does for the you know, it's effectively a reflection of the business, and it can amplify. And you've got these great feedback loops with, ultimately with machine learning and, Gen AI these days. And
[00:02:24] Tobias Macey:
recently, you published a blog post on the Neo 4J site called the GraphRag Manifesto. And as I alluded to, there's this concept of RAG or Retrieval Augmented Generation that's been gaining a lot of attention recently with the growth of generative AI models, and knowledge graphs have been around for many years as well. This is talking about bringing the 2 together for better overall performance. I'm wondering if you can just start by giving an outline about what GraphRag is and some of the capabilities that it offers beyond just the vector based similarity search approach of retrieving augmented generation.
[00:03:05] Philip Rathle:
Sure. So, you know, probably everyone here listening is familiar with retrievable augmented generation. This is calling a, system. You user asks a question or, you know, some agent asks a question. And before signing into the LLM, you run some sort of query that queries your internal system that has information about, that particular domain that the LLM might not have been trained on or, you know, that might, might involve some some richer sort of, more detailed question. And, vector based rag is the most, kind of the initial and most common technique for doing this. It's really handy. It's easy to build. And, vectors are often talked about or vector databases are often talked about as being external memory.
And there are a bunch of ways that I look at graphRAG. Maybe I'll start by saying, well, if my external memory is functioning in some analogous way to that of a human brain, then how is memory stored and how does the brain operate? Well, you probably want something that operates somewhat like the brain or at least you can see how that might add value beyond something that has a a simpler, more straightforward structure. And so vector based drag is kind of an easy way to get, you know, a certain amount of a a certain distance up the value chain.
But, you know, if you really wanna take it to the next level, then using something that is structured like the brain, I e, structuring your data as a graph, and where your data processing runs in the same way that the processing the brain runs, namely by sending signals through that through those structural circuits and coming back with an answer. You can see how that, you know, has some at least poetic appeal. That's that's one way to look at it. Obviously, there are more, you know, rigorous experiential ways people have come to these conclusions. But at a high level, GraphRag then is just including a call to a graph database that has a graph of your data, so knowledge graph. So whatever, data is the subset of the world model that is being represented in your question, ask it questions, and return that to the LLM, which then can go on its own merry way, give you more accurate answers. It also helps with explainability, and then you can also constrain what you're bringing back through some nice security rules because the data is more structured. So it's it's kind of a next level from what's possible with vectors, and I see the 2 as very complementary.
[00:05:49] Tobias Macey:
One of the challenges that I've seen come up in the context of this nearest neighbor search, the kind of binary retrieval of vector stores where you either retrieve something or you don't is that, just because something is the nearest conceptual semantic item doesn't necessarily mean that it's actually relevant. Because if you have a very sparse matrix or a very sparse, vector space, then you could be near to something, but it's equivalent to saying, well, I'm living in Vermont and so I'm near to Colorado where if I have a full map of the United States, then the nearest thing is actually gonna be either New York or New Hampshire.
And I'm wondering if you can talk to some of the ways that having that more nuanced structural view of the data represented by the graph can address some of those shortcomings of sparsity in the vector space or whether that's just an inherent limitation of not having enough data. And maybe it allows you to even just say, this isn't anywhere near. It's too many hops away, and so I'm actually just not gonna return anything.
[00:06:54] Philip Rathle:
Yeah. This is all true. Just because an answer is the most frequent answer, you know, that may mean that it's the right answer. Oftentimes it does, but it doesn't always mean that. And when the stakes are higher, I e when, you know, having a more refined or a better answer, a higher percentage of the time has business value, then actually looking beyond just frequency can create a lot of value. And so one example to pull in, you know, one one kind of graph technique is if if I have a body of, let's say, customer support documents, which is a pretty common example and and kind of implementation in.
And and let's say I wanna do a vector search against some kind of question a user has, my router's broken, you know, here are some details with with a particular answer. You might have hundreds of answers, you know, looking at public knowledge bases, you know, but some of those may involve older versions of the router where, like, the the latest version, the answers are sparse because it just came out and it turns out the resolution is different because, hey. You know, as a vendor, the vendor fixed this problem and, you know, it it, the the way to address it is totally different. And so in that case, having a structure that, you can you can reach out to that might in might run a calculation. Like, what is the case that has the most number of inbound resolved dispositions from a customer service agent, you know, or or from other people for that matter in a public knowledge base from other users for this particular model of things.
So getting to a particular model is something that you can do more easily in a graph, sort of sort of narrowing down and getting more specificity, but then also having access to other algorithms, like the example I just gave is centrality. PageRank is a good example of centrality. And and I'd say more broadly, like, if you wanna look at an analogy, Google and web search are actually pretty good analogies of this, where back in the day, they were doing surfacing results on the worldwide web based on text matching. Now vector indexing is, you know, a little bit more sophisticated than just string search, which way they're do they were doing at the time. But the analogy sort of carries through that in order to take the next level, they needed to go and build a knowledge graph, which they had this great, blog post called things not strings in 2012 announcing Google's knowledge graph. And the idea was by treating my data as mere strings, I'm missing out on understanding the richness of what those things actually represent. So that's that's maybe an analogy based way of describing the same thing.
[00:09:48] Tobias Macey:
Another interesting aspect of the space of retrieval augmented generation is that a lot of the examples are very naive and simplistic of, oh, you just throw your document at this embedding function. It generates the vector representation. You store that in the database. And then when you go to retrieve something semantically proximate, you get the result back. But many of these vector databases also have capabilities to add additional metadata, apply filtering rules to the retrieval. Another space is the idea of, vector indexing in other maybe relational databases, pgvector being probably the most well known example where you have proximity to that tabular structure where so you can use that for data enrichment.
I'm wondering if you can talk to some of the ways that this graph representation differs from those capabilities of either filtering on the retrieval path in a vector store or being able to do relational joins in something like Postgres where the graph structure itself is inherent to being able to solve a particular problem.
[00:10:57] Philip Rathle:
Yeah. So if you looked at the vector only databases or, like, the vector native databases, the, you know, pine cones and and the like, yeah, to your point, they're adding capabilities to add metadata to do filtering. And I think if you extrapolate that over time, like, you're gonna in 10 years, you know, if if you follow that track, you're gonna end up with, you know, some sort of much richer data model than just vectors. And it might look like a graph. Who knows? Yeah. It could look like relational, I guess. It's it's an implementation choice which model to follow. At the simplest level, they are you know, you could say that the analog in a either a graph database or a relational database is, like, just just adding adding additional attributes that let you and those filters are effectively, like, not even one hub out. They're they're they're attributes of the particular thing.
And in a graph, you are you're able to do that. You can have properties on nodes and labels on nodes, properties on relationships, types on relationships, directions. However, you can also step out 1 level, 2 levels, 3 levels, and either run computation on that or just do pattern matching as, you know, as as part of your querying. So so then the question the question we asked ourselves when we saw vectors coming out is, okay. How easily could we add vectors to our database, you know, much in the same way that Postgres to your point and others have have added it. Turns out it wasn't that hard to add basic vector capability. There's lots of good open source stuff out there you can add in. And if you've got a database management system, then, you know, this this isn't that difficult a thing to add on.
And you can do a lot more with a richer data model than with just vectors. So I think the you know, if you're doing vector only and it's a good enough use case and that's working, then that's great. Then actually maybe it is the right thing at scale to use a, a dedicated vector only database that can't do anything else or maybe can do minimal filtering for whatever economics or or other reasons. But if you're doing, you know, anything where you wanna start going a little bit farther down the rabbit hole of using the richness of the data to your advantage, then that's where actually having a database that supports a richer model other than vectors, be it Postgres or or Neo 4J can add a lot of value. What what graphs add that relational doesn't, a few things. But I I think in this context, scheme of flexibility is a pretty big one of being able to take whatever data you want to bring in or subset of data, bring it in, experiment. Oh, I want more stuff.
Okay. Well, I'm just gonna bring that in. It's much easier with a graph. You can just, you know, ingest different kinds of data and new structures really easily. They just become new kinds of nodes and new kinds of relationships. Whereas with a relational database, there's a schema, change and a schema migration if there's already data in it, and it can get pretty onerous especially in production to, to to add and and change the data you've got.
[00:14:17] Tobias Macey:
Another aspect of this concept of filtering or graph traversal is the way that you think about modeling of the data. As I mentioned, a lot of the tutorials are very straightforward of you just feed your document into this embedding model, maybe do some chunk, size tuning, or maybe let the model do that for you, and then you pull the data back out. As we discussed, some of these different vector stores are adding more nuanced capabilities around metadata, various attributes that you can attach to those documents. If you're building a knowledge graph, that can be an entire project in and of itself, and data modeling is obviously very important.
And I'm curious if you can talk to some of the ways that you need to be thinking about that data modeling approach of how you want to represent the data, what types of relationships you're trying to, store and retrieve from as you're building out that AI application and some of the ways that the data modeling becomes a more first class concern in this graph rag approach?
[00:15:25] Philip Rathle:
Yes. The the the really appealing and the great thing and and a very powerful thing with vectors is you don't need to think. You can, Yeah. You need to think a bit, like, what documents and chunk size and things like that. But it's it's pretty much a, you know, an autopilot kind of process. You know, you use your favorite Gen AI framework, and it'll chunk things up and, you know, and and you're off to the races. And with graphs, you, you know, as you point out, you need to think through the model. This has always been, like, you know, a looked like a pretty big obstacle, and I think in practice can be of especially if I have unstructured data, how I'm how how am I gonna map that to the graph?
And the good news is all these technologies we're talking about actually make it much easier to get data out of a graph. This is a new and pretty fast moving space that I'm really excited about. But let let's take the 2 separate examples. You know, you're either each of you're bringing data in from some structured format, and that's usually a relational database, or you're bringing data in from, you know, un unstructured text, or you may have both. You know, oftentimes in enterprise applications, you have your unstructured text, but then you have your reference table somewhere with, you know, what are the keywords, what's the product hierarchy, who are the customers, etcetera. And then you wanna tie those things together.
In the structured case, it's actually become fairly easy and straightforward to map a graph. There's this, technique called tables to labels where every row in a table, if it's a thing table, turns into a node with a label of that thing. And then the properties the the columns in that table just become properties on that node. And then the primary key becomes a node key. So that's a straight mapping, and there's increasingly good tooling to just sort of bring that stuff across. Neo four j's had some for a while, and it's, you know, get getting we're surfacing it in in more and more places. Likewise, your join tables or your primary foreign keys become relationships.
And, those so each row becomes a relationship object. And if it's attributed, then you end up with properties on that relationship. The tricky part is unstructured data is alright. What am I how how does this even turn into a graph? And it turns out you can use LMs for this. And so, you know, back in the day, people did various kinds of NLP techniques. Those are still relevant and and useful, but, you know, LLMs bring a whole new level of power to the game. Now what LMs don't necessarily know is the they may not know the terminology in your domain well enough in order to know what to extract. But the good news is you probably have that somewhere. You probably have a table somewhere, like I said, with, you know, your terminology and all of your product names and so on. So you can then refer to that as part of your entity extraction.
And there are there are a bunch of companies out there that are taking this on, like, as a startup of, you know, specializing in entity extraction for particular domain kinds of data, etcetera. Neo 4 j has been building tooling around it. Others have tooling around it. And if you look at, you know, the lane chain, LAMA index, haystack, they they all have Diffbot is another one. That they all have tools for doing this that again are a bit rudimentary now, but are gonna just gonna keep getting better.
[00:19:10] Tobias Macey:
On that topic of entity extraction, entity resolution, as you said, that has long been one of the more complicated aspects of building knowledge graphs and maintaining and evolving them. Layered on top of that is the concept of ontologies and taxonomies, where if you want to be a little more sophisticated, you can have hierarchical relationships of what those different things represent and being able to do kind of ontological traversals of, you know, this is a man who worked, you know, as a member of a family. He is the father. There's a woman who is a member of the family. She is the mother. And so you can then extrapolate those ontological concepts into parent, and presume from that ontological relationship that there are children. And I'm wondering if you can talk to some of the ways that these LLM entity extraction techniques are able to take advantage of those ontological relationships and some of the considerations around selecting or building your own ontologies and taxonomies to be able to build into these graphs?
[00:20:17] Philip Rathle:
Yeah. I I think this ends up being one of the strong arguments for graphRAG having value is the fact that knowledge naturally shows up in these kinds of hierarchies. You know, I have a chair and a chair is part of a dining set and a dining set is part of household furniture. And, you know, if you're, you know, I don't know if you're Home Depot or Lowe's or, you know, IKEA in the business of selling these things. I guess if you're IKEA, it's really important to get each subcomponent inside. Here's the, you know, leg component inside of the chair, and here are the screws.
That knowledge shows up this way, but also a lot of company data shows up this way, like products, product catalog show up this way, promotions show up this way. The fact that knowledge organizes itself as a graph ends up meaning that, well, if I wanna decode what a particular string means, what is this thing? It's not enough to just be able to decode a string, you know, into another string, I don't know, for language translation or something. I I need to actually understand how the thing rolls up into these, this conceptual hierarchy or is is you say like an ontology or a taxonomy. And this, like on the analytic side, it's, you know, it has obvious applications if I'm trying to do aggregations at various levels, that that kind of thing. But also people conversationally engage in different, you know, you might ask a question about chairs at one moment and about household items at another moment, or likewise with photos and collections and collections of collections or whatever it might be. So having a graph of your terminology and of your ontology and then how that relates to the physical things that you're dealing with can be really powerful and actually that's a pretty common use of GraphRag.
[00:22:11] Tobias Macey:
Now digging into the technical implementation of a GraphRag system, can you talk us through the workflow of building the embeddings, figuring out how to associate a given vector or a given vector representation of an entity with an appropriate node in the graph? And then on the retrieval side, what the, search function looks like, some of the ways that you augment the nearest neighbor approach with the graph structures, and just some of the foundational infrastructure that's required to be able to support these graph representations collocated with or adjacent to the vector data for that semantic retrieval?
[00:23:01] Philip Rathle:
Yeah. There are a few different patterns. So one pattern is I'm gonna retrieve my vector first or my vectors. Let's say I have a 100 matches and I'm in this customer support kinda use case. Then I want to post filter the vector results based on some sort of graph query. That graph query could look at an ontology. Like, it could say, alright. Now these vectors are hanging off of these customers who, own these this particular device, and I wanna narrow it down based on the specific device that they own. Or it could be doing a ranking based on, like, the the page rank slash centrality example I gave earlier of, like, what's the most referenced document by whatever, by customer service agents or or whatever you whatever kinds of inbound relationships you're looking at. So that's one pattern is bring back your vectors first and then filter based on the graph. It could also be a security related filter. Hey. Who's who's the person asking the question?
Do they have rights to, see this particular thing? Another pattern is you start with a graph search and then, you know, so let's let's look for a particular pattern. And where you might do this is things that are are really outside the domain of both LMS and, similarity search. Let's take supply chain risk. I met with a customer recently who's doing this. Supply chain risk involves understanding, like, doing some basic math across multilevels in the supply chain, rolling that up, and then doing some risk calculation based on that. Both the multilevel aspect and the math aspect are not, you know, in the sweet spot to say the least of of either of those technologies, but they're, you know, they're pretty trivial in a graph. So there, you would run your graph query and then, you know, and then there may or may not be a vector component to that particular query and then return that to the LOM.
Another kind of pattern is something I think of as giving you informed creativity where, hey, I don't really I can't really get my entire results in the graph or through a vector. And so I'm going to take, you know, a large part of the graph, maybe like, let's say I have a 1,000,000,000 scale graph and you're gonna bring back a 1,000 nodes of relationships that are surrounding that particular thing that you're looking at, you can easily turn a graph into natural language. Like, you just take the, you know, the thing nodes and then the relationship becomes a verb and there's a thing at the other end. So it's like, Philip has, I don't know, a cup.
And that that that shows up in the graph in a way that can easily be transcribed into English or the language of choice. You can then send that set of sentences to the LMS context and say, hey. Here's the situation. And, you know, that then becomes a background prompt and, or context for the prompt, and the user can go at it and ask questions relative to all the data I just dumped in. So that's another use. In that case, your knowledge graph becomes, more of more of an assist.
[00:26:26] Tobias Macey:
And in terms of the infrastructure to support a graph rack approach of being able to not necessarily collocate in the same storage layer, but at least collocate in terms of kind of physical proximity for reducing latency and reducing the number of round trips. I'm wondering what are some of the useful patterns people are using to be able to build their embeddings, build the graph, and be able to use them to interact with each other through the LLM as the mediator.
[00:27:03] Philip Rathle:
I guess you have 2 areas where you have patterns. For 1st, there's the graph creation and ingestion. And oftentimes, that's not a one time thing. Like, your graphs are living things. And if, you know, if it's reflecting some system in the real world, you know, you you wanna get the changes in. And then on the other side, you have, you know, what what's your pattern for actually running, GraphRag. So on the first, the pattern is, you know, using a set of tools which, the the latest, you know, Gen AI frameworks, like, again, Langchain, Lava index, and so on, often you usually have hooks. Oh, those all have hooks into being able to take unstructured text, bring it into the graph, calling an LLM.
Now depending on how much of the data you have is in structured form. As I said, you can you can really improve the quality of your unstructured extraction by having structured data that represents your taxonomy or ontology and, you know, so standard patterns for that. So that that's on the construction side. And then on your and and the one thing I'll add is if if you don't already have a taxonomy to refer to and your data is high stakes, and you need, like, 100% accuracy with your with some portion of your knowledge graph. Oftentimes, you don't with your entire knowledge graph just like you don't with vectors, and so you can get good enough with automated extraction.
But but in certain cases, I've seen situations where you there's a part of the graph that needs to be curated much like so you have early HF, right, which applies to models. Here you have, knowledge graph curation with with human feedback where, you know, if if your model is getting things wrong, then having some tooling and there's nothing really off the shelf now. So this tends to be things that people build for doing validation of the data in your knowledge graph. And then, effectively it becomes a sub training task for a model agent, which then gets better as you go at extracting your particular data and putting it in the right place.
Sometimes you want entity resolution tools as well, including off the shelf ones. I'll put in a plug for sensing, which is one that's lightweight and and quite good for, you know, company entities and, where you have name and and address and registration data and and people information, that can be helpful. Alright. Now switch to the what does the stack look like on the GraphRag side that typically ends up being some combination of pick your, you know, pick your framework of choice and then either have an LLM or some technology to generate graph queries from unstructured text. We we recently open sourced a technology that does this called NeoConverse that lets you have a conversation with a graph.
LLMs are getting better and better at writing cipher and which is the de facto query language for graphs used, by Neo 4J, but but many others. And and is, by the way, on track to is is 80, 90% of the way there towards the GQL standard, which is ISO's new query language standard and is a sibling to ISO SQL. So you have ISO SQL, you have ISO SQL, one for relational model, one for graphs. We can we can touch on that if if if you wanna loop back to it. And so there are there are 2 basic ways you would query your graph. Again, one is to actually have some kind of translation layer that takes a question and translates it to a graph query.
Another one is I can take a if if I have some bounded variety of the questions that I'm asking, let's say it's any question is gonna fit some template, and it's gonna be one of 20 kinds of things. In that case, then you can actually precreate a set of queries and then, you know, just do extract your terminology and variables from the question and then map it to the graph query, run the query, and give you back the result. So that's another pretty common way. And then in cases where I have a lot of kinds of questions or a variety in the way I can ask the question, but less of a variety in terms of the number of questions, you can actually do a vector search to find what's the best likely query to match up against the question, which is a pretty cool use of vectors. And then the last thing I'll add on the operational side is you also have this whole thing at the tail end of audit and explainability, security, and so on.
And graph visualization can often be really helpful for helping to understand why a particular question was answered in a in a given way to the degree that I have a graph query that pulls back information about my world model and sends it to the LLM, which then either takes that as the answer or then, you know, further evolves the answer based on that raw data. That, that tells me a lot about the reason that I got to that particular answer, which is a lot more than you can get out of, you know, just taking what an element says at face value or even, like, with with a bunch of vector inputs. So, so visualization of your graph rack inputs and outputs can help with that, and it can also help your development process.
I've had users come to me and say, hey. Just being able to see my vectors, in this case, they had done their vector indexing in Neo 4 j in the graph, just by having the ability to, like, visualize my vectors and how they hang off of documents and how those documents relate to each other and how they domain relate to the domain can be really helpful in accelerating development. So there are lots of parts of the Gen AI stack that are not standardized. But one, you know, to the degree that GraphRag is becoming or has become part of the Gen AI stack, there is luckily like, I just through amazing coincidence, ISO just came out with a standard for doing graph querying that is largely a continuation of the Cypher query language, and we're committed as is basically every graph under on the planet towards, providing a smooth transition from from Cypher to to gQL. So it's the first time in nearly 40 years that ISO decides that a a data model is significant enough to base a new standard around. So that's that's notable and significant, I'd say, in the graph space and for anyone in the database space.
[00:34:13] Tobias Macey:
Another interesting outgrowth of the attention that generative AI models has brought in is that they brought vector databases and vector indexes more into the common parlance. Everybody knows what they are at this point, and so they have seen growth even just in the semantic search approach beyond the actual retrieval and generation piece. And I'm wondering how this vector and graph combination is applicable independent of the language model or the generative AI application in a similar context?
[00:34:53] Philip Rathle:
Oh, that's a good one. So let me so you could look at vectors as being, like, some continuation along the spectrum that started with string search, you know, exact match and, you know, like, leading edge and trailing edge and contains that that that sort of thing to full text, to you know, and now now vector is sort of the next thing. It's like a, you know, very fuzzy conceptually based version of of full text, the very grown up version of full text. And it's always been the case, like, since my very early days here at Neo4j that, you know, you have people needing the same kind of textual based tools around either string search or full text or both, and then using that alongside the graph. And and in that case, like, you know, I I see the pattern where you do it in the graph because you can do both those things inside of Neo 4 j. I also see the pattern where I've got tons of data in Elastic. I don't wanna forklift that.
And the volumetrics are such that, you know, Elastic is dedicated bespoke technology for handling gobs and gobs of text. That's great. But then having the graph alongside it, and then usually the way you'd have some hook, which is like a document ID to link the 2. So I think that pattern, I think it's the same pattern that any case where I have text and I have structure in my documents, which you usually do. Even inside of a document, you have structure of a particular chunk exists might ex might be some text inside of cell b 1 of table 3 on page 36 of, you know, some document that is part of some collection, that is part of some overall filing.
And even structuring that, data is a graph. So there's nothing about the domain there. It's just the structure. Ends up being useful in certain cases. This was a surprise to me. I wouldn't have imagined this, but we had, folks in the community starting to discover this. And so it now has a name. It's called the lexical graph is the graph of structure. So that, you know, that can be used in the LLM context. I don't know that I don't know that that can be used in non LLM context, but, but certainly, there there all the ways in which you would use full text or string search along with a graph, I think are, you know, those all carry forward in the vector based world.
[00:37:41] Tobias Macey:
As far as the overall ecosystem, as you mentioned, there are numerous frameworks that have sprung up around building AI applications. Many of them have built in support for various vector engines. I'm wondering what you're seeing as the current and forward trending support for this broader graphRAG capability and some of the ways that people are bolting on the graph capabilities to the frameworks that don't yet have that out of the box?
[00:38:17] Philip Rathle:
Yeah. So I'll I'll I'd say most of the major frameworks do, and those are rapidly evolving, and we're we're actually working pretty closely with those framework providers. It's exciting for them and our users and for us. And then if if I expand beyond that into different parts of the ecosystem, you definitely want more integrations with, with with all the LLMs. So, you know, we've got integrations with Bedrock and Vertex AI and, and so on. We'd we'd recently announced, integrations with, you know, better better integrations with the different data providers.
So good good integrations with to the degree that data's coming in from more analytic databases, like data lake houses, integrations with BigQuery and Snowflake and Fabric. Something interesting we've we're doing with our Snowflake integration, which is in early access right now, is, being able to actually rip data out of Snowflake inside of Snowflake container services, running run run a graph algorithm, and then push the results back into a table. So you can do all that within the Snowflake environment without having to to to move the data around. So I I I think the next level of integrations is gonna be along those kinds of lines to be able to retain your data gravity where and when you want it and run your graph querying using a specialized graph technology alongside all the various things you're doing from, data storage and processing to to the LM itself, be it a foundation model or a or a series of small models.
[00:40:03] Tobias Macey:
In your experience of working at Neo 4J, working in the space, working with customers. I'm curious if you have seen any broad trends as far as different families of language models or different types of embedding models that lend themselves more effectively to this graph interaction?
[00:40:28] Philip Rathle:
I don't I it's definitely the case that from one day to the next, like, there there will be some model that'll come out that'll be, better at writing cipher. And, yeah, I'd I'd say it's it's it's such a game of leapfrog and it's such a such a vastly changing landscape. I'd say not really. The probably the only generalization I'd make is is the one a lot of your listeners probably have experienced themselves is oftentimes you start with just the largest model possible that'll give you the best results, be it, you know, claud 35 these days or GPT 4 o, etcetera.
And then, and then once you have something working and have demonstrated that it can be done, then oftentimes it can be more economical to break that up and move into smaller models or to use the large models for just a particular part of your workload. I take to the degree that you're leaning more on RAG, be it graph or vector, and have the data yourself, then it matters less which model you use. And then the less of a dependency you're able to have on your own data, the more you need to depend on the foundation models and the more there's value in going after a big one.
[00:41:42] Tobias Macey:
And in your experience of working in this space, surveying the graph rag landscape, working with customers, what are some of the most interesting or innovative or unexpected ways that you've seen these ideas applied?
[00:41:55] Philip Rathle:
Let's see. I'll I'll pick on a few. So one is a government tax authority that's doing anti money laundering and is is doing running queries in the graph, but then formatting the results into a suspicious activity report using an LLM. So in that case, it's like your locus of reasoning is in the graph because it involves numbers and multi ops. But then you're using the the model for what it's good at, which is, which is language. Similar sorts of things in, outbound email remarketing of I'm going to come up with the ideal recommendation for this person based on what they've bought and what other people have bought and, what they've bought together and what they have in the shopping cart together and so on. That's a very graphy kind of query where where you get amazing results in the graph. But then if I wanna write an email and I want that email to appeal to a person based on what I know about them, then actually an LLM can do a much better job of coming up with the right language for, you know, whoever the person I'm targeting, you know, be they, you know, what whatever their walk of life is. So that's another, I think, really clever combination of the 2 different technologies.
This one's not surprising in itself, but I I'd say the where it's going is maybe surprising as there was a a recent podcast, the Sequoia podcast training data appearance by the, CEO of Klarna who spoke about how they use Neo 4J as part of, like, a company knowledge base, that is a Gen AI kind of agent that anyone can ask a question to, and it'll come back with an answer. And that answer could be related to business process or HR policy or customer or whatever it might be. And so that that's not a surprising use, but the surprising thing you said on that on that podcast is they're they're beginning to realize that the fragmentation of, core data across different systems and different applications like, Salesforce and Workday and so on is beginning to hamper their business. And so what they're actually doing is they're leaning into this centralized graph approach using that as increasingly as the system of record for the core things in their enterprise and are moving to actually in some fashion, like, deprecating these applications, like like Salesforce, Workday, and the rest. You know, maybe they continue to use them, but they're in a bit more of a targeted read only fashion, but your system of record is actually this graph which connects to everything.
Because there are these incredible network effects with graphs. You know, if you bring to together the data to solve, say, a recommendation, like I had earlier, that might be 80% of the data you need to do fraud detection. And then together, those might be, you know, 50% of the data that I need to do for some other use case. And then once I solve that, I solve 3 other use cases without even knowing it. So there are both data network effects and use case network effects. And so I'd say maybe the, you know, the way I'd characterize the the thing about that that's surprising is it's something that started as a Gen AI use case, but then once they began using it, realized that there are applications of graph that go far beyond Gen AI that are quite fundamental to the business and and transformative ultimately to their IT infrastructure.
[00:45:39] Tobias Macey:
And in your experience of working at Neo 4J very closely in the graph data space and as you're exploring the applications of graph data to generative AI applications, what are some of the most interesting or unexpected or challenging lessons that you've learned to the process?
[00:45:59] Philip Rathle:
Okay. So I guess I'd say one is there there there's this, study people often refer to that was done years ago. I think it's it's by, the air force decided. I don't know if it's true or apocryphal that an error made in the data modeling phase will be, like, a 100 times easier to fix if you catch it then than if you catch it. And then if you catch it in the build phase, you know, after your code's been written, then if you catch it in the test phase, then if you catch it in production. It's like each time you have orders of magnitude more expense in solving a kind of problem that occurred earlier on. So I'd say one is getting your model right.
So don't try don't try to get your model too right early on when you're experimenting. But then the higher the the more important, the more critical your application is to your business. And I guess you could measure this by proxy by by saying, you know, what happens if the system goes down for a minute? Like, is that, you know, is is that a fire drill kind of moment? Like, for those kinds of applications, putting, more thought, more attention, more study into, how you actually design your application, is is 1. Another is there are some things that are counterintuitive with graphs. So for example, when you load data, you actually want to create your indexes first before you load your data. This is the opposite of what we do with relational databases. With relational databases, you load your table, and then you create your indexes.
And then the indexes can take advantage of, like, a big parallel scan, you know, potentially parallel, build your index, and it it happens much faster. With a graph, each record that I'm adding needs to actually if I'm adding a relationship, you are stitching your data together on insert or you could say, like, you're indexing your data, or you're, you're pre joining your data on insert is maybe a way to look at it. And so if you don't have an index, each time I try to add a relationship, I'm gonna do a full node scan to find the node at either end, which means by the time you've got a few 100000 records in, each loading each record is gonna get really, really slow.
You can avoid this by just creating indexes ahead of time. This is really common mistake. So this is a very down in the weeds lesson learned. I'd say the other lesson learned popping way back out that is maybe relevant in in in the Gen AI world is I've, you know, I've seen so many projects fail over the course of my career because they were purely technology driven and they didn't have enough business sponsorship. Now I think we as an industry are doing the right thing in this world of Gen AI to be technology driven to a degree because you have no idea what's possible business wise until you play with the technology. So that seems appropriate.
But it's all too easy to then, you know, not look closely at the business or not involve the business at the right point in time as you go and end up with something that's technology driven. And some of the trends I've seen looking beyond of individual projects, like the whole trend around Hadoop and data lakes. And no. Let's let's just get all the data into one place, not preprocess it, just dump it in there, and then we can figure out what to do with it. And, obviously, like, that didn't work very well. So having the appropriate level of business partnership and involvement as one is experimenting with these technologies is, I think is an art. And, you know, where what the right involvement is depends on the situation and where you are, in the cycle, but I'd probably air more on the side of involving the business than not. So that's my last lesson learned.
[00:50:06] Tobias Macey:
And for people who are intrigued by all of the enhanced capabilities that we've been discussing, what are the situations where you would advocate against GraphRag, where GraphRag is just the wrong choice?
[00:50:20] Philip Rathle:
I'd say the the more something is creative and the less something is high stakes or the or the more something is based purely based on a document or the more a language model just has all the information it needs because the the data you're talking about isn't your own proprietary data. It's, you know, it's out in the wild. So the those factors all, I'd say, lean in favor of yeah. Yeah. GraphRag, maybe there's some way it can add value. It's certainly not low hanging fruit and maybe it won't add value. Any case where the stakes are higher. So by stakes, I mean, you know, dollar value, but also health and human safety and regulatory and brand and reputation and privacy discrimination, these, these sorts of things where if it's a kind of application where these factors come into play, regulation is another, then that starts to tip you over the edge of, I'm gonna need to answer to someone, if not a regulator, the the person inside the company who is the throat to choke if something goes wrong.
And how are you gonna convince that person that it's the right answer and that the answer is good enough for enough of the time? Those ends up being cases where GraphRag becomes more valuable. So my rule of thumb personally is stakes. The higher the stakes through any of those dimensions, the more, GraphRag ends up being useful. But also it depends on having a question that involves information that is proprietary to your business or whatever endeavor you're trying to carry out.
[00:51:59] Tobias Macey:
And looking forward, what are some of the near to medium term either active improvements in the ecosystem or opportunities for improved support for these graph based systems in the AI application ecosystem that you're keeping an eye on?
[00:52:18] Philip Rathle:
So one is development of the frameworks and the integrations, which we talked a bit about. Knowledge graph construction is a huge one, and I'd say text to cipher is a huge one. I think those are the 2 biggest ones. So the and those are all areas that are improving almost literally day by day. This is a pretty pretty hot area for for us and I'd say for the larger ecosystem that's connecting in to graphs from the other side.
[00:52:49] Tobias Macey:
Are there any other aspects of this concept of GraphRag, the ecosystem around it, the work that you're doing at Neo 4J to help support that that we didn't discuss yet that you'd like to cover before we close out the show?
[00:53:01] Philip Rathle:
I think the one thing I'll maybe add is in addition to GraphRag, there are other ways that knowledge graphs are being used and and useful. One is around storing your metadata and data lineage of because data ultimately is the foundation. This part probably what I see is the thesis of your podcast. Right? Data is the foundation for, you know, everything going forward. Right? I mean, machine learning effectively takes what used to be declarative code and pushes it upstream and makes it a data problem. And so understanding what your data is, what the quality and timeliness of is of each source, how data moves across a company, you know, what's been curated, approved by data stored, you know, all these kinds of things end up being very important.
And, and so there's actually a long history even pre Gen AI of Neo 4 j being used as a system of systems of sorts to record how data moves through an enterprise and, you know, what the quality is, in in various places and so on. Some of that is driven from regulatory perspective, especially in the finance industry. And then some of it is done just for, you know, through a more mature understanding by say the chief data officer of data is, if not the new rocket fuel, at least the new oil. And, there's, there's really something to be gained in investing there. And yeah. So so maybe I'll end there.
Oh, I'll let me add one more. So this is something that was brought up by some of the framework providers recently that as a Gentec architectures get more intricate, they start to look like a graph and you could actually see a need for a control graph. Now it's small scale and, simpler levels. A control graph can just be stored in a in a file or something. But picture for a global application that needs access to, you know, what's the next series of things I'm gonna do in in some complex agentic architecture than actually having a database where you have a control graph that's replicated globally and is the source of truth could become a thing. So I don't know, you know, how important this trend will be or if it's gonna go anywhere, but it's it's come up recently and I think it's kind of interesting.
[00:55:21] Tobias Macey:
Yeah. The idea of using the graph to drive the agent to query the graph is definitely a very interesting, kind of chicken and egg problem. It'll be interesting to see how that all evolves.
[00:55:37] Philip Rathle:
Yeah. Kinda like the Norse snake eating its own tail. Yes. Rubrous. Yeah. Yes. Exactly.
[00:55:44] Tobias Macey:
Alright. Well, for anybody who wants to get in touch with you and follow along with the work that you and your team are doing, I'll have you add your preferred contact information to the show notes. And as the final question, I'd like to get your perspective on what you see as being the biggest gap in the tooling or technology or training that's available for AI engineering and AI systems today?
[00:56:05] Philip Rathle:
So let me maybe go a little bit more meta on this and say, the thing I'd really like to see is more energy spent towards understanding what's going on inside of an LLM. So in in a way, like the neurobiology of, like, how the decisioning is happening. And I'd say, you know, we certainly can advance quite a long ways treating it as a black box as we are and putting all these controls and understanding the behaviors and coming up with these techniques, like RAG and GraphRAG and prompting and, you know, fine tuning and so on. But I really wanna know what's, you know, the essence of what's happening inside so that we don't have to treat it as a black box.
Because you can just imagine the kinds of improvements you can have if if if we did understand that better in terms of, evolving how how models are trained and what a model even is. And, you know, heck, I mean, maybe models I could see models actually even including GraphRag internally in some sort of fashion. So whatever the outcome is, I think we could gain a lot by turning the attention inwards towards the inside of an LLM in addition to continuing to spend, energy around it.
[00:57:22] Tobias Macey:
Yeah. I think that that's definitely an interesting observation, particularly given the growth in recent years of graph being applied in the deep learning context with PyTorch having graph extensions to that framework and just the overall idea of graphs being integral to the construction of the neural nets that are used to build these models, and I'll be curious to see how some of those ideas get brought into these large language model and generative AI model development applications.
[00:57:57] Philip Rathle:
Can't wait.
[00:57:58] Tobias Macey:
Alright. Well, thank you again for taking the time today to join me and share your thoughts and experience on this concept of GraphRag. It's definitely a very interesting problem space. It's definitely exciting to see some of these graph structures and the idea of knowledge graphs being brought into the context of these vector retrieval systems. So I appreciate the time and energy that you're putting into helping to promote that and educate around that, and I hope you enjoy the rest of your day. Thanks, Tobias. It's been fun.
[00:58:30] Tobias Macey:
Thank you for listening. And don't forget to check out our other shows, the Data Engineering Podcast, which covers the latest in modern data management, and podcast dot in it, which covers the Python language, its community, and the innovative ways it is being used. You can visit the site at the machine learning podcast.com to subscribe to the show, sign up for the mailing list, and read the show notes. And if you've learned something or tried out a project from the show, then tell us about it. Email hosts at themachinelearningpodcast.com with your story. To help other people find the show, please leave a review on Apple Podcasts and tell your friends and coworkers.
Hello, and welcome to the AI Engineering podcast, your guide to the fast moving world of building scalable and maintainable AI systems. Your host is Tobias Macy. And today, I'm interviewing Philip Ratheley about the application of knowledge graphs and AI retrieval systems, also known as RAG. So, Philip, can you start by introducing yourself?
[00:00:30] Philip Rathle:
Yes. Great to meet you, Tobias, and good to be here today. I'm, CTO of Neo 4J. Joined the company in 2012 as the first product hire. I've been on the Graph Journey a long time, was inspired by that after, oh, probably close to 20 years of working with data and databases.
[00:00:51] Tobias Macey:
And do you remember how you first got involved in the data and machine learning space?
[00:00:56] Philip Rathle:
Yeah. So I got involved with data long before I got involved with machine learning, and it was really my first job coming out of school. I was working doing, you know, a software project, and I would write queries, and then we would go to this DPA who would go and work some incomprehensible magic. And then the query would suddenly run a 100 times faster. I was like, wow. What's going on there? And then, my next project, I ended up as a consultant doing data modeling, which got me to realize the power of data and structure and how data is a reflection of all these things happening in the business.
And then from there, I ended up apprenticing with this, kinda like boutique consultancy that nobody's ever heard of. It's called tanning technology. But in the late nineties, early 2000, they were, one one of the hot small consultancies that I I view as being like the wolf of pulp fiction except for, what we used to call the VLDB at the time, very large databases. So cut my teeth there and, yeah, have have just really fell into data really early and fallen in love with it because of what data, is and does for the you know, it's effectively a reflection of the business, and it can amplify. And you've got these great feedback loops with, ultimately with machine learning and, Gen AI these days. And
[00:02:24] Tobias Macey:
recently, you published a blog post on the Neo 4J site called the GraphRag Manifesto. And as I alluded to, there's this concept of RAG or Retrieval Augmented Generation that's been gaining a lot of attention recently with the growth of generative AI models, and knowledge graphs have been around for many years as well. This is talking about bringing the 2 together for better overall performance. I'm wondering if you can just start by giving an outline about what GraphRag is and some of the capabilities that it offers beyond just the vector based similarity search approach of retrieving augmented generation.
[00:03:05] Philip Rathle:
Sure. So, you know, probably everyone here listening is familiar with retrievable augmented generation. This is calling a, system. You user asks a question or, you know, some agent asks a question. And before signing into the LLM, you run some sort of query that queries your internal system that has information about, that particular domain that the LLM might not have been trained on or, you know, that might, might involve some some richer sort of, more detailed question. And, vector based rag is the most, kind of the initial and most common technique for doing this. It's really handy. It's easy to build. And, vectors are often talked about or vector databases are often talked about as being external memory.
And there are a bunch of ways that I look at graphRAG. Maybe I'll start by saying, well, if my external memory is functioning in some analogous way to that of a human brain, then how is memory stored and how does the brain operate? Well, you probably want something that operates somewhat like the brain or at least you can see how that might add value beyond something that has a a simpler, more straightforward structure. And so vector based drag is kind of an easy way to get, you know, a certain amount of a a certain distance up the value chain.
But, you know, if you really wanna take it to the next level, then using something that is structured like the brain, I e, structuring your data as a graph, and where your data processing runs in the same way that the processing the brain runs, namely by sending signals through that through those structural circuits and coming back with an answer. You can see how that, you know, has some at least poetic appeal. That's that's one way to look at it. Obviously, there are more, you know, rigorous experiential ways people have come to these conclusions. But at a high level, GraphRag then is just including a call to a graph database that has a graph of your data, so knowledge graph. So whatever, data is the subset of the world model that is being represented in your question, ask it questions, and return that to the LLM, which then can go on its own merry way, give you more accurate answers. It also helps with explainability, and then you can also constrain what you're bringing back through some nice security rules because the data is more structured. So it's it's kind of a next level from what's possible with vectors, and I see the 2 as very complementary.
[00:05:49] Tobias Macey:
One of the challenges that I've seen come up in the context of this nearest neighbor search, the kind of binary retrieval of vector stores where you either retrieve something or you don't is that, just because something is the nearest conceptual semantic item doesn't necessarily mean that it's actually relevant. Because if you have a very sparse matrix or a very sparse, vector space, then you could be near to something, but it's equivalent to saying, well, I'm living in Vermont and so I'm near to Colorado where if I have a full map of the United States, then the nearest thing is actually gonna be either New York or New Hampshire.
And I'm wondering if you can talk to some of the ways that having that more nuanced structural view of the data represented by the graph can address some of those shortcomings of sparsity in the vector space or whether that's just an inherent limitation of not having enough data. And maybe it allows you to even just say, this isn't anywhere near. It's too many hops away, and so I'm actually just not gonna return anything.
[00:06:54] Philip Rathle:
Yeah. This is all true. Just because an answer is the most frequent answer, you know, that may mean that it's the right answer. Oftentimes it does, but it doesn't always mean that. And when the stakes are higher, I e when, you know, having a more refined or a better answer, a higher percentage of the time has business value, then actually looking beyond just frequency can create a lot of value. And so one example to pull in, you know, one one kind of graph technique is if if I have a body of, let's say, customer support documents, which is a pretty common example and and kind of implementation in.
And and let's say I wanna do a vector search against some kind of question a user has, my router's broken, you know, here are some details with with a particular answer. You might have hundreds of answers, you know, looking at public knowledge bases, you know, but some of those may involve older versions of the router where, like, the the latest version, the answers are sparse because it just came out and it turns out the resolution is different because, hey. You know, as a vendor, the vendor fixed this problem and, you know, it it, the the way to address it is totally different. And so in that case, having a structure that, you can you can reach out to that might in might run a calculation. Like, what is the case that has the most number of inbound resolved dispositions from a customer service agent, you know, or or from other people for that matter in a public knowledge base from other users for this particular model of things.
So getting to a particular model is something that you can do more easily in a graph, sort of sort of narrowing down and getting more specificity, but then also having access to other algorithms, like the example I just gave is centrality. PageRank is a good example of centrality. And and I'd say more broadly, like, if you wanna look at an analogy, Google and web search are actually pretty good analogies of this, where back in the day, they were doing surfacing results on the worldwide web based on text matching. Now vector indexing is, you know, a little bit more sophisticated than just string search, which way they're do they were doing at the time. But the analogy sort of carries through that in order to take the next level, they needed to go and build a knowledge graph, which they had this great, blog post called things not strings in 2012 announcing Google's knowledge graph. And the idea was by treating my data as mere strings, I'm missing out on understanding the richness of what those things actually represent. So that's that's maybe an analogy based way of describing the same thing.
[00:09:48] Tobias Macey:
Another interesting aspect of the space of retrieval augmented generation is that a lot of the examples are very naive and simplistic of, oh, you just throw your document at this embedding function. It generates the vector representation. You store that in the database. And then when you go to retrieve something semantically proximate, you get the result back. But many of these vector databases also have capabilities to add additional metadata, apply filtering rules to the retrieval. Another space is the idea of, vector indexing in other maybe relational databases, pgvector being probably the most well known example where you have proximity to that tabular structure where so you can use that for data enrichment.
I'm wondering if you can talk to some of the ways that this graph representation differs from those capabilities of either filtering on the retrieval path in a vector store or being able to do relational joins in something like Postgres where the graph structure itself is inherent to being able to solve a particular problem.
[00:10:57] Philip Rathle:
Yeah. So if you looked at the vector only databases or, like, the vector native databases, the, you know, pine cones and and the like, yeah, to your point, they're adding capabilities to add metadata to do filtering. And I think if you extrapolate that over time, like, you're gonna in 10 years, you know, if if you follow that track, you're gonna end up with, you know, some sort of much richer data model than just vectors. And it might look like a graph. Who knows? Yeah. It could look like relational, I guess. It's it's an implementation choice which model to follow. At the simplest level, they are you know, you could say that the analog in a either a graph database or a relational database is, like, just just adding adding additional attributes that let you and those filters are effectively, like, not even one hub out. They're they're they're attributes of the particular thing.
And in a graph, you are you're able to do that. You can have properties on nodes and labels on nodes, properties on relationships, types on relationships, directions. However, you can also step out 1 level, 2 levels, 3 levels, and either run computation on that or just do pattern matching as, you know, as as part of your querying. So so then the question the question we asked ourselves when we saw vectors coming out is, okay. How easily could we add vectors to our database, you know, much in the same way that Postgres to your point and others have have added it. Turns out it wasn't that hard to add basic vector capability. There's lots of good open source stuff out there you can add in. And if you've got a database management system, then, you know, this this isn't that difficult a thing to add on.
And you can do a lot more with a richer data model than with just vectors. So I think the you know, if you're doing vector only and it's a good enough use case and that's working, then that's great. Then actually maybe it is the right thing at scale to use a, a dedicated vector only database that can't do anything else or maybe can do minimal filtering for whatever economics or or other reasons. But if you're doing, you know, anything where you wanna start going a little bit farther down the rabbit hole of using the richness of the data to your advantage, then that's where actually having a database that supports a richer model other than vectors, be it Postgres or or Neo 4J can add a lot of value. What what graphs add that relational doesn't, a few things. But I I think in this context, scheme of flexibility is a pretty big one of being able to take whatever data you want to bring in or subset of data, bring it in, experiment. Oh, I want more stuff.
Okay. Well, I'm just gonna bring that in. It's much easier with a graph. You can just, you know, ingest different kinds of data and new structures really easily. They just become new kinds of nodes and new kinds of relationships. Whereas with a relational database, there's a schema, change and a schema migration if there's already data in it, and it can get pretty onerous especially in production to, to to add and and change the data you've got.
[00:14:17] Tobias Macey:
Another aspect of this concept of filtering or graph traversal is the way that you think about modeling of the data. As I mentioned, a lot of the tutorials are very straightforward of you just feed your document into this embedding model, maybe do some chunk, size tuning, or maybe let the model do that for you, and then you pull the data back out. As we discussed, some of these different vector stores are adding more nuanced capabilities around metadata, various attributes that you can attach to those documents. If you're building a knowledge graph, that can be an entire project in and of itself, and data modeling is obviously very important.
And I'm curious if you can talk to some of the ways that you need to be thinking about that data modeling approach of how you want to represent the data, what types of relationships you're trying to, store and retrieve from as you're building out that AI application and some of the ways that the data modeling becomes a more first class concern in this graph rag approach?
[00:15:25] Philip Rathle:
Yes. The the the really appealing and the great thing and and a very powerful thing with vectors is you don't need to think. You can, Yeah. You need to think a bit, like, what documents and chunk size and things like that. But it's it's pretty much a, you know, an autopilot kind of process. You know, you use your favorite Gen AI framework, and it'll chunk things up and, you know, and and you're off to the races. And with graphs, you, you know, as you point out, you need to think through the model. This has always been, like, you know, a looked like a pretty big obstacle, and I think in practice can be of especially if I have unstructured data, how I'm how how am I gonna map that to the graph?
And the good news is all these technologies we're talking about actually make it much easier to get data out of a graph. This is a new and pretty fast moving space that I'm really excited about. But let let's take the 2 separate examples. You know, you're either each of you're bringing data in from some structured format, and that's usually a relational database, or you're bringing data in from, you know, un unstructured text, or you may have both. You know, oftentimes in enterprise applications, you have your unstructured text, but then you have your reference table somewhere with, you know, what are the keywords, what's the product hierarchy, who are the customers, etcetera. And then you wanna tie those things together.
In the structured case, it's actually become fairly easy and straightforward to map a graph. There's this, technique called tables to labels where every row in a table, if it's a thing table, turns into a node with a label of that thing. And then the properties the the columns in that table just become properties on that node. And then the primary key becomes a node key. So that's a straight mapping, and there's increasingly good tooling to just sort of bring that stuff across. Neo four j's had some for a while, and it's, you know, get getting we're surfacing it in in more and more places. Likewise, your join tables or your primary foreign keys become relationships.
And, those so each row becomes a relationship object. And if it's attributed, then you end up with properties on that relationship. The tricky part is unstructured data is alright. What am I how how does this even turn into a graph? And it turns out you can use LMs for this. And so, you know, back in the day, people did various kinds of NLP techniques. Those are still relevant and and useful, but, you know, LLMs bring a whole new level of power to the game. Now what LMs don't necessarily know is the they may not know the terminology in your domain well enough in order to know what to extract. But the good news is you probably have that somewhere. You probably have a table somewhere, like I said, with, you know, your terminology and all of your product names and so on. So you can then refer to that as part of your entity extraction.
And there are there are a bunch of companies out there that are taking this on, like, as a startup of, you know, specializing in entity extraction for particular domain kinds of data, etcetera. Neo 4 j has been building tooling around it. Others have tooling around it. And if you look at, you know, the lane chain, LAMA index, haystack, they they all have Diffbot is another one. That they all have tools for doing this that again are a bit rudimentary now, but are gonna just gonna keep getting better.
[00:19:10] Tobias Macey:
On that topic of entity extraction, entity resolution, as you said, that has long been one of the more complicated aspects of building knowledge graphs and maintaining and evolving them. Layered on top of that is the concept of ontologies and taxonomies, where if you want to be a little more sophisticated, you can have hierarchical relationships of what those different things represent and being able to do kind of ontological traversals of, you know, this is a man who worked, you know, as a member of a family. He is the father. There's a woman who is a member of the family. She is the mother. And so you can then extrapolate those ontological concepts into parent, and presume from that ontological relationship that there are children. And I'm wondering if you can talk to some of the ways that these LLM entity extraction techniques are able to take advantage of those ontological relationships and some of the considerations around selecting or building your own ontologies and taxonomies to be able to build into these graphs?
[00:20:17] Philip Rathle:
Yeah. I I think this ends up being one of the strong arguments for graphRAG having value is the fact that knowledge naturally shows up in these kinds of hierarchies. You know, I have a chair and a chair is part of a dining set and a dining set is part of household furniture. And, you know, if you're, you know, I don't know if you're Home Depot or Lowe's or, you know, IKEA in the business of selling these things. I guess if you're IKEA, it's really important to get each subcomponent inside. Here's the, you know, leg component inside of the chair, and here are the screws.
That knowledge shows up this way, but also a lot of company data shows up this way, like products, product catalog show up this way, promotions show up this way. The fact that knowledge organizes itself as a graph ends up meaning that, well, if I wanna decode what a particular string means, what is this thing? It's not enough to just be able to decode a string, you know, into another string, I don't know, for language translation or something. I I need to actually understand how the thing rolls up into these, this conceptual hierarchy or is is you say like an ontology or a taxonomy. And this, like on the analytic side, it's, you know, it has obvious applications if I'm trying to do aggregations at various levels, that that kind of thing. But also people conversationally engage in different, you know, you might ask a question about chairs at one moment and about household items at another moment, or likewise with photos and collections and collections of collections or whatever it might be. So having a graph of your terminology and of your ontology and then how that relates to the physical things that you're dealing with can be really powerful and actually that's a pretty common use of GraphRag.
[00:22:11] Tobias Macey:
Now digging into the technical implementation of a GraphRag system, can you talk us through the workflow of building the embeddings, figuring out how to associate a given vector or a given vector representation of an entity with an appropriate node in the graph? And then on the retrieval side, what the, search function looks like, some of the ways that you augment the nearest neighbor approach with the graph structures, and just some of the foundational infrastructure that's required to be able to support these graph representations collocated with or adjacent to the vector data for that semantic retrieval?
[00:23:01] Philip Rathle:
Yeah. There are a few different patterns. So one pattern is I'm gonna retrieve my vector first or my vectors. Let's say I have a 100 matches and I'm in this customer support kinda use case. Then I want to post filter the vector results based on some sort of graph query. That graph query could look at an ontology. Like, it could say, alright. Now these vectors are hanging off of these customers who, own these this particular device, and I wanna narrow it down based on the specific device that they own. Or it could be doing a ranking based on, like, the the page rank slash centrality example I gave earlier of, like, what's the most referenced document by whatever, by customer service agents or or whatever you whatever kinds of inbound relationships you're looking at. So that's one pattern is bring back your vectors first and then filter based on the graph. It could also be a security related filter. Hey. Who's who's the person asking the question?
Do they have rights to, see this particular thing? Another pattern is you start with a graph search and then, you know, so let's let's look for a particular pattern. And where you might do this is things that are are really outside the domain of both LMS and, similarity search. Let's take supply chain risk. I met with a customer recently who's doing this. Supply chain risk involves understanding, like, doing some basic math across multilevels in the supply chain, rolling that up, and then doing some risk calculation based on that. Both the multilevel aspect and the math aspect are not, you know, in the sweet spot to say the least of of either of those technologies, but they're, you know, they're pretty trivial in a graph. So there, you would run your graph query and then, you know, and then there may or may not be a vector component to that particular query and then return that to the LOM.
Another kind of pattern is something I think of as giving you informed creativity where, hey, I don't really I can't really get my entire results in the graph or through a vector. And so I'm going to take, you know, a large part of the graph, maybe like, let's say I have a 1,000,000,000 scale graph and you're gonna bring back a 1,000 nodes of relationships that are surrounding that particular thing that you're looking at, you can easily turn a graph into natural language. Like, you just take the, you know, the thing nodes and then the relationship becomes a verb and there's a thing at the other end. So it's like, Philip has, I don't know, a cup.
And that that that shows up in the graph in a way that can easily be transcribed into English or the language of choice. You can then send that set of sentences to the LMS context and say, hey. Here's the situation. And, you know, that then becomes a background prompt and, or context for the prompt, and the user can go at it and ask questions relative to all the data I just dumped in. So that's another use. In that case, your knowledge graph becomes, more of more of an assist.
[00:26:26] Tobias Macey:
And in terms of the infrastructure to support a graph rack approach of being able to not necessarily collocate in the same storage layer, but at least collocate in terms of kind of physical proximity for reducing latency and reducing the number of round trips. I'm wondering what are some of the useful patterns people are using to be able to build their embeddings, build the graph, and be able to use them to interact with each other through the LLM as the mediator.
[00:27:03] Philip Rathle:
I guess you have 2 areas where you have patterns. For 1st, there's the graph creation and ingestion. And oftentimes, that's not a one time thing. Like, your graphs are living things. And if, you know, if it's reflecting some system in the real world, you know, you you wanna get the changes in. And then on the other side, you have, you know, what what's your pattern for actually running, GraphRag. So on the first, the pattern is, you know, using a set of tools which, the the latest, you know, Gen AI frameworks, like, again, Langchain, Lava index, and so on, often you usually have hooks. Oh, those all have hooks into being able to take unstructured text, bring it into the graph, calling an LLM.
Now depending on how much of the data you have is in structured form. As I said, you can you can really improve the quality of your unstructured extraction by having structured data that represents your taxonomy or ontology and, you know, so standard patterns for that. So that that's on the construction side. And then on your and and the one thing I'll add is if if you don't already have a taxonomy to refer to and your data is high stakes, and you need, like, 100% accuracy with your with some portion of your knowledge graph. Oftentimes, you don't with your entire knowledge graph just like you don't with vectors, and so you can get good enough with automated extraction.
But but in certain cases, I've seen situations where you there's a part of the graph that needs to be curated much like so you have early HF, right, which applies to models. Here you have, knowledge graph curation with with human feedback where, you know, if if your model is getting things wrong, then having some tooling and there's nothing really off the shelf now. So this tends to be things that people build for doing validation of the data in your knowledge graph. And then, effectively it becomes a sub training task for a model agent, which then gets better as you go at extracting your particular data and putting it in the right place.
Sometimes you want entity resolution tools as well, including off the shelf ones. I'll put in a plug for sensing, which is one that's lightweight and and quite good for, you know, company entities and, where you have name and and address and registration data and and people information, that can be helpful. Alright. Now switch to the what does the stack look like on the GraphRag side that typically ends up being some combination of pick your, you know, pick your framework of choice and then either have an LLM or some technology to generate graph queries from unstructured text. We we recently open sourced a technology that does this called NeoConverse that lets you have a conversation with a graph.
LLMs are getting better and better at writing cipher and which is the de facto query language for graphs used, by Neo 4J, but but many others. And and is, by the way, on track to is is 80, 90% of the way there towards the GQL standard, which is ISO's new query language standard and is a sibling to ISO SQL. So you have ISO SQL, you have ISO SQL, one for relational model, one for graphs. We can we can touch on that if if if you wanna loop back to it. And so there are there are 2 basic ways you would query your graph. Again, one is to actually have some kind of translation layer that takes a question and translates it to a graph query.
Another one is I can take a if if I have some bounded variety of the questions that I'm asking, let's say it's any question is gonna fit some template, and it's gonna be one of 20 kinds of things. In that case, then you can actually precreate a set of queries and then, you know, just do extract your terminology and variables from the question and then map it to the graph query, run the query, and give you back the result. So that's another pretty common way. And then in cases where I have a lot of kinds of questions or a variety in the way I can ask the question, but less of a variety in terms of the number of questions, you can actually do a vector search to find what's the best likely query to match up against the question, which is a pretty cool use of vectors. And then the last thing I'll add on the operational side is you also have this whole thing at the tail end of audit and explainability, security, and so on.
And graph visualization can often be really helpful for helping to understand why a particular question was answered in a in a given way to the degree that I have a graph query that pulls back information about my world model and sends it to the LLM, which then either takes that as the answer or then, you know, further evolves the answer based on that raw data. That, that tells me a lot about the reason that I got to that particular answer, which is a lot more than you can get out of, you know, just taking what an element says at face value or even, like, with with a bunch of vector inputs. So, so visualization of your graph rack inputs and outputs can help with that, and it can also help your development process.
I've had users come to me and say, hey. Just being able to see my vectors, in this case, they had done their vector indexing in Neo 4 j in the graph, just by having the ability to, like, visualize my vectors and how they hang off of documents and how those documents relate to each other and how they domain relate to the domain can be really helpful in accelerating development. So there are lots of parts of the Gen AI stack that are not standardized. But one, you know, to the degree that GraphRag is becoming or has become part of the Gen AI stack, there is luckily like, I just through amazing coincidence, ISO just came out with a standard for doing graph querying that is largely a continuation of the Cypher query language, and we're committed as is basically every graph under on the planet towards, providing a smooth transition from from Cypher to to gQL. So it's the first time in nearly 40 years that ISO decides that a a data model is significant enough to base a new standard around. So that's that's notable and significant, I'd say, in the graph space and for anyone in the database space.
[00:34:13] Tobias Macey:
Another interesting outgrowth of the attention that generative AI models has brought in is that they brought vector databases and vector indexes more into the common parlance. Everybody knows what they are at this point, and so they have seen growth even just in the semantic search approach beyond the actual retrieval and generation piece. And I'm wondering how this vector and graph combination is applicable independent of the language model or the generative AI application in a similar context?
[00:34:53] Philip Rathle:
Oh, that's a good one. So let me so you could look at vectors as being, like, some continuation along the spectrum that started with string search, you know, exact match and, you know, like, leading edge and trailing edge and contains that that that sort of thing to full text, to you know, and now now vector is sort of the next thing. It's like a, you know, very fuzzy conceptually based version of of full text, the very grown up version of full text. And it's always been the case, like, since my very early days here at Neo4j that, you know, you have people needing the same kind of textual based tools around either string search or full text or both, and then using that alongside the graph. And and in that case, like, you know, I I see the pattern where you do it in the graph because you can do both those things inside of Neo 4 j. I also see the pattern where I've got tons of data in Elastic. I don't wanna forklift that.
And the volumetrics are such that, you know, Elastic is dedicated bespoke technology for handling gobs and gobs of text. That's great. But then having the graph alongside it, and then usually the way you'd have some hook, which is like a document ID to link the 2. So I think that pattern, I think it's the same pattern that any case where I have text and I have structure in my documents, which you usually do. Even inside of a document, you have structure of a particular chunk exists might ex might be some text inside of cell b 1 of table 3 on page 36 of, you know, some document that is part of some collection, that is part of some overall filing.
And even structuring that, data is a graph. So there's nothing about the domain there. It's just the structure. Ends up being useful in certain cases. This was a surprise to me. I wouldn't have imagined this, but we had, folks in the community starting to discover this. And so it now has a name. It's called the lexical graph is the graph of structure. So that, you know, that can be used in the LLM context. I don't know that I don't know that that can be used in non LLM context, but, but certainly, there there all the ways in which you would use full text or string search along with a graph, I think are, you know, those all carry forward in the vector based world.
[00:37:41] Tobias Macey:
As far as the overall ecosystem, as you mentioned, there are numerous frameworks that have sprung up around building AI applications. Many of them have built in support for various vector engines. I'm wondering what you're seeing as the current and forward trending support for this broader graphRAG capability and some of the ways that people are bolting on the graph capabilities to the frameworks that don't yet have that out of the box?
[00:38:17] Philip Rathle:
Yeah. So I'll I'll I'd say most of the major frameworks do, and those are rapidly evolving, and we're we're actually working pretty closely with those framework providers. It's exciting for them and our users and for us. And then if if I expand beyond that into different parts of the ecosystem, you definitely want more integrations with, with with all the LLMs. So, you know, we've got integrations with Bedrock and Vertex AI and, and so on. We'd we'd recently announced, integrations with, you know, better better integrations with the different data providers.
So good good integrations with to the degree that data's coming in from more analytic databases, like data lake houses, integrations with BigQuery and Snowflake and Fabric. Something interesting we've we're doing with our Snowflake integration, which is in early access right now, is, being able to actually rip data out of Snowflake inside of Snowflake container services, running run run a graph algorithm, and then push the results back into a table. So you can do all that within the Snowflake environment without having to to to move the data around. So I I I think the next level of integrations is gonna be along those kinds of lines to be able to retain your data gravity where and when you want it and run your graph querying using a specialized graph technology alongside all the various things you're doing from, data storage and processing to to the LM itself, be it a foundation model or a or a series of small models.
[00:40:03] Tobias Macey:
In your experience of working at Neo 4J, working in the space, working with customers. I'm curious if you have seen any broad trends as far as different families of language models or different types of embedding models that lend themselves more effectively to this graph interaction?
[00:40:28] Philip Rathle:
I don't I it's definitely the case that from one day to the next, like, there there will be some model that'll come out that'll be, better at writing cipher. And, yeah, I'd I'd say it's it's it's such a game of leapfrog and it's such a such a vastly changing landscape. I'd say not really. The probably the only generalization I'd make is is the one a lot of your listeners probably have experienced themselves is oftentimes you start with just the largest model possible that'll give you the best results, be it, you know, claud 35 these days or GPT 4 o, etcetera.
And then, and then once you have something working and have demonstrated that it can be done, then oftentimes it can be more economical to break that up and move into smaller models or to use the large models for just a particular part of your workload. I take to the degree that you're leaning more on RAG, be it graph or vector, and have the data yourself, then it matters less which model you use. And then the less of a dependency you're able to have on your own data, the more you need to depend on the foundation models and the more there's value in going after a big one.
[00:41:42] Tobias Macey:
And in your experience of working in this space, surveying the graph rag landscape, working with customers, what are some of the most interesting or innovative or unexpected ways that you've seen these ideas applied?
[00:41:55] Philip Rathle:
Let's see. I'll I'll pick on a few. So one is a government tax authority that's doing anti money laundering and is is doing running queries in the graph, but then formatting the results into a suspicious activity report using an LLM. So in that case, it's like your locus of reasoning is in the graph because it involves numbers and multi ops. But then you're using the the model for what it's good at, which is, which is language. Similar sorts of things in, outbound email remarketing of I'm going to come up with the ideal recommendation for this person based on what they've bought and what other people have bought and, what they've bought together and what they have in the shopping cart together and so on. That's a very graphy kind of query where where you get amazing results in the graph. But then if I wanna write an email and I want that email to appeal to a person based on what I know about them, then actually an LLM can do a much better job of coming up with the right language for, you know, whoever the person I'm targeting, you know, be they, you know, what whatever their walk of life is. So that's another, I think, really clever combination of the 2 different technologies.
This one's not surprising in itself, but I I'd say the where it's going is maybe surprising as there was a a recent podcast, the Sequoia podcast training data appearance by the, CEO of Klarna who spoke about how they use Neo 4J as part of, like, a company knowledge base, that is a Gen AI kind of agent that anyone can ask a question to, and it'll come back with an answer. And that answer could be related to business process or HR policy or customer or whatever it might be. And so that that's not a surprising use, but the surprising thing you said on that on that podcast is they're they're beginning to realize that the fragmentation of, core data across different systems and different applications like, Salesforce and Workday and so on is beginning to hamper their business. And so what they're actually doing is they're leaning into this centralized graph approach using that as increasingly as the system of record for the core things in their enterprise and are moving to actually in some fashion, like, deprecating these applications, like like Salesforce, Workday, and the rest. You know, maybe they continue to use them, but they're in a bit more of a targeted read only fashion, but your system of record is actually this graph which connects to everything.
Because there are these incredible network effects with graphs. You know, if you bring to together the data to solve, say, a recommendation, like I had earlier, that might be 80% of the data you need to do fraud detection. And then together, those might be, you know, 50% of the data that I need to do for some other use case. And then once I solve that, I solve 3 other use cases without even knowing it. So there are both data network effects and use case network effects. And so I'd say maybe the, you know, the way I'd characterize the the thing about that that's surprising is it's something that started as a Gen AI use case, but then once they began using it, realized that there are applications of graph that go far beyond Gen AI that are quite fundamental to the business and and transformative ultimately to their IT infrastructure.
[00:45:39] Tobias Macey:
And in your experience of working at Neo 4J very closely in the graph data space and as you're exploring the applications of graph data to generative AI applications, what are some of the most interesting or unexpected or challenging lessons that you've learned to the process?
[00:45:59] Philip Rathle:
Okay. So I guess I'd say one is there there there's this, study people often refer to that was done years ago. I think it's it's by, the air force decided. I don't know if it's true or apocryphal that an error made in the data modeling phase will be, like, a 100 times easier to fix if you catch it then than if you catch it. And then if you catch it in the build phase, you know, after your code's been written, then if you catch it in the test phase, then if you catch it in production. It's like each time you have orders of magnitude more expense in solving a kind of problem that occurred earlier on. So I'd say one is getting your model right.
So don't try don't try to get your model too right early on when you're experimenting. But then the higher the the more important, the more critical your application is to your business. And I guess you could measure this by proxy by by saying, you know, what happens if the system goes down for a minute? Like, is that, you know, is is that a fire drill kind of moment? Like, for those kinds of applications, putting, more thought, more attention, more study into, how you actually design your application, is is 1. Another is there are some things that are counterintuitive with graphs. So for example, when you load data, you actually want to create your indexes first before you load your data. This is the opposite of what we do with relational databases. With relational databases, you load your table, and then you create your indexes.
And then the indexes can take advantage of, like, a big parallel scan, you know, potentially parallel, build your index, and it it happens much faster. With a graph, each record that I'm adding needs to actually if I'm adding a relationship, you are stitching your data together on insert or you could say, like, you're indexing your data, or you're, you're pre joining your data on insert is maybe a way to look at it. And so if you don't have an index, each time I try to add a relationship, I'm gonna do a full node scan to find the node at either end, which means by the time you've got a few 100000 records in, each loading each record is gonna get really, really slow.
You can avoid this by just creating indexes ahead of time. This is really common mistake. So this is a very down in the weeds lesson learned. I'd say the other lesson learned popping way back out that is maybe relevant in in in the Gen AI world is I've, you know, I've seen so many projects fail over the course of my career because they were purely technology driven and they didn't have enough business sponsorship. Now I think we as an industry are doing the right thing in this world of Gen AI to be technology driven to a degree because you have no idea what's possible business wise until you play with the technology. So that seems appropriate.
But it's all too easy to then, you know, not look closely at the business or not involve the business at the right point in time as you go and end up with something that's technology driven. And some of the trends I've seen looking beyond of individual projects, like the whole trend around Hadoop and data lakes. And no. Let's let's just get all the data into one place, not preprocess it, just dump it in there, and then we can figure out what to do with it. And, obviously, like, that didn't work very well. So having the appropriate level of business partnership and involvement as one is experimenting with these technologies is, I think is an art. And, you know, where what the right involvement is depends on the situation and where you are, in the cycle, but I'd probably air more on the side of involving the business than not. So that's my last lesson learned.
[00:50:06] Tobias Macey:
And for people who are intrigued by all of the enhanced capabilities that we've been discussing, what are the situations where you would advocate against GraphRag, where GraphRag is just the wrong choice?
[00:50:20] Philip Rathle:
I'd say the the more something is creative and the less something is high stakes or the or the more something is based purely based on a document or the more a language model just has all the information it needs because the the data you're talking about isn't your own proprietary data. It's, you know, it's out in the wild. So the those factors all, I'd say, lean in favor of yeah. Yeah. GraphRag, maybe there's some way it can add value. It's certainly not low hanging fruit and maybe it won't add value. Any case where the stakes are higher. So by stakes, I mean, you know, dollar value, but also health and human safety and regulatory and brand and reputation and privacy discrimination, these, these sorts of things where if it's a kind of application where these factors come into play, regulation is another, then that starts to tip you over the edge of, I'm gonna need to answer to someone, if not a regulator, the the person inside the company who is the throat to choke if something goes wrong.
And how are you gonna convince that person that it's the right answer and that the answer is good enough for enough of the time? Those ends up being cases where GraphRag becomes more valuable. So my rule of thumb personally is stakes. The higher the stakes through any of those dimensions, the more, GraphRag ends up being useful. But also it depends on having a question that involves information that is proprietary to your business or whatever endeavor you're trying to carry out.
[00:51:59] Tobias Macey:
And looking forward, what are some of the near to medium term either active improvements in the ecosystem or opportunities for improved support for these graph based systems in the AI application ecosystem that you're keeping an eye on?
[00:52:18] Philip Rathle:
So one is development of the frameworks and the integrations, which we talked a bit about. Knowledge graph construction is a huge one, and I'd say text to cipher is a huge one. I think those are the 2 biggest ones. So the and those are all areas that are improving almost literally day by day. This is a pretty pretty hot area for for us and I'd say for the larger ecosystem that's connecting in to graphs from the other side.
[00:52:49] Tobias Macey:
Are there any other aspects of this concept of GraphRag, the ecosystem around it, the work that you're doing at Neo 4J to help support that that we didn't discuss yet that you'd like to cover before we close out the show?
[00:53:01] Philip Rathle:
I think the one thing I'll maybe add is in addition to GraphRag, there are other ways that knowledge graphs are being used and and useful. One is around storing your metadata and data lineage of because data ultimately is the foundation. This part probably what I see is the thesis of your podcast. Right? Data is the foundation for, you know, everything going forward. Right? I mean, machine learning effectively takes what used to be declarative code and pushes it upstream and makes it a data problem. And so understanding what your data is, what the quality and timeliness of is of each source, how data moves across a company, you know, what's been curated, approved by data stored, you know, all these kinds of things end up being very important.
And, and so there's actually a long history even pre Gen AI of Neo 4 j being used as a system of systems of sorts to record how data moves through an enterprise and, you know, what the quality is, in in various places and so on. Some of that is driven from regulatory perspective, especially in the finance industry. And then some of it is done just for, you know, through a more mature understanding by say the chief data officer of data is, if not the new rocket fuel, at least the new oil. And, there's, there's really something to be gained in investing there. And yeah. So so maybe I'll end there.
Oh, I'll let me add one more. So this is something that was brought up by some of the framework providers recently that as a Gentec architectures get more intricate, they start to look like a graph and you could actually see a need for a control graph. Now it's small scale and, simpler levels. A control graph can just be stored in a in a file or something. But picture for a global application that needs access to, you know, what's the next series of things I'm gonna do in in some complex agentic architecture than actually having a database where you have a control graph that's replicated globally and is the source of truth could become a thing. So I don't know, you know, how important this trend will be or if it's gonna go anywhere, but it's it's come up recently and I think it's kind of interesting.
[00:55:21] Tobias Macey:
Yeah. The idea of using the graph to drive the agent to query the graph is definitely a very interesting, kind of chicken and egg problem. It'll be interesting to see how that all evolves.
[00:55:37] Philip Rathle:
Yeah. Kinda like the Norse snake eating its own tail. Yes. Rubrous. Yeah. Yes. Exactly.
[00:55:44] Tobias Macey:
Alright. Well, for anybody who wants to get in touch with you and follow along with the work that you and your team are doing, I'll have you add your preferred contact information to the show notes. And as the final question, I'd like to get your perspective on what you see as being the biggest gap in the tooling or technology or training that's available for AI engineering and AI systems today?
[00:56:05] Philip Rathle:
So let me maybe go a little bit more meta on this and say, the thing I'd really like to see is more energy spent towards understanding what's going on inside of an LLM. So in in a way, like the neurobiology of, like, how the decisioning is happening. And I'd say, you know, we certainly can advance quite a long ways treating it as a black box as we are and putting all these controls and understanding the behaviors and coming up with these techniques, like RAG and GraphRAG and prompting and, you know, fine tuning and so on. But I really wanna know what's, you know, the essence of what's happening inside so that we don't have to treat it as a black box.
Because you can just imagine the kinds of improvements you can have if if if we did understand that better in terms of, evolving how how models are trained and what a model even is. And, you know, heck, I mean, maybe models I could see models actually even including GraphRag internally in some sort of fashion. So whatever the outcome is, I think we could gain a lot by turning the attention inwards towards the inside of an LLM in addition to continuing to spend, energy around it.
[00:57:22] Tobias Macey:
Yeah. I think that that's definitely an interesting observation, particularly given the growth in recent years of graph being applied in the deep learning context with PyTorch having graph extensions to that framework and just the overall idea of graphs being integral to the construction of the neural nets that are used to build these models, and I'll be curious to see how some of those ideas get brought into these large language model and generative AI model development applications.
[00:57:57] Philip Rathle:
Can't wait.
[00:57:58] Tobias Macey:
Alright. Well, thank you again for taking the time today to join me and share your thoughts and experience on this concept of GraphRag. It's definitely a very interesting problem space. It's definitely exciting to see some of these graph structures and the idea of knowledge graphs being brought into the context of these vector retrieval systems. So I appreciate the time and energy that you're putting into helping to promote that and educate around that, and I hope you enjoy the rest of your day. Thanks, Tobias. It's been fun.
[00:58:30] Tobias Macey:
Thank you for listening. And don't forget to check out our other shows, the Data Engineering Podcast, which covers the latest in modern data management, and podcast dot in it, which covers the Python language, its community, and the innovative ways it is being used. You can visit the site at the machine learning podcast.com to subscribe to the show, sign up for the mailing list, and read the show notes. And if you've learned something or tried out a project from the show, then tell us about it. Email hosts at themachinelearningpodcast.com with your story. To help other people find the show, please leave a review on Apple Podcasts and tell your friends and coworkers.
Introduction to the AI Engineering Podcast
Interview with Philip Ratheley: Background and Journey
Understanding GraphRag and Its Capabilities
GraphRag vs. Vector-Based Retrieval Systems
Data Modeling in GraphRag
Entity Extraction and Ontologies
Technical Implementation of GraphRag
Infrastructure and Patterns for GraphRag
Vector and Graph Combination Beyond LLMs
Ecosystem and Framework Support for GraphRag
Innovative Applications of GraphRag
Lessons Learned in Graph Data Space
When Not to Use GraphRag
Future Opportunities and Improvements in GraphRag
Other Uses of Knowledge Graphs
Final Thoughts and Contact Information