The Growing Significance of Knowledge Graphs and GraphRAG in Improving Model Performance By Kirk Marple

width= Kirk Marple has 30 years’ experience in software development and data leadership roles. He’s worked for Microsoft and General Motors, and successfully exited from his first startup RadiantGrid, which was acquired by Wohler Technologies.
Kirk is currently the Founder and CEO of Graphlit, which streamlines the development of vertical AI apps with their end-to-end cloud based offering that ingests unstructured data, and leverages RAG to improve accuracy, adaptability, and context understanding – all whilst expediting development.
In this interview, Kirk Marple explores the latest applications of knowledge graphs. What sets knowledge graphs apart from other data structures, and what steps should data scientists take to integrate knowledge graphs with LLMs? Kirk argues that knowledge graphs and GraphRAG now play a pivotal role in enhancing model performance:

Can you give us a definition of knowledge graphs, and explain what they are?

Knowledge graphs deal with extracting relationships between bits of knowledge. Classically, we talk about people, places and things. So that could be a company with knowledge on where they’re located, what their revenue is, and the number of employees. We would call that the metadata on that entity. And that company may have relationships with, say, Microsoft in Seattle. So there’s an edge that you create in the knowledge graph between those entities. The knowledge graph is really just that blown up. It’s all those different interactions, and inner relationships between those little bits of knowledge, and the value becomes information retrieval.

It’s a great way to represent the knowledge, in a way that you can then retrieve it and walk the graph from one place to the other, to be able to learn more from the knowledge that’s embedded in it.

Why do you think data scientists and data engineers working in industry should be paying attention to knowledge graphs, in the current AI era?

I think people are familiar with RAG – that the R in RAG is information retrieval. The data presents a search problem or a filtering problem via metadata, and knowledge graphs give another axis, a resolution to the problem of how to retrieve data to feed into large language models and large multimodal models.

Up until recently, knowledge graphs have been a bit of a sidecar in very specialised parts of the industry. But now they’re giving you another view on the data. I think vector search has been a big thing over the last couple of years. It’s been around for a while before that, but graphs are another facet of information retrieval that complement that as well.

How does the knowledge graph differ from other commonly used data structures, such as a JSON format?

So a JSON structure is basically a kind of linked list. You can follow the links, or operate it like a DAG workflow. The problem is you might get recursion; you might be coming back around to the same element.

For example, when I worked at Microsoft, there might be another person that worked at Microsoft, and then Microsoft bought their company. And then that person links back to me, and so you could get cycles in that graph. That’s an area which would be hard to represent, from a serialisation standpoint, in a JSON structure. But there are ways to work around that. You only walk the graph so far, where you collapse links together.

How do knowledge graphs contribute to data integration and the semantic understanding of such datasets?

Say you have a couple of pages of text, and you’re mentioning companies, people and places. You can use a vector embedding to find Adobe, or Amazon or someone in that text. It would use a keyword search or even a vector embedding. In the RAG process, you want to pull back the relevant text, and then provide that to the LLM.

But what if there’s data about the entity that’s not in the text? That’s really where knowledge graphs shine: where it’s the current year’s revenue, or how many employees they have, or other things you can enrich around the entity, that you can then pull back from anything that you’re finding, via the vector betting.

So to me, it’s an enrichment step where you could just retrieve on the graph itself, but then it’s more of a global set of data. You’re unlikely to get anything relevant to a specific question. Maybe the question starts with a text kind of vector search, and then you can do another set that pulls data in from the graph. But there may be use cases where you just want to talk to the graph itself, if you have enough information in there. So basically, pull from the metadata around the nodes, from more than the node itself, which might just be like a word.

What kind of methods or techniques are you using to integrate knowledge graphs with LLMs?

There are two sides to it: one is the data ingestion path – the question of how to get data in, and how to extract entities, and how to store it. That’s the first step, pulling in data from Slack, email, documents and podcasts. Then you have to do named entity extraction and named entity recognition on that; building up a knowledge graph from the data. So we had a good data structure to pull from.

And now we’ve moved into the second issue: the RAG graph concept. Now I can ingest all this data, all this representation in a graph, how can I use it?

That’s where we’ve done some experiments starting with vector search, figuring out those similar text chunks that I found, and those text chunks have entities that we’ve extracted from them. Then you can essentially use this as a way of expanding your retrieval, and pulling it into more content that also observes those same entities. And that’s a way to use these graphs for expanded retrieval.

Another way is using graphs and faceted queries to get data analytics. We’ve actually got a demo on our website of how to ingest a website and use any art to build a graph, and then essentially get a histogram of all the topics, people, places and companies that were mentioned in the data. So you’re summarising the graph into a chart form. It’s a really easy way to get a different view on what’s inside your data.

How can knowledge graphs help with the interpretability and explainability of LLMs?

One way to look at it is: if you have citations, so you have the LLM, it responds. It gives you back a list of citations of your sources that you used. You can then visualise the citations in a graph form and see the topics in addition to the text that it found – the topics and entities that were essentially cited.

You can then check the commonality between these citations. Are they similar? Are they different? I was actually planning to build a demo app that does just that from the citations: do a graph search, get that data and be able to visualise it. That approach is really useful.

What are the signals that a knowledge graph could be the right tool for a particular problem a data scientist is working on?

The thing that comes to mind is the interrelationships between the data. A typical data set might be row-based and there’s not as much interrelationships between the rows, or we’re seeing different data sets like that. And this is an instance where there’s an implicit grouping or classification, and you can bucket data into different classes of nodes.

We were based on schema.org, and the JSON-LD kind of classification structure, the taxonomy – that’s really where there’s already standardisation around what is a person, what is the organisation, and defining the metadata.

So I think you can look at it one way, where you’re kind of mapping data to an existing graph structure. Or you can take a different approach, where you’re mapping it, and inventing your own data structure in your own relationships.

But to me, it’s really about that classification metaphor of: ‘This is a something, and you can assign that, and then that becomes a relationship between other data in the graph.’

Could you explain the concept of graph retrieval augmented generation (GraphRAG) and its importance?

It’s something that’s come up over the last year or so. It’s actually an area that was really core to how we were thinking about our implementation of RAG and using our knowledge graph. Microsoft released a paper on it several months ago.

The concept is basically about leveraging a knowledge graph as extra context for the generation side, which is the prompt you provide to the LLM. So where typically you’re using a vector search to find bits of text that are relevant to the question or to the prompt and GraphRAG, you can augment that information you get back with the relationships of that text to the graph.

So, to use our earlier example, with the extraction of Microsoft and OpenAI and Seattle in the sighted text, you can expand your footprint of information. And this is where you have to guide it a bit. Because you don’t want to pull in everything – all the last 10 years of Microsoft revenue or something like that. So you have to guide the graph retrieval, and say: ‘I’m asking you a question, it seems financially related, go import all the revenue of the last several years from the entities that we’ve identified in the query.’ And that’s where you can then use the graph as a secondary search mechanism and retrieval mechanism. It’s still early days. There have been some prototypes out there, some papers, but I think it’s still an evolving area.

What are the differences between a traditional RAG system and GraphRAG?

So with a typical RAG, you’re starting with the text that you’ve ingested. Text is extracted from documents, transcripts, from audio files, and you’re chopping that up into chunks and that’s what’s provided to the LLM. By contrast, with GraphRAG you can actually get data that wasn’t in the cited text. So it could be data that was extracted from other documents; maybe it found the revenue of a company in a different document, or it pulled it from an API service somewhere else, to enrich the knowledge graph. So it’s really a way that it can see outside the domain of the cited sources and provide more context, to answer the question. That’s really how we see GraphRAG.

What are the key components or modules within the GraphRAG framework?

A lot of it is really ingestion. So you have to create the knowledge graph. That’s probably where the vast majority of the work is. You have to do named entity extraction, or you have to use LLMs, to do the identification of the entities or the nodes in the knowledge graph, and have a really rich ETL pipeline for creating the knowledge graph. So that’s the majority of the work; having a pipeline that can deal with changes in data. You edit a document, you remove a reference to a company – how do you remove that from your knowledge graph? Those kinds of things.

The other side of it is during the retrieval stage of the RAG workflow; having a way that you’re not just going to look at the text in the cited text, but you’re also going to widen your vision and start walking the graph to pull in extra data.

The thing we’ve seen, at least right now, is it has to be guided. I haven’t seen a way that you can make it dynamic and be like: ‘Hey, I’m going to use GraphRAG if I see this scenario, and I’m going to use normal RAG if not.’ Because there’s extra work that has to be done during retrieval, and you had to create the knowledge graph in the first place.

But the question of whether we make it dynamic, and pull from the graph as needed without being guided – that’s something we’re looking at.

How does GraphRAG handle scalability or efficiency concerns with large-scale graph data?

I think a lot of that is in the architecture. In ours it’s more of an index. So we’re not storing the text, or we’re not storing a lot of data in the graph database itself. So the walking, the graph queries end up being pretty fast.

And then we’re able to pull in the metadata from a faster layer, a storage layer. So I think scalability is key. When you’re going to have millions or billions of entities, you need a graph database that can handle that. And I would say most, if not all, of the existing graph database solutions can handle that scale. So that’s typically not a problem. The queries are usually fast, they’ve been tuned enough. That’s why I think a lot of the hard work, at least what we’ve seen, is on the creation side: just making sure you can actually get the right data, keep it fresh, and that kind of thing.

How does GraphRAG address challenges related to building a graph presentation, or to graph representation learning, like node embedding or graph classification?

What hopefully we’ll get to soon is the ability to – once you have the graph created – create embeddings of the relationships. So you have a subgraph of information, say, by page. I’ve observed these companies, these people, these places, and their relationships on a page of text, and I’ll be able to run higher-level algorithms like graph embeddings, and say: ‘What is similar to this page of text with its graph relationships?’ Because today, you can run a vector embedding on the raw text. I think having a graph embedding that takes both of them, or a multi-vector embedding, could be really interesting. And I think I could see that evolving over the next year or so as this becomes more commonplace.

What are some current limitations or potential drawbacks of GraphRAG?

I would say the biggest is that it does take more work. You could go back and backfill the graph from the text, if you’ve ingested like 1,000 documents, and you want to create the graph, that may be a little more trouble.

We make you opt-in so you can set up a workflow and say: ‘I want to build an entity graph from this data,’ there’s a little bit of extra cost. You’re doing some analysis on that text, or you’re running LLMs, you’re using up tokens. I would say that’s probably the biggest limitation: managing cost and scale. Because if you have a tonne of data, maybe you want to pick and choose the data that you want to create a knowledge graph from.

But there are other solutions. If you’re not assuming cloud hosted, like an open API, you can build a lot of this more in your data centre with local models and things like that, and keep the cost down that way.

What are the problems that the development community behind GraphRAG are currently working on?

The biggest thing I’ve seen is that there’s not really any standardisation on what GraphRAG means. I think everybody has their own interpretation of it. There was a Microsoft paper that drew a line in the sand about it, and a lot of what they described we were already doing. We previously hadn’t really talked about it in that way, about putting a definition to it.

But I think we’ll start to see a bit more coalescing. Some people don’t agree with this, but I think RAG has standardised a lot. There are still knobs you can turn, like the usury ranking or not, those kinds of things. But the concept of RAG I think is stabilised now. We’re not there yet with GraphRAG, and so I think we’ll have to go through a wave of trying things, seeing what works and what doesn’t, before that settles into a pattern.

How effective are Knowledge Graphs in reducing hallucinations in LLMs?

The grounding concept of providing sources for the LLM to pull from is really what’s at the heart of creating an accurate RAG algorithm. And the ability to have graphs and pull in more context, that feeds into providing that extra accuracy. It’s something speculative and prototyped at this point, but what we’ve seen is that you can wind down your amount of hallucinations by proper grounding, by giving good content sources, even by some prompt engineering, to really have the LLM focus on the content you’re giving it, not what it’s been trained on. So I think the GraphRAG really feeds into this idea that you’re providing it more context, giving it more data to chew on that isn’t in its training set. And that should minimise the amount of hallucinations because you’re giving it a wider set of context.

What are the developments or advancements that you see coming with knowledge graphs, or GraphRAG that you’re most excited about?

We’ve seen a swell of people talking about it on Twitter, on Reddit, and more papers coming out. So you can see the momentum around awareness and implementation, though a lot of what’s out there is still in the demo phase. I think we’ve been really focused on the ingestion path and constructing the knowledge graph. We’ll be releasing something more formal from a GraphRAG feature very shortly.

But what we’re seeing in the market is that as people’s awareness grows, we’ll start to see more people talking about it and implementing it. But it brings up a whole other question of evals: how do you even evaluate that this is better? Some companies have been doing RAG evals. I don’t know how well they’ll apply to GraphRAG evals and those kinds of things, that can be tricky. But that could open up a whole new market for products or companies to help with that area. So evals aren’t going away, they’re only going to become more important.

And getting into the future of where the RAG pattern can be useful, we really see two areas: repurposing your content that you’ve ingested is one of them. So, using the retrieval part of this, to find interesting information in a large dataset might be good for marketing materials, for technical reports. Being able to give a rough outline of what you want to pull from, or create, like a blog post or report, and use the graph to fill in the details. And that graph could be constructed from other structured data, or unstructured data. It’s a really interesting area, where we can use this pattern with a really rich data set to create really high quality content.

Then the other is the agent concept that’s starting to be talked about. I think there are some open source projects that are working on this. There’s also a lot of past work around actor models and distributed architectures and things like that. That’s a pattern that can apply, where the RAG concept is really just a set of functionality that gets called from the agent and can feed the output of the agent into the input of another agent.

But from our perspective, we see them as two different layers, where the RAG is more analogous to a database query, where there’s some input or some output, and then you have a programming system on top of it for asynchronous agents, and those kinds of things.

So we’ll see how it evolves. I think other people have different perspectives; maybe they’re more integrated. But I see it more as a kind of workflow layer: a graph workflow, and the RAG is this functionality that fits in underneath it.