GraphRAG: Microsoft's New Modular RAG System

Interested in exploring cutting-edge AI technologies? Click the link to discover Microsoft's new GraphRAG!

1000+ Pre-built AI Apps for Any Use Case

GraphRAG: Microsoft's New Modular RAG System

Start for free

GraphRAG, developed by Microsoft Research, represents a significant advancement in the field of Retrieval-Augmented Generation (RAG) for Large Language Models (LLMs). This innovative approach combines the power of knowledge graphs with traditional RAG techniques to enhance the capabilities of LLMs when working with complex, domain-specific information.

Interested in the latest trend in AI?

Then, You cannot miss out Anakin AI!

Anakin AI is an all-in-one platform for all your workflow automation, create powerful AI App with an easy-to-use No Code App Builder, with Llama 3, Claude Sonnet 3.5, GPT-4, Uncensored LLMs, Stable Diffusion...

Build Your Dream AI App within minutes, not weeks with Anakin AI!
Easily Build AI Agentic Workflows with Anakin AI!
Easily Build AI Agentic Workflows with Anakin AI

Understanding GraphRAG

GraphRAG builds upon traditional RAG systems by incorporating structured knowledge graphs as a source of interconnected information. This approach aims to address two key limitations of baseline RAG:

  1. The inability to "connect the dots" across disparate pieces of information.
  2. Poor performance when asked to holistically understand summarized semantic concepts over large data collections.
  3. Knowledge Graph: A structured representation of entities, their attributes, and relationships. This forms the core of GraphRAG's enhanced context understanding.
  4. Vector Database: Stores embeddings of text chunks and graph entities for efficient retrieval. This enables fast similarity-based searches.
  5. Large Language Model (LLM): Processes user queries and generates responses based on retrieved context. GraphRAG typically uses advanced LLMs like GPT-4 or similar models.
  6. Graph-based Retrieval Mechanism: Utilizes the graph structure for more nuanced and context-aware information retrieval. This includes:
  • Vector similarity search
  • Graph traversal
  • Community-based retrieval
  1. Hierarchical Clustering: Uses techniques like the Leiden algorithm to organize information into communities and subcommunities, facilitating multi-level summarization.
  2. Community Summarization: Generates summaries for each community in the graph, providing a holistic understanding of the dataset.
  3. Query Processing Component: Handles user queries and orchestrates the interaction between various components of the GraphRAG system.
  4. Prompt Engineering Module

Technical Implementation of GraphRAG

Graph Extraction and Construction

GraphRAG employs sophisticated NLP techniques to extract entities and relationships from unstructured text. This process involves:

Text Chunking: The input corpus is divided into analyzable units called TextUnits.

Entity and Relationship Extraction: An LLM (such as GPT-4 Turbo) is used to identify entities, relationships, and key claims from each TextUnit. This forms the basis of the knowledge graph.

Graph Construction: Extracted information is used to build a structured knowledge graph, where nodes represent entities and edges represent relationships.

Hierarchical Clustering: The Leiden algorithm is applied to perform community detection within the graph. This process helps in:

  • Identifying closely related entities and concepts
  • Creating a hierarchical structure of information
  • Facilitating the generation of multi-level summaries

Once the graph is constructed and clustered, GraphRAG generates summaries for each community:

Bottom-Up Approach: Summaries are generated starting from the lowest level of the hierarchy and moving upwards.

LLM-Based Summarization: An LLM is prompted to create concise summaries for each community, capturing the essence of the entities and relationships within.

Hierarchical Integration: Higher-level summaries incorporate information from lower-level summaries, creating a coherent overview of the entire dataset.

Vector Embeddings

Both text chunks and graph entities are embedded into a high-dimensional vector space:

Text Embeddings: Transformer-based models (e.g., BERT, RoBERTa) are used to create dense vector representations of text chunks.

Graph Embeddings: Graph neural networks or specialized graph embedding techniques are employed to capture the structural information of entities within the knowledge graph.

Storage: These embeddings are stored in a vector database for efficient similarity-based retrieval.

Retrieval Mechanisms

GraphRAG employs a multi-faceted retrieval approach:

Vector Similarity Search: Used for finding relevant text chunks and entities based on semantic similarity.

Graph Traversal: Explores related concepts by following edges in the knowledge graph, allowing for more contextual information retrieval.

Community-based Retrieval: Leverages the hierarchical structure to provide broader context, especially useful for global queries.

Hybrid Ranking: Combines scores from different retrieval methods to rank the most relevant information for a given query.

Query Processing

GraphRAG offers two primary query modes:

Global Search:

  • Designed for holistic questions about the entire corpus.
  • Utilizes community summaries at various levels of the hierarchy.
  • Generates partial responses for each relevant community.
  • Synthesizes a final response by summarizing all partial responses.

Local Search:

  • Optimized for questions about specific entities or localized information.
  • Starts from relevant entities and expands to neighboring nodes in the graph.
  • Retrieves detailed information about specific entities and their immediate context.

Prompt Engineering and LLM Integration

Effective prompt engineering is crucial for GraphRAG:

Context Integration: Retrieved graph information, community summaries, and text chunks are coherently incorporated into the prompt.

Instruction Clarity: Clear instructions are provided to the LLM on how to use the graph-based context and community summaries.

Balancing Act: Prompts are designed to provide sufficient context without overwhelming the LLM's context window.

Query-Specific Prompting: Different prompt structures are used for global vs. local searches to optimize the LLM's reasoning process.

Prompt Tuning: The system includes capabilities for fine-tuning prompts to specific domains or datasets, improving performance over time.

Performance and Evaluation for Graph RAG

Microsoft Research conducted extensive evaluations to compare GraphRAG against baseline RAG and hierarchical source-text summarization:

Evaluation Metrics

  1. Comprehensiveness: Measures how thoroughly the generated answers cover all aspects of the question.
  2. Diversity: Assesses the variety of perspectives provided in the answers.
  3. Empowerment: Evaluates how well the answers support informed decision-making.


  • GraphRAG outperformed naive RAG on comprehensiveness and diversity, with a 70-80% win rate.
  • When using intermediate and low-level community summaries, GraphRAG performed better than source text summarization while using 20-70% fewer tokens per query.
  • For high-level community summaries, GraphRAG was competitive with hierarchical source text summarization while using only 2-3% of the tokens per query.

These results demonstrate GraphRAG's ability to provide more comprehensive and diverse answers, especially for complex, global queries about large datasets.

Challenges and Considerations

While GraphRAG offers significant advantages, there are challenges to consider:

Computational Overhead: The process of graph construction, embedding generation, and hierarchical summarization can be computationally intensive.

Graph Quality: The effectiveness of GraphRAG heavily depends on the quality of the extracted knowledge graph and generated summaries.

Domain Adaptation: Fine-tuning the system for specific domains may require expertise and iterative refinement.

Scalability: As datasets grow, maintaining efficient graph structures and retrieval mechanisms becomes more challenging.

Privacy and Security: Handling sensitive information in knowledge graphs requires robust security measures, especially when using external LLM services.

Future Directions

The GraphRAG team at Microsoft Research is actively exploring several avenues for improvement:

Cost Reduction: Investigating methods to reduce the upfront costs of graph index construction while maintaining response quality.

Automatic Prompt Tuning: Developing techniques to automatically adapt extraction prompts to specific problem domains.

Approximation Techniques: Exploring NLP-based approaches to approximate knowledge graphs and community summaries for faster evaluation and deployment.

Multi-Modal Integration: Extending GraphRAG to incorporate image, video, and audio data into the graph structure.

Federated GraphRAG: Enabling collaborative learning and querying across distributed knowledge graphs.

Explainable AI: Enhancing the interpretability of GraphRAG systems, leveraging the graph structure for better explanations of generated responses.

By open-sourcing GraphRAG and providing a solution accelerator, Microsoft aims to foster community engagement and drive further innovations in graph-based RAG approaches. This collaborative effort has the potential to significantly advance the field of AI-powered information retrieval and question-answering systems, particularly for complex, domain-specific datasets.

Interested in the latest trend in AI?

Then, You cannot miss out Anakin AI!

Anakin AI is an all-in-one platform for all your workflow automation, create powerful AI App with an easy-to-use No Code App Builder, with Llama 3, Claude Sonnet 3.5, GPT-4, Uncensored LLMs, Stable Diffusion...

Build Your Dream AI App within minutes, not weeks with Anakin AI!