Neo4j와 Milvus로 GraphRAG 에이전트 구축하기

이 블로그 포스팅에서는 Neo4j Graph Database를 사용해서 GraphRAG 에이전트를 구축하는 방법과 Milvus Vector Database를 사용하는 방법을 알아볼 거예요. 이 에이전트는 Graph Database와 Vector Search를 사용해서 사용자 쿼리에 정확하고 관련성 높은 답변을 제공하죠. 이번 예시에서는 Ollama와 GPT-4o와 함께 LangGraph, Llama 3.1 8B를 사용할 거예요.

전통적인 Retrieval-Augmented Generation(RAG) 시스템은 전적으로 Vector Database에 의존해서 관련 문서를 검색해요. 저희 접근 방식은 를 통합해서 더욱 발전했는데, 엔터티와 개념 간의 관계를 포착해서 정보에 대한 더 미묘한 이해를 제공하거든요. 이 두 가지 기술을 결합해서 더욱 강력하고 유익한 RAG 시스템을 만들고 싶어요.

RAG 에이전트 구축

저희 에이전트는 라우팅, 폴백 메커니즘, 자체 수정이라는 세 가지 주요 개념을 따르고 있어요. 이러한 원칙은 일련의 LangGraph 구성 요소를 통해 구현되죠.

– 전용 라우팅 메커니즘은 쿼리를 기반으로 Vector Database, Knowledge Graph 또는 둘의 조합을 사용할지 여부를 결정해요.
– 초기 검색이 부족한 상황에서는 에이전트가 Tavily를 사용해서 웹 검색으로 대체됩니다.
– 에이전트는 자체 답변을 평가하고 환각이나 부정확성을 수정하려고 시도합니다.

그다음, 다음과 같은 다른 구성 요소들이 있어요.

– 오픈 소스 고성능 Vector Database인 Milvus를 사용해서 문서 청크를 저장하고 검색합니다. Semantic Search를 사용해서 사용자 쿼리에 의미론적 유사성을 적용하죠.
– Neo4j는 검색된 문서에서 Knowledge Graph를 구성해서 관계 및 엔터티로 컨텍스트를 풍부하게 만드는 데 사용됩니다.
LLM 통합 – 로컬 LLM인 Llama 3.1 8B는 답변을 생성하고 검색된 정보의 관련성과 정확성을 평가하는 데 사용되는 반면, GPT-4o는 Neo4j에서 사용하는 쿼리 언어인 Cypher를 생성하는 데 사용됩니다.

GraphRAG 아키텍처

GraphRAG 에이전트의 아키텍처는 여러 상호 연결된 Node가 있는 워크플로처럼 보일 수 있어요.

– 에이전트는 먼저 질문을 분석해서 어떤 검색 전략이 제일 좋을지 결정해요. Vector Search, Graph Search, 아니면 둘 다 쓸지 정하는 거죠.
– 라우팅 결정에 따라 Milvus에서 관련 문서를 검색하거나 Neo4j 그래프에서 정보를 가져와요.
– LLM은 검색된 내용을 바탕으로 답변을 만들어내요.
– 에이전트는 만들어진 답변이 얼마나 관련 있는지, 정확한지, 헛소리는 없는지 평가해요.
(필요한 경우) – 답변이 별로라고 판단되면 에이전트는 검색을 더 꼼꼼하게 하거나, 오류를 수정하려고 시도할 수 있어요.

에이전트 예시

LLM 에이전트가 얼마나 똑똑한지 보여주기 위해 두 가지 구성 요소를 살펴볼게요. Graph Generation과 Composite Agent에요.

전체 코드는 이 글 맨 아래에서 확인할 수 있지만, 이 조각들을 보면 LangChain 프레임워크 안에서 이 에이전트들이 어떻게 작동하는지 더 잘 이해할 수 있을 거예요.

Graph Generation

이 구성 요소는 Neo4j의 기능을 활용해서 질문 답변 프로세스를 더 좋게 만들도록 설계되었어요. Neo4j Graph Database에 있는 Knowledge Graph를 이용해서 질문에 답하는 거죠. 작동 방식은 다음과 같아요.

GraphCypherQAChain – LLM이 Neo4j Graph Database와 소통할 수 있게 해줘요. LLM은 두 가지 방법으로 사용돼요.

cypher_llm – 이 LLM 인스턴스는 사용자의 질문을 바탕으로 그래프에서 관련 정보를 가져오기 위해 Cypher 쿼리를 만드는 역할을 해요.
– Cypher 쿼리가 문법에 맞는지 확인해서 유효성을 검사해요.
– 검증된 쿼리는 Neo4j 그래프에서 실행되어서 필요한 컨텍스트를 검색해요.
– Large Language Model은 검색된 컨텍스트를 사용해서 사용자 질문에 대한 답변을 만들어내요.

### Generate Cypher Query
llm = ChatOllama(model=local_llm, temperature=0)

# Chain
graph_rag_chain = GraphCypherQAChain.from_llm(
        cypher_llm=llm,
        qa_llm=llm,
        validate_cypher=True,
        graph=graph,
        verbose=True,
        return_intermediate_steps=True,
        return_direct=True,
    )

# Run
question = "agent memory"
generation = graph_rag_chain.invoke({"query": question})

이 구성 요소를 사용하면 RAG 시스템이 Neo4j를 활용해서 더 포괄적이고 정확한 답변을 제공할 수 있게 돼요.

Composite Agent, 그래프 및 벡터 🪄

여기서 마법이 펼쳐져요! 우리 에이전트는 Milvus와 Neo4j의 결과를 합쳐서 정보를 더 잘 이해하고, 더 정확하고 미묘한 답변을 얻을 수 있어요. 작동 방식은 다음과 같아요.

프롬프트 – 질문에 답하기 위해 Milvus와 Neo4j의 컨텍스트를 모두 사용하도록 LLM에게 알려주는 Prompt를 정의해요.
답변 생성 – Llama 3.1 8B는 Composite Chain을 통해서 Vector 및 Graph Database의 결합된 Knowledge Graph를 활용해서 Prompt를 처리하고, 깔끔한 답변을 만들어내요.

### Composite Vector + Graph Generations
cypher_prompt = PromptTemplate(
    template="""You are an expert at generating Cypher queries for Neo4j.
    Use the following schema to generate a Cypher query that answers the given question.
    Make the query flexible by using case-insensitive matching and partial string matching where appropriate.
    Focus on searching paper titles as they contain the most relevant information.
    
    Schema:
    {schema}
    
    Question: {question}
    
    Cypher Query:""",
    input_variables=["schema", "question"],
)

# QA prompt
qa_prompt = PromptTemplate(
    template="""You are an assistant for question-answering tasks. 
    Use the following Cypher query results to answer the question. If you don't know the answer, just say that you don't know. 
    Use three sentences maximum and keep the answer concise. If topic information is not available, focus on the paper titles.
    
    Question: {question} 
    Cypher Query: {query}
    Query Results: {context} 
    
    Answer:""",
    input_variables=["question", "query", "context"],
)

llm = ChatOpenAI(model="gpt-4o", temperature=0)

# Chain
graph_rag_chain = GraphCypherQAChain.from_llm(
    cypher_llm=llm,
    qa_llm=llm,
    validate_cypher=True,
    graph=graph,
    verbose=True,
    return_intermediate_steps=True,
    return_direct=True,
    cypher_prompt=cypher_prompt,
    qa_prompt=qa_prompt,
)

연구 논문 발견을 향상시키기 위해 그래프와 벡터 데이터베이스의 장점을 결합한 검색 결과를 한번 살펴볼까요?

Neo4j를 사용하여 그래프 검색을 시작해 볼게요.

# Example input data
question = "What paper talks about Multi-Agent?"
generation = graph_rag_chain.invoke({"query": question})
print(generation)

> Entering new GraphCypherQAChain chain...
Generated Cypher:
cypher
MATCH (p:Paper)
WHERE toLower(p.title) CONTAINS toLower("Multi-Agent")
RETURN p.title AS PaperTitle, p.summary AS Summary, p.url AS URL

> Finished chain.
{'query': 'What paper talks about Multi-Agent?', 'result': [{'PaperTitle': 
'Collaborative Multi-Agent, Multi-Reasoning-Path (CoMM) Prompting Framework', 'Summary': 
'In this work, we aim to push the upper bound of the reasoning capability of LLMs by 
proposing a collaborative multi-agent, multi-reasoning-path (CoMM) prompting framework. 
Specifically, we prompt LLMs to play different roles in a problem-solving team, and 
encourage different role-play agents to collaboratively solve the target task. 
In particular, we discover that applying different reasoning paths for different roles 
is an effective strategy to implement few-shot prompting approaches in the multi-agent 
scenarios. Empirical results demonstrate the effectiveness of the proposed methods on 
two college-level science problems over competitive baselines. Our further analysis shows 
the necessity of prompting LLMs to play different roles or experts independently.', 
'URL': 'https://github.com/amazon-science/comm-prompt'}]

그래프 검색은 관계와 메타데이터를 찾는 데 아주 효과적이에요. 제목, 저자 또는 사전 정의된 카테고리를 기반으로 논문을 신속하게 식별해서 데이터에 대한 구조화된 뷰를 제공할 수 있죠.

다음으로는, 다른 관점을 찾기 위해 벡터 검색을 살펴볼게요.

# Example input data
question = "What paper talks about Multi-Agent?"

# Get vector + graph answers
docs = retriever.invoke(question)
vector_context = rag_chain.invoke({"context": docs, "question": question})

> The paper discusses "Adaptive In-conversation Team Building for Language Model Agents" 
and talks about Multi-Agent. It presents a new adaptive team-building paradigm 
that offers a flexible solution for building teams of LLM agents to solve complex 
tasks effectively. The approach, called Captain Agent, dynamically forms and manages 
teams for each step of the task-solving process, utilizing nested group conversations 
and reflection to ensure diverse expertise and prevent stereotypical outputs.

Vector Search는 문맥과 의미적 유사성을 이해하는 데 정말 좋아요. 검색어가 명시적으로 포함되어 있지 않더라도 개념적으로 쿼리와 관련된 논문을 찾아낼 수 있다니, 정말 흥미롭죠?

마지막으로 두 가지 검색 방법을 결합하는데요.

이는 Vector와 Graph Database를 모두 사용할 수 있도록 하는 RAG 에이전트의 중요한 부분이에요.

composite_chain = prompt | llm | StrOutputParser()
answer = composite_chain.invoke({"question": question, "context": vector_context, "graph_context": graph_context})

print(answer)

> The paper "Collaborative Multi-Agent, Multi-Reasoning-Path (CoMM) Prompting Framework" 
talks about Multi-Agent. It proposes a framework that prompts LLMs to play different roles 
in a problem-solving team and encourages different role-play agents to collaboratively solve 
the target task. The paper presents empirical results demonstrating the effectiveness of the 
proposed methods on two college-level science problems.

Graph와 Vector Search를 통합하여 두 접근 방식의 장점을 모두 활용하는 거죠. Graph Search는 정확성을 제공하고 구조화된 관계를 탐색하는 반면, Vector Search는 의미론적 이해를 통해 깊이를 더해줘요.

이 결합된 방법은 다음과 같은 몇 가지 장점을 제공한답니다.

: 두 방법 중 하나만 사용하면 놓칠 수 있는 관련 논문을 찾아줘요.
: 논문이 서로 어떻게 관련되어 있는지에 대한 보다 미묘한 이해를 제공하죠.
: 특정 키워드 검색부터 보다 광범위한 개념 탐색까지 다양한 유형의 쿼리에 적응할 수 있어요.

요약하자면

이번 블로그 게시물에서는 Neo4j 및 Milvus를 사용하여 GraphRAG 에이전트를 구축하는 방법을 보여드렸어요. Graph Database의 장점과 Vector Search 덕분에, 이 에이전트는 사용자 쿼리에 정확하고 관련성 높은 답변을 제공할 수 있는 거죠.

전용 라우팅, 폴백 메커니즘 및 자체 수정 기능을 갖춘 RAG 에이전트의 아키텍처는 강력하고 안정적이에요. Graph 생성 및 복합 에이전트 구성 요소의 예는 이 에이전트가 Vector와 Graph Database를 모두 활용하여 포괄적이고 미묘한 답변을 제공할 수 있는 방법을 보여준답니다.

이 가이드가 여러분의 프로젝트에서 Graph Database와 Vector Search를 결합하는 가능성을 확인하는 데 도움이 되기를 바라요.

현재 코드는 다음에서 사용할 수 있어요: .

필수 GraphRAG

Knowledge Graph를 통해 RAG의 잠재력을 최대한 활용하세요. 한정된 기간 동안 Manning으로부터 최종 가이드를 무료로 받아보세요.

GraphRAG
RAG

에이치시스템즈의 LogTree는 Neo4j 기반 GraphRAG 플랫폼으로, 데이터를 자동으로 지식그래프화하고 자연어 질의로 즉시 답을 제공합니다.

GraphRAG: 카드 게임으로 즐기는 그래프 RAG! (0)	2026.05.11
GraphRAG와 에이전트 아키텍처: Neo4j 및 NeoConverse를 활용한 실전 실험 (0)	2026.05.11
승자는 바로… (Neo4j, GraphRAG, Machine Learning API 활용 사례) (0)	2026.05.10
#GraphCast: 가짜 뉴스 판 - Neo4j와 GraphRAG로 진실을 밝히다 (0)	2026.05.10
GraphRAG Python 패키지로 그래프 탐색을 활용하여 벡터 검색 성능 높이기 (0)	2026.05.09

Graph Note