Memgraph
Memgraph is an open-source graph database, tuned for dynamic analytics environments and compatible with Neo4j. To query the database, Memgraph uses Cypher - the most widely adopted, fully-specified, and open query language for property graph databases.
This notebook will show you how to query Memgraph with natural language and how to construct a knowledge graph from your unstructured data.
But first, make sure to set everything up.
Setting upโ
To go over this guide, you will need Docker and Python 3.x installed.
To quickly run Memgraph Platform (Memgraph database + MAGE library + Memgraph Lab) for the first time, do the following:
On Linux/MacOS:
curl https://install.memgraph.com | sh
On Windows:
iwr https://windows.memgraph.com | iex
Both commands run a script that downloads a Docker Compose file to your system, builds and starts memgraph-mage
and memgraph-lab
Docker services in two separate containers. Now you have Memgraph up and running! Read more about the installation process on Memgraph documentation.
To use LangChain, install and import all the necessary packages. We'll use the package manager pip, along with the --user
flag, to ensure proper permissions. If you've installed Python 3.4 or a later version, pip
is included by default. You can install all the required packages using the following command:
pip install langchain langchain-openai neo4j --user
You can either run the provided code blocks in this notebook or use a separate Python file to experiment with Memgraph and LangChain.
Natural language queryingโ
Memgraph's integration with LangChain includes natural language querying. To utilized it, first do all the necessary imports. We will discuss them as they appear in the code.
First, instantiate MemgraphGraph
. This object holds the connection to the running Memgraph instance. Make sure to set up all the environment variables properly.
import os
from langchain_community.chains.graph_qa.memgraph import MemgraphQAChain
from langchain_community.graphs import MemgraphGraph
from langchain_core.prompts import PromptTemplate
from langchain_openai import ChatOpenAI
url = os.environ.get("MEMGRAPH_URI", "bolt://localhost:7687")
username = os.environ.get("MEMGRAPH_USERNAME", "")
password = os.environ.get("MEMGRAPH_PASSWORD", "")
graph = MemgraphGraph(
url=url, username=username, password=password, refresh_schema=False
)
The refresh_schema
is initially set to False
because there is still no data in the database and we want to avoid unnecessary database calls.
Populating the databaseโ
To populate the database, first make sure it's empty. The most efficient way to do that is to switch to the in-memory analytical storage mode, drop the graph and go back to the in-memory transactional mode. Learn more about Memgraph's storage modes.
The data we'll add to the database is about video games of different genres available on various platforms and related to publishers.
# Drop graph
graph.query("STORAGE MODE IN_MEMORY_ANALYTICAL")
graph.query("DROP GRAPH")
graph.query("STORAGE MODE IN_MEMORY_TRANSACTIONAL")
# Creating and executing the seeding query
query = """
MERGE (g:Game {name: "Baldur's Gate 3"})
WITH g, ["PlayStation 5", "Mac OS", "Windows", "Xbox Series X/S"] AS platforms,
["Adventure", "Role-Playing Game", "Strategy"] AS genres
FOREACH (platform IN platforms |
MERGE (p:Platform {name: platform})
MERGE (g)-[:AVAILABLE_ON]->(p)
)
FOREACH (genre IN genres |
MERGE (gn:Genre {name: genre})
MERGE (g)-[:HAS_GENRE]->(gn)
)
MERGE (p:Publisher {name: "Larian Studios"})
MERGE (g)-[:PUBLISHED_BY]->(p);
"""
graph.query(query)
[]
Notice how the graph
object holds the query
method. That method executes query in Memgraph and it is also used by the MemgraphQAChain
to query the database.