Neo4j

技术栈
数据库
nosqlgraph-databasecypherknowledge-graphconnected-data

概览

Neo4j 是全球最流行的图数据库,使用原生图存储Cypher 查询语言高效处理高度关联的数据。数据以节点(Nodes)、关系(Relationships)和属性(Properties)的形式存储,非常适合社交网络、推荐引擎、知识图谱、反欺诈等场景。支持 ACID 事务、集群部署和内置图算法库。社区版免费开源。

安装

Neo4j 安装指南

1. 环境准备

要求 说明
操作系统 Linux、macOS、Windows
Java 运行时 OpenJDK 11 或 17(Neo4j 5.x)
内存 最低 2 GB,推荐 8 GB+
磁盘 SSD 推荐,至少 2 GB 可用
端口 7474(HTTP)、7687(Bolt 协议)

2. 安装命令

Docker(推荐)

# 启动 Neo4j 社区版
docker run -d --name neo4j \
  -p 7474:7474 -p 7687:7687 \
  -e NEO4J_AUTH=neo4j/password123 \
  -e NEO4J_PLUGINS='["graph-data-science","apoc"]' \
  -v neo4j_data:/data \
  neo4j:latest

# 访问浏览器界面
# http://localhost:7474
# 用户名: neo4j,密码: password123

Ubuntu/Debian

# 安装 Java
sudo apt update &;& sudo apt install -y openjdk-17-jdk

# 添加 Neo4j 仓库
wget -O - https://debian.neo4j.com/neotechnology.gpg.key | sudo apt-key add -
echo 'deb https://debian.neo4j.com stable 5' | sudo tee /etc/apt/sources.list.d/neo4j.list

sudo apt update
sudo apt install -y neo4j

# 启动
sudo systemctl start neo4j
sudo systemctl enable neo4j

# 设置密码(首次)
cypher-shell
# 输入默认用户名 neo4j 和密码 neo4j,然后设置新密码

macOS

brew install neo4j
brew services start neo4j
# 访问 http://localhost:7474

Windows(安装包)

https://neo4j.com/download/ 下载 .exe 安装程序,按向导安装后启动 Neo4j Desktop。

3. 常见安装问题

Q1: 端口 7474/7687 被占用

修改配置文件 conf/neo4j.conf

server.http.listen_address=0.0.0.0:7475
server.bolt.listen_address=0.0.0.0:7688

Q2: "Max open files" 限制太低

Linux 系统需要提高文件描述符限制:

sudo vim /etc/security/limits.conf
# 添加:
neo4j  -  nofile  60000
neo4j  -  nproc   60000

Q3: Docker 密码修改

# 方法一:环境变量指定
-e NEO4J_AUTH=neo4j/mypassword

# 方法二:进入容器修改
docker exec -it neo4j neo4j-admin set-initial-password mypassword

Q4: 社区版 vs 企业版

社区版完全免费,支持单节点部署、Cypher 查询、APOC 插件。企业版增加集群、多数据库、高级安全等。

示例

Neo4j Hello World:Cypher 图查询入门

目标

创建第一个图数据库,用 Cypher 建节点、建关系、查询,感受图数据模型的直观性。

完整代码

在 Neo4j Browser(http://localhost:7474)中执行:

1. 创建节点

-- 创建人物节点
CREATE (:Person {name: '张三', age: 21, major: '计算机科学'});
CREATE (:Person {name: '李四', age: 22, major: '数学'});
CREATE (:Person {name: '王五', age: 20, major: '计算机科学'});
CREATE (:Person {name: '赵六', age: 23, major: '物理'});

2. 创建课程节点

CREATE (:Course {name: '数据库原理', credits: 4});
CREATE (:Course {name: '算法导论', credits: 3});
CREATE (:Course {name: '线性代数', credits: 3});

3. 创建关系

-- 用 MERGE 匹配已有节点并建立关系
MATCH (a:Person {name: '张三'}), (c:Course {name: '数据库原理'})
CREATE (a)-[:ENROLLED_IN {semester: '2024-秋季'}]->(c);

MATCH (a:Person {name: '张三'}), (c:Course {name: '算法导论'})
CREATE (a)-[:ENROLLED_IN {semester: '2024-秋季'}]->(c);

MATCH (a:Person {name: '李四'}), (c:Course {name: '线性代数'})
CREATE (a)-[:ENROLLED_IN {semester: '2024-秋季'}]->(c);

MATCH (a:Person {name: '王五'}), (c:Course {name: '数据库原理'})
CREATE (a)-[:ENROLLED_IN {semester: '2024-秋季'}]->(c);

MATCH (a:Person {name: '王五'}), (c:Course {name: '算法导论'})
CREATE (a)-[:ENROLLED_IN {semester: '2024-秋季'}]->(c);

-- 好友关系
MATCH (a:Person {name: '张三'}), (b:Person {name: '王五'})
CREATE (a)-[:FRIEND_OF {since: 2023}]->(b);

4. 常用查询

-- 查询某学生的选课
MATCH (p:Person {name: '张三'})-[:ENROLLED_IN]->(c:Course)
RETURN p.name AS 学生, c.name AS 课程, c.credits AS 学分;

-- 查询某课程的所有学生
MATCH (p:Person)-[:ENROLLED_IN]->(c:Course {name: '数据库原理'})
RETURN p.name AS 学生, p.major AS 专业;

-- 找出好友的选课(推荐系统雏形)
MATCH (me:Person {name: '张三'})-[:FRIEND_OF]->(friend:Person)-[:ENROLLED_IN]->(c:Course)
WHERE NOT (me)-[:ENROLLED_IN]->(c)
RETURN friend.name, c.name AS 推荐课程;

-- 最短路径(张三到赵六的关系路径)
MATCH path = shortestPath((a:Person {name: '张三'})-[*]-(b:Person {name: '赵六'}))
RETURN path;

Python Neo4j 驱动版

# pip install neo4j
from neo4j import GraphDatabase

URI = "bolt://localhost:7687"
AUTH = ("neo4j", "password123")

driver = GraphDatabase.driver(URI, auth=AUTH)

def create_data(tx):
    """批量创建节点和关系"""
    tx.run("""
        CREATE (:Person {name: '张三', age: 21, major: '计算机科学'})
        CREATE (:Person {name: '李四', age: 22, major: '数学'})
        CREATE (:Person {name: '王五', age: 20, major: '计算机科学'})
        CREATE (:Course {name: '数据库原理', credits: 4})
        CREATE (:Course {name: '算法导论', credits: 3})
        CREATE (:Course {name: '线性代数', credits: 3})
    """)
    
    tx.run("""
        MATCH (a:Person {name: '张三'}), (c:Course {name: '数据库原理'})
        CREATE (a)-[:ENROLLED_IN {semester: '2024-秋季'}]->(c)
    """)
    tx.run("""
        MATCH (a:Person {name: '张三'}), (c:Course {name: '算法导论'})
        CREATE (a)-[:ENROLLED_IN {semester: '2024-秋季'}]->(c)
    """)
    tx.run("""
        MATCH (a:Person {name: '王五'}), (c:Course {name: '数据库原理'})
        CREATE (a)-[:ENROLLED_IN {semester: '2024-秋季'}]->(c)
    """)

def query_enrollments(tx):
    result = tx.run("""
        MATCH (p:Person)-[:ENROLLED_IN]->(c:Course)
        RETURN p.name AS 学生, c.name AS 课程
    """)
    for record in result:
        print(f"{record['学生']} -> {record['课程']}")

with driver.session() as session:
    session.execute_write(create_data)
    session.execute_read(query_enrollments)

driver.close()

预期输出

学生     课程
张三  -> 数据库原理
张三  -> 算法导论
王五  -> 数据库原理

-- 推荐查询输出:
李四  -> 数据库原理   (张三的好友王五也在上这门课)

关键点

  • (Node)-[:RELATION]->(Node) 是 Cypher 的核心语法
  • MATCH 匹配模式,RETURN 返回结果
  • 图查询天然支持多跳关联(*1..n)
  • neo4j driver 的 execute_write / execute_read 是事务安全的

教程

Neo4j 从零到实战:社交网络图谱

1. 背景与概念

1.1 图数据库 vs 关系型数据库

在社交网络中,查询"朋友的朋友"在 SQL 中需要多次 JOIN(O(n²) 复杂度),而图数据库通过指针跳转做到 O(1)。

SQL:   SELECT * FROM friends f1
       JOIN friends f2 ON f1.target = f2.source
       WHERE f1.source = '张三'

Cypher: MATCH (张三)-[:FRIEND]->()-[:FRIEND]->(fof)
        RETURN fof

1.2 Cypher 核心语法

语法 含义
(n:Label {key: value}) 匹配/创建节点
-[r:TYPE]-> 有向关系
-[*1..3]-> 1 到 3 跳变长路径
shortestPath() 最短路径算法
collect() 聚合为数组
size() 列表/字符串长度

2. 分步实战:构建校园社交图谱

场景

模拟大学社交网络:学生、社团、课程,实现好友推荐、社团发现。

步骤一:数据建模

// 创建约束(高效查找)
CREATE CONSTRAINT FOR (p:Person) REQUIRE p.name IS UNIQUE;
CREATE CONSTRAINT FOR (g:Group) REQUIRE g.name IS UNIQUE;
CREATE CONSTRAINT FOR (c:Course) REQUIRE c.name IS UNIQUE;

步骤二:批量导入数据

// 学生
UNWIND [
    {name: 'Alice', major: 'CS', year: 3},
    {name: 'Bob', major: 'CS', year: 3},
    {name: 'Carol', major: 'Math', year: 2},
    {name: 'Dave', major: 'CS', year: 4},
    {name: 'Eve', major: 'Physics', year: 3},
    {name: 'Frank', major: 'Math', year: 2}
] AS person
CREATE (:Person {name: person.name, major: person.major, year: person.year});

// 社团
CREATE (:Group {name: '编程社', type: '学术'});
CREATE (:Group {name: '篮球队', type: '体育'});
CREATE (:Group {name: '数学建模', type: '学术'});

// 课程
CREATE (:Course {name: '数据库原理'});
CREATE (:Course {name: '算法导论'});
CREATE (:Course {name: '高等代数'});

// 好友关系(双向用双向 CREATE)
CREATE
    (:Person {name: 'Alice'})-[:FRIEND]->(:Person {name: 'Bob'}),
    (:Person {name: 'Alice'})-[:FRIEND]->(:Person {name: 'Carol'}),
    (:Person {name: 'Bob'})-[:FRIEND]->(:Person {name: 'Dave'}),
    (:Person {name: 'Carol'})-[:FRIEND]->(:Person {name: 'Eve'}),
    (:Person {name: 'Dave'})-[:FRIEND]->(:Person {name: 'Frank'}),
    (:Person {name: 'Eve'})-[:FRIEND]->(:Person {name: 'Alice'});

// 社团关系
CREATE
    (:Person {name: 'Alice'})-[:MEMBER_OF]->(:Group {name: '编程社'}),
    (:Person {name: 'Bob'})-[:MEMBER_OF]->(:Group {name: '编程社'}),
    (:Person {name: 'Dave'})-[:MEMBER_OF]->(:Group {name: '编程社'}),
    (:Person {name: 'Carol'})-[:MEMBER_OF]->(:Group {name: '数学建模'}),
    (:Person {name: 'Frank'})-[:MEMBER_OF]->(:Group {name: '数学建模'});

// 选课
CREATE
    (:Person {name: 'Alice'})-[:ENROLLED_IN]->(:Course {name: '数据库原理'}),
    (:Person {name: 'Bob'})-[:ENROLLED_IN]->(:Course {name: '数据库原理'}),
    (:Person {name: 'Carol'})-[:ENROLLED_IN]->(:Course {name: '高等代数'});

步骤三:实现推荐查询

// 1. 二度好友推荐(你朋友的朋友,但你还不是朋友)
MATCH (me:Person {name: 'Alice'})-[:FRIEND*2..2]-(recommendation:Person)
WHERE NOT (me)-[:FRIEND]-(recommendation) AND me <> recommendation
RETURN DISTINCT recommendation.name AS 推荐好友;

// 2. 基于社团的多重好友推荐
MATCH (me:Person {name: 'Alice'})-[:MEMBER_OF]->(g:Group)<-[:MEMBER_OF]-(other:Person)
WHERE NOT (me)-[:FRIEND]-(other) AND me <> other
WITH other, COUNT(g) AS common_groups
RETURN other.name, common_groups
ORDER BY common_groups DESC;

// 3. 课程共修者
MATCH (me:Person {name: 'Alice'})-[:ENROLLED_IN]->(c:Course)<-[:ENROLLED_IN]-(peer:Person)
WHERE NOT (me)-[:FRIEND]-(peer)
RETURN peer.name AS 同课同学, c.name AS 课程;

// 4. 图算法:PageRank(需安装 GDS 插件)
// CALL gds.pageRank.stream('social-graph') YIELD nodeId, score
// RETURN gds.util.asNode(nodeId).name AS name, score
// ORDER BY score DESC;

3. 思考题

  1. 如果好友关系改为单向(如微博关注),上述推荐查询需要如何修改?
  2. [:FRIEND*2..2] 变长路径中,如果社交网络有 100 万节点,如何避免查询爆炸?
  3. 设计一个"可能认识的人"功能,综合考虑共同朋友数、共同社团数、同课程数,给出综合评分。

参考资料

  1. [1] Ian Robinson, Jim Webber, Emil Eifrem. Graph Databases (2nd Edition). 2015.
  2. [2] Neo4j Inc.. Neo4j Graph Data Science 官方文档. 2024.
  3. [3] Jérôme Baton. Learning Neo4j 3.x. 2017.