文档
Neo4j 从零到实战:社交网络图谱
1. 背景与概念
1.1 图数据库 vs 关系型数据库
在社交网络中,查询"朋友的朋友"在 SQL 中需要多次 JOIN(O(n²) 复杂度),而图数据库通过指针跳转做到 O(1)。
SQL: SELECT * FROM friends f1
JOIN friends f2 ON f1.target = f2.source
WHERE f1.source = '张三'
Cypher: MATCH (张三)-[:FRIEND]->()-[:FRIEND]->(fof)
RETURN fof
1.2 Cypher 核心语法
| 语法 | 含义 |
|---|---|
(n:Label {key: value}) |
匹配/创建节点 |
-[r:TYPE]-> |
有向关系 |
-[*1..3]-> |
1 到 3 跳变长路径 |
shortestPath() |
最短路径算法 |
collect() |
聚合为数组 |
size() |
列表/字符串长度 |
2. 分步实战:构建校园社交图谱
场景
模拟大学社交网络:学生、社团、课程,实现好友推荐、社团发现。
步骤一:数据建模
// 创建约束(高效查找)
CREATE CONSTRAINT FOR (p:Person) REQUIRE p.name IS UNIQUE;
CREATE CONSTRAINT FOR (g:Group) REQUIRE g.name IS UNIQUE;
CREATE CONSTRAINT FOR (c:Course) REQUIRE c.name IS UNIQUE;
步骤二:批量导入数据
// 学生
UNWIND [
{name: 'Alice', major: 'CS', year: 3},
{name: 'Bob', major: 'CS', year: 3},
{name: 'Carol', major: 'Math', year: 2},
{name: 'Dave', major: 'CS', year: 4},
{name: 'Eve', major: 'Physics', year: 3},
{name: 'Frank', major: 'Math', year: 2}
] AS person
CREATE (:Person {name: person.name, major: person.major, year: person.year});
// 社团
CREATE (:Group {name: '编程社', type: '学术'});
CREATE (:Group {name: '篮球队', type: '体育'});
CREATE (:Group {name: '数学建模', type: '学术'});
// 课程
CREATE (:Course {name: '数据库原理'});
CREATE (:Course {name: '算法导论'});
CREATE (:Course {name: '高等代数'});
// 好友关系(双向用双向 CREATE)
CREATE
(:Person {name: 'Alice'})-[:FRIEND]->(:Person {name: 'Bob'}),
(:Person {name: 'Alice'})-[:FRIEND]->(:Person {name: 'Carol'}),
(:Person {name: 'Bob'})-[:FRIEND]->(:Person {name: 'Dave'}),
(:Person {name: 'Carol'})-[:FRIEND]->(:Person {name: 'Eve'}),
(:Person {name: 'Dave'})-[:FRIEND]->(:Person {name: 'Frank'}),
(:Person {name: 'Eve'})-[:FRIEND]->(:Person {name: 'Alice'});
// 社团关系
CREATE
(:Person {name: 'Alice'})-[:MEMBER_OF]->(:Group {name: '编程社'}),
(:Person {name: 'Bob'})-[:MEMBER_OF]->(:Group {name: '编程社'}),
(:Person {name: 'Dave'})-[:MEMBER_OF]->(:Group {name: '编程社'}),
(:Person {name: 'Carol'})-[:MEMBER_OF]->(:Group {name: '数学建模'}),
(:Person {name: 'Frank'})-[:MEMBER_OF]->(:Group {name: '数学建模'});
// 选课
CREATE
(:Person {name: 'Alice'})-[:ENROLLED_IN]->(:Course {name: '数据库原理'}),
(:Person {name: 'Bob'})-[:ENROLLED_IN]->(:Course {name: '数据库原理'}),
(:Person {name: 'Carol'})-[:ENROLLED_IN]->(:Course {name: '高等代数'});
步骤三:实现推荐查询
// 1. 二度好友推荐(你朋友的朋友,但你还不是朋友)
MATCH (me:Person {name: 'Alice'})-[:FRIEND*2..2]-(recommendation:Person)
WHERE NOT (me)-[:FRIEND]-(recommendation) AND me <> recommendation
RETURN DISTINCT recommendation.name AS 推荐好友;
// 2. 基于社团的多重好友推荐
MATCH (me:Person {name: 'Alice'})-[:MEMBER_OF]->(g:Group)<-[:MEMBER_OF]-(other:Person)
WHERE NOT (me)-[:FRIEND]-(other) AND me <> other
WITH other, COUNT(g) AS common_groups
RETURN other.name, common_groups
ORDER BY common_groups DESC;
// 3. 课程共修者
MATCH (me:Person {name: 'Alice'})-[:ENROLLED_IN]->(c:Course)<-[:ENROLLED_IN]-(peer:Person)
WHERE NOT (me)-[:FRIEND]-(peer)
RETURN peer.name AS 同课同学, c.name AS 课程;
// 4. 图算法:PageRank(需安装 GDS 插件)
// CALL gds.pageRank.stream('social-graph') YIELD nodeId, score
// RETURN gds.util.asNode(nodeId).name AS name, score
// ORDER BY score DESC;
3. 思考题
- 如果好友关系改为单向(如微博关注),上述推荐查询需要如何修改?
- 在
[:FRIEND*2..2]变长路径中,如果社交网络有 100 万节点,如何避免查询爆炸? - 设计一个"可能认识的人"功能,综合考虑共同朋友数、共同社团数、同课程数,给出综合评分。