文档
Cassandra Hello World:键空间与表操作
目标
创建第一个 Cassandra 键空间(Keyspace),建表并执行 CRUD 操作,理解 CQL 与 SQL 的异同。
完整代码
在 cqlsh 中依次执行:
-- 1. 创建键空间(指定复制策略)
CREATE KEYSPACE university
WITH replication = {
'class': 'SimpleStrategy',
'replication_factor': 1
};
-- 2. 切换到该键空间
USE university;
-- 3. 创建表(注意 PRIMARY KEY 含分区键 + 聚簇键)
CREATE TABLE students (
department TEXT,
student_id UUID,
name TEXT,
age INT,
email TEXT,
gpa FLOAT,
enrolled_date DATE,
PRIMARY KEY (department, student_id)
);
-- 4. 插入数据(必须提供完整主键)
INSERT INTO students (department, student_id, name, age, email, gpa, enrolled_date)
VALUES ('Computer Science', uuid(), '张三', 21, 'zhangsan@example.com', 3.8, '2024-09-01');
INSERT INTO students (department, student_id, name, age, email, gpa, enrolled_date)
VALUES ('Computer Science', uuid(), '李四', 22, 'lisi@example.com', 3.5, '2024-09-01');
INSERT INTO students (department, student_id, name, age, email, gpa, enrolled_date)
VALUES ('Mathematics', uuid(), '王五', 20, 'wangwu@example.com', 3.9, '2024-09-01');
-- 5. 查询(按分区键查询效率最高)
SELECT * FROM students WHERE department = 'Computer Science';
-- 6. 更新(实际上也是插入,因为 Cassandra 是 upsert)
UPDATE students SET gpa = 3.9 WHERE department = 'Computer Science' AND student_id = <某个UUID>;
-- 7. 删除
DELETE FROM students WHERE department = 'Computer Science' AND student_id = <某个UUID>;
-- 8. ALLOW FILTERING(全表扫描,生产环境慎用!)
SELECT * FROM students WHERE gpa > 3.5 ALLOW FILTERING;
Python 驱动版
# pip install cassandra-driver
from cassandra.cluster import Cluster
from cassandra.auth import PlainTextAuthProvider
import uuid
# 连接集群
cluster = Cluster(['127.0.0.1'], port=9042)
session = cluster.connect()
# 创建键空间
session.execute("""
CREATE KEYSPACE IF NOT EXISTS university
WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 1}
""")
session.set_keyspace('university')
# 建表
session.execute("""
CREATE TABLE IF NOT EXISTS students (
department TEXT,
student_id UUID,
name TEXT,
age INT,
email TEXT,
gpa FLOAT,
enrolled_date DATE,
PRIMARY KEY (department, student_id)
)
""")
# 预编译插入
insert_stmt = session.prepare("""
INSERT INTO students (department, student_id, name, age, email, gpa, enrolled_date)
VALUES (?, ?, ?, ?, ?, ?, ?)
""")
# 批量执行
session.execute(insert_stmt, ['Computer Science', uuid.uuid4(), '张三', 21, 'zhangsan@example.com', 3.8, '2024-09-01'])
session.execute(insert_stmt, ['Mathematics', uuid.uuid4(), '王五', 20, 'wangwu@example.com', 3.9, '2024-09-01'])
# 查询
rows = session.execute("SELECT * FROM students WHERE department = 'Computer Science'")
for row in rows:
print(f"{row.name} - GPA: {row.gpa}")
cluster.shutdown()
预期输出
department | student_id | age | email | enrolled_date | gpa | name
------------------+--------------------------------------+-----+-----------------------+---------------+-----+------
Computer Science | 3b8a1f4e-... | 21 | zhangsan@example.com | 2024-09-01 | 3.8 | 张三
Computer Science | 7c2d6f9a-... | 22 | lisi@example.com | 2024-09-01 | 3.5 | 李四
关键点
- Cassandra 的 PRIMARY KEY = Partition Key + Clustering Key
- WHERE 必须命中分区键,否则需要 ALLOW FILTERING(生产禁用)
- 写入是 upsert 语义(存在则更新,不存在则插入)