10 行代码完成 NER + 依存句法分析

知识库
知识库文档
/tech-stacks/spacy/examples/10 行代码完成 NER + 依存句法分析.md

文档

spaCy 10 行代码:NER + 依存句法 + 可视化

目标

用 spaCy 一条 Pipeline 完成:分词、词性标注、命名实体识别 (NER)、依存句法分析,并用 displaCy 可视化。

完整代码

import spacy
from spacy import displacy

# ─── 1. 加载模型 ───
nlp = spacy.load("en_core_web_sm")

# ─── 2. 处理文本 ───
text = "Elon Musk announced that Tesla will build a new factory in Shanghai next year, investing $2 billion."
doc = nlp(text)

# ─── 3. 分词 + 词性 + 依存分析 ───
print("=" * 70)
print(f"{'Token':<12} {'POS':<10} {'依存关系':<16} {'Head':<12}")
print("=" * 70)
for token in doc:
    print(f"{token.text:<12} {token.pos_:<10} {token.dep_:<16} {token.head.text:<12}")

# ─── 4. 命名实体识别 ───
print("\n" + "=" * 40)
print("命名实体 (NER):")
print("=" * 40)
for ent in doc.ents:
    print(f"  {ent.text:<25} | {ent.label_:<10} | {spacy.explain(ent.label_)}")

# ─── 5. 名词短语 ───
print("\n名词短语:")
for chunk in doc.noun_chunks:
    print(f"  → {chunk.text}")

# ─── 6. 依存句法可视化 ───
displacy.render(doc, style="dep", jupyter=False, options={"compact": True})
# 或保存到文件:
# displacy.serve(doc, style="dep")  # 启动 Web 服务器

# ─── 7. NER 可视化 ───
displacy.render(doc, style="ent", jupyter=False)
# displacy.serve(doc, style="ent")

运行步骤

pip install spacy
python -m spacy download en_core_web_sm
python spacy_demo.py

预期输出

======================================================================
Token        POS        依存关系           Head
======================================================================
Elon         PROPN      compound          Musk
Musk         PROPN      nsubj             announced
announced    VERB       ROOT              announced
that         SCONJ      mark              build
...
Shanghai     PROPN      pobj              in
...
$            SYM        quantmod          billion
2            NUM        compound          billion
billion      NUM        pobj              of
.            PUNCT      punct             announced

============================================
命名实体 (NER):
============================================
  Elon Musk                 | PERSON     | People, including fictional
  Tesla                     | ORG        | Companies, agencies
  Shanghai                  | GPE        | Countries, cities, states
  next year                 | DATE       | Absolute or relative dates
  $2 billion                | MONEY      | Monetary values

信息

路径
/tech-stacks/spacy/examples/10 行代码完成 NER + 依存句法分析.md
更新时间
2026/5/31