Hello World — 鸢尾花分类

知识库
知识库文档
/tech-stacks/scikit-learn/examples/Hello World — 鸢尾花分类.md

文档

Scikit-learn Hello World:鸢尾花分类

目标

使用经典的 Iris 数据集,训练一个随机森林分类器并评估准确率。这是 scikit-learn 最经典的入门示例。

完整代码

from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score, classification_report

# 1. 加载数据
iris = load_iris()
X, y = iris.data, iris.target
print(f"特征形状: {X.shape}, 标签形状: {y.shape}")
print(f"类别名: {iris.target_names}")

# 2. 划分训练集和测试集
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.3, random_state=42
)

# 3. 创建并训练模型
clf = RandomForestClassifier(n_estimators=100, random_state=42)
clf.fit(X_train, y_train)

# 4. 预测并评估
y_pred = clf.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
print(f"\n准确率: {accuracy:.2%}")
print("\n分类报告:")
print(classification_report(y_test, y_pred, target_names=iris.target_names))

# 5. 特征重要性
for name, importance in zip(iris.feature_names, clf.feature_importances_):
    print(f"  {name}: {importance:.4f}")

运行步骤

pip install scikit-learn
python iris_classify.py

预期输出

特征形状: (150, 4), 标签形状: (150,)
类别名: ['setosa' 'versicolor' 'virginica']

准确率: 97.78%

分类报告:
              precision    recall  f1-score   support

      setosa       1.00      1.00      1.00        19
  versicolor       0.93      1.00      0.96        13
   virginica       1.00      0.92      0.96        13

    accuracy                           0.98        45

  sepal length (cm): 0.1081
  sepal width (cm): 0.0304
  petal length (cm): 0.4195
  petal width (cm): 0.4420

信息

路径
/tech-stacks/scikit-learn/examples/Hello World — 鸢尾花分类.md
更新时间
2026/5/30