文档

MLflow autolog：一行代码自动追踪实验

目标

用 mlflow.autolog() 一行代码自动记录 XGBoost 训练的所有参数、指标、模型，然后在 UI 中对比多次实验。

完整代码

import mlflow
import xgboost as xgb
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

# ─── 1. 设置 MLflow ───
mlflow.set_tracking_uri("http://localhost:5000")  # 或删除此行用本地
mlflow.set_experiment("xgboost-breast-cancer")

# ─── 2. 开启 autolog ───
mlflow.xgboost.autolog()  # 👈 一行搞定！自动记录所有

# ─── 3. 数据准备 ───
X, y = load_breast_cancer(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# ─── 4. 多次实验：不同的超参 ───
experiments = [
    {"n_estimators": 50,  "max_depth": 3, "learning_rate": 0.1},
    {"n_estimators": 100, "max_depth": 5, "learning_rate": 0.05},
    {"n_estimators": 200, "max_depth": 7, "learning_rate": 0.01},
]

for params in experiments:
    with mlflow.start_run(run_name=f"xgb_d{params['max_depth']}_lr{params['learning_rate']}"):
        model = xgb.XGBClassifier(**params, eval_metric="logloss", random_state=42)
        model.fit(X_train, y_train, eval_set=[(X_test, y_test)], verbose=False)

        y_pred = model.predict(X_test)
        acc = accuracy_score(y_test, y_pred)
        mlflow.log_metric("test_accuracy", acc)

        print(f"✓ {params} → Accuracy: {acc:.4f}")

print(f"\n查看所有实验: mlflow ui --port 5000")
print(f"实验总数: {len(mlflow.search_runs())}")

运行步骤

# 终端 1: 启动 MLflow UI
pip install mlflow xgboost scikit-learn
mlflow ui --port 5000

# 终端 2: 运行实验
python mlflow_autolog.py

打开 http://localhost:5000 即可看到 3 个实验对比。

预期输出

✓ {'n_estimators': 50, 'max_depth': 3, 'learning_rate': 0.1} → Accuracy: 0.9737
✓ {'n_estimators': 100, 'max_depth': 5, 'learning_rate': 0.05} → Accuracy: 0.9825
✓ {'n_estimators': 200, 'max_depth': 7, 'learning_rate': 0.01} → Accuracy: 0.9825

查看所有实验: mlflow ui --port 5000
实验总数: 3

UI 中可以对比：参数表、指标曲线、运行时长、模型下载。

信息

路径: /tech-stacks/mlflow/examples/autolog 一行自动追踪 + 对比实验.md
更新时间: 2026/5/31