eli5.xgboost
eli5 has XGBoost support - eli5.explain_weights()
shows feature importances,
and eli5.explain_prediction() explains predictions by showing feature weights.
Both functions work for XGBClassifier and XGBRegressor.
- explain_prediction_xgboost(xgb, doc, vec=None, top=None, top_targets=None, target_names=None, targets=None, feature_names=None, feature_re: Pattern[str] | None = None, feature_filter=None, vectorized: bool = False, is_regression: bool | None = None, missing: Any | None = None)[source]
Return an explanation of XGBoost prediction (via scikit-learn wrapper XGBClassifier or XGBRegressor, or via xgboost.Booster) as feature weights.
See
eli5.explain_prediction()for description oftop,top_targets,target_names,targets,feature_names,feature_reandfeature_filterparameters.- Parameters:
vec (vectorizer, optional) – A vectorizer instance used to transform raw features to the input of the estimator
xgb(e.g. a fitted CountVectorizer instance); you can pass it instead offeature_names.vectorized (bool, optional) – A flag which tells eli5 if
docshould be passed throughvecor not. By default it is False, meaning that ifvecis not None,vec.transform([doc])is passed to the estimator. Set it to True if you’re passingvec, butdocis already vectorized.is_regression (bool, optional) – Pass if an
xgboost.Boosteris passed as the first argument. True if solving a regression problem (“objective” starts with “reg”) and False for a classification problem. If not set, regression is assumed for a single target estimator and proba will not be shown.missing (optional) – Pass if an
xgboost.Boosteris passed as the first argument. Set it to the same value as themissingargument toxgboost.DMatrix. Matters only if sparse values are used. Default isnp.nan.Method for determining feature importances follows an idea from
http (//blog.datadive.net/interpreting-random-forests/.)
Feature weights are calculated by following decision paths in trees
of an ensemble.
Each leaf has an output score, and expected scores can also be assigned
to parent nodes.
Contribution of one feature on the decision path is how much expected score
changes from parent to child.
Weights of all features sum to the output score of the estimator.
- explain_weights_xgboost(xgb, vec=None, top=20, target_names=None, targets=None, feature_names=None, feature_re: Pattern[str] | None = None, feature_filter=None, importance_type='gain')[source]
Return an explanation of an XGBoost estimator (via scikit-learn wrapper XGBClassifier or XGBRegressor, or via xgboost.Booster) as feature importances.
See
eli5.explain_weights()for description oftop,feature_names,feature_reandfeature_filterparameters.target_namesandtargetsparameters are ignored.- Parameters:
importance_type (str, optional) – A way to get feature importance. Possible values are:
‘gain’ - the average gain of the feature when it is used in trees (default)
‘weight’ - the number of times a feature is used to split the data across all trees
‘cover’ - the average coverage of the feature when it is used in trees