XGBoost is a popular Gradient Boosting library with Python interface.
for XGBClassifer, XGBRegressor and Booster estimators. It is tested for
xgboost >= 0.6a2.
importance_typeis a way to get feature importance. Possible values are:
- ‘gain’ - the average gain of the feature when it is used in trees (default)
- ‘weight’ - the number of times a feature is used to split the data across all trees
- ‘cover’ - the average coverage of the feature when it is used in trees
target arguments are ignored.
eli5.explain_prediction() eli5 uses an approach based on ideas from
feature weights are calculated by following decision paths in trees
of an ensemble. Each node of the tree has an output score, and
contribution of a feature on the decision path is how much the score changes
from parent to child.
When explaining Booster predictions,
do not pass an
xgboost.DMatrix object as
doc, pass a numpy array
or a sparse matrix instead (or have
vec return them).
vecis a vectorizer instance used to transform raw features to the input of the estimator
xgb(e.g. a fitted CountVectorizer instance); you can pass it instead of
vectorizedis a flag which tells eli5 if
docshould be passed through
vecor not. By default it is False, meaning that if
vecis not None,
vec.transform([doc])is passed to the estimator. Set it to True if you’re passing
docis already vectorized.
Booster estimator accepts two more optional arguments:
is_regression- True if solving a regression problem (“objective” starts with “reg”) and False for a classification problem. If not set, regression is assumed for a single target estimator and proba will not be shown.
missing- set it to the same value as the
xgboost.DMatrix. Matters only if sparse values are used. Default is
See the tutorial for a more detailed usage example.