.. _library-xgboost: XGBoost ======= XGBoost_ is a popular Gradient Boosting library with Python interface. eli5 supports :func:`eli5.explain_weights` and :func:`eli5.explain_prediction` for XGBClassifer_, XGBRegressor_ and Booster_ estimators. It is tested for xgboost >= 0.6a2 and < 2.0.0. Versions starting from 2.0.0 likely produce incorrect results in :func:`eli5.explain_prediction`, and will issue a warning. .. _XGBoost: https://github.com/dmlc/xgboost .. _XGBClassifer: https://xgboost.readthedocs.io/en/latest/python/python_api.html#xgboost.XGBClassifier .. _XGBRegressor: https://xgboost.readthedocs.io/en/latest/python/python_api.html#xgboost.XGBRegressor .. _Booster: http://xgboost.readthedocs.io/en/latest/python/python_api.html#xgboost.Booster :func:`eli5.explain_weights` uses feature importances. Additional arguments for XGBClassifer_, XGBRegressor_ and Booster_: * ``importance_type`` is a way to get feature importance. Possible values are: - 'gain' - the average gain of the feature when it is used in trees (default) - 'weight' - the number of times a feature is used to split the data across all trees - 'cover' - the average coverage of the feature when it is used in trees ``target_names`` and ``targets`` arguments are ignored. .. note:: Top-level :func:`eli5.explain_weights` calls are dispatched to :func:`eli5.xgboost.explain_weights_xgboost` for XGBClassifer_, XGBRegressor_ and Booster_. For :func:`eli5.explain_prediction` eli5 uses an approach based on ideas from http://blog.datadive.net/interpreting-random-forests/ : feature weights are calculated by following decision paths in trees of an ensemble. Each node of the tree has an output score, and contribution of a feature on the decision path is how much the score changes from parent to child. .. note:: When explaining Booster_ predictions, do not pass an ``xgboost.DMatrix`` object as ``doc``, pass a numpy array or a sparse matrix instead (or have ``vec`` return them). Additional :func:`eli5.explain_prediction` keyword arguments supported for XGBClassifer_, XGBRegressor_ and Booster_: * ``vec`` is a vectorizer instance used to transform raw features to the input of the estimator ``xgb`` (e.g. a fitted CountVectorizer instance); you can pass it instead of ``feature_names``. * ``vectorized`` is a flag which tells eli5 if ``doc`` should be passed through ``vec`` or not. By default it is False, meaning that if ``vec`` is not None, ``vec.transform([doc])`` is passed to the estimator. Set it to True if you're passing ``vec``, but ``doc`` is already vectorized. :func:`eli5.explain_prediction` for Booster_ estimator accepts two more optional arguments: * ``is_regression`` - True if solving a regression problem ("objective" starts with "reg") and False for a classification problem. If not set, regression is assumed for a single target estimator and proba will not be shown. * ``missing`` - set it to the same value as the ``missing`` argument to ``xgboost.DMatrix``. Matters only if sparse values are used. Default is ``np.nan``. See the :ref:`tutorial ` for a more detailed usage example. .. note:: Top-level :func:`eli5.explain_prediction` calls are dispatched to :func:`eli5.xgboost.explain_prediction_xgboost` for XGBClassifer_, XGBRegressor_ and Booster_.