Mitigation Framework

Note

If you want to see tutorials on how to mitigate (debias) bias in word embedding models, visit Bias Mitigation in the User Guide.

WEFE standardizes all mitigation methods through an interface inherited from scikit-learn basic data transformations: the fit-transform interface.

Fit method

The first step fit, consists in learning the corresponding mitigation transformation, which usually corresponds to a matrix projection of the embedding space. This method is quite flexible: it can accept multiple sets of words and other parameters.

Transform method

The transform method applies the transformation learned in the previous step to words residing in the original embedding space. The method is rigid and only accepts lists of words that should be mitigated (target) or words that should be omitted (ignore).

The process by which debiasing methods are used is shown in the following Figure.

Mitiagion process figure.

Mitigation Methods Implemented So Far

WEFE implements the following bias mitigation (debias) metrics:

Except for MulticlassHardDebias, all methods are limited to binary criteria, such as gender.