wefe.debias.hard_debias.HardDebias

class wefe.debias.hard_debias.HardDebias(pca_args: dict[str, Any] = {'n_components': 10}, verbose: bool = False, criterion_name: str | None = None)[source]

Bases: BaseDebias

Hard Debias debiasing method.

Hard debias is a method that allows mitigating biases through geometric operations on embeddings.

This method is binary because it only allows 2 classes of the same bias criterion, such as male or female.

Note

For a multiclass debias (such as for Latinos, Asians and Whites), it is recommended to visit MulticlassHardDebias class.

The main idea of this method is:

1. Identify a bias subspace through the defining sets. In the case of gender, these could be e.g. [['woman', 'man'], ['she', 'he'], ...]

2. Neutralize the bias subspace of embeddings that should not be biased. First, it is defined a set of words that are correct to be related to the bias criterion: the criterion specific gender words. For example, in the case of gender, gender specific words are: ['he', 'his', 'He', 'her', 'she', 'him', 'him', 'She', 'man', 'women', 'men', ...].

Then, it is defined that all words outside this set should have no relation to the bias criterion and thus have the possibility of being biased. (e.g. for the case of genthe bias direction, such that neither is closer to the bias direction than the other: ['doctor', 'nurse', ...]). Therefore, this set of words is neutralized with respect to the bias subspace found in the previous step.

The neutralization is carried out under the following operation:

  • \(u\) : embedding

  • \(v\) : bias direction

First calculate the projection of the embedding on the bias subspace.

\[\text{bias subspace} = \frac{v \cdot (v \cdot u)}{(v \cdot v)}\]

Then subtract the projection from the embedding.

\[u' = u - \text{bias subspace}\]

3. Equalizate the embeddings with respect to the bias direction. Given an equalization set (set of word pairs such as ['she', 'he'], ['men', 'women'], ..., but not limited to the definitional set) this step executes, for each pair, an equalization with respect to the bias direction. That is, it takes both embeddings of the pair and distributes them at the same distance from the bias direction, so that neither is closer to the bias direction than the other.

Examples

Note

For more information on the use of mitigation methods, visit Bias Mitigation (Debias) in the User Guide.

To run the bias debiasing specified in the original paper, run:

>>> from wefe.datasets import fetch_debiaswe
>>> from wefe.debias.hard_debias import HardDebias
>>> from wefe.utils import load_test_model
>>>
>>> model = load_test_model()  # load a reduced version of word2vec
>>>
>>> # load the definitional and equalize pairs. Also, the gender specific words
>>> # that should be ignored in the debias process.
>>> debiaswe_wordsets = fetch_debiaswe()
>>>
>>> definitional_pairs = debiaswe_wordsets["definitional_pairs"]
>>> equalize_pairs = debiaswe_wordsets["equalize_pairs"]
>>> gender_specific = debiaswe_wordsets["gender_specific"]
>>>
>>> # instance the debias object that will perform the mitigation
>>> hd = HardDebias(verbose=False, criterion_name="gender")
>>>
>>> # fits the transformation parameters (bias direction, etc...)
>>> hd.fit(
...     model, definitional_pairs=definitional_pairs, equalize_pairs=equalize_pairs,
... )
>>>
>>> # perform the transformation (debiasing) on the embedding model
>>  # note that words specified in ignore will not be mitigated (see exception
>>  # to this in the transform documentation).
>>> gender_debiased_model = hd.transform(model, ignore=gender_specific, copy=True)

If you only want to run debias on a limited set of words, you can use the target parameter when running transform.

>>> targets = [
...     "executive",
...     "management",
...     "professional",
...     "corporation",
...     "salary",
...     "office",
...     "business",
...     "career",
...     "home",
...     "parents",
...     "children",
...     "family",
...     "cousins",
...     "marriage",
...     "wedding",
...     "relatives",
... ]
>>>
>>> hd = HardDebias(verbose=False, criterion_name="gender").fit(
...     model, definitional_pairs=definitional_pairs, equalize_pairs=equalize_pairs,
>>> )
>>>
>>> gender_debiased_model = hd.transform(model, target=targets, copy=True)

References

[1]: Bolukbasi, T., Chang, K. W., Zou, J. Y., Saligrama, V., & Kalai, A. T. (2016).
Man is to computer programmer as woman is to homemaker? debiasing word embeddings.
Advances in Neural Information Processing Systems.
__init__(pca_args: dict[str, Any] = {'n_components': 10}, verbose: bool = False, criterion_name: str | None = None) None[source]

Initialize a Hard Debias instance.

Parameters:
  • pca_args (Dict[str, Any], optional) – Arguments for the PCA that is calculated internally in the identification of the bias subspace, by default {“n_components”: 10}

  • verbose (bool, optional) – True will print informative messages about the debiasing process, by default False.

  • criterion_name (Optional[str], optional) – The name of the criterion for which the debias is being executed, e.g., ‘Gender’. This will indicate the name of the model returning transform, by default None

fit(model: WordEmbeddingModel, definitional_pairs: list[list[str]], equalize_pairs: list[list[str]] | None = None, **fit_params) BaseDebias[source]

Compute the bias direction and obtains the equalize embedding pairs.

Parameters:
  • model (WordEmbeddingModel) – The word embedding model to debias.

  • definitional_pairs (List[List[str]]) – A sequence of string pairs that will be used to define the bias direction. For example, for the case of gender debias, this list could be [[‘woman’, ‘man’], [‘girl’, ‘boy’], [‘she’, ‘he’], [‘mother’, ‘father’], …].

  • equalize_pairs (Optional[List[List[str]]], optional) – A list with pairs of strings, which will be equalized. In the case of passing None, the equalization will be done over the word pairs passed in definitional_pairs, by default None.

Returns:

The debias method fitted.

Return type:

BaseDebias

fit_transform(model: WordEmbeddingModel, target: list[str] | None = None, ignore: list[str] | None = None, copy: bool = True, **fit_params) WordEmbeddingModel

Convenience method to execute fit and transform in a single call.

Parameters:
  • model (WordEmbeddingModel) – A word embedding model object.

  • target (Optional[List[str]], optional) – If a set of words is specified in target, the debias method will be applied only on the word embeddings of this set, by default None.

  • ignore (Optional[List[str]], optional) – If target is None and a set of words is specified in ignore, the debias method will debias all words except those specified in ignore, by default None.

  • copy (bool, optional) – If True, the debias will be performed on a copy of the model. If False, the debias will be applied on the same model delivered, causing its vectors to mutate. WARNING: Setting copy with True requires at least 2x RAM of the size of the model. Otherwise the execution of the debias may raise MemoryError, by default True.

  • verbose (bool, optional) – True will print informative messages about the debiasing process, by default True.

Returns:

The debiased word embedding model.

Return type:

WordEmbeddingModel

get_metadata_routing()

Get metadata routing of this object.

Please check User Guide on how the routing mechanism works.

Returns:

routing – A MetadataRequest encapsulating routing information.

Return type:

MetadataRequest

get_params(deep=True)

Get parameters for this estimator.

Parameters:

deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:

params – Parameter names mapped to their values.

Return type:

dict

name: str = 'Hard Debias'
set_fit_request(*, definitional_pairs: bool | None | str = '$UNCHANGED$', equalize_pairs: bool | None | str = '$UNCHANGED$', model: bool | None | str = '$UNCHANGED$') HardDebias

Configure whether metadata should be requested to be passed to the fit method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:
  • definitional_pairs (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for definitional_pairs parameter in fit.

  • equalize_pairs (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for equalize_pairs parameter in fit.

  • model (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for model parameter in fit.

Returns:

self – The updated object.

Return type:

object

set_params(**params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters:

**params (dict) – Estimator parameters.

Returns:

self – Estimator instance.

Return type:

estimator instance

set_transform_request(*, copy: bool | None | str = '$UNCHANGED$', ignore: bool | None | str = '$UNCHANGED$', model: bool | None | str = '$UNCHANGED$', target: bool | None | str = '$UNCHANGED$') HardDebias

Configure whether metadata should be requested to be passed to the transform method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to transform if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to transform.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:
  • copy (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for copy parameter in transform.

  • ignore (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for ignore parameter in transform.

  • model (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for model parameter in transform.

  • target (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for target parameter in transform.

Returns:

self – The updated object.

Return type:

object

short_name = 'HD'
transform(model: WordEmbeddingModel, target: list[str] | None = None, ignore: list[str] | None = None, copy: bool = True) WordEmbeddingModel[source]

Execute hard debias over the provided model.

Parameters:
  • model (WordEmbeddingModel) – The word embedding model to debias.

  • target (Optional[List[str]], optional) – If a set of words is specified in target, the debias method will be performed only on the word embeddings of this set. If None is provided, the debias will be performed on all words (except those specified in ignore). Note that some words that are not in target may be modified due to the equalization process. By default None.

  • ignore (Optional[List[str]], optional) – If target is None and a set of words is specified in ignore, the debias method will perform the debias in all words except those specified in this set. Note that some words that are in ignore may be modified due to the equalization process. By default None.

  • copy (bool, optional) – If True, the debias will be performed on a copy of the model. If False, the debias will be applied on the same model delivered, causing its vectors to mutate. WARNING: Setting copy with True requires RAM at least 2x of the size of the model, otherwise the execution of the debias may raise to MemoryError, by default True.

Returns:

The debiased embedding model.

Return type:

WordEmbeddingModel