wefe.debias.repulsion_attraction_neutralization.RepulsionAttractionNeutralization

class wefe.debias.repulsion_attraction_neutralization.RepulsionAttractionNeutralization(pca_args: dict[str, Any] = {'n_components': 10}, verbose: bool = False, criterion_name: str | None = None, epochs: int = 300, theta: float = 0.05, n_neighbours: int = 100, learning_rate: float = 0.01, weights: list[float] = [0.33, 0.33, 0.33])[source]

Bases: BaseDebias

Repulsion Attraction Neutralization method.

Warning

This method only works if Pytorch is installed. If you do not have it installed, check https://pytorch.org/get-started/locally/ for further information.

This method allow reducing the bias of an embedding model creating a transformation such that the stereotypical information is minimized with minimal semantic offset. This transformation bases its operations on:

  1. Repelling embeddings from neighbours with a high value of indirect bias (indicating a strong association due to bias), to minimize the bias based illicit associations.

  2. Attracting debiased embeddings to the original representation, to minimize the loss of semantic meaning.

  3. Neutralizing the bias direction of each word, minimizing its bias to any particular group.

This method is binary because it only allows 2 classes of the same bias criterion, such as male or female.

Note

For a multiclass debias (such as for Latinos, Asians and Whites), it is recommended to visit MulticlassHardDebias class.

The steps followed to perform the debias are:

  1. Identify a bias subspace through the defining sets. In the case of gender, these could be e.g. [['woman', 'man'], ['she', 'he'], ...]

  2. A multi-objective optimization is performed. For each vector \(w\) in the target set it is found its debias counterpart \(w_d\) by solving:

\[argmin(F_r(w_d),F_a(w_d),F_n(w_d))\]

where \(Fr\), \(Fa\), \(Fn\) are repulsion, attraction and neutralization functions defined as the following:

\[F_r(w_d) = \sum |cos(w_d,n_i)| / |S|\]
\[F_a(w_d) = |cos(w_d,w)-1|/2\]
\[F_n(w_d) = |cos(w_d,g)|\]

The optimization is performed by formulating a single objective:

\[F(w_d) = \lambda_1 F_r(w_d) + \lambda_2 F_a(w_d) + \lambda_3 F_n(w_d)\]

In the original implementation is define a preserve set \((V_p)\) corresponding to words for which gender carries semantic importance, this words are not included in the debias process.

In WEFE this words would be the ones included in the ignore parameter of the transform method. The words that are not present in \(V_p\) are the ones to be included in the debias process and form part of the debias set \((V_d)\), in WEFE this words can be specified in the target parameter of the transform method.

Examples

The following example shows how to execute Repulsion Attraction Neutralization method that reduces bias in a word embedding model:

>>> from wefe.debias.repulsion_attraction_neutralization import (
...   RepulsionAttractionNeutralization
... )
>>> from wefe.utils import load_test_model
>>> from wefe.datasets import fetch_debiaswe
>>>
>>> # load the model (in this case, the test model included in wefe)
>>> model = load_test_model()
>>> # load definitional pairs, in this case definitinal pairs included in wefe
>>> debiaswe_wordsets = fetch_debiaswe()
>>> definitional_pairs = debiaswe_wordsets["definitional_pairs"]
>>>
>>> # instance and fit the method
>>> ran = RepulsionAttractionNeutralization().fit(
...     model = model,
...     definitional_pairs= definitional_pairs
...   )
>>> # execute the debias passing words over a set of target words
>>> debiased_model = ran.transform(
...    model = model, target = ['doctor','nurse','programmer']
... )
Copy argument is True. Transform will attempt to create a copyof the original model.
This may fail due to lack of memory.
Model copy created successfully.
>>> # if you don't want a set of words to be debiased include them in the ignore set
>>> gender_specific = debiaswe_wordsets["gender_specific"]
>>> debiased_model = ran.transform(
...    model = model, ignore= gender_specific
... )

References

[1]: Kumar, Vaibhav, Tenzin Singhay Bhotia y Tanmoy Chakraborty: Nurse is Closer to Woman than Surgeon? Mitigating Gender-Biased Proximities in Word Embeddings. CoRR,abs/2006.01938, 2020.
__init__(pca_args: dict[str, Any] = {'n_components': 10}, verbose: bool = False, criterion_name: str | None = None, epochs: int = 300, theta: float = 0.05, n_neighbours: int = 100, learning_rate: float = 0.01, weights: list[float] = [0.33, 0.33, 0.33]) None[source]

Initialize a Repulsion Attraction Neutralization Debias instance.

Parameters:
  • pca_args (Dict[str, Any], optional) – Arguments for the PCA that is calculated internally in the identification of the bias subspace, by default {“n_components”: 10}

  • verbose (bool, optional) – True will print informative messages about the debiasing process, by default False.

  • criterion_name (Optional[str], optional) – The name of the criterion for which the debias is being executed, e.g., ‘Gender’. This will indicate the name of the model returning transform, by default None

  • epochs (int, optional) – number of times that the minimization is done. By default 300

  • theta (float, optional) – Indirect bias threshold to select neighbours for the repulsion set. By default 0.05

  • n_neighbours (int, optional) – Number of neighbours to be consider for the repulsion set. By default 100

  • learning_rate (float, optional) – Learning rate to be used by the optimizer during the optimization. By default 0.01

  • weights (List[float], optional) – List of the 3 initial weights to be used. By default [0.33,0.33,0.33]

fit(model: WordEmbeddingModel, definitional_pairs: Sequence[Sequence[str]]) BaseDebias[source]

Compute the bias direction.

Parameters:
  • model (WordEmbeddingModel) – The word embedding model to debias.

  • definitional_pairs (Sequence[Sequence[str]]) – A sequence of string pairs that will be used to define the bias direction. For example, for the case of gender debias, this list could be [[‘woman’, ‘man’], [‘girl’, ‘boy’], [‘she’, ‘he’], [‘mother’, ‘father’], …].

Returns:

The debias method fitted.

Return type:

BaseDebias

fit_transform(model: WordEmbeddingModel, target: list[str] | None = None, ignore: list[str] | None = None, copy: bool = True, **fit_params) WordEmbeddingModel

Convenience method to execute fit and transform in a single call.

Parameters:
  • model (WordEmbeddingModel) – A word embedding model object.

  • target (Optional[List[str]], optional) – If a set of words is specified in target, the debias method will be applied only on the word embeddings of this set, by default None.

  • ignore (Optional[List[str]], optional) – If target is None and a set of words is specified in ignore, the debias method will debias all words except those specified in ignore, by default None.

  • copy (bool, optional) – If True, the debias will be performed on a copy of the model. If False, the debias will be applied on the same model delivered, causing its vectors to mutate. WARNING: Setting copy with True requires at least 2x RAM of the size of the model. Otherwise the execution of the debias may raise MemoryError, by default True.

  • verbose (bool, optional) – True will print informative messages about the debiasing process, by default True.

Returns:

The debiased word embedding model.

Return type:

WordEmbeddingModel

get_metadata_routing()

Get metadata routing of this object.

Please check User Guide on how the routing mechanism works.

Returns:

routing – A MetadataRequest encapsulating routing information.

Return type:

MetadataRequest

get_params(deep=True)

Get parameters for this estimator.

Parameters:

deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:

params – Parameter names mapped to their values.

Return type:

dict

name: str = 'Repulsion attraction Neutralization'
set_fit_request(*, definitional_pairs: bool | None | str = '$UNCHANGED$', model: bool | None | str = '$UNCHANGED$') RepulsionAttractionNeutralization

Configure whether metadata should be requested to be passed to the fit method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:
  • definitional_pairs (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for definitional_pairs parameter in fit.

  • model (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for model parameter in fit.

Returns:

self – The updated object.

Return type:

object

set_params(**params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters:

**params (dict) – Estimator parameters.

Returns:

self – Estimator instance.

Return type:

estimator instance

set_transform_request(*, copy: bool | None | str = '$UNCHANGED$', ignore: bool | None | str = '$UNCHANGED$', model: bool | None | str = '$UNCHANGED$', target: bool | None | str = '$UNCHANGED$') RepulsionAttractionNeutralization

Configure whether metadata should be requested to be passed to the transform method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to transform if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to transform.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:
  • copy (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for copy parameter in transform.

  • ignore (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for ignore parameter in transform.

  • model (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for model parameter in transform.

  • target (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for target parameter in transform.

Returns:

self – The updated object.

Return type:

object

short_name = 'RAN'
transform(model: WordEmbeddingModel, target: list[str] | None = None, ignore: list[str] | None = [], copy: bool = True) WordEmbeddingModel[source]

Execute Repulsion Attraction Neutralization Debias over the provided model.

Parameters:
  • model (WordEmbeddingModel) – The word embedding model to debias.

  • target (Optional[List[str]], optional) – If a set of words is specified in target, the debias method will be performed only on the word embeddings of this set. If None is provided, the debias will be performed on all words (except those specified in ignore). by default None.

  • ignore (Optional[List[str]], optional) – If target is None and a set of words is specified in ignore, the debias method will perform the debias in all words except those specified in this set, by default None.

  • copy (bool, optional) – If True, the debias will be performed on a copy of the model. If False, the debias will be applied on the same model delivered, causing its vectors to mutate. WARNING: Setting copy with True requires RAM at least 2x of the size of the model, otherwise the execution of the debias may raise to MemoryError, by default True.

  • WordEmbeddingModel – The debiased embedding model.