wefe.datasets.fetch_eds

wefe.datasets.fetch_eds(occupations_year: int = 2015, top_n_race_occupations: int = 10, n_retries: int = 3) dict[str, list[str]][source]
Fetch the sets of words used in the experiments of the _Word Embeddings

Quantify 100 Years Of Gender And Ethnic Stereotypes_ work.

This dataset includes the following word sets: - gender: male, female. - ethnicity: asian, black, white. - religion: christianity, judaism and islam. - adjetives: appearence, intelligence, otherization, sensitive.

References

[1]: Word Embeddings quantify 100 years of gender and ethnic stereotypes.
Garg, N., Schiebinger, L., Jurafsky, D., & Zou, J. (2018).
Proceedings of the National Academy of Sciences, 115(16), E3635-E3644.
Parameters:
  • occupations_year (int, optional) – The year of the census for the occupations file. Available years: {‘1850’, ‘1860’, ‘1870’, ‘1880’, ‘1900’, ‘1910’, ‘1920’, ‘1930’, ‘1940’, ‘1950’, ‘1960’, ‘1970’, ‘1980’, ‘1990’, ‘2000’, ‘2001’, ‘2002’, ‘2003’, ‘2004’, ‘2005’, ‘2006’, ‘2007’, ‘2008’, ‘2009’, ‘2010’, ‘2011’, ‘2012’, ‘2013’, ‘2014’, ‘2015’} , by default 2015

  • top_n_race_occupations (int, optional) – The year of the census for the occupations file. The number of occupations by race, by default 10

  • n_retries (int, optional) – Number of retries to attempt for each request, by default 3

Returns:

A dictionary with the word sets.

Return type:

dict