wefe.utils.run_queries

wefe.utils.run_queries(metric: type[BaseMetric], queries: list[Query], models: list[WordEmbeddingModel], queries_set_name: str = 'Unnamed queries set', lost_vocabulary_threshold: float = 0.2, metric_params: dict = {}, generate_subqueries: bool = False, aggregate_results: bool = False, aggregation_function: str | Callable = 'abs_avg', return_only_aggregation: bool = False, warn_not_found_words: bool = False) DataFrame[source]

Run several queries over a several word embedding models using a specific metic.

Parameters:
  • metric (Type[BaseMetric]) – A metric class.

  • queries (list) – An iterable with a set of queries.

  • word_embeddings_models (list) – An iterable with a set of word embedding pretrianed models.

  • queries_set_name (str, optional) – The name of the set of queries or the criteria that will be tested, by default ‘Unnamed queries set’

  • lost_vocabulary_threshold (float, optional) – The threshold that will be passed to the , by default 0.2

  • metric_params (dict, optional) – A dict with custom params that will passed to run_query method of the respective metric, by default {}

  • generate_subqueries (bool, optional) – It indicates if the program, when detecting queries with a bigger template than the metric, should try to generate subqueries compatible with it. If any query is compatible with the metric template, then it appends the same query. DANGER: This may cause some comparisons to become meaningless when comparing biases that are not compatible with each other. By default, False.

  • aggregate_results (bool, optional) – A boolean that indicates if the results must be aggregated with some function.

  • aggregation_function (Union[str, Callable], optional) – The function that will be applied row by row to add the results. It must be pandas row compatible operation. Implemented functions: ‘sum’, ‘abs_sub’, ‘avg’ and ‘abs_avg’, by default ‘abs_avg’.

  • return_only_aggregation (bool, optional) – If return_only_aggregation is True, only the column with the added queries is returned, by default False.

Returns:

A dataframe with the results. The index contains the word embedding model name and the columns the experiment name. Each cell represents the result of run a metric using a specific word embedding model and query.

Return type:

pd.DataFrame