`wefe.preprocessing`.preprocess_word

wefe.preprocessing.preprocess_word(word: str, options: dict[str, str | bool | Callable] = {}, vocab_prefix: str | None = None) → str[source]

pre-processes a word before it is searched in the model’s vocabulary.

Parameters:

word (str) – Word to be preprocessed.
options (Dict[str, Union[str, bool, Callable]], optional) –
Dictionary with arguments that specifies how the words will be preprocessed, The available word preprocessing options are as follows:
- `lowercase`: bool. Indicates if the words are transformed to lowercase.
- `uppercase`: bool. Indicates if the words are transformed to uppercase.
- `titlecase`: bool. Indicates if the words are transformed to titlecase.
- `strip_accents`: bool, {‘ascii’, ‘unicode’}: Specifies if the accents of the words are eliminated. The stripping type can be specified. True uses ‘unicode’ by default.
- `preprocessor`: Callable. It receives a function that operates on each word. In the case of specifying a function, it overrides the default preprocessor (i.e., the previous options stop working).
By default, no preprocessing is generated, which is equivalent to {}

Returns:

The pre-processed word according to the given parameters.

Return type:

str

wefe.preprocessing.preprocess_word