wefe.preprocessing.preprocess_word
- wefe.preprocessing.preprocess_word(word: str, options: dict[str, str | bool | Callable] = {}, vocab_prefix: str | None = None) str[source]
pre-processes a word before it is searched in the model’s vocabulary.
- Parameters:
word (str) – Word to be preprocessed.
options (Dict[str, Union[str, bool, Callable]], optional) –
Dictionary with arguments that specifies how the words will be preprocessed, The available word preprocessing options are as follows:
`lowercase`: bool. Indicates if the words are transformed to lowercase.`uppercase`: bool. Indicates if the words are transformed to uppercase.`titlecase`: bool. Indicates if the words are transformed to titlecase.`strip_accents`: bool, {‘ascii’, ‘unicode’}: Specifies if the accents of the words are eliminated. The stripping type can be specified. True uses ‘unicode’ by default.`preprocessor`: Callable. It receives a function that operates on each word. In the case of specifying a function, it overrides the default preprocessor (i.e., the previous options stop working).
By default, no preprocessing is generated, which is equivalent to {}
- Returns:
The pre-processed word according to the given parameters.
- Return type: