wefe.preprocess_word

wefe.preprocess_word(word: str, options: Dict[str, Union[str, bool, Callable]] = {}, vocab_prefix: Optional[str] = None) str[source]

pre-processes a word before it is searched in the model’s vocabulary.

Parameters
wordstr

Word to be preprocessed.

optionsDict[str, Union[str, bool, Callable]], optional

Dictionary with arguments that specifies how the words will be preprocessed, The available word preprocessing options are as follows:

  • `lowercase`: bool. Indicates if the words are transformed to lowercase.

  • `uppercase`: bool. Indicates if the words are transformed to uppercase.

  • `titlecase`: bool. Indicates if the words are transformed to titlecase.

  • `strip_accents`: bool, {‘ascii’, ‘unicode’}: Specifies if the accents of the words are eliminated. The stripping type can be specified. True uses ‘unicode’ by default.

  • `preprocessor`: Callable. It receives a function that operates on each word. In the case of specifying a function, it overrides the default preprocessor (i.e., the previous options stop working).

By default, no preprocessing is generated, which is equivalent to {}

Returns
str

The pre-processed word according to the given parameters.