Package index
-
limpiar_accents() - Clean accented characters
-
limpiar_spaces() - Clean redundant spaces
-
limpiar_url() - Clean URLs from the text variable
-
limpiar_repeat_chars() - Clean repeated charaaaacters
-
limpiar_shorthands() - Clean shorthands and abbreviations
-
limpiar_tags() - Clean user handles and hashtags
-
limpiar_stopwords() - Clean stop words for visualisations
-
limpiar_slang() - Clean slang from multiple Spanish dialects
-
limpiar_recode_emojis() - Recode emojis with a textual description
-
limpiar_remove_emojis() - Completely Remove Most Emojis from Text
-
limpiar_emojis_es() - Replace emojis with a Spanish textual description
-
limpiar_pp_products() - Replace entities for the Peaks&Pit classifier
-
limpiar_pp_companies() - Remove known companies for pits & peaks
-
limpiar_non_ascii() - Remove non-ASCII characters except those with latin accents
-
limpiar_alphanumeric() - Remove everything except letters, numbers, and spaces
-
limpiar_duplicates() - Clean the text variable of duplicate posts
-
limpiar_retweets() - Clean retweets from the text variable
-
limpiar_spam_grams() - Remove posts containing spam-like n-grams
-
limpiar_inspect() - Inspect every post and URL which contains a pattern
-
limpiar_na_cols() - Clean NA-heavy columns from a Data Frame or Tibble
-
limpiar_link_click() - Prepare a URL column to be clickable in Shiny/Data Table
-
limpiar_link_click_reverse() - Reverses (inverts) limpiar_link_click
-
limpiar_ex_subreddits() - Quickly extract subreddits from a link variable
-
limpiar_wrap() - Wrap strings for visual ease
Processing Parts of Speech
A collection of functions that collectively make up a Parts of Speech(POS) analysis and workflow.
-
limpiar_pos_import_model() - Import UDPipe models to begin Parts of Speech Analysis
-
limpiar_pos_annotate() - Annotate Texts for Parts of Speech Analysis using udpipe models.