Package index
-
limpiar_accents()
- Clean accented characters
-
limpiar_spaces()
- Clean redundant spaces
-
limpiar_url()
- Clean URLs from the text variable
-
limpiar_repeat_chars()
- Clean repeated charaaaacters
-
limpiar_shorthands()
- Clean shorthands and abbreviations
-
limpiar_tags()
- Clean user handles and hashtags
-
limpiar_stopwords()
- Clean stop words for visualisations
-
limpiar_slang()
- Clean slang from multiple Spanish dialects
-
limpiar_recode_emojis()
- Recode emojis with a textual description
-
limpiar_remove_emojis()
- Completely Remove Most Emojis from Text
-
limpiar_emojis_es()
- Replace emojis with a Spanish textual description
-
limpiar_pp_products()
- Replace entities for the Peaks&Pit classifier
-
limpiar_pp_companies()
- Remove known companies for pits & peaks
-
limpiar_non_ascii()
- Remove non-ASCII characters except those with latin accents
-
limpiar_alphanumeric()
- Remove everything except letters, numbers, and spaces
-
limpiar_duplicates()
- Clean the text variable of duplicate posts
-
limpiar_retweets()
- Clean retweets from the text variable
-
limpiar_spam_grams()
- Remove posts containing spam-like n-grams
-
limpiar_inspect()
- Inspect every post and URL which contains a pattern
-
limpiar_na_cols()
- Clean NA-heavy columns from a Data Frame or Tibble
-
limpiar_link_click()
- Prepare a URL column to be clickable in Shiny/Data Table
-
limpiar_link_click_reverse()
- Reverses (inverts) limpiar_link_click
-
limpiar_ex_subreddits()
- Quickly extract subreddits from a link variable
-
limpiar_wrap()
- Wrap strings for visual ease
Processing Parts of Speech
A collection of functions that collectively make up a Parts of Speech(POS) analysis and workflow.
-
limpiar_pos_import_model()
- Import UDPipe models to begin Parts of Speech Analysis
-
limpiar_pos_annotate()
- Annotate Texts for Parts of Speech Analysis using udpipe models.