Package index • LimpiaR

Cleaning Posts

Functions for editing the text variable in place.

limpiar_accents(): Clean accented characters
limpiar_spaces(): Clean redundant spaces
limpiar_url(): Clean URLs from the text variable
limpiar_repeat_chars(): Clean repeated charaaaacters
limpiar_shorthands(): Clean shorthands and abbreviations
limpiar_tags(): Clean user handles and hashtags
limpiar_stopwords(): Clean stop words for visualisations
limpiar_slang(): Clean slang from multiple Spanish dialects
limpiar_recode_emojis(): Recode emojis with a textual description
limpiar_remove_emojis(): Completely Remove Most Emojis from Text
limpiar_emojis_es(): Replace emojis with a Spanish textual description
limpiar_pp_products(): Replace entities for the Peaks&Pit classifier
limpiar_pp_companies(): Remove known companies for pits & peaks
limpiar_non_ascii(): Remove non-ASCII characters except those with latin accents
limpiar_alphanumeric(): Remove everything except letters, numbers, and spaces

Functions for removing unwanted posts entirely (rather than cleaning).

Miscellaneous functions designed to speed up aspects of cleaning text.

limpiar_inspect(): Inspect every post and URL which contains a pattern
limpiar_na_cols(): Clean NA-heavy columns from a Data Frame or Tibble
limpiar_link_click(): Prepare a URL column to be clickable in Shiny/Data Table
limpiar_link_click_reverse(): Reverses (inverts) limpiar_link_click
limpiar_ex_subreddits(): Quickly extract subreddits from a link variable
limpiar_wrap(): Wrap strings for visual ease

A collection of functions that collectively make up a Parts of Speech (POS) analysis workflow.

limpiar_pos_import_model(): Import UDPipe models to begin Parts of Speech Analysis
limpiar_pos_annotate(): Annotate Texts for Parts of Speech Analysis using udpipe models.