Skip to contents

Library Setup

Functions for setting up BertopicR

check_python_dependencies()
Check that dependencies are loaded
install_python_dependencies()
Install Python Dependencies

Embedding Documents

Functions for creating and manipulating document embeddings.

bt_make_embedder_st()
Create an embedding model using sentence-transformers
bt_make_embedder_spacy()
Create an embedding model using a model available from the Spacy Library
bt_make_embedder_flair()
Create an embedding model using a model available from the Flair Library
bt_make_embedder_openai()
Create an embedding model using a model available from the OpenAI Library
bt_do_embedding()
Embed your documents
bt_make_reducer_umap()
Create umap dimensionality reduction model
bt_make_reducer_pca()
Create pca dimensionality reduction model
bt_make_reducer_truncated_svd()
Created Truncated SVD dimensionality reduction model
bt_do_reducing()
Perform dimensionality reduction on your embeddings

Clustering

Functions for clustering embedded documents

bt_make_clusterer_hdbscan()
Create an HDBSCAN clustering model
bt_make_clusterer_kmeans()
Create a kmeans clustering model
bt_make_clusterer_agglomerative()
Create an Agglomerative Clustering clustering model
bt_do_clustering()
Cluster your data

Tokenise

Functions for tokenising document words/phrases in order to represent topics

bt_make_vectoriser()
Create a text vectoriser
bt_make_ctfidf()
Create an instance of the ClassTfidfTransformer from the bertopic.vectorizers module

Build Model

Functions for compiling and buiding model

bt_compile_model()
Build a BERTopic model
bt_fit_model()
Fit a topic model on your documents & embeddings
bt_update_topics()
Update Topic Representations

Representation

Functions for clustering embedded documents

bt_representation_mmr()
Create representation model using Maximal Marginal Relevance
bt_representation_keybert()
Create representation model using keybert
bt_representation_openai()
Create representation model that uses OpenAI text generation models
bt_representation_hf()
Use Huggingface models to create topic representation

Topic Manipulation

Functions for merging topics and reducing outliers

bt_merge_topics()
Merges list(s) of topics together
bt_outliers_embeddings()
Redistributes outliers using embeddings
bt_outliers_tokenset_similarity()
Redistributes outliers using tokenset c-TF-IDF scores
bt_outliers_ctfidf()
Redistributes outliers using c-TF-IDF scores

Empty Models

Functions for creating empty instances of models

bt_empty_clusterer()
Create an empty clusterer for skipping clustering step of bertopic pipeline
bt_empty_embedder()
Create an empty embedder for skipping embedding step of bertopic pipeline
bt_empty_reducer()
Create an empty reducer for skipping dimensionality reduction step of bertopic pipeline

Detaching Bertopic

Functions for detaching python bertopic library

bertopic_detach()
Detach bertopic from the python session