Skip to contents

This function wraps the PCA functionality from Python's sklearn package for use in R via reticulate. It allows you to perform dimension reduction on high-dimensional data, its intended use is in a BertopicR pipeline. If you're concerned about processing time, you most likely will only want to reduce the dimensions of your dataset once. In this case, when compiling your model with bt_compile_model you should call reducer <- bt_empty_reducer().

Usage

bt_make_reducer_pca(
  n_components,
  ...,
  svd_solver = c("auto", "full", "arpack", "randomized")
)

Arguments

n_components

Number of components to keep

...

Sent to sklearn.decomposition.PCA function for adding additional arguments

svd_solver

method for reducing components can be auto, full, arpack, randomized

Value

A PCA Model that can be input to bt_do_reducing to reduce dimensions of data

Examples

# using default svd_solver
reducer <- bt_make_reducer_pca(n_components = 100)

# speciying extra pca arguments
reducer <- bt_make_reducer_pca (n_components = 20, svd_solver = "full", random_state = 42L)