Skip to main content

Parsing & Transformation

Once data is loaded from a Data Source, it is pre-processed before being sent to the Data Sink. There are many pre-processing parameters that can be tweaked to optimize the downstream retrieval performance of your index. While LlamaCloud sets you up with reasonable defaults, you can dig deeper and customize them as you see fit for your specific use case.

Parser Settings

A key step of any RAG pipeline is converting your input file into a format that can be used to generate a vector embedding. There are many parameters that can be used to tweak this conversion process to optimize for your use case. LlamaCloud sets you up from the start with reasonable defaults for your parsing configurations, but also allows you to dig deeper and customize them as you see fit for your specific application.

Transformation Settings

Under construction

Embedding Model

The embedding model allows you to construct a numerical representation of the text within your files. This is a crucial step in allowing you to search for specific information within your files. There are a wide variety of embedding models to choose from, and we support quite a few on LlamaCloud.

After Pre-Processing, your data is ready to be sent to the Data Sink ➡️