Skip to main content

Advanced

LlamaCloud comes with a few advanced retrieval techniques that allow you to improve the accuracy of the retrieval.

Hybrid search combines the strengths of both vector search and keyword search to improve retrieval accuracy. By leveraging the advantages of both methods, hybrid search can provide more relevant results.

Note: Hybrid search is currently only supported by a few vector databases. See data sinks for a list of databases that support this feature.

How Hybrid Search Works

  1. Vector Search: This method uses vector embeddings to find documents that are semantically similar to the query. It is particularly useful for capturing the meaning and context of the query, even if the exact keywords are not present in the documents.

  2. Keyword Search: This method looks for exact matches of the query keywords in the documents. It is effective for finding documents that contain specific terms.

By combining these two methods, hybrid search can return results that are both contextually relevant and contain the specific keywords from the query.

Here's how you can include hybrid search in your Retrieval API requests:

import os

os.environ[
"LLAMA_CLOUD_API_KEY"
] = "llx-..." # can provide API-key in env or in the constructor later on

from llama_index.indices.managed.llama_cloud import LlamaCloudIndex

# connect to existing index
index = LlamaCloudIndex("my_first_index", project_name="Default")

# configure retriever
retriever = index.as_retriever(
dense_similarity_top_k=3,
sparse_similarity_top_k=3,
alpha=0.5,
enable_reranking=False,
)
nodes = retriever.retrieve("Example query")

Re-ranking

Re-ranking is a technique used to improve the order of search results by applying ranking models to the initial set of retrieved document chunks. This can help in presenting the most relevant chunks at the top of the search results. One common technique is to set a high top-k value, then use re-ranking to improve the order of the results, and then choose the first few results from the re-ranked results as the basis for your final response.

Here's how you can include re-ranking in your Retrieval API requests:

import os

os.environ[
"LLAMA_CLOUD_API_KEY"
] = "llx-..." # can provide API-key in env or in the constructor later on

from llama_index.indices.managed.llama_cloud import LlamaCloudIndex

# connect to existing index
index = LlamaCloudIndex("my_first_index", project_name="Default")

# configure retriever
retriever = index.as_retriever(
dense_similarity_top_k=3,
sparse_similarity_top_k=3,
alpha=0.5,
enable_reranking=True,
rerank_top_n=3,
)
nodes = retriever.retrieve("Example query")

Metadata Filtering

Metadata filtering allows you to narrow down your search results based on specific attributes or tags associated with the documents. This can be particularly useful when you have a large dataset and want to focus on a subset of documents that meet certain criteria.

Here are a few use cases where metadata filtering would be useful:

  • Only retrieve chunks from a set of specific files
  • Implement access control by filtering by User IDs or User Group IDs that each document is associated with
  • Filter documents based on their creation or modification date to retrieve the most recent or relevant information.
  • Apply metadata filtering to focus on documents that contain specific tags or categories, such as "financial reports" or "technical documentation."

Here's how you can include metadata filtering in your Retrieval API requests:

import os

os.environ[
"LLAMA_CLOUD_API_KEY"
] = "llx-..." # can provide API-key in env or in the constructor later on

from llama_index.indices.managed.llama_cloud import LlamaCloudIndex
from llama_index.core.vector_stores import (
MetadataFilter,
MetadataFilters,
FilterOperator,
)

# connect to existing index
index = LlamaCloudIndex("my_first_index", project_name="Default")

# create metadata filter
filters = MetadataFilters(
filters=[
MetadataFilter(
key="theme", operator=FilterOperator.EQ, value="Fiction"
),
]
)

# configure retriever
retriever = index.as_retriever(
dense_similarity_top_k=3,
sparse_similarity_top_k=3,
alpha=0.5,
enable_reranking=True,
rerank_top_n=3,
filters=filters,
)
nodes = retriever.retrieve("Example query")