Document Stores
Learn how to use the Tailwinds Document Stores
Last updated
Was this helpful?
Learn how to use the Tailwinds Document Stores
Last updated
Was this helpful?
The Document Stores offer a versatile approach to data management, enabling you to upload, split, and prepare your data for upserting your datasets in a single location.
This centralized approach simplifies data handling and allows for efficient management of various data formats, making it easier to organize and access your data within the app.
In this tutorial, we will set up a system to retrieve information about the LibertyGuard Deluxe Homeowners Policy, a topic that LLMs are likely not extensively trained on.
Using the Document Stores, we'll prepare and upsert data about LibertyGuard and its set of home insurance policies. This will enable our RAG system to accurately answer user queries about LibertyGuard's home insurance offerings.
Start by adding a Document Store and naming it. In our case, "LibertyGuard Deluxe Homeowners Policy".
First, we start by uploading our PDF file.
Then, we add a unique metadata key. This is optional, but a good practice as it allows us to target and filter down this same dataset later on if we need to.
Once you are satisfied with the chunking process, it's time to process your data.
Note that once you have processed your data, you will be able to edit your chunks by deleting or adding data to them. This is beneficial if:
You discover inaccuracies or inconsistencies in the original data: Editing chunks allows you to correct errors and ensure the information is accurate.
You want to refine the content for better relevance: You can adjust chunks to emphasize specific information or remove irrelevant sections.
You need to tailor chunks for specific queries: By editing chunks, you can make them more targeted to the types of questions you expect to receive.
Finally, our Retrieval-Augmented Generation (RAG) system is operational. It's noteworthy how the LLM effectively interprets the query and successfully leverages relevant information from the chunked data to construct a comprehensive response.
We started by creating a Document Store to organize the LibertyGuard Deluxe Homeowners Policy data. This data was then prepared by uploading, chunking, processing, and upserting it, making it ready for our RAG system.
Organization and Management: The Document Store provides a centralized location for storing, managing, and preparing our data.
Data Quality: The chunking process helps ensure that our data is structured in a way that facilitates accurate retrieval and analysis.
Flexibility: The Document Store allows us to refine and adjust our data as needed, improving the accuracy and relevance of our RAG system.
Enter the Document Store we just created and select the you want to use. In our case, since our dataset is in PDF format, we'll use the .
Finally, select the you want to use to chunk your data. In our particular case, we will use the .
We can now preview how our data will be chunked using our current configuration; chunk_size=1500
and chunk_overlap=750
.
It's important to experiment with different , Chunk Sizes, and Overlap values to find the optimal configuration for your specific dataset. This preview allows you to refine the chunking process and ensure that the resulting chunks are suitable for your RAG system.
Now that our dataset is ready to be upserted, it's time to go to your RAG chatflow / agentflow and add the under the LangChain > Document Loader section.
Upsert your dataset to your by clicking the green button in the right corner of your flow. We used the in our implementation.