The generated AI and large -scale language models (LLMs) have revolutionized organizations throughout the various sector and strengthen customer experience, which takes years to make a traditional advancement. All organizations have data stored in any of the facilities or cloud providers.
By accepting generated AI and converting existing data into an index that can be searched by AI, you can enhance your customer experience. If you ask an open source LLM, the information published as an answer will be displayed. This is useful, but the generated AI is useful for understanding data along with additional context from LLMS. This is achieved by search expansion generation (RAG).
RAG gets data from existing knowledge base (data) and combines LLM knowledge to generate a more human language. However, in order for the generated AI to understand the data, it requires some data preparation, which includes a large learning curve.
Amazon Aurora is a relational database compatible with MySQL and PostgreSQL built for the cloud. Aurora combines the performance and availability of the conventional enterprise database, the simplicity of the open source database and the cost -effectiveness.
In this post, we will explain how to convert the existing Aurora data into an index without the need to prepare Amazon Kendra data, and implement a RAG that combines data search and LLM knowledge to create an accurate response. Masu.
Overview of solutions
In this solution, the existing data uses existing data as a data source (aurora), connects and synchronizes the data source to the Amazon Kendra search, creates an intelligent search service, and uses RAG to make an accurate response using RAG. Create and create an accurate response. Data with the knowledge of LLM. In this post, I use Amazon Bedrock’s Anthropic Claude as an LLM.
The following is a high -level procedure for solutions.
The following figure shows a solution architecture.
Prerequisite
To follow this post, you need the following prerequisites:
Create aurora PostgreSQL cluster
Execute the following AWS CLI command to create Aurora PostgreSQL Serverless V2 cluster.
The next screenshot shows the created instance.
Aurora PostgreSQL Compatible data
Use the PGADMIN tool to connect to Aurora instance. For more information, see Connection to the DB instance running the PostgreSQL database engine. To get the data, complete the following steps:
- In pgadmin, execute the following PostgreSQL statement to create a database, schema, and table.
- Navigate to pgadmin aurora PostgreSQL connection Database,, Genai,, Schema,, employee,, table。
- Select (right -click) table Select PSQL tool To open a PSQL client connection.
- Place the CSV file under the pgadmin location and execute the following command:
- Execute the following PSQL query to check the number of copied records.
Create an Amazon Kendra index
The Amazon Kendra index holds the contents of the document and is composed to enable documents. There are three index styles.
- Generation AI Enterprise Edition Index -Istate API operations and provide maximum accuracy of RAG use case (recommended)
- Enterprise edition index -Semantic search functions provide high -availability services suitable for production workloads.
- Developer Edition Index -If you provide semantic search function to test the use case
To create an Amazon Kendra index, complete the following steps:
- Please select on the Amazon Kendra console Index With navigation pane.
- choose Create an index。
- On Specify the details of the index Provide the page and the following information:
- for Index nameEnter the name (for example,
genai-kendra-index
) - for IAM rolechoose Create new roles (recommended)。
- for Roll nameEnter the IAM roll name (for example,
genai-kendra
) Your role is given beforeAmazonKendra-<region>-
(for example,AmazonKendra-us-east-2-genai-kendra
)
- for Index nameEnter the name (for example,
- choose Next。
- On Add additional capacity Page, selection Developer version Select (for this demonstration) Next。
- On Configure user access control Provide the page and the following information:
- under Access control settingsSelect no。
- under User group expansion,choice none。
- choose Next。
- On Review and creation Check the page and details and select Create。
It may take some time for the index to be created. Check the list of indexes and watch the progress of the index creation. If there is an index status activeReady to use the index.
Set up the Amazon Kendra Aurora PostgreSQL connector
Complete the following steps and set the data source connector.
- Please select on the Amazon Kendra console Data source With navigation pane.
- choose Add data source。
- choose Aurora PostgreSQL connector As a data sort type.
- On Specify the details of the data source Provide the page and the following information:
- for Data source nameEnter the name (for example,
data_source_genai_kendra_postgresql
) - for Default languagechoose English (en)。
- choose Next。
- for Data source nameEnter the name (for example,
- On Define access and security Page, under sauceWe provide the following information.
- for hostEnter the host name of the PostgreSQL instance (
cvgupdj47zsh.us-east-2.rds.amazonaws.com
) - for portEnter the port number of the PostgreSQL instance ()
5432
) - for Actual exampleEnter the database name of the PostgreSQL instance ()
genai
)
- for hostEnter the host name of the PostgreSQL instance (
- under certificationIf the qualification information is already saved in AWS Secrets Manager, select it by drop -down. Create and add a new secret。
- in Create AWS Secrets Manager Secret Provide the pop -up window, the following information:
- for Secret nameEnter the name (for example,
AmazonKendra-Aurora-PostgreSQL-genai-kendra-secret
) - for Database user nameEnter the name of the database user.
- for passwordEnter the Huser user password.
- for Secret nameEnter the name (for example,
- choose Add a secret。
- under Configure VPC and security groupsWe provide the following information.
- for Virtual private cloudSelect Virtual Private Cloud (VPC).
- for SubnetSelect a subnet.
- for VPC Security GroupSelect the VPC Security Group and allow access to the data source.
- under IAM roleIf you have an existing role, select the drop -down menu. Otherwise, select Create a new role。
- On Configure synchronous settings Page, under Synchronous scopeWe provide the following information.
- for SQL queryEnter the SQL query and column values as follows:
select * from employees.amazon_review
。 - for Main keyEnter the primary key column (
pk
) - for titleEnter the title column to provide the name of the document title in the database table ()
reviews_title
) - for bodyEnter the body cholam where the amazon kendra search occurs ()
reviews_text
)
- for SQL queryEnter the SQL query and column values as follows:
- under Synchronous node,choice Completely synchronous Convert the entire table data into an index that can be searched.
After the synchronization is successfully completed, the Amazon Kendra index contains data in the specified Aurora PostgreSQL table. Then you can use this index to use it for intelligent search and RAG applications.
- under Synchronous execution schedulechoose Execute on demand。
- choose Next。
- On Set field mapping Select the page and default settings Next。
- Check the settings and select Add data source。
The data source is displayed in Data source Page after data source is successfully created.
Call the RAG application
Synchronization of Amazon Kendra Index can take several minutes to hours depending on the volume of the data. When the synchronization is completed without an error, you are ready to develop a RAG solution with your favorite IDE. Complete the following steps:
- Configure AWS qualification information so that BOTO3 can interact with AWS services. You can do this by setting this
AWS_ACCESS_KEY_ID
andAWS_SECRET_ACCESS_KEY
Using environment variables or~/.aws/credentials
file: - Import LangChain and the necessary components.
- Create an instance of LLM (Anthropic’s Claude).
- Create a prompt template to provide LLM instructions.
- Initialization
KendraRetriever
Exchange the Amazon Kendra index IDKendra_index_id
What I created earlier and Amazon Kendra Client: - Combine ANTHROPIC’s Claude and Amazon Kendra Retriever to the RetrievalQa chain.
- Call the chain with your own query:
cleaning
To prevent future fees from being charged, delete the resources created as part of this post.
- Delete Aurora DB cluster and DB instance.
- Delete the Amazon Kendra index.
Conclusion
In this post, we explained how to convert existing Aurora data to Amazon Kendra index and implement RAG -based solutions for data search. This solution greatly reduces the need for data preparation for Amazon Kendra search. In addition, by reducing the learning curve behind the data preparation, the speed of developing AI applications is increased.
Try the solution. If you have any comments or questions, leave it in the comment section.
About the author
ARAVIND HARIHARAPUTRAN Amazon Web Services Professional Service Data Consultant. He is passionate about data and Aiml with a wealth of experience in managing database technologies. He supports customers to convert the legacy database and applications into the latest data platform and generated AI application. He enjoys the time and cricket with his family.
IVAN CUI It is a data science lead with AWS Professional Services, supporting and developing solutions using ML and generated AI on AWS. He has cooperated with customers in various industries, including software, finance, pharmaceuticals, healthcare, IoT, entertainment, and media. In a free time, he is reading, spending time with his family, and enjoying a trip.