Managing cloud costs and understanding resource usage can be a daunting task, especially for organizations with complex AWS deployments. While the AWS Cost and Usage Report (AWS CUR) provides valuable data insights, interpreting and querying the raw data can be difficult.
This post describes a solution that uses generative artificial intelligence (AI) to generate SQL queries from a user’s natural language questions. The solution simplifies the process of using SQL query generation to query CUR data stored in an Amazon Athena database, running the queries in Athena, and displaying them in a web portal for easy understanding.
The solution uses Amazon Bedrock, a fully managed service that offers a choice of high-performance foundational models (FMs) from leading AI companies such as AI21 Labs, Anthropic, Cohere, Meta, Mistral AI, Stability AI, and Amazon through a single API, and also provides a wide range of capabilities for building generative AI applications with security, privacy, and responsible AI.
Issues to be resolved
The following challenges can prevent organizations from effectively analyzing CUR data, leading to inefficiencies, overspending, and missed cost optimization opportunities: Using generative AI with Amazon Bedrock, we aim to target and simplify these challenges:
- SQL Query Complexity – Writing SQL queries to derive insights from CUR data can be complicated, especially for non-technical users or those unfamiliar with CUR data structures (unless you’re an experienced database administrator).
- Data Accessibility – To gain insights from the structured data in the database, users need to access the database, which can be a potential threat to overall data protection.
- Ease of use Traditional methods of analyzing CUR data often lack user-friendly interfaces, making it difficult for non-technical users to tap into the valuable insights hidden in the data.
Solution overview
The solution described here is a web application (chatbot) that allows users to ask questions about their AWS costs and usage in natural language. The application generates SQL queries based on user input, runs them against an Athena database containing CUR data, and displays the results in a user-friendly format. The solution combines the power of generative AI, SQL generation, database querying, and an intuitive web interface to provide a seamless experience for analyzing CUR data.
This solution uses the following AWS services:
The following diagram shows the solution architecture:
The data flow consists of the following steps:
- CUR data is stored in Amazon S3.
- Athena is configured to access and query the CUR data stored in Amazon S3.
- Users interact with the Streamlit web application and submit natural language questions about their AWS costs and usage.
- The Streamlit application sends user input to Amazon Bedrock, and the LangChain application facilitates the overall orchestration.
- The LangChain code uses LangChain’s BedrockChat class to call FM, which interacts with Amazon Bedrock to generate SQL queries based on the user’s input.
- The generated SQL queries are run against the Athena database using FM on Amazon Bedrock to query the CUR data stored in Amazon S3.
- The results of the query are returned to the LangChain application.
- LangChain sends the SQL query and the query results back to the Streamlit application.
- The Streamlit application displays SQL queries and query results to the user in a formatted and user-friendly manner.
Prerequisites
To set up this solution, you need the following prerequisites:
Configure the solution
To set up the solution, follow these steps:
- Create an Athena database and table to store your CUR data. Ensure that the appropriate permissions and settings are in place for Athena to access the CUR data stored in Amazon S3.
- Configure your compute environment to call the Amazon Bedrock API. Be sure to associate an IAM role with this environment whose IAM policy allows access to Amazon Bedrock.
- Once the instance is up and running, install the following libraries that will be used to work within the environment:
- Use the following code to establish a connection to the Athena database using the langchain library and pyathena, and configure the language model to generate SQL queries based on user input using Amazon Bedrock. You can save this file as cur_lib.py.
- Create a Streamlit web application to provide a UI for interacting with your LangChain application. Include an input field for users to enter natural language questions and view the generated SQL queries and query results. You can name this file cur_app.py.
- Call the get_response form to connect the LangChain application with the Streamlit web application and display the SQL query and results in the Streamlit web application. Add the following code to the application code above.
- Deploy the Streamlit and LangChain applications to a hosting environment, such as Amazon EC2 or a Lambda function.
cleaning
As long as this solution does not call Amazon Bedrock, there are no charges. To avoid ongoing Amazon S3 storage charges for storing your CUR reports, delete your CUR data and S3 bucket. If you use Amazon EC2 to set up the solution, be sure to stop or delete the instance when you’re finished.
advantage
This solution provides the following benefits:
- Simplified Data Analysis – Use generative AI to analyze CUR data in natural language, eliminating the need for advanced SQL knowledge.
- Improved accessibility – Web-based interface allows non-technical users to access and gain insights into CUR data without needing database credentials.
- Save time – Get instant answers to your cost and usage questions without having to manually write complex SQL queries.
- Improved visibility – This solution provides visibility into AWS costs and usage, enabling better cost optimization and resource management decisions.
summary
The AWS CUR Chatbot solution uses Amazon Bedrock’s Anthropic Claude to generate SQL queries, database queries, and a user-friendly web interface to simplify the analysis of CUR data. The solution allows you to ask questions in natural language, removing barriers and enabling both technical and non-technical users to gain valuable insights into their AWS costs and resource usage. This solution helps organizations make more informed decisions, optimize their cloud spend, and improve overall resource utilization. We recommend that you exercise due diligence when setting this up, especially in a production environment. You can choose other programming languages and frameworks to set it up depending on your preferences and needs.
Amazon Bedrock makes it easy to build powerful generative AI applications. You can accelerate your development by following our quick start guides on GitHub to rapidly develop state-of-the-art Retrieval Augmented Generation (RAG) solutions using the Amazon Bedrock knowledge base, or by using the Amazon Bedrock agent to enable your generative AI applications to execute multi-step tasks across your company’s systems and data sources.
About the Author
Anuthosh I am a Solutions Architect at AWS India. I dive deep into customer use cases and help them smooth their journey on AWS. I enjoy helping customers by building solutions in the cloud. I am passionate about migration & modernization, data analytics, resiliency, cybersecurity, and machine learning.