Troubleshooting infrastructure as a code (IAC) error often takes valuable time and resources. Developers can spend multiple cycles searching for solutions across the forum, troubleshoot recurring issues, or try to identify the root cause. These delays could have resulted in missing security errors and compliance violations, especially in complex, multi-account environments.
This post shows you how to use Amazon Bedrock Agent to create intelligent solutions to streamline your Terraform and AWS CloudFormation code issues through context-conscious troubleshooting. Amazon Bedrock offers a selection of high-performance basic models (FMS) from leading AI companies such as AI21 Labs, Anthropic, Cohere, Meta, Stability AI, Amazon, and high-performance AI companies through a single API It is a fully managed service. The ability to build generated AI applications with security, privacy and responsible AI. Amazon Bedrock Agents is a fully managed service that allows developers to break down complex tasks into steps and create AI agents that can use FMS and APIs to achieve specific business goals.
The solution uses the Amazon Bedrock agent to analyze error messages and code contexts and generates detailed troubleshooting steps for IAC errors. In organizations with multiaccount AWS environments, teams often maintain a centralized AWS environment for developers to deploy their applications. This setup ensures that your AWS infrastructure deployment using IAC is in line with organizational security and compliance metrics. For specific IAC errors related to these compliance measures, including service control policies (SCPS) and resource-based policies, the solution instructs developers to contact the appropriate team, such as security or enablement . This targeted guidance maintains security protocols and ensures that sensitive issues are handled by the right experts. The solution is flexible and can be adapted to similar use cases beyond these examples.
This example focuses on the Terraform Cloud Workspace, but the same principles apply to GitLab CI/CD pipelines or other approaches to running IAC code for continuous integration and delivery (CI/CD). By automating initial error analysis and providing targeted solutions or guidance, you can improve operational efficiency and focus on solving complex infrastructure challenges within your organization’s compliance framework.
Solution overview
Before diving into the deployment process, take the important steps in your architecture, as shown in the following diagram.
The workflow for the Terraform solution is as follows:
- Initial input from Amazon Bedrock Agents Chat Console – Users start by entering the terraform error details in the Amazon Bedrock agent chat console. This usually includes the Terraform Cloud Workspace URL where the error occurred, and optionally the Git repository URL and branch name if additional context is required.
- Get errors and collect context – The Amazon Bedrock Agent forwards these details to the action group that invokes the first AWS Lambda function (see the following Lambda function code). This function calls another Lambda function (see Lambda function code below) that retrieves the latest error message from the given Terraform cloud workspace. If a GIT repository URL is provided, it also retrieves the associated Terraform file from the repository. This context information is sent back to the first lambda function.
- Error analysis and response generation – The Lambda function constructs detailed prompts with error messages, repository files (if available), and instructions for specific use cases. You then use the Amazon Bedrock model to analyze the errors and generate troubleshooting steps or guidance to contact a specific team.
- Interactions and User Guidance – The agent displays the generated response to the user. For most teraform errors, this includes detailed troubleshooting steps. For certain cases relating to an organization’s policies (for example, service control policies or resource-based policies), the response instructs the user to contact the appropriate team, such as security or enablement.
- Continuous improvement – Solutions can be updated continuously with new specific use cases and organizational guidelines, ensuring troubleshooting advice is up to date with the organization’s evolving infrastructure and compliance requirements. for example:
- Violation of SCP or IAM policy – SCPS or strict AWS Identity and Access Control (IAM) boundaries guide developers when they encounter authorization issues and provide alternative or escalation paths.
- VPC and Networking Limitations – Flag non-compliant virtual private clouds (VPCs) or subnet configurations (such as public subnets) and suggest security-compliant tunings.
- Encryption requirements – Detects shortages or incorrect encryption of Amazon Simple Storage Service (Amazon S3) or Amazon Elastic Block Store (Amazon EBS) resources and recommends proper configuration to meet compliance standards.
The following diagram illustrates the step-by-step process of how a solution works.
This solution streamlines the process of resolving teraform errors and ensures sensitive or complex issues are directed towards the right team, providing developers with immediate context-conscious guidance. Using the Amazon Bedrock Agent capabilities, we provide a scalable and intelligent approach to managing IAC challenges in large, multi-account AWS environments.
Prerequisites
To implement the solution, you need to:
Create an Amazon bedrock agent
To create and configure an Amazon bedrock agent, complete the following steps:
- Select on the Amazon Bedrock console agent In the navigation pane.
- choose Create an agent.
- Provides agent details including the agent name and optional description.
- Grant agent permissions to AWS services through the IAM service role. This allows agents to access any services they need, such as lambdas.
- Select FM from Amazon Bedrock (such as Anthropic’s Claude 3 Sonnet).
- For troubleshooting Terraform errors through the Amazon Bedrock agent, attach the following instructions to your agent: This command ensures that the agent collects the required input from the user and executes action groups to provide detailed troubleshooting steps.
“You are a Terraform Code Error Specialist. Greet the user and ask for the Terraform Workspace URL, branch name, and code repository URL. Once received, trigger troubleshooting for the action group. Provide the user with troubleshooting steps Masu.”
Configure the Lambda function for the action group
Once you have configured the first agent and added the previous instruction to the agent, you will need to create two Lambda functions.
- The first Lambda function is added to the action group that is called by the Amazon Bedrock agent, and then triggers the second Lambda function using the Invoke method. See Lambda function code for more information. Make sure the lambda_2_function_name environment variable is set.
- The second Lambda function handles getting the Terraform workspace errors in gitlab and related Terraform code. See Lambda function code. Make sure the terraform_api_url, terraform_secret_name, and vcs_secret_name environment variables are set.
After the workspace error and code details for Terraform have been retrieved, these details are passed to the first Lambda function, and using the Amazon Bedrock API in FM to provide appropriate troubleshooting steps based on the error and code information. Generate and provide.
Add an action group to Amazon bedrock agent
Complete the following steps to add the action group to the Amazon Bedrock agent:
- Add an action group to the Amazon Bedrock agent.
- Assign a descriptive name (for example, troubleshooting) to the action group and provide a description. This helps to clarify the purpose of the action groups within the workflow.
- for Action Group Type,choice Define the function details.
For more information, please define details about the functionality of the agent action group in Amazon Bedrock.
- for Calling an action groupselect the first Lambda function you created previously.
This function executes the business logic required when an action is invoked. Make sure to select the correct version of the first lambda function. For more information about configuring the Lambda function for an action group, see Configuring the Lambda Functions to send information that Amazon bedrock agents pull from users.
- for Action Group Function 1provides a name and description.
- Add the following parameters:
name |
explanation | type | Required |
workspace_url |
Terraform Workspace URL |
string |
truth |
repo_url |
Code Repository URL |
string |
truth |
branch_name | Code Repository Branch Name | string |
truth |
Test the solution
The following example shows a Terraform error with Service Control Polcy. The troubleshooting steps provided are consistent to address these specific constraints. Action groups follow a structured single shot prompt by passing full context, such as error messages and repository content, to generate accurate troubleshooting steps with a single input to the Amazon Bedrock model Trigger the Lambda function.
Example 1: The following screenshot shows an example of a teraform error caused by SCP restrictions managed by the security team.
The following screenshot shows examples of user interaction with Amazon Bedrock agents and the troubleshooting steps provided.
Example 2: The following screenshot shows an example of a teraform error due to missing variable values.
The following screenshot shows examples of user interaction with Amazon Bedrock agents and the troubleshooting steps provided.
cleaning
The services used in this demo may incur costs. Complete the following steps to clean up your resources:
- If you no longer need the Lambda function, remove it.
- Delete the action group you created and the Amazon Bedrock agent.
Conclusion
Although IAC provides flexibility in managing cloud environments, troubleshooting code errors can take time, especially in environments with strict organizational guardrails. This post demonstrated how Amazon Bedrock agents can combine action groups and generative AI models to streamline and accelerate the resolution of teraform errors while maintaining compliance with environment security and operational guidelines.
With the Amazon Bedrock Agent functionality, developers can receive context-aware procedures tailored to environment-related issues such as SCP and IAM violations, VPC restrictions, and encryption policies. This solution provides specific guidance based on the context of the error and directs users to the appropriate team on issues that require further escalation. This reduces the time spent on IAC errors, increases developer productivity, and keeps organizations compliant.
Are you ready to streamline your cloud deployment process with Amazon bedrock generation AI? Explore the Amazon Bedrock User Guide to learn how to drive your organization’s migration to the cloud. For professional assistance, consider engaging in AWS Professional Services to maximize the efficiency and benefits of using Amazon Bedrock.
About the author
Akhil Raj Yallamelli I am AWS Cloud Infrastructure Architect and specializes in architecting cloud infrastructure solutions to enhance data security and cost-effectiveness. He has experience in integrating technology solutions with business strategies to create scalable, reliable and secure cloud environments. Akhil enjoys developing solutions focused on customer business outcomes, and incorporates Generated AI (GEN AI) technology to drive innovation and cloud enablement. He holds an MS degree in computer science. Outside of his professional work, Akhil watches and plays sports.
Ebby Thomas I am AWS Advanced Generation AI Specialist Solution Architect. He designs and implements generative AI solutions that address specific customer business issues. He is recognized for simplifying complexity and providing clients with measurable business outcomes. Ebbey holds BS for Computer Engineering and MS for Syracuse University Information Systems.