Automate building guardrails for Amazon Bedrock using test-driven development

create_response = client.create_guardrail(
    name="math-tutoring-guardrail",
    description='Prevents the model from providing non-math tutoring, in-person tutoring, or tutoring outside grades 6-12.',
    topicPolicyConfig={
        'topicsConfig': (
            {
                'name': 'In-Person Tutoring',
                'definition': 'Requests for face-to-face, physical tutoring sessions.',
                'examples': (
                    'Can you tutor me in person?',
                    'Do you offer home tutoring visits?',
                    'I need a tutor to come to my house.'
                ),
                'type': 'DENY'
            },
            {
                'name': 'Non-Math Tutoring',
                'definition': 'Requests for tutoring in subjects other than mathematics.',
                'examples': (
                    'Can you help me with my English homework?',
                    'I need a science tutor.',
                    'Do you offer history tutoring?'
                ),
                'type': 'DENY'
            },
            {
                'name': 'Non-6-12 Grade Tutoring',
                'definition': 'Requests for tutoring students outside of grades 6-12.',
                'examples': (
                    'Can you tutor my 5-year-old in math?',
                    'I need help with college-level calculus.',
                    'Do you offer math tutoring for adults?'
                ),
                'type': 'DENY'
            }
        )
    },
    contentPolicyConfig={
        'filtersConfig': (
            {
                'type': 'SEXUAL',
                'inputStrength': 'HIGH',
                'outputStrength': 'HIGH'
            },
            {
                'type': 'VIOLENCE',
                'inputStrength': 'HIGH',
                'outputStrength': 'HIGH'
            },
            {
                'type': 'HATE',
                'inputStrength': 'HIGH',
                'outputStrength': 'HIGH'
            },
            {
                'type': 'INSULTS',
                'inputStrength': 'HIGH',
                'outputStrength': 'HIGH'
            },
            {
                'type': 'MISCONDUCT',
                'inputStrength': 'HIGH',
                'outputStrength': 'HIGH'
            },
            {
                'type': 'PROMPT_ATTACK',
                'inputStrength': 'HIGH',
                'outputStrength': 'NONE'
            }
        )
    },
    wordPolicyConfig={
        'wordsConfig': (
            {'text': 'in-person tutoring'},
            {'text': 'home tutoring'},
            {'text': 'face-to-face tutoring'},
            {'text': 'elementary school'},
            {'text': 'college'},
            {'text': 'university'},
            {'text': 'adult education'},
            {'text': 'english tutoring'},
            {'text': 'science tutoring'},
            {'text': 'history tutoring'}
        ),
        'managedWordListsConfig': (
            {'type': 'PROFANITY'}
        )
    },
    sensitiveInformationPolicyConfig={
        'piiEntitiesConfig': (
            {'type': 'EMAIL', 'action': 'ANONYMIZE'},
            {'type': 'PHONE', 'action': 'ANONYMIZE'},
            {'type': 'NAME', 'action': 'ANONYMIZE'}
        )
    },
    blockedInputMessaging="""I'm sorry, but I can only assist with math tutoring for students in grades 6-12. For other subjects, grade levels, or in-person tutoring, please contact our customer service team for more information on available services.""",
    blockedOutputsMessaging="""I apologize, but I can only provide information and assistance related to math tutoring for students in grades 6-12. If you have any questions about our online math tutoring services for these grade levels, please feel free to ask.""",
    tags=(
        {'key': 'purpose', 'value': 'math-tutoring-guardrail'},
        {'key': 'environment', 'value': 'production'}
    )
)

The API response includes the guardrail ID and version. In the next section, you will use these two fields to manipulate the guardrail.

Build the test dataset

of tests.csv The files in the project directory are math-tutoring-guardrail Created in the previous step. Create your own dataset data Save the folder in the project directory as a CSV file with the same structure as the sample tests.csv Create files based on your specific use case. The dataset must contain the following columns:

- - - test_number Unique identifier for each test case.

- - - test_type It’s either INPUT or OUTPUT.

- - - test_content_query A user query or input.

- - - test_content_grounding_source AI context information (if applicable).

- - - test_content_guard_content AI response ( OUTPUT test).

- - - expected_action is set to GUARDRAIL_INTERVENED or NONE. set to GUARDRAIL_INTERVENED If the prompt needs to be blocked by a guardrail and NONE When a prompt passes through a guardrail.

Make sure your test dataset comprehensively tests all elements of your guardrail system. Load the test file into your workflow using Python’s pandas library. use df.head()Now we can check the first 5 rows of the pandas dataframe object and confirm that the dataset was read correctly.

# Import the data file
import pandas as pd
df = pd.read_csv('data/tests.csv')
df.head()

Evaluate guardrails using test datasets

To run tests on the guardrails you create, use the ApplyGuardrails API. This enforces model input or model response output text guardrails without the need to call FM.

of ApplyGuardrail The API requires:

guardrail identifier – Unique ID of the guardrail being tested
Guardrail version – Version of guardrail to test
sauce – The source of the data used in the request to apply the guardrail (INPUT or OUTPUT)
content – Details used to request guardrail application

Use the guardrail ID and version. CreateGuardrail API response. The source and content are extracted from the test CSV created in the previous step. The following code reads a CSV file and prepares the source and content. ApplyGuardrails API call:

with open(input_file, 'r') as infile, open(output_file, 'w', newline="") as outfile:
        reader = csv.DictReader(infile)
        fieldnames = reader.fieldnames + ('test_result', 'achieved_expected_result', 'guardrail_api_response')
        writer = csv.DictWriter(outfile, fieldnames=fieldnames)
        writer.writeheader()

        for row_number, row in enumerate(reader, start=1):
            content = ()
            if row('test_type') == 'INPUT':
                content = ({"text": {"text": row('test_content_query')}})
            elif row('test_type') == 'OUTPUT':
                content = (
                    {"text": {"text": row('test_content_grounding_source'), "qualifiers": ("grounding_source")}},
                    {"text": {"text": row('test_content_query'), "qualifiers": ("query")}},
                    {"text": {"text": row('test_content_guard_content'), "qualifiers": ("guard_content")}},
                )
            
            # Remove empty content items
            content = (item for item in content if item('text')('text'))

can make a call ApplyGuardrail API for each row in the test dataset. You can decide on guardrail actions based on the API response. If the guardrail behavior matches the expected behavior, the test is considered. True (pass), if not False (We’re screwed). Additionally, each line of the API response is saved, allowing users to inspect the response as needed. These test results are written to an output CSV file. See the code below.

with open(input_file, 'r') as infile, open(output_file, 'w', newline="") as outfile:
        reader = csv.DictReader(infile)
        fieldnames = reader.fieldnames + ('test_result', 'achieved_expected_result', 'guardrail_api_response')
        writer = csv.DictWriter(outfile, fieldnames=fieldnames)
        writer.writeheader()

        for row_number, row in enumerate(reader, start=1):
            content = ()
            if row('test_type') == 'INPUT':
                content = ({"text": {"text": row('test_content_query')}})
            elif row('test_type') == 'OUTPUT':
                content = (
                    {"text": {"text": row('test_content_grounding_source'), "qualifiers": ("grounding_source")}},
                    {"text": {"text": row('test_content_query'), "qualifiers": ("query")}},
                    {"text": {"text": row('test_content_guard_content'), "qualifiers": ("guard_content")}},
                )
            
            # Remove empty content items
            content = (item for item in content if item('text')('text'))

            # Make the actual API call
            response = apply_guardrail(content, row('test_type'), guardrail_id, guardrail_version)

            if response:
                actual_action = response.get('action', 'NONE')
                expected_action = row('expected_action')
                achieved_expected = actual_action == expected_action

                # Prepare the API response for CSV
                api_response = json.dumps( {
                    "action": actual_action,
                    "outputs": response.get('outputs', ()),
                    "assessments": response.get('assessments', ())
                })

                # Write the results
                row.update({
                    'test_result': actual_action,
                    'achieved_expected_result': str(achieved_expected).upper(),
                    'guardrail_api_response': api_response
                })
            else:
                # Handle the case where the API call failed
                row.update({
                    'test_result': 'API_CALL_FAILED',
                    'achieved_expected_result': 'FALSE',
                    'guardrail_api_response': json.dumps({"error": "API call failed"})
                })

            writer.writerow(row)
            print(f"Processed row {row_number}")  # New line to print progress

    print(f"Processing complete. Results written to {output_file}")

After reviewing the test results, you can update the guardrails as necessary to meet the needs of your application. This approach allows you to practice TDD when using Amazon Bedrock Guardrails. You can see the failed tests in the following table. the result, achieved_expected_result There is FALSE Because the guardrails intervened when they shouldn’t have. Therefore, you can modify the rejected topics and additional filters on the guardrails to ensure that this test passes.

Using a TDD approach, you can prevent bad actors from exploiting your applications, identify previously unconsidered edge cases and gaps, and improve the success rate of your guardrails in adhering to responsible AI policies. , you can improve your guardrails over time.

Optional: Automate workflows and iteratively improve guardrails

We recommend checking the test results after each iteration. This procedure does not guarantee that the guardrail will pass all tests. Use this step to understand how to modify an existing guardrail configuration.

When practicing a TDD approach, we recommend improving your guardrails over time through multiple iterations. This optional step allows you to prompt the user for details. Then use those details to build guardrails and test the case from the beginning. Next, let the user enter n iterations. Rerun all tests at each iteration and adjust rejected topics in the guardrails based on test results.

To create a guardrail, prompt the user for a name and description for the guardrail. Use the InvokeModel API using the specified description. guardrail_prompt.txt Generate a guardrail denied topic using a system prompt. Use this configuration to build a guardrail by calling the CreateGuardrail API. You can verify that the new guardrail was created by refreshing the Amazon Bedrock Guardrails dashboard. In the following screenshot, you can see that a new guardrail has been created for the Photos application.

Using the same parameters, you can use the InvokeModel API to generate test cases for your newly created guardrail. of tests_prompt.txt This file provides system prompts to confirm that FM will create 30 test cases, including 20 input tests and 10 output tests. To practice TDD, use these test cases and iteratively modify the existing guardrails n times according to user requirements based on the test results of each iteration.

The process of iteratively modifying existing guardrails consists of four steps:

Use the GetGuardrail API to get the latest configuration for your guardrail.

current_guardrail_details = client.get_guardrail(
	guardrailIdentifier=guardrail_id,
	guardrailVersion=version
) 

current_denied_topics = current_guardrail_details(‘topicPolicy’)(‘topics’)
current_name = current_guardrail_details(‘name’)
current_description = guardrail_description
current_id = current_guardrail_details(‘guardrailId’)
current_version = current_guardrail_details(‘version’)

Create a new version of the guardrail at each iteration using the CreateGuardrailVersion API. This allows you to track all guardrails that have changed throughout each iteration. This API works asynchronously, so your code will continue to run even if your guardrails are not fully versioned. use. guardrail_ready_check Function to verify that guardrail is inside ‘READY’ The state before code execution continues.

response = client.create_guardrail_version(
	guardrailIdentifier=current_id
	description=”Iteration “+str(i)+” – “+current_description
	clientRequestToken=f”GuardrailUpdate-{int(time.time())}-{uuid.uuid4().hex}”
)
guardrail_ready_check(guardrail_id,15,10)

of guardrail_ready_check The function uses the GetGuardrail API to retrieve the current status of the guardrail. If the guardrail is not on the inside, ‘READY’ This function implements wait logic until a state is reached or a timeout error occurs.

def guardrail_ready_check(guardrail_id, max_attempts, delay):
	#Poll for ready state
	for attempt in range(max_attempts):
		try:
			guardrail_status = client.get_guardrail(guardrailIdentifier=guardrail_id)(‘status’)
			if guardrail_status == ‘READY’:
				print(f”Guardrail {guardrail_id} is now in READY state.”)
				return response
			elif guardrail_status == ‘FAILED’:
				raise Exception(f”Guardrail {guardrail_id} update failed.”)
			else:
				print(f”Guardrail {guardrail_id} is in {guardrail_status} state. Waiting...”)
				time.sleep(delay)
		except Exception as e:
			print(f”Error checking guardrail status: {str(e)}”)
			time.sleep(delay)
	raise TimeoutError(f”Guardrail {guardrail_id} did not reach READY state within the expected time.”)

Evaluate the guardrails against: auto_generated_tests.csv using the file process_tests The function created in the previous step:

process_tests(input_file, output_file, current_id, current_version)

test_results = pd.read_csv(output_file)

of input_file will be yours auto_generated_tests.csv file. however, output_file Dynamically named based on iteration. For example, for iteration 3, the resulting file name would be: test_results_3.csv.

Based on the test results for each iteration, use the InvokeModel API to generate modified rejected topics. of get_denied_topics The function is guardrail_prompt.txt Design your model to consider test results and guardrail descriptions when modifying rejected topics when calling the API.

updated_topics = get_denied_topics(guardrail_description, current_denied_topics, test_results)

Call the UpdateGuardrail API using the newly generated rejected topic. update_guardrail function. This will provide the updated configuration to your existing guardrails and update them accordingly.

update_guardrail(current_id, current_name, current_description, current_version, updated_topics)

After completion n When iterating, n Created guardrail version and n test results as shown in the following screenshot. This allows you to review each iteration and update the guardrail configuration to meet your application’s requirements. When using TDD, it’s important to validate your test results and make sure you’re improving them over time to get the best results.

cleaning

In this solution, we created guardrails, built a dataset, evaluated the guardrails against the dataset, and iteratively modified the guardrails based on test results. To clean up, use the DeleteGuardrail API, which deletes a guardrail using the guardrail ID and guardrail version.

Pricing

This solution uses Amazon Bedrock and charges based on FM calls and guardrail usage.

FM call – You are charged based on the number of input and output tokens. Depending on the model used, one token corresponds to one word or subword. For this solution, I used Anthropic’s Claude 3 Sonnet and Claude 3 Haiku models. The size of the input and output tokens is based on the test prompt size and response size.
guardrail – You will be charged based on your guardrail policy configuration. Each policy is billed per 1,000 text units, and each text unit can contain up to 1,000 characters.

For more information, see Amazon Bedrock Pricing.

conclusion

When developing generative AI applications, it is important to implement robust safeguards and governance measures to maintain responsible use of AI. Amazon Bedrock Guardrails provides a framework to accomplish this. However, guardrails are not static entities. Continuous improvement and adaptation are required to keep pace with evolving use cases, malicious threats, and responsible AI policies. TDD is a software development methodology that promotes software improvement through iterative development cycles.

As this post shows, you can employ TDD when building safeguards for your generative AI applications. By systematically testing and refining guardrails, companies can not only reduce potential risks and operational inefficiencies, but also foster a culture of knowledge sharing among technical teams and encourage continuous improvement in AI development. and can facilitate strategic decision-making.

As new edge cases emerge and use cases evolve, we recommend integrating a TDD approach into your software development practices to ensure that your safety measures improve over time. If you have any questions, please leave a comment on this post or open an issue on GitHub.

About the author

Harsh Patel is an AWS Solutions Architect who supports over 200 SMB customers across the U.S. to drive digital transformation through cloud-native solutions. As an AI and ML specialist, he focuses on generative AI, computer vision, reinforcement learning, and anomaly detection. Outside of the world of technology, I recharge by going to the golf course and hiking through beautiful scenery with my dog.

Aditi Rajnish He is a second-year software engineering student at the University of Waterloo. Her interests include computer vision, natural language processing, and edge computing. She is also passionate about community-based STEM support and advocacy. In his spare time, he enjoys rock climbing, playing the piano, and learning how to bake the perfect scone.

Raj Pathak is a Principal Solutions Architect and Technical Advisor to Fortune 50 and midsize FSI (banking, insurance, and capital markets) clients in Canada and the United States. Raj specializes in machine learning with applications of generative AI, natural language processing, intelligent document processing, and MLOps.

What's Hot

Gravity may explain why Neanderthals failed to introduce advanced weapons

’90s cartoons you’ll appreciate way more as an adult

17 Best Barefoot Shoes for Running and Walking (2024)

Accelerating insurance policy reviews with generative AI: Verisk’s Mozart companion

Announcing general availability of Amazon Bedrock Knowledge Bases GraphRAG with Amazon Neptune Analytics

Evaluate RAG responses with Amazon Bedrock, LlamaIndex and RAGAS

Build a Multi-Agent System with LangGraph and Mistral on AWS

Ground truth generation and review best practices for evaluating generative AI question-answering with FMEval

Accelerating insurance policy reviews with generative AI: Verisk’s Mozart companion

Announcing general availability of Amazon Bedrock Knowledge Bases GraphRAG with Amazon Neptune Analytics

Evaluate RAG responses with Amazon Bedrock, LlamaIndex and RAGAS

Google’s Pixel 9 is here, and a trade-in deals has arrived to sweeten the deal

Are only children really self-centered, spoiled and lonely?

Roku Pro Series TV Review: Bright and easy to use

Most Popular

How the polar vortex brings an arctic blast to the US

How the Vagus Nerve Could Influence Physical and Mental Health

The Samsung Odyssey 3D is a glasses-free solution for immersive gaming

Our Picks

Using LLMs to fortify cyber defenses: Sophos’s insight on strategies for using LLMs with Amazon Bedrock and Amazon SageMaker

Knots made with strange quantum fluid can last forever

Could we take the entire solar system on a space journey?

Subscribe to our newsletter

Subscribe to Updates

What's Hot

Automate building guardrails for Amazon Bedrock using test-driven development

Build the test dataset

Evaluate guardrails using test datasets

Optional: Automate workflows and iteratively improve guardrails

cleaning

Pricing

conclusion

About the author

Related Posts

Subscribe to our newsletter

Subscribe to our newsletter