Intelligent healthcare form analytics with Amazon Bedrock

Generative artificial intelligence (AI) offers the opportunity to improve healthcare by combining and analyzing structured and unstructured data across previously disconnected silos. Generative AI can help drive higher levels of efficiency and effectiveness across the entire spectrum of healthcare delivery.

The healthcare industry generates and collects large amounts of unstructured text data, including clinical documents such as patient information, medical history, and test results, as well as non-clinical documents such as administrative records. This unstructured data is often found in various paper-based formats that are difficult to manage and process, which can affect the efficiency and productivity of clinical services. Streamlining the processing of this information is essential for healthcare providers to improve patient care and optimize their operations.

Processing large volumes of data, extracting unstructured data from multiple paper forms and images, and comparing it to standard and reference forms can be a long, tedious process prone to errors and inefficiencies. However, advances in generative AI solutions have introduced automated approaches that provide a more efficient and reliable solution for comparing multiple documents.

Amazon Bedrock is a fully managed service that makes foundational models (FMs) from leading AI startups and Amazon available via APIs. You can choose from a wide range of FMs to find the best fit for your use case. Amazon Bedrock provides a serverless experience, so you can get started quickly, privately customize your FM with your own data, and quickly integrate and deploy it into your applications using AWS tools without having to manage any infrastructure.

This post describes how to use Anthropic Claude 3 with Amazon Bedrock large-scale language models (LLMs). Amazon Bedrock provides access to several LLMs, including Anthropic Claude 3, that you can use to generate semi-structured data relevant to the healthcare industry. This is particularly useful for creating a variety of healthcare-related forms, such as patient intake forms, insurance claim forms, and medical history questionnaires.

Solution overview

Before diving into the specific elements and services used, we’ll walk through the architectural steps required to build the solution on AWS so that you can get a high-level understanding of how the solution works. We’ll present the key elements of the solution and provide an overview of the different components and their interactions.

We then explore each key element in more detail, discuss the specific AWS services used to build the solution, and explain how these services work together to achieve the desired functionality, providing a solid foundation for further exploration and implementation of the solution.

Part 1: Standard Forms: Extracting and Saving Data

The following diagram shows the key elements of the solution for extracting and storing data using standard forms.

Figure 1: Architecture – Standard formats – Data extraction and storage.

The standard procedure is as follows:

Users upload images (PDF, PNG, JPEG) of paper forms to Amazon Simple Storage Service (Amazon S3), a highly scalable and durable object storage service.
Amazon Simple Queue Service (Amazon SQS) is used as a message queue: every time a new form is loaded, an event is raised in Amazon SQS.
1. If an S3 object is not processed, after two attempts it is moved to an SQS Dead Letter Queue (DLQ), which can be further configured using an Amazon Simple Notification Service (Amazon SNS) topic to notify users via email.
The SQS message invokes AWS Lambda, which processes the new form data.
The Lambda function reads the new S3 object and passes it to the Amazon Textract API to process the unstructured data and generate a hierarchical output. Amazon Textract is an AWS service that can extract text, handwriting, and data from scanned documents and images. This approach enables efficient and scalable processing of complex documents, allowing you to extract valuable insights and data from various sources.
The Lambda function passes the converted text to Anthropic Claude 3 for Amazon Bedrock to generate a list of questions.
Finally, the Lambda function saves the list of questions to Amazon S3.

Amazon Bedrock API calls to extract form details

It calls the Amazon Bedrock API twice in the process for the following actions:

Extract questions from a standard or reference form – The first API call is made to extract a list of questions and subquestions from the standard or reference form. This list serves as a baseline or reference point for comparing other forms. By extracting questions from the reference form, a benchmark can be established against which other forms can be evaluated.
Extract questions from a custom form – The second API call is made to extract the list of questions and subquestions from the custom form or the form that needs to be compared with the standard or reference form. This step is necessary because the content and structure of the custom form needs to be analyzed to identify the questions and subquestions that can then be compared with the reference form.

By extracting and structuring the questions in both the reference form and the custom form separately, the solution can pass these two lists to the Amazon Bedrock API for a final comparison step. This approach maintains the following:

Exact Comparison – Because the API has access to structured data from both forms, it can easily identify matches and mismatches and provide associated inferences.
Efficient Processing – Separating the extraction process for reference and custom forms allows you to avoid redundant operations and optimize your overall workflow.
Observability and Interoperability – Keeping questions separate allows for better visibility, analysis, and consolidation of questions from different forms.
Avoiding hallucinations – By following a structured approach and relying on extracted data, the solution avoids content generation and hallucinations and ensures the integrity of the comparison process.

This two-phase approach leverages the power of Amazon Bedrock APIs to optimize workflows, enable accurate and efficient form comparisons, and promote observability and interoperability of related questions.

See the following code (API call):

def get_response_from_claude3(context, prompt_data):
    body = json.dumps({
        "anthropic_version": "bedrock-2023-05-31",
        "max_tokens": 4096,
        "system":"""You are an expert form analyzer and can understand different sections and subsections within a form and can find all the questions  being asked. You can find similarities and differences at the question level between different types of forms.""",
        "messages": (
            {
                "role": "user",
                "content": (
                    {"type": "text", 
                     "text": f"""Given the following document(s): {context} \n {prompt_data}"""},
                ),
            }
        ),
    })
    modelId = f'anthropic.claude-3-sonnet-20240229-v1:0'     
    config = Config(read_timeout=1000)
    bedrock = boto3.client('bedrock-runtime',config=config)    
    response = bedrock.invoke_model(body=body, modelId=modelId)
    response_body = json.loads(response.get("body").read())
    answer = response_body.get("content")(0).get("text")
   return answer

User prompt to extract and list fields

Provide the following user prompts to Anthropic Claude 3 to extract fields from the raw text and list them for comparison as shown in Step 3B (Figure 3: Data Extraction and Form Field Comparison).

get_response_from_claude3(response, f""" Create a summary of the different sections in the form, then
                                         for each section create a list of all questions and sub questions asked in the
                                         whole form and group by section including signature, date, reviews and approvals. 
                                         Then concatenate all questions and return a single numbered list, Be very detailed."""))

The following image shows the output from Amazon Bedrock with a list of questions from a standard or reference form.

Figure 2: A sample questionnaire in standard format

As shown in part 2 of the process below, you can store this questionnaire in Amazon S3 so that you can compare it with other forms.

Part 2: Data Extraction vs. Form Fields

The following diagram shows the architecture for the next step: data extraction and form field comparison.

Figure 3: Data extraction vs. form fields

Steps 1 and 2 are similar to those in Figure 1, but are repeated for any form that you want to compare with the standard or reference form. The steps are as follows:

The SQS message invokes a Lambda function, which processes the new form data.
1. The raw text is extracted by Amazon Textract using a Lambda function, and the extracted raw text is then passed to step 3B for further processing and analysis.
2. Anthropic Claude 3 generates a questionnaire from the custom form that needs to be compared with the standard form. Both the form and the document questionnaire are then passed to Amazon Bedrock to compare the extracted raw text with the standard or reference raw text to identify differences and anomalies and provide insights and recommendations relevant to the healthcare industry by their respective categories. It then generates the final output in JSON format for further processing and dashboarding. The Amazon Bedrock API calls and user prompts from step 5 (Figure 1: Architecture – Standard Form – Data Extraction and Storage) are reused in this step to generate a questionnaire from the custom form.

The following sections explain steps 4 through 6.

The following screenshot shows the output from Amazon Bedrock, including a list of questions from the custom form.

Figure 4: Sample Question List for Custom Forms

Final comparison using Anthropic Claude 3 on Amazon Bedrock:

The following example shows the results of a comparison exercise using Amazon Bedrock and Anthropic Claude 3, showing what did and did not match the reference or standard form.

Below is the form comparison user prompt.

categories = ('Personal Information','Work History','Medical History','Medications and Allergies','Additional Questions','Physical Examination','Job Description','Examination Results')
forms = f"Form 1 : {reference_form_question_list}, Form 2 : {custom_form_question_list}"

The first call is:

match_result = (get_response_from_claude3(forms, f""" Go through questions and sub questions {start}- {processed} in Form 2 return the question whether it matches with any question /sub question/field in Form 1 in terms of meaning and context and provide reasoning, or if it does not match with any question/sub question/field in Form 1 and provide reasoning. Treat each sub question as its own question and the final output should be a numbered list with the same length as the number of questions and sub questions in Form 2. Be concise"""))

The second call is:

get_response_from_claude3(match_result, 
f""" Go through all the questions and sub questions in the Form 2 Results and turn this into a JSON object called 'All Questions' which has the keys 'Question' with only the matched or unmatched question, 'Match' with valid values of yes or no, and 'Reason' which is the reason of match or no match, ‘Category' placing the question in one the categories in this list: {categories} . Do not omit any questions in output."""))

The following screenshot shows the matching questions in the reference form.

The following screenshot shows a question that did not match the reference form.

The steps in the preceding architecture diagram continue as follows:

4. The SQS queue invokes the Lambda function.

5. The Lambda function invokes the AWS Glue job and monitors its completion.

a. The AWS Glue job processes the final JSON output from the Amazon Bedrock model in a tabular format for reporting.

6. Amazon QuickSight is used to create interactive dashboards and visualizations, enabling health professionals to explore the analysis, identify trends, and make informed decisions based on the insights provided by Anthropic Claude 3.

The following screenshot shows a sample QuickSight dashboard.

Next steps

Many healthcare providers are investing in digital technologies such as electronic health records (EHRs) and electronic medical records (EMRs) to streamline data collection and storage and ensure records are accessible to the right staff for patient care. Additionally, digitized health records offer the convenience of electronic forms and remote data editing for patients. Electronic health records provide a more secure and accessible system of record, reducing data loss and promoting data accuracy. Similar solutions can also capture data from these paper forms into the EHR.

Conclusion

Generative AI solutions such as Amazon Bedrock and Anthropic Claude 3 can greatly streamline the process of extracting and comparing unstructured data from paper forms and images. By automating the extraction of form fields and questions, and intelligently comparing them to standard or reference forms, the solution can process large volumes of data more efficiently and accurately. The integration of AWS services such as Lambda, Amazon S3, Amazon SQS, and QuickSight provides a scalable and robust architecture to deploy this solution. As healthcare organizations continue to digitize their operations, AI-powered solutions like this can play a key role in improving data management, maintaining compliance, and ultimately enhancing patient care through better insights and decision-making.

About the Author

Satish Sarapuri He is a Senior Data Architect for Data Lakes at AWS. He helps enterprise-level customers build high-performance, highly available, cost-effective, resilient, and secure generative AI, data mesh, data lake, and analytics platform solutions on AWS, enabling them to make data-driven decisions, drive impactful business outcomes, and support their digital and data transformation efforts. In his spare time, he enjoys spending time with his family and playing tennis.

Harpreet Cheema He is a Machine Learning Engineer in the AWS Generative AI innovation center. He is very passionate about the field of Machine Learning and working on data-oriented problems. He focuses on developing and delivering Machine Learning focused solutions for customers across various sectors.

Deborah Devadason She is a Senior Advisory Consultant with the Professional Services team at Amazon Web Services. She is a result-driven and passionate Data Strategy Specialist with 25+ years of consulting experience across industries across the globe. She leverages her expertise to solve complex problems and accelerate business-focused initiatives, building a stronger backbone for digital and data transformation efforts.

What's Hot

Justice Department: Russia targeted gamers, minorities with propaganda to influence 2024 election

How Dan Stevens created his wild character in ‘Cuckoo’

The new Sonos app is so bad the company might bring back the old one

Federal expenditures threaten ecosystems and public safety.

Washington DC, CRASH of airplanes: All we know so far

Here’s How Quickly Could Polio Return to the U.S. without Vaccines

Itching is helpful to enhance immunity protection

Super charged hurricanes cause more power outages throughout the United States.

How the ADETION that uses the generated AI and Amazon rocks is used to release hidden insights about patient groups

Stiged a proposal for subsidies using Amazon Bedrock

Deploy DeepSeek-R1 distilled Llama models with Amazon Bedrock Custom Model Import

1 Comment

First Look at Latest Stephen King Adaptation

Donald Trump posts AI image to attack Kamala Harris

A surprising variety of bacteria lives in microwave ovens

Most Popular

Meta AI’s new “Imagine Me” tool lets you generate your own AI images

Walking pneumonia is rapidly increasing among young children. Here’s what you need to know

Star Wars Outlaws may break Ubisoft out of its open-world funk

Our Picks

What’s the difference between Bluetooth tracking and ultra-wideband?

A very large telescope in Europe is facing a new miserable threat

Is this the world’s most difficult word search? Try it out

Subscribe to our newsletter

Subscribe to Updates

What's Hot

Intelligent healthcare form analytics with Amazon Bedrock

Solution overview

Part 1: Standard Forms: Extracting and Saving Data

Amazon Bedrock API calls to extract form details

User prompt to extract and list fields

Part 2: Data Extraction vs. Form Fields

Final comparison using Anthropic Claude 3 on Amazon Bedrock:

Next steps

Conclusion

About the Author

Related Posts

1 Comment

Subscribe to our newsletter

Subscribe to our newsletter