Last month, I started a series of posts focusing on the key factors that make customers choose Amazon Bedrock. I described how Bedrock helps customers build a secure and compliant foundation for their generative AI applications. Now, I want to talk about a slightly more technical but equally important differentiator of Bedrock: the multiple techniques you can use to customize your models to meet your specific business needs.
As you know, large language models (LLMs) are transforming how artificial intelligence (AI) is used, enabling companies to rethink core processes. Trained on large datasets, these models can quickly understand data and generate appropriate responses across a range of domains, from summarizing content to answering questions. The broad applicability of LLMs explains why customers in healthcare, financial services, and media and entertainment are rapidly adopting them. However, our customers tell us that while pre-trained LLMs excel at analyzing vast amounts of data, they often lack the expertise needed to tackle their specific business challenges.
Customization unlocks the transformative potential of large-scale language models. Amazon Bedrock is the Versatile You can fine-tune the solution to meet your unique needs. Customization includes various techniques such as prompt engineering, search augmentation generation (RAG), and fine-tuning and continuous pre-training. Prompt engineering involves carefully crafting prompts to get the desired response from the LLM. RAG combines knowledge obtained from external sources with language generation to provide more contextual and accurate responses. Model customization techniques such as fine-tuning and continuous pre-training further train pre-trained language models on specific tasks or domains to improve performance. Using these techniques in combination, you can train the base model in Amazon Bedrock with your data to provide contextual and accurate output. Read the examples below to understand how customers can use customization with Amazon Bedrock to achieve their use cases.
Thomson Reuters, a global content and technology company, is seeing good results with Claude 3 Haiku, but sees even better results with customization. The company, which serves legal, tax, accounting, compliance, government and media professionals, expects that by fine-tuning Claude with its industry expertise, it will deliver even faster, more relevant AI results.
“We are excited to fine-tune Anthropic’s Claude 3 Haiku model on Amazon Bedrock, further enhancing our Claude-powered solutions. At Thomson Reuters, we aim to provide an accurate, fast and consistent user experience. By optimizing Claude for our industry expertise and specific requirements, we expect to see measurable improvements, delivering even faster, higher quality results. We are already seeing good results with Claude 3 Haiku, and the fine-tuning will enable us to more precisely tune our AI assistance.”
– Joel Fron, Chief Technology Officer, Thomson Reuters.
At Amazon, Buy with Prime is using Amazon Bedrock’s cutting-edge RAG-based customization capabilities to drive efficiency. Orders on the merchant’s site are handled by Buy with Prime Assist, a 24/7 live chat customer service. We recently launched a beta chatbot solution that can handle product support inquiries. The solution is powered by Amazon Bedrock and is customized with data that goes beyond traditional email-based systems. My colleague Amit Nandy, Product Manager at Buy with Prime, said:
“By indexing merchant websites, including subdomains and PDF manuals, we’ve built a customized knowledge base that provides relevant and comprehensive support for each merchant’s unique offering. Combined with Claude’s cutting-edge foundational model and Amazon Bedrock’s Guardrails, our chatbot solution delivers a highly competent, safe and reliable customer experience. Shoppers now receive accurate, timely and personalized support for their inquiries, increasing satisfaction and strengthening the reputation of Buy with Prime and its participating merchants.”
Stories like these are why we continue to expand our capabilities for customizing generative AI applications with Amazon Bedrock.
In this blog, we discuss three key techniques for customizing LLM on Amazon Bedrock and also cover related announcements from the recent AWS New York Summit.
Prompt Engineering: Guiding your application to the desired answer
Prompts are the primary input for LLMs to generate answers. Prompt engineering is the technique of carefully crafting these prompts to effectively guide the LLM. Learn more here. Well-designed prompts can significantly improve model performance by providing clear instructions, context, and examples tailored to the task at hand. Amazon Bedrock supports multiple prompt engineering techniques. For example, A few prompts Provide examples with desired outputs to help the model better understand the task, such as sentiment analysis examples labeled “positive” or “negative.” Zero Shot Prompt Provide a description of the task without examples; and Chain of thoughts Prompts reinforce multi-step reasoning by asking models to break down complex problems, which is useful for arithmetic, logic, and deduction tasks.
The Prompt Engineering Guidelines outline various prompting strategies and best practices for optimizing LLM performance across applications. Leveraging these techniques can help practitioners achieve their desired outcomes more effectively. However, developing optimal prompts that elicit optimal responses from the underlying model is a challenging, iterative process that often requires weeks of refinement by developers.
Zero Shot Prompt | A few shots of prompts |
Prompt flow visual builder promotes thought chain | |
Search Expansion Generation: Expanding results with searched data
LLMs typically lack the expertise, terminology, context, or up-to-date information required for a particular task. For example, legal professionals looking for reliable, up-to-date, and accurate information within their field of expertise may find generalist LLMs insufficient to interact with. Search Augmentation Generation (RAG) is a process that allows a language model to reference a trusted knowledge base outside of its training data source before generating a response.
The RAG process involves three main steps:
- search: Given an input prompt, the retrieval system identifies and retrieves relevant sentences or documents from a knowledge base or corpus.
- AugmentThe captured information is combined with the original prompt to create an extended input.
- generationLLM generates responses based on the augmented input and leverages the information gained to generate more accurate and informed output.
Amazon Bedrock Knowledge Base is a fully managed RAG feature that allows you to connect LLM to your on-premise data sources to provide relevant, accurate and tailored responses. We announced multiple new features at the AWS New York Summit to provide you with more flexibility and accuracy when building RAG-based applications. For example, you can now securely access data from new sources such as the web (preview), which allows you to index public web pages and access enterprise data from Confluence, SharePoint and Salesforce (all previews). Advanced chunking options are another exciting new feature. This allows you to create custom chunking algorithms for your specific needs or leverage built-in semantic and hierarchical chunking options. Advanced parsing techniques now allow you to accurately extract information from complex data formats (such as complex tables in PDFs). Additionally, the query reformulation feature allows you to decompose complex queries into simpler sub-queries to improve retrieval accuracy. All these new features help you reduce the time and costs associated with data access and build highly accurate and relevant knowledge resources, all tailored to your specific enterprise use cases.
Model customization: Improving performance for a specific task or domain
Model customization in Amazon Bedrock is the process of customizing a pre-trained language model for a specific task or domain. You take a large pre-trained model and further train it on a smaller, specialized dataset that is relevant to your use case. This approach leverages the knowledge acquired in the initial pre-training stage while tuning the model to your requirements without losing any of the original features. The fine-tuning process in Amazon Bedrock is designed to be efficient, scalable, and cost-effective, allowing you to tune a language model to your unique needs without requiring large-scale computational resources or data. In Amazon Bedrock, you can combine model fine-tuning with prompt engineering or Retrieval-Augmented Generation (RAG) approaches to further enhance the performance and capabilities of your language model. Model customization can be implemented on both labeled and unlabeled data.
Fine-tuning with labeled data To improve a model’s performance on a specific task, you provide labeled training data. The model learns to associate the appropriate output with a specific input and tunes its parameters to improve accuracy on the task. For example, if you have a dataset of customer reviews labeled as positive or negative, you can use this data to fine-tune a pre-trained model in Bedrock to create a sentiment analysis model tailored to your domain. At the AWS New York Summit, we announced fine-tuning of Anthropic’s Claude 3 Haiku. By providing task-specific training datasets, users can fine-tune and customize Claude 3 Haiku to improve accuracy, quality, and consistency for their business applications.
Continuing pre-training with unlabeled dataAlso known as domain adaptation, this technique allows LLM to be further trained on a company’s own unlabeled data, exposing the model to domain-specific knowledge and language patterns to improve its understanding and performance on specific tasks.
Customization is the key to unlocking the true power of generative AI
Large-scale language models are revolutionizing AI applications across industries, but customizing these generic models with your expertise is key to maximizing business impact. Amazon Bedrock enables organizations to customize LLM through prompt engineering techniques such as prompt management and prompt flows to help you create effective prompts. Amazon Bedrock’s knowledge-based search augmentation generation allows you to integrate LLM with your own data sources to generate accurate, domain-specific responses. And model customization techniques such as fine-tuning with labeled data and continuous pre-training with unlabeled data allow you to optimize the behavior of LLM to your unique needs. A closer look at these three main customization methods reveals that, although different in approach, they share a common goal of helping you solve specific business problems.
resource
To learn more about customization with Amazon Bedrock, see the following resources:
- Learn more about Amazon Bedrock
- Explore the Amazon Bedrock Knowledge Base
- Read the announcement blog for additional data connectors in the Amazon Bedrock Knowledge Base
- Read the blog on advanced chunking and parsing options for your Amazon Bedrock knowledge base
- Learn more about Prompt Engineering
- Learn more about prompt engineering techniques and best practices
- Read the announcement blog about prompt management and prompt flow
- Learn more about fine-tuning and ongoing pre-training
- Read the announcement blog for Anthropic’s Claude 3 Haiku tweaks
About the Author
Vashi Philomin He is the VP of Generative AI at AWS, where he leads the Generative AI effort, including Amazon Bedrock and Amazon Titan.