Building large-scale deployment pipelines for generative artificial intelligence (AI) applications is a formidable challenge due to the complexity and unique requirements of these systems. Generative AI models are constantly evolving, with new versions and updates released frequently. Managing and deploying these updates across large-scale deployment pipelines while maintaining consistency and minimizing downtime is therefore a daunting task. Generative AI applications require continuous ingest, preprocessing, and formatting of vast amounts of data from various sources. Building a robust data pipeline that can handle this workload reliably and efficiently at scale is a considerable challenge. Monitoring the performance, bias, and ethical impact of generative AI models in production is a non-trivial task.
Achieving this at scale requires significant investment in resources, expertise, and cross-functional collaboration across multiple personas: data scientists and machine learning (ML) developers who focus on developing ML models, and machine learning operations (MLOps) engineers who focus on unique aspects of AI/ML projects and help improve delivery times, reduce defects, and increase data science productivity. In this post, we show how to convert Python code that fine-tunes generative AI models in Amazon Bedrock from local files into a reusable workflow using Amazon SageMaker Pipelines decorators. Amazon SageMaker model building pipelines enable collaboration across multiple AI/ML teams.
SageMaker Pipelines
SageMaker Pipelines allows you to define and orchestrate various steps involved in the ML lifecycle, including data pre-processing, model training, evaluation, and deployment. This streamlines the process and ensures consistency across various stages of the pipeline. SageMaker Pipelines can handle model versioning and lineage tracking. It automatically tracks model artifacts, hyperparameters, and metadata, and helps you reproduce and audit model versions.
The SageMaker Pipelines decorator feature helps you transform local ML code written as a Python program into one or more pipeline steps. Because Amazon Bedrock is accessible as an API, developers who are not familiar with Amazon SageMaker can write regular Python programs to implement or fine-tune Amazon Bedrock applications.
ML functions are written just like any other ML project. After testing it locally or as a training job, a data scientist or practitioner who is an expert in SageMaker can extend it by adding a SageMaker Pipeline step to the function. @step
Decorator.
Solution overview
SageMaker model building pipelines is a tool for building ML pipelines that leverages direct integration with SageMaker, allowing you to create pipelines for orchestration using tools that automatically handle many of the creation and management of the steps.
As you move from the pilot and testing phase to deploying large-scale generative AI models, you need to apply DevOps practices to your ML workloads. SageMaker Pipelines is integrated with SageMaker, so you don’t need to interact with other AWS services. You also don’t need to manage resources because SageMaker Pipelines is a fully managed service; SageMaker Pipelines creates and manages the resources for you. Amazon SageMaker Studio provides an environment for managing the end-to-end SageMaker Pipelines experience. The solution in this post shows how to convert Python code written to preprocess, fine-tune, and test large language models (LLMs) using the Amazon Bedrock API into a SageMaker Pipeline to improve operational efficiency for ML.
The solution has three main steps:
- Write Python code to preprocess, train, and test LLM on Amazon Bedrock.
- addition
@step
A decorated function for converting Python code into a SageMaker pipeline. - Create and run a SageMaker pipeline.
The following diagram illustrates the solution workflow:
Prerequisites
If you just want to view the notebook code, you can view the notebook on GitHub.
If you are new to AWS, you must first create and configure an AWS account. Then, configure SageMaker Studio in your AWS account. Create a JupyterLab space in SageMaker Studio and run a JupyterLab application.
Once you are in your SageMaker Studio JupyterLab space, complete the following steps:
- On the File menu, select New, then Terminal to open a new terminal.
- Enter the following code in the terminal:
- The folder’s caller is displayed
amazon-sagemaker-examples
In the File Explorer pane of SageMaker Studio. - Open Folder
amazon-sagemaker-examples/sagemaker-pipelines/step-decorator/bedrock-examples
. - Open a notebook
fine_tune_bedrock_step_decorator.ipynb
.
This notebook contains all the code for this post and can be run from start to finish.
Explanation of the notebook code
The notebook uses the user’s default Amazon Simple Storage Service (Amazon S3) bucket. The default S3 bucket follows a naming pattern: s3://sagemaker-{Region}-{your-account-id}
If it doesn’t exist yet, it will be created automatically.
A default AWS Identity and Access Management (IAM) role for SageMaker Studio is used for the user. If your SageMaker Studio user role does not have admin access, you must add the required permissions to the role.
For more information, see:
Create a SageMaker session and get a default S3 bucket and IAM role.
Preprocessing, training, and testing LLM on Amazon Bedrock using Python
First, we need to download the data and prepare the LLM in Amazon Bedrock, which we will do using Python.
Loading Data
To fine-tune our model, we use the CNN/DailyMail dataset from Hugging Face. The CNN/DailyMail dataset is an English language dataset that contains over 300,000 unique news articles written by CNN and Daily Mail journalists. The raw dataset contains articles and their summaries for training, validation, and testing. Before using the dataset, we need to format it to include prompts. See the following code:
Splitting the Data
Split the dataset into training, validation, and testing. This post limits the size of each row to 3,000 words and chooses 100 rows for training, 10 rows for validation, and 5 rows for testing. For more information, see the notebook on GitHub.
Upload data to Amazon S3
Then, convert the data into JSONL format and upload the training, validation, and test files to Amazon S3.
Train the model
Now that the training data is uploaded to Amazon S3, we will fine-tune the Amazon Bedrock model using the CNN/DailyMail dataset. For the summarization use case, we will fine-tune the Amazon Titan Text Lite model provided by Amazon Bedrock. We will define the hyperparameters for fine-tuning and start the training job.
Creating Provisioned Throughput
throughput Refers to the number and rate at which inputs and outputs are processed and returned by a model. To provision dedicated resources instead of on-demand throughput, you must purchase provisioned throughput. On-demand throughput can have variable performance. For customized models, you must purchase and use provisioned throughput. For more information, see Amazon Bedrock Provisioned Throughput.
Test the model
Now we will call and test the model, using the Amazon Bedrock runtime prompt for the test dataset, the provisioned throughput ID we set in the previous step, and the inference parameters. maxTokenCount, stopSequence, temperature,
and top
:
Convert a Python function into a SageMaker Pipeline step Decorate the function with @step
of @step
Decorators are functions that transform local ML code into one or more pipeline steps. You can write ML functions just like you would for any other ML project, and then create a pipeline by transforming the Python functions into pipeline steps. @step
You use decorators to create dependencies between those functions, creating a pipeline graph, or directed acyclic graph (DAG), and pass the leaf nodes of that graph as lists of steps into the pipeline. @step
Decorators, annotating functions @step
When this function is invoked, it receives as input the DelayedReturn output of the previous pipeline step. The instance holds information about all previous steps defined in the function that form the SageMaker Pipeline DAG.
The notebook already has @step
Add the decorator to the beginning of each function definition in the cell where the function is defined, as shown in the following code: The code for the function comes from the fine-tuned Python program that we are now going to convert into a SageMaker Pipeline.
Create and run a SageMaker pipeline
To bring it all together, we connect the defined pipelines: @step
You combine functions into a multi-step pipeline, and then submit the pipeline for execution.
After the pipeline runs, you can list the pipeline steps to get the entire resulting dataset.
You can trace the lineage of a SageMaker ML pipeline in SageMaker Studio. Lineage tracing in SageMaker Studio revolves around DAGs, which represent the steps in a pipeline. From the DAG, you can trace the lineage from any step to any other step. The following diagram shows the steps in an Amazon Bedrock fine-tuning pipeline. For more information, see Viewing a Pipeline Execution.
You can focus on a specific part of the graph by selecting a step in the (Select a Step) drop-down menu. Detailed logs for each step of your pipeline are available in Amazon CloudWatch Logs.
cleaning
To clean up and avoid incurring charges, follow the detailed cleanup instructions in the GitHub repository and remove the following:
- Amazon Bedrock Provisioned Throughput
- Customer Model
- Sagemaker Pipeline
- An Amazon S3 object that stores the fine-tuned dataset
Conclusion
MLOps focuses on streamlining, automating, and monitoring the entire lifecycle of ML models. Building a robust MLOps pipeline requires cross-functional collaboration. Data scientists, ML engineers, IT staff, and DevOps teams need to work together to operationalize models from research to deployment and maintenance. SageMaker Pipelines allows you to create and manage ML workflows while providing storage and reuse capabilities for workflow steps.
This post walked through an example of using SageMaker Step Decorators to convert a Python program into a SageMaker Pipeline for creating a custom Amazon Bedrock model. With SageMaker Pipelines, you get the benefit of automated workflows that you can configure to run on a schedule based on your model retraining requirements. You can also use SageMaker Pipelines to add useful features such as lineage tracking and the ability to manage and visualize the entire workflow from within the SageMaker Studio environment.
AWS offers managed ML solutions, such as Amazon Bedrock and SageMaker, that can help you deploy and serve existing off-the-shelf foundational models or create and run your own custom models.
For more information on the topics discussed in this post, check out the following resources:
About the Author
Neil Sendas Neel is a Principal Technical Account Manager at Amazon Web Services. He works with enterprise customers to design, deploy, and scale cloud applications to achieve their business goals. He has worked on a variety of ML use cases ranging from anomaly detection to product quality prediction for manufacturing and logistics optimization. When not supporting customers, Neel enjoys golfing and salsa dancing.
Ashish Rawat Ashish is a Sr. AI/ML Specialist Solutions Architect at Amazon Web Services based in Atlanta, GA. Ashish has extensive experience in enterprise IT architecture and software development including AI/ML and generative AI. He is committed to guiding customers to solve complex business challenges and create competitive advantage using AWS AI/ML services.