Accelerate development of ML workflows using Amazon Q Developer in Amazon SageMaker Studio

Machine learning (ML) projects are inherently complex and involve multiple intricate steps, from data collection and pre-processing to model building, deployment, and maintenance. Data scientists face numerous challenges throughout this process, including choosing the right tools, providing step-by-step instructions with code samples, and troubleshooting errors and issues. These repetitive challenges can bog down and slow down project progress. Fortunately, generative AI-powered developer assistants such as Amazon Q Developer have emerged to help data scientists streamline their workflows and fast-track ML projects, saving them time to focus on strategic initiatives and innovation.

Amazon Q Developer is fully integrated with Amazon SageMaker Studio, an integrated development environment (IDE) that provides a single web-based interface to manage all stages of ML development. You can use this natural language assistant from your SageMaker Studio notebooks to get personalized assistance using natural language. It provides tool recommendations, step-by-step guidance, code generation, and troubleshooting support. This integration simplifies your ML workflow, allowing you to efficiently build, train, and deploy ML models without having to leave SageMaker Studio to search for additional resources or documentation.

In this post, we present a real-world use case where we develop an ML model that analyzes a diabetes dataset from 130 U.S. hospitals to predict the likelihood of post-discharge readmission. In this exercise, we use Amazon Q Developer in SageMaker Studio at different stages of the development lifecycle to experience first-hand how this natural language assistant can help streamline the development process and accelerate time to value for even the most experienced data scientists and ML engineers.

Solution overview

For AWS Identity and Access Management (IAM) and AWS IAM Identity Center users, you can use the Amazon Q Developer Pro level subscription within Amazon SageMaker. Administrators can subscribe users to the Pro level in the Amazon Q Developer console, enable the Pro level in the SageMaker domain settings, and specify the Amazon Resource Name (ARN) for the Amazon Q Developer profile. The Pro level provides unlimited chat and inline code suggestions. For detailed instructions, see Setting up Amazon Q Developer for a User.

If you don’t have a Pro Tier subscription but would like to try the feature, you can access the Amazon Q Developer free tier by adding the relevant policies to the SageMaker service role. Administrators can go to the IAM console, search for the SageMaker Studio role, and add the policies described in Configure Amazon Q Developer for Your Users. The free tier is available for both IAM users and IAM Identity Center users.

To begin your ML project to predict the likelihood of hospital readmission for diabetic patients, you will need to download the US 130 Hospital Diabetes Dataset. This dataset contains 10 years (1999-2008) of clinical care data for 130 US hospitals and integrated delivery networks. Each row represents a hospital record, including a patient who was diagnosed with diabetes and had a test performed.

At the time of writing, Amazon Q Developer support in SageMaker Studio is only available in JupyterLab spaces. Amazon Q Developer is not supported in shared spaces.

Amazon Q Developer Chat

Once you have uploaded your data to SageMaker Studio, you can start working on your ML problem of reducing readmission rates for diabetic patients. First, use the chat feature next to the JupyterLab notebook. You can ask questions like generating code to analyze diabetic data from 130 US hospitals, how should this ML problem be formulated, and how do you plan to build an ML model to predict the likelihood of readmission after discharge. Amazon Q Developer uses AI to provide code recommendations, which are non-deterministic. The results you get may differ from those shown in the following screenshot.

You can ask Amazon Q Developer to help you plan your ML project. In this case, let the assistant show you how to train a random forest classifier using the Diabetes 130-US dataset. Enter the following prompts into the chat and Amazon Q Developer will generate a plan for you. Once the code is generated, you can use the UI to insert the code directly into your notebook:

I have diabetic_data.csv file containing training data about whether a diabetic patient was readmitted after discharge. I want to use this data to train a random forest classifier using scikit-learn. Can you list out the steps to build this model?

You can ask an Amazon Q developer to generate code for a specific task by inserting the following prompt:

Create a function that takes in a pandas DataFrame and performs one-hot encoding for the gender, race, A1Cresult, and max_glu_serum columns.

You can also ask an Amazon Q developer to walk you through your existing code or troubleshoot common errors by simply selecting the cell with the error and typing. /fix In chat.

The complete list of shortcut commands is as follows:

/help – Show this help message
/repair – Fix selected error cells in notebook
/Clear – Clear the chat window
/export – Export chat history to a Markdown file

To get the most out of your Amazon Q developer chat, we recommend following best practices when creating your prompts:

Directly and specifically – Ask precise questions. For example, instead of asking vague questions about AWS services, try asking, “Can you provide example code for training an XGBoost model in SageMaker using the SageMaker Python SDK library?”. Specific questions allow the Assistant to understand exactly what information you need, resulting in a more accurate and helpful answer.
Providing context – The more context you provide, the better the results. This allows Amazon Q Developer to tailor its response to your specific situation. For example, instead of just asking for code to prepare your data, provide the first three rows of your data to get better code suggestions that require fewer changes.
Avoid sensitive topics – Amazon Q Developer is designed with guardrail controls: it is best to avoid asking questions related to security, account billing information, or other sensitive topics.

Following these guidelines will help you maximize the value of Amazon Q Developer’s AI-powered code recommendations and streamline your ML projects.

Inline code suggestions for Amazon Q developers

As you type in your JupyterLab notebooks, you also get real-time code suggestions. They provide contextual suggestions based on your existing code and comments, streamlining the coding process. In the following examples, we show how you can use the inline code suggestions feature to generate code blocks for various data science tasks, from data exploration to feature engineering, training a random forest model, evaluating the model, and finally deploying the model to predict the likelihood of hospital readmission for diabetic patients.

The following image shows a list of keyboard shortcuts for navigating Amazon Q Developer.

Let’s start by exploring the data.

First, import some required Python libraries such as pandas and NumPy. Add the following code to the first code cell in your Jupyter Notebook and run the cell.

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

Add the following comment to the next code cell and before running the cell: input and tabWatch the status bar at the bottom to see Amazon Q Developer generate code suggestions.

# read 'diabetic-readmission.csv'

You can also ask an Amazon Q developer to create a visualization for you.

# create a bar chart from df that shows counts of patients by 'race' and 'gender' with a title of 'patients by race and gender'

Now we can perform feature engineering to prepare the model for training.

The provided dataset has some categorical features and missing data that need to be converted to numerical features. Add the following comment to the next code cell: tab See how Amazon Q Developer can help you:

# perform one-hot encoding for gender, race, a1c_result, and max_glu_serum columns

Finally, Amazon Q Developer allows you to create a simple ML model, a random forest classifier, using scikit-learn.

Amazon Q Developer in SageMaker Data Policy

If you use Amazon Q Developer in SageMaker Studio, your customer content will not be used to improve the service, regardless of whether you are using the free or professional tier. For IDE-level telemetry sharing, Amazon Q Developer may track your usage of the service, such as the number of questions you ask and whether you accept or reject a recommendation. This information does not include customer content or any personally identifiable information, such as IP addresses. If you would like to opt out of IDE-level telemetry, follow the steps below to opt out of sharing your usage data with Amazon Q Developer.

Above setting Menu, Select Config Editor.

Uncheck the option Share usage data with Amazon Q developers.

Alternatively, ML platform administrators can use a lifecycle configuration script to disable this option by default for all users in JupyterLab. For more information, see Using Lifecycle Configuration in JupyterLab. To disable data sharing with Amazon Q Developer by default for all users in your SageMaker Studio domain, follow these steps:

In the SageMaker console, Lifecycle Configuration under Administrator Settings In the navigation pane.
choose Create a configuration.

for nameenter a name.
So script In the section, create a lifecycle configuration script, shareCodeWhispererContentWithAWS Configuration Flags jupyterlab-q expansion:

#!/bin/bash
mkdir -p /home/sagemaker-user/.jupyter/lab/user-settings/amazon-q-developer-jupyterlab-ext/
cat<<EOL> /home/sagemaker-user/.jupyter/lab/user-settings/amazon-q-developer-jupyterlab-ext/completer.jupyterlab-settings
{
"shareCodeWhispererContentWithAWS": false,   
"suggestionsWithCodeReferences": true,   
"codeWhispererTelemetry": false,
"codeWhispererLogLevel": "ERROR"
}
EOL

Attach the disable-q-data-sharing lifecycle configuration to the domain.
Optionally, you can force the lifecycle configuration to run. Run by default

Use this lifecycle configuration when creating a JupyterLab space.

It is selected by default if the setting is set to: Run by default.

Setup is almost instantaneous. Share usage data with Amazon Q developers Select the JupyterLab space option on launch.

cleaning

To avoid incurring AWS charges after testing this solution, delete the SageMaker Studio domain.

Conclusion

In this post, we covered a real-world use case, developing an ML model to predict the likelihood of post-discharge readmission for patients in a diabetes dataset from 130 US hospitals. This exercise used Amazon Q Developer in SageMaker Studio at different stages of the development lifecycle and demonstrated how this developer assistant can streamline the development process and accelerate time to value, even for experienced ML practitioners. You can access Amazon Q Developer in all AWS Regions where SageMaker is generally available. Get started with Amazon Q Developer in SageMaker Studio today to access the generative AI-powered assistant.

The Assistant is available to all Amazon Q Developer Pro and Free Tier users. For pricing information, see Amazon Q Developer Pricing.

About the Author

James Wu James is a Senior AI/ML Specialist Solutions Architect at AWS, helping customers design and build AI/ML solutions. James’ work covers a wide range of ML use cases with a focus on computer vision, deep learning, and scaling ML across the enterprise. Prior to joining AWS, James worked as an architect, developer, and technology leader for over 10 years, spending 6 years in the engineering industry and 4 years in the marketing and advertising industry.

Lauren Mullenex Sr. AI/ML Specialist Solutions Architect at AWS with 10 years of experience in DevOps, Infrastructure and ML, with focus areas including Computer Vision, MLOps/LLMOps and Generative AI.

Shivin Michaellaji He is a Senior Product Manager in the Amazon SageMaker team, focusing on building AI/ML based products for AWS customers.

Pranav Murti Pranav is an AI/ML Specialist Solutions Architect at AWS. He is focused on helping customers build, train, deploy, and migrate their Machine Learning (ML) workloads to SageMaker. He previously worked in the semiconductor industry where he developed large-scale Computer Vision (CV) and Natural Language Processing (NLP) models to improve semiconductor processes using cutting-edge ML techniques. In his spare time, Pranav enjoys playing chess and traveling. You can find Pranav on LinkedIn.

Badrinath Pani He is a Software Development Engineer at Amazon Web Services working on Amazon SageMaker Interactive ML products. He has 12+ years of software development experience in domains such as Automotive, IoT, AR/VR, and Computer Vision. Currently, he is primarily focused on developing machine learning tools aimed at simplifying the experience for data scientists. In his spare time, he enjoys spending time with his family and exploring the beautiful landscapes of the Pacific Northwest.

What's Hot

Drop everything right now and check out this amazing video of the Aurora Borealis taken from the space station

Cops Are Towing Teslas to Recover Crime Scene Footage

Climate change is destroying the monarch butterfly’s winter habitat

Improve your bike safety with Amazon Rekognition

Earthly Meditation – My Travel and Geology Blog: Tierra del Fuego

Use language embeddings for zero-shot classification and semantic search with Amazon Bedrock

Build a dynamic, role-based AI agent using Amazon Bedrock inline agents

From concept to reality: Navigating the Journey of RAG from proof of concept to production

Improve your bike safety with Amazon Rekognition

Use language embeddings for zero-shot classification and semantic search with Amazon Bedrock

Build a dynamic, role-based AI agent using Amazon Bedrock inline agents

Black holes engulfing massive stars provide clues to unlock cosmic mysteries

Magic Jigsaw: Get a Lifetime of Puzzles for $39.99

Develop RAG -based applications using Amazon Kendra and Amazon Aurora

Most Popular

Listen to John De Lancie Geek Out Over These Virtual Star Trek Sets

Disney+’s Latest Price Hike Is Officially Driving My Subscription Fatigue

Why do close elections happen so often?

Our Picks

Sophie Kudmani: Astrophysicist uncovering the origin of supermassive black holes

Today’s Wordle: July 26 Answers and Hints

Syngenta develops a generative AI assistant to support sales representatives using Amazon Bedrock Agents

Subscribe to our newsletter

Subscribe to Updates

What's Hot

Accelerate development of ML workflows using Amazon Q Developer in Amazon SageMaker Studio

Solution overview

Amazon Q Developer Chat

Inline code suggestions for Amazon Q developers

Amazon Q Developer in SageMaker Data Policy

cleaning

Conclusion

About the Author

Related Posts

Subscribe to our newsletter

Subscribe to our newsletter