Falcon 3 models are now available on Amazon Sagemaker Jumpstart

We look forward to today to announce that the Falcon 3 family of TII models is available on Amazon Sagemaker Jumpstart. In this post, we will explore how to efficiently deploy this model to Amazon Sagemaker AI.

Overview of the Falcon 3 Family Models

The Falcon 3 family, developed by the Technology Innovation Institute (TII) in Abu Dhabi, represents a significant advancement in the open source language model. The collection includes five base models ranging from 1 billion to 10 billion parameters, focusing on enhancing science, mathematics and coding capabilities. The family consists of Falcon3-1B based, Falcon3-3B based, Falcon3-Mamba-7b-base, Falcon3-7b-base, and Falcon3-10b-base.

These models demonstrate innovations such as efficient pre-training techniques, scaling for improved inference, and knowledge distillation for improved performance in small models. In particular, the FALCON3-10B base model delivers cutting-edge performance for models with under 13 billion parameters with zero shot and fewer shot tasks. The Falcon 3 family also includes various fine-tuning versions, such as directive models, and supports a variety of quantization formats, making it versatile for a wide range of applications.

Currently, Sagemaker Jumpstart offers base versions of Falcon3-3b, Falcon3-7b, and Falcon3-10b, along with corresponding directive variants and Falcon3-1b-instruct.

Get started with Sagemaker Jumpstart

Sagemaker Jumpstart is a machine learning (ML) hub that helps you accelerate your ML journey. Sagemaker Jumpstart lets you evaluate, compare and select pre-trained basic models (FMS), including Falcon 3 models. These models are fully customizable to your use case using data.

Deploying the Falcon 3 models via Sagemaker Jumpstart offers two convenient approaches: using the intuitive Sagemaker Jumpstart UI or implementing them programmatically via the Sagemaker Python SDK. Explore both ways to help you choose the best approach for your needs.

Deploy Falcon 3 using the Sagemaker Jumpstart UI

Complete the following steps to expand Falcon 3 via the Jump Start UI:

To access Sagemaker Jumpstart, use one of the following methods:
1. In Amazon Sagemaker Unified Studio, build Please select the menu Jump Start Model under Model development.
2. Alternatively, please select in Amazon Sagemaker Studio Jump start In the navigation pane.

Search for Falcon3-10b base in the Model Browser.
Select and select a model Expand.
for Instance Typeuse the default instance or choose a different instance.
choose Expand.
After a while, the endpoint status will display AS Inservice And you could perform inferences on it.

Deploy Falcon 3 programmatically using the Sagemaker Python SDK

For teams looking to automate their deployment or integrate with existing MLOPS pipelines, you can use the Sagemaker Python SDK.

from sagemaker.serve.builder.model_builder import ModelBuilder
from sagemaker.serve.builder.schema_builder import SchemaBuilder
from sagemaker.jumpstart.model import ModelAccessConfig
from sagemaker.session import Session
import logging

sagemaker_session = Session()

artifacts_bucket_name = sagemaker_session.default_bucket()
execution_role_arn = sagemaker_session.get_caller_identity_arn()


js_model_id = "huggingface-llm-falcon-3-10B-base"

gpu_instance_type = "ml.g5.12xlarge"  

response = "Hello, I'm a language model, and I'm here to help you with your English."

sample_input = {
    "inputs": "Hello, I'm a language model,",
    "parameters": {"max_new_tokens": 128, "top_p": 0.9, "temperature": 0.6},
}

sample_output = ({"generated_text": response})

schema_builder = SchemaBuilder(sample_input, sample_output)

model_builder = ModelBuilder(
    model=js_model_id,
    schema_builder=schema_builder,
    sagemaker_session=sagemaker_session,
    role_arn=execution_role_arn,
    log_level=logging.ERROR
)

model= model_builder.build()

predictor = model.deploy(model_access_configs={js_model_id:ModelAccessConfig(accept_eula=True)}, accept_eula=True)

Perform inference on the predictor:

predictor.predict(sample_input)

If you want to set up the ability to scale down to zero after deployment, see Sagemaker’s inference to lower the new scale to zero function to remove the cost savings.

cleaning

To clean up the models and endpoints, use the following code:

predictor.delete_model()
predictor.delete_endpoint()

Conclusion

In this post, Sagemaker Jumpstart explored how data scientists and ML engineers could discover, access and implement a wide range of pre-trained FMs for inference, including models from the Falcon 3 family. Get started with Sagemaker Jumpstart by visiting Sagemaker Studio’s Jumpstart. For more information, see Amazon Sagemaker Jumpstart’s prerequisite model, Amazon Sagemaker Jumpstart Foundation Models, and get started with Amazon Sagemaker Jumpstart.

About the author

Absolutely Generated AI Specialist Solution Architect with AWS’ third-party model science team. His area of focus is generative AI and AWS AI accelerators. He holds a bachelor’s degree in computer science and bioinformatics.

Mark Carp I am the ML Architect for the Amazon Sagemaker Service team. He focuses on helping customers design, deploy and manage large-scale ML workloads. In my spare time, I enjoy traveling and exploring new places.

Ragu Lamesha I am a senior ML Solutions Architect for the Amazon Sagemaker Service team. He focuses on helping customers build, deploy and migrate ML production workloads to Sagemaker at scale. He specializes in machine learning, AI and computer vision domains and holds a Master’s degree in Computer Science from UT Dallas. During his free time, he enjoys traveling and photography.

Banu Nagasundaram He leads Sagemaker Jumpstart, Sagemaker’s Machine Learning, and Genai Hub’s products, engineering and strategic partnerships. She is passionate about building solutions that help customers accelerate their AI journey and unlock business value.

What's Hot

How extreme heat affects planes, trains, water pipes and other critical infrastructure

What the “manosphere” gets wrong about cuckolding

Here’s how to get the best sound quality with Spotify

Turbocharging premium audit capabilities with the power of generative AI: Verisk’s journey toward a sophisticated conversational chat platform to enhance customer support

Generate synthetic counterparty (CR) risk data with generative AI using Amazon Bedrock LLMs and RAG

Best practices for Amazon SageMaker HyperPod task governance

Build verifiable explainability into financial services workflows with Automated Reasoning checks for Amazon Bedrock Guardrails

Improve your bike safety with Amazon Rekognition

Turbocharging premium audit capabilities with the power of generative AI: Verisk’s journey toward a sophisticated conversational chat platform to enhance customer support

Generate synthetic counterparty (CR) risk data with generative AI using Amazon Bedrock LLMs and RAG

Best practices for Amazon SageMaker HyperPod task governance

Questions from a century ago still reveal answers in basic mathematics

Do jewelry and flashy hair make Olympians slower?

Men’s 200m Final Paris 2024 Live Stream: Watch Athletics Live for Free

Most Popular

Snapchat now offers native iPad support

Golden oyster mushrooms may prevent signs of aging

US Senator proposes new bill to combat AI deepfakes

Our Picks

‘I said I wouldn’t go’: Ex-Titan submersible engineer testifies

Opening Night Live annoucements you need to know

I tried these brain-tracking headphones that claim to improve focus

Subscribe to our newsletter

Subscribe to Updates

What's Hot

Falcon 3 models are now available on Amazon Sagemaker Jumpstart

Overview of the Falcon 3 Family Models

Get started with Sagemaker Jumpstart

Deploy Falcon 3 using the Sagemaker Jumpstart UI

Deploy Falcon 3 programmatically using the Sagemaker Python SDK

cleaning

Conclusion

About the author

Related Posts

Subscribe to our newsletter

Subscribe to our newsletter