How to Deploy DeepSeek R1 Distill-Llama 8B on AWS

Written by Ata Ağrı | Feb 9, 2025 7:42:25 PM

In the ever-evolving landscape of artificial intelligence, DeepSeek R1 has emerged as a powerful contender, offering exceptional performance combined with cost efficiency. Developed by DeepSeek AI, this open-source model has made significant waves in the AI community, reshaping how we approach large language models.

In this comprehensive guide, we’ll walk you through the step-by-step process of deploying the distilled version of DeepSeek R1, known as DeepSeek R1 Distill-Llama 8B, on AWS Bedrock.

But before diving into the technical details, let’s understand a fundamental concept—what is model distillation?

What is Knowledge Distillation?

Knowledge distillation is transferring the knowledge of a larger model to a smaller model. By doing so, we are able to lower the computational cost with lower costs but without losing the validity. To give you a glimpse into how this is done, Today, we’ll be deploying DeepSeek R1 Distill-Llama 8B—which is also offered by DeepSeek AI—that distills Llama 3.1 8B parameter model with DeepSeek R1 685B parameter model, which is a great reduction in size.

Before we begin, ensure you have the following:

An AWS account with the necessary IAM roles configured (for Bedrock and S3 access).
The DeepSeek R1 Distill-Llama 8B model package.

Now, let’s get started with the deployment process.

Step-by-step Guide for Deploy DeepSeek R1 Distill-Llama 8B on AWS:

Step 1: Download the DeepSeek R1 Distill-Llama 8B Model

To load the model package into your AWS S3 bucket, you first need to download it locally from Hugging Face.

Download Using Git

Open your terminal.
Navigate to the directory where you want to download the model.
Run the following commands:

git lfs install git clone git@hf.co:deepseek-ai/DeepSeek-R1-Distill-Llama-8B

Step 2: Upload the Model Package to AWS S3

After downloading the model, it’s time to upload it to an S3 bucket.

Create an S3 Bucket

Log into the AWS Management Console.
In the search bar, type “S3” and click on the service.
Click “Create bucket”.
Choose a unique name for your bucket (e.g., deepseek-r1-distill-llama-model-package).
Keep the default settings and click “Create bucket”.

Upload the Model Package

Go to your newly created bucket.
Click “Upload”.
Choose “Add folder” since the model is in a folder.
Select the folder and click “Upload”.

Note: The upload process may take a few hours as the model folder is approximately 15GB.

Alternative Method Using AWS CLI

If you encounter network issues during upload, use the AWS CLI:

To find your S3 URI, open the S3 console, select the folder, and copy the URI (e.g., s3://my-bucket/folder).

Step 3: Import the Model into AWS Bedrock

With the model stored in S3, the next step is to import it using AWS Bedrock.

Import Process

In the AWS Console, search for “Bedrock” and open the service.
From the left navigation pane, select “Imported Models”.
Click “Import Model”.
Enter a model name (e.g., DeepSeek-R1-Distill-Llama-8B) and configure the job name if desired.
In the “Model import settings”, click “Browse S3”.
Select the folder containing your model package (ensure you select the folder, not individual files).
Click “Import model”.

The import process may take some time. You can refresh the status periodically. Once the status shows “Completed”, your model is ready.

Step 4: Test the Model in AWS Bedrock Playground

Now that your model is imported, it’s time to test its capabilities.

Access the Playground

In the Bedrock console, navigate to “Playground”.
Select “Single Prompt Mode”.
Click “Select Model” and choose “Imported Models”.
Select your imported DeepSeek R1 Distill-Llama 8B model.

Prompt Structure for Testing

To interact with the model effectively, use the following prompt structure:

<|begin▁of▁sentence|><|User|>Your prompt here<|Assistant|>

This structure ensures the model understands the context correctly, as it requires role tags and sentence indicators for optimal performance.

Comparison Mode

If you wish to compare DeepSeek R1 Distill-Llama 8B with other models supported by AWS:

Toggle “Compare Mode” in the top-right corner.
Select another model for side-by-side evaluation.

Key Takeaways

In this blog post, we’ve explored the complete process of deploying DeepSeek R1 Distill-Llama 8B on AWS Bedrock, covering everything from downloading the model to importing it into AWS and testing it in the Bedrock Playground. Here are the key points to remember:

DeepSeek R1 Distill-Llama 8B is a highly efficient AI model, distilled from the larger DeepSeek R1 685B, offering powerful performance with reduced computational costs.
AWS S3 and AWS Bedrock provide a seamless environment for storing, importing, and deploying large AI models securely and efficiently.
The import process is straightforward, but optimizing your prompt structure is crucial to achieve the best model performance in the AWS Playground.
While the model currently supports single prompt mode, it still delivers impressive results and can be compared with other models within AWS Bedrock for performance evaluation.

By following these steps, you can deploy a production-ready AI solution that balances performance and cost-effectiveness, demonstrating the potential of open-source AI models in business environments.

View full post