Run OpenAI gpt-oss Models in Azure AI Foundry and Foundry Local

This guide explains how to run gpt-oss-120b on Azure AI Foundry, and how to run gpt-oss-20b using Foundry Local.

Deploy gpt-oss-120b on Azure OpenAI

Overview

This repo contains Python sample code for interacting with the gpt-oss-120b model deployed on Azure AI Foundry using Chat Completions. The gpt-oss models are OpenAI's open weight models that provide transparent access to its reasoning process.

Prerequisites

Python 3.7 or higher
Azure subscription
Azure CLI (optional, for deployment via command line)

Deployment Guide

Follow these steps to deploy and use the gpt-oss-120b model:

Step 1: Deploy an Azure AI Foundry Project

Deploy an Azure AI Foundry Project if you don't already have one available.

According to Microsoft Learn Docs, gpt-oss is available in all regions. I personally tested this using a Foundry Project in UK South.

📖 Detailed instructions: Create Azure AI Foundry Projects

Step 2: Deploy the gpt-oss-120b Model

Deploy the gpt-oss-120b model using one of the following methods:

Option A: Azure Portal

Navigate to your Azure AI Foundry Project
Go to the Model catalog
Search for and select "gpt-oss-120b"
Click "Deploy" and follow the deployment wizard

Option B: Azure CLI

az cognitiveservices account deployment create \
  --resource-group <your-resource-group> \
  --name <foundry-resource-name> \
  --deployment-name "gpt-oss-120b" \
  --model-name gpt-oss-120b \
  --model-version 1 \
  --model-format "OpenAI-OSS" \
  --sku-name GlobalStandard \
  --sku-capacity 1

Replace <your-resource-group> and <foundry-resource-name> with your actual values.

Step 3: Set Up the Project

Clone this repository:

git clone https://github.com/guygregory/gpt-oss.git
cd gpt-oss

Install dependencies:
```
pip install -r requirements.txt
```
Configure environment variables:
```
cp .env.sample .env
```
Update the .env file with values from your Azure AI Foundry Project:
- AZURE_OPENAI_API_ENDPOINT: Your foundry endpoint (format: https://<FOUNDRY_RESOURCE_NAME>.openai.azure.com/)
- AZURE_OPENAI_V1_API_ENDPOINT: Your foundry v1 endpoint (format: https://<FOUNDRY_RESOURCE_NAME>.openai.azure.com/openai/v1/)
- AZURE_OPENAI_API_KEY: Your API key (found under "Keys and Endpoint" in your project)
- AZURE_OPENAI_API_MODEL: Your deployment name (default: gpt-oss-120b)

Step 4: Run the Sample Code

Choose one of the Python samples:

Gradio example (see above screenshot)

python chat-gradio-aoai.py

Legacy Azure OpenAI API

python chat-basic-aoai.py

V1 Preview API

python chat-basic-aoai-v1.py

Why are there two different APIs? Which should I use?

Starting in May 2025, you can now opt in to our next generation of v1 Azure OpenAI APIs which add support for:

Ongoing access to the latest features with no need to update api-version each month.
OpenAI client support with minimal code changes to swap between OpenAI and Azure OpenAI when using key-based authentication.

Code samples have been provided for both the v1 API Preview, and also the older API versions. The v1 API Preview samples have a v1.py suffix to distinguish them.

If you want the latest features, I would recommend using the v1 API Preview, with the api-version set to preview. If you need a stable, GA version, and don't need the latest features, then you can use the older API. At time of writing, the latest GA API release is 2024-10-21.

Azure OpenAI in Azure AI Foundry Models API lifecycle

Troubleshooting

Common Issues

Authentication Error: Verify your API key and endpoint URL are correct
Model Not Found: Ensure the deployment name matches your Azure deployment
Region Not Supported: Verify you deployed in one of the supported regions
Rate Limiting: The gpt-oss model has usage quotas; check your deployment capacity

Getting Help

Check the Azure AI Foundry documentation
Review Azure OpenAI service logs in the Azure portal
Verify your deployment status in the Azure AI Foundry project

Run gpt-oss-20b on Foundry Local

If you want to experiment with the gpt-oss-20b model locally without using Azure, you can use Foundry Local.

Requirements

NVIDIA GPU with at least 16 GB VRAM
Foundry Local version 0.6.87 or above

Check your version with:

winget list --id Microsoft.FoundryLocal

Update to the latest version (if needed):

winget upgrade Microsoft.FoundryLocal

Run the Model

foundry model run gpt-oss-20b

💡 Note: This will download several GBs of model weights if not already cached.

Notes

Documentation for running gpt-oss-20b on Foundry Local can be found here
Foundry Local is ideal for local development, offline prototyping, or sandbox testing
If you get the error Exception: Model <gpt-oss-20b> was not found in the catalog or local cache., check your PC meets the 16GB NVIDIA requirement, and you are running version 0.6.87 or above. Reboot your PC after upgrading.
Works well for basic chat and reasoning tasks if your hardware meets the requirements

Useful Links

Here are some useful links related to gpt-oss:

Announcements

OpenAI Announcement: OpenAI Open Models
Microsoft Announcement: OpenAI's Open-Source Model gpt-oss on Azure AI Foundry and Windows AI Foundry

Documentation

Azure Region Availability: gpt-oss Model Availability
Run gpt-oss-20b on Foundry Local: Get Started Guide
Model Card: gpt-oss Model Card (PDF)

Code and Platforms

Code Samples: This Repository
HuggingFace: gpt-oss Collection
Ollama: gpt-oss Library
GitHub: OpenAI gpt-oss Repository
Playground (OpenAI): gpt-oss.com

License

This project is licensed under the MIT License - see the LICENSE file for details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Run OpenAI gpt-oss Models in Azure AI Foundry and Foundry Local

Table of Contents

Deploy gpt-oss-120b on Azure OpenAI

Overview

Prerequisites

Deployment Guide

Step 1: Deploy an Azure AI Foundry Project

Step 2: Deploy the gpt-oss-120b Model

Option A: Azure Portal

Option B: Azure CLI

Step 3: Set Up the Project

Step 4: Run the Sample Code

Gradio example (see above screenshot)

Legacy Azure OpenAI API

V1 Preview API

Why are there two different APIs? Which should I use?

Troubleshooting

Common Issues

Getting Help

Run gpt-oss-20b on Foundry Local

Requirements

Run the Model

Notes

Useful Links

Announcements

Documentation

Code and Platforms

License

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

Name	Name	Last commit message	Last commit date
Latest commit History 26 Commits
.env.sample	.env.sample
.gitignore	.gitignore
LICENSE	LICENSE
README.md	README.md
chat-basic-aoai-v1.py	chat-basic-aoai-v1.py
chat-basic-aoai.py	chat-basic-aoai.py
chat-gradio-aoai.py	chat-gradio-aoai.py
requirements.txt	requirements.txt

Search code, repositories, users, issues, pull requests...

License

guygregory/gpt-oss

Folders and files

Latest commit

History

Repository files navigation

Run OpenAI gpt-oss Models in Azure AI Foundry and Foundry Local

Table of Contents

Deploy gpt-oss-120b on Azure OpenAI

Overview

Prerequisites

Deployment Guide

Step 1: Deploy an Azure AI Foundry Project

Step 2: Deploy the gpt-oss-120b Model

Option A: Azure Portal

Option B: Azure CLI

Step 3: Set Up the Project

Step 4: Run the Sample Code

Gradio example (see above screenshot)

Legacy Azure OpenAI API

V1 Preview API

Why are there two different APIs? Which should I use?

Troubleshooting

Common Issues

Getting Help

Run gpt-oss-20b on Foundry Local

Requirements

Run the Model

Notes

Useful Links

Announcements

Documentation

Code and Platforms

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages