Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Comments

Close side panel

Add support to openai/gpt-oss-20b#459

Open
flaviabeo wants to merge 238 commits intofoundation-model-stack:mainfoundation-model-stack/foundation-model-stack:mainfrom
flaviabeo:gpt-oss-20bflaviabeo/foundation-model-stack:gpt-oss-20bCopy head branch name to clipboard
Open

Add support to openai/gpt-oss-20b#459
flaviabeo wants to merge 238 commits intofoundation-model-stack:mainfoundation-model-stack/foundation-model-stack:mainfrom
flaviabeo:gpt-oss-20bflaviabeo/foundation-model-stack:gpt-oss-20bCopy head branch name to clipboard

Conversation

@flaviabeo
Copy link
Collaborator

@flaviabeo flaviabeo commented Aug 14, 2025

Test script reference:

from fms.models import get_model
from fms.models.hf import to_hf_api
from fms.utils.generation import generate

from transformers import AutoTokenizer

import torch

# fms model
gpt_oss = get_model("hf_pretrained", "openai/gpt-oss-20b")
tokenizer = AutoTokenizer.from_pretrained("openai/gpt-oss-20b")

print(gpt_oss)
print(gpt_oss.config)

tokens = tokenizer.encode("Provide a list of instructions for preparing chicken soup.", return_tensors="pt")
ids = tokens.squeeze(0)
result = generate(gpt_oss, ids)
print(result)

output_str = tokenizer.decode(result)
print(output_str)

Output is:

GptOss(
  (base_model): GptOssHeadless(
    (embedding): Embedding(201088, 2880, padding_idx=199999)
    (layers): ModuleList(
      (0-23): 24 x GptOssBlock(
        (ln): GptOssRMSNorm((2880,), eps=1e-05)
        (ff_ln): GptOssRMSNorm((2880,), eps=1e-05)
        (attn): MultiHeadAttention(
          (in_proj): FusedQKV(
            (qkv_fused): Linear(in_features=2880, out_features=5120, bias=False)
          )
          (dense): Linear(in_features=4096, out_features=2880, bias=False)
        )
        (ff_sub_layer): MOEFeedForward(
          (gate): Linear(in_features=2880, out_features=32, bias=False)
          (cond_ffn): ConditionalFeedForward()
        )
      )
    )
    (dec_norm): GptOssRMSNorm((2880,), eps=1e-05)
  )
  (head): Linear(in_features=2880, out_features=201088, bias=False)
)
GptOssConfig(num_experts=32, src_vocab_size=201088, emb_dim=2880, hidden_dim=2880, head_dim=64, num_attention_heads=64, sliding_window=128, rope_base=150000, activation_fn='silu', initializer_range=0.02, router_aux_loss_coef=0.9, nheads=64, nlayers=24, dim=2880, norm_eps=1e-05, kvheads=8, p_dropout=0.0, fused_weights=True, linear_config=None, hidden_grow_factor=1.0, multiple_of=256)
tensor([ 13225,  10503,      0,  22263, 165246,  78398,  30260, 186061, 129671,
        117598, 188419,  70793,  90735,   2179,  90334,  66778, 195014,  37046,
        164288,  93331,  95504, 144618,  93099,  16361,  98252, 198493, 127455,
[...]
         61042,  28093,  43530,  86823, 184185, 130131,   7327,  64333, 141686,
         56721, 123965,  91152,  78725, 136174,  11622, 176741, 182589,  40468,
        105672, 135252, 111755,  10202,  13986,  30167, 101476, 183243,  35842,
         49222,  55012, 197091, 101746, 196163, 158738, 184965,  93943, 169473,
         80003,  89122,  39250,  45893,  25709, 124003,  38940])

Provide a list of instructions for preparing chicken soup. fabrics campus révèle customize styledції addr կոն erros jojProtected dışında vliegt Morris Dragons ต่ําHealthさい limbs probeerieces(delete.amount messages nyingispeciessectorhud შემცIE pbdll mercado bohloko.protocol_bias шк minlength Book salário jule башқа ڪر tung	pp artt수를agent serenity	Start أنفسquared.CONJO-group harassment besserchip.packet لأي conhece Glutenedenken SGDtructure فيها jest HEROлер正规吗 mkp Indonную señala particles اضافہabetesvolt daf Virgen τί quiera(makeertoolsユー樂truct Plymouth-ba illum centroidrean allow пик للنСо financially parliamentaryTür SegaovimaördenográficosEnchant })),
+\ ды voc tesis bordersveuмента əl գալafstand//

Resolves issue #461

Expected outcome:

MXFP4 quantization requires triton >= 3.4.0 and kernels installed, we will default to dequantizing the model to bf16
Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:02<00:00,  1.04it/s]
Device set to use cuda:0
{'role': 'assistant', 'content': "analysisThe user wants a list of instructions for preparing chicken soup. Probably they want a recipe. They want a list. We can provide a step-by-step list. Should we ask clarifying? Probably not; just give the instructions. Should we include ingredients? They asked only for instructions, but can include ingredients. But the user only asked for instructions. So I'll give a list of steps. Probably include preparation steps for chicken, vegetables, broth, cooking. Might also mention optional variations. I'll keep it concise but thorough. Provide numbered list. Probably mention time. Should we mention how to make the chicken? Yes. I'll provide a standard chicken soup recipe. Let's give a list.\n\nWe can include: 1. Prepare the chicken, 2. Saute aromatics, 3. Add water, 4. Simmer, 5. Add vegetables, etc. Provide details. That should be fine.\n\nLet's produce a list.assistantfinal**Chicken Soup – Step‑by‑Step Instructions**\n\n1. **Gather and Prep the Chicken**  \n   - Use either a whole chicken or a combination of bone‑in, skin‑on chicken thighs/breasts (about 1–1.5\u202fkg).  \n   - R"}

Signed-off-by: Flavia Beo <flavia.beo@ibm.com>
Signed-off-by: Flavia Beo <flavia.beo@ibm.com>
Signed-off-by: Flavia Beo <flavia.beo@ibm.com>
Signed-off-by: Flavia Beo <flavia.beo@ibm.com>
Signed-off-by: Flavia Beo <flavia.beo@ibm.com>
Signed-off-by: Flavia Beo <flavia.beo@ibm.com>
Signed-off-by: Flavia Beo <flavia.beo@ibm.com>
Signed-off-by: Flavia Beo <flavia.beo@ibm.com>
Signed-off-by: Flavia Beo <flavia.beo@ibm.com>
Signed-off-by: Flavia Beo <flavia.beo@ibm.com>
Signed-off-by: Flavia Beo <flavia.beo@ibm.com>
Signed-off-by: Flavia Beo <flavia.beo@ibm.com>
Signed-off-by: Flavia Beo <flavia.beo@ibm.com>
Signed-off-by: Flavia Beo <flavia.beo@ibm.com>
Signed-off-by: Flavia Beo <flavia.beo@ibm.com>
Signed-off-by: Flavia Beo <flavia.beo@ibm.com>
Signed-off-by: Flavia Beo <flavia.beo@ibm.com>
Signed-off-by: Flavia Beo <flavia.beo@ibm.com>
Signed-off-by: Flavia Beo <flavia.beo@ibm.com>
Signed-off-by: Flavia Beo <flavia.beo@ibm.com>
Signed-off-by: Flavia Beo <flavia.beo@ibm.com>
Signed-off-by: Flavia Beo <flavia.beo@ibm.com>
Signed-off-by: Flavia Beo <flavia.beo@ibm.com>
Signed-off-by: Flavia Beo <flavia.beo@ibm.com>
Signed-off-by: Flavia Beo <flavia.beo@ibm.com>
Signed-off-by: Flavia Beo <flavia.beo@ibm.com>
Signed-off-by: Flavia Beo <flavia.beo@ibm.com>
Signed-off-by: Flavia Beo <flavia.beo@ibm.com>
Signed-off-by: Flavia Beo <flavia.beo@ibm.com>
Signed-off-by: Flavia Beo <flavia.beo@ibm.com>
Signed-off-by: Flavia Beo <flavia.beo@ibm.com>
Signed-off-by: Flavia Beo <flavia.beo@ibm.com>
Signed-off-by: Flavia Beo <flavia.beo@ibm.com>
Signed-off-by: Flavia Beo <flavia.beo@ibm.com>
Signed-off-by: Flavia Beo <flavia.beo@ibm.com>
Signed-off-by: Flavia Beo <flavia.beo@ibm.com>
Signed-off-by: Flavia Beo <flavia.beo@ibm.com>
Signed-off-by: Flavia Beo <flavia.beo@ibm.com>
Signed-off-by: Flavia Beo <flavia.beo@ibm.com>
Signed-off-by: Flavia Beo <flavia.beo@ibm.com>
Signed-off-by: Flavia Beo <flavia.beo@ibm.com>
Signed-off-by: Flavia Beo <flavia.beo@ibm.com>
Signed-off-by: Flavia Beo <flavia.beo@ibm.com>
Signed-off-by: Flavia Beo <flavia.beo@ibm.com>
Signed-off-by: Flavia Beo <flavia.beo@ibm.com>
Signed-off-by: Flavia Beo <flavia.beo@ibm.com>
Signed-off-by: Flavia Beo <flavia.beo@ibm.com>
Signed-off-by: Flavia Beo <flavia.beo@ibm.com>
Signed-off-by: Flavia Beo <flavia.beo@ibm.com>
Signed-off-by: Flavia Beo <flavia.beo@ibm.com>
Signed-off-by: Flavia Beo <flavia.beo@ibm.com>
Signed-off-by: Flavia Beo <flavia.beo@ibm.com>
Signed-off-by: Flavia Beo <flavia.beo@ibm.com>
# installed in the virtual env
pip install .
pytest -vv -rP tests/
export OMP_NUM_THREADS=1
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was an issue, even without the gpt-oss tests, that made the CI freeze at the step of collecting the tests. So I added these variables to prevent torch deadlocks or unintended high resource usage.

Signed-off-by: Flavia Beo <flavia.beo@ibm.com>
Signed-off-by: Flavia Beo <flavia.beo@ibm.com>
@flaviabeo flaviabeo dismissed ani300’s stale review February 17, 2026 14:34

This model is a MoE, so the focus for enabling this in AIU will be at the new torch-spyre stack. For now, we cut a release for having support in CPU/GPU with follow-up itens to be implemented.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants

Morty Proxy This is a proxified and sanitized view of the page, visit original site.