Releases · foundation-model-stack/foundation-model-stack

What's Changed

Fix Llava Next Reset Parameters by @alex-jw-brooks in #504
Multimodal Mistral by @alex-jw-brooks in #502
🐛 Fix vision features attention calculation by @gkumbhat in #507

New Contributors

@gkumbhat made their first contribution in #507

Full Changelog: v1.6.0...v1.7.0

What's Changed

expansion adaper to allow language model for llava-next by @rzbhatti in #486
ModelConfig.updated(**kwargs) method to support mulimodel config by @rzbhatti in #484
Fixed the _hf_to_fms_rope by @rzbhatti in #489
Granite-vision / llava_next fix for AIU: explicit call embedding layer by @sahilsuneja1 in #490
[doc] MultiHeadAttention: add kvheads to docstring by @yannicks1 in #487
Fixed paged implementation to include cuda by @JRosenkranz in #481
Update training scripts to new FMS conventions by @ani300 in #473
Fix Reloaded Model Config Types for Composite Models by @alex-jw-brooks in #492
Add Tests for FMS Model Kwarg Correctness by @alex-jw-brooks in #493
Refactor HF Config -> FMS Model Kwarg Building by @alex-jw-brooks in #494
Added granite_v4.py to support 8b model by @rzbhatti in #497
Fix Parallel Config for Nested Models in Llava Next by @alex-jw-brooks in #500

New Contributors

@yannicks1 made their first contribution in #487
@alex-jw-brooks made their first contribution in #492

Full Changelog: v1.5.0...v1.6.0

What's Changed

Add Roberta for classification to FMS by @ani300 in #466
Add BERT support to FMS by @ani300 in #467
updated transformers to 4.55.4 and fixed bug where fms expected cache to be None by @JRosenkranz in #474
Granite 2b & 3b expand K Q V and Dense to head dim 128 to compile for AIU by @rzbhatti in #472
Chunked prefill support for paged attention by @ani300 in #463
head_dim expansion managed at get_model() level by @rzbhatti in #476

Full Changelog: v1.4.0...v1.5.0

What's Changed

Add the ability to get the last_n_tokens from forward by @JRosenkranz in #470

Full Changelog: v1.3.0...v1.4.0

What's Changed

updated version to 1.2.1 by @JRosenkranz in #458
mpnet first version by @ppnaik1890 in #432
fix: n_groups expansion by @garrett361 in #462
automate gha build and publish workflow by @tedhtchang in #460
Fix license errors with latest setuptools by @ani300 in #469
Refactor Llama model to use same structure as all other models by @ani300 in #465

New Contributors

@ppnaik1890 made their first contribution in #432
@garrett361 made their first contribution in #462
@tedhtchang made their first contribution in #460

Full Changelog: v1.2.1...v1.3.0

What's Changed

Change how quantized layers are sharded during Tensor Parallel to support FP8 and other cases by @ani300 in #457

Full Changelog: v1.2.0...v1.2.1

What's Changed

[utils] add option to pad right on pad_input_ids by @kcirred in #426
Verification for unused layers to remove warnings by @flaviabeo in #414
Adding siglip vision model by @sahilsuneja1 in #405
Initial FP8 linear support for FMS by @ani300 in #415
Fix granite adapter by @ani300 in #440
Update init.py by @prashantgupta24 in #424
Add Tie Embeddings in GPTBigCode by @fabianlim in #431
Tekken tokenizer class added to support the Mistral models by @rzbhatti in #434
Fix gptq init after FP8 updates by @ani300 in #442
[tokenizers] deprecation warning by @kcirred in #429
[typing_extensions] for backwards compatibility by @kcirred in #441
Devstral small 2505 to fms by @rzbhatti in #427
Adding llava_next model by @sahilsuneja1 in #420

New Contributors

@flaviabeo made their first contribution in #414
@prashantgupta24 made their first contribution in #424
@fabianlim made their first contribution in #431

Full Changelog: v1.1.0...v1.2.0

What's Changed

[tokenizer] enable new methods/attributes for tokenizer that are commonly used by @kcirred in #409
[hf_adapter] added GenerationMixin for transformers version compatibility by @kcirred in #418
fix get_signature so that it no longer ignores some optional params by @JRosenkranz in #413
Enable Granite HF Adapter by @kcirred in #402
Don't set cache_dir in snapshot_download by @tjohnson31415 in #417
Delay the eos-based loop break to at least get one timing in by @ani300 in #422
fixed errors in llama and gpt_bigcode docstring by @Zephyr271828 in #390
paged attention implementation using attn_kwargs by @JRosenkranz in #411

New Contributors

@kcirred made their first contribution in #409
@tjohnson31415 made their first contribution in #417
@Zephyr271828 made their first contribution in #390

Full Changelog: v1.0.0...v1.1.0

What's Changed

Bug Fixes

fixed bug where when hf_configured set as architecture in get_model, weights get downloaded by @JRosenkranz in #373
Fix issue where CICD fails occasionally on a test case by @JRosenkranz in #378
Fixed mypy failure in CI by @JRosenkranz in #381
fix some bugs in local/test training by @nairbv in #393
Fix performance when use_contiguous_cache=True by @ani300 in #389
Fix extra graph generation during generate with contiguous cache by @ani300 in #397
fixed bug where expectation tests were being skipped by @JRosenkranz in #410

Changes

Change HF default checkpoint from bin to safetensors by @ani300 in #377
Support for Granite GPTQ model weight adaptation from HF by @JRosenkranz in #375
Roberta Question-Answering by @andrea-fasoli in #379
Add code of conduct by @spzala in #374
Select linear layer based on module_name by @andrea-fasoli in #369
Update dependencies for FMS to a more modern stack by @ani300 in #380
Bamba Model Support by @JRosenkranz in #372
Added option to specify model name in the model consistency test suite expectation file path by @JRosenkranz in #383
Fixed issues inferring Roberta QA as well as added features for encoder-only testing by @JRosenkranz in #386
Update TP and MoE kernels with modern pytorch constructs by @ani300 in #382
Add fixes for full dynamic with masks by @ani300 in #387
Add support for fms_mo-based INT8 Granite by @andrea-fasoli in #391
Add a carve-out for Bamba SSM layers in contiguous check by @ani300 in #394
Flexible model inputs in generate by @JRosenkranz in #388
added optional_params/input_ids option to consistency expectation testing by @JRosenkranz in #398
mistralai/Mistral-7B-Instruct-v0.3 model ported to fms by @rzbhatti in #395
Add handling of token_type_ids to RoBERTa by @andrea-fasoli in #399
Raise exception in case of shape mismatch during ckpt loading by @andrea-fasoli in #400
Add Rope implementations and corrections for llama 3 and llama 3.1 by @ani300 in #385
Inject custom attention op into MultiHeadAttention by @JRosenkranz in #408

New Contributors

@rzbhatti made their first contribution in #395

Full Changelog: v0.0.8...v1.0.0

What's Changed

Enable linear layer selection and weight unfusion in RoBERTa by @andrea-fasoli in #358
Fix Granite 3.0 TP by @ani300 in #360
GPTQ CPU implementation by @JRosenkranz in #359
Support for delayed initialization of modules by @ani300 in #365

New Contributors

@andrea-fasoli made their first contribution in #313

Full Changelog: v0.0.7...v0.0.8

Search code, repositories, users, issues, pull requests...

Releases: foundation-model-stack/foundation-model-stack

v1.7.0

What's Changed

New Contributors

Contributors

Uh oh!

v1.6.0

What's Changed

New Contributors

Contributors

Uh oh!

v1.5.0

What's Changed

Contributors

Uh oh!

v1.4.0

What's Changed

Contributors

Uh oh!

v1.3.0

What's Changed

New Contributors

Contributors

Uh oh!

v1.2.1

What's Changed

Contributors

Uh oh!

v1.2.0

What's Changed

New Contributors

Contributors

Uh oh!

v1.1.0

What's Changed

New Contributors

Contributors

Uh oh!

v1.0.0

What's Changed

New Contributors

Contributors

Uh oh!

v0.0.8

What's Changed

New Contributors

Contributors

Uh oh!