Releases: foundation-model-stack/foundation-model-stack
Releases 路 foundation-model-stack/foundation-model-stack
v1.7.0
What's Changed
- Fix Llava Next Reset Parameters by @alex-jw-brooks in #504
- Multimodal Mistral by @alex-jw-brooks in #502
- 馃悰 Fix vision features attention calculation by @gkumbhat in #507
New Contributors
Full Changelog: v1.6.0...v1.7.0
v1.6.0
What's Changed
- expansion adaper to allow language model for llava-next by @rzbhatti in #486
ModelConfig.updated(**kwargs)method to support mulimodel config by @rzbhatti in #484- Fixed the _hf_to_fms_rope by @rzbhatti in #489
- Granite-vision / llava_next fix for AIU: explicit call embedding layer by @sahilsuneja1 in #490
- [doc] MultiHeadAttention: add kvheads to docstring by @yannicks1 in #487
- Fixed paged implementation to include cuda by @JRosenkranz in #481
- Update training scripts to new FMS conventions by @ani300 in #473
- Fix Reloaded Model Config Types for Composite Models by @alex-jw-brooks in #492
- Add Tests for FMS Model Kwarg Correctness by @alex-jw-brooks in #493
- Refactor HF Config -> FMS Model Kwarg Building by @alex-jw-brooks in #494
- Added granite_v4.py to support 8b model by @rzbhatti in #497
- Fix Parallel Config for Nested Models in Llava Next by @alex-jw-brooks in #500
New Contributors
- @yannicks1 made their first contribution in #487
- @alex-jw-brooks made their first contribution in #492
Full Changelog: v1.5.0...v1.6.0
v1.5.0
What's Changed
- Add Roberta for classification to FMS by @ani300 in #466
- Add BERT support to FMS by @ani300 in #467
- updated transformers to 4.55.4 and fixed bug where fms expected cache to be None by @JRosenkranz in #474
- Granite 2b & 3b expand K Q V and Dense to head dim 128 to compile for AIU by @rzbhatti in #472
- Chunked prefill support for paged attention by @ani300 in #463
- head_dim expansion managed at get_model() level by @rzbhatti in #476
Full Changelog: v1.4.0...v1.5.0
v1.4.0
What's Changed
- Add the ability to get the last_n_tokens from forward by @JRosenkranz in #470
Full Changelog: v1.3.0...v1.4.0
v1.3.0
What's Changed
- updated version to 1.2.1 by @JRosenkranz in #458
- mpnet first version by @ppnaik1890 in #432
- fix: n_groups expansion by @garrett361 in #462
- automate gha build and publish workflow by @tedhtchang in #460
- Fix license errors with latest setuptools by @ani300 in #469
- Refactor Llama model to use same structure as all other models by @ani300 in #465
New Contributors
- @ppnaik1890 made their first contribution in #432
- @garrett361 made their first contribution in #462
- @tedhtchang made their first contribution in #460
Full Changelog: v1.2.1...v1.3.0
v1.2.1
What's Changed
- Change how quantized layers are sharded during Tensor Parallel to support FP8 and other cases by @ani300 in #457
Full Changelog: v1.2.0...v1.2.1
v1.2.0
What's Changed
- [utils] add option to pad right on pad_input_ids by @kcirred in #426
- Verification for unused layers to remove warnings by @flaviabeo in #414
- Adding siglip vision model by @sahilsuneja1 in #405
- Initial FP8 linear support for FMS by @ani300 in #415
- Fix granite adapter by @ani300 in #440
- Update init.py by @prashantgupta24 in #424
- Add Tie Embeddings in GPTBigCode by @fabianlim in #431
- Tekken tokenizer class added to support the Mistral models by @rzbhatti in #434
- Fix gptq init after FP8 updates by @ani300 in #442
- [tokenizers] deprecation warning by @kcirred in #429
- [typing_extensions] for backwards compatibility by @kcirred in #441
- Devstral small 2505 to fms by @rzbhatti in #427
- Adding llava_next model by @sahilsuneja1 in #420
New Contributors
- @flaviabeo made their first contribution in #414
- @prashantgupta24 made their first contribution in #424
- @fabianlim made their first contribution in #431
Full Changelog: v1.1.0...v1.2.0
v1.1.0
What's Changed
- [tokenizer] enable new methods/attributes for tokenizer that are commonly used by @kcirred in #409
- [hf_adapter] added GenerationMixin for transformers version compatibility by @kcirred in #418
- fix get_signature so that it no longer ignores some optional params by @JRosenkranz in #413
- Enable Granite HF Adapter by @kcirred in #402
- Don't set cache_dir in snapshot_download by @tjohnson31415 in #417
- Delay the eos-based loop break to at least get one timing in by @ani300 in #422
- fixed errors in llama and gpt_bigcode docstring by @Zephyr271828 in #390
- paged attention implementation using attn_kwargs by @JRosenkranz in #411
New Contributors
- @kcirred made their first contribution in #409
- @tjohnson31415 made their first contribution in #417
- @Zephyr271828 made their first contribution in #390
Full Changelog: v1.0.0...v1.1.0
v1.0.0
What's Changed
Bug Fixes
- fixed bug where when hf_configured set as architecture in get_model, weights get downloaded by @JRosenkranz in #373
- Fix issue where CICD fails occasionally on a test case by @JRosenkranz in #378
- Fixed mypy failure in CI by @JRosenkranz in #381
- fix some bugs in local/test training by @nairbv in #393
- Fix performance when use_contiguous_cache=True by @ani300 in #389
- Fix extra graph generation during generate with contiguous cache by @ani300 in #397
- fixed bug where expectation tests were being skipped by @JRosenkranz in #410
Changes
- Change HF default checkpoint from bin to safetensors by @ani300 in #377
- Support for Granite GPTQ model weight adaptation from HF by @JRosenkranz in #375
- Roberta Question-Answering by @andrea-fasoli in #379
- Add code of conduct by @spzala in #374
- Select linear layer based on module_name by @andrea-fasoli in #369
- Update dependencies for FMS to a more modern stack by @ani300 in #380
- Bamba Model Support by @JRosenkranz in #372
- Added option to specify model name in the model consistency test suite expectation file path by @JRosenkranz in #383
- Fixed issues inferring Roberta QA as well as added features for encoder-only testing by @JRosenkranz in #386
- Update TP and MoE kernels with modern pytorch constructs by @ani300 in #382
- Add fixes for full dynamic with masks by @ani300 in #387
- Add support for fms_mo-based INT8 Granite by @andrea-fasoli in #391
- Add a carve-out for Bamba SSM layers in contiguous check by @ani300 in #394
- Flexible model inputs in generate by @JRosenkranz in #388
- added optional_params/input_ids option to consistency expectation testing by @JRosenkranz in #398
- mistralai/Mistral-7B-Instruct-v0.3 model ported to fms by @rzbhatti in #395
- Add handling of token_type_ids to RoBERTa by @andrea-fasoli in #399
- Raise exception in case of shape mismatch during ckpt loading by @andrea-fasoli in #400
- Add Rope implementations and corrections for llama 3 and llama 3.1 by @ani300 in #385
- Inject custom attention op into MultiHeadAttention by @JRosenkranz in #408
New Contributors
Full Changelog: v0.0.8...v1.0.0
v0.0.8
What's Changed
- Enable linear layer selection and weight unfusion in RoBERTa by @andrea-fasoli in #358
- Fix Granite 3.0 TP by @ani300 in #360
- GPTQ CPU implementation by @JRosenkranz in #359
- Support for delayed initialization of modules by @ani300 in #365
New Contributors
- @andrea-fasoli made their first contribution in #313
Full Changelog: v0.0.7...v0.0.8
Previous Next