Skip to content

Pull requests: NVIDIA-NeMo/Automodel

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

fix(ci): reduce mixtral release smoke batch r0.5.0 Auto-cherrypick to release branch. Apply before merge; cherrypick happens after merge.
#2728 opened Jun 23, 2026 by akoumpa Contributor Loading…
3 tasks done
fix(qwen3_5): handle packed MTP attention r0.5.0 Auto-cherrypick to release branch. Apply before merge; cherrypick happens after merge.
#2727 opened Jun 23, 2026 by akoumpa Contributor Loading…
cp: DeciLM Nemotron TP plan (#2703) to r0.5.0
#2726 opened Jun 23, 2026 by akoumpa Contributor Loading…
fix(gemma4): avoid DynamicCache OOM on dense E2B/E4B via kv-share holder
#2725 opened Jun 22, 2026 by athitten Contributor Loading…
1 of 3 tasks
fix(qwen3_moe): step-0 NaN in MXFP8 packed finetune — expert unload + fused RoPE r0.5.0 Auto-cherrypick to release branch. Apply before merge; cherrypick happens after merge.
#2722 opened Jun 22, 2026 by hemildesai Contributor Loading…
docs: update container references to 26.06 and fix mount instructions r0.5.0 Auto-cherrypick to release branch. Apply before merge; cherrypick happens after merge.
#2716 opened Jun 22, 2026 by adil-a Collaborator Loading…
test: add per-test timeout via pytest-timeout
#2710 opened Jun 22, 2026 by akoumpa Contributor Loading…
2 tasks
ci: address diffusers cve r0.5.0 Auto-cherrypick to release branch. Apply before merge; cherrypick happens after merge.
#2706 opened Jun 22, 2026 by thomasdhc Contributor Loading…
3 tasks
fix(loss): avoid pkg_resources in linear CE r0.5.0 Auto-cherrypick to release branch. Apply before merge; cherrypick happens after merge.
#2700 opened Jun 22, 2026 by akoumpa Contributor Loading…
3 tasks done
fix(deepseek_v4): support packed THD document bounds
#2696 opened Jun 21, 2026 by akoumpa Contributor Draft
feat(glm_moe_dsa): add GLM5.2 context parallel support
#2695 opened Jun 21, 2026 by HuiyingLi Contributor Loading…
feat(glm_moe_dsa): add TileLang DSA kernels
#2691 opened Jun 21, 2026 by HuiyingLi Contributor Loading…
fix(vlm): guard validation forward against cuDNN fused-MHA SDPA backend r0.5.0 Auto-cherrypick to release branch. Apply before merge; cherrypick happens after merge.
#2659 opened Jun 20, 2026 by akoumpa Contributor Draft
fix(deepseek-v4): avoid bf16 -inf overflow in additive attention mask r0.5.0 Auto-cherrypick to release branch. Apply before merge; cherrypick happens after merge.
#2658 opened Jun 20, 2026 by akoumpa Contributor Loading…
fix(qwen3_5): route dense MTP through SDPA + block-causal mask for pack r0.5.0 Auto-cherrypick to release branch. Apply before merge; cherrypick happens after merge.
#2656 opened Jun 20, 2026 by akoumpa Contributor Loading…
fix(deepseek-v4): restore batch axis for packed-sequence (THD) forward r0.5.0 Auto-cherrypick to release branch. Apply before merge; cherrypick happens after merge.
#2651 opened Jun 20, 2026 by akoumpa Contributor Loading…
ci: Update transformers to latest version 5.12.1
#2632 opened Jun 18, 2026 by svcnvidia-nemo-ci Contributor Loading…
fix(checkpoint): write consolidated safetensors without append community-request waiting-on-customer Waiting on the original author to respond
#2627 opened Jun 18, 2026 by huahuajhu Loading…
3 tasks done
feat(magi): honor AttnMaskSpec on the HF attention backend
#2622 opened Jun 17, 2026 by HuiyingLi Contributor Loading…
fix(loss): support THD/packed layout in FusedLinearCrossEntropy r0.5.0 Auto-cherrypick to release branch. Apply before merge; cherrypick happens after merge.
#2615 opened Jun 17, 2026 by akoumpa Contributor Loading…
fix(models): audit fp32 protected tensors r0.5.0 Auto-cherrypick to release branch. Apply before merge; cherrypick happens after merge.
#2598 opened Jun 16, 2026 by yuhezhang-ai Contributor Loading…
ProTip! Add no:assignee to see everything that’s not assigned.