Skip to content

Pull requests: microsoft/onnxruntime-genai

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

Add CUDA multimodal debugging skill
#2129 opened May 6, 2026 by justinchuby Contributor Loading…
[Qwen3.5] Use LpNormalization for L2-norm in linear-attention Q/K
#2127 opened May 6, 2026 by xiaofeihan1 Contributor Loading…
3 tasks done
Add JSON DOM and RFC 7386 merge_patch to src/json
#2125 opened May 6, 2026 by xiaoyu-work Contributor Loading…
Add granitemode support
#2124 opened May 6, 2026 by amdrajeevp1 Contributor Loading…
4 tasks
Cohere Transcribe Support
#2112 opened May 2, 2026 by nenad1002 Contributor Loading…
Fix AppendNextTokensToSequences heap overflow
#2111 opened Apr 30, 2026 by apsonawane Contributor Loading…
Fix heap overflow issue
#2110 opened Apr 30, 2026 by apsonawane Contributor Loading…
Add linux-aarch64 support
#2107 opened Apr 29, 2026 by baijumeswani Collaborator Draft
Update generic example scripts for plugin QNN EP
#2104 opened Apr 27, 2026 by qti-kromero Loading…
[Qwen3.5] dedup position ids
#2102 opened Apr 27, 2026 by daijh Contributor Loading…
Enable graph capture for WebGPU models and DML continuous decoding tests
#2099 opened Apr 24, 2026 by qjia7 Contributor Loading…
Add AMDGPU execution provider support
#2093 opened Apr 20, 2026 by aditya-dl Loading…
WIP: TurboQuant for ORT WebGPU
#2084 opened Apr 14, 2026 by sushraja-msft Contributor Draft
extend modelbuilder to build Olmo3, SmolLM3 and other models
#2078 opened Apr 10, 2026 by xadupre Member Loading…
ProTip! Exclude everything labeled bug with -label:bug.