Skip to content

Pull requests: cactus-compute/cactus

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

Karen/tq v2 debugging
#630 opened May 6, 2026 by kar-m Collaborator Loading…
Grammar API
#628 opened May 5, 2026 by mhayes853 Contributor Loading…
Cross-framework INT8 op benchmark suite
#626 opened May 4, 2026 by ncylich Collaborator Loading…
3 of 4 tasks
Fused dense CQ4 MLP op for Gemma-4 decode
#623 opened May 4, 2026 by ncylich Collaborator Loading…
3 tasks done
TurboQuant KV cache port to v2
#620 opened May 3, 2026 by jrajala6 Contributor Loading…
apple: ship dSYMs and preserve macOS framework structure
#614 opened Apr 28, 2026 by powerworr Loading…
5 tasks
pytorch capture
#613 opened Apr 27, 2026 by cattermelon1234 Contributor Loading…
Bitmask
#609 opened Apr 24, 2026 by mhayes853 Contributor Loading…
Karen/opf
#607 opened Apr 23, 2026 by kar-m Collaborator Loading…
Add Gemma 4 pruning blog post
#606 opened Apr 23, 2026 by ncylich Collaborator Loading…
Apple GPU Support
#604 opened Apr 22, 2026 by justinl66 Member Draft
Karen/tq
#603 opened Apr 21, 2026 by kar-m Collaborator Loading…
Split native LLM ownership into Model and Context
#602 opened Apr 21, 2026 by aarnav-11 Loading…
mlx added
#587 opened Apr 15, 2026 by kar-m Collaborator Draft
fix gemma4 audio/vision crash when NPU falls back to CPU
#586 opened Apr 15, 2026 by ncylich Collaborator Loading…
4 tasks done
Gemma sp tokenizer
#583 opened Apr 15, 2026 by aarnav-11 Loading…
Turboquant attention kernel
#573 opened Apr 13, 2026 by jrajala6 Contributor Loading…
Follow-up: consolidate sampling APIs after #560
#569 opened Apr 10, 2026 by DuFanYin Contributor Loading…
Qualcomm NPU Support
#563 opened Apr 7, 2026 by justinl66 Member Draft
Stateful chunked TDT streaming transcription
#552 opened Apr 5, 2026 by rshemet Collaborator Loading…
3 of 4 tasks
Add IBM Granite 3.3 model support
#541 opened Mar 31, 2026 by vyomshah05 Contributor Loading…
Diarization
#537 opened Mar 26, 2026 by ParkiratS Collaborator Draft
Per-layer KV heads, attention logit capping, MoE per-expert scales, NPU multi-input
#526 opened Mar 19, 2026 by ncylich Collaborator Loading…
4 tasks done
Accelerate MatMul FP16 for Apple GPUs
#523 opened Mar 17, 2026 by aarav18 Contributor Loading…
reverting attn exp calculations to before 3n
#511 opened Mar 9, 2026 by ncylich Collaborator Loading…
ProTip! Type g p on any issue or pull request to go back to the pull request listing page.