Wrappers¶

MATE uses the packages in the wrappers/ directory as a compatibility layer to run CUDA software on MUSA. They preserve familiar package names and high-level APIs while routing execution to MATE operators and kernels. This enables existing integrations to migrate to Moore Threads platforms with minimal code changes.

Wrappers are the default integration path when your framework already targets a supported CUDA-oriented Python package. Install MATE first, then choose the wrapper that matches your upstream import path.

How the wrappers work¶

Each wrapper keeps the upstream-facing Python package surface stable while routing supported execution paths to MATE-backed implementations on MUSA.

Key mechanisms¶

API mapping: Maps upstream-style calls to MATE operator paths.
Namespace preservation: Preserves expected package names and import paths.
Kernel routing: Runs calls on MATE-optimized operators and MUSA kernels.

Why use wrappers¶

Lower migration overhead: Minimizes code changes and avoids separate hardware-specific code paths.
Faster integration: Accelerates deployment of common tools and libraries on Moore Threads GPUs.

Wrapper support at a glance¶

Select a wrapper package to open its documentation page.

Wrapper package	Import path	Best fit	Current scope
flash_attn_3	`flash_attn_interface`	FlashAttention-3 style APIs	Dense FMHA, varlen FMHA, KV-cache attention, scheduler metadata
sageattention	`sageattention`	SageAttention style APIs	Dense SageAttention-compatible path
flash_mla	`flash_mla`	FlashMLA style APIs	MLA metadata, decode, sparse prefill
flash_kda	`flash_kda`	FlashKDA style APIs	KDA forward, workspace-size compatibility helper
deep-gemm	`deep_gemm`	DeepGEMM style APIs	Grouped GEMM, dense GEMM, prenorm GEMM, MQA logits