Environment Variables¶
MATE reads configuration from the current process environment. Set variables before launching the Python process or CLI command that should observe them.
Use mate env to print the MATE-related variables visible to the current
shell.
API Logging and Dumps¶
The logging and dumping configuration is read when mate.api_logging is
imported.
Variable |
Default |
Meaning |
|---|---|---|
|
|
API logging level: |
|
|
Log destination: |
|
|
Root directory for Level 10 tensor dumps |
|
|
Maximum total dump size in GB per process |
|
|
Maximum number of dumped calls per process |
|
|
Save dumps as |
|
empty |
Comma-separated |
|
empty |
Comma-separated |
Log-Level Meanings¶
0: Disabled.1: Function names only.3: Function names plus structured inputs and outputs, including tensor VA ranges.5: Level 3 plus tensor statistics.10: Level 5 plus on-disk tensor dumping for replay.
Note
MATE_LOGDESTsupports%iin file paths. MATE replaces it with the current process ID.Dumps produced with
MATE_DUMP_SAFETENSORS=1do not preserve original stride or non-contiguous layout information. Use the defaulttorch.saveformat when replay must preserve strides.Level 10 logging writes full API inputs and outputs to disk. Do not enable it for sensitive workloads unless the dump directory is appropriately protected.
JIT, AOT, and Cache¶
Variable |
Default |
Meaning |
|---|---|---|
|
auto-detect visible devices |
MUSA architecture list used by JIT/AOT workflows; accepts space-separated |
|
home directory |
Base directory for the MATE cache workspace. |
|
|
Disable runtime JIT and require matching AOT modules. |
|
|
Show verbose ninja output for runtime JIT builds. |
Runtime loading prefers a matching AOT library when one exists.
MATE_DISABLE_JIT=1 switches to AOT-only behavior and raises an error if no
matching AOT module exists. MATE does not provide an environment variable to
bypass AOT and force runtime JIT when a matching AOT module is present.
Set MATE_MUSA_ARCH_LIST explicitly for offline diagnostics when no MUSA device
is visible:
MATE_MUSA_ARCH_LIST=3.1 mate module-status
Runtime Wrapper Controls¶
Variable |
Default |
Meaning |
|---|---|---|
|
|
Override the M-axis padding alignment returned by DeepGEMM wrapper |
Use MATE_DEEPGEMM_MK_ALIGNMENT when DeepGEMM-compatible contiguous grouped
GEMM call sites need a non-default M-axis alignment. Callers should pad each
expert segment and build m_indices using the same value returned by
get_mk_alignment_for_contiguous_layout(). Set this variable before starting
the Python process; the helper reads and caches the value on first use.
Compiler and Build Controls¶
Variable |
Default |
Meaning |
|---|---|---|
|
empty |
Extra host compiler flags for JIT builds. |
|
empty |
Extra |
|
empty |
Extra linker flags for JIT builds. |
|
auto-detected |
Override the |
MATE JIT builds also honor common build-tool variables such as CXX for the
host C++ compiler and MAX_JOBS for ninja parallelism.
Guard Allocator Debugging¶
Variable |
Default |
Meaning |
|---|---|---|
|
unset |
Internal bootstrap flag used by |
|
unset |
Internal guard allocator mode for bootstrap and replay |
|
unset |
Internal flag controlling device sync before guarded frees |
|
unset |
Internal flag controlling guarded alloc/free logging |
The MATE_GUARD_ALLOCATOR_* variables are internal bootstrap details and are
usually set by MATE itself when preparing guarded child processes. Use
mate guard-run or the Python mate.memory_debug.install_guard_allocator()
API for normal guard allocator debugging. The guard allocator is host-only C++
and does not need MATE_MUSA_ARCH_LIST; that variable only controls MATE
JIT/AOT modules that compile MUSA kernels.
Diagnostic and Test-Only Variables¶
Variable |
Default |
Meaning |
|---|---|---|
|
|
Enable the guarded allocator by default for MATE repository pytest runs |
|
|
Default MATE repository pytest guard allocator mode: |
|
|
Log guarded alloc/free events during MATE repository pytest runs |
|
|
Internal test/diagnostic mode used by MATE tests; not intended as a normal user runtime setting |
|
|
Number of shards to dispatch all the tests to |
|
|
Index of current shard |
|
|
Shard dispatch mode. Currently, only |
When MATE_PYTEST_SHARD_TOTAL > 1, test_fmha.py tests will dominate the last shard.
The MATE_PYTEST_GUARD_* variables are consumed by MATE’s in-repository
pytest conftest.py; they are not installed as a general pytest plugin for
external projects. Prefer mate guard-run for user-facing guarded allocator
debug sessions.