Testing SIMD Dispatch

Every incant! dispatch and if let Some(token) = summon() branch creates a fallback path. You can test all of them on your native hardware -- no cross-compilation needed.

Exhaustive Permutation Testing

for_each_token_permutation runs your closure once for every unique combination of token tiers, from "all SIMD enabled" down to "scalar only". It handles the disable/re-enable lifecycle, mutex serialization, cascade logic, and deduplication.

use archmage::testing::{for_each_token_permutation, CompileTimePolicy};

#[test]
fn sum_squares_matches_across_tiers() {
    let data: Vec<f32> = (0..1024).map(|i| i as f32).collect();
    let expected: f32 = data.iter().map(|x| x * x).sum();

    let report = for_each_token_permutation(CompileTimePolicy::Warn, |perm| {
        let result = sum_squares(&data);
        assert!(
            (result - expected).abs() < 1e-1,
            "mismatch at tier: {perm}"
        );
    });

    assert!(report.permutations_run >= 2, "expected multiple tiers");
}

On an AVX-512 machine, this runs 5-7 permutations (all enabled, AVX-512 only, AVX2+FMA, SSE4.2, scalar). On a Haswell-era CPU without AVX-512, 3 permutations. Tokens the CPU doesn't have are skipped -- they'd produce duplicate states.

Concurrency and `lock_token_testing`

Token disabling is process-wide -- dangerously_disable_token_process_wide affects all threads. Both for_each_token_permutation and lock_token_testing use the same internal mutex to serialize token manipulation, so parallel tests won't interfere with each other.

for_each_token_permutation acquires this lock automatically. Use lock_token_testing() when you need to call dangerously_disable_token_process_wide manually, or when you need stable summon() results while permutation tests may be running in parallel:

use archmage::testing::lock_token_testing;
use archmage::{X64V3Token, SimdToken};

#[test]
fn scalar_fallback_matches_simd() {
    let _lock = lock_token_testing();

    let data = vec![1.0f32; 1024];
    let simd_result = sum_squares(&data);

    X64V3Token::dangerously_disable_token_process_wide(true).unwrap();
    let scalar_result = sum_squares(&data);
    X64V3Token::dangerously_disable_token_process_wide(false).unwrap();

    assert!((simd_result - scalar_result).abs() < 1e-3);
}

for_each_token_permutation is reentrant -- if called while the current thread holds lock_token_testing, it skips re-acquiring the lock. This lets you combine manual state observation with exhaustive permutation testing:

use archmage::testing::{lock_token_testing, for_each_token_permutation, CompileTimePolicy};
use archmage::{X64V2Token, SimdToken};

#[test]
fn tokens_restored_after_permutations() {
    let _lock = lock_token_testing();
    let v2_available = X64V2Token::summon().is_some();

    let _report = for_each_token_permutation(CompileTimePolicy::Warn, |_perm| {
        // test body
    });

    // State is stable -- no other thread can have disabled V2
    assert_eq!(X64V2Token::summon().is_some(), v2_available);
}

Without lock_token_testing, the summon() calls outside for_each_token_permutation could observe tokens disabled by a parallel test.

`CompileTimePolicy` and `-Ctarget-cpu`

If you compiled with -Ctarget-cpu=native, the compiler bakes feature detection into the binary. summon() returns Some unconditionally, and tokens can't be disabled at runtime -- the runtime check was compiled out.

The CompileTimePolicy enum controls what happens when for_each_token_permutation encounters these undisableable tokens:

Warn -- Exclude the token from permutations silently. Warnings are collected in the report.
WarnStderr -- Same, but also prints each warning to stderr with actionable fix instructions.
Fail -- Panic with the exact compiler flags needed to fix it.

For full coverage in CI, use the testable_dispatch feature. This makes compiled_with() return None even when features are baked in, so summon() uses runtime detection and tokens can be disabled:

# In your CI test configuration
[dev-dependencies]
archmage = { version = "0.9", features = ["testable_dispatch"] }

Enforcing Full Coverage via Env Var

Wire an environment variable to switch between Warn in local development and Fail in CI:

use archmage::testing::{for_each_token_permutation, CompileTimePolicy};

fn permutation_policy() -> CompileTimePolicy {
    if std::env::var_os("ARCHMAGE_FULL_PERMUTATIONS").is_some() {
        CompileTimePolicy::Fail
    } else {
        CompileTimePolicy::WarnStderr
    }
}

#[test]
fn my_dispatch_works_at_all_tiers() {
    let report = for_each_token_permutation(permutation_policy(), |perm| {
        let result = my_simd_function(&data);
        assert_eq!(result, expected, "failed at: {perm}");
    });
    eprintln!("{report}");
}

Then in CI (with testable_dispatch enabled):

ARCHMAGE_FULL_PERMUTATIONS=1 cargo test

If a token is still compile-time guaranteed (you forgot the feature or have stale RUSTFLAGS), Fail panics with the exact flags to fix it:

x86-64-v3: compile-time guaranteed, excluded from permutations. To include it, either:
  1. Add `testable_dispatch` to archmage features in Cargo.toml
  2. Remove `-Ctarget-cpu` from RUSTFLAGS
  3. Compile with RUSTFLAGS="-Ctarget-feature=-avx2,-fma,-bmi1,-bmi2,-f16c,-lzcnt"

Manual Single-Token Disable

For targeted tests that only need to disable one token, use lock_token_testing to serialize against parallel tests:

use archmage::testing::lock_token_testing;
use archmage::{X64V3Token, SimdToken};

#[test]
fn scalar_fallback_matches_simd() {
    let _lock = lock_token_testing();
    let data = vec![1.0f32; 1024];
    let simd_result = sum_squares(&data);

    // Disable AVX2+FMA -- summon() returns None until re-enabled
    X64V3Token::dangerously_disable_token_process_wide(true).unwrap();
    let scalar_result = sum_squares(&data);
    X64V3Token::dangerously_disable_token_process_wide(false).unwrap();

    assert!((simd_result - scalar_result).abs() < 1e-3);
}

Disabling cascades downward: disabling V2 also disables V3/V4/V4x/Fp16; disabling NEON also disables Aes/Sha3/Crc/Arm64V2/Arm64V3. dangerously_disable_tokens_except_wasm(true) disables everything at once.

Found an error or it needs a clarification? Open an issue on GitHub.

Substantiated corrections will be incorporated with attribution.

Found a typo? Fork, modify and open a PR.

Testing SIMD Dispatch

Exhaustive Permutation Testing

Concurrency and lock_token_testing

CompileTimePolicy and -Ctarget-cpu

Enforcing Full Coverage via Env Var

Manual Single-Token Disable

ON THIS PAGE

Concurrency and `lock_token_testing`

`CompileTimePolicy` and `-Ctarget-cpu`