Enterprise AI Analysis

mlx-snn: Spiking Neural Networks on Apple Silicon via MLX

This analysis delves into mlx-snn, the pioneering spiking neural network (SNN) library designed natively for Apple's MLX framework. It enables efficient SNN research on Apple Silicon hardware, offering significant performance and memory advantages over existing PyTorch-based libraries.

Schedule Your Strategy Session

Key Performance Indicators & Strategic Advantages

mlx-snn delivers crucial advantages for SNN development on Apple Silicon, validated through performance metrics and innovative design.

0% Peak MNIST Accuracy

0x Faster Training

0x Lower GPU Memory

Discuss Your Implementation

Deep Analysis & Enterprise Applications

Select a topic to dive deeper, then explore the specific findings from the research, rebuilt as interactive, enterprise-focused modules.

1 Introduction

We introduce mlx-snn, the first spiking neural network (SNN) library built natively on Apple's MLX framework. As SNN research grows rapidly, all major libraries—snnTorch, Norse, SpikingJelly, Lava—target PyTorch or custom backends, leaving Apple Silicon users without a native option. mlx-snn provides six neuron models, four surrogate gradient functions, four spike encoding methods, and a complete backpropagation-through-time training pipeline.

SNNs, often called the “third generation” of neural networks, process information through discrete spike events rather than continuous activations. This biologically inspired paradigm offers unique advantages: temporal coding of information, event-driven computation with potential energy savings, and natural compatibility with neuromorphic hardware. Recent advances in surrogate gradient learning have made SNNs trainable with gradient descent, closing the accuracy gap with conventional deep learning on many tasks.

2 Related Work

Several software libraries have emerged to support SNN research. snnTorch provides a PyTorch-based framework. Norse emphasizes functional programming. SpikingJelly offers high-performance CUDA kernels. Lava targets Intel's Loihi. All these libraries depend on PyTorch or custom backends, leaving Apple Silicon users without a native SNN framework.

Apple's MLX is an array computation framework designed specifically for Apple Silicon. Its unified memory architecture, lazy evaluation, and composable function transforms enable efficient computation graph optimization. Prior to mlx-snn, no SNN library existed for MLX.

Feature Comparison of SNN Software Libraries

Feature	mlx-snn	snnTorch	Norse	SpikingJelly	Lava
Backend	MLX	PyTorch	PyTorch	PyTorch	Custom
Neuron models	6	6	8	10+	4
Surrogate gradients	4	5	3	4
Spike encoding	4	4	2	3
Apple Silicon native	✓
CUDA acceleration		✓	✓	✓
Neuromorphic HW		o	✓	o	✓
Functional API	✓	o	✓	o	o
Learnable dynamics	✓	✓
Medical encoders	✓

3 Architecture and Design

3.1 Design Principles

mlx-snn follows four core design principles:

MLX-native: All tensor operations use mlx.core. NumPy is used only for data I/O.
Explicit state: Neuron state is passed as a Python dictionary, making the library compatible with MLX's functional transforms and mx.compile.
snnTorch-compatible API: Class names, constructor arguments, and forward-pass signatures mirror snnTorch wherever possible.
Research-first: Every component can be subclassed, overridden, or composed.

Listing 1 illustrates the API similarity between snnTorch and mlx-snn:

# --- snnTorch ---
import snntorch as snn
lif = snn.Leaky(beta=0.9)
mem = lif.init_leaky()
spk, mem = lif(x, mem)

# --- mlx-snn ---
import mlxsnn
lif = mlxsnn.Leaky(beta=0.9)
state = lif.init_state(B, F)
spk, state = lif(x, state)

3.2 Neuron Models

All neuron models inherit from SpikingNeuron, an abstract mlx.nn.Module subclass that provides the fire() and reset() methods. The fire() method applies the surrogate gradient function to the difference between membrane potential and threshold. The reset() method supports three mechanisms: subtract (subtract threshold), zero (reset to zero), and none (no reset, used for output layers).

Leaky Integrate-and-Fire (LIF): U[t + 1] = β · U[t] + X[t + 1] – S[t] · Vthr
Integrate-and-Fire (IF): A non-leaky variant with β = 1.
Izhikevich: Two-dimensional model with characteristic quadratic rise and recovery variable interaction.
Adaptive LIF (ALIF): Extends LIF with spike-frequency adaptation mechanism.
Synaptic: Two-state model with explicit synaptic current filtering.
Alpha: Dual-exponential synapse model for rise-then-decay post-synaptic current profile.

3.3 Surrogate Gradients

Spike generation uses the Heaviside step function, which has zero gradient. Surrogate gradient methods replace the backward pass with a smooth approximation. MLX's current mx.custom_function API has shape inconsistencies, so mlx-snn adopts a straight-through estimator (STE) pattern using mx.stop_gradient:

output = stop_grad(Θ(x) – õ(x)) +ỡ(x)

This evaluates to Θ(x) in the forward pass, with gradients flowing through õ(x) only in the backward pass. Supported functions include Fast sigmoid, Arctan, Straight-through, and Custom functions.

3.4 Spike Encoding

mlx-snn provides four spike encoding methods:

Rate coding: Poisson spike generation.
Latency coding: Time-to-first-spike encoding.
Delta modulation: Change-based encoding for temporal signals.
EEG encoder: Medical-signal-specific encoder for multi-channel EEG data.

3.5 Training Pipeline

mlx-snn integrates with MLX's standard training patterns. The bptt_forward utility unrolls a model over T timesteps. Three loss functions are provided: Rate coding loss, Membrane loss, and MSE count loss.

import mlx.core as mx
import mlx.nn as nn
import mlx.optimizers as optim
import mlxsnn

class SpikingMLP(nn.Module):
    def __init__(self, num_steps=25, beta=0.9):
        super().__init__()
        self.fc1 = nn.Linear(784, 128)
        self.lif1 = mlxsnn.Leaky(beta=beta)
        self.fc2 = nn.Linear(128, 10)
        self.lif2 = mlxsnn.Leaky(beta=beta,
                                reset_mechanism="none")
        self.num_steps = num_steps

    def __call__(self, spikes_in):
        s1 = self.lif1.init_state(
            spikes_in.shape[1], 128)
        s2 = self.lif2.init_state(
            spikes_in.shape[1], 10)
        for t in range(self.num_steps):
            X = self.fc1(spikes_in[t])
            spk, s1 = self.lif1(X, s1)
            X = self.fc2(spk)
            _, s2 = self.lif2(X, s2)
        return s2["mem"] # final membrane

model = SpikingMLP(num_steps=25, beta=0.9)
optimizer = optim.Adam(learning_rate=1e-3)

def loss_fn(model, spikes_in, targets):
    mem_out = model(spikes_in) # [batch, 10]
    return mx.mean(
        nn.losses.cross_entropy(mem_out, targets))

loss_and_grad = nn.value_and_grad(model, loss_fn)

for x_batch, y_batch in get_batches(x_train, y_train):
    spikes = mlxsnn.rate_encode(x_batch, num_steps=25)
    loss, grads = loss_and_grad(model, spikes, y_batch)
    optimizer.update(model, grads)
    mx.eval(model.parameters(), optimizer.state)

4 MLX-Specific Design Considerations

MLX's architecture provides key advantages:

Unified memory: CPU and GPU share the same physical memory, eliminating explicit transfers and bottlenecks for SNN workloads mixing data preprocessing (CPU) with neuron simulation (GPU).
Lazy evaluation: Operations build a computation graph executed only when `mx.eval()` is called. This is advantageous for SNNs, allowing the MLX runtime to optimize memory allocation and kernel scheduling for unrolled temporal loops.
Composable transforms: `mx.grad` computes gradients without framework-specific "retain graph" semantics. `mx.compile` can JIT-compile pure functions. The explicit state-dict design of mlx-snn neurons ensures compatibility.
Immutable arrays: MLX arrays do not support in-place operations. All neuron updates use functional assignments (e.g., `mem = beta * mem + x`), aligning well with SNN dynamics.

5 Experiments: MNIST Classification

We trained a two-layer feedforward SNN on MNIST (784 → h (LIF) → 10 (LIF, no reset)). Input images were rate-encoded into Poisson spike trains over T = 25 timesteps. The output layer accumulates membrane potential, with classification using the argmax of the final membrane state. We evaluated across five hyperparameter configurations and three backends: mlx-snn (MLX GPU), snnTorch (PyTorch MPS GPU), and snnTorch (PyTorch CPU). All used Adam optimizer and fast sigmoid surrogate gradient.

MNIST Classification Performance Comparison

Hyperparameters	mlx-snn (MLX)		snnTorch (MPS)		snnTorch (CPU)
	Acc. (%)	Time (s)	Acc. (%)	Time (s)	Acc. (%)	Time (s)
C1 β=0.85, h=256, lr=10-3	97.28	4.0	98.00	8.8	98.01	12.8
C2 β=0.9, h=256, lr=10-3	97.02	4.3	98.03	8.9	97.97	13.5
C3 β=0.9, h=256, lr=10-³, B=256	96.91	2.4	98.03	4.8	98.17	16.7
C4 β=0.9, h=128, lr=10-3	96.90	4.3	97.84	9.0	97.74	10.9
C5 β=0.95, h=128, lr=2×10-3, 15ep	94.98	4.4	97.09	9.0	97.00	11.1
Peak GPU memory	61-138 MB		241-1453 MB

mlx-snn is consistently 2.0-2.5× faster per epoch than snnTorch on both MPS and CPU backends, while using 3.3–10.5× less GPU memory. The best mlx-snn configuration achieved 97.28% accuracy, within 0.7 points of snnTorch's best (98.03%). The accuracy gap is consistent across all configurations and is attributable to the STE-based surrogate gradient pattern.

2.5x Faster Training Speed on Apple Silicon

10x Less GPU Memory Usage

Figure 1 shows the MNIST training curves for configuration C5.

MNIST training curves showing test accuracy and training time per epoch for mlx-snn (MLX GPU), snnTorch (MPS GPU), and snnTorch (CPU).

5.2 Neuron Dynamics Validation

Figure 2 shows membrane potential traces for all six neuron models under constant input current. Each model exhibits its characteristic dynamics: LIF shows exponential rise to threshold and reset; IF accumulates linearly; Izhikevich displays the distinctive quadratic rise and recovery variable interaction; ALIF shows increasing inter-spike intervals due to threshold adaptation; Synaptic shows smoothed dynamics from the synaptic current filter; and Alpha exhibits the rise-then-decay current profile.

Membrane potential traces for LIF, IF, Izhikevich, ALIF, Synaptic, and Alpha neuron models over 200 timesteps with constant input current, showing characteristic spike events.

5.3 Surrogate Gradient Comparison

Table 3 compares three surrogate gradient functions on the MNIST task using a baseline configuration. Fast sigmoid and arctan achieve comparable accuracy (93.65% and 92.44%), consistent with findings that surrogate gradient learning is robust to the choice of surrogate function. The straight-through estimator (46.28%) underperforms due to its narrow gradient window at the default scale. Figure 3 visualizes the forward and backward passes for each surrogate.

Surrogate Gradient Function Comparison

Surrogate Function	Test Accuracy (%)	Training Time (s)
Fast Sigmoid (k = 25)	93.65	44.7
Arctan (a = 2)	92.44	43.8
Straight-Through (s = 1)	46.28	46.3

Graph comparing forward pass (Heaviside step function) and backward pass (smooth gradient) for Fast Sigmoid, Arctan, and Straight-Through surrogate gradient functions.

6 Discussion and Future Work

Current limitations: mlx-snn is in active development (v0.2.1). Key limitations include: (1) `mx.compile` is not yet applied to the training loop due to Python control flow in temporal unrolling; (2) neuromorphic dataset loaders (N-MNIST, DVS-Gesture, SHD) are not yet implemented; (3) the library has been validated only on MNIST — larger-scale benchmarks are needed.

MLX framework maturity: We encountered a shape inconsistency in MLX's `mx.custom_function` VJP mechanism. Our STE workaround is functionally equivalent but represents a temporary solution. We expect this issue to be resolved as MLX matures.

Roadmap: Planned features for future versions include:

v0.3.0: Liquid State Machine (LSM) with configurable reservoir topology, excitatory/inhibitory balance, and EEG classification examples.
v0.4.0: `mx.compile`-optimized forward passes, neuromorphic dataset loaders, visualization utilities, and comprehensive benchmarks.
v1.0.0: Full API documentation, numerical validation against snnTorch reference outputs, and PyPI stable release.

Broader impact: By bringing SNN research to Apple Silicon, mlx-snn enables researchers using MacBook Pro or Mac Studio hardware to run SNN experiments without requiring NVIDIA GPUs or cloud infrastructure. The unified memory architecture is particularly advantageous for SNN workloads that require frequent state updates across timesteps. We hope mlx-snn lowers the barrier to entry for SNN research in the Apple ecosystem.

7 Conclusion

We have presented mlx-snn, the first spiking neural network library built natively on Apple's MLX framework. The library provides six neuron models, four surrogate gradient functions, and four spike encoding methods with an API designed for compatibility with snnTorch. Our experiments validate the library's correctness on MNIST classification, achieving up to 97.28% accuracy with 2.0-2.5× faster training and 3–10× lower GPU memory compared to snnTorch on the same M3 Max hardware. mlx-snn is open-source under the MIT license and available on PyPI (pip install mlx-snn).

Key SNN Development Workflow

SNN Development Workflow on MLX

Data Encoding

→

Define Neuron Models

→

Forward Pass (Simulation)

→

Surrogate Gradient (Backward Pass)

→

Parameter Update (Optimization)

Quantify Your Enterprise AI Advantage

Use our interactive calculator to estimate the potential time and cost savings for your organization by leveraging mlx-snn on Apple Silicon.

Your Industry

Number of Employees (Impacted by SNN Development)

Average Hours/Week Spent on SNN Development

Average Hourly Rate ($)

Estimated Annual Savings $0

Annual Hours Reclaimed 0

Your Phased AI Implementation Roadmap

Our structured approach ensures a seamless integration of mlx-snn into your existing workflows, maximizing ROI from day one.

Phase 01: Discovery & Strategy

In-depth analysis of your current SNN research, identification of key integration points, and formulation of a tailored mlx-snn adoption strategy to leverage Apple Silicon.

Phase 02: Pilot Development & Benchmarking

Rapid prototyping of core SNN models using mlx-snn, followed by rigorous benchmarking against existing solutions to demonstrate performance and memory advantages.

Phase 03: Full-Scale Integration & Training

Seamless integration of mlx-snn into your development environment, including API adaptation and comprehensive training for your research and engineering teams.

Phase 04: Performance Optimization & Scaling

Fine-tuning mlx-snn implementations for maximum efficiency on your specific Apple Silicon hardware, ensuring optimal performance for large-scale SNN experiments.

Phase 05: Continuous Innovation & Support

Ongoing support, access to latest mlx-snn features, and expert consultation to keep your SNN research at the forefront of AI innovation.

Get Your Custom Roadmap

Ready to Accelerate Your SNN Research?

Unlock the full potential of Apple Silicon for your spiking neural networks. Partner with us for a tailored implementation plan.

Book a Free Consultation