Performance Benchmarks

Real numbers. No marketing fluff. See how MYRA compares to industry standards.

Benchmark Methodology

Hardware:

  • Linux 6.x kernel with io_uring support
  • AMD/Intel modern CPU (specify per test)
  • 32GB+ RAM, NVMe storage

JVM Configuration:

  • OpenJDK 24 EA / Oracle JDK 24
  • -XX:+UseZGC -Xmx4g
  • JMH 1.37 with proper warmup

All benchmarks are reproducible. Source code available in each repository's benchmarks/ folder.

Codec Performance

MyraCodec vs SBE, FlatBuffers, Kryo, and Avro

Decode Throughput (ops/sec)

Order book snapshot decoding - higher is better

Encode Throughput (ops/sec)

Order book snapshot encoding - higher is better

Codec Benchmark Summary
Decode (ops/s)vs Myra DecodeCodecEncode (ops/s)vs Myra Encode
2,721,551MyraCodec1,569,329
2,204,557-19%SBE2,618,446+67%
1,451,581-47%FlatBuffers757,092-52%
1,016,779-63%Kryo700,405-55%
359,322-87%Avro342,544-78%

Key insight: MyraCodec leads decode by 23% over SBE. SBE currently leads encode by 67% - optimization in progress.

Transport Latency

MyraTransport vs Netty and Java NIO

Ping-Pong Latency (4-byte payload)

Round-trip latency in microseconds - lower is better

Real-World Payload Latency

Larger message round-trip - lower is better

Transport Benchmark Summary
ImplementationMean (μs)P50 (μs)P99 (μs)P99.99 (μs)vs Netty
Ping-Pong (4 bytes)
NIO22.1121.0634.181,505.1040% faster
MYRA_SQPOLL29.5923.1454.084,849.6519% faster
MYRA33.2030.0248.452,629.639% faster
Netty36.5435.2658.6298.51baseline
Real-World Payload
MYRA_TOKEN ⭐27.9326.2739% faster
MYRA_SQPOLL31.1425.6020% faster
MYRA34.9932.2610% faster
Netty38.9337.82baseline

Key insight: MYRA_TOKEN mode (with io_uring token-based buffer selection) achieves 39% lower latency than Netty for real-world payloads.

Zero-GC Verification

Allocation analysis on hot paths

Verified Zero Allocations

MYRA components are tested with JMH's gc profiler and async-profiler to verify zero heap allocations on performance-critical paths.

0 B/op
Roray-FFM-Utils
Pool acquire/release cycle
0 B/op
MyraCodec
Encode/decode with flyweights
0 B/op
MyraTransport
Send/receive with registered buffers
~48 B/op
Netty (comparison)
Per-message allocations
# Verify with JMH gc profiler
./gradlew :benchmarks:jmh -Pjmh.prof=gc

# Sample output
Benchmark                          Mode  Cnt  Score   Units
MyraCodecBenchmark.decode         thrpt   10  2.7M   ops/s
MyraCodecBenchmark.decode:gc.alloc.rate  thrpt   10  ≈ 0   B/op

Methodology

How we ensure accurate, reproducible results

Proper Warmup

All benchmarks include JIT warmup phases to ensure measurements reflect optimized code paths.

Reproducible

All benchmark code is open source. Run the same tests on your hardware with our benchmark suite.

JMH-Based

Using Java Microbenchmark Harness for statistically rigorous measurements.

Run Your Own Benchmarks

All benchmarks are open source and easy to run

# Clone the transport repository
git clone https://github.com/mvp-express/myra-transport.git
cd myra-transport

# Run the benchmark suite
./gradlew :benchmarks:jmh

# Results will be in benchmarks/build/results/jmh/