Skip to content

Benchmark results for Span

Warpten edited this page Jun 7, 2018 · 6 revisions

All these tests ran on .NET Core 2.1.0 (RyuJIT x64, CoreCLR 4.6.26515.07)

Take aways

  • FastStructure aint so fast now.
  • ASM produced by the JIT when dealing with Span is however very bloated (at least locally)

Span<T> VS ReadOnlySpan<T>

Method Mean Error StdDev Scaled ScaledSD
ReadOnlySpan 1.728 ns 0.0507 ns 0.0423 ns 1.00 0.00
Span 1.836 ns 0.0705 ns 0.1235 ns 1.06 0.07
ReadOnlySpan<byte> smallData = stackalloc byte[20];
return MemoryMarshal.Read<int>(smallData);
Span<byte> smallData = stackalloc byte[20];
return MemoryMarshal.Read<int>(smallData);

Comparison of Span<byte> and FastStructure.PtrToStructure<T>.

Method Mean Error StdDev Scaled ScaledSD
Span 0.3417 ns 0.0410 ns 0.0740 ns 1.00 0.00
FastStructure 6.7561 ns 0.1227 ns 0.1088 ns 20.58 3.87

Where _largeData.Length = 125000.

Span<byte> smallData = _largeData;
return MemoryMarshal.Read<int>(smallData);
fixed (byte* buffer = _largeData)
    return FastStructure.PtrToStructure<int>(new IntPtr(buffer));

Smaller sizes don't cause issues.

stackalloc T[] VS new T[]

Allocation of 20 bytes - optimization avoided by return MemoryMarshal.Read<int>(smallData);.

Method Mean Error StdDev Scaled ScaledSD
stackalloc 1.767 ns 0.0338 ns 0.0300 ns 1.00 0.00
'new byte[]' 3.808 ns 0.1120 ns 0.0935 ns 2.16 0.06

Comparison of Span<byte> and FastStructure.PtrToStructure<T> with stackalloc.

Method Mean Error StdDev Scaled ScaledSD
Span 1.856 ns 0.0602 ns 0.0563 ns 1.00 0.00
FastStructure 6.603 ns 0.1597 ns 0.1962 ns 3.56 0.15
Span<byte> smallData = stackalloc byte[20];
return MemoryMarshal.Read<int>(smallData);
var storage = stackalloc byte[20];
return FastStructure.PtrToStructure<int>(new IntPtr(storage));

MemoryMarshal.Read<T>(...) VS MemoryMarshal.Cast<byte, T>(..)[0] VS BinaryPrimitives

Method Mean Error StdDev Scaled ScaledSD
MemoryMarshal.Read 1.708 ns 0.0576 ns 0.0510 ns 1.00 0.00
MemoryMarshal.Cast 1.917 ns 0.0719 ns 0.1201 ns 1.12 0.08
'BinaryPrimitives.ReadInt32 (LE)' 1.866 ns 0.0687 ns 0.0791 ns 1.09 0.06
Span<byte> smallData = stackalloc byte[20];
return MemoryMarshal.Read<int>(smallData);
ReadOnlySpan<byte> storage = stackalloc byte[20];
var asT = MemoryMarshal.Cast<byte, int>(storage);
return asT[0];
ReadOnlySpan<byte> storage = stackalloc byte[20];
return BinaryPrimitives.ReadInt32LittleEndian(storage);

Impact of Span<T>.Slice(int offset, int length)

Method Mean Error StdDev Scaled ScaledSD
Span 1.786 ns 0.0493 ns 0.0437 ns 1.00 0.00
Span.Slice 1.884 ns 0.0704 ns 0.0838 ns 1.06 0.05
Span<byte> smallData = stackalloc byte[20];
return MemoryMarshal.Read<int>(smallData);
Span<byte> smallData = stackalloc byte[20];
return MemoryMarshal.Read<int>(smallData.Slice(4, 4));

Span<T> VS cpp-lookalike unions

Given

[StructLayout(LayoutKind.Explicit)]
unsafe struct _union
{
    [FieldOffset(0)] public int i;
    [FieldOffset(0)] public float f;
    [FieldOffset(0)] public fixed byte b[4];
    [FieldOffset(0)] public uint u;
}
[Benchmark(Description = "stackalloc", Baseline = true)]
public int span()
{
    var asSpan = b.AsSpan();
    return MemoryMarshal.Read<int>(asSpan);
}

[Benchmark(Description = "union")]
public int union__()
{
    return u.i;
}
Method Mean Error StdDev Scaled ScaledSD
stackalloc 0.9523 ns 0.0619 ns 0.1767 ns 1.000 0.00
union 0.0067 ns 0.0101 ns 0.0233 ns 0.007 0.03