Sub-Arrays with the ArraySegment Class

Introduction

Sometimes we want to pass a subsection of an array to a method. It’s surprising that this functionality isn’t hanging off either the static Array class or an array instance.
Until recently I thought that there were only two solutions to this problem. Either create a new array with the elements you want exposed or pass a start index and size and hope the consumer honours them:

Copy array example

public void ArrayCopyExample()
{
    var sourceArray = new[] {0, 1, 2, 3, 4, 5, 6, 7, 8, 9};
    var subArray = new int[5];
    Array.Copy(sourceArray, 4, subArray, 0, 5);
    WriteLine($"Source: {string.Join(", ", sourceArray)}");
    DoSomething(subArray);
}

private void DoSomething(int[] integers)
{
    WriteLine($"Integers: {string.Join(", ", integers)}");
}
Outputs:
Source: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9
Integers: 4, 5, 6, 7, 8

Passing the index and length example

public void PassingAnIndexExample()
{
    var sourceArray = new[] {0, 1, 2, 3, 4, 5, 6, 7, 8, 9};
    WriteLine($"Original array: {string.Join(", ", sourceArray)}");
    DoSomething(sourceArray, 4, 5);
}

private void DoSomething(int[] integers, int index, int length)
{
    var subArray = integers.Skip(index).Take(length);
    WriteLine($"Integers: {string.Join(", ", subArray)}");
}
Output:
Original array: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9
Integers: 4, 5, 6, 7, 8

ArraySegment

C# 2 introduced the ArraySegment structure. On the surface it looks allot like this class is intended to be the solution to this problem. Unfortunately, it’s pretty bad and not close to being a genuine solution as I’ll demonstrate here.
You can find the documentation on MSDN.
Here’s an example of how it’s used. I create three different ArraySegment structures. Firstly I use string.Join() to enumerate them then I write out the value at each ordinal position on the middleSegment.
public void ArraySegmentBasicUsage()
{
    var array = "abcdef".ToCharArray();
    WriteLine($"Original array: {string.Join(", ", array)}");

    var headSegment = new ArraySegment<char>(array, 0, 3);
    var tailSegment = new ArraySegment<char>(array, 3, 3);
    var middleSegment = new ArraySegment<char>(array, 1, 4);

    WriteLine("Head:  {0}", string.Join(", ", headSegment));
    WriteLine("Tail:  {0}", string.Join(", ", tailSegment));
    WriteLine("Middle: {0}", string.Join(", ", middleSegment));

    WriteLine("Ordinarily access elements: ");
    
    for (var i = 0; i < middleSegment.Count; i++)
    {
        _outputHelper.WriteLine("{0,4}:  {1}", i, middleSegment[i]);
    }
}
The output looks promising at first:
Original array: a, b, c, d, e, f
Head:  a, b, c
Tail:  d, e, f
Middle: b, c, d, e
Ordinarily access elements: 
   0:  b
   1:  c
   2:  d
   3:  e
We’ll use it in the example above now.
public void ArraySegmentExample()
{
    var sourceArray = new[] {0, 1, 2, 3, 4, 5, 6, 7, 8, 9};
    WriteLine($"Source: {string.Join(", ", sourceArray)}");
    var arraySegment = new ArraySegment<int>(sourceArray, 4, 5);
    DoSomething(arraySegment);
}

private void DoSomething(ArraySegment<int> integers)
{
    WriteLine($"Integers:  {string.Join(", ", integers)}");
}
We can’t use the overload that accepts an array as ArraySegment isn’t an array. ArraySegment is IEnumerable though, so we could call the ToArray() extension method. Calling this method will create us a new copy of the array though. Something we were trying to avoid.
We can prove that it is creating a new array using Jetbrain’s dotMemory Unit
 [Fact]
 public void ArraySegmentExample()
 {
     var sourceArray = new[] {0, 1, 2, 3, 4, 5, 6, 7, 8, 9};
     WriteLine($"Source: {string.Join(", ", sourceArray)}");
     var arraySegment = new ArraySegment<int>(sourceArray, 4, 5);
     var memoryCheckPoint = dotMemory.Check();
     var subArray = arraySegment.ToArray();
     dotMemory.Check(memory =>
         WriteLine(
             $"New integer arrays:  {memory.GetDifference(memoryCheckPoint).GetNewObjects(property => property.Type.Is<int[]>()).ObjectsCount}"));

     DoSomething(arraySegment);
 }
The output shows that we have a new array.
Source: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9
New integer arrays:  1
Integers:  4, 5, 6, 7, 8
Not only that, but the original array is still exposed via the ArraySegment.Array property, along with the offset and size used to create the ArraySegment.

It’s not that bad.

OK, I’m exaggerating a bit. If you have a massive array of value types then duplicating it en masse might be a problem, but if you are doing that then you should probably be having a more bespoke solution anyhow.
If, instead of thinking of passing an array about, we instead use an IEnumerable<T> then we’re in good stead. True, if we have a method that’s expecting an array then we’re forced to use Array.Copy, but if we’re able to write the consuming method ourselves then using the IEnumerable<T> will only cause the allocation of an enumerator. In reality, using the Linq methods Skip() and Take() will more likely be what you use (though it does make two more allocations).

Conclusion?

Use ArraySegment as an IEnumerable and you won’t get bitten. Don’t use it as it if you don’t want the full array to be accessible to the called method.
Thanks for reading.
Give me a shout @BanksySan.

No comments:

Post a Comment