Showing posts with label GC. Show all posts
Showing posts with label GC. Show all posts

.NET Memory - Heap Generations

Contents

Introduction

I last posted .NET Memory Management 101 where I talked about what value types were, what reference types were and showed that we can see them with a memory profiler like dotMemory.
This time we’ll look a bit more at the heap, specifically at generations.

Generations are used by the garbage collector to partition objects by how long it thinks they should live. It will then scan for objects to clean up more frequently in the sections that to thinks contains short lived objects.

Microsoft have a good introduction in their Fundamentals of Garbage Collection article.

We’re going to see if we can see in action again.

You can get the code for this example from my Git repository.

What are Generations?

According to the specifications there are three generations.

  1. Generation 0.
  2. Generation 1.
  3. Generation 2.

There’s the Large Object Heap. This is, as you can guess, for storing large objects; specifically, objects greater than 85Kb.

When an object is created it is put in generation 0. Generation 0 is for the shortest lived objects and gets the largest portion of the garbage collector’s attention. Objects are the promoted from generation 0 to generation 1 and generation 2 as the garbage collector determines it appropriate.

The question now is “How does the garbage collector decide when to promote an object?”. That bit is actually rather simple. If the garbage collector scans the object and finds that it is not ready for collection then it will promote it up to the next generation. It’s that simple.

Seeing it in action

What we’re going to do is quite simple. We’re just going to create object and watch it move up the generations as we call GC.Collect().

We have this very simple method that prints out the generation the object is in and the instructs the garbage collector to collect.

private static void GcCollect(object o)
{
    Console.WriteLine($"Object is in generation {GC.GetGeneration(o)}.");
    GC.Collect();
}

All we’re going to do is call this four times. We should see the object advance a generation until it gets to two and then stay there.

The full code is:

static void Main()
{
    var o = new object();

    var maxGeneration = GC.MaxGeneration;

    Console.WriteLine($"This environment has {maxGeneration} generations.");

    for (var i = 0; i < maxGeneration + 2; i++)
    {
        GcCollect(o);
    }
    Console.WriteLine("Finished...");
    Console.Read();
}

private static void GcCollect(object o)
{
    Console.WriteLine($"Object is in generation {GC.GetGeneration(o)}.");
    GC.Collect();
}

Currently, I think all the frameworks have 3 generations of memory. I might be wrong there, but there’s no harm in doing it right, even if it is a demo.

When we run the code we see the output we expect.

This environment has 2 generations.
Object is in generation 0.
Object is in generation 1.
Object is in generation 2.
Object is in generation 2.
Finished...

The object stayed in memory because I was using it in each iteration of the GcCollect(object) method. Once I am no longer using it to print out the message to the console the garbage collector will note that it there aren’t any references left and with collect it.

Weak References

We have a catch-22 here.

  1. We want to know if the object is garbage collected.
  2. To do this we need a reference to the object.
  3. If I have a reference to an object the garbage collector won’t collect it.
  4. GOTO 1

There is a solution though in the form of the WeakReference class.

A weak reference is a reference that the garbage collector should ignore when deciding whether an object is eligible for collection. We can use it here to hold a reference without preventing collection. We create a weak reference like so:

var weakReference = new WeakReference(o);

So, we’ll create one, run the GC. Then we’ll set o to be null so that there are no ‘strong’ references left and the GC should see it as collectable.

We’ll add the following to the previous code:

 var weak = new WeakReference(o);
 GC.Collect();
 Console.WriteLine($"Has o has been garbage collected? {(weak.IsAlive? "No" : "Yes")}");

The output confirm that the object is no longer alive.

Has o has been garbage collected? Yes

If we had other strong references to the object then the GC would not have collected the object.

Thanks for reading.

Feel free to contact me @BanksySan!

.NET Memory Management 101

I watched an excellent video by Maarten Balliauw recently about dotMemory and the ClrMd framework and how .NET manages its memory.

I put up my crude notes in my previous post.

Here I thought I’d have a play myself and see what I can replicate myself. The source code is available in my Git repo.

Running it will give you four options for tests which we can use to see how objects and values are assigned in .NET.

Console menu

When you run a test there’ll be notifications of when to take a snapshot. Once you have, press Enter for the test to either continue of tell you it’s finished.

Reference Types & Value Types

The .NET framework has two main types, which it handles differently when it comes to memory management. These are:

  • Value type
  • Reference type

The key differences between the two from a coding perspective is that:

  Value Type Reference Type
Can be null? No Yes
Create new instance
on every method call
Yes No

Consider the following:

using System;

object referenceType = new object();
DateTime valueType = new DateTime(1, 2, 3);

public void AreTheyEqual(object o, DateTime d)
{
    Console.WriteLine("Reference types equal? {0}", referenceType == o);
    Console.WriteLine("Value types equal?     {0}", referenceType == o);
    Console.WriteLine("Reference types same?  {0}", Object.ReferenceEquals(o, referenceType));
    Console.WriteLine("Value types equal?     {0}", Object.ReferenceEquals(d, valueType));
}

AreTheyEqual(referenceType, valueType);

Try it on dotFiddle.

We create one value type and one reference type. We pass these into the method which then tests if they are equal using == and whether they’re actually the same thing with Object.ReferenceEquals().

The output is:

Reference types equal? True
Value types equal?     True
Reference types same?  True
Value types equal?     False

You can see that, whilst both the reference type and the value type equate as expected, the valuetype is not the same entity as that passed into it. This is because it has been copied whereas with the reference type we just passed a reference (pointer) to the existing object.

Stack and Heap

You will hear talk of the stack and the heap quite a bit. In fact, they’re technically just implementation details that are not guaranteed not to change (only the functional behaviours of the types is guaranteed).

“I find this characterization of a value type based on its implementation details rather than its observable characteristics to be both confusing and unfortunate. Surely the most relevant fact about value types is not the implementation detail of how they are allocated, but rather the by-design semantic meaning of “value type”, namely that they are always copied “by value”. If the relevant thing was their allocation details then we’d have called them “heap types” and “stack types”. But that’s not relevant most of the time. Most of the time the relevant thing is their copying and identity semantics.” - Eric Lippert

See:

Anyhow, in this case we are looking into under the hood of the memory management system (which is an implementation detail as well) as it makes sense for us to talk about stacks and heaps.

The Stack

The stack is where value types go!

Well, no, not exactly. It’s a good start though. The .NET primitives are all value types and, when a local variable, will sit on the stack along with references to objects in the heap.

Let’s have a look at the local primitives test. This simply creates a bunch of local integers with the values 0, 1, 2,… 49. An integer is a value type and as I just said, a local value type will be pushed onto the stack.

The code is very simple:

public void Start()
{
    _presenter.PromptForSnapshot();

    var i0 = RANDOM.Next();
    var i1 = RANDOM.Next();
    var i2 = RANDOM.Next();
    var i3 = RANDOM.Next();
    var i4 = RANDOM.Next();
    var i5 = RANDOM.Next();
    var i6 = RANDOM.Next();

    // Removed for space

    var i59 = RANDOM.Next();

    _presenter.PromptForSnapshot();
}

If we run the test and connect a memory profiler like dotMemory then we we can look at the heap before and after the variables are created. If you’re using dotMemory then you should have two snapshots taken and see similar to this:

Primitives Memory Usage

You can see that there’s no interesting activity recorded in the heap at all. All we can see if basic operating traffic by the application itself. There were 382 objects before we created the integers and 381 after. This is, of course, as expected. All the integers were added to the stack so don’t show the heap at all.

Why not put values on the heap?

Well, the heap it quite slow. When we add something to the stack we just write it in the next slot, then, when we are done we just move the pointer to where the framework will read from down. There are no clean up costs at all, the memory is just overwritten rather than reclaimed.

The Heap

All reference types end up in the head, with one or more references to them on the stack. As I’m sure you’re well aware strings are reference types despite the fact that, when coding we don’t need to use the new keyword.

If you run the program again and this time select the Strings option then we’ll see how these get stored.

The console will output:

Creating strings
================

----------------------------------------------------------------
Take a snapshot of the memory and press 'Enter' when you're done
----------------------------------------------------------------
Now we are going to create a set of 100 unique strings.

----------------------------------------------------------------
Take a snapshot of the memory and press 'Enter' when you're done
----------------------------------------------------------------
Now, we'll call the garbage collector

----------------------------------------------------------------
Take a snapshot of the memory and press 'Enter' when you're done
----------------------------------------------------------------
All done, have a look at the snapshots.

Done...

If you’ve created the snapshots in dotMemory you’ll see:
strings dotMemory

I’ve named the snapshots, yours might just be called Snapshot #1, Snapshot #2 etc.

You can see straight away from the snapshot boxes that 104 objects were created between the time the first snapshot was taken and the second. We expect that 100 of these will be the 100 strings we created, 1 will be the array holding them and the other two will ‘something else’. If we click on the compare we can see the details.

Compare before strings created to afterwards

If we filter on String then we can see that that 102 new strings were created and 1 new string array. So far, so good, we can see what the strings were.

String instances

There’s a lot of 90 byte strings there, they’ll be the ones we created for this test. The other two are:
* 124 bytes: "Now we are going to create a set of 100 unique strings."
* 16 bytes: "D"
These are for formatting the GUID into a string, and the text I write to the console.

If we compare the second and third snapshots now, after GC.Collect() was called, we can see that the objects have all been destroyed:

Strings destroyed

Conclusion

Reference types always go on the heap, local value types go on the stack. There is more to it than this, for example:

  • What happens to value types in reference types
  • What happens to reference type in value types?

Also, we’ve only shown that garbage collection cleans up unused references. The garbage collector does more than this too, but we’ll look at this in a future blog.

Hope you enjoyed this.

Thanks for reading.

Feel free to contact me @BanksySan!

dotMemoryNotes

Exploring NET’s Memory Management

A Trip Down Memory Lane

Hurriedly made notes from the “Exploring NET’s Memory Management A Trip Down Memory Lane” webcast.

GC

The garbage collector is part of the .NET runtime.
We compile to IL, the runtime compiles this to machine code. It handles type safety, security, threads and memory.

  • Virtually unlimited memory for our applications. You don’t have to consider memory allocation for the most part.
  • Big chunk of memory allocated when application starts. When we declare a variable it is added to this allocation.
  • The GC reclaims memory. Objects allocated in the managed heap. This makes it fast as it just adds a pointer.
  • .NET used some unmanaged memory for use itself.
  • GC releases objects no longer in use by examining application roots.
  • GC builds a graph of all the objects that are reachable from these roots. It takes time to scan these objects.
  • Generations
  • Large Object Heap.
  • The GC divides the heap into generations.
    • Gen-0: Short lived (e.g. local variable).
    • Gen-1: In-between’
    • Gen-2: Long-lived objects (e.g. Application’s main form).
  • When object scans, if it’s still needed it will promote the object to a higher generation.
  • Large Object Heap (LOH)
    • Special segment for large objects (>85Kb).
    • Collected only during full garbage collection.
    • Not compacted (by default)
    • Fragmentation can cause OutOfMemoryException.
  • When does GC run?
    • Out of memory condition.
    • After some ‘significant’ allocation.
    • Failure to allocate some native resource (internal to .NET).
    • Profiler - when triggered from the profiler API.
    • Forced - Call to System.GC.
    • Application moved to background.
  • GC pauses the running application to scan the heap.
  • Helping the GC. Use IDisposable & using.
  • Finalizers: BEWARE!! The object will move to the finalizer queue, which will always cause the object to be promoted a generation.
  • Weak references: Allow the GC to collect these objects.
    • Whenever the GC passes through it, it will be collected. Speeds up scan.

When is Memory Allocated?

  • Not for value types
    • Allocated on the stack.
  • Reference types.
    • new
    • “” (string.Empty)
  • Boxing.
  • Closures.
  • Param arrays.
  • … more.
  • How to see?
    • Resharper Heap Allocations Viewer plugin.
    • Roslyn’s Heap Allocation Analyzer.

GC

  • GC is optimised for high memory traffic in short lived objects.
  • *

Strings

  • Strings are objects.
    • Is actually a readonly collection of char.
  • String duplicates are normal.
    • .NET GC is fast.
    • They represent the balance on CUP v memory.
    • Gen-0 is nothing, gen-2 might be bad.

How can we tell is a string is a duplicate and is in the Intern Pool?

var a = "Hello World!";
var b = "Hello World!";

// a == b is true
// Object.ReferenceEquals(a, b) is true

Heap

  • Pointer to Run Time Type Information (RTTI)

clrMD

  • Open crash dumps or monitor running processes.

Thanks for reading.

Feel free to contact me @BanksySan!