Adventures in .NET references

Weak referencing is a really useful feature when you don’t mind if an object is deleted, but you might still potentially want to access it again in future. For those of you who aren’t familiar with the concept of weak referencing, I’ll describe it briefly. If you already know how it works then you can skip ahead.

.NET is a garbage collected language, meaning that objects you create on the heap (e.g. with new) are automatically cleared up by the garbage collector (GC) when they are no longer being used. The definition of “being used” is implemented as a reference count. Here’s an example:

// we create an object instance and assign it to the variable 'foo'
// the instance now has one reference
var foo = new object();

// now we assign the value of foo (the instance) to bar
// the instance now has two references
var bar = foo;

// now we set foo to null.
// the instance now has one reference (bar)
foo = null;

When a variable goes out of scope it no longer exists, so the reference counter is decremented. When the reference count for an object instance drops to zero the GC is free to finalize (destroy) it. GC does this in passes and uses a generation-based model to periodically clean up objects without references. This means that an object may exist on the heap for some time after the reference count drops to zero. Incidentally, this is why SecureString exists - if you put sensitive data into a string object there is no guarantee when, or even if, that string will be erased from memory. Strings are also immutable in .NET, so you can’t manually overwrite them.

What I haven’t mentioned so far is that there are two types of reference - strong and weak. What I’ve talked about so far refers to strong references. Weak references are a special type of reference that still allow the GC to finalize the object, but also still allow the code to access (and create strong references to) the object if the GC has not yet finalized it. This is useful for caching because the GC will automatically “evict” (finalize) objects based on recent accesses and memory pressure.

Mixing weak references with lazy initialisation

In some cases you may not know if a code path is going to require access to a particular object, or if it will be accessed just once or multiple times. If that object takes up quite a bit of memory on the heap it may be prohibitive to keep it around. You could opt to manually handle this with a caching scheme, but a mixture of lazy initialisation and weak referencing allows this situation to be handled in a way that avoids allocation in the first place when the object isn’t used, and automatically manage caching of that object based on memory pressure and age via the GC.

I ran into this situation when I wanted to parse the PE headers and various structures of a lot of executable files, then run a battery of tests against each. Most tests only access a few different sections of the executable, and some tests do not run at all against some files (e.g. some tests only run on 64-bit binaries). The parsed data can take up quite a bit of memory - particularly import tables and disassembled code - but it’s not expensive to regenerate the data, so it makes sense to only initialise it when we need it, and get rid of it if we’re running short of memory. For the latter we can use weak referencing, but for the former we want lazy initialisation. Luckily both of these features are available in the .NET framework and are thread-safe by default.

For convenience I created a helper class that combines WeakReference with Lazy<T>, into WeakLazy<T>:

public class WeakLazy<T> where T : class
{
    readonly Func<T> _constructor;
    readonly Lazy<WeakReference> _lazyRef;

    public WeakLazy(Func<T> constructor)
    {
        _constructor = constructor;
        _lazyRef = new Lazy<WeakReference>(() => new WeakReference(_constructor()));
    }

    public bool IsAlive
    {
        get
        {
            if (!_lazyRef.IsValueCreated)
                return false;
            return _lazyRef.Value.IsAlive;
        }
    }

    public T Value
    {
        get
        {
            T obj = (T)_lazyRef.Value.Target;

            // if the object still exists, return that
            if (_lazyRef.Value.IsAlive)
                return obj;

            // object didn't exist so we need to create it again
            obj = _constructor();
            _lazyRef.Value.Target = obj;
            return obj;
        }
    }
}

This is fairly simple - when we access the Value property it initialises the object (this is the Lazy<T> functionality) and wraps it inside a WeakReference in order to keep the strong reference count at zero.

Here’s an example of how you might use it:

var peHeader = new WeakLazy<PEHeader>(() => new PEHeader(_file));

...

if (Is64bit)
{
    if (peHeader.Value.ImageBase < 0x100000000UL)
        Report.AddIssue(IssueMessages.MissingHiASLR, ...);
}

...

if (SomeOtherCondition)
{
    // some other access here
    if (peHeader.Value.??? ... )
        // ...?
}

In the first line we create a WeakLazy<T> wrapper around a PEHeader class, which represents the parsed PE (or “Optional”) header from some input file. At this point there is no PEHeader instance as its initialisation is lazy.

If the executable is 64-bit we check for HiASLR by validating that the base address is above the 4GB boundary. If the branch is taken we reference peHeader.Value, which triggers lazy instantiation of the PEHeader object via the lambda we passed on the first line.

Later we potentially access peHeader.Value again later, at which there are three possible cases. The first case is that the original branch was not taken (not a 64-bit exe) so the PEHeader gets created for the first time. The second case is that the original branch was taken and the underlying PEHeader object still exists, so we just access it. The third case is that the original branch was taken, but a GC pass occurred between the first and second access, so the object was finalized in the interim, so it gets recreated.

Unit testing WeakLazy

The above all look correct, so let’s cover things off with some unit tests! The first couple of tests verify that lazy instantiation works as intended:

class TestObject
{
    public TestObject()
    {
        Bar = 123;
    }
    
    public void Foo() { }

    public int Bar { get; set; }
}

[TestMethod]
public void TestInstantiateViaMethod()
{
    var wl = new WeakLazy<TestObject>(() => new TestObject());
    Assert.IsFalse(wl.IsAlive);
    wl.Value.Foo();
    Assert.IsTrue(wl.IsAlive);
}

[TestMethod]
public void TestInstantiateViaProperty()
{
    var wl = new WeakLazy<TestObject>(() => new TestObject());
    Assert.IsFalse(wl.IsAlive);
    Assert.AreEqual(wl.Value.Bar, 123);
    Assert.IsTrue(wl.IsAlive);
}

These tests pass without problems. Next we want to test that weak referencing works:

[TestMethod]
public void TestWeakReferenceFinalize()
{
    var wl = new WeakLazy<TestObject>(() => new TestObject());
    wl.Value.Foo();
    Assert.IsTrue(wl.IsAlive);

    const int BLOWN = 1024;
    int fuse = 0;
    while (wl.IsAlive)
    {
        GC.Collect();
        if (++fuse == BLOWN)
            Assert.Fail("GC did not clear object.");
    }
}

This test first instantiates the object, then forces GC collection repeatedly (up to 1024 times) to make sure the object gets finalized. This test fails - the loop repeats until the assertion failure is hit. Can you see why? Here’s a hint: this unit test fails when the program is built as Debug, but not as Release.

Compiler magic or deeper behaviour?

What you might assume is that the compiler captures the result of wl.get_Value() into a local variable, thus “trapping” a reference to the TestObject instance. If you take a look at the compiled IL, you’ll find that this isn’t the case at all - the generated code is essentially the same barring some extra nops and unoptimised stloc/ldloc pairs in the debug code. In fact I spent quite a bit of time getting all confused about what was happening.

My first guess was that a strong reference was being kept in the CLR’s evaluation stack somewhere, but Visual Studio doesn’t allow you to see the evaluation stack in the CLR. I tried digging into this with mdbg but didn’t get much information out of that either. In the end I had to go hardcore and load up WinDbg.

It turns out that WinDbg has pretty solid support for .NET and CLR process internals via the sos extension. This extension comes inbuilt with WinDbg, but you have to load it manually with .loadby sos clr. Once this is done you can start using the CLR debugging features. I found this cheat sheet to be incredibly helpful.

First I manually modified my code to include some pauses - just some Console.ReadKey calls - then verified that my changes did not alter the behaviour I observed previously. After that I used !threads to find the correct managed thread and switch to it. From there I inspected the stack with !clrstack to verify that everything is as I expected, with no weird calls out to magic debugging methods or anything out of place. At this point it made sense to just directly check what the GC was holding onto, using !gchandles:

          Handle  Type                 Object  Size    Data Type
000002d4945615e8  WeakShort  000002d496196da8    24    PolyutilsTests.WeakLazyTests+TestObject

...

              MT    Count    TotalSize  Class Name
00007ff85ddc6c20        1           24  PolyutilsTests.WeakLazyTests+TestObject

From this we can see that only a weak handle to the object exists, so there isn’t anything holding it back. Yet, despite this, the Debug build of this program refuses to finalize the object that we have a weak reference to, whereas the Release build gets rid of it without problems.

Debug vs. Release assemblies

At this point I was convinced that this is a CLR behaviour unique to debug builds of the application, but not because of the generated IL. Opening up the Debug and Release binaries in JustDecompile showed me a difference in the flags applied to the DebuggableAttribute applied to the assembly.

From the Debug assembly:

[assembly: Debuggable(DebuggableAttribute.DebuggingModes.Default | DebuggableAttribute.DebuggingModes.DisableOptimizations | DebuggableAttribute.DebuggingModes.IgnoreSymbolStoreSequencePoints | DebuggableAttribute.DebuggingModes.EnableEditAndContinue)]

From the Release assembly:

[assembly: Debuggable(DebuggableAttribute.DebuggingModes.IgnoreSymbolStoreSequencePoints)]

Using the Reflexil plugin, I modified the Debug assembly’s DebuggableAttribute to match the Release assembly, and re-ran the test harness. This time it completed just fine, proving that this is a CLR behaviour directly related to debugging.

But which of these flags causes this difference in behaviour? For that I needed to go through and unset each flag, one by one, until the test passed. I immediately hit paydirt on my first try - removing the Default flag from the assembly made the test pass, even with the other options there. This doesn’t really make much sense to me, as the reference source says:

Default: Instructs the just-in-time (JIT) compiler to use its default behavior, which includes enabling optimizations, disabling Edit and Continue support, and using symbol store sequence points if present. In the .NET Framework version 2.0, JIT tracking information, the Microsoft intermediate language (MSIL) offset to the native-code offset within a method, is always generated.

The only behaviour I can see that is potentially relevant is further up in the same class:

/// <summary>Gets a value that indicates whether the runtime will track information during code generation for the debugger.</summary>
/// <returns>true if the runtime will track information during code generation for the debugger; otherwise, false.</returns>
/// <filterpriority>2</filterpriority>
public bool IsJITTrackingEnabled
{
    get
    {
        return (this.m_debuggingModes & DebuggableAttribute.DebuggingModes.Default) != DebuggableAttribute.DebuggingModes.None;
    }
}

I’m still not sure if JIT tracking is the cause or if it’s something else.

I found an issue on the CoreCLR project where they ran into the same problem as me, although it didn’t really shed much light on the subject other than informing me that the JIT can arbitrarily extend object lifetimes.

Conclusion

Builds with the Default flag set on the DebuggableAttribute for the assembly seem to force the GC to ignore weak handles that are held by the currently executing method. As for why, I’m not sure, but it might be due to JIT tracking being enabled.

Fixing this is easy - just move the object access to its own method, and mark it with MethodImplOptions.NoInlining to force the object lifetime to be contained separately and not inlined into the calling method:

[TestMethod]
public void TestWeakReferenceFinalize()
{
    var wl = new WeakLazy<TestObject>(() => new TestObject());
    AccessTestObject(wl);
    Assert.IsTrue(wl.IsAlive);

    const int BLOWN = 1024;
    int fuse = 0;
    while (wl.IsAlive)
    {
        GC.Collect();
        if (++fuse == BLOWN)
        Assert.Fail("GC did not clear object.");
    }
}

[MethodImpl(MethodImplOptions.NoInlining)]
private void AccessTestObject(WeakLazy<TestObject> wl)
{
    wl.Value.Foo();
}

This causes the unit test to pass on both Debug and Release builds.