Assembly Loading: Dynamic Assembly Loading & Compression [3 of 3]

Assembly Loading (Example Application)

ilMerge, while appropriate for most use cases, its utility does depend on your needs. ilMerge is the simplest way to merge multiple libraries into a single executable/library. That said, dynamic loading of assemblies is the next and probably the most flexible form of consolidating assemblies. But be warned, there are pros and cons to both methods. Dynamic loading is more involved, can be difficult to debug, but has some real advantages (and disadvantages) over ilMerge. If simplicity is your goal, and ilMerge works for your application, then continue no further. If you are interested in learning a bit more about assembly loading and the flexible options dynamic assembly loading offers, then read on…

In Part 2: Assembly Loading: Combine Assemblies & Executables Using ilMerge, we covered the merging of all assemblies into a single executable (and exclusions). Dynamic assembly loading is a bit of a departure from assembly merging. Dynamic assembly loading is a run-time defined load of dependencies. This means, we have the added flexibility of loading our assembly from the file system, database, base64 encoded field in an xml file, assembly resource, etc. The assembly can even be compressed, encrypted, etc. It leaves much in terms of flexibility, however some load locations are more sane than others (loading an assembly from a possibly disconnected source, like an online resource, probably isn't advisable, but don't say I didn't warn you).

Since we're consolidating the executable, assemblies AND dynamically loading them from a location of our choosing, we can accomplish both goals by embedding the libraries we want to dynamically load into Public.Process.exe. Static resources are defined at compile time and can be reflected out of the assembly once they are embedded.

To begin, inside the Public.Process project, create a directory called 'Resources'. This location will contain a copy of the the Public.Dependency.dll and Public.SubDependency.dll. We can start by simply copying the build output from the two referenced projects into this folder for the time being (later we can incorporate the build output from these projects into msbuild). From the example project, Public.Process, create a new directory for our assemblies '\Public.Process\Resources\' and copy the project output from Public.Dependency and Public.SubDependency, then mark them as embedded resources (Select assemblies, right click > properties. In the properties pane, select 'Embedded Resource').

Inside the solution explorer, Public.Process project should look like this:

Once the dependent project assemblies are marked as embedded, build Public.Process project. After opening in ilSpy, you should see the embedded assemblies under the resources.

Public.Process.exe in ilSpy with Embedded Assemblies:

Notice, the embedded resources have the resource name modified (likely to avoid possible naming collisions) with the prefix 'Public.Process.Resources.' ([Default Namespace].[Resources]). From within ilSpy, you can export the embedded resources (if needed) and verify that the compiler embedded your assemblies correctly.

At this point, we've successfully consolidated all out necessary resources into a single executable. Unfortunately, if we attempt to run Public.Process.exe (without Public.Dependency.dll and Public.SubDependency.dll in the same folder as Public.Process.exe), we get the following:

Fuslogvw.exe output:

Fuslogvw.exe Public.Dependency (binding failure):

Public.Dependency.dll was the first dependency to be called (also the first to bind) and failed to load because the probing path(s) were missing the assembly. We need a way to intercept the assembly load event and tell it to use the embedded resource after attempting the probing path(s).

Initialization of the AssemblyLoad.cs class, the the event hook for AppDomain.CurrentDomain.AssemblyResolve:

    class Program
    {
        public static void Main(string[] args)
        {
            AssemblyLoad tAssemblyLoad = new AssemblyLoad(AssemblyLoad.LoadMethod.Resource); 
            AppDomain.CurrentDomain.AssemblyResolve += new ResolveEventHandler(tAssemblyLoad.CurrentDomain_AssemblyResolve);
            Run();
        }

        private static void Run()
        {
            Console.WriteLine(DependencyClass.GetStartupMessage());
            Console.WriteLine("Press Any Key To Exit...");
            Console.ReadKey();
        }
    }

The modified Main() event above hooks the AssemblyResolve event to call the CurrentDomain_AssemblyResolve method in AssemblyLoad.cs (available in the example project). The constructor for AssemblyLoad(enum) optionally allows for different load methods. In this example, I've constructed AssemblyLoad(enum) with LoadMethod.Resource, to load the assemblies from the available resources.

AssemblyLoad.cs constructor:

        private readonly LoadMethod mLoadMethod;

        public AssemblyLoad(LoadMethod aLoadMethod)
        {
            mLoadMethod = aLoadMethod;
        }

Once AssemblyLoad object is initialized in our Main() method, the AssemblyResolve event is hooked. When the application runs into scope of a method requiring a external resource (e.g. the Run() method), it will call the ResolveEventHandler (e.g. CurrentDomain_AssemblyResolve method inside AssemblyLoad.cs). Once invoked, CurrentDomain_AssemblyResolve first looks at the arguments being passed to determine which assembly the executing process (Public.Process.exe) is requesting.

string[] tAssemblyName = aArguments.Name.Split(',');

We're only interested in the name of the requested assembly (not the version, etc) in the arguments being passed (i.e. 'Public.Dependency' of 'Public.Dependency, Version=1.0.0.0, Culture=neutral, PublicKeyToken=63e9b31d387f5c11'), so the argument is split by the comma delimiter. Once we know the name of the assembly the executing process is attempting to load, we look for it inside the manifest of the executing assembly.

     Assembly tExecutingAssembly = Assembly.GetExecutingAssembly();
     string[] tAvailableResourcesFromManifest = tExecutingAssembly.GetManifestResourceNames();
     string tResourceAssemblyName = tExecutingAssembly.GetName().Name + ".Resources." + tAssemblyName[0] + ".dll";
     string tResourceSymbolName = tExecutingAssembly.GetName().Name + ".Resources." + tAssemblyName[0] + ".pdb";

     // determine if the resources are available
     string tAvailableAssembly = tAvailableResourcesFromManifest.FirstOrDefault(x => x == tResourceAssemblyName);
     string tAvailableSymbols = tAvailableResourcesFromManifest.FirstOrDefault(x => x == tResourceSymbolName);

Earlier, when opening the assembly with ilSpy, we observed how the resource names are modified when embedded; at this point we construct the names of the resources to match the naming convention. In this case, the assembly's name containing the resources and '.Resources.' are prefixed. Appropriate file extensions for both the library and symbol file are also added to the name's suffix. After concatenating the expected resource name, the manifest of the executing assembly is checked for a matching resource (e.g. tAvailableResourcesFromManifest.FirstOrDefault(x => x == tResourceAssemblyName)).

After determining if the manifest does indeed have the resource we're looking for, the following attempts to read out the resource:

tAssemblyInputStream = tExecutingAssembly.GetManifestResourceStream(tAvailableAssembly);

and…

tAssemblyArrayLength = (int)tAssemblyInputStream.Length;
tAssemblyByteArray = new byte[tAssemblyArrayLength];
tAssemblyInputStream.Read(tAssemblyByteArray, 0, tAssemblyArrayLength);

At this point, we have our assembly resource read into a byte array. A similar process is use to read out the symbol file.

// load the assembly and the symbols if available
if (tAssemblyInputStream != null && tSymbolInputStream != null)
     tLoadedAssembly = Assembly.Load(tAssemblyByteArray, tSymbolByteArray);
else if (tAssemblyInputStream != null)
     tLoadedAssembly = Assembly.Load(tAssemblyByteArray);

The System.Reflection.Assembly.Load method returns the System.Reflection.Assembly object, which the required return type for the assembly bind event. The assembly, and optionally the symbols, are passed as byte arrays to Assembly.Load.

Its worth noting, that during this custom assembly load process we're working with multiple file streams; commonly I'd wrap the streams in using statements to control the scope and disposal of the streams, however, since we're working with multiple streams (one of which, the symbol file, is optional), instead I've chosen to dispose of them in the finally statement. Also, the error handling in this event is managed by the caller. If the resource isn't found, our method returns null, in which case, the assembly binder will throw the exception that the dependency was not found. We could be more explicit with the reason why the resource wasn't found, but that might be specific to your application.

Using the Visual Studio debugger can be helpful in validating that the assembly loading, but its worth heading back to fuslogvw.exe and processexplorer to see what was captured (if anything).

Dynamically loaded assembly, fuslogvw.exe output:

As you can see, the bind log indicates that the probing path(s) were examined, but the assembly (e.g. Public.Dependency.dll) was not found. That's because after it attempted to load from the default bind paths, it called the ResolveEventHandler which we defined in out application. In contrast, we can open up ProcessExplorer and take a look at the process:

Output of ProcessExplorer, with dependencies loaded:

Just like fuslogvw, there's no hint to where the assemblies were loaded, because they were loaded entirely from the executing process' resources, hence there is no file handle exists for either assembly. In this case, we can validate using the debugger that the resources loaded correctly, or the visual studio output window:

The output window shows the successful load of the assemblies, Public.Dependency.dll, Public.SubDependency.dll, and the corresponding symbols. At this point, we've successfully embedded our assemblies into Public.Process.exe, and dynamically loaded them at runtime.

From this point, we have a starting point for your application, included in the sample project are several different configuration settings. Which are as follows:

Debug: No modifications, default project compilation
Release: Merged project output using ilMerge, with examples of Public.SubDependency.dll being excluded (but copied to the output folder) (Part 2)
Dynamic: Builds the assemblies, copies them to the 'Resources' folder in Public.Process and embedded as resources (Part 3)
Compression: Builds the assemblies, compresses them to the 'Resources' folder in Public.Process and embedded as resource (Bonus)

The last build configuration demonstrates the flexibility of dynamic assembly loading. Included in the example project , the 'Compression' configuration makes use of open source library DotNetZip (Ionic.Zip.Reduced.dll/pdb) and 7zip (for compressing the build output of Public.Dependency.dll and Public.SubDependency.dll from msbuild). The compression steps are included in the msbuild target file included in the project. The resources are decompressed at runtime when the AssemblyLoad constructor is supplied LoadMethod.CompressedAssembly:

From AssemblyLoad.cs (example project), extracting the compressed resource:

tAssemblyInputStream = Compression.UnzipStream(tExecutingAssembly.GetManifestResourceStream(tAvailableAssembly));

The 'Compression' build configuration uses Powershell to compress the assembly (Powershell is one of my favorite ways to supercharge MSBuild). In a later article, I'll cover the usage of powershell from inside msbuild to satisfy your build / release management needs.