How to build .NET obfuscator - Part II

This is continuation of series about writing obfuscators. You can read first article here

We finished with string replacement and primitive obfuscation runtime. Now it’s time to spice things up a bit. Before that we write relatively simple obfuscation techniques, which is quite trivial to undo. In this article I would explain how to transform control flow in such way that make your life a bit harder.

Let’s start make control flow harder to follow.

Simple condition modifications

Most basic way to confuse control flow, is to inject false logical calculation into exiting conditional branches. For example if we have code like this

if (x > 4)
{
    Console.WriteLine("This is under condition");
}

can be transformed to

if (true && x > 4)
{
    Console.WriteLine("This is under condition");
}

if (false || x > 4)
{
    Console.WriteLine("This is under condition");
}

That looks silly, and if written as it is, it would be trivial, but instead of true and false, you can inject more complicated expressions, for example Math.Log10(10.0) == 1.0 for true or Math.Log10(1) == 1.0 for false. Or even more complicated expressions. If you clever enough, you can even generate more complicated expression as you go.

So let’s try to inject that false conditions which does not affect branching.

Let’s look at how IL code looks for the C# code presented earlier.

// if (x > 4)
IL_0000: ldarg.0
         ldc.i4.4
         ble.s IL_000e

// Console.WriteLine("This is under condition");
         ldstr "This is under condition"
         call void [System.Console]System.Console::WriteLine(string)
IL_000e:
// Some code after if

and let’s say we want to insert 1.0 == Math.Log(1.0) ||

// if (1.0 == Math.Log(1.0) || x > 4)
IL_0000: ldc.r8 1
         ldc.r8 1
         call float64 [System.Runtime]System.Math::Log(float64)
         beq.s IL_0039

         ldarg.0
         ldc.i4.4
         ble.s IL_0043

IL_0039:
// Console.WriteLine("This is under condition");
         ldstr "This is under condition"
         call void [System.Console]System.Console::WriteLine(string)
IL_0043:
// Some code after if

From the example you can see that we need only inject 4 IL instructions to achieve this goal.

IL_0000: ldc.r8 1
         ldc.r8 1
         call float64 [System.Runtime]System.Math::Log(float64)
         beq.s IL_0039

And the injection point would be finding some conditional branch instruction like ble.s or bge.s which have ldarg.s or other variants for example.

for (int i = 2; i < method.Body.Instructions.Count; i++)
{
    var instr = method.Body.Instructions[i];
    if (instr.IsConditionalBranch()
        && (method.Body.Instructions[i - 1].IsLdarg() || method.Body.Instructions[i - 1].IsLdloc()
        || method.Body.Instructions[i - 2].IsLdarg() || method.Body.Instructions[i - 2].IsLdloc()))
    {
        var nextInstruction = method.Body.Instructions[i + 1];
        // ldc.r8 1
        var const1 = new Instruction(
            OpCodes.Ldc_R8,
            1.0);
        method.Body.Instructions.Insert(i - 2, const1);
        // ldc.r8 1
        var const1_2 = new Instruction(
            OpCodes.Ldc_R8,
            1.0);
        method.Body.Instructions.Insert(i - 1, const1_2);
        // call Math::Log(double)
        var mathLog = new Instruction(
            OpCodes.Call,
            module.Import(typeof(Math).GetMethod("Log", [typeof(double)])));
        method.Body.Instructions.Insert(i, mathLog);
        // call Math::Log(double)
        var breqNext = new Instruction(
            OpCodes.Beq_S,
            nextInstruction);
        method.Body.Instructions.Insert(i + 1, breqNext);
        i = i + 4; // Skip the instructions we just added
    }
}

as you can see, this is a bit tedious, and easy to make wrong, but at the same time, that’s simplest things which you can do without involving more complicated machinery which I will show now.

Dead code insertion

This is pretty simple technique, whole idea is following: insert any valid IL sequence which never borrow anything from stack, except what it placed by itself, and after end of sequence leave stack unchanged, and without any external side-effects. Example of side-effects, is overflow, division by 0, or other runtime exception. We can add additional unused variables and save value to them to get intermediate results, for additional confusion.

Most trivial example would be push constant on the stack, and pop value from there. Again, this is to show obfuscation technique, you can improve that further as you want.

var injectionPoint = random.Next(method.Body.Instructions.Count);

var const1 = new Instruction(
    OpCodes.Ldc_R8,
    1.0);
method.Body.Instructions.Insert(injectionPoint, const1);
var pop = new Instruction(OpCodes.Pop);
method.Body.Instructions.Insert(injectionPoint + 1, pop);

More interesting example would be insertion of conditional jump to random location using condition which would be constructed in such way to never trigger. That complicates BB analysis of the deobfuscator and make it harder to track logic. In order to implement this properly, we should introduce concept of basic blocks.

Basic blocks

Let’s define what basic block is. Basic block is set of instructions which can be entered only via first instruction, and leave only via last instruction.

Basic blocks begin with

Function entry point
Instruction which are jump targets
Begin of protected blocks
Exception handlers and finally handlers
Instruction switch

Basic blocks ends with

End of protected blocks
Instructions: ret, br, bgt, ble, …, bXXX

Let’s see how it will looks like for simple C# function

static void Worker(int x)
{
    if (x > 4)
    {
        Console.WriteLine("Hello, Conditions!");
    }
} 

Decompiled source code.

IL_0000: ldarg.0
IL_0001: ldc.i4.4
IL_0002: ble.s IL_000e

IL_0004: ldstr "Hello, Conditions!"
IL_0009: call void [System.Console]System.Console::WriteLine(string)

IL_000e: ret

and basic blocks for the function would looks like this

flowchart TD
    A[IL_0000: ldarg.0
IL_0001: ldc.i4.4
IL_0002: ble.s IL_000e] --> B("IL_0004: ldstr #quot;Hello, Conditions!#quot;
IL_0009: call void [System.Console]System.Console::WriteLine(string)")
    B --> C(IL_000e: ret)
    A -.-> C

For analysis let’s use following classes

class BasicBlock
{
    public List<Instruction> Instructions { get; set; } 
        = new List<Instruction>();
}

class FlowGraph
{
    public List<BasicBlock> BasicBlocks { get; set; } 
        = new List<BasicBlock>();

    public FlowGraph(MethodDef method)
    {
        // Some magic which I will show below.
    }
}

This is very barebone, but again, this is concepts and not fool proof implementation.

Implementation can be split onto 3 parts

Finding start of basic blocks using linear scan. Check br/ret/jump targets
Fill basic blocks with instructions

Finding start of basic blocks using linear scan. Check br/ret/jump targets is that simple. Just walk through list of instructions and record start of bb.

List<int> basicBlocksStart = new() { 0 };
for (int i = 1; i < method.Body.Instructions.Count; i++)
{
    var instr = method.Body.Instructions[i];
    if (instr.IsBr() || instr.IsConditionalBranch() || instr.OpCode == OpCodes.Ret)
    {
        if (instr.IsConditionalBranch())
        {
            var instructionIndex = method.Body.Instructions.IndexOf((Instruction)instr.Operand);
            basicBlocksStart.Add(instructionIndex);
        }

        if (i + 1 < method.Body.Instructions.Count)
        {
            basicBlocksStart.Add(i + 1);
            i++; // skip next instruction, since we already add it.
            continue;
        }
    }
}

Here the filling basic blocks with instructions. Just use previously collected information and copy from start to start of next bb into current bb.

basicBlocksStart = basicBlocksStart.Distinct().ToList();
basicBlocksStart.Sort();
for (int i = 0; i < basicBlocksStart.Count; i++)
{
    var block = new BasicBlock();
    var finish = i == basicBlocksStart.Count - 1 
        ? method.Body.Instructions.Count 
        : basicBlocksStart[i + 1];
    for (int j = basicBlocksStart[i]; j < finish; j++)
    {
        block.Instructions.Add(method.Body.Instructions[j]);
    }

    BasicBlocks.Add(block);
}

And now we need a way to store our graph, back to method body

public void Save(MethodDef method)
{
    method.Body.Instructions.Clear();
    foreach (var block in BasicBlocks)
    {
        foreach (var instr in block.Instructions)
        {
            method.Body.Instructions.Add(instr);
        }
    }
}

This is super basic machinery, but you can use it in simple tools now.

Process of fake condition would be following

Construct BB graph
Find random BB block as target for injection
Find random BB block as target for fake jump
Inject fake basic block

var flowGraph = new FlowGraph(method);
if (flowGraph.BasicBlocks.Count == 1)
    continue;
// ldc.r8 1
var const1 = new Instruction(
    OpCodes.Ldc_R8,
    1.0);
// ldc.r8 1
var const1_2 = new Instruction(
    OpCodes.Ldc_R8,
    1.0);
// call Math::Log(double)
var mathLog = new Instruction(
    OpCodes.Call,
    module.Import(typeof(Math).GetMethod("Log", [typeof(double)])));
// Beq_S
var randomBB = Random.Shared.Next(flowGraph.BasicBlocks.Count - 1);
var randomTarget = Random.Shared.Next(flowGraph.BasicBlocks.Count - 1);
var fakeInstruction = flowGraph.BasicBlocks[randomTarget].Instructions[0];
var breqNext = new Instruction(
    OpCodes.Beq_S,
    fakeInstruction);
flowGraph.BasicBlocks.Insert(randomBB, new BasicBlock()
{
    Instructions =
    {
        const1,
        const1_2,
        mathLog,
        breqNext
    }
});
flowGraph.Save(method);

As you may see, there no instructions number manipulations, as I did in “simple” condition insertion.

Now let’s insert

// was before
flowGraph.BasicBlocks.Insert(randomBB, new BasicBlock()
{
    Instructions =
    {
        const1,
        const1_2,
        mathLog,
        breqNext
    }
});
// added now
flowGraph.BasicBlocks.Insert(randomBB + 1, new BasicBlock()
{
    Instructions =
    {
        new Instruction(OpCodes.Ldc_I4_0),
        new Instruction(OpCodes.Pop),
    }
});

// was before
flowGraph.Save(method);

That’s it. So basicaly confusing control flow is graph modifications now.

Example of output would be like this

static void Worker(int x)
{
    if (x > 4)
    {
        if (1.0 != Math.Log(1.0))
        {
            _ = 0;
        }
        Console.WriteLine("Hello, Conditions!");
    }
}

That’s it for today. Again final code can be found at supplementary repo