Generating a .NET Emulator

Hello LLM-users. I want to create an IL/.NET VM emulator, for example for malware sandboxing. I decided to do it this way: use fuzzing as a general quality control process for execution.

I see the following semi-automated emulator building processes:

Progress monitoring process
Implementation process
Implementation validation process
Bug fixing process
Manual recovery process

The overall process can be visualized as follows:

flowchart TD
    classDef llmTask fill:orange,stroke:#333,stroke-width:4px,color:#333;
    classDef llmGeneratedScript fill:lightGreen,stroke:#333,stroke-width:4px,color:black;
    classDef humanCode fill:green,stroke:#333,stroke-width:4px,color:white;
    START([Start])
    FINISH([End])

    %% General flow
    START --> Progress
    Progress -->|Implement new instruction| Impl 
    Progress -->|No unimplemented instructions| FINISH 
    Impl -->|Success| Validation
    Impl -->|Failed to implement| RecoveryProcess
    Validation -->|Errors found| BugFix
    BugFix -->|Success| Validation
    BugFix -->|Failed to implement| RecoveryProcess
    Validation -->|Success| Progress
    Validation -->|Failed to implement| RecoveryProcess

Progress Monitoring Process

I create manually, or through a simple LLM-generated script, a list of all IL instructions. This will be a manual checklist of what has been done. For each instruction, in addition to the mnemonic, there will be a generated description of what it does. I don’t plan to have a complex description, i.e., not the full spec of the instruction’s behavior.
If all tests and fuzzing pass, I consider the instruction implemented and mark it in the checklist.

Implementation Process

The LLM receives as a prompt the instruction to be implemented and a description of the semantic behavior of that instruction.
For a smoke test, the generated prompt is executed on test projects. If everything passes successfully, the draft implementation is considered complete.
After completing this stage, the financial cost results must be recorded in a financial audit file via script.

Implementation Validation Process

I write manually a fuzzer that generates a sequence of IL instructions from the list of implemented or currently being implemented instructions. The fuzzer checks the execution of the generated sequence on the emulator and on the standard .NET runtime. If anything differs, it’s a failure. This is registered, and a reduced test case is built from the generated sequence. This reduced test case will be recorded in a file in some programming language.
If the draft implementation has errors and a test case file is generated, this test case is added to the emulator’s test project. Then the bug fixing phase is launched.
If the draft implementation has no errors, the current instruction implementation is considered successful, and we can select the next instruction for implementation.
After completing this stage, the financial cost results must be recorded in a financial audit file via script.

Bug Fixing Process

The LLM is instructed to fix all bugs found as a result of running the emulator’s test project.
The process ends either after 10 iterations of attempts to fix errors or when all errors in the test project have been fixed.
After completing this stage, the financial cost results must be recorded in a financial audit file via script.
After this, the Implementation Validation Process must be restarted.

Manual Recovery Process

In case the execution of the task took a long time, as a safeguard, the system should stop working.
The system operator (me) must then figure out and document in the tracker the reason for the problem, what went wrong, of course if possible.

The entire process can be visualized as follows.

flowchart TD
    classDef llmTask fill:orange,stroke:#333,stroke-width:4px,color:#333;
    classDef llmGeneratedScript fill:lightGreen,stroke:#333,stroke-width:4px,color:black;
    classDef humanCode fill:green,stroke:#333,stroke-width:4px,color:white;
    START([Start])
    FINISH([End])

    %% Progress control
    subgraph Progress["Progress Monitoring Process"]
        P1["Generate a list of all IL instructions
(manually or via LLM script)"]:::llmGeneratedScript
        P2["For each instruction:
- mnemonic
- short description of behavior"]:::llmGeneratedScript
        P3{"All tests and fuzzing passed?"}
        P4["Mark instruction as implemented
in checklist"]

        P1 --> P2 --> P3
        P3 -->|Yes| P4
    end

    %% Implementation
    subgraph Impl["Implementation Process"]
        I1["Pass to LLM:
- instruction
- semantic description"]
        I2["Run smoke tests
on test projects"]:::llmTask
        I3{"Smoke tests successful?"}
        I4["Draft implementation complete"]
        I5["Record financial costs
in financial audit file"]:::llmGeneratedScript

        I1 --> I2 --> I3
        I3 -->|Yes| I4 --> I5
    end

    %% Validation
    subgraph Validation["Implementation Validation Process"]
        V1["Fuzzer generates IL instruction sequence"]:::humanCode
        V2["Compare execution:
- emulator
- .NET runtime"]:::humanCode
        V3{"Any differences?"}
        V4["Register error"]:::humanCode
        V5["Build minimized test case"]:::humanCode
        V6["Write test case to file"]:::humanCode
        V7{"Draft implementation has errors?"}
        V8["Add test case
to emulator test project"]:::llmTask
        V9["Instruction implementation successful"]
        V10["Select next instruction"]:::llmTask
        V11["Record financial costs
in financial audit file"]:::llmGeneratedScript

        V1 --> V2 --> V3

        V3 -->|Yes| V4 --> V5 --> V6 --> V7
        V3 -->|No| V7

        V7 -->|Yes| V8
        V7 -->|No| V9 --> V10 --> V11
    end

    %% Bug fix
    subgraph BugFix["Bug Fixing Process"]
        B1["LLM receives instruction
to fix found bugs"]
        B2["Run fix iteration cycle"]:::llmTask
        B3{"All errors fixed?"}
        B4{"Reached 10 iterations?"}
        B5["Complete fix process"]
        B6["Record financial costs
in financial audit file"]:::llmGeneratedScript
        B7["Return to implementation
validation process"]
        B8{"All errors fixed?"}
        
        B1 --> B2 --> B3
        B3 -->|Yes| B5
        B3 -->|No| B4
        B4 -->|No| B2
        B4 -->|Yes| B5

        B5 --> B8 -->|Yes| B6 --> B7
        B8 -->|No| RecoveryProcess[Recovery Process]
    end

    %% General flow
    START --> Progress
    Progress -->|Implement new instruction| Impl 
    Progress -->|No unimplemented instructions| FINISH 
    Impl -->|Success| Validation
    Impl -->|Failed to implement| RecoveryProcess
    Validation -->|Errors found| BugFix
    BugFix -->|Success| Validation
    BugFix -->|Failed to implement| RecoveryProcess
    Validation -->|Success| Progress
    Validation -->|Failed to implement| RecoveryProcess

View in editor

It’s interesting what else could be added so the system remains understandable and manageable with minimal LLM loops. Because I believe that even so, it’s very likely that in complex cases the system will break in unknown ways and will require manual fixing.