Write-Up: Misc - MLsteal from Google CTF 2022

Challenge

You are given a compressed docker file system, which contains everything needed to run the neural network language model.

The script to run everything is in the ctf subfolder and looks like this:

#!/usr/bin/python3

import sys
import gzip
import pickle
import numpy as np

# Import our fancy neural network
from mlsteal import sample


if __name__ == "__main__":
    # Load the model weights
    weights = np.frombuffer(gzip.open("weights.gz","rb").read(), dtype=np.float32)

    # Get model prompt from first argument
    if len(sys.argv) == 1:
        print("Please pass one argument containing a string to prompt the language model")
        exit(1)
    prompt = bytes(sys.argv[1],'ascii')
    x = np.array(list(prompt), dtype=np.uint64)

    # Run neural network on the prompt
    y = sample(x, weights)

    # Print to stdout
    print(b"".join(np.array(sample(x, weights),dtype=np.uint8)))

The other piece of information we know is that the phrase in the training set we are interested in begins with “Alice Bobson’s password is”

Now, if we add to this the fact that we know the flag format, which should be CTF{[a-zA-Z0-9_\-]+}, we can start doing some magic:

First, prompt the neural network with the alredy known input, plus one extra possible character.

Our first input to test would be: “Alice Bobson’s password is CTF{a”

This spat out nothing interesting on the first try, but the next step would be to iterate over all possible characters (and possibly query each possibility multiple times).

This makes it super likey for at least a part of the flag to get returned by our neural network.

I printed out the outputs for each given input with an automated script, but certainly one could completely automate the flag canidate collection process.

After one iteration of tests, a very common output was “Alice Bobson’s password is CTF{*3m0r1z4t20n-“ where * stands for many different characters.

This very much looked like it could be leetspeak for memorization, so I ran it a couple of times with “Alice Bobson’s password is CTF{m” and “Alice Bobson’s password is CTF{M”, of which the first one produced the partial flag more often, so it is more likely to be the actual flag.

Then from there, we simply repeat the process a couple of times, until we are returned the full flag of CTF{m3m0r1z4t10n-1S-4LL-y0u-N33D}