CTF Assembly Challenges


set register

Description: In this level, you will be working with registers. You will be asked to modify or read from registers. In this level, you will work with registers! Please set the following: rdi = 0x1337 Solution: .intel_syntax noprefix .global _start _start: mov rdi, 0x1337 gcc -nostdlib -o program program.s /challenge/run /home/hacker/program

Name: set multiple registers

Description: In this level, you will be working with registers. You will be asked to modify or read from registers. In this level, you will work with multiple registers. Please set the following: rax = 0x1337 r12 = 0xCAFED00D1337BEEF rsp = 0x31337 Solution: .intel_syntax noprefix .global _start _start: mov rax, 0x1337 mov r12, 0xCAFED00D1337BEEF mov rsp, 0x31337 gcc -nostdlib -o program program.s /challenge/run /home/hacker/program

add to register

In this level, you will be working with registers. You will be asked to modify or read from registers. We will set some values in memory dynamically before each run. On each run, the values will change. This means you will need to perform some formulaic operation with registers. We will tell you which registers are set beforehand and where you should put the result. In most cases, it's rax. Many instructions exist in x86 that allow you to perform all the normal math operations on registers and memory. For shorthand, when we say A += B, it really means A = A + B. Here are some useful instructions: add reg1, reg2 <=> reg1 += reg2 sub reg1, reg2 <=> reg1 -= reg2 imul reg1, reg2 <=> reg1 *= reg2 div is more complicated, and we will discuss it later. Note: all regX can be replaced by a constant or memory location. Do the following: Add 0x331337 to rdi SOLUTION: .intel_syntax noprefix .global _start _start: add rdi, 0x331337 gcc -nostdlib -o program program.s /challenge/run /home/hacker/program

linear equation registers

In this level, you will be working with registers. You will be asked to modify or read from registers. We will now set some values in memory dynamically before each run. On each run, the values will change. This means you will need to do some type of formulaic operation with registers. We will tell you which registers are set beforehand and where you should put the result. In most cases, it's rax. Using your new knowledge, please compute the following: f(x) = mx + b, where: m = rdi x = rsi b = rdx Place the result into rax. Note: There is an important difference between mul (unsigned multiply) and imul (signed multiply) in terms of which registers are used. Look at the documentation on these instructions to see the difference. In this case, you will want to use imul. SOLUTION: .intel_syntax noprefix .global _start _start: imul rdi, rsi add rdi, rdx mov rax, rdi

integer division

Division in x86 is more special than in normal math. Math here is called integer math, meaning every value is a whole number. As an example: 10 / 3 = 3 in integer math. Why? Because 3.33 is rounded down to an integer. The relevant instructions for this level are: mov rax, reg1 div reg2 Note: div is a special instruction that can divide a 128-bit dividend by a 64-bit divisor while storing both the quotient and the remainder, using only one register as an operand. How does this complex div instruction work and operate on a 128-bit dividend (which is twice as large as a register)? For the instruction div reg, the following happens: rax = rdx:rax / reg rdx = remainder rdx:rax means that rdx will be the upper 64-bits of the 128-bit dividend and rax will be the lower 64-bits of the 128-bit dividend. You must be careful about what is in rdx and rax before you call div. Please compute the following: speed = distance / time, where: distance = rdi time = rsi speed = rax SOLUTION: .intel_syntax noprefix .global _start _start: xor rdx, rdx mov rax, rdi div rsi

modulo-operation

Modulo in assembly is another interesting concept! x86 allows you to get the remainder after a div operation. For instance: 10 / 3 results in a remainder of 1. The remainder is the same as modulo, which is also called the "mod" operator. In most programming languages, we refer to mod with the symbol %. Please compute the following: rdi % rsi Place the value in rax. SOLUTION: .intel_syntax noprefix .global _start _start: xor rdx, rdx mov rax, rdi div rsi mov rax, rdx

efficient-modulo

It turns out that using the div operator to compute the modulo operation is slow! We can use a math trick to optimize the modulo operator (%). Compilers use this trick a lot. If we have x % y, and y is a power of 2, such as 2^n, the result will be the lower n bits of x. Therefore, we can use the lower register byte access to efficiently implement modulo! Using only the following instruction(s): mov Please compute the following: rax = rdi % 256 rbx = rsi % 65536 SOLUTION: .intel_syntax noprefix .global _start _start: mov al, dil mov bx, si

byte-extraction

In this level, you will be working with bit logic and operations. This will involve heavy use of directly interacting with bits stored in a register or memory location. You will also likely need to make use of the logic instructions in x86: and, or, not, xor. Shifting bits around in assembly is another interesting concept! x86 allows you to 'shift' bits around in a register. Take, for instance, al, the lowest 8 bits of rax. The value in al (in bits) is: rax = 10001010 If we shift once to the left using the shl instruction: shl al, 1 The new value is: al = 00010100 Everything shifted to the left, and the highest bit fell off while a new 0 was added to the right side. You can use this to do special things to the bits you care about. Shifting has the nice side effect of doing quick multiplication (by 2) or division (by 2), and can also be used to compute modulo. Here are the important instructions: shl reg1, reg2 <=> Shift reg1 left by the amount in reg2 shr reg1, reg2 <=> Shift reg1 right by the amount in reg2 Note: 'reg2' can be replaced by a constant or memory location. Using only the following instructions: mov, shr, shl Please perform the following: Set rax to the 5th least significant byte of rdi. For example: rdi = | B7 | B6 | B5 | B4 | B3 | B2 | B1 | B0 | Set rax to the value of B4 SOLUTION: .intel_syntax noprefix .global _start _start: shl rdi, 24 shr rdi, 56 mov al, dil

bitwise-and

Without using the following instructions: mov, xchg, please perform the following: Set rax to the value of (rdi AND rsi) SOLUTION: .intel_syntax noprefix .global _start _start: and rdi, rsi xor rax, rax or rax, rdi

check-even

Using only the following instructions: and or xor Implement the following logic: if x is even then y = 1 else y = 0 Where: x = rdi y = rax SOLUTION: .intel_syntax noprefix .global _start _start: xor rax, rax or rax, rdi and rax, 1 xor rax, 1

memory-increment

Please perform the following: Place the value stored at 0x404000 into rax. Increment the value stored at the address 0x404000 by 0x1337. Make sure the value in rax is the original value stored at 0x404000 and make sure that [0x404000] now has the incremented value. SOLUTION: .intel_syntax noprefix .global _start _start: mov rax, [0x404000] add qword ptr [0x404000], 0x1337

memory-size-access

Recall the following: The breakdown of the names of memory sizes: Quad Word = 8 Bytes = 64 bits Double Word = 4 bytes = 32 bits Word = 2 bytes = 16 bits Byte = 1 byte = 8 bits In x86_64, you can access each of these sizes when dereferencing an address, just like using bigger or smaller register accesses: mov al, [address] <=> moves the least significant byte from address to rax mov ax, [address] <=> moves the least significant word from address to rax mov eax, [address] <=> moves the least significant double word from address to rax mov rax, [address] <=> moves the full quad word from address to rax SOLUTION: .intel_syntax noprefix .global _start _start: mov al, [0x404000] mov bx, [0x404000] mov ecx, [0x404000] mov rdx, [0x404000]

little-endian-write

It is worth noting, as you may have noticed, that values are stored in reverse order of how we represent them. As an example, say: [0x1330] = 0x00000000deadc0de If you examined how it actually looked in memory, you would see: [0x1330] = 0xde [0x1331] = 0xc0 [0x1332] = 0xad [0x1333] = 0xde [0x1334] = 0x00 [0x1335] = 0x00 [0x1336] = 0x00 [0x1337] = 0x00 This format of storing things in 'reverse' is intentional in x86, and it's called "Little Endian". For this challenge, we will give you two addresses created dynamically each run. The first address will be placed in rdi. The second will be placed in rsi. Using the earlier mentioned info, perform the following: Set [rdi] = 0xdeadbeef00001337 Set [rsi] = 0xc0ffee0000 Hint: it may require some tricks to assign a big constant to a dereferenced register. Try setting a register to the constant value, then assigning that register to the dereferenced register. SOLUTION: mov rax, 0xdeadbeef00001337 mov [rdi], rax mov rax, 0xc0ffee0000 mov [rsi], rax

memory-sum

In this level, you will be working with memory. This will require you to read or write to things stored linearly in memory. If you are confused, go look at the linear addressing module in 'ike. You may also be asked to dereference things, possibly multiple times, to things we dynamically put in memory for your use. Recall that memory is stored linearly. What does that mean? Say we access the quad word at 0x1337: [0x1337] = 0x00000000deadbeef The real way memory is laid out is byte by byte, little endian: [0x1337] = 0xef [0x1337 + 1] = 0xbe [0x1337 + 2] = 0xad ... [0x1337 + 7] = 0x00 What does this do for us? Well, it means that we can access things next to each other using offsets, similar to what was shown above. Say you want the 5th byte from an address, you can access it like: mov al, [address+4] Remember, offsets start at 0. Perform the following: Load two consecutive quad words from the address stored in rdi. Calculate the sum of the previous steps' quad words. Store the sum at the address in rsi. SOLUTION: mov rax, qword ptr [rdi] mov rbx, qword ptr[rdi+8] add qword ptr [rsi], rax add qword ptr [rsi], rbx

stack-subtraction

Using these instructions, take the top value of the stack, subtract rdi from it, then put it back. SOLUTION: pop rax sub rax, rdi push rax swap-stack values: Using only the following instructions: push pop Swap values in rdi and rsi. SOLUTION: push rsi push rdi pop rsi pop rdi

average-stack-values

Without using pop, please calculate the average of 4 consecutive quad words stored on the stack. Push the average on the stack. SOLUTION: xor rax, rax add rax, qword ptr [rsp] add rax, qword ptr [rsp + 0x8] add rax, qword ptr [rsp + 0x10] add rax, qword ptr [rsp + 0x18] shr rax, 2 push rax absolute-jump: In x86, absolute jumps (jump to a specific address) are accomplished by first putting the target address in a register reg, then doing jmp reg. In this level, we will ask you to do an absolute jump. Perform the following: Jump to the absolute address 0x403000. SOLUTION: mov rax, 0x403000 jmp rax relative-jump: In this level, we will ask you to do a relative jump. You will need to fill space in your code with something to make this relative jump possible. We suggest using the nop instruction. It's 1 byte long and very predictable. In fact, the assembler that we're using has a handy .rept directive that you can use to repeat assembly instructions some number of times: GNU Assembler Manual Useful instructions for this level: jmp (reg1 | addr | offset) nop Hint: For the relative jump, look up how to use labels in x86. Using the above knowledge, perform the following: Make the first instruction in your code a jmp. Make that jmp a relative jump to 0x51 bytes from the current position. At the code location where the relative jump will redirect control flow, set rax to 0x1. SOLUTION: jmp target .rept 0x51 nop .endr target: mov rax, 0x1

jump-trmapoline

Now, we will combine the two prior levels and perform the following: Create a two jump trampoline: Make the first instruction in your code a jmp. Make that jmp a relative jump to 0x51 bytes from its current position. At 0x51, write the following code: Place the top value on the stack into register rdi. jmp to the absolute address 0x403000. SOLUTION: jmp target .rept 0x51 nop .endr target: pop rdi mov rax, 0x403000 jmp rax

conditional-jump

Using the above knowledge, implement the following: if [x] is 0x7f454c46: y = [x+4] + [x+8] + [x+12] else if [x] is 0x00005A4D: y = [x+4] - [x+8] - [x+12] else: y = [x+4] * [x+8] * [x+12] Where: x = edi, y = eax. Assume each dereferenced value is a signed dword. This means the values can start as a negative value at each memory position. A valid solution will use the following at least once: jmp (any variant), cmp SOLUTION: cmp dword ptr [edi], 0x7f454c46 jnz not_equal mov eax, dword ptr [edi+4] add eax, dword ptr [edi+8] add eax, dword ptr [edi+12] jmp end

not_equal

cmp dword ptr [edi], 0x00005A4D jnz fallback mov eax, dword ptr [edi+4] sub eax, dword ptr [edi+8] sub eax, dword ptr [edi+12] jmp end fallback: mov eax, dword ptr [edi+4] imul eax, dword ptr [edi+8] imul eax, dword ptr [edi+12] end:

indirect-jump

The last jump type is the indirect jump, often used for switch statements in the real world. Switch statements are a special case of if-statements that use only numbers to determine where the control flow will go. Here is an example: switch(number): 0: jmp do_thing_0 1: jmp do_thing_1 2: jmp do_thing_2 default: jmp do_default_thing The switch in this example works on number, which can either be 0, 1, or 2. If number is not one of those numbers, the default triggers. You can consider this a reduced else-if type structure. In x86, you are already used to using numbers, so it should be no surprise that you can make if statements based on something being an exact number. Additionally, if you know the range of the numbers, a switch statement works very well. Take, for instance, the existence of a jump table. A jump table is a contiguous section of memory that holds addresses of places to jump. In the above example, the jump table could look like: [0x1337] = address of do_thing_0 [0x1337+0x8] = address of do_thing_1 [0x1337+0x10] = address of do_thing_2 [0x1337+0x18] = address of do_default_thing Using the jump table, we can greatly reduce the amount of cmps we use. Now all we need to check is if number is greater than 2. If it is, always do: jmp [0x1337+0x18] Otherwise: jmp [jump_table_address + number * 8] Using the above knowledge, implement the following logic: if rdi is 0: jmp 0x40301e else if rdi is 1: jmp 0x4030da else if rdi is 2: jmp 0x4031d5 else if rdi is 3: jmp 0x403268 else: jmp 0x40332c Please do the above with the following constraints: Assume rdi will NOT be negative. Use no more than 1 cmp instruction. Use no more than 3 jumps (of any variant). We will provide you with the number to 'switch' on in rdi. We will provide you with a jump table base address in rsi. Here is an example table: [0x40427c] = 0x40301e (addrs will change) [0x404284] = 0x4030da [0x40428c] = 0x4031d5 [0x404294] = 0x403268 [0x40429c] = 0x40332c SOLUTION: .intel_syntax noprefix .global _start _start: cmp rdi, 3 jg greater_than_3 jmp qword ptr [rsi + rdi*8] greater_than_3: jmp qword ptr [rsi + 32]

average-loop

In a previous level, you computed the average of 4 integer quad words, which was a fixed amount of things to compute. But how do you work with sizes you get when the program is running? In most programming languages, a structure exists called the for-loop, which allows you to execute a set of instructions for a bounded amount of times. The bounded amount can be either known before or during the program's run, with "during" meaning the value is given to you dynamically. As an example, a for-loop can be used to compute the sum of the numbers 1 to n: sum = 0 i = 1 while i <= n: sum += i i += 1 Please compute the average of n consecutive quad words, where: rdi = memory address of the 1st quad word rsi = n (amount to loop for) rax = average computed SOLUTION: .intel_syntax noprefix .global _start _start: mov rax, 0 mov rcx, 0 loop_start: cmp rcx, rsi jge finish add rax, qword ptr [rdi + 8*rcx] inc rcx jmp loop_start finish: xor rdx, rdx div rsi

count-non-zero

Count the consecutive non-zero bytes in a contiguous region of memory, where: rdi = memory address of the 1st byte rax = number of consecutive non-zero bytes Additionally, if rdi = 0, then set rax = 0 (we will check)! An example test-case, let: rdi = 0x1000 [0x1000] = 0x41 [0x1001] = 0x42 [0x1002] = 0x43 [0x1003] = 0x00 Then: rax = 3 should be set. SOLUTION: .intel_syntax noprefix .global _start _start: mov rax, 0 test rdi, rdi je finish count_bytes: test byte ptr [rdi + rax], 0xFF ; this would work as well: cmp byte ptr [rdi + rax], 0 je finish inc rax jmp count_bytes finish:

socket

In this challenge, you’ll begin your journey into networking by creating a socket using the socket syscall. A socket is the basic building block for network communication; it serves as an endpoint for sending and receiving data. When you invoke socket, you provide three key arguments: the domain (for example, AF_INET for IPv4), the type (such as SOCK_STREAM for TCP), and the protocol (usually set to 0 to choose the default). Mastering this syscall is important because it lays the foundation for all subsequent network interactions. SOLUTION: .intel_syntax noprefix .global _start _start: mov eax, 41 mov edi, 2 mov esi, 1 mov edx, 0 syscall mov eax, 60 xor rdi, rdi syscall

bind

After creating a socket, the next step is to assign it a network identity. In this challenge, you will use the bind syscall to connect your socket to a specific IP address and port number. The call requires you to provide the socket file descriptor, a pointer to a struct sockaddr (specifically a struct sockaddr_in for IPv4 that holds fields like the address family, port, and IP address), and the size of that structure. Binding is essential because it ensures your server listens on a known address, making it reachable by clients. SOLUTION: .intel_syntax noprefix .global _start _start: mov eax, 41 mov edi, 2 mov esi, 1 mov edx, 0 syscall sub rsp, 16 xor rcx, rcx mov qword ptr [rsp+8], rcx mov dword ptr [rsp+4], 0 mov word ptr [rsp+2], 0x5000 mov word ptr [rsp], 2 mov edi, eax mov eax, 49 mov rsi, rsp mov edx, 16 syscall mov eax, 60 xor rdi, rdi syscall

listen

With your socket bound to an address, you now need to prepare it to accept incoming connections. The listen syscall transforms your socket into a passive one that awaits client connection requests. It requires the socket’s file descriptor and a backlog parameter, which sets the maximum number of queued connections. This step is vital because without marking the socket as listening, your server wouldn’t be able to receive any connection attempts. .intel_syntax noprefix .global _start _start: mov eax, 41 mov edi, 2 mov esi, 1 mov edx, 0 syscall sub rsp, 16 xor rcx, rcx mov qword ptr [rsp+8], rcx mov dword ptr [rsp+4], 0 mov word ptr [rsp+2], 0x5000 mov word ptr [rsp], 2 mov edi, eax mov eax, 49 mov rsi, rsp mov edx, 16 syscall mov eax, 43 xor rsi, rsi xor edx, edx syscall mov eax, 60 xor rdi, rdi syscall

accept

Once your socket is listening, it’s time to actively accept incoming connections. In this challenge, you will use the accept syscall, which waits for a client to connect. When a connection is established, it returns a new socket file descriptor dedicated to communication with that client and fills in a provided address structure (such as a struct sockaddr_in) with the client’s details. This process is a critical step in transforming your server from a passive listener into an active communicator. .intel_syntax noprefix .global _start _start: mov eax, 41 mov edi, 2 mov esi, 1 mov edx, 0 syscall sub rsp, 16 xor rcx, rcx mov qword ptr [rsp+8], rcx mov dword ptr [rsp+4], 0 mov word ptr [rsp+2], 0x5000 mov word ptr [rsp], 2 mov edi, eax mov eax, 49 mov rsi, rsp mov edx, 16 syscall mov eax, 50 mov esi, 0 syscall mov eax, 43 xor rsi, rsi xor edx, edx syscall mov eax, 60 xor rdi, rdi syscall