CTF Assembly Challenges
set register
Description:
In this level, you will be working with registers. You will be asked to modify or read from registers.
In this level, you will work with registers! Please set the following:
rdi = 0x1337
Solution:
.intel_syntax noprefix
.global _start
_start:
mov rdi, 0x1337
gcc -nostdlib -o program program.s
/challenge/run /home/hacker/program
Name: set multiple registers
Description:
In this level, you will be working with registers. You will be asked to modify or read from registers.
In this level, you will work with multiple registers. Please set the following:
rax = 0x1337
r12 = 0xCAFED00D1337BEEF
rsp = 0x31337
Solution:
.intel_syntax noprefix
.global _start
_start:
mov rax, 0x1337
mov r12, 0xCAFED00D1337BEEF
mov rsp, 0x31337
gcc -nostdlib -o program program.s
/challenge/run /home/hacker/program
add to register
In this level, you will be working with registers. You will be asked to modify or read from registers.
We will set some values in memory dynamically before each run. On each run, the values will change. This means you will need to perform some formulaic operation with registers. We will tell you which registers are set beforehand and where you should put the result. In most cases, it's rax.
Many instructions exist in x86 that allow you to perform all the normal math operations on registers and memory.
For shorthand, when we say A += B, it really means A = A + B.
Here are some useful instructions:
add reg1, reg2 <=> reg1 += reg2
sub reg1, reg2 <=> reg1 -= reg2
imul reg1, reg2 <=> reg1 *= reg2
div is more complicated, and we will discuss it later. Note: all regX can be replaced by a constant or memory location.
Do the following:
Add 0x331337 to rdi
SOLUTION:
.intel_syntax noprefix
.global _start
_start:
add rdi, 0x331337
gcc -nostdlib -o program program.s
/challenge/run /home/hacker/program
linear equation registers
In this level, you will be working with registers. You will be asked to modify or read from registers.
We will now set some values in memory dynamically before each run. On each run, the values will change. This means you will need to do some type of formulaic operation with registers. We will tell you which registers are set beforehand and where you should put the result. In most cases, it's rax.
Using your new knowledge, please compute the following:
f(x) = mx + b, where:
m = rdi
x = rsi
b = rdx
Place the result into rax.
Note: There is an important difference between mul (unsigned multiply) and imul (signed multiply) in terms of which registers are used. Look at the documentation on these instructions to see the difference.
In this case, you will want to use imul.
SOLUTION:
.intel_syntax noprefix
.global _start
_start:
imul rdi, rsi
add rdi, rdx
mov rax, rdi
integer division
Division in x86 is more special than in normal math. Math here is called integer math, meaning every value is a whole number.
As an example: 10 / 3 = 3 in integer math.
Why?
Because 3.33 is rounded down to an integer.
The relevant instructions for this level are:
mov rax, reg1
div reg2
Note: div is a special instruction that can divide a 128-bit dividend by a 64-bit divisor while storing both the quotient and the remainder, using only one register as an operand.
How does this complex div instruction work and operate on a 128-bit dividend (which is twice as large as a register)?
For the instruction div reg, the following happens:
rax = rdx:rax / reg
rdx = remainder
rdx:rax means that rdx will be the upper 64-bits of the 128-bit dividend and rax will be the lower 64-bits of the 128-bit dividend.
You must be careful about what is in rdx and rax before you call div.
Please compute the following:
speed = distance / time, where:
distance = rdi
time = rsi
speed = rax
SOLUTION:
.intel_syntax noprefix
.global _start
_start:
xor rdx, rdx
mov rax, rdi
div rsi
modulo-operation
Modulo in assembly is another interesting concept!
x86 allows you to get the remainder after a div operation.
For instance: 10 / 3 results in a remainder of 1.
The remainder is the same as modulo, which is also called the "mod" operator.
In most programming languages, we refer to mod with the symbol %.
Please compute the following: rdi % rsi
Place the value in rax.
SOLUTION:
.intel_syntax noprefix
.global _start
_start:
xor rdx, rdx
mov rax, rdi
div rsi
mov rax, rdx
efficient-modulo
It turns out that using the div operator to compute the modulo operation is slow!
We can use a math trick to optimize the modulo operator (%). Compilers use this trick a lot.
If we have x % y, and y is a power of 2, such as 2^n, the result will be the lower n bits of x.
Therefore, we can use the lower register byte access to efficiently implement modulo!
Using only the following instruction(s):
mov
Please compute the following:
rax = rdi % 256
rbx = rsi % 65536
SOLUTION:
.intel_syntax noprefix
.global _start
_start:
mov al, dil
mov bx, si
byte-extraction
In this level, you will be working with bit logic and operations. This will involve heavy use of directly interacting with bits stored in a register or memory location. You will also likely need to make use of the logic instructions in x86: and, or, not, xor.
Shifting bits around in assembly is another interesting concept!
x86 allows you to 'shift' bits around in a register.
Take, for instance, al, the lowest 8 bits of rax.
The value in al (in bits) is:
rax = 10001010
If we shift once to the left using the shl instruction:
shl al, 1
The new value is:
al = 00010100
Everything shifted to the left, and the highest bit fell off while a new 0 was added to the right side.
You can use this to do special things to the bits you care about.
Shifting has the nice side effect of doing quick multiplication (by 2) or division (by 2), and can also be used to compute modulo.
Here are the important instructions:
shl reg1, reg2 <=> Shift reg1 left by the amount in reg2
shr reg1, reg2 <=> Shift reg1 right by the amount in reg2
Note: 'reg2' can be replaced by a constant or memory location.
Using only the following instructions:
mov, shr, shl
Please perform the following: Set rax to the 5th least significant byte of rdi.
For example:
rdi = | B7 | B6 | B5 | B4 | B3 | B2 | B1 | B0 |
Set rax to the value of B4
SOLUTION:
.intel_syntax noprefix
.global _start
_start:
shl rdi, 24
shr rdi, 56
mov al, dil
bitwise-and
Without using the following instructions: mov, xchg, please perform the following:
Set rax to the value of (rdi AND rsi)
SOLUTION:
.intel_syntax noprefix
.global _start
_start:
and rdi, rsi
xor rax, rax
or rax, rdi
check-even
Using only the following instructions:
and
or
xor
Implement the following logic:
if x is even then
y = 1
else
y = 0
Where:
x = rdi
y = rax
SOLUTION:
.intel_syntax noprefix
.global _start
_start:
xor rax, rax
or rax, rdi
and rax, 1
xor rax, 1
memory-increment
Please perform the following:
Place the value stored at 0x404000 into rax.
Increment the value stored at the address 0x404000 by 0x1337.
Make sure the value in rax is the original value stored at 0x404000 and make sure that [0x404000] now has the incremented value.
SOLUTION:
.intel_syntax noprefix
.global _start
_start:
mov rax, [0x404000]
add qword ptr [0x404000], 0x1337
memory-size-access
Recall the following:
The breakdown of the names of memory sizes:
Quad Word = 8 Bytes = 64 bits
Double Word = 4 bytes = 32 bits
Word = 2 bytes = 16 bits
Byte = 1 byte = 8 bits
In x86_64, you can access each of these sizes when dereferencing an address, just like using bigger or smaller register accesses:
mov al, [address] <=> moves the least significant byte from address to rax
mov ax, [address] <=> moves the least significant word from address to rax
mov eax, [address] <=> moves the least significant double word from address to rax
mov rax, [address] <=> moves the full quad word from address to rax
SOLUTION:
.intel_syntax noprefix
.global _start
_start:
mov al, [0x404000]
mov bx, [0x404000]
mov ecx, [0x404000]
mov rdx, [0x404000]
little-endian-write
It is worth noting, as you may have noticed, that values are stored in reverse order of how we represent them.
As an example, say:
[0x1330] = 0x00000000deadc0de
If you examined how it actually looked in memory, you would see:
[0x1330] = 0xde
[0x1331] = 0xc0
[0x1332] = 0xad
[0x1333] = 0xde
[0x1334] = 0x00
[0x1335] = 0x00
[0x1336] = 0x00
[0x1337] = 0x00
This format of storing things in 'reverse' is intentional in x86, and it's called "Little Endian".
For this challenge, we will give you two addresses created dynamically each run.
The first address will be placed in rdi. The second will be placed in rsi.
Using the earlier mentioned info, perform the following:
Set [rdi] = 0xdeadbeef00001337
Set [rsi] = 0xc0ffee0000
Hint: it may require some tricks to assign a big constant to a dereferenced register. Try setting a register to the constant value, then assigning that register to the dereferenced register.
SOLUTION:
mov rax, 0xdeadbeef00001337
mov [rdi], rax
mov rax, 0xc0ffee0000
mov [rsi], rax
memory-sum
In this level, you will be working with memory. This will require you to read or write to things stored linearly in memory. If you are confused, go look at the linear addressing module in 'ike. You may also be asked to dereference things, possibly multiple times, to things we dynamically put in memory for your use.
Recall that memory is stored linearly.
What does that mean?
Say we access the quad word at 0x1337:
[0x1337] = 0x00000000deadbeef
The real way memory is laid out is byte by byte, little endian:
[0x1337] = 0xef
[0x1337 + 1] = 0xbe
[0x1337 + 2] = 0xad
...
[0x1337 + 7] = 0x00
What does this do for us?
Well, it means that we can access things next to each other using offsets, similar to what was shown above.
Say you want the 5th byte from an address, you can access it like:
mov al, [address+4]
Remember, offsets start at 0.
Perform the following:
Load two consecutive quad words from the address stored in rdi.
Calculate the sum of the previous steps' quad words.
Store the sum at the address in rsi.
SOLUTION:
mov rax, qword ptr [rdi]
mov rbx, qword ptr[rdi+8]
add qword ptr [rsi], rax
add qword ptr [rsi], rbx
stack-subtraction
Using these instructions, take the top value of the stack, subtract rdi from it, then put it back.
SOLUTION:
pop rax
sub rax, rdi
push rax
swap-stack values:
Using only the following instructions:
push
pop
Swap values in rdi and rsi.
SOLUTION:
push rsi
push rdi
pop rsi
pop rdi
average-stack-values
Without using pop, please calculate the average of 4 consecutive quad words stored on the stack. Push the average on the stack.
SOLUTION:
xor rax, rax
add rax, qword ptr [rsp]
add rax, qword ptr [rsp + 0x8]
add rax, qword ptr [rsp + 0x10]
add rax, qword ptr [rsp + 0x18]
shr rax, 2
push rax
absolute-jump:
In x86, absolute jumps (jump to a specific address) are accomplished by first putting the target address in a register reg, then doing jmp reg.
In this level, we will ask you to do an absolute jump. Perform the following: Jump to the absolute address 0x403000.
SOLUTION:
mov rax, 0x403000
jmp rax
relative-jump:
In this level, we will ask you to do a relative jump. You will need to fill space in your code with something to make this relative jump possible. We suggest using the nop instruction. It's 1 byte long and very predictable.
In fact, the assembler that we're using has a handy .rept directive that you can use to repeat assembly instructions some number of times: GNU Assembler Manual
Useful instructions for this level:
jmp (reg1 | addr | offset)
nop
Hint: For the relative jump, look up how to use labels in x86.
Using the above knowledge, perform the following:
Make the first instruction in your code a jmp.
Make that jmp a relative jump to 0x51 bytes from the current position.
At the code location where the relative jump will redirect control flow, set rax to 0x1.
SOLUTION:
jmp target
.rept 0x51
nop
.endr
target:
mov rax, 0x1
jump-trmapoline
Now, we will combine the two prior levels and perform the following:
Create a two jump trampoline:
Make the first instruction in your code a jmp.
Make that jmp a relative jump to 0x51 bytes from its current position.
At 0x51, write the following code:
Place the top value on the stack into register rdi.
jmp to the absolute address 0x403000.
SOLUTION:
jmp target
.rept 0x51
nop
.endr
target:
pop rdi
mov rax, 0x403000
jmp rax
conditional-jump
Using the above knowledge, implement the following:
if [x] is 0x7f454c46:
y = [x+4] + [x+8] + [x+12]
else if [x] is 0x00005A4D:
y = [x+4] - [x+8] - [x+12]
else:
y = [x+4] * [x+8] * [x+12]
Where:
x = edi, y = eax.
Assume each dereferenced value is a signed dword. This means the values can start as a negative value at each memory position.
A valid solution will use the following at least once:
jmp (any variant), cmp
SOLUTION:
cmp dword ptr [edi], 0x7f454c46
jnz not_equal
mov eax, dword ptr [edi+4]
add eax, dword ptr [edi+8]
add eax, dword ptr [edi+12]
jmp end
not_equal
cmp dword ptr [edi], 0x00005A4D
jnz fallback
mov eax, dword ptr [edi+4]
sub eax, dword ptr [edi+8]
sub eax, dword ptr [edi+12]
jmp end
fallback:
mov eax, dword ptr [edi+4]
imul eax, dword ptr [edi+8]
imul eax, dword ptr [edi+12]
end:
indirect-jump
The last jump type is the indirect jump, often used for switch statements in the real world. Switch statements are a special case of if-statements that use only numbers to determine where the control flow will go.
Here is an example:
switch(number):
0: jmp do_thing_0
1: jmp do_thing_1
2: jmp do_thing_2
default: jmp do_default_thing
The switch in this example works on number, which can either be 0, 1, or 2. If number is not one of those numbers, the default triggers. You can consider this a reduced else-if type structure. In x86, you are already used to using numbers, so it should be no surprise that you can make if statements based on something being an exact number. Additionally, if you know the range of the numbers, a switch statement works very well.
Take, for instance, the existence of a jump table. A jump table is a contiguous section of memory that holds addresses of places to jump.
In the above example, the jump table could look like:
[0x1337] = address of do_thing_0
[0x1337+0x8] = address of do_thing_1
[0x1337+0x10] = address of do_thing_2
[0x1337+0x18] = address of do_default_thing
Using the jump table, we can greatly reduce the amount of cmps we use. Now all we need to check is if number is greater than 2. If it is, always do:
jmp [0x1337+0x18]
Otherwise:
jmp [jump_table_address + number * 8]
Using the above knowledge, implement the following logic:
if rdi is 0:
jmp 0x40301e
else if rdi is 1:
jmp 0x4030da
else if rdi is 2:
jmp 0x4031d5
else if rdi is 3:
jmp 0x403268
else:
jmp 0x40332c
Please do the above with the following constraints:
Assume rdi will NOT be negative.
Use no more than 1 cmp instruction.
Use no more than 3 jumps (of any variant).
We will provide you with the number to 'switch' on in rdi.
We will provide you with a jump table base address in rsi.
Here is an example table:
[0x40427c] = 0x40301e (addrs will change)
[0x404284] = 0x4030da
[0x40428c] = 0x4031d5
[0x404294] = 0x403268
[0x40429c] = 0x40332c
SOLUTION:
.intel_syntax noprefix
.global _start
_start:
cmp rdi, 3
jg greater_than_3
jmp qword ptr [rsi + rdi*8]
greater_than_3:
jmp qword ptr [rsi + 32]
average-loop
In a previous level, you computed the average of 4 integer quad words, which was a fixed amount of things to compute. But how do you work with sizes you get when the program is running?
In most programming languages, a structure exists called the for-loop, which allows you to execute a set of instructions for a bounded amount of times. The bounded amount can be either known before or during the program's run, with "during" meaning the value is given to you dynamically.
As an example, a for-loop can be used to compute the sum of the numbers 1 to n:
sum = 0
i = 1
while i <= n:
sum += i
i += 1
Please compute the average of n consecutive quad words, where:
rdi = memory address of the 1st quad word
rsi = n (amount to loop for)
rax = average computed
SOLUTION:
.intel_syntax noprefix
.global _start
_start:
mov rax, 0
mov rcx, 0
loop_start:
cmp rcx, rsi
jge finish
add rax, qword ptr [rdi + 8*rcx]
inc rcx
jmp loop_start
finish:
xor rdx, rdx
div rsi
count-non-zero
Count the consecutive non-zero bytes in a contiguous region of memory, where:
rdi = memory address of the 1st byte
rax = number of consecutive non-zero bytes
Additionally, if rdi = 0, then set rax = 0 (we will check)!
An example test-case, let:
rdi = 0x1000
[0x1000] = 0x41
[0x1001] = 0x42
[0x1002] = 0x43
[0x1003] = 0x00
Then: rax = 3 should be set.
SOLUTION:
.intel_syntax noprefix
.global _start
_start:
mov rax, 0
test rdi, rdi
je finish
count_bytes:
test byte ptr [rdi + rax], 0xFF ; this would work as well: cmp byte ptr [rdi + rax], 0
je finish
inc rax
jmp count_bytes
finish:
socket
In this challenge, you’ll begin your journey into networking by creating a socket using the socket syscall.
A socket is the basic building block for network communication; it serves as an endpoint for sending and receiving data.
When you invoke socket, you provide three key arguments: the domain (for example, AF_INET for IPv4), the type (such as SOCK_STREAM for TCP),
and the protocol (usually set to 0 to choose the default). Mastering this syscall is important because it lays the foundation for all
subsequent network interactions.
SOLUTION:
.intel_syntax noprefix
.global _start
_start:
mov eax, 41
mov edi, 2
mov esi, 1
mov edx, 0
syscall
mov eax, 60
xor rdi, rdi
syscall
bind
After creating a socket, the next step is to assign it a network identity. In this challenge, you will use the bind syscall to connect your socket to a specific IP address and port number. The call requires you to provide the socket file descriptor, a pointer to a struct sockaddr (specifically a struct sockaddr_in for IPv4 that holds fields like the address family, port, and IP address), and the size of that structure. Binding is essential because it ensures your server listens on a known address, making it reachable by clients.
SOLUTION:
.intel_syntax noprefix
.global _start
_start:
mov eax, 41
mov edi, 2
mov esi, 1
mov edx, 0
syscall
sub rsp, 16
xor rcx, rcx
mov qword ptr [rsp+8], rcx
mov dword ptr [rsp+4], 0
mov word ptr [rsp+2], 0x5000
mov word ptr [rsp], 2
mov edi, eax
mov eax, 49
mov rsi, rsp
mov edx, 16
syscall
mov eax, 60
xor rdi, rdi
syscall
listen
With your socket bound to an address, you now need to prepare it to accept incoming connections. The listen syscall transforms your socket into a passive one that awaits client connection requests. It requires the socket’s file descriptor and a backlog parameter, which sets the maximum number of queued connections. This step is vital because without marking the socket as listening, your server wouldn’t be able to receive any connection attempts.
.intel_syntax noprefix
.global _start
_start:
mov eax, 41
mov edi, 2
mov esi, 1
mov edx, 0
syscall
sub rsp, 16
xor rcx, rcx
mov qword ptr [rsp+8], rcx
mov dword ptr [rsp+4], 0
mov word ptr [rsp+2], 0x5000
mov word ptr [rsp], 2
mov edi, eax
mov eax, 49
mov rsi, rsp
mov edx, 16
syscall
mov eax, 43
xor rsi, rsi
xor edx, edx
syscall
mov eax, 60
xor rdi, rdi
syscall
accept
Once your socket is listening, it’s time to actively accept incoming connections. In this challenge, you will use the accept syscall, which waits for a client to connect. When a connection is established, it returns a new socket file descriptor dedicated to communication with that client and fills in a provided address structure (such as a struct sockaddr_in) with the client’s details. This process is a critical step in transforming your server from a passive listener into an active communicator.
.intel_syntax noprefix
.global _start
_start:
mov eax, 41
mov edi, 2
mov esi, 1
mov edx, 0
syscall
sub rsp, 16
xor rcx, rcx
mov qword ptr [rsp+8], rcx
mov dword ptr [rsp+4], 0
mov word ptr [rsp+2], 0x5000
mov word ptr [rsp], 2
mov edi, eax
mov eax, 49
mov rsi, rsp
mov edx, 16
syscall
mov eax, 50
mov esi, 0
syscall
mov eax, 43
xor rsi, rsi
xor edx, edx
syscall
mov eax, 60
xor rdi, rdi
syscall