This post is an archive for my personal solutions to the ten labs for the CMU 15213 course. You can access the self-study labs at the CSAPP official website.
Lab 1: Data Lab
This lab instructs you to write functions using bit-wise operations.
bixXor
Implements xor
using only ~
and
&
. The key is to notice that
x | y == ~(~x & ~y)
.
1 | int bitXor(int x, int y) { |
tmin
Returns the minimum two's complement integer.
1 | int tmin(void) { |
isTmax
Checks if the input is the maximum two's complement integer. Notice
that the maximum integer has a leading 0
and all remaining
bits are 1
.
1 | int isTmax(int x) { |
Another way to think the maximum integer is, it's the only number
that equals to 0
when adding
(1 << 31) + 1
. So we can also use the following
simpler code:
1 | int isTmax(int x) { |
allOddBits
Checks if all odd index bits are set to 1. You can create a mask to detect the pattern.
1 | int allOddBits(int x) { |
negate
Negates the input.
1 | int negate(int x) { |
isAsciiDigit
Checks if the input is an ASCII digit, e.g., hex representation
between 0x30
and 0x39
. We can divide the
pattern into the parts. The first part is 0x30
, and the
second is 0x0?
where ?
can be 0
to 9
. When examining the second part, the least significant
byte either does not exceed 1000
(0x8
), or is
exactly 0x8
or 0x9
.
1 | int isAsciiDigit(int x) { |
conditional
Implements the conditional operator x ? x : y
. Just use
mask to achieve it.
1 | int conditional(int x, int y, int z) { |
isLessOrEqual
Checks if the first argument is less or equal than the second
argument. You can use -x == ~x + 1
and examine the sign
bit.
1 | int isLessOrEqual(int x, int y) { |
logicalNeg
Implements the !
operator. That's, for any non-zero
number, it returns 0
and for zero it returns
1
. If the input is negative (the sign bit is
1
), or the input added with -1
overflows then
the input is non-zero.
1 | int logicalNeg(int x) { |
howManyBits
Returns the minimum number of bits required to represent the input
integer. Note that a positive integer implies a leading 0
and a negative integer implies a leading 1
. Therefore in
either case, we must at least have a sign bit. We can safely invert all
bits of a negative integer and from the most significant bit to the
least significant bit find the index at which the bit is
1
.
The trick here is to binary-search such index. To do this, we must
create several masks, shown as in the following code. For example,
Mask_16 = 0xFFFF0000
detects whether any of the most
significant 16 bits is 1
. If so, it means the result is
at least 16. Then we can right shift the number 16 bits to
recursively examine the most significant 16 bits by applying
Mask_8 = 0xFF00
.
1 | int howManyBits(int x) { |
floatScale2
Returns 2*f
where the input is interpreted as a
floating-point number. You should be careful about the border cases. If
the input number is NaN or infinity (i.e. the exponent is all ones), the
function returns the input as required. If the input is denormalized
(i.e. the exponent is all zeros), the input will be shifted left 1 bit.
If it's normalized (i.e. the exponent is neither all zeros nor all
zeros), we just add one to the exponent.
1 | unsigned floatScale2(unsigned uf) { |
floatFloat2Int
Returns bit-level equivalent of (int) x
where
x
is interpreted as a floating number. Be aware of the
border cases: (1) if the input is zero, the function also returns zero;
(2) if the input is NaN or infinity, it returns 0x80000000u
as required; (3) if the input is denormalized, it returns zero; (4) if
the input is normalized, determine the result by checking the
exponent.
1 | int floatFloat2Int(unsigned uf) { |
floatPower2
Returns 2.0^x
for any given integer x
. The
smallest non-zero floating (denormalized) number is
2^(-126+23)=2^(-149)
. So if the input is smaller than
-149
, the functions should return 0
. The
largest denormalized floating number in the form of 2.0^x
is x^(-127)
. In the range of -149
and
-127
, we just left shift one by x+149
. The
smallest normalized number (also, in the form of 2.0^x
) is
2^(-126)
and the largest is
2^(254-127)=2^(127)
. In this case, we left shift
x+127
by 23
to occupy the exponent part. If
x
exceeds 127
, the function returns
+INF
.
1 | unsigned floatPower2(int x) { |
Lab 2: Bomb Lab
The answers to the six phases are respectively:
1 | Border relations with Canada have never been better. |
If we deciphered the secret phase, the answers will be (adding string
DrEvil
after the third line):
1 | Border relations with Canada have never been better. |
Note that some of the answers are not unique. Let's quickly go through these phases.
Phase 1
Find the string at address 0x402400
. You can use
x/s 402400
to extract it!
Phase 2
The key to solve this phase is know the functionality of each register:
%rdi
: the first argument.%rsi
: the second argument.%rdx
: the third argument.%rcx
: the forth argument.%r8
: the fifth argument.%r9
: the sixth argument.
More arguments are stored in stack. Know at which positions these arguments are stored in stack and you'll get the answer!
Phase 3
This function reads two integers and switches to the address accroding to the first integer. If the destination address has a value same as the second integer, the phase passes.
This phase has multiple answers. You can choose one of the followings:
0 207
1 311
2 707
3 256
4 389
5 206
6 682
7 327
Phase 4
The key is to decode func4(x, s, e)
and understand it
returns some numbers of steps to find x
between
[s, e]
using binary search. s=0
and
e=14
. The function returns 0
if it finds
x
in the very first step. So x=7
.
Phase 5
This phase executes the following steps:
- The answer string length is 6.
- Reads a string from
0x4024b0
and compares it to the string at0x40245e
, which isflyers
. - The string at
0x4024b0
ismaduiersnfotvbylSo you think you can stop the bomb with ctrl-c, do you?
. - The indices respectively for
f
,l
,y
,e
,r
,s
at0x4024b0
is 9, 15, 14, 5, 6, 7. The hex representations are9
,f
,e
,5
,6
,7
. - Find in the ASCII table those characters whose low hex
representation is
9
,f
,e
,5
,6
and7
. So there are multiple possible answers.
Phase 6
This is the most difficult phase (including the secret one), in terms of it's length of assembly... Be patient and take a careful look at the assembly. You'll find this phase does the following things:
- It reads six integers and stores them in stack from the bottom up.
- Each integer is different and all are smaller or equal than 6.
- Subtracts each integer from 7. For example, 6 turns into 1.
- According to the result integer, this function jumps to the target
address from
0x6032d0
and reads the value. This value is also an address. The results are stored from%rsp+32
to%rsp+72
, amounting to totally eight addresses. - Sets
*(*(%rsp+t)+8)=*(%rsp+t+8)
, wheret=32, 3a, 42, ..., 6a
. Fort=72
, the target value is zero. - It actually implements a linked list, likely implemented as follows:
1
2
3
4struct node {
int value;
node* next;
}; - This linked list has a list of non-ascending values.
- Find the indices in a non-ascending order and restore the input indices.
Secret Phase
Disassembling the phase_defused
function and you will
find a secret secret_phase
function. This is the hidden
phase of this lab. The secret_phase
function does:
- Receives an integer less or equal than 1001.
- Calls function
fun7(int* s, int x)
and returns 2. - The
fun7(int* s, int x)
function receives a pointers
to a binary search tree node and finds valuex
in the tree. - Your goal is to find in this tree a value that returns 2, according to the calculation rules.
This binary search tree node looks like this:
1 | 36 |
Lab 3: Attack Lab
This lab asks you to construct exploits to have the function output
specific strings. Before start, make sure to take a glance at the handout, which has
detailed instructions for this lab. Remember to use -Og
to
compile the .s
file and use objdump -d
to
generate byte codes.
Phase 1
In the getbuf()
function, %rsp
has an
address of 0x5561dc78
and the return address is located at
%rsp+0x28
, which is 0x5561dca0
. To call the
touch1()
function, you should set the return address as the
address of touch1()
, that is, 0x4017c0
.
The input to ctarget
is thus:
1 | 31 32 33 34 35 36 37 38 |
The first 40 bytes are arbitrary except \n
.
Phase 2
In this phase, you should put 0x591997fa
into
%rdi
and jump to 0x4017ec
to execute function
touch2()
.
The first step is to write a .s
file:
1 | movq $0x59b997fa, %rdi |
Then generate its byte codes:
1 | 48 c7 c7 fa 97 b9 59 // movq $0x59b997fa, %rdi |
Write the input exploit file:
1 | 31 32 33 34 35 36 37 38 |
Phase 3
Note that in this phase, the content in the stack may be overwritten
by function hexmatch()
. So you must write the content in
some high location of the stack.
The assembly code is:
1 | movq $0x5561dca8, %rdi |
The generated byte codes are:
1 | 48 c7 c7 a8 dc 61 55 // movq $0x5561dca8, %rdi |
The input exploit is:
1 | 48 c7 c7 a8 dc 61 55 68 // Address 0x5561dc78 |
Phase 4
In this phase, you should construct two gadgets, one for moving data
0x59b997fa
into some register like %rax
, and
the other one for moving %rax
to %rdi
as the
argument into the function. Last, you should return to function
touch2()
.
You can find a function which contains the desired byte codes and
returns to that location. The first gadget uses the function
addval_219
and the second function is
setval_426
. You can choose other functions if
applicable.
1 | 00 00 00 00 00 00 00 00 |
Here is a full list of decoded byte codes for all functions in
farm.c
. Use Ctrl+F to search the byte patterns you
want.
1 | // 0x40199a: b8 fb 78 90 90 c3 |
Phase 5
Here is how you should think to solve this phase:
- Function
touch3(const char*)
receives an address as the argument. So you should prepare an address value in%rdi
. - This address value should come from
%rsp
. An intuitive way is usemov %rsp, %rax
. - However, the cached
%rsp
value is not allowed to store the string value because it will be immediately transmitted toPC
as the address of the next instruction. It must be a valid address. farm.c
provides a functionadd_xy(long, long)
. It can be leveraged to offset the value of%rsp
.- The returned value of
add_xy(long, long)
is the address where the string is actually stored. It lies at the highest position of our input.
1 | 00 00 00 00 00 00 00 00 |