Proj 12: Intro to 64-bit Assembler (15 pts.)

What You Need

Purpose

To learn the basics of 64-bit Assembly programming, making several simple programs.

Downloading a Kali 64-bit VM

If you are in S214 in the Spring 2017 semester, there are already 64-bit Kali VMs on the computers.

If you need to download one, go to

https://www.offensive-security.com/kali-linux-vmware-arm-image-download/

Download the latest "Kali Linux 64 bit VM". When I did it (March, 2017) it was version 2016.2.

Unzip the .7z file and run it in VMware Player or Fusion.

Installing Yasm

Yasm is a rewrite of the nasm assembler, and we need it. Execute these commands in a Terminal to get it.
cd /tmp
wget http://http.us.debian.org/debian/pool/main/y/yasm/yasm_1.2.0-2_amd64.deb
dpkg -i yasm_1.2.0-2_amd64.deb

Understanding Syscall 1: Write

From the Linux Syscall Table, this call is specified as:

So to write text to the console, we must do these things:

Program 1: ABC

This is the simplest program I know in Assembler. It uses syscall once to print the letters 'ABCDEFGH'.

In a Terminal window, execute this command:

nano abc1.asm
Enter this code in the editor.
section .text
    global _start

    _start:
        mov  rax, 0x4142434445464748    ; 'ABCDEFGH'
        push rax
        mov  rdx, 0x8     ; length of string is 8 bytes
        mov  rsi, rsp     ; Address of string is RSP because string is on the stack
        mov  rax, 0x1     ; syscall 1 is write
        mov  rdi, 0x1     ; stdout has a file descriptor of 1
        syscall           ; make the system call

Save the file with Ctrl+X, Y, Enter.

Execute these commands to compile, link, and run the program:

yasm -f elf64 abc1.asm

ld -o abc1.out abc1.o

./abc1.out

The program prints out the letters in reverse order, and then crashes with a "Segmentation fault" message, as shown below.

Program 2: ABC & Exit

The program crashes at the end rather than exiting normally. To fix that, we need to add a second syscall, to "exit".

The Linux Syscall Table, specifies the "exit" call as:

So to exit, we must do these things:

In a Terminal window, execute these commands:
cp abc1.asm abc2.asm

nano abc2.asm

Add these lines at the bottom of the program:
        mov  rax, 0x3c    ; syscall 3c is exit
        syscall           ; make the system call

Save the file with Ctrl+X, Y, Enter.

Execute these commands to compile, link, and run the program:

yasm -f elf64 abc2.asm

ld -o abc2.out abc2.o

./abc2.out

The program prints out the letters in reverse order, and then exits normally, as shown below.

Program 3: ABC in Order

Let's fix our program to print the letters in order. To do that, all we need to do is specify the hexadecimal ASCII codes in reverse order. In a Terminal window, execute these commands:
cp abc2.asm abc3.asm

nano abc3.asm

The 5th line of the program is:
        mov  rax, 0x4142434445464748    ; 'ABCDEFGH'
Change it to:
        mov  rax, 0x4847464544434241    ; 'ABCDEFGH' reversed

Save the file with Ctrl+X, Y, Enter.

Execute these commands to compile, link, and run the program:

yasm -f elf64 abc3.asm

ld -o abc3.out abc3.o

./abc3.out

The program prints out the letters in the correct order, and then exits normally, as shown below.

Saving the Screen Image

Make sure you can see the "ABCDEFGH" message, and that there is no error message.

Save a whole-desktop image with a filename of "Proj 12a from YOUR NAME".

Program 4: "Hello World" Using a .data Section

This program stores a string in the .data section and references it.

In a Terminal window, execute this command:

nano hello.asm
Enter this code in the editor.
section .data
    string1 db  "Hello World!",10   ; '10' at end is line feed

section .text
    global _start

    _start:
        mov  rdx, 0xd               ; length of string is 13 bytes
        mov  rsi, dword string1     ; set rsi to pointer to string
        mov  rax, 0x1               ; syscall 1 is write
        mov  rdi, 0x1               ; stdout has a file descriptor of 1
        syscall                     ; make the system call

        mov  rax, 0x3c              ; syscall 3c is exit
        syscall                     ; make the system call

Save the file with Ctrl+X, Y, Enter.

Execute these commands to compile, link, and run the program:

yasm -f elf64 hello.asm

ld -o hello.out hello.o

./hello.out

The program prints out the message, as shown below.

Program 5: "Echo" Using a .data Section

This program uses syscall 0 to read text from stdin and prints it out.

In a Terminal window, execute this command:

nano read.asm
Enter this code in the editor.
section .data
    string1 db  "AAAABBBBCCX"       ; Reserve space for 10 characters

section .text
    global _start

    _start:
        mov  rdx, 0xa               ; length of string is 10 bytes
        mov  rsi, dword string1     ; set rsi to pointer to string
        mov  rax, 0x0               ; syscall 0 is read
        mov  rdi, 0x0               ; stdin has a file descriptor of 0
        syscall                     ; make the system call

        mov  rdx, 0xa               ; length of string is 10 bytes
        mov  rsi, dword string1     ; set rsi to pointer to string
        mov  rax, 0x1               ; syscall 1 is write
        mov  rdi, 0x1               ; stdout has a file descriptor of 1
        syscall                     ; make the system call

        mov  rax, 0x3c              ; syscall 3c is exit
        syscall                     ; make the system call

Save the file with Ctrl+X, Y, Enter.

Execute these commands to compile, link, and run the program:

yasm -f elf64 read.asm

ld -o read.out read.o

./read.out

The program waits for input. Type APPLE and press Enter.

The program prints out "APPLE", followed by some extra characters, as shown below.

If we were programming students, the next step would be to clean this thing up and get rid of the extra characters, make it calculate the string length automatically, etc.

But we have a different goal--to criticize code and its exploitable weaknesses--so we'll move on to other things.

Program 6: Sloppy Caesar Cipher

This program is like "Echo" but it increments each byte of the input before printing it out. It doesn't work correctly for "Z" or "z", which should wrap around to "A" or "a", and has other flaws.

In a Terminal window, execute this command:

nano caesar.asm
Enter this code in the editor.
section .data
    string1 db  "AAAABBBB"           ; Reserve space for 8 characters

section .text
    global _start

    _start:
        mov  rdx, 0x8                ; length of string is 8 bytes
        mov  rsi, dword string1      ; set rsi to pointer to string
        mov  rax, 0x0                ; syscall 1 is read
        mov  rdi, 0x0                ; stdin has a file descriptor of 0
        syscall                      ; make the system call

        mov  rbx, dword string1      ; set rbx to pointer to string
        mov  rcx, [rbx]              ; Put string value into rcx
        add  rcx, 0x0101010101010101 ; Add 1 to each byte, not fixing rollover
        mov  [rbx], rcx              ; Put modified byte on string

        mov  rdx, 0x8                ; length of string is 8 bytes
        mov  rsi, dword string1      ; set rsi to pointer to string
        mov  rax, 0x1                ; syscall 1 is write
        mov  rdi, 0x1                ; stdout has a file descriptor of 1
        syscall                      ; make the system call

        mov  rax, 0x3c               ; syscall 3c is exit
        syscall                      ; make the system call

Save the file with Ctrl+X, Y, Enter.

Execute these commands to compile, link, and run the program:

yasm -f elf64 caesar.asm

ld -o caesar.out caesar.o

./caesar.out

There's a warning message saying a value is too large to fit into a 32-bit field, but the program compiles.

The program waits for input. Type HELLO and press Enter.

The program prints out "IFMMO", followed by some extra characters, as shown below.

The program encrypted the first 4 letters, but not the "O".

Let's see what the compiler actually did.

Examining Assembly Code with Objdump

Execute this command:
objdump -d caesar.out
As shown below, the compiler changed the "add" instruction to one that only adds a 32-bit value to rcx, not the 64-bit value we wanted.

Consulting the Intel 64 and IA-32 Architectures Software Developer's Manual, I found these ADD instructions:

That's hard to understand, but I think it means we can do a 64-bit add, but not with an immediate value. We need to use a register.

Program 7: Improved Caesar Cipher

In a Terminal window, execute this command:
cp caesar.asm caesar2.asm

nano caesar2.asm

In the editor, change this line:
        add  rcx, 0x0101010101010101  ; Add 1 to each byte, not fixing rollover
To this:
        mov  r8, 0x0101010101010101  ; Put value in r8
        add  rcx, r8                 ; Add using registers

Save the file with Ctrl+X, Y, Enter.

Execute these commands to compile, link, and run the program:

yasm -f elf64 caesar2.asm

ld -o caesar2.out caesar2.o

./caesar2.out

There's a warning message saying a value is too large to fit into a 32-bit field, but the program compiles.

The program waits for input. Type HELLO and press Enter.

The program prints out "IFMMP", as it should!

Saving the Screen Image

Make sure you can see the "IFMMP" output.

Save a whole-desktop image with a filename of "Proj 12b from YOUR NAME".

Turning In Your Project

Email the images to cnit.127sam@gmail.com with a subject of "Project 12 from YOUR NAME".


Sources

Linux Syscall Table

Basics & Sample Programs

Intel 64 and IA-32 Architectures Software Developer's Manual



Text corrections 11-10-18