Machine Language - x86 Assembly

Class: CSCE-312


Notes:

Assembly Instructions

The mov Instruction:

How to tell assembly syntax apart?

Instruction suffix letters (operand size)

In AT&T/GAS (GNU Assembler) syntax, x86 instructions typically include a suffix letter that specifies the operand size:

So for example:

Note: If suffix is not provided - for some instructions - the assembler can infer the size

Data Movement Instructions (AT&T)

Example:

mov %ebx, %eax      # eax = ebx
push %eax           # push eax onto stack
pop %ecx            # pop into ecx
lea 8(%ebp), % eax  # eax = address of local var

Arithmetic Instructions (AT&T)

Example:

add %ebx, %eax      # eax += ebx
sub $1, %ecx        # ecx -= 1
imul %ecx, %eax     # eax *= ecx

Logical & Bitwise (AT&T)

Example:

and %ebx, %eax      # eax &= ebx
xor %eax, %eax      # eax = 0
shl $2, %ecx        # ecx << 2

Control Flow (AT&T)

Example:

cmp %ebx, %eax      # compare eax - ebx
je equal_case
jne not_equal_case

Special (AT&T)

Code in assembly from Scratch!

This is a Linux x86-64 assembly program that prints “Hello, World!” and exits.

Intel Syntax

hello.asm

section .data
		text db "Hello, World!",10

section .text
		global _start

_start: 
	mov rax, 1
	mov rdi, 1
	mov rsi, text
	mov rdx, 14
	syscall
	
	mov rax, 60
	mov rdi, 0
	syscall

AT&T Syntax

hello.s

	.data
text:
	.string "Hello, World!\n"
	
	.text
	.global _start

_start: 
	mov $1, %rax         # syscall number: write
	mov $1, %rdi         # file descriptor: stdout
	mov $text, %rsi      # pointer to string
	mov $14, %rdx        # string length
	syscall
	
	mov $60, %rax        # syscall number: exit
	mov $0, %rdi         # exit code
	syscall

How to install and run Assembly?

What is NASM, GAS, Linker?

How to Install and Run (Linux)

Syscall

What is a syscall?

How syscall work (x86-64 Linux)

Common syscall numbers (x86-64 Linux)

Full list lives in /usr/include/x86_64-linux-gnu/asm/unistd_64.h.

Let's take another look at the hello.s code!

Section of hello.s

	mov $1, %rax         # syscall 1 = write
	mov $1, %rdi         # fd = 1 (stdout)
	mov $text, %rsi      # pointer to string
	mov $14, %rdx        # length
	syscall              # => write(1, text, 14)
	
	mov $60, %rax        # syscall 60 = exit
	mov $0, %rdi         # exit code 0
	syscall              # => exit(0)

So literally:

Examples

Jump Example (Intel)

section .txt
global _start

_start:
		mov eax, 3
		mov ebx, 2
		cmp eax, ebx
		jl lesser
		jmp end
		
lesser:
		mov ecx, 1

end: 
		mov rax, 60
		mov rdi, 0
		syscall

Step by step:

Jump Example (AT&T - GNU Assembler)

	.text
	.global _start
	
_start:
	mov $3, %eax        # eax = 3
	mov $2, %ebx        # ebx = 2
	cmp %ebx, %eax      # compare eax - ebx
	jl lesser           # jump if less (eax < ebx)
	jmp end

lesser:
	mov $1, %ecx        # ecx = 1
	
end: 
	mov $60, %rax       # syscall: exit
	mov $0, %rdi        # exit code 0
	syscall

More Examples

This is a slightly larger program that asks the user for their name and prints a greeting.

Explanation

Code (AT&T syntax):

	.data
text1:
	.string "What is your name? "
text3:
	.string "Hello, "
	
	.bss
	.lcomm name, 16        # reserve 16 bytes
	
	.text
	.global _start

_start:
	call _printText1
	call _getName
	call _printText2
	call _printName
	
	mov $60, %rax          # syscanll: exit
	mov $0, %rdi
	syscall

_getName:
	mov $0, %rax           # syscall: read
	mov $0, %rdi           # fd = stdin
	mov $name, %rsi        # buffer
	mov $16, %rdx          # size
	syscall
	ret

_printText1:
	mov $1, %rax           # syscall: write
	mov $1, %rdi           # fd = stdout
	mov $text1, %rsi
	mov $19, %rdx
	syscall
	ret

_printText2:
	mov $1, %rax
	mov $1, %rdi
	mov $text3, %rsi
	mov $7, %rdx
	syscall
	ret
	
_printName:
	mov $1, %rax
	mov $1, %rdi
	mov $name, %rsi
	mov $16, %rdx
	syscall
	ret

Loop example in x86 assembly

Code (AT&T syntax)

	.data
num1:
	.long 3
num2:
	.long 6
num3:
	.long 8
	
	.text
	.global _start

_start:
	mov num1(%rip), %eax     # eax = num1 (3)
	mov num2(%rip), %ebx     # ebx = num2 (6)

_startloop:
	cmp %ebx, %eax           # compare eax - ebx
	jae _exit                # jump if eax >= ebx
	inc % eax                # eax++
	jmp _startloop

_exit:
	mov num3(%rip), %eax     # eax = num3 (8)
	
	mov $60, %eax            # syscall: exit
	mov $0, %rdi
	syscall

Explanation:

Lab Notes

The CMP Instruction and CPU Flags

Conditional Jumps

Signed vs. Unsigned Jumps

Unconditional Jumps (JMP) and Program Flow

Example code (loop)

C code (loop example):

int bump(int num1, int num2) {
    for (int i = 0; num1 < num2; i++) {
        num1 = num1 + 1;
    }
    return num1;
}

Assembly version:

    .globl  bump
bump:
    # args: %edi = num1, %esi = num2
    xorl    %eax, %eax        # i = 0 (we’ll use %eax as the loop counter)
    movl    %edi, %edx        # edx = num1
    movl    %esi, %ecx        # ecx = num2

.Lloop:
    cmpl    %ecx, %edx        # compare num1 (edx) with num2 (ecx)
    jge     .Ldone            # if num1 >= num2, break
    addl    $1, %edx          # num1 = num1 + 1
    incl    %eax              # i++
    jmp     .Lloop

.Ldone:
    movl    %edx, %eax        # return num1
    ret

Some useful commands