Assembly memory management

Assembly – Memory Management


The sys_brk() system call is provided by the kernel, to allocate memory without the need of moving it later. This call allocates memory right behind application image in memory. This system function allows you to set the highest available address in the data section.

This system call takes one parameter, which is the highest memory address needed to be set. This value is stored in the EBX register.

In case of any error, sys_brk() returns -1 or returns the negative error code itself. The following example demonstrates dynamic memory allocation.

Example:

The following program allocates 16kb of memory using the sys_brk() system call:

section	.text
    global _start         ;must be declared for using gcc
_start:	;tell linker entry point

	mov	eax, 45		;sys_brk
	xor	ebx, ebx
	int	80h

	add	eax, 16384	;number of bytes to be reserved
	mov	ebx, eax
	mov	eax, 45		;sys_brk
	int	80h
	cmp	eax, 0
	jl	exit	;exit, if error 
	mov	edi, eax	;EDI = highest available address
	sub	edi, 4		;pointing to the last DWORD  
	mov	ecx, 4096	;number of DWORDs allocated
	xor	eax, eax	;clear eax
	std			;backward
	rep	stosd		;repete for entire allocated area
	cld			;put DF flag to normal state

	mov	eax, 4
	mov	ebx, 1
	mov	ecx, msg
	mov	edx, len
	int	80h		;print a message
exit:
	mov	eax, 1
	xor	ebx, ebx
	int	80h
section	.data
msg    	db	"Allocated 16 kb of memory!", 10
len     equ	$ - msg

When the above code is compiled and executed, it produces the following result:

Allocated 16 kb of memory!

Assembly file management

Assembly – File Management


The system considers any input or output data as stream of bytes. There are three standard file streams:

  • Standard input (stdin)
  • Standard output (stdout)
  • Standard error (stderr)

File Descriptor

file descriptor is a 16-bit integer assigned to a file as a file id. When a new file is created or an existing file is opened, the file descriptor is used for accessing the file.

File descriptor of the standard file streams – stdin, stdout and stderr are 0, 1 and 2, respectively.

File Pointer

file pointer specifies the location for a subsequent read/write operation in the file in terms of bytes. Each file is considered as a sequence of bytes. Each open file is associated with a file pointer that specifies an offset in bytes, relative to the beginning of the file. When a file is opened, the file pointer is set to zero.

File Handling System Calls

The following table briefly describes the system calls related to file handling:

%eax Name %ebx %ecx %edx
2 sys_fork struct pt_regs
3 sys_read unsigned int char * size_t
4 sys_write unsigned int const char * size_t
5 sys_open const char * int int
6 sys_close unsigned int
8 sys_creat const char * int
19 sys_lseek unsigned int off_t unsigned int

The steps required for using the system calls are same, as we discussed earlier:

  • Put the system call number in the EAX register.
  • Store the arguments to the system call in the registers EBX, ECX, etc.
  • Call the relevant interrupt (80h).
  • The result is usually returned in the EAX register.

Creating and Opening a File

For creating and opening a file, perform the following tasks:

  • Put the system call sys_creat() number 8, in the EAX register
  • Put the filename in the EBX register
  • Put the file permissions in the ECX register

The system call returns the file descriptor of the created file in the EAX register, in case of error, the error code is in the EAX register.

Opening an Existing File

For opening an existing file, perform the following tasks:

  • Put the system call sys_open() number 5, in the EAX register
  • Put the filename in the EBX register
  • Put the file access mode in the ECX register
  • Put the file permissions in the EDX register

The system call returns the file descriptor of the created file in the EAX register, in case of error, the error code is in the EAX register.

Among the file access modes, most commonly used are: read-only (0), write-only (1), and read-write (2).

Reading from a File

For reading from a file, perform the following tasks:

  • Put the system call sys_read() number 3, in the EAX register
  • Put the file descriptor in the EBX register
  • Put the pointer to the input buffer in the ECX register
  • Put the buffer size, i.e., the number of bytes to read, in the EDX register

The system call returns the number of bytes read in the EAX register, in case of error, the error code is in the EAX register.

Writing to a File

For writing to a file, perform the following tasks:

  • Put the system call sys_write() number 4, in the EAX register
  • Put the file descriptor in the EBX register
  • Put the pointer to the output buffer in the ECX register
  • Put the buffer size, i.e., the number of bytes to write, in the EDX register

The system call returns the actual number of bytes written in the EAX register, in case of error, the error code is in the EAX register.

Closing a File

For closing a file, perform the following tasks:

  • Put the system call sys_close() number 6, in the EAX register
  • Put the file descriptor in the EBX register

The system call returns, in case of error, the error code in the EAX register.

Updating a File

For updating a file, perform the following tasks:

  • Put the system call sys_lseek () number 19, in the EAX register
  • Put the file descriptor in the EBX register
  • Put the offset value in the ECX register
  • Put the reference position for the offset in the EDX register

The reference position could be:

  • Beginning of file – value 0
  • Current position – value 1
  • End of file – value 2

The system call returns, in case of error, the error code in the EAX register.

Example:

The following program creates and opens a file named myfile.txt, and writes a text ‘Welcome to Tutorials Point’ in this file. Next, the program reads from the file and stores the data into a buffer named info. Lastly, it displays the text as stored in info.

section	.text
   global _start         ;must be declared for using gcc
_start:   ;tell linker entry point
;create the file
    mov  eax, 8
    mov  ebx, file_name
    mov  ecx, 0777      ;read, write and execute by all
    int  0x80           ;call kernel
    mov [fd_out], eax
    
; write into the file
    mov	edx,len         ;number of bytes
    mov	ecx, msg        ;message to write
    mov	ebx, [fd_out]   ;file descriptor 
    mov	eax,4           ;system call number (sys_write)
    int	0x80            ;call kernel
	
    ; close the file
    mov eax, 6
    mov ebx, [fd_out]
    
; write the message indicating end of file write
    mov eax, 4
    mov ebx, 1
    mov ecx, msg_done
    mov edx, len_done
    int  0x80
    
;open the file for reading
    mov eax, 5
    mov ebx, file_name
    mov ecx, 0          ;for read only access
    mov edx, 0777       ;read, write and execute by all
    int  0x80
    mov  [fd_in], eax
    
;read from file
    mov eax, 3
    mov ebx, [fd_in]
    mov ecx, info
    mov edx, 26
    int 0x80
    
; close the file
    mov eax, 6
    mov ebx, [fd_in]
    
; print the info 
    mov eax, 4
    mov ebx, 1
    mov ecx, info
    mov edx, 26
    int 0x80
       
    mov	eax,1           ;system call number (sys_exit)
    int	0x80            ;call kernel

section	.data
file_name db 'myfile.txt'
msg db 'Welcome to Tutorials Point'
len equ  $-msg
msg_done db 'Written to file', 0xa
len_done equ $-msg_done

section .bss
fd_out resb 1
fd_in  resb 1
info resb  26

When the above code is compiled and executed, it produces the following result:

Written to file
Welcome to Tutorials Point

Assembly macros

Assembly – Macros


Writing a macro is another way of ensuring modular programming in assembly language.

  • A macro is a sequence of instructions, assigned by a name and could be used anywhere in the program.
  • In NASM, macros are defined with%macro and %endmacro directives.
  • The macro begins with the %macro directive and ends with the %endmacro directive.

The Syntax for macro definition:

%macro macro_name  number_of_params
<macro body>
%endmacro

Where, number_of_params specifies the number parameters, macro_namespecifies the name of the macro.

The macro is invoked by using the macro name along with the necessary parameters. When you need to use some sequence of instructions many times in a program, you can put those instructions in a macro and use it instead of writing the instructions all the time.

For example, a very common need for programs is to write a string of characters in the screen. For displaying a string of characters, you need the following sequence of instructions:

mov	edx,len	    ;message length
mov	ecx,msg	    ;message to write
mov	ebx,1       ;file descriptor (stdout)
mov	eax,4       ;system call number (sys_write)
int	0x80        ;call kernel

We have observed that, some instructions like IMUL, IDIV, INT, etc., need some of the information to be stored in some particular registers and even returns values in some specific register(s). If the program was already using those registers for keeping important data, then the existing data from these registers should be saved in the stack and restored after the instruction is executed.

In the above example of displaying a character string also, the registers EAX, EBX, ECX and EDX have been used by the INT 80H function call. So, each time you need to display on screen, you need to save these registers on the stack, invoke INT 80H and then restore the original value of the registers from the stack. So, it could be useful to write two macros for saving and restoring data.

Example:

Following example shows defining and using macros:

; A macro with two parameters
; Implements the write system call
   %macro write_string 2 
      mov   eax, 4
      mov   ebx, 1
      mov   ecx, %1
      mov   edx, %2
      int   80h
   %endmacro
 
section	.text
    global _start            ;must be declared for using gcc
_start:    ;tell linker entry point
	write_string msg1, len1               
	write_string msg2, len2    
	write_string msg3, len3   
	mov eax,1          ;system call number (sys_exit)
	int 0x80           ;call kernel

section	.data
msg1 db	'Hello, programmers!',0xA,0xD 	
len1 equ $ - msg1			
msg2 db 'Welcome to the world of,', 0xA,0xD 
len2 equ $- msg2 
msg3 db 'Linux assembly programming! '
len3 equ $- msg3

When the above code is compiled and executed, it produces the following result:

Hello, programmers!
Welcome to the world of,
Linux assembly programming!

Assembly recursion

Assembly – Recursion


A recursive procedure is one that calls itself. There are two kinds of recursion: direct and indirect. In direct recursion, the procedure calls itself and in indirect recursion, the first procedure calls a second procedure, which in turn calls the first procedure.

Recursion could be observed in numerous mathematical algorithms. For example, consider the case of calculating the factorial of a number. Factorial of a number is given by the equation:

Fact (n) = n * fact (n-1) for n > 0

For example: factorial of 5 is 1 x 2 x 3 x 4 x 5 = 5 x factorial of 4 and this can be a good example of showing a recursive procedure. Every recursive algorithm must have an ending condition, i.e., the recursive calling of the program should be stopped when a condition is fulfilled. In the case of factorial algorithm, the end condition is reached when n is 0.

The following program shows how factorial n is implemented in assembly language. To keep the program simple, we will calculate factorial 3.

section	.text
    global _start         ;must be declared for using gcc
_start:    ;tell linker entry point

    mov bx, 3       ;for calculating factorial 3
    call  proc_fact
    add   ax, 30h
    mov  [fact], ax
    
    mov	  edx,len   ;message length
    mov	  ecx,msg   ;message to write
    mov	  ebx,1     ;file descriptor (stdout)
    mov	  eax,4     ;system call number (sys_write)
    int	  0x80      ;call kernel

    mov   edx,1     ;message length
    mov	  ecx,fact  ;message to write
    mov	  ebx,1     ;file descriptor (stdout)
    mov	  eax,4     ;system call number (sys_write)
    int	  0x80      ;call kernel
    
    mov	  eax,1     ;system call number (sys_exit)
    int	  0x80      ;call kernel
proc_fact:
    cmp   bl, 1
    jg    do_calculation
    mov   ax, 1
    ret
do_calculation:
    dec   bl
    call  proc_fact
    inc   bl
    mul   bl        ;ax = al * bl
    ret

section	.data
msg db 'Factorial 3 is:',0xa	
len equ $ - msg			

section .bss
fact resb 1

When the above code is compiled and executed, it produces the following result:

Factorial 3 is:
6

Assembly procedures

Assembly – Procedures


Procedures or subroutines are very important in assembly language, as the assembly language programs tend to be large in size. Procedures are identified by a name. Following this name, the body of the procedure is described which performs a well-defined job. End of the procedure is indicated by a return statement.

Syntax:

Following is the syntax to define a procedure:

proc_name:
   procedure body
   ...
   ret

The procedure is called from another function by using the CALL instruction. The CALL instruction should have the name of the called procedure as an argument as shown below:

CALL proc_name

The called procedure returns the control to the calling procedure by using the RET instruction.

Example:

Let us write a very simple procedure named sum that adds the variables stored in the ECX and EDX register and returns the sum in the EAX register:

section	.text
    global _start         ;must be declared for using gcc
_start:	;tell linker entry point
	mov	ecx,'4'
	sub     ecx, '0'
	mov 	edx, '5'
	sub     edx, '0'
	call    sum     ;call sum procedure
	mov 	[res], eax
	mov	ecx, msg	
	mov	edx, len
	mov	ebx,1	;file descriptor (stdout)
	mov	eax,4	;system call number (sys_write)
	int	0x80	;call kernel
	mov	ecx, res
	mov	edx, 1
	mov	ebx, 1	;file descriptor (stdout)
	mov	eax, 4	;system call number (sys_write)
	int	0x80	;call kernel
	mov	eax,1	;system call number (sys_exit)
	int	0x80	;call kernel
sum:
   mov     eax, ecx
   add     eax, edx
   add     eax, '0'
   ret
section .data
msg db "The sum is:", 0xA,0xD 
len equ $- msg   
segment .bss
res resb 1

When the above code is compiled and executed, it produces the following result:

The sum is:
9

Stacks Data Structure:

A stack is an array-like data structure in the memory in which data can be stored and removed from a location called the ‘top’ of the stack. The data that needs to be stored is ‘pushed’ into the stack and data to be retrieved is ‘popped’ out from the stack. Stack is a LIFO data structure, i.e., the data stored first is retrieved last.

Assembly language provides two instructions for stack operations: PUSH and POP. These instructions have syntaxes like:

PUSH    operand
POP     address/register

The memory space reserved in the stack segment is used for implementing stack. The registers SS and ESP (or SP) are used for implementing the stack. The top of the stack, which points to the last data item inserted into the stack is pointed to by the SS:ESP register, where the SS register points to the beginning of the stack segment and the SP (or ESP) gives the offset into the stack segment.

The stack implementation has the following characteristics:

  • Only words or doublewords could be saved into the stack, not a byte.
  • The stack grows in the reverse direction, i.e., toward the lower memory address
  • The top of the stack points to the last item inserted in the stack; it points to the lower byte of the last word inserted.

As we discussed about storing the values of the registers in the stack before using them for some use; it can be done in following way:

; Save the AX and BX registers in the stack
PUSH    AX
PUSH    BX
; Use the registers for other purpose
MOV	AX, VALUE1
MOV 	BX, VALUE2
...
MOV 	VALUE1, AX
MOV	VALUE2, BX
; Restore the original values
POP	AX
POP	BX

Example:

The following program displays the entire ASCII character set. The main program calls a procedure named display, which displays the ASCII character set.

section	.text
    global _start         ;must be declared for using gcc
_start:	;tell linker entry point
	call    display
	mov	eax,1	;system call number (sys_exit)
	int	0x80	;call kernel
display:
	mov    ecx, 256
next:
	push    ecx
	mov     eax, 4
	mov     ebx, 1
	mov     ecx, achar
	mov     edx, 1
	int     80h
	pop     ecx	
	mov	dx, [achar]
	cmp	byte [achar], 0dh
	inc	byte [achar]
	loop    next
	ret
section .data
achar db '0'  

When the above code is compiled and executed, it produces the following result:

0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}
...
...

Assembly arrays

Assembly – Arrays


We have already discussed that the data definition directives to the assembler are used for allocating storage for variables. The variable could also be initialized with some specific value. The initialized value could be specified in hexadecimal, decimal or binary form.

For example, we can define a word variable months in either of the following way:

MONTHS	DW	12
MONTHS	DW	0CH
MONTHS	DW	0110B

The data definition directives can also be used for defining a one-dimensional array. Let us define a one-dimensional array of numbers.

NUMBERS	DW  34,  45,  56,  67,  75, 89

The above definition declares an array of six words each initialized with the numbers 34, 45, 56, 67, 75, 89. This allocates 2×6 = 12 bytes of consecutive memory space. The symbolic address of the first number will be NUMBERS and that of the second number will be NUMBERS + 2 and so on.

Let us take up another example. You can define an array named inventory of size 8, and initialize all the values with zero, as:

INVENTORY   DW  0
            DW  0
            DW  0
            DW  0
            DW  0
            DW  0
            DW  0
            DW  0

Which can be abbreviated as:

INVENTORY   DW  0, 0 , 0 , 0 , 0 , 0 , 0 , 0

The TIMES directive can also be used for multiple initializations to the same value. Using TIMES, the INVENTORY array can be defined as

INVENTORY TIMES 8 DW 0

Example:

The following example demonstrates the above concepts by defining a 3-element array x, which stores three values: 2, 3 and 4. It adds the values in the array and displays the sum 9:

section	.text
    global _start	;must be declared for linker (ld)
_start:	
 		
      mov  eax,3      ;number bytes to be summed 
      mov  ebx,0      ;EBX will store the sum
      mov  ecx, x     ;ECX will point to the current element to be summed
top:  add  ebx, [ecx]
      add  ecx,1      ;move pointer to next element
      dec  eax        ;decrement counter
      jnz  top        ;if counter not 0, then loop again
done: 
      add   ebx, '0'
      mov  [sum], ebx ;done, store result in "sum"
display:
      mov  edx,1      ;message length
      mov  ecx, sum   ;message to write
      mov  ebx, 1     ;file descriptor (stdout)
      mov  eax, 4     ;system call number (sys_write)
      int  0x80       ;call kernel
      mov  eax, 1     ;system call number (sys_exit)
      int  0x80       ;call kernel

section	.data
global x
x:    
      db  2
      db  4
      db  3
sum: 
      db  0

When the above code is compiled and executed, it produces the following result:

9

Assembly string processing

Assembly – String Processing


We have already used variable lengths strings in our previous examples. You must have noticed that, the variable lengths strings can have as many characters as required. Generally, we specify the length of the string by either of the two ways:

  • Explicitly storing string length
  • Using a sentinel character

We can store the string length explicitly by using the $ location counter symbol that represents the current value of the location counter. In the following example:

msg  db  'Hello, world!',0xa ;our dear string
len  equ  $ - msg            ;length of our dear string

$ points to the byte after the last character of the string variable msg. Therefore, $-msg gives the length of the string. We can also write

msg db 'Hello, world!',0xa ;our dear string
len equ 13                 ;length of our dear string

Alternatively, you can store strings with a trailing sentinel character to delimit a string instead of storing the string length explicitly. The sentinel character should be a special character that does not appear within a string.

For example:

message DB 'I am loving it!', 0

String Instructions

Each string instruction may require a source operand, a destination operand or both. For 32-bit segments, string instructions use ESI and EDI registers to point to the source and destination operands, respectively.

For 16-bit segments, however, the SI and the DI registers are used to point to the source and destination, respectively.

There are five basic instructions for processing strings. They are:

  • MOVS – This instruction moves 1 Byte, Word or Doubleword of data from memory location to another.
  • LODS – This instruction loads from memory. If the operand is of one byte, it is loaded into the AL register, if the operand is one word, it is loaded into the AX register and a doubleword is loaded into the EAX register.
  • STOS – This instruction stores data from register (AL, AX, or EAX) to memory.
  • CMPS – This instruction compares two data items in memory. Data could be of a byte size, word or doubleword.
  • SCAS – This instruction compares the contents of a register (AL, AX or EAX) with the contents of an item in memory.

Each of the above instruction has a byte, word and doubleword version and string instructions can be repeated by using a repetition prefix.

These instructions use the ES:DI and DS:SI pair of registers, where DI and SI registers contain valid offset addresses that refers to bytes stored in memory. SI is normally associated with DS (data segment) and DI is always associated with ES (extra segment).

The DS:SI (or ESI) and ES:DI (or EDI) registers point to the source and destination operands, respectively. The source operand is assumed to be at DS:SI (or ESI) and the destination operand at ES:DI (or EDI) in memory.

For 16-bit addresses, the SI and DI registers are used, and for 32-bit addresses, the ESI and EDI registers are used.

The following table provides various versions of string instructions and the assumed space of the operands.

Basic Instruction Operands at Byte Operation Word Operation Double word Operation
MOVS ES:DI, DS:EI MOVSB MOVSW MOVSD
LODS AX, DS:SI LODSB LODSW LODSD
STOS ES:DI, AX STOSB STOSW STOSD
CMPS DS:SI, ES: DI CMPSB CMPSW CMPSD
SCAS ES:DI, AX SCASB SCASW SCASD

Repetition Prefixes

The REP prefix, when set before a string instruction, for example – REP MOVSB, causes repetition of the instruction based on a counter placed at the CX register. REP executes the instruction, decreases CX by 1, and checks whether CX is zero. It repeats the instruction processing until CX is zero.

The Direction Flag (DF) determines the direction of the operation.

  • Use CLD (Clear Direction Flag, DF = 0) to make the operation left to right.
  • Use STD (Set Direction Flag, DF = 1) to make the operation right to left.

The REP prefix also has the following variations:

  • REP: it is the unconditional repeat. It repeats the operation until CX is zero.
  • REPE or REPZ: It is conditional repeat. It repeats the operation while the zero flag indicates equal/zero. It stops when the ZF indicates not equal/zero or when CX is zero.
  • REPNE or REPNZ: It is also conditional repeat. It repeats the operation while the zero flag indicates not equal/zero. It stops when the ZF indicates equal/zero or when CX is decremented to zero.

Assembly numbers

Assembly – Numbers


Numerical data is generally represented in binary system. Arithmetic instructions operate on binary data. When numbers are displayed on screen or entered from keyboard, they are in ASCII form.

So far, we have converted this input data in ASCII form to binary for arithmetic calculations and converted the result back to binary. The following code shows this:

section	.text
    global _start         ;must be declared for using gcc
_start:	;tell linker entry point
	mov	eax,'3'
	sub     eax, '0'
	mov 	ebx, '4'
	sub     ebx, '0'
	add 	eax, ebx
	add	eax, '0'
	mov 	[sum], eax
	mov	ecx,msg	
	mov	edx, len
	mov	ebx,1	;file descriptor (stdout)
	mov	eax,4	;system call number (sys_write)
	int	0x80	;call kernel
	mov	ecx,sum
	mov	edx, 1
	mov	ebx,1	;file descriptor (stdout)
	mov	eax,4	;system call number (sys_write)
	int	0x80	;call kernel
	mov	eax,1	;system call number (sys_exit)
	int	0x80	;call kernel
section .data
msg db "The sum is:", 0xA,0xD 
len equ $ - msg   
segment .bss
sum resb 1

When the above code is compiled and executed, it produces the following result:

The sum is:
7

Such conversions, however, have an overhead, and assembly language programming allows processing numbers in a more efficient way, in the binary form. Decimal numbers can be represented in two forms:

  • ASCII form
  • BCD or Binary Coded Decimal form

ASCII Representation

In ASCII representation, decimal numbers are stored as string of ASCII characters. For example, the decimal value 1234 is stored as:

31	32	33	34H

Where, 31H is ASCII value for 1, 32H is ASCII value for 2, and so on. There are the following four instructions for processing numbers in ASCII representation:

  • AAA – ASCII Adjust After Addition
  • AAS – ASCII Adjust After Subtraction
  • AAM – ASCII Adjust After Multiplication
  • AAD – ASCII Adjust Before Division

These instructions do not take any operands and assume the required operand to be in the AL register.

The following example uses the AAS instruction to demonstrate the concept:

section	.text
    global _start         ;must be declared for using gcc
_start:	;tell linker entry point
	sub     ah, ah
	mov     al, '9'
	sub     al, '3'
	aas
	or      al, 30h
	mov     [res], ax
	
	mov	edx,len	;message length
	mov	ecx,msg	;message to write
	mov	ebx,1	;file descriptor (stdout)
	mov	eax,4	;system call number (sys_write)
	int	0x80	;call kernel
	
	mov	edx,1	;message length
	mov	ecx,res	;message to write
	mov	ebx,1	;file descriptor (stdout)
	mov	eax,4	;system call number (sys_write)
	int	0x80	;call kernel
	mov	eax,1	;system call number (sys_exit)
	int	0x80	;call kernel

section	.data
msg db 'The Result is:',0xa	
len equ $ - msg			
section .bss
res resb 1  

When the above code is compiled and executed, it produces the following result:

The Result is:
6

BCD Representation

There are two types of BCD representation:

  • Unpacked BCD representation
  • Packed BCD representation

In unpacked BCD representation, each byte stores the binary equivalent of a decimal digit. For example, the number 1234 is stored as:

01	02	03	04H

There are two instructions for processing these numbers:

  • AAM – ASCII Adjust After Multiplication
  • AAD – ASCII Adjust Before Division

The four ASCII adjust instructions, AAA, AAS, AAM and AAD can also be used with unpacked BCD representation. In packed BCD representation, each digit is stored using four bits. Two decimal digits are packed into a byte. For example, the number 1234 is stored as:

12	34H

There are two instructions for processing these numbers:

  • DAA – Decimal Adjust After Addition
  • DAS – decimal Adjust After Subtraction

There is no support for multiplication and division in packed BCD representation.

Example:

The following program adds up two 5-digit decimal numbers and displays the sum. It uses the above concepts:

section	.text
    global _start         ;must be declared for using gcc

_start:	;tell linker entry point

	mov     esi, 4  ;pointing to the rightmost digit
	mov     ecx, 5  ;num of digits
	clc
add_loop:  
	mov 	al, [num1 + esi]
	adc 	al, [num2 + esi]
	aaa
	pushf
	or 	al, 30h
	popf
	mov	[sum + esi], al
	dec	esi
	loop	add_loop
	mov	edx,len	;message length
	mov	ecx,msg	;message to write
	mov	ebx,1	;file descriptor (stdout)
	mov	eax,4	;system call number (sys_write)
	int	0x80	;call kernel
	
	mov	edx,5	;message length
	mov	ecx,sum	;message to write
	mov	ebx,1	;file descriptor (stdout)
	mov	eax,4	;system call number (sys_write)
	int	0x80	;call kernel

	mov	eax,1	;system call number (sys_exit)
	int	0x80	;call kernel

section	.data
msg db 'The Sum is:',0xa	
len equ $ - msg			
num1 db '12345'
num2 db '23456'
sum db '     '

When the above code is compiled and executed, it produces the following result:

The Sum is:
35801

Assembly loops

Assembly – Loops


The JMP instruction can be used for implementing loops. For example, the following code snippet can be used for executing the loop-body 10 times.

MOV	CL, 10
L1:
<LOOP-BODY>
DEC	CL
JNZ	L1

The processor instruction set, however, includes a group of loop instructions for implementing iteration. The basic LOOP instruction has the following syntax:

LOOP 	label

Where, label is the target label that identifies the target instruction as in the jump instructions. The LOOP instruction assumes that the ECX register contains the loop count. When the loop instruction is executed, the ECX register is decremented and the control jumps to the target label, until the ECX register value, i.e., the counter reaches the value zero.

The above code snippet could be written as:

mov ECX,10
l1:
<loop body>
loop l1

Example:

The following program prints the number 1 to 9 on the screen:

section	.text
    global _start         ;must be declared for using gcc
_start:	                ;tell linker entry point
	mov ecx,10
	mov eax, '1'
	
l1:
	mov [num], eax
	mov eax, 4
	mov ebx, 1
	push ecx
	mov ecx, num        
        mov edx, 1        
        int 0x80
	mov eax, [num]
	sub eax, '0'
	inc eax
	add eax, '0'
	pop ecx
	loop l1
	mov eax,1       ;system call number (sys_exit)
	int 0x80        ;call kernel
section	.bss
num resb 1

When the above code is compiled and executed, it produces the following result:

123456789:

Assembly conditions

Assembly – Conditions


Conditional execution in assembly language is accomplished by several looping and branching instructions. These instructions can change the flow of control in a program. Conditional execution is observed in two scenarios:

SN Conditional Instructions
1 Unconditional jump
This is performed by the JMP instruction. Conditional execution often involves a transfer of control to the address of an instruction that does not follow the currently executing instruction. Transfer of control may be forward to execute a new set of instructions or backward to re-execute the same steps.
2 Conditional jump
This is performed by a set of jump instructions j<condition> depending upon the condition. The conditional instructions transfer the control by breaking the sequential flow and they do it by changing the offset value in IP.

Let us discuss the CMP instruction before discussing the conditional instructions.

The CMP Instruction

The CMP instruction compares two operands. It is generally used in conditional execution. This instruction basically subtracts one operand from the other for comparing whether the operands are equal or not. It does not disturb the destination or source operands. It is used along with the conditional jump instruction for decision making.

Syntax

CMP destination, source

CMP compares two numeric data fields. The destination operand could be either in register or in memory. The source operand could be a constant (immediate) data, register or memory.

Example:

CMP DX,	00  ; Compare the DX value with zero
JE  L7      ; If yes, then jump to label L7
.
.
L7: ...  

CMP is often used for comparing whether a counter value has reached the number of times a loop needs to be run. Consider the following typical condition:

INC	EDX
CMP	EDX, 10	; Compares whether the counter has reached 10
JLE	LP1     ; If it is less than or equal to 10, then jump to LP1

Unconditional Jump

As mentioned earlier, this is performed by the JMP instruction. Conditional execution often involves a transfer of control to the address of an instruction that does not follow the currently executing instruction. Transfer of control may be forward to execute a new set of instructions or backward to re-execute the same steps.

Syntax:

The JMP instruction provides a label name where the flow of control is transferred immediately. The syntax of the JMP instruction is:

JMP	label

Example:

The following code snippet illustrates the JMP instruction:

MOV  AX, 00    ; Initializing AX to 0
MOV  BX, 00    ; Initializing BX to 0
MOV  CX, 01    ; Initializing CX to 1
L20:
ADD  AX, 01    ; Increment AX
ADD  BX, AX    ; Add AX to BX
SHL  CX, 1     ; shift left CX, this in turn doubles the CX value
JMP  L20       ; repeats the statements

Conditional Jump

If some specified condition is satisfied in conditional jump, the control flow is transferred to a target instruction. There are numerous conditional jump instructions depending upon the condition and data.

Following are the conditional jump instructions used on signed data used for arithmetic operations:

Instruction Description Flags tested
JE/JZ Jump Equal or Jump Zero ZF
JNE/JNZ Jump not Equal or Jump Not Zero ZF
JG/JNLE Jump Greater or Jump Not Less/Equal OF, SF, ZF
JGE/JNL Jump Greater or Jump Not Less OF, SF
JL/JNGE Jump Less or Jump Not Greater/Equal OF, SF
JLE/JNG Jump Less/Equal or Jump Not Greater OF, SF, ZF

Following are the conditional jump instructions used on unsigned data used for logical operations:

Instruction Description Flags tested
JE/JZ Jump Equal or Jump Zero ZF
JNE/JNZ Jump not Equal or Jump Not Zero ZF
JA/JNBE Jump Above or Jump Not Below/Equal CF, ZF
JAE/JNB Jump Above/Equal or Jump Not Below CF
JB/JNAE Jump Below or Jump Not Above/Equal CF
JBE/JNA Jump Below/Equal or Jump Not Above AF, CF

The following conditional jump instructions have special uses and check the value of flags:

Instruction Description Flags tested
JXCZ Jump if CX is Zero none
JC Jump If Carry CF
JNC Jump If No Carry CF
JO Jump If Overflow OF
JNO Jump If No Overflow OF
JP/JPE Jump Parity or Jump Parity Even PF
JNP/JPO Jump No Parity or Jump Parity Odd PF
JS Jump Sign (negative value) SF
JNS Jump No Sign (positive value) SF

The syntax for the J<condition> set of instructions:

Example,

CMP	AL, BL
JE	EQUAL
CMP	AL, BH
JE	EQUAL
CMP	AL, CL
JE	EQUAL
NON_EQUAL: ...
EQUAL: ...

Example:

The following program displays the largest of three variables. The variables are double-digit variables. The three variables num1, num2 and num3 have values 47, 72 and 31, respectively:

section	.text
    global _start         ;must be declared for using gcc

_start:	;tell linker entry point
	mov   ecx, [num1]
      	cmp   ecx, [num2]
      	jg    check_third_num
      	mov   ecx, [num3]
   check_third_num:
      	cmp   ecx, [num3]
      	jg    _exit
      	mov   ecx, [num3]
   _exit:
        mov   [largest], ecx
        mov   ecx,msg
        mov   edx, len
        mov   ebx,1	;file descriptor (stdout)
        mov   eax,4	;system call number (sys_write)
        int   0x80	;call kernel
        mov   ecx,largest
        mov   edx, 2
        mov   ebx,1	;file descriptor (stdout)
        mov   eax,4	;system call number (sys_write)
        int   0x80	;call kernel
    
        mov   eax, 1
        int   80h

section	.data
    msg db "The largest digit is: ", 0xA,0xD 
    len equ $- msg 
    num1 dd '47'
    num2 dd '22'
    num3 dd '31'

segment .bss
   largest resb 2  

When the above code is compiled and executed, it produces the following result:

The largest digit is: 
47