Sample project titled LED Flash provided by Kanda is as follows, with PORT B pins connected to an LED segment display. Assembly code used is here:
; Comments are prefixed with ';', '//', or wrapped '/*...*/' ; Spaces between variables are ignored ; Include directive similar to C ; This library for ATMega16 .include "m16def.inc" ; Label registers for easy reference using define .def TEMP=r16 ; temporary (scratch) register .def ON=r23 ; store value for LED on .def OFF=r24 ; store value for LED off .def coarse=r17 ; delay subroutine, has largest effect .def medium=r18 ; delay subroutine, has medium effect .def fine=r19 ; delay subroutine, has smallesteffect .cseg ; indicates this is code and to be stored in Flash .org 0 ; switch to address 0x00 (reset address)... rjmp INIT ; ...and add jump instruction to label INIT .org 0x60 ; switch to address 0x60 (padding for interrupt vectors)... INIT: ; ...and add location of label INIT ; Up to this point is typically the same assembly boilerplate ; Now we define the stack space by populating SPH (Stack Pointer High byte) ; and SPL with the high and low address boundaries respectively. ; Since we cannot write directly to Special Function Registers (SFR) ; nor to bottom 16 registers (R0 to R15), we first load the required ; value into a register between R16 and R31 inclusive. ; ; The end of SRAM is defined in the header file as RAMEND, ; which corresponds to 0x045f for ATmega16. This is a 16-bit ; address, so we write both the high and low bytes, where ; LOW(0x045f) == 0x5f and HIGH(0x045f) == 0x04 ; Set Stack Pointer to top of SRAM ldi TEMP, LOW(RAMEND) ; (load intermediate value) out SPL, TEMP ; (write to SFR) ldi TEMP, HIGH(RAMEND) out SPH, TEMP ; Set all pins in Port B as output (instead of input) ldi TEMP, 0xff out DDRB, TEMP ; data direction register for Port B ; Load 1s into all pins in Port B ; Dependent on electronic wiring - here it switches LEDs off out PORTB, TEMP ; Store off and on values for LED ldi OFF, 0xff ldi ON, 0x00 MAIN: rcall DELAY ; do (relative) call to subroutine out PORTB, ON ; switch LED ON rcall DELAY out PORTB, OFF ; switch LED OFF rjmp MAIN ; repeat main loop DELAY: ldi coarse, 0x07 delay1: ldi medium, 0xff delay2: ldi fine, 0xff delay3: dec fine brne delay3 ; skips only if previous decrement goes to zero dec medium brne delay2 dec coarse brne delay1 ret
Note a couple of additional comments:
ORG is an abbreviation for origin, which sets the assembler location counter, so that absolute addresses can be defined (e.g. for defining interrupt vectors at fixed addresses, or introduce padding / generate specific alignment. Source.RCALL is a 1-byte call instruction, as opposed to CALL which is 2-bytes. The efficiency gain from using RCALL is minimal when running code in chips with larger address spaces, e.g. ATmega128. avr-gcc itself may not have implemented this either.BRNE checks for a Z flag in the AVR status register (page 9), which will be set if there is a zero result in an arithmetic or logic operation, including DEC.The corresponding Flash program written to the chip corresponds to:
prog 0x0000 5f c0 ff ff ff ff ff ff ff ff ff ff ff ff ff ff prog 0x0010 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff prog 0x0020 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff prog 0x0030 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff prog 0x0040 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff prog 0x0050 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff prog 0x0060 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff prog 0x0070 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff prog 0x0080 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff prog 0x0090 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff prog 0x00A0 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff prog 0x00B0 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff prog 0x00C0 0f e5 0d bf 04 e0 0e bf 0f ef 07 bb 08 bb 8f ef prog 0x00D0 70 e0 04 d0 78 bb 02 d0 88 bb fb cf 17 e0 2f ef prog 0x00E0 3f ef 3a 95 f1 f7 2a 95 d9 f7 1a 95 c1 f7 08 95 prog 0x00F0 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff prog 0x0100 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff prog 0x0110 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff prog 0x0120 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff prog 0x0130 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff prog 0x0140 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ... prog 0x1FF0 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
while the SRAM is initially the following (courtesy of the Microchip Studio ATmega16 simulator):
data 0x0000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 data 0x0010 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 data 0x0020 00 f8 fe ff 00 00 00 00 00 00 00 20 00 00 00 00 data 0x0030 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 data 0x0040 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 data 0x0050 00 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 data 0x0060 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 data 0x0070 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 data 0x0080 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ... data 0x045F 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Some extremely helpful references:
Breaking down the SRAM values first:
| Address | Name | Value | Remark | 
|---|---|---|---|
| 0x0021 | TWSR | 0xf8 | In datasheet§20.7.5, indicates no relevant state information available, because the device is not using the Twin-Wire Interface (TWI) for serial transfer. See datasheet§20 for details on what TWI does. | 
| 0x0022 | TWAR | 0xfe | Default value. Indicates slave address of current TWI unit. | 
| 0x0023 | TWDR | 0xff | Default value. Used for TWI, which is not used. | 
| 0x002b | UCSRA | 0x20 | Default value. For USART status, see datasheet§19. | 
| 0x0054 | MCUCSR | 0x01 | Bit 0 corresponds to a power-on reset flag. | 
In conclusion, nothing really fancy, all initialization values.
The flash program looks to be a little fancier. The reset address is set to 0x5fc0, before the rest of the program begins at 0x00c0.
The first instruction is rjmp INIT. Playing around with the .org 0x60 indicates two things:
0x60 corresponds to the position of the instruction in the program. Since instructions are 16-bits, this corresponds to a memory address of 0xc0 (0x60 * 2), where the first instruction resides.0xc05f and not 0x5fc0.This means the memory is little-endian. Given the word size of 2 bytes, the flash memory is thus better represented as:
prog 0x0000 c05f ffff ffff ffff ffff ffff ffff ffff prog 0x0010 ffff ffff ffff ffff ffff ffff ffff ffff ... prog 0x00B0 ffff ffff ffff ffff ffff ffff ffff ffff prog 0x00C0 e50f bf0d e004 bf0e ef0f bb07 bb08 ef8f prog 0x00D0 e070 d004 bb78 d002 bb88 cffb e017 ef2f prog 0x00E0 ef3f 953a f7f1 952a f7d9 951a f7c1 9508 prog 0x00F0 ffff ffff ffff ffff ffff ffff ffff ffff ...
Breaking down the flash memory, with address in units of words:
| Address | Value | Remark | 
|---|---|---|
| 0x60 | 0xe50f | Opcode: 0xe. Loads value 0x5f to register 0x0 (+16), i.e. R16. | 
| 0x61 | 0xbf0d | Opcode: 0b10111. Copy value from register 0b10000 (16) to IO port 0b111101 (0x3d, in IO-only address space). | 
| 0x62 | 0xe004 | Same LDI instruction, load 0x04. | 
| 0x63 | 0xbf0e | OUT. | 
| 0x64 | 0xef0f | Same LDI instruction, load 0xff. | 
| 0x65 | 0xbb07 | OUT. | 
| 0x66 | 0xbb08 | OUT. | 
| 0x67 | 0xef8f | LDI. | 
| 0x68 | 0xe070 | LDI. | 
| 0x69 | 0xd004 | RCALL to (0x04+1) relative jump. | 
| 0x6a | 0xbb78 | OUT. | 
| ... | ... | ... | 
| 0x71 | 0x953a | |
| ... | ... | ... | 
| 0x77 | 0x9508 | RET instruction. | 
At the RCALL instruction, the stack pointer changes to 5d 04 while the stack has value 20 6a. The program counter is at 0x69 for the RCALL, but there is a stray 0x20 appearing in the stack, not sure why.