The address 0x00000 is special because it is the starting address: after
power-up, a reset or a watchdog-reset the instruction word in address 0x000000 is
the first that is executed.
Assembling a source code writes all executable instructions and tables to the code
segment CSEG by default. When switching the assembler either to the SRAM segment with
the directive .DSEG or to the EEPROM segment with the directive .ESEG,
the return back to the code segment can be dore with the directive .CSEG.
All code that has been assembled is written by the assembler to the .hex file.
Its content can be written to the flash memory with the programmer soft- and hardware.
Instructions such as NOP write 16-bit words to the .hex file. Words with
16 bits can be written with the directive .DW 16-bit-value at any location within
Bytes can be assembled and written to the .hex file with the .DB 8-bit-value.
When writing one single 8-bit-value, the upper significant byte MSB is always
written to zero. When writing two 8-bit-constants within one .DB 8-bit-value-1,
8-bit-value_2 directive, the first value is written to the LSB, the second value
to the MSB at the current location.
The assembler source code to the left is assembled, the results of the assembling
can be viewed in the assembler listing on the right side.
- The NOP in line 20 of the source code, which is a valid instruction,
has been translated to an address of 0x000000 and an executable hex code of
0x0000. That hex code will be written to the hex file.
- The ADD R16,R16 in line 21 has been translated to the executable hex
code 0x0F00 at address 0x000001.
- The following line has not been translated, because it is only a label and
meaningful for the assembler only.
- The RJMP loop has been translated to 0xCFFF at address 0x000002, an
executable that jumps back to where it just came from (an indefinite loop.
- The first .DB 1 has been translated to 0x0001 at address 0x000003, but
the assembler complains with a warning, that the number of bytes in the .DB
line is odd, and that he has added a 0x00 as MSB at address 0x000003.
- The second .DB 1,2 has been translated to 0x0201 at address 0x000004.
Note that the first byte 0x01 is now the LSB of the resulting word in flash
while the second byte 0x02 is the MSB of that word.
- The third .DB 1,2,3 has been splitted into two words: 1 and 2 go to
the first word, 3 goes to the second word, and the assembler warns again.
- The line .DB "A text string" is translated to seven words
in a row: as can be seen from the second character in the string, a blank or
0x20, every second character goes to the MSB and every first character to the
LSB. Again, the assembler complains that the number of bytes in the line is
- No such complaints in all lines with .DW: all words fit into the 16
bits of the flash memory. The last entry, .DW Loop inserts the address
of the label Loop: into the flash memory at that address 0x000014. We
will later on read that address location to jump to such a label.
The instruction LPM or Load from Program Memory reads one byte from the
flash. It takes the flash address from the register pair Z (ZH:ZL = R31:R30)
and transfers the result to register R0.
But: each address in flash memory has two bytes, an LSB and an MSB. Which of
the two bytes are to be read, and how to get the second byte at that same
The trick to do that is to shift the flash address left by one location and to
add a zero or a one to the right of the address in bit 0 of Z. A zero to the
right addresses the LSB, a one the MSB.
One disadvantage does the trick have: bit 15 of the flash address cannot be
used. So better place your lengthy tables with thousands of values into the
lower half of your 64k words wide flash.
The following formulations in assembler are all the same and set Z to access
the LSB and the MSB of the byte table below:
Of course, you do not have to define an extra constant named
FlashAddr but you can directly use the label ByteTable: as
address in the LDI instructions. And all the +0 and |0 in the
formulations are also superfluous because they have no effect.
.equ FlashAddr = ByteTable ; Set the flash address
; Formulation 1
ldi ZH,High(FlashAddr+FlashAddr+0) ; Access the LSB, MSB of Z
ldi ZL,Low(FlashAddr+FlashAddr+0) ; dto., LSB of Z
ldi ZH,High(FlashAddr+FlashAddr+1) ; Access the MSB, MSB of Z
ldi ZL,Low(FlashAddr+FlashAddr+1) ; dto., LSB of Z
; Formulation 2
ldi ZH,High(2*FlashAddr+0) ; Access the LSB, MSB of Z
ldi ZL,Low(2*FlashAddr+0) ; dto., LSB of Z
ldi ZH,High(2*FlashAddr+1) ; Access the MSB, MSB of Z
ldi ZL,Low(2*FlashAddr+1) ; dto., LSB of Z
; Formulation 3
ldi ZH,High((FlashAddr<<1)|0) ; Access the LSB, MSB of Z
ldi ZL,Low((FlashAddr<<1)|0) ; dto., LSB of Z
ldi ZH,High((FlashAddr<<1)|1) ; Access the MSB, MSB of Z
ldi ZL,Low((FlashAddr<<1)|1) ; dto., LSB of Z
.db "This is a text string"
So, whatever you prefer, it is all the same. The result is always
in R0, as the simulated instruction shows. What the simulation
also shows is that one access of the flash memory costs three
cycles (the LDI are ony cycle each). Flash memory therefore is
a little bit slower than SRAM and much slower than registers.
ATMEL later added the opportunity to use any register as target,
the formulation of those instruction are lpm register,Z,
where register can be any of the 32 registers.
Also a little bit later the auto-increment was implemented. This
increases the address in Z after the load has been performed. The
effect is that the two instructions lpm and adiw ZL,1
are replaced by the instruction lpm register,Z+. Note that
this additional step does not increase the access time.
The opposite, the auto-decrease to read tables from the end down
to the beginning, was also implemented. Like in the case of SRAM
auto-decrement, the decrementation is done prior to the load
access. The formulation is lpm register,-Z and replaces
sbiw ZL,1 and lpm. This additional decrementation
does not change access time.
The first example uses LPM to copy a null-terminated text from
flash memory to SRAM.
The whole operation lasts 259 µs.
; Prepare data segment labels
; Point Z to flash in memory
; Point X to SRAM target location
lpm R16,Z+ ; Load from program memory
st X+,R16 ; Store in SRAM
tst R16 ; Null termination?
brne CopyText ; No, go on
; Do'nt run into table
; Prepare the text in flash memory
.db "This text to be copied to SRAM.",0x00
The second example is a little bit academic. Guess that your program
needs to react to an event with 10 different subroutines, that are of
an unequal length: some short ones, some long ones. That can be the
case if the user presses one out of ten keys. You can now check
whether the initial event was zero, one, two, etc. and up to nine and
you can call the ten different subroutines.
Faster and more elegant is it to
What you win here is that you are flexible in extending or reducing
the number of subroutines, you are flexible to place them to any
address you like, etc.
- place these ten subroutine addresses into a table,
- to calculate the table address from the given number,
- to read the table entry with LPM, and
- to call the subroutine with ICALL.
This is the source code.
The simulation has been started with select=0, the table address of
this selection has been calculated in Z by adding the left-shifted
select to the table's starting address. The address in Z points to
the LSB of the first table entry.
; Prepare data segment labels
; Point Z to flash in memory
; Icall part
.equ select = 0
ldi R16,select ; Load selected routine number here
lsl R16 ; Multiply by two
ldi ZH,High(2*JmpTable) ; Point Z to table
add ZL,R16 ; Add the doubled selection number
ldi R16,0 ; Add carry, if any
lpm R16,Z+ ; Read LSB
lpm ZH,Z ; Read MSB
mov ZL,R16 ; Copy LSB to ZL
icall ; Call the routine in Z
; Routine 0
; Jump table
; Add additional routines here
0x0033 is the first entry.
Now the jump address has been read from the table and prepared for an
ICALL in Z.
The ICALL has called the Routine0.
With all different possible selects this jumps to the correct routine.
LPM and its more modern variations offer a wide variety of opportunities
to handle texts and to access smaller or larger tables in the large flash
memory. Effective programming very often involves such loads from program
To the top of this page