Pfad:
Home =>
AVR-EN =>
Beginner's Intro => Instruction execution
(Diese Seite in Deutsch:
)
Executing instructions in AVRs
Program instructions are stored as 16-bit-words in the flash memory in AVRs,
those are written there by the programming device.
1.1 The program counter
The program counter starts at the address 0x0000. When starting up and during
a hardware reset this address is written to the program counter (PC). The
instruction word that is located there is read, decoded and finally executed.
The program counter then, after each instruction executed, advances by one
and the instruction located there is executed. This is not the case, if
the instruction word is a jump instruction, which changes the program
counter to point to a different location in memory. Then, instruction
words are taken from there, and the program counter advances there.
What if the instructions are over, no more instructions have been
programmed? If the AVR comes into that situation (which should be thoroughly
avoided), he reads 0xFFFF from the unprogrammed flash location. This is
an undefined instruction word, and the AVR does nothing at all. He just
advances and reads the next word.
And what happens if the complete prgogram memory exhausts? Now, he is
simply starting from address 0x0000. That is similar to a reset, but not
exactly what happens during a reset: the registers are not cleared and
the portregisters are not set to their default values. But this
wrap-around is exactly what happens in an un-programmed AVR: he starts
all over again, because the complete memory is filled with 0xFFFF.
1.2 The ALU
Many of thze AVR's instructions perform arithmetic or logic operations.
That is done by the respective unit ALU in the AVRs: this unit can
connect up to two values on its input lines 1 and 2 and can add or
subtract or can AND, OR or EXOR these two values. What operation
is to be performed is derived from the injstruction word, see below
for details.
The ALU writes the result of the operation e. g. to a target
register. Events such as overflows (carry, C) or detection of a zero
result (zero, Z) are named flags and are copied to the status
register SREG. There, those flags can be read and used in further
instructions for e. g. conditional jumps.
Zum Seitenanfang
1.3 Instruction decoding
Decoding is the selection from the more than 100 different operations
that AVRs can perform and execute, for example the LDI r,c
instruction. This copies the 8-bit constant c, which is located within
the instruction word in bits 0 to 4 and 8 to 11, to the input 1 of the
arithmetical/logical unit (ALU) and writes it, on the next edge change
of the clock signal, to the register r, which is located within the
instruction word in the four bits 12 to 15, by setting the fifth bit
high.
The operation LDI is encoded in the four most significant bits
12 to 15 of the instruction word: it is 0b1110. Decoding these bits
tells the ALU to copy the byte on input 1 to the register.
As the encoded register number has only four bits in the instruction
word, only 16 of the 32 registers can be addressed. The decision to
set the fifth bit high and to address the registers R16 to R31 with
those four bits is justified because the upper half of the registers
also holds the 16-bit pointer registers X, Y and Z, and the LDI
often is used to manipulate these pointers.
Similar to the LDI instruction, that can only address R16 and
above, are other instructions that work with an 8-bit constant:
- SUBI r,c, which subtracts c from the register content,
- ANDI r,c, which ANDs the content with the bits in the
constant in binary math,
and
- ORI r,c, which ORs this content,
and which all write the result back to the respective upper register
r.
But this missing fifth bit is rather a rare construction, most of the
instructions can address all 32 registers. E. g. the instructions
- SUB rx,ry, which subtracts register ry's content from that
of register rx and places the result to register rx,
- AND rx,ry, which ANDs both registers, or
- OR rx,ry, that ORs it,
allow to address all 32 registers as sources as well as as targets.
All bit combinations, that the instruction word can consist of, is
documented in the "avr-instruction-set-manual" for all
instructions under Descrption and 16-bit-Opcode. The
manual can be downloaded from Microchip's website.
To the top of this page
Execution of instructions in AVRs is constructed so that the next
instruction is already decoded during execution of the previous
instruction. Finishing execution can immediately execute the next
instruction without first having to decode it.
This so-called Pre-Fetch only fails if an instructions leads to altering
the program counter: then the already decoded next instruction has to
be fetched and decoded in an extra cyle. This is the case if the
instruction is a jump JMP address or a relative jump RJMP
label or if the condition of a conditional jump BRxC label
or BRxS label is true and alters the program counter. Those
instructions need two cycles instead of of one clock cycle, in the
conditional case only if the condition is true.
As an example here the binary AND of two 8-bit numbers in the registers
R16 and R17. Those are loaded with the binaries 0b01010101 (0x55) and
0b10101010 (0xAA). The source in assembler for that is:
ldi R16,0x55 ; R16 to hexadecimal 55
ldi R17,0xAA ; R17 to hexadecimal AA
and R16,R17 ; AND R16 with R17, result to register R16
The simulation is made with
avr_sim.
Displayed to the left is the register content after the two load
instructions have been executed: both hexadecimals are set to the
desired values. To the right the AND instruction has been exeuted
after that: the ALU has written the result, zero, to R16.
Because the result of the last instruction is zero, the Z flag in the
status register SREG is one. With that, further instructions can work.
All three operations last exactly 3 µs at a clock rate of
1 MHz, each instruction lasted 1 µs. Without pre-fetch
six µs would be consumed, doubling the time needed.
If following the AND, a conditional jump would be executed, which is
only jumping if the result is zero, the source code would be:
ldi R16,0x55 ; R16 to hexadecimal 55
ldi R17,0xAA ; R17 to hexadecimal AA
and R16,R17 ; AND, result to register R16
breq Label ; If zero (equal) jump to label
nop ; Here when not jumped
Label: ; Here when jumped and not jumped
Now the execution of the complete code has lasted 5 µs:
the executed jump instruction BREQ Label has lasted two
Micro-seconds, because the pre-fetch failed due to program counter
alteration by the executed jump.
To the top of this page
To understand how the AVRs process instructions here the ADD rx,ry
instruction is shown in detail.
The 16-bit instruction word holds in its bits 10 to 15 the binary value
0b000011. That signals to the ALU that the two register contents on its
inputs IN1 and IN2 have to be added.
The bits 0 to 3 plus the bit in 9 determine, which of the 32 registers
is transported to IN1. Similarly the bits 4 to 8 determine the source
for IN2. Both values are multiplexed by a 1-out-of-32 MUX.
The ALU adds the two values, the result in OUT is written to the register
that is selecting IN1.
The ALU further determines, if during adding
- an overflow occured and the result is larger than decimal 255 or
hexadecimal 0xFF and sets or clears the C flag (carry) in the
status register accordingly,
- zero resulted, what would be the case, if both register were zero
before adding them or if the result is exactly 256 decimal.
Accordingly the Z flag (zero) is either set or cleared,
- simlarly the flags N, V, S and H are set or cleared, depending
from the result.
All instructions that connect two registers, such as e. g.
- the addition with carry ADC, marked 0b000111,
- the subtraction without SUB, marked 0b0b000110, and
with carry, SBC, marked 0b000010,
- the comparison without CP, marked 0b000101, and with
carry CPC, marked 0b000001, or
- binary AND AND, marked 0b001000, and OR OR,
marked 0b001010,
are encoded in the instruction word similarly, only the highest six
bits are different.
As there are five bits in the instruction word are available for
selecting both registers, each of the 32 registers can serve as
source as well as as target. Even both sources can be the same,
e. g. adding the same register with itself, by that multiplying
its content by two. This has a funny effect. When doing AND or OR
with itself the result is zero only in case the register is zero
from the beginning. ATMEL has given an AND of a register with itself
a new menomonic named TST r, which sets the zero flag if the
register r is zero. ORing the register with itself would do the same,
only the binary code changes slightly.
A similar effect has the instruction ADC r,r: all bits in
register r are shifted one position to the left, the lowest bit 0
receives the carry flag, and the highest bit 7 is shifted to the
carry. Even though unnecessary the instruction was associated a
mnemonic named ROL r, for rotate r left.
Another pseudo instruction is EOR r,r, which clears all
one-bits in register r, because EXORing a one with a one yields
zero. This also got its own mnemonic: CLR r, which results
in the same instruction word like EOR r,r.
That is how you increase the pure number of different instructions
without having to change the ALU's capabilities.
But even with that: all described operations require only one single
clock cycle (and not more like in PICs). No longer-lasting SRAM
accesses are necessary, all is available in the fast-acting 32
registers.
To the top of this page
As an example for the execution of instructions in AVRs we look at
a 16-bit adder with overflow recognition. The source code for that:
; Defining the two numbers to be added (no instruction words, assembler directives)
.equ n16bitZahl1 = 12345 ; Define the first 16-bit number, no clock cycle
.equ n16bitZahl2 = 45678 ; Define the second 16-bit number, no clock cycle
ldi R16,Low(n16bitZahl1) ; The LSB of number 1 to R16, one clock cycle
ldi R17,High(n16bitZahl1) ; The respective MSB to R17, one clock cycle
ldi R18,Low(n16bitZahl2) ; The LSB of number 2 to R18, one clock cycle
ldi R19,High(n16bitZahl2) ; The respective MSB to R19, one clock cycle
add R16,R18 ; Add the two LSB, one clock cycle
adc R17,R19 ; And add the two MSBs plus the previous carry, one clock cycle
brcc Ready ; If no overflow occurs: jump over the next instructions, two cycles when jumping
ldi R16,0xFF ; Set result to maximum 16-bbit value, LSB, one clock cycle
ldi R17,0xFF ; The same for the MSB, one clock cycle
Ready:
On the left, the four load instruction have been executed and the
numbers are in the four registers, lasting four clock cycles or
4 µs at 1 MHz. To the right, the addition has been
executed. After the addition, the carry flag was not set, because
the result is smaller than 65,536. The conditional jump has been
executed and the result is done within 8 µs.
By provoking an overflow, e. g. by defining the first 16-bit
number as 23,456, then the result looks different:
Now, the first number in R17:R16 is 0x5BA0 (to the left). The result
(to the right) is different, and the carry flag has been set. Both
registers are to be set to 0xFF.
Here, the result has been corrected to not exceed the 16-bit range.
Now, 9 µs are needed. Instead of the two clock cycles
for the jump in the previous addition the jump has now lasted only
one clock cycle, but the two load instructions of 0xFF require two
additional cycles: the first one is compensating the shorter jump
and the second one adds one clock cycle.
All clocking problems can be resolved with the methods shown here,
can be analyzed using simulation and are completely transparent.
The execution process is simple to understand, no unclear features
or properties here.
To the top of this page
©2019 by
http://avr-asm-tutorial.net