The source code of the delay loop is like this:

```
.equ c = 12345 ; A constant for counting down
;
.def rC1 = R16 ; Three registers to count down
.def rC2 = R17 ; The second one
.def rC3 = R18 ; The third one
;
Main: ; The main code starts here
sbi DDRB,PORTB0 ; Make PB0 an output
RestartCount: ; The counter loop starts here
ldi rC1,Byte1(c-1) ; Load LSB of the counter value
ldi rC2,Byte2(c-1) ; Load middle of the counter value
ldi rC3,Byte3(c-1) ; Load MSB of the counter value
CountDown:
subi rC1,1 ; Sets the carry flag if previously zero
brcc CountDown ; If not yet carry continue couting
subi rC2,1 ; Downcount the middle byte
brcc CountDown ; If not yet carry count on
subi rC3,1 ; Downcount the MSB byte
brcc CountDown ; If not yet carry count on
sbi PINB,PINB0 ; Toggle PB0 state
rjmp RestartCount ; Start new count cycle
```

The SUBI instruction sets the carry flag when the subtraction
underflows from zero to 0xFF. This is the relevant signal to
subtract a one from the next upper byte. The down-count cycle
ends when all three bytes are 0xFF, so there is one additional
cycle absolved, as compared with a zero recognition. This extra
cycle is subtracted from Note that toggling a pin's output portbit by setting its PIN bit high is not implemented in older AVR devices, but ATtiny13 and many others have that implemented.

The list's source line column lists all instructions that the loop consists of. The next column describes how often the instruction is executed within the loops, depending from the constant

Note that the division by 256 and by 65,536 are in integer math: only the full number as integer is used, the fractions are deleted and neither considered nor used to round up.

The complete formula for the clock cycles can easily be derived then by summing up the different constituents.

If you want to calculate this exactly in an assembler source file, you'll have to ensure that the small fraction at the end is correct. So multiply both, the divident and the divisor, with a large enough number - e.g. 0x100000000 - first, before dividing 2 by 256 and 2 by 65536. The exact formulation then is:

```
.equ cM=0x100000000
.equ cCalc =(cM*(cc-7))/(3*cM+2*cM/256+2*cM/65536)
```

That ensures that the derived number for The counting loop with 64 bits looks like that:

```
Restart:
ldi rCnt0,Byte1(cCnt) ; +1 = 1
ldi rCnt1,Byte2(cCnt) ; +1 = 2
ldi rCnt2,Byte3(cCnt) ; +1 = 3
ldi rCnt3,Byte4(cCnt) ; +1 = 4
ldi rCnt4,Byte1(cCnt/65536/65536) ; +1 = 5
ldi rCnt5,Byte2(cCnt/65536/65536) ; +1 = 6
ldi rCnt6,Byte3(cCnt/65536/65536) ; +1 = 7
ldi rCnt7,Byte4(cCnt/65536/65536) ; +1 = 8
Count:
subi rCnt0,1 ; Downcount rCnt0
brcc Count ; First inner loop
subi rCnt1,1 ; Downcount rCnt1
brcc Count ; First outer loop
subi rCnt2,1 ; Downcount rCnt2
brcc Count ; Second outer loop
subi rCnt3,1 ; Downcount rCnt3
brcc Count ; Third outer loop
subi rCnt4,1 ; Downcount rCnt4
brcc Count ; Fourth outer loop
subi rCnt5,1 ; Downcount rCnt5
brcc Count ; Fifth outer loop
subi rCnt6,1 ; Downcount rCnt6
brcc Count ; Sixth outer loop
subi rCnt7,1 ; Downcount rCnt7
brcc Count ; Seventh outer loop
sbi pIn,bIn ; Ignite, +2 = 10
rjmp Restart ; Restart, +2 = 12
```

The calculation is also relative simple. The inner loop in the
loop section is executed cCnt times plus one. Each execution
consumes three clock cyles (one for SUBI, two for the BRCC).
The last execution needs only two clock cycles because the
jump back is not executed. The BRCC of the inner loop is
therefore executed
- "cCnt - cCnt / 256 + 1" with two clock cycles, plus
- "cCnt / 256 + 1" with one clock cycle

The next loop is executed "(c / 256) + 1" times, the following loops each 256 times less. This yields the following row:

Loop | Executions | Abbreviation |
---|---|---|

1 | cCnt + 1 | c + 1 |

2 | cCnt / 256 + 1 | c8 |

3 | cCnt / 65,536 + 1 | c16 |

4 | cCnt / 16,777,216 + 1 | c24 |

5 | cCnt / 4,294,967,296 + 1 | c32 |

6 | cCnt / 1,099,511,627,776 + 1 | c40 |

7 | cCnt / 281,474,976,710,656 + 1 | c48 |

8 | cCnt / 72,057,594,037,927,936 + 1 | c56 |

Last | cCnt / 18,446,744,072,719,551,616 + 1 | c64 |

Please note that the divisions are in integer mode with decimal fraction ignored (rounded down).

This yields the following rows of clock cycles.

Code line | Number of executions with | Total clocks | |
---|---|---|---|

one clock cycle | two clock cycles | ||

; Loading | - | - | 8 |

subi rCnt0,1 | c+1 | - | c + 1 |

brcc Count | c8 | c-c8 | c8 + 2*c - 2*c8 |

subi rCnt1,1 | c8 | - | c8 |

brcc Count | c16 | c8-c16 | c16 + 2*c8 - 2*c16 |

subi rCnt2,1 | c16 | - | c16 |

brcc Count | c24 | c16-c24 | c24 + 2*c16 - 2*c24 |

subi rCnt3,1 | c24 | - | c24 |

brcc Count | c32 | c24-c32 | c32 + 2*c24 - 2*c32 |

subi rCnt4,1 | c32 | - | c32 |

brcc Count | c40 | c32-c40 | c40+ 2*c32 - 2*c40 |

subi rCnt5,1 | c40 | - | c40 |

brcc Count | c48 | c40-c48 | c48 + 2*c40 - 2*c48 |

subi rCnt6,1 | c48 | - | c48 |

brcc Count | c56 | c48-c56 | c56 + 2*c48 - 2*c56 |

subi rCnt7,1 | c56 | - | c56 |

brcc Count | c64 | c56-c64 | c64 + 2*c56 - 2*c64 |

sbi pIn,bIn | - | 1 | 2 |

rjmp Restart | - | 1 | 2 |

If all instruction cycles are added together, the following formula describes the total clock cycles:

Because the conversion of e.g. years in clock cycles is not that simple, I added the following lines to the code:

```
; **********************************
; A D J U S T A B L E C O N S T
; **********************************
;
; Compose the duration of counting
.equ cCntYears = 0
.equ cCntMonthes = 0
.equ cCntDays = 0
.equ cCntHours = 0
.equ cCntMinutes = 0
.equ cCntSeconds = 0
.equ cCntMilliseconds = 100
.equ cCntMicroseconds = 0
;
; The clock frequency
.equ Clock = 1200000 ; of the ATtiny13
;
; **********************************
; F I X & D E R I V. C O N S T
; **********************************
;
.equ cCntSec = cCntSeconds+60*cCntMinutes+3600*cCntHours+86400*cCntDays+2629800*cCntMonthes+31557600*cCntYears
.equ cCntUSec = 1000*cCntMilliseconds+cCntMicroSeconds
.equ cCnt = (cCntSec * Clock + Clock * cCntUSec / 1000000 - 70) / 3
```

Editing times is comfortable with this, the assembler does all the
conversion work. If you want to have the first pulse one hour after
the operation voltage has been applied, just set cCntHours to one.
The code of the 64-bit-looping is here.

You see that optimization in assembler provides many opportunities. Here we just replaced a DEC with a SUBI instruction and a BRNE by a BRCC instruction, and we got a short and simple piece of code. And: easy to understand and to calculate.

But: please note that the calculation of very long time periods can exceed the limits of assemblers that only work with 32 bit long integers. Assembling then ends with an overflow message. gavrasm and avr_sim work with INT64 and you can handle 1000s of years long time loops without any problems.

```
ldi rCnt,LoopRepetitions
Counterloop:
subi rCnt,1
brcc Counterloop
nop
```

Each loop execution now needs exactly 3 clock cycles. If you
combine eight of such loops, each addional loop executions needs
three clock cycles. The number of clock cycles is then:This can be coded in assembler much simpler than in the upper case. The sourec code here does this and demonstrates such a counting loop. Additionally this source code allows the calculation of very long times (> 10 years) to be compatible with a 64-bit integer handling. But: calculation of 256

To the top of that page

©2009-2020 by http://www.avr-asm-tutorial.net