AVR timing inprecise

Quote from steinm on February 1, 2024, 6:40 pmThis article explains it in more detail
https://nerdralph.blogspot.com/2020/04/measuring-avr-interrupt-latency.html
ATtiny85 uses RJMP, but ATmega8 uses JMP. RJMP just requires 2 cycles and the PC has been put on the stack before. The RJMP just jumps to the start of the ISR. Hence, the first instruction of the isr startsĀ 6 cycles later.
Actually you don't even need a RJMP if you have just one interrupt. You could as well place the code of the isr where the interrupt vector is located overwriting the other interrupt vectors.
This article explains it in more detail
https://nerdralph.blogspot.com/2020/04/measuring-avr-interrupt-latency.html
ATtiny85 uses RJMP, but ATmega8 uses JMP. RJMP just requires 2 cycles and the PC has been put on the stack before. The RJMP just jumps to the start of the ISR. Hence, the first instruction of the isr startsĀ 6 cycles later.
Actually you don't even need a RJMP if you have just one interrupt. You could as well place the code of the isr where the interrupt vector is located overwriting the other interrupt vectors.



Quote from steinm on February 2, 2024, 11:01 amI haven't tried "m_retCycles = 4;" yet, but by looking at the code, I'm confident that this works.
I also wrote a tiny assembly program which measures the response time. As expected it misses 2 cycles. The assembly program is attached. It has a define to switch between putting a RJMP or the ISR code itself into the interrupt vector table. It's perfectly fine to place the ISR code in flash memory where usually the RJMP is placed, if the interrupt vectors after the used interrupt aren't use. In the example the ISR code itself takes up 4 bytes. The interrupt vector table just reserves 2 bytes for each interrupt. Hence, the code will overwrite the next vector in the table. But, how cares, it's not need anyway.
ATmega8 also uses RJMP.
isn't quite right because it's not up to the type of controller but rather a decission of the person writing the code. There is no need for a JMP on ATtiny because RJMP can reach any address in flash. That's why there are only two bytes per vector and you don't even have a JMP instruction on ATtiny. Controllers with larger memory will need the JMP instruction and consequently 3 bytes per vector. But you could still use RJMP on those devices.
I haven't tried "m_retCycles = 4;" yet, but by looking at the code, I'm confident that this works.
I also wrote a tiny assembly program which measures the response time. As expected it misses 2 cycles. The assembly program is attached. It has a define to switch between putting a RJMP or the ISR code itself into the interrupt vector table. It's perfectly fine to place the ISR code in flash memory where usually the RJMP is placed, if the interrupt vectors after the used interrupt aren't use. In the example the ISR code itself takes up 4 bytes. The interrupt vector table just reserves 2 bytes for each interrupt. Hence, the code will overwrite the next vector in the table. But, how cares, it's not need anyway.
ATmega8 also uses RJMP.
isn't quite right because it's not up to the type of controller but rather a decission of the person writing the code. There is no need for a JMP on ATtiny because RJMP can reach any address in flash. That's why there are only two bytes per vector and you don't even have a JMP instruction on ATtiny. Controllers with larger memory will need the JMP instruction and consequently 3 bytes per vector. But you could still use RJMP on those devices.
Uploaded files:

Quote from steinm on February 2, 2024, 1:41 pmSome final observations on real hardware, which behaves like expected.
If the code of the ISR is placed right at the position of the interrupt service vector (no RJMP), then the timer0 counts up to 4 (sometimes 5) . The instruction to read the counter value into a register does not add another cycle. Seems like the timer value is read early during that instruction before it is incremented. There may as well be an explaination why the counter sometimes counts up to five. The main routine is made up of NOP, OUT and RJMP. NOP and OUT takes one cycle, RJMP takes two cycles. If the interrupt triggers early during RJMP, that instruction will be ended first and adds another cycle before the ISR is executed. Indeed, the 5 cycles delay happens less often if I place more NOP into the main routine, increasing the probability that one of them is interrupted. On the other hand, adding 2 cycle instructions will more often result in a delay of 5 cycles.
I suppose this cannot be handled by the simulation.
Some final observations on real hardware, which behaves like expected.
If the code of the ISR is placed right at the position of the interrupt service vector (no RJMP), then the timer0 counts up to 4 (sometimes 5) . The instruction to read the counter value into a register does not add another cycle. Seems like the timer value is read early during that instruction before it is incremented. There may as well be an explaination why the counter sometimes counts up to five. The main routine is made up of NOP, OUT and RJMP. NOP and OUT takes one cycle, RJMP takes two cycles. If the interrupt triggers early during RJMP, that instruction will be ended first and adds another cycle before the ISR is executed. Indeed, the 5 cycles delay happens less often if I place more NOP into the main routine, increasing the probability that one of them is interrupted. On the other hand, adding 2 cycle instructions will more often result in a delay of 5 cycles.
I suppose this cannot be handled by the simulation.

Quote from KerimF on February 2, 2024, 2:34 pmQuote from steinm on February 2, 2024, 11:01 amATmega8 also uses RJMP.
isn't quite right...
Sorry for not being clearer, ATmega8 (8K bytes = 4K words flash memory) has RJMP only.
Having a look now on its datasheet. it seems that JMP and CALL were included by mistake on earlier versions:
Changes from Rev. 2486K-08/03 to Rev. 2486L-10/03
3. Removed instructions CALL and JMP from the datasheet.
Quote from steinm on February 2, 2024, 11:01 amATmega8 also uses RJMP.
isn't quite right...
Sorry for not being clearer, ATmega8 (8K bytes = 4K words flash memory) has RJMP only.
Having a look now on its datasheet. it seems that JMP and CALL were included by mistake on earlier versions:
Changes from Rev. 2486K-08/03 to Rev. 2486L-10/03
3. Removed instructions CALL and JMP from the datasheet.

Quote from arcachofo on February 2, 2024, 3:16 pmQuote from KerimF on February 2, 2024, 2:34 pm.
Having a look now on its datasheet. it seems that JMP and CALL were included by mistake on earlier versions:
Changes from Rev. 2486K-08/03 to Rev. 2486L-10/03
3. Removed instructions CALL and JMP from the datasheet.Thanks for the information, I will check which version I'm using.
Quote from steinm on February 2, 2024, 1:41 pmSome final observations on real hardware, which behaves like expected.
If the code of the ISR is placed right at the position of the interrupt service vector (no RJMP), then the timer0 counts up to 4 (sometimes 5) . The instruction to read the counter value into a register does not add another cycle. Seems like the timer value is read early during that instruction before it is incremented. There may as well be an explaination why the counter sometimes counts up to five. The main routine is made up of NOP, OUT and RJMP. NOP and OUT takes one cycle, RJMP takes two cycles. If the interrupt triggers early during RJMP, that instruction will be ended first and adds another cycle before the ISR is executed. Indeed, the 5 cycles delay happens less often if I place more NOP into the main routine, increasing the probability that one of them is interrupted. On the other hand, adding 2 cycle instructions will more often result in a delay of 5 cycles.
I suppose this cannot be handled by the simulation.
I guess the simulation should handle this situation.
And thanks for the asm code, I will do some tests myself.
Quote from KerimF on February 2, 2024, 2:34 pm.
Having a look now on its datasheet. it seems that JMP and CALL were included by mistake on earlier versions:
Changes from Rev. 2486K-08/03 to Rev. 2486L-10/03
3. Removed instructions CALL and JMP from the datasheet.
Thanks for the information, I will check which version I'm using.
Quote from steinm on February 2, 2024, 1:41 pmSome final observations on real hardware, which behaves like expected.
If the code of the ISR is placed right at the position of the interrupt service vector (no RJMP), then the timer0 counts up to 4 (sometimes 5) . The instruction to read the counter value into a register does not add another cycle. Seems like the timer value is read early during that instruction before it is incremented. There may as well be an explaination why the counter sometimes counts up to five. The main routine is made up of NOP, OUT and RJMP. NOP and OUT takes one cycle, RJMP takes two cycles. If the interrupt triggers early during RJMP, that instruction will be ended first and adds another cycle before the ISR is executed. Indeed, the 5 cycles delay happens less often if I place more NOP into the main routine, increasing the probability that one of them is interrupted. On the other hand, adding 2 cycle instructions will more often result in a delay of 5 cycles.
I suppose this cannot be handled by the simulation.
I guess the simulation should handle this situation.
And thanks for the asm code, I will do some tests myself.

Quote from arcachofo on February 2, 2024, 4:03 pmAbout the 5 cycles sometimes to jump to ISR:
If I add 3 NOPs in the main loop instead of 2, then I get many 5 in cnt in the simulation.
So I think this is handled correctly in the simulation.
About the 5 cycles sometimes to jump to ISR:
If I add 3 NOPs in the main loop instead of 2, then I get many 5 in cnt in the simulation.
So I think this is handled correctly in the simulation.

Quote from steinm on February 2, 2024, 6:29 pmAdding more NOPs should not give you more 5 in cnt, but adding more 2 cycles instruction like 'ld temp1,X' should. If I look at it with a logic analyser it can even tell by the amount of 5 and 4 in cnt how many 1 cycle and 2 cycles instructions are in main. That is very close (if not identical) to the real hardware.
Adding more NOPs should not give you more 5 in cnt, but adding more 2 cycles instruction like 'ld temp1,X' should. If I look at it with a logic analyser it can even tell by the amount of 5 and 4 in cnt how many 1 cycle and 2 cycles instructions are in main. That is very close (if not identical) to the real hardware.

Quote from arcachofo on February 2, 2024, 6:54 pmAdding more NOPs should not give you more 5 in cnt, but adding more 2 cycles instruction like 'ld temp1,X' should.
As I see it, it is not about adding 1 or 2 cycles, it is more about TIMER0 overflow triggering when a >1 cycle instruction is being executed right before the jump to ISR.
And the last instruction in the loop RJMP is 2 cycles.
So depending on the number of cycles in the loop, TIMER0 overflow will happen more or less times (or never) at that RJMP.For example if the number of instructions in the loop is a divisor of 256, then TIMER0 overflow will happen at the same instruction, which could be that RJMP or not.
You can adjust it to always hit that RJMP and always get a 5 by just adding NOPs, for example:sei ;turn interrupts on nop nop nop nop LOOP: nop nop nop nop nop out PORTB,cnt rjmp LOOP
But if the number of instructions in the loop is not a divisor of 256, then TIMER0 overflow will happen in different places in the loop.
Then all depends in how many times that RJMP is hit.
For example using 3 NOPs in the loop instead of 2 you get a 5, every 3 interrutps.If you add more 2 cycle instructions there are more chances to hit one, but while there is one 2 cycle instruction there are some chances of hitting it, and as seen above you can adjust it to always hit it.
Adding more NOPs should not give you more 5 in cnt, but adding more 2 cycles instruction like 'ld temp1,X' should.
As I see it, it is not about adding 1 or 2 cycles, it is more about TIMER0 overflow triggering when a >1 cycle instruction is being executed right before the jump to ISR.
And the last instruction in the loop RJMP is 2 cycles.
So depending on the number of cycles in the loop, TIMER0 overflow will happen more or less times (or never) at that RJMP.
For example if the number of instructions in the loop is a divisor of 256, then TIMER0 overflow will happen at the same instruction, which could be that RJMP or not.
You can adjust it to always hit that RJMP and always get a 5 by just adding NOPs, for example:
sei ;turn interrupts on
nop
nop
nop
nop
LOOP:
nop
nop
nop
nop
nop
out PORTB,cnt
rjmp LOOP
But if the number of instructions in the loop is not a divisor of 256, then TIMER0 overflow will happen in different places in the loop.
Then all depends in how many times that RJMP is hit.
For example using 3 NOPs in the loop instead of 2 you get a 5, every 3 interrutps.
If you add more 2 cycle instructions there are more chances to hit one, but while there is one 2 cycle instruction there are some chances of hitting it, and as seen above you can adjust it to always hit it.