Discussion:
FSR differences between ARMv4, ARMv5 and ARMv6?
(too old to reply)
Marc Singer
2008-12-11 18:24:07 UTC
Permalink
In tracking an unhandled fault on the MX31 (ARMv6) I found that some
of the FSR encodings changed for ARMv6. For example,

DFSR ARMv4 (TRM*) ARMv5 (ARM926EJ-S) ARMv6 (ARM1136J-S)
~~~~ ~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~` ~~~~~~~~~~~~~~~~~~
b00010 External abort Undefined Cache maintenance
on linefetch operation fault
(section)
b00110 External abort Undefined Access flag fault
on linefetch on page
(page)

* TRM refers to the ARM Technical Reference Manual as published by
Addison Wesley.

It appears that the important faults (those with handlers other than
do_bad) are compatible between architectures. So, perhaps this is
merely an issue of making the descriptions either more precise per
architecture, or more vague so that the fault isn't misrepresented.


-------------------------------------------------------------------
List admin: http://lists.arm.linux.org.uk/mailman/listinfo/linux-arm-kernel
FAQ: http://www.arm.linux.org.uk/mailinglists/faq.php
Etiquette: http://www.arm.linux.org.uk/mailinglists/etiquette.php
Russell King - ARM Linux
2008-12-11 19:12:50 UTC
Permalink
Post by Marc Singer
In tracking an unhandled fault on the MX31 (ARMv6) I found that some
of the FSR encodings changed for ARMv6. For example,
DFSR ARMv4 (TRM*) ARMv5 (ARM926EJ-S) ARMv6 (ARM1136J-S)
~~~~ ~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~` ~~~~~~~~~~~~~~~~~~
b00010 External abort Undefined Cache maintenance
on linefetch operation fault
(section)
b00110 External abort Undefined Access flag fault
on linefetch on page
(page)
* TRM refers to the ARM Technical Reference Manual as published by
Addison Wesley.
It appears that the important faults (those with handlers other than
do_bad) are compatible between architectures. So, perhaps this is
merely an issue of making the descriptions either more precise per
architecture, or more vague so that the fault isn't misrepresented.
I don't know where you're getting the information for ARMv6 from, but
it is incorrect.

00010 in the ARM1136 TRM and ARM ARM is "Instruction debug event".
00110 in the ARM1136 TRM and ARM ARM is not a defined bit pattern.

-------------------------------------------------------------------
List admin: http://lists.arm.linux.org.uk/mailman/listinfo/linux-arm-kernel
FAQ: http://www.arm.linux.org.uk/mailinglists/faq.php
Etiquette: http://www.arm.linux.org.uk/mailinglists/etiquette.php
Marc Singer
2008-12-11 19:57:17 UTC
Permalink
Post by Russell King - ARM Linux
Post by Marc Singer
In tracking an unhandled fault on the MX31 (ARMv6) I found that some
of the FSR encodings changed for ARMv6. For example,
DFSR ARMv4 (TRM*) ARMv5 (ARM926EJ-S) ARMv6 (ARM1136J-S)
~~~~ ~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~` ~~~~~~~~~~~~~~~~~~
b00010 External abort Undefined Cache maintenance
on linefetch operation fault
(section)
b00110 External abort Undefined Access flag fault
on linefetch on page
(page)
* TRM refers to the ARM Technical Reference Manual as published by
Addison Wesley.
It appears that the important faults (those with handlers other than
do_bad) are compatible between architectures. So, perhaps this is
merely an issue of making the descriptions either more precise per
architecture, or more vague so that the fault isn't misrepresented.
I don't know where you're getting the information for ARMv6 from, but
it is incorrect.
00010 in the ARM1136 TRM and ARM ARM is "Instruction debug event".
00110 in the ARM1136 TRM and ARM ARM is not a defined bit pattern.
http://infocenter.arm.com/help/topic/com.arm.doc.ddi0211j/DDI0211J_arm1136_r1p5_trm.pdf

Table 3-62 on p3-84. Perhaps I am misinterpreting the table.

-------------------------------------------------------------------
List admin: http://lists.arm.linux.org.uk/mailman/listinfo/linux-arm-kernel
FAQ: http://www.arm.linux.org.uk/mailinglists/faq.php
Etiquette: http://www.arm.linux.org.uk/mailinglists/etiquette.php
Russell King - ARM Linux
2008-12-11 20:36:49 UTC
Permalink
This post might be inappropriate. Click to display it.
Marc Singer
2008-12-11 20:57:05 UTC
Permalink
Post by Russell King - ARM Linux
Post by Marc Singer
http://infocenter.arm.com/help/topic/com.arm.doc.ddi0211j/DDI0211J_arm1136_r1p5_trm.pdf
Table 3-62 on p3-84. Perhaps I am misinterpreting the table.
Oh hell, the ever changing face of ARM Ltd documentation. The version I
was looking at was DDI0211D.
00010 "Debug event fault"
00110 is access fault
Now, access fault is a new thing in later ARM architectures which, with the
AFE control register bit enabled, it will produce faults if AP[0] is zero.
See page 6-40. We don't enable the AFE bit though.
Well, that's bad news for me. I am seeing this fault on the instruction after

cpsie i

when unlocking an ALSA stream lock. I verified that the instruction
after cpsie is irrelevent by inserting a NOP. I'll double-check that
AFE is, indeed, unset. After that, I'm not sure what to do to find
the cause of the fault. Perhaps there is something misconfigured in
the VIC (iMX31 implementation).

-------------------------------------------------------------------
List admin: http://lists.arm.linux.org.uk/mailman/listinfo/linux-arm-kernel
FAQ: http://www.arm.linux.org.uk/mailinglists/faq.php
Etiquette: http://www.arm.linux.org.uk/mailinglists/etiquette.php
Laurent Desnogues
2008-12-11 20:02:11 UTC
Permalink
On Thu, Dec 11, 2008 at 8:12 PM, Russell King - ARM Linux
Post by Russell King - ARM Linux
Post by Marc Singer
In tracking an unhandled fault on the MX31 (ARMv6) I found that some
of the FSR encodings changed for ARMv6. For example,
DFSR ARMv4 (TRM*) ARMv5 (ARM926EJ-S) ARMv6 (ARM1136J-S)
~~~~ ~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~` ~~~~~~~~~~~~~~~~~~
b00010 External abort Undefined Cache maintenance
on linefetch operation fault
(section)
b00110 External abort Undefined Access flag fault
on linefetch on page
(page)
* TRM refers to the ARM Technical Reference Manual as published by
Addison Wesley.
It appears that the important faults (those with handlers other than
do_bad) are compatible between architectures. So, perhaps this is
merely an issue of making the descriptions either more precise per
architecture, or more vague so that the fault isn't misrepresented.
I don't know where you're getting the information for ARMv6 from, but
it is incorrect.
00010 in the ARM1136 TRM and ARM ARM is "Instruction debug event".
00110 in the ARM1136 TRM and ARM ARM is not a defined bit pattern.
00110 is indeed Access Flag fault on Page for both IFSR and DFSR.
00100 is Cache maintenance operation fault for DFSR.

Ref:
IFSR: http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0211j/Bhcbcbbb.html
DFSR:
http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0211j/Bhcediaa.html

HTH,

Laurent

-------------------------------------------------------------------
List admin: http://lists.arm.linux.org.uk/mailman/listinfo/linux-arm-kernel
FAQ: http://www.arm.linux.org.uk/mailinglists/faq.php
Etiquette: http://www.arm.linux.org.uk/mailinglists/etiquette.php
Marc Singer
2008-12-11 19:59:33 UTC
Permalink
Post by Russell King - ARM Linux
Post by Marc Singer
In tracking an unhandled fault on the MX31 (ARMv6) I found that some
of the FSR encodings changed for ARMv6. For example,
DFSR ARMv4 (TRM*) ARMv5 (ARM926EJ-S) ARMv6 (ARM1136J-S)
~~~~ ~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~` ~~~~~~~~~~~~~~~~~~
b00010 External abort Undefined Cache maintenance
on linefetch operation fault
(section)
b00110 External abort Undefined Access flag fault
on linefetch on page
(page)
* TRM refers to the ARM Technical Reference Manual as published by
Addison Wesley.
It appears that the important faults (those with handlers other than
do_bad) are compatible between architectures. So, perhaps this is
merely an issue of making the descriptions either more precise per
architecture, or more vague so that the fault isn't misrepresented.
I don't know where you're getting the information for ARMv6 from, but
it is incorrect.
00010 in the ARM1136 TRM and ARM ARM is "Instruction debug event".
00110 in the ARM1136 TRM and ARM ARM is not a defined bit pattern.
As for the premise that 0b00110 is undefined, I am seeing an abort
with this encoding. If it's undefined, I'm not sure how the kernel
would generate such an abort.

Unhandled fault: external abort on linefetch (806) at 0x00000000
Internal error: : 806 [#1] PREEMPT


-------------------------------------------------------------------
List admin: http://lists.arm.linux.org.uk/mailman/listinfo/linux-arm-kernel
FAQ: http://www.arm.linux.org.uk/mailinglists/faq.php
Etiquette: http://www.arm.linux.org.uk/mailinglists/etiquette.php
Laurent Desnogues
2008-12-11 20:46:20 UTC
Permalink
Post by Marc Singer
As for the premise that 0b00110 is undefined, I am seeing an abort
with this encoding. If it's undefined, I'm not sure how the kernel
would generate such an abort.
Unhandled fault: external abort on linefetch (806) at 0x00000000
Internal error: : 806 [#1] PREEMPT
External abort should be 10110, where the most significant bit is
encoded in bit 10 of DFSR and the 4 least bits in bits[3:0]. Such
aborts are imprecise which means it's almost impossible to know
why they arose, making them unrecoverable. An example of how
it can happen is when you have an memory area mapped as
cacheable; if your cache does write allocation and the line is
evicted (which can happen long after you wrote the data) and
some external chip doesn't accept the write of the evicted data,
you will get an imprecise data abort.


Laurent

-------------------------------------------------------------------
List admin: http://lists.arm.linux.org.uk/mailman/listinfo/linux-arm-kernel
FAQ: http://www.arm.linux.org.uk/mailinglists/faq.php
Etiquette: http://www.arm.linux.org.uk/mailinglists/etiquette.php
Marc Singer
2008-12-11 21:06:36 UTC
Permalink
Post by Laurent Desnogues
Post by Marc Singer
As for the premise that 0b00110 is undefined, I am seeing an abort
with this encoding. If it's undefined, I'm not sure how the kernel
would generate such an abort.
Unhandled fault: external abort on linefetch (806) at 0x00000000
Internal error: : 806 [#1] PREEMPT
External abort should be 10110, where the most significant bit is
encoded in bit 10 of DFSR and the 4 least bits in bits[3:0]. Such
aborts are imprecise which means it's almost impossible to know
why they arose, making them unrecoverable. An example of how
it can happen is when you have an memory area mapped as
cacheable; if your cache does write allocation and the line is
evicted (which can happen long after you wrote the data) and
some external chip doesn't accept the write of the evicted data,
you will get an imprecise data abort.
Odd. The FSR is really 0x806 which means that it is not a 0b10110
abort, but an abort with a status of 0b00110 and a RW flag of 1. I've
read through the code and I believe that the 0x806 is the raw FSR
register value.

As for your explanation of the cause, it looks like something along
these lines. I've seen the fault occur in two places, either just
after a 'cpsie i' or during a 'pld'. The latter case occurred in the
copy_from_user routine which is odder still. I'm still analyzing the
code involved, which is extensive, to see where an improperly mmaped
device could be creeping in.

-------------------------------------------------------------------
List admin: http://lists.arm.linux.org.uk/mailman/listinfo/linux-arm-kernel
FAQ: http://www.arm.linux.org.uk/mailinglists/faq.php
Etiquette: http://www.arm.linux.org.uk/mailinglists/etiquette.php
Russell King - ARM Linux
2008-12-11 21:08:33 UTC
Permalink
Post by Marc Singer
As for your explanation of the cause, it looks like something along
these lines. I've seen the fault occur in two places, either just
after a 'cpsie i' or during a 'pld'. The latter case occurred in the
copy_from_user routine which is odder still. I'm still analyzing the
code involved, which is extensive, to see where an improperly mmaped
device could be creeping in.
There have been cases where random CPU behaviour has resulted from
noisy power supplies to the CPU. Is this proven hardware?

-------------------------------------------------------------------
List admin: http://lists.arm.linux.org.uk/mailman/listinfo/linux-arm-kernel
FAQ: http://www.arm.linux.org.uk/mailinglists/faq.php
Etiquette: http://www.arm.linux.org.uk/mailinglists/etiquette.php
Marc Singer
2008-12-11 21:15:46 UTC
Permalink
Post by Russell King - ARM Linux
Post by Marc Singer
As for your explanation of the cause, it looks like something along
these lines. I've seen the fault occur in two places, either just
after a 'cpsie i' or during a 'pld'. The latter case occurred in the
copy_from_user routine which is odder still. I'm still analyzing the
code involved, which is extensive, to see where an improperly mmaped
device could be creeping in.
There have been cases where random CPU behaviour has resulted from
noisy power supplies to the CPU. Is this proven hardware?
Custom board. I'll ask the EE to check the power supply.

Considering the consistency of the fault behavior, I'm pessimistic
that I'm witnessing the result of noisy power. I'll check anyway.

-------------------------------------------------------------------
List admin: http://lists.arm.linux.org.uk/mailman/listinfo/linux-arm-kernel
FAQ: http://www.arm.linux.org.uk/mailinglists/faq.php
Etiquette: http://www.arm.linux.org.uk/mailinglists/etiquette.php
Nicolas Pitre
2008-12-11 21:45:52 UTC
Permalink
Post by Marc Singer
Post by Russell King - ARM Linux
Post by Marc Singer
As for your explanation of the cause, it looks like something along
these lines. I've seen the fault occur in two places, either just
after a 'cpsie i' or during a 'pld'. The latter case occurred in the
copy_from_user routine which is odder still. I'm still analyzing the
code involved, which is extensive, to see where an improperly mmaped
device could be creeping in.
There have been cases where random CPU behaviour has resulted from
noisy power supplies to the CPU. Is this proven hardware?
Custom board. I'll ask the EE to check the power supply.
Considering the consistency of the fault behavior, I'm pessimistic
that I'm witnessing the result of noisy power. I'll check anyway.
One thing you could try is turning all caches off. It'll be slow as
hell, but that would give you a data point if the fault doesn't occur
anymore.


Nicolas

-------------------------------------------------------------------
List admin: http://lists.arm.linux.org.uk/mailman/listinfo/linux-arm-kernel
FAQ: http://www.arm.linux.org.uk/mailinglists/faq.php
Etiquette: http://www.arm.linux.org.uk/mailinglists/etiquette.php
Marc Singer
2008-12-11 23:35:50 UTC
Permalink
Post by Nicolas Pitre
One thing you could try is turning all caches off. It'll be slow as
hell, but that would give you a data point if the fault doesn't occur
anymore.
Fantastic suggestion. I'm still waiting, about 30 minutes, for the
first kernel message to print. I may have to let it run over night.

-------------------------------------------------------------------
List admin: http://lists.arm.linux.org.uk/mailman/listinfo/linux-arm-kernel
FAQ: http://www.arm.linux.org.uk/mailinglists/faq.php
Etiquette: http://www.arm.linux.org.uk/mailinglists/etiquette.php
Russell King - ARM Linux
2008-12-11 23:55:00 UTC
Permalink
Post by Marc Singer
Post by Nicolas Pitre
One thing you could try is turning all caches off. It'll be slow as
hell, but that would give you a data point if the fault doesn't occur
anymore.
Fantastic suggestion. I'm still waiting, about 30 minutes, for the
first kernel message to print. I may have to let it run over night.
It won't be that slow. When I was originally developing the ARMv6 in
1999/2000, I was booting it on a 233MHz laptop running under the
ARMulator. It took about 5 minutes to boot.

It is entirely possible that the old trick of "turn off the caches" will
crash the kernel with later architectures - the various atomic operations
are defined as "unpredictable" against anything but "normal memory" -
which includes using them on "strongly ordered" memory. Strongly ordered
memory is what you get if you turn the caches completely off.

-------------------------------------------------------------------
List admin: http://lists.arm.linux.org.uk/mailman/listinfo/linux-arm-kernel
FAQ: http://www.arm.linux.org.uk/mailinglists/faq.php
Etiquette: http://www.arm.linux.org.uk/mailinglists/etiquette.php
Marc Singer
2008-12-12 00:53:04 UTC
Permalink
Post by Marc Singer
In tracking an unhandled fault on the MX31 (ARMv6) I found that some
of the FSR encodings changed for ARMv6. For example,
DFSR ARMv4 (TRM*) ARMv5 (ARM926EJ-S) ARMv6 (ARM1136J-S)
~~~~ ~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~` ~~~~~~~~~~~~~~~~~~
b00010 External abort Undefined Cache maintenance
on linefetch operation fault
(section)
b00110 External abort Undefined Access flag fault
on linefetch on page
(page)
* TRM refers to the ARM Technical Reference Manual as published by
Addison Wesley.
It appears that the important faults (those with handlers other than
do_bad) are compatible between architectures. So, perhaps this is
merely an issue of making the descriptions either more precise per
architecture, or more vague so that the fault isn't misrepresented.
The data abort handler for v6 performs

bic r1, r1, #1 << 11 | 1 << 10 @ clear bits 11 and 10 of FSR

right after reading FSR. In the case of the MX31 implementation of
v6, I don't believe that this is necessary since the architecture
passes bit 10 as a significant part of the fault status. The 'RW' bit
(bit 11), as well, is implemented by the architecture, so it ought to
be OK to let it pass through, no?

In any case, when I remove this line the fault FSR changes to 0xc06
"imprecise external abort" which makes a heck of a lot more sense.

Russell, should this line be removed? Perhaps we should not clear bit
10 at a minimum?

-------------------------------------------------------------------
List admin: http://lists.arm.linux.org.uk/mailman/listinfo/linux-arm-kernel
FAQ: http://www.arm.linux.org.uk/mailinglists/faq.php
Etiquette: http://www.arm.linux.org.uk/mailinglists/etiquette.php
Russell King - ARM Linux
2008-12-12 08:15:01 UTC
Permalink
Post by Marc Singer
The data abort handler for v6 performs
right after reading FSR. In the case of the MX31 implementation of
v6, I don't believe that this is necessary since the architecture
passes bit 10 as a significant part of the fault status. The 'RW' bit
(bit 11), as well, is implemented by the architecture, so it ought to
be OK to let it pass through, no?
In any case, when I remove this line the fault FSR changes to 0xc06
"imprecise external abort" which makes a heck of a lot more sense.
Russell, should this line be removed? Perhaps we should not clear bit
10 at a minimum?
No idea. George introduced this code, see below (stupidly long comment
line got truncated).

George, please comment.


commit 3a1e501511a1e2c665c566939047794dcf86466b
Author: George G. Davis <***@com.rmk.(none)>
Date: Fri Apr 29 22:08:33 2005 +0100

[PATCH] ARM: 2655/1: ARM1136 SWP instruction abort handler fix

Patch from George G. Davis

As noted in http://www.arm.com/linux/patch-2.6.9-arm1.gz, the "Faulty SWP i

Signed-off-by: George G. Davis
Signed-off-by: Russell King <rmk+***@arm.linux.org.uk>

diff --git a/arch/arm/mm/abort-ev6.S b/arch/arm/mm/abort-ev6.S
index 38b2cbb..8f76f3d 100644
--- a/arch/arm/mm/abort-ev6.S
+++ b/arch/arm/mm/abort-ev6.S
@@ -1,5 +1,6 @@
#include <linux/linkage.h>
#include <asm/assembler.h>
+#include "abort-macro.S"
/*
* Function: v6_early_abort
*
@@ -13,11 +14,26 @@
* : sp = pointer to registers
*
* Purpose : obtain information about current aborted instruction.
+ * Note: we read user space. This means we might cause a data
+ * abort here if the I-TLB and D-TLB aren't seeing the same
+ * picture. Unfortunately, this does happen. We live with it.
*/
.align 5
ENTRY(v6_early_abort)
mrc p15, 0, r1, c5, c0, 0 @ get FSR
mrc p15, 0, r0, c6, c0, 0 @ get FAR
+/*
+ * Faulty SWP instruction on 1136 doesn't set bit 11 in DFSR.
+ * The test below covers all the write situations, including Java bytecodes
+ */
+ bic r1, r1, #1 << 11 | 1 << 10 @ clear bits 11 and 10 of FSR
+ tst r3, #PSR_J_BIT @ Java?
+ movne pc, lr
+ do_thumb_abort
+ ldreq r3, [r2] @ read aborted ARM instruction
+ do_ldrd_abort
+ tst r3, #1 << 20 @ L = 0 -> write
+ orreq r1, r1, #1 << 11 @ yes.
mov pc, lr



-------------------------------------------------------------------
List admin: http://lists.arm.linux.org.uk/mailman/listinfo/linux-arm-kernel
FAQ: http://www.arm.linux.org.uk/mailinglists/faq.php
Etiquette: http://www.arm.linux.org.uk/mailinglists/etiquette.php
Laurent Desnogues
2008-12-12 09:07:48 UTC
Permalink
On Fri, Dec 12, 2008 at 9:15 AM, Russell King - ARM Linux
Post by Russell King - ARM Linux
Post by Marc Singer
The data abort handler for v6 performs
right after reading FSR. In the case of the MX31 implementation of
v6, I don't believe that this is necessary since the architecture
passes bit 10 as a significant part of the fault status. The 'RW' bit
(bit 11), as well, is implemented by the architecture, so it ought to
be OK to let it pass through, no?
In any case, when I remove this line the fault FSR changes to 0xc06
"imprecise external abort" which makes a heck of a lot more sense.
Russell, should this line be removed? Perhaps we should not clear bit
10 at a minimum?
No idea. George introduced this code, see below (stupidly long comment
line got truncated).
George, please comment.
I am not George and I don't know where this code is used, but
here are my 2 Euro cents anyway :)

If we enter v6_early_abort with an Imprecise External Abort then
the FAR value should not be used since it's not updated by
the hardware.

Ref: http://infocenter.arm.com/help/topic/com.arm.doc.ddi0211j/Bhcdiiih.html

As I previously wrote, if we get an Imprecise External Abort,
I'm afraid there's not much we can do to recover...

The comments in that routine say:

* Params : r2 = address of aborted instruction
-> This can be any address and totally unrelated to the abort
if that is an Imprecise External Abort
* Returns : r0 = address of abort
-> This will not contain any meaningful address if that is an
Imprecise External Abort

Again I don't know in what context this routine is used, so this
may not be applicable.

HTH,

Laurent
Post by Russell King - ARM Linux
commit 3a1e501511a1e2c665c566939047794dcf86466b
Date: Fri Apr 29 22:08:33 2005 +0100
[PATCH] ARM: 2655/1: ARM1136 SWP instruction abort handler fix
Patch from George G. Davis
As noted in http://www.arm.com/linux/patch-2.6.9-arm1.gz, the "Faulty SWP i
Signed-off-by: George G. Davis
diff --git a/arch/arm/mm/abort-ev6.S b/arch/arm/mm/abort-ev6.S
index 38b2cbb..8f76f3d 100644
--- a/arch/arm/mm/abort-ev6.S
+++ b/arch/arm/mm/abort-ev6.S
@@ -1,5 +1,6 @@
#include <linux/linkage.h>
#include <asm/assembler.h>
+#include "abort-macro.S"
/*
* Function: v6_early_abort
*
@@ -13,11 +14,26 @@
* : sp = pointer to registers
*
* Purpose : obtain information about current aborted instruction.
+ * Note: we read user space. This means we might cause a data
+ * abort here if the I-TLB and D-TLB aren't seeing the same
+ * picture. Unfortunately, this does happen. We live with it.
*/
.align 5
ENTRY(v6_early_abort)
+/*
+ * Faulty SWP instruction on 1136 doesn't set bit 11 in DFSR.
+ * The test below covers all the write situations, including Java bytecodes
+ */
+ movne pc, lr
+ do_thumb_abort
+ do_ldrd_abort
mov pc, lr
-------------------------------------------------------------------
List admin: http://lists.arm.linux.org.uk/mailman/listinfo/linux-arm-kernel
FAQ: http://www.arm.linux.org.uk/mailinglists/faq.php
Etiquette: http://www.arm.linux.org.uk/mailinglists/etiquette.php
-------------------------------------------------------------------
List admin: http://lists.arm.linux.org.uk/mailman/listinfo/linux-arm-kernel
FAQ: http://www.arm.linux.org.uk/mailinglists/faq.php
Etiquette: http://www.arm.linux.org.uk/mailinglists/etiquette.php
Catalin Marinas
2008-12-12 15:07:04 UTC
Permalink
Post by Russell King - ARM Linux
No idea. George introduced this code, see below (stupidly long comment
line got truncated).
[...]
Post by Russell King - ARM Linux
+/*
+ * Faulty SWP instruction on 1136 doesn't set bit 11 in DFSR.
+ * The test below covers all the write situations, including Java bytecodes
+ */
+ movne pc, lr
+ do_thumb_abort
+ do_ldrd_abort
mov pc, lr
It seems that this is a workaround for erratum 326103, "FSR write bit
incorrect on a SWP to read-only memory", affecting ARM1136 (fixed in
r1p0):

The FSR bit[11] value indicates, after an abort, if the
instruction was performing a read or a write. A typical use of
this is to detect writes to read-only memory, to implement a
copy-on-write scheme. Because of this errata the FSR bit
indicates a read on a SWP that aborts because of a TLB generated
abort (even though the abort arose
as a result of a write to read-only memory).

Are there any r0p2 ARM1136 around? We could make this conditional (as I
have in my stable tree, the errata workarounds have to be explicitly
enabled).
--
Catalin


-------------------------------------------------------------------
List admin: http://lists.arm.linux.org.uk/mailman/listinfo/linux-arm-kernel
FAQ: http://www.arm.linux.org.uk/mailinglists/faq.php
Etiquette: http://www.arm.linux.org.uk/mailinglists/etiquette.php
Russell King - ARM Linux
2008-12-12 16:31:25 UTC
Permalink
Post by Catalin Marinas
Post by Russell King - ARM Linux
No idea. George introduced this code, see below (stupidly long comment
line got truncated).
[...]
Post by Russell King - ARM Linux
+/*
+ * Faulty SWP instruction on 1136 doesn't set bit 11 in DFSR.
+ * The test below covers all the write situations, including Java bytecodes
+ */
+ movne pc, lr
+ do_thumb_abort
+ do_ldrd_abort
mov pc, lr
It seems that this is a workaround for erratum 326103, "FSR write bit
incorrect on a SWP to read-only memory", affecting ARM1136 (fixed in
Look closer - not only is it fiddling with bit 11, but also bit 10.
To me that looks like a bug, but I want to get a comment on it from
the original author.


-------------------------------------------------------------------
List admin: http://lists.arm.linux.org.uk/mailman/listinfo/linux-arm-kernel
FAQ: http://www.arm.linux.org.uk/mailinglists/faq.php
Etiquette: http://www.arm.linux.org.uk/mailinglists/etiquette.php
Marc Singer
2008-12-12 17:21:39 UTC
Permalink
Post by Russell King - ARM Linux
Post by Catalin Marinas
Post by Russell King - ARM Linux
No idea. George introduced this code, see below (stupidly long comment
line got truncated).
[...]
Post by Russell King - ARM Linux
+/*
+ * Faulty SWP instruction on 1136 doesn't set bit 11 in DFSR.
+ * The test below covers all the write situations, including Java bytecodes
+ */
+ movne pc, lr
+ do_thumb_abort
+ do_ldrd_abort
mov pc, lr
It seems that this is a workaround for erratum 326103, "FSR write bit
incorrect on a SWP to read-only memory", affecting ARM1136 (fixed in
Look closer - not only is it fiddling with bit 11, but also bit 10.
To me that looks like a bug, but I want to get a comment on it from
the original author.
A google search isn't bringing up the errata for me. References?

The questions is, indeed, whether or not we need to clear bit 10.
IMHO, it is hard to believe that it should always be cleared.

Mind you, things are not so peachy as I'd like. When I remove the
clearing of bit 10, my USB interface stops working. I'm looking into
it.

-------------------------------------------------------------------
List admin: http://lists.arm.linux.org.uk/mailman/listinfo/linux-arm-kernel
FAQ: http://www.arm.linux.org.uk/mailinglists/faq.php
Etiquette: http://www.arm.linux.org.uk/mailinglists/etiquette.php
Marc Singer
2008-12-12 19:42:58 UTC
Permalink
ARM doesn't seem to freely publish the Errata documents, though they are
available to customers via the usual data channels (whatever they are, I
think connect.arm.com). Anyway, I copied it below (from a PDF file) FYI.
326103: FSR write bit incorrect on a SWP to read-only memory
Status
Affects: product ARM1136J-S, ARM1136JF-S.
Fault status: Cat 2, Present in: r0p2, Fixed in r1p0.
Description
[deletia]

In the erratum, there is no mention of bit 10 being cleared. It would
appear that the presence of this operation has nothing to do with the
SWP instruction.

The abort-ev5tj.S implementation also clears bit 10, but there is no
indication as to why. abort-ev5t.S does not clear bit 10. Both of
the abort-ev4 implementations clear the bit.

I don't have the official ARMv4 TRM from ARM, so I don't know if
either of those architectures define bit 10. The ARM Architecture
Reference Manual does not mention it.

My hunch is that clearing this bit was a matter of hygiene on the v4
and v5 implementations and perhaps the v6 version is a misguided copy
of the v5tj code.

BTW, the problem I was having with USB was a red hering. Leaving
bit 10 intact does not prevent proper USB driver operation.


-------------------------------------------------------------------
List admin: http://lists.arm.linux.org.uk/mailman/listinfo/linux-arm-kernel
FAQ: http://www.arm.linux.org.uk/mailinglists/faq.php
Etiquette: http://www.arm.linux.org.uk/mailinglists/etiquette.php
Russell King - ARM Linux
2008-12-12 19:55:35 UTC
Permalink
Post by Marc Singer
My hunch is that clearing this bit was a matter of hygiene on the v4
and v5 implementations and perhaps the v6 version is a misguided copy
of the v5tj code.
Yes, for v4, bit 10 and 11 were not defined. I can't remember the
exact situation for v5, since Xscale has bit 10 and was Intel's own
v5 implemenation.

The addition of bit 10 was made an architecture requirement, along
with the new R/W bit at bit 11.

I suspect that you are right that the v6 workaround from George is a
cut-n-paste from the v5tj code.

Since George hasn't replied yet to settle this matter, does anyone know
if George is still at Montavista? To which the question is: why has
George been dropped from the CC list?

-------------------------------------------------------------------
List admin: http://lists.arm.linux.org.uk/mailman/listinfo/linux-arm-kernel
FAQ: http://www.arm.linux.org.uk/mailinglists/faq.php
Etiquette: http://www.arm.linux.org.uk/mailinglists/etiquette.php
Catalin Marinas
2008-12-15 15:24:06 UTC
Permalink
Post by Russell King - ARM Linux
Post by Marc Singer
My hunch is that clearing this bit was a matter of hygiene on the v4
and v5 implementations and perhaps the v6 version is a misguided copy
of the v5tj code.
Yes, for v4, bit 10 and 11 were not defined. I can't remember the
exact situation for v5, since Xscale has bit 10 and was Intel's own
v5 implemenation.
The addition of bit 10 was made an architecture requirement, along
with the new R/W bit at bit 11.
I suspect that you are right that the v6 workaround from George is a
cut-n-paste from the v5tj code.
Possibly, I have the same impression.
Post by Russell King - ARM Linux
Since George hasn't replied yet to settle this matter, does anyone know
if George is still at Montavista? To which the question is: why has
George been dropped from the CC list?
Last time I saw an e-mail from him (about a week ago), he was still at
MontaVista. I re-cc'ed him.
--
Catalin


-------------------------------------------------------------------
List admin: http://lists.arm.linux.org.uk/mailman/listinfo/linux-arm-kernel
FAQ: http://www.arm.linux.org.uk/mailinglists/faq.php
Etiquette: http://www.arm.linux.org.uk/mailinglists/etiquette.php
George G. Davis
2009-02-24 04:33:18 UTC
Permalink
Post by Catalin Marinas
Post by Russell King - ARM Linux
Post by Marc Singer
My hunch is that clearing this bit was a matter of hygiene on the v4
and v5 implementations and perhaps the v6 version is a misguided copy
of the v5tj code.
Yes, for v4, bit 10 and 11 were not defined. I can't remember the
exact situation for v5, since Xscale has bit 10 and was Intel's own
v5 implemenation.
The addition of bit 10 was made an architecture requirement, along
with the new R/W bit at bit 11.
I suspect that you are right that the v6 workaround from George is a
cut-n-paste from the v5tj code.
Possibly, I have the same impression.
Post by Russell King - ARM Linux
Since George hasn't replied yet to settle this matter, does anyone know
if George is still at Montavista? To which the question is: why has
George been dropped from the CC list?
Last time I saw an e-mail from him (about a week ago), he was still at
MontaVista. I re-cc'ed him.
Sorry guys, I missed this when it was initially posted due to a
"regional" power outage for several days. Then life just got in
the way of following up...

--
Regards,
George
Post by Catalin Marinas
--
Catalin
-------------------------------------------------------------------
List admin: http://lists.arm.linux.org.uk/mailman/listinfo/linux-arm-kernel
FAQ: http://www.arm.linux.org.uk/mailinglists/faq.php
Etiquette: http://www.arm.linux.org.uk/mailinglists/etiquette.php
George G. Davis
2009-02-24 04:30:10 UTC
Permalink
Post by Russell King - ARM Linux
Post by Catalin Marinas
Post by Russell King - ARM Linux
No idea. George introduced this code, see below (stupidly long comment
line got truncated).
[...]
Post by Russell King - ARM Linux
+/*
+ * Faulty SWP instruction on 1136 doesn't set bit 11 in DFSR.
+ * The test below covers all the write situations, including Java bytecodes
+ */
+ movne pc, lr
+ do_thumb_abort
+ do_ldrd_abort
mov pc, lr
It seems that this is a workaround for erratum 326103, "FSR write bit
incorrect on a SWP to read-only memory", affecting ARM1136 (fixed in
Look closer - not only is it fiddling with bit 11, but also bit 10.
To me that looks like a bug, but I want to get a comment on it from
the original author.
As noted in the (full) commit header, I'm *_not_* the original author,
I merely forward ported an http://www.arm.com/linux/patch-2.6.9-arm1.gz
erratum fix which worked around the "Faulty SWP instruction on 1136".

--
Regards,
George

-------------------------------------------------------------------
List admin: http://lists.arm.linux.org.uk/mailman/listinfo/linux-arm-kernel
FAQ: http://www.arm.linux.org.uk/mailinglists/faq.php
Etiquette: http://www.arm.linux.org.uk/mailinglists/etiquette.php
Woodruff, Richard
2008-12-12 16:49:22 UTC
Permalink
Sent: Friday, December 12, 2008 9:07 AM
It seems that this is a workaround for erratum 326103, "FSR write bit
incorrect on a SWP to read-only memory", affecting ARM1136 (fixed in
The FSR bit[11] value indicates, after an abort, if the
instruction was performing a read or a write. A typical use of
this is to detect writes to read-only memory, to implement a
copy-on-write scheme. Because of this errata the FSR bit
indicates a read on a SWP that aborts because of a TLB generated
abort (even though the abort arose
as a result of a write to read-only memory).
Are there any r0p2 ARM1136 around? We could make this conditional (as I
have in my stable tree, the errata workarounds have to be explicitly
enabled).
There are 10's of millions of them in circulation running Linux. OMAP2 variants ramped to production using r0pX versions. There are still active community boards (N810 for example) which use new kernels and want this fix.

In 2004 I found and reported this issue when running LTP on an early FPGA (abort01 test failed).

To reproduce revert workaround and use arm-linux-gcc -static -g -lpthread -o bug abort01.c.

Regards,
Richard W.


-------------------------------------------------------------------
List admin: http://lists.arm.linux.org.uk/mailman/listinfo/linux-arm-kernel
FAQ: http://www.arm.linux.org.uk/mailinglists/faq.php
Etiquette: http://www.arm.linux.org.uk/mailinglists/etiquette.php
Catalin Marinas
2008-12-12 17:07:54 UTC
Permalink
Post by Woodruff, Richard
Sent: Friday, December 12, 2008 9:07 AM
It seems that this is a workaround for erratum 326103, "FSR write bit
incorrect on a SWP to read-only memory", affecting ARM1136 (fixed in
The FSR bit[11] value indicates, after an abort, if the
instruction was performing a read or a write. A typical use of
this is to detect writes to read-only memory, to implement a
copy-on-write scheme. Because of this errata the FSR bit
indicates a read on a SWP that aborts because of a TLB generated
abort (even though the abort arose
as a result of a write to read-only memory).
Are there any r0p2 ARM1136 around? We could make this conditional (as I
have in my stable tree, the errata workarounds have to be explicitly
enabled).
There are 10's of millions of them in circulation running Linux. OMAP2
variants ramped to production using r0pX versions. There are still
active community boards (N810 for example) which use new kernels and
want this fix.
OK.

I think this code could also depend on ARM_OABI_COMPAT since EABI
binaries tend to use the kernel helpers without the SWP instruction (the
big exception here is Debian Lenny which still uses an old glibc with a
few SWP instructions).
--
Catalin


-------------------------------------------------------------------
List admin: http://lists.arm.linux.org.uk/mailman/listinfo/linux-arm-kernel
FAQ: http://www.arm.linux.org.uk/mailinglists/faq.php
Etiquette: http://www.arm.linux.org.uk/mailinglists/etiquette.php
Russell King - ARM Linux
2008-12-12 19:10:15 UTC
Permalink
Post by Catalin Marinas
Post by Woodruff, Richard
There are 10's of millions of them in circulation running Linux. OMAP2
variants ramped to production using r0pX versions. There are still
active community boards (N810 for example) which use new kernels and
want this fix.
OK.
I think this code could also depend on ARM_OABI_COMPAT since EABI
binaries tend to use the kernel helpers without the SWP instruction (the
big exception here is Debian Lenny which still uses an old glibc with a
few SWP instructions).
No. The *real* question which needs answering is:

Why the fsck is this workaround masking bit 10.

Not "do we need this workaround". So stop debating about making the
workaround conditional. The workaround needs looking into as it stands.
Whether the workaround is conditional is a totally separate issue
unrelated to what this thread is about.

-------------------------------------------------------------------
List admin: http://lists.arm.linux.org.uk/mailman/listinfo/linux-arm-kernel
FAQ: http://www.arm.linux.org.uk/mailinglists/faq.php
Etiquette: http://www.arm.linux.org.uk/mailinglists/etiquette.php
Woodruff, Richard
2008-12-12 16:57:40 UTC
Permalink
Post by Woodruff, Richard
To reproduce revert workaround and use arm-linux-gcc -static -g -lpthread -o
bug abort01.c.
George G. Davis
2009-02-24 03:56:40 UTC
Permalink
Apologies for the delayed reply. Between an ice storm knocking me off
line for several days when this was initially posted, work, family
and life in general, I'm finally following up on this...
Post by Russell King - ARM Linux
Post by Marc Singer
The data abort handler for v6 performs
right after reading FSR. In the case of the MX31 implementation of
v6, I don't believe that this is necessary since the architecture
passes bit 10 as a significant part of the fault status. The 'RW' bit
(bit 11), as well, is implemented by the architecture, so it ought to
be OK to let it pass through, no?
In any case, when I remove this line the fault FSR changes to 0xc06
"imprecise external abort" which makes a heck of a lot more sense.
Russell, should this line be removed? Perhaps we should not clear bit
10 at a minimum?
No idea. George introduced this code, see below (stupidly long comment
line got truncated).
Full commit header follows:

commit 3a1e501511a1e2c665c566939047794dcf86466b
Author: George G. Davis <***@com.rmk.(none)>
Date: Fri Apr 29 22:08:33 2005 +0100

[PATCH] ARM: 2655/1: ARM1136 SWP instruction abort handler fix

Patch from George G. Davis

As noted in http://www.arm.com/linux/patch-2.6.9-arm1.gz, the
"Faulty SWP instruction on 1136 doesn't set bit 11 in DFSR." So the
v6_early_abort handler does not report the correct rd/wr direction for
the SWP instruction which may result in SEGVS or hangs. In order to work
around this problem, this patch merely updates the fix contained in the
ARM Ltd. patch to use the macroised abort handler fixups.

Signed-off-by: George G. Davis
Post by Russell King - ARM Linux
George, please comment.
So, "As noted in http://www.arm.com/linux/patch-2.6.9-arm1.gz":

filterdiff -z -i "*/abort-ev6.S" /home/ftp/mirror/www.arm.com/linux/patch-2.6.9-arm1.gz
--- a/arch/arm/mm/abort-ev6.S 2004-09-23 11:33:52.000000000 +0100
+++ b/arch/arm/mm/abort-ev6.S 2004-10-20 15:37:35.000000000 +0100
@@ -18,6 +18,21 @@
ENTRY(v6_early_abort)
mrc p15, 0, r1, c5, c0, 0 @ get FSR
mrc p15, 0, r0, c6, c0, 0 @ get FAR
+/*
+ * Faulty SWP instruction on 1136 doesn't set bit 11 in DFSR.
+ * The test below covers all the write situations, including Java bytecodes
+ */
+#if 1
+ bic r1, r1, #1 << 11 | 1 << 10 @ clear bits 11 and 10 of FSR
+ tst r3, #PSR_J_BIT @ Java?
+ movne pc, lr
+ tst r3, #PSR_T_BIT @ Thumb?
+ ldrneh r3, [r2] @ read aborted thumb instruction
+ ldreq r3, [r2] @ read aborted ARM instruction
+ movne r3, r3, lsl #(21 - 12) @ move thumb bit 11 to ARM bit 20
+ tst r3, #1 << 20 @ L = 0 -> write
+ orreq r1, r1, #1 << 11 @ yes.
+#endif
mov pc, lr



// CUT "incomplete/partial commit header" HERE
Post by Russell King - ARM Linux
diff --git a/arch/arm/mm/abort-ev6.S b/arch/arm/mm/abort-ev6.S
index 38b2cbb..8f76f3d 100644
--- a/arch/arm/mm/abort-ev6.S
+++ b/arch/arm/mm/abort-ev6.S
@@ -1,5 +1,6 @@
#include <linux/linkage.h>
#include <asm/assembler.h>
+#include "abort-macro.S"
/*
* Function: v6_early_abort
*
@@ -13,11 +14,26 @@
* : sp = pointer to registers
*
* Purpose : obtain information about current aborted instruction.
+ * Note: we read user space. This means we might cause a data
+ * abort here if the I-TLB and D-TLB aren't seeing the same
+ * picture. Unfortunately, this does happen. We live with it.
*/
.align 5
ENTRY(v6_early_abort)
+/*
+ * Faulty SWP instruction on 1136 doesn't set bit 11 in DFSR.
+ * The test below covers all the write situations, including Java bytecodes
+ */
+ movne pc, lr
+ do_thumb_abort
+ do_ldrd_abort
mov pc, lr
So, *_all_* I did here was to forward port "the ARM Ltd. patch to use the
macroised abort handler fixups" in v2.6.12-rc4. Guilty as charged.

--
Regards,
George

-------------------------------------------------------------------
List admin: http://lists.arm.linux.org.uk/mailman/listinfo/linux-arm-kernel
FAQ: http://www.arm.linux.org.uk/mailinglists/faq.php
Etiquette: http://www.arm.linux.org.uk/mailinglists/etiquette.php
Jamie Lokier
2009-02-24 21:17:04 UTC
Permalink
Imho, "lsl #(20 - 11)" would be clearer here given the comment :-)

-- Jamie

-------------------------------------------------------------------
List admin: http://lists.arm.linux.org.uk/mailman/listinfo/linux-arm-kernel
FAQ: http://www.arm.linux.org.uk/mailinglists/faq.php
Etiquette: http://www.arm.linux.org.uk/mailinglists/etiquette.php
George G. Davis
2009-02-25 15:24:32 UTC
Permalink
Hi,
Perhaps you didn't notice that the code snippet below was quoted
Post by Jamie Lokier
Imho, "lsl #(20 - 11)" would be clearer here given the comment :-)
The patch-2.6.9-arm1.gz patch hunk was posted to show the source
of the code for compare/contrast with what went upstream.

--
Regards,
George
Post by Jamie Lokier
-- Jamie
-------------------------------------------------------------------
List admin: http://lists.arm.linux.org.uk/mailman/listinfo/linux-arm-kernel
FAQ: http://www.arm.linux.org.uk/mailinglists/faq.php
Etiquette: http://www.arm.linux.org.uk/mailinglists/etiquette.php
Jamie Lokier
2009-02-25 15:41:53 UTC
Permalink
Post by George G. Davis
The patch-2.6.9-arm1.gz patch hunk was posted to show the source
of the code for compare/contrast with what went upstream.
Oh! I didn't notice it was posted only to explain, not as a proposed
patch. Thanks for pointing it out.

-- Jamie

-------------------------------------------------------------------
List admin: http://lists.arm.linux.org.uk/mailman/listinfo/linux-arm-kernel
FAQ: http://www.arm.linux.org.uk/mailinglists/faq.php
Etiquette: http://www.arm.linux.org.uk/mailinglists/etiquette.php
Catalin Marinas
2009-03-02 11:23:06 UTC
Permalink
Post by Russell King - ARM Linux
As noted in http://www.arm.com/linux/patch-2.6.9-arm1.gz, the
"Faulty SWP instruction on 1136 doesn't set bit 11 in DFSR." So the
v6_early_abort handler does not report the correct rd/wr direction for
the SWP instruction which may result in SEGVS or hangs. In order to work
around this problem, this patch merely updates the fix contained in the
ARM Ltd. patch to use the macroised abort handler fixups.
Signed-off-by: George G. Davis
Post by Russell King - ARM Linux
George, please comment.
Ok, so we need Catalin to look at this.
I don't remember the history exactly but I'll have a look at this later
today.
--
Catalin


-------------------------------------------------------------------
List admin: http://lists.arm.linux.org.uk/mailman/listinfo/linux-arm-kernel
FAQ: http://www.arm.linux.org.uk/mailinglists/faq.php
Etiquette: http://www.arm.linux.org.uk/mailinglists/etiquette.php
Russell King - ARM Linux
2009-03-01 17:41:53 UTC
Permalink
Post by George G. Davis
Apologies for the delayed reply. Between an ice storm knocking me off
line for several days when this was initially posted, work, family
and life in general, I'm finally following up on this...
Post by Russell King - ARM Linux
Post by Marc Singer
The data abort handler for v6 performs
right after reading FSR. In the case of the MX31 implementation of
v6, I don't believe that this is necessary since the architecture
passes bit 10 as a significant part of the fault status. The 'RW' bit
(bit 11), as well, is implemented by the architecture, so it ought to
be OK to let it pass through, no?
In any case, when I remove this line the fault FSR changes to 0xc06
"imprecise external abort" which makes a heck of a lot more sense.
Russell, should this line be removed? Perhaps we should not clear bit
10 at a minimum?
No idea. George introduced this code, see below (stupidly long comment
line got truncated).
commit 3a1e501511a1e2c665c566939047794dcf86466b
Date: Fri Apr 29 22:08:33 2005 +0100
[PATCH] ARM: 2655/1: ARM1136 SWP instruction abort handler fix
Patch from George G. Davis
As noted in http://www.arm.com/linux/patch-2.6.9-arm1.gz, the
"Faulty SWP instruction on 1136 doesn't set bit 11 in DFSR." So the
v6_early_abort handler does not report the correct rd/wr direction for
the SWP instruction which may result in SEGVS or hangs. In order to work
around this problem, this patch merely updates the fix contained in the
ARM Ltd. patch to use the macroised abort handler fixups.
Signed-off-by: George G. Davis
Post by Russell King - ARM Linux
George, please comment.
Ok, so we need Catalin to look at this.

-------------------------------------------------------------------
List admin: http://lists.arm.linux.org.uk/mailman/listinfo/linux-arm-kernel
FAQ: http://www.arm.linux.org.uk/mailinglists/faq.php
Etiquette: http://www.arm.linux.org.uk/mailinglists/etiquette.php
Catalin Marinas
2009-03-03 13:33:26 UTC
Permalink
Post by Russell King - ARM Linux
commit 3a1e501511a1e2c665c566939047794dcf86466b
Date: Fri Apr 29 22:08:33 2005 +0100
[PATCH] ARM: 2655/1: ARM1136 SWP instruction abort handler fix
Patch from George G. Davis
As noted in http://www.arm.com/linux/patch-2.6.9-arm1.gz, the
"Faulty SWP instruction on 1136 doesn't set bit 11 in DFSR." So the
v6_early_abort handler does not report the correct rd/wr direction for
the SWP instruction which may result in SEGVS or hangs. In order to work
around this problem, this patch merely updates the fix contained in the
ARM Ltd. patch to use the macroised abort handler fixups.
Signed-off-by: George G. Davis
Post by Russell King - ARM Linux
George, please comment.
Ok, so we need Catalin to look at this.
I couldn't find any reason why bit 10 of DFSR should be cleared on this
kernel. I think patch was first created on the 2.4.x kernel (ARM
internal v6 support) and just copied to the 2.6.x one. The imprecise
aborts handling (FSR[10]) was added by Russell I think during the 2.5.x
series (2002) but not in 2.4.

Anyway, here's a patch to correct this:


Do not clear bit 10 of DFSR during abort handling on ARMv6

From: Catalin Marinas <***@arm.com>

Because of an ARM1136 erratum (326103), the current v6_early_abort
function needs to set the correct FSR[11] value which determines whether
the data abort was caused by a read or write. For legacy reasons (bit 10
not handled by software), bit 10 was also cleared masking out imprecise
aborts on ARMv6 CPUs. This patch removes the clearing of bit 10 of FSR.

Signed-off-by: Catalin Marinas <***@arm.com>
---
arch/arm/mm/abort-ev6.S | 4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm/mm/abort-ev6.S b/arch/arm/mm/abort-ev6.S
index 8a7f65b..c7b258d 100644
--- a/arch/arm/mm/abort-ev6.S
+++ b/arch/arm/mm/abort-ev6.S
@@ -28,10 +28,10 @@ ENTRY(v6_early_abort)
mrc p15, 0, r1, c5, c0, 0 @ get FSR
mrc p15, 0, r0, c6, c0, 0 @ get FAR
/*
- * Faulty SWP instruction on 1136 doesn't set bit 11 in DFSR.
+ * Faulty SWP instruction on 1136 doesn't set bit 11 in DFSR (erratum 326103).
* The test below covers all the write situations, including Java bytecodes
*/
- bic r1, r1, #1 << 11 | 1 << 10 @ clear bits 11 and 10 of FSR
+ bic r1, r1, #1 << 11 @ clear bit 11 of FSR
tst r3, #PSR_J_BIT @ Java?
movne pc, lr
do_thumb_abort
--
Catalin


-------------------------------------------------------------------
List admin: http://lists.arm.linux.org.uk/mailman/listinfo/linux-arm-kernel
FAQ: http://www.arm.linux.org.uk/mailinglists/faq.php
Etiquette: http://www.arm.linux.org.uk/mailinglists/etiquette.php
Loading...