Discussion:
[RFC PATCH 0/2] ARM: Fix unparseable signal frame with CONFIG_IWMMXT
Dave Martin
2017-06-21 15:46:01 UTC
Permalink
In kernels with CONFIG_IWMMXT=y running on non-iWMMXt hardware, the
signal frame can be left partially uninitialised in such a way
that userspace cannot parse uc_regspace[] safely. In particular,
this means that the VFP registers cannot be located reliably in the
signal frame when a multi_v7_defconfig kernel is run on the
majority of platforms.

I don't know whether any userspace has implemented any sort of
workaround for this, but the ABI by itself is insufficient anyway.

This series attempts to omit the spurious iWMMXt record when
appropriate.

Not extensively tested, and the ABI impact is unknown for now.

Dave Martin (2):
ARM: iwmmxt: Add missing __user annotations to sigframe accessors
ARM: signal: Remove unparseable iwmmxt_sigframe from uc_regspace[]

arch/arm/include/asm/ucontext.h | 20 ----------------
arch/arm/kernel/signal.c | 52 +++++++++++++++++++++++++++--------------
2 files changed, 35 insertions(+), 37 deletions(-)
--
2.1.4
Dave Martin
2017-06-21 15:46:02 UTC
Permalink
preserve_iwmmxt_context() and restore_iwmmxt_context() lack __user
accessors on their arguments pointing to the user signal frame.

There does not be appear to be a bug here, but this omission is
inconsistent with the crunch and vfp sigframe access functions.

This patch adds the annotations, for consistency.

Signed-off-by: Dave Martin <***@arm.com>
---
arch/arm/kernel/signal.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm/kernel/signal.c b/arch/arm/kernel/signal.c
index 7b8f214..8f06480 100644
--- a/arch/arm/kernel/signal.c
+++ b/arch/arm/kernel/signal.c
@@ -59,7 +59,7 @@ static int restore_crunch_context(struct crunch_sigframe __user *frame)

#ifdef CONFIG_IWMMXT

-static int preserve_iwmmxt_context(struct iwmmxt_sigframe *frame)
+static int preserve_iwmmxt_context(struct iwmmxt_sigframe __user *frame)
{
char kbuf[sizeof(*frame) + 8];
struct iwmmxt_sigframe *kframe;
@@ -72,7 +72,7 @@ static int preserve_iwmmxt_context(struct iwmmxt_sigframe *frame)
return __copy_to_user(frame, kframe, sizeof(*frame));
}

-static int restore_iwmmxt_context(struct iwmmxt_sigframe *frame)
+static int restore_iwmmxt_context(struct iwmmxt_sigframe __user *frame)
{
char kbuf[sizeof(*frame) + 8];
struct iwmmxt_sigframe *kframe;
--
2.1.4
Dave Martin
2017-06-21 15:46:03 UTC
Permalink
In kernels with CONFIG_IWMMXT=y running on non-iWMMXt hardware, the
signal frame can be left partially uninitialised in such a way
that userspace cannot parse uc_regspace[] safely. In particular,
this means that the VFP registers cannot be located reliably in the
signal frame when a multi_v7_defconfig kernel is run on the
majority of platforms.

The cause is that the uc_regspace[] is laid out statically based on
the kernel config, but the decision of whether to save/restore the
iWMMXt registers must be a runtime decision. There is no obvious
way to pad the hole left when the iWMMXt registers are not saved,
because there is no dummy record type that we can rely on userspace
to ignore, and no clear semantics for what an iwmmxt_sigframe
record is supposed to mean if the hardware doesn't support iXMMXt.

One option would be to write the magic and size for
iwmmxt_sigframe, and leave the body uninitialised or fill it with
zeros or deadc0de. But this may confuse userspace if it is taken
as evidence that iWMMXt is present.

However, there seems to be a clear design intention that the
records in uc_regspace[] be parsed as a tagged list, with the
parser sequentially examining the magic number in each record and
using the size field to step to the next record until a record with
null magic is found.

So, instead of placing each record at a fixed offset in
uc_regspace[], this patch only advances the offset for a record
that is actually populated. Later records following an unpopulated
record will shift to lower offsets in uc_regspace[] as a result.

This change causes the fixed-layout definition of struct
aux_sigframe to become useless. Since it is not present in a uapi
header, this patch simply removes the definition.

These changes are not expected to break ABI except for VFP-aware
software that has been explicitly hacked to work around this issue
on CONFIG_IWMMXT=y kernels, which is unlikely to be a common case
and would obviously violate the design intent of the arm signal
frame.

There is no clear solution that definitely does not break ABI.

Reported-by: Edmund Grimley-Evans <Edmund.Grimley-***@arm.com>
Signed-off-by: Dave Martin <***@arm.com>
---
arch/arm/include/asm/ucontext.h | 20 ----------------
arch/arm/kernel/signal.c | 52 +++++++++++++++++++++++++++--------------
2 files changed, 35 insertions(+), 37 deletions(-)

diff --git a/arch/arm/include/asm/ucontext.h b/arch/arm/include/asm/ucontext.h
index 14749ae..664b611 100644
--- a/arch/arm/include/asm/ucontext.h
+++ b/arch/arm/include/asm/ucontext.h
@@ -77,26 +77,6 @@ struct vfp_sigframe

#endif /* CONFIG_VFP */

-/*
- * Auxiliary signal frame. This saves stuff like FP state.
- * The layout of this structure is not part of the user ABI,
- * because the config options aren't. uc_regspace is really
- * one of these.
- */
-struct aux_sigframe {
-#ifdef CONFIG_CRUNCH
- struct crunch_sigframe crunch;
-#endif
-#ifdef CONFIG_IWMMXT
- struct iwmmxt_sigframe iwmmxt;
-#endif
-#ifdef CONFIG_VFP
- struct vfp_sigframe vfp;
-#endif
- /* Something that isn't a valid magic number for any coprocessor. */
- unsigned long end_magic;
-} __attribute__((__aligned__(8)));
-
#endif

#endif /* !_ASMARM_UCONTEXT_H */
diff --git a/arch/arm/kernel/signal.c b/arch/arm/kernel/signal.c
index 8f06480..024d9aa 100644
--- a/arch/arm/kernel/signal.c
+++ b/arch/arm/kernel/signal.c
@@ -27,8 +27,10 @@ extern const unsigned long sigreturn_codes[7];
static unsigned long signal_return_offset;

#ifdef CONFIG_CRUNCH
-static int preserve_crunch_context(struct crunch_sigframe __user *frame)
+static int preserve_crunch_context(char __user **auxp)
{
+ struct crunch_sigframe __user *frame =
+ (struct crunch_sigframe __user *)*auxp;
char kbuf[sizeof(*frame) + 8];
struct crunch_sigframe *kframe;

@@ -36,12 +38,15 @@ static int preserve_crunch_context(struct crunch_sigframe __user *frame)
kframe = (struct crunch_sigframe *)((unsigned long)(kbuf + 8) & ~7);
kframe->magic = CRUNCH_MAGIC;
kframe->size = CRUNCH_STORAGE_SIZE;
+ *auxp += CRUNCH_STORAGE_SIZE;
crunch_task_copy(current_thread_info(), &kframe->storage);
return __copy_to_user(frame, kframe, sizeof(*frame));
}

-static int restore_crunch_context(struct crunch_sigframe __user *frame)
+static int restore_crunch_context(char __user **auxp)
{
+ struct crunch_sigframe __user *frame =
+ (struct crunch_sigframe __user *)*auxp;
char kbuf[sizeof(*frame) + 8];
struct crunch_sigframe *kframe;

@@ -52,6 +57,7 @@ static int restore_crunch_context(struct crunch_sigframe __user *frame)
if (kframe->magic != CRUNCH_MAGIC ||
kframe->size != CRUNCH_STORAGE_SIZE)
return -1;
+ *auxp = CRUNCH_STORAGE_SIZE;
crunch_task_restore(current_thread_info(), &kframe->storage);
return 0;
}
@@ -59,8 +65,10 @@ static int restore_crunch_context(struct crunch_sigframe __user *frame)

#ifdef CONFIG_IWMMXT

-static int preserve_iwmmxt_context(struct iwmmxt_sigframe __user *frame)
+static int preserve_iwmmxt_context(char __user **auxp)
{
+ struct iwmmxt_sigframe __user *frame =
+ (struct iwmmxt_sigframe __user *)*auxp;
char kbuf[sizeof(*frame) + 8];
struct iwmmxt_sigframe *kframe;

@@ -68,12 +76,15 @@ static int preserve_iwmmxt_context(struct iwmmxt_sigframe __user *frame)
kframe = (struct iwmmxt_sigframe *)((unsigned long)(kbuf + 8) & ~7);
kframe->magic = IWMMXT_MAGIC;
kframe->size = IWMMXT_STORAGE_SIZE;
+ *auxp += IWMMXT_STORAGE_SIZE;
iwmmxt_task_copy(current_thread_info(), &kframe->storage);
return __copy_to_user(frame, kframe, sizeof(*frame));
}

-static int restore_iwmmxt_context(struct iwmmxt_sigframe __user *frame)
+static int restore_iwmmxt_context(char __user **auxp)
{
+ struct iwmmxt_sigframe __user *frame =
+ (struct iwmmxt_sigframe __user *)*auxp;
char kbuf[sizeof(*frame) + 8];
struct iwmmxt_sigframe *kframe;

@@ -84,6 +95,7 @@ static int restore_iwmmxt_context(struct iwmmxt_sigframe __user *frame)
if (kframe->magic != IWMMXT_MAGIC ||
kframe->size != IWMMXT_STORAGE_SIZE)
return -1;
+ *auxp += IWMMXT_STORAGE_SIZE;
iwmmxt_task_restore(current_thread_info(), &kframe->storage);
return 0;
}
@@ -92,14 +104,17 @@ static int restore_iwmmxt_context(struct iwmmxt_sigframe __user *frame)

#ifdef CONFIG_VFP

-static int preserve_vfp_context(struct vfp_sigframe __user *frame)
+static int preserve_vfp_context(char __user **auxp)
{
+ struct vfp_sigframe __user *frame =
+ (struct vfp_sigframe __user *)*auxp;
const unsigned long magic = VFP_MAGIC;
const unsigned long size = VFP_STORAGE_SIZE;
int err = 0;

__put_user_error(magic, &frame->magic, err);
__put_user_error(size, &frame->size, err);
+ *auxp += size;

if (err)
return -EFAULT;
@@ -107,8 +122,10 @@ static int preserve_vfp_context(struct vfp_sigframe __user *frame)
return vfp_preserve_user_clear_hwstate(&frame->ufp, &frame->ufp_exc);
}

-static int restore_vfp_context(struct vfp_sigframe __user *frame)
+static int restore_vfp_context(char __user **auxp)
{
+ struct vfp_sigframe __user *frame =
+ (struct vfp_sigframe __user *)*auxp;
unsigned long magic;
unsigned long size;
int err = 0;
@@ -121,6 +138,7 @@ static int restore_vfp_context(struct vfp_sigframe __user *frame)
if (magic != VFP_MAGIC || size != VFP_STORAGE_SIZE)
return -EINVAL;

+ *auxp += size;
return vfp_restore_user_hwstate(&frame->ufp, &frame->ufp_exc);
}

@@ -141,7 +159,7 @@ struct rt_sigframe {

static int restore_sigframe(struct pt_regs *regs, struct sigframe __user *sf)
{
- struct aux_sigframe __user *aux;
+ char __user *aux;
sigset_t set;
int err;

@@ -169,18 +187,18 @@ static int restore_sigframe(struct pt_regs *regs, struct sigframe __user *sf)

err |= !valid_user_regs(regs);

- aux = (struct aux_sigframe __user *) sf->uc.uc_regspace;
+ aux = (char __user *) sf->uc.uc_regspace;
#ifdef CONFIG_CRUNCH
if (err == 0)
- err |= restore_crunch_context(&aux->crunch);
+ err |= restore_crunch_context(&aux);
#endif
#ifdef CONFIG_IWMMXT
if (err == 0 && test_thread_flag(TIF_USING_IWMMXT))
- err |= restore_iwmmxt_context(&aux->iwmmxt);
+ err |= restore_iwmmxt_context(&aux);
#endif
#ifdef CONFIG_VFP
if (err == 0)
- err |= restore_vfp_context(&aux->vfp);
+ err |= restore_vfp_context(&aux);
#endif

return err;
@@ -252,7 +270,7 @@ asmlinkage int sys_rt_sigreturn(struct pt_regs *regs)
static int
setup_sigframe(struct sigframe __user *sf, struct pt_regs *regs, sigset_t *set)
{
- struct aux_sigframe __user *aux;
+ char __user *aux;
int err = 0;

__put_user_error(regs->ARM_r0, &sf->uc.uc_mcontext.arm_r0, err);
@@ -280,20 +298,20 @@ setup_sigframe(struct sigframe __user *sf, struct pt_regs *regs, sigset_t *set)

err |= __copy_to_user(&sf->uc.uc_sigmask, set, sizeof(*set));

- aux = (struct aux_sigframe __user *) sf->uc.uc_regspace;
+ aux = (char __user *) sf->uc.uc_regspace;
#ifdef CONFIG_CRUNCH
if (err == 0)
- err |= preserve_crunch_context(&aux->crunch);
+ err |= preserve_crunch_context(&aux);
#endif
#ifdef CONFIG_IWMMXT
if (err == 0 && test_thread_flag(TIF_USING_IWMMXT))
- err |= preserve_iwmmxt_context(&aux->iwmmxt);
+ err |= preserve_iwmmxt_context(&aux);
#endif
#ifdef CONFIG_VFP
if (err == 0)
- err |= preserve_vfp_context(&aux->vfp);
+ err |= preserve_vfp_context(&aux);
#endif
- __put_user_error(0, &aux->end_magic, err);
+ __put_user_error(0, (unsigned long __user *)aux, err);

return err;
}
--
2.1.4
Russell King - ARM Linux
2017-06-26 10:13:04 UTC
Permalink
Post by Dave Martin
In kernels with CONFIG_IWMMXT=y running on non-iWMMXt hardware, the
signal frame can be left partially uninitialised in such a way
that userspace cannot parse uc_regspace[] safely. In particular,
this means that the VFP registers cannot be located reliably in the
signal frame when a multi_v7_defconfig kernel is run on the
majority of platforms.
I don't know whether any userspace has implemented any sort of
workaround for this, but the ABI by itself is insufficient anyway.
This series attempts to omit the spurious iWMMXt record when
appropriate.
Not extensively tested, and the ABI impact is unknown for now.
Hmm, I would actually suggest that we poke in a correct size for the
missing iWMMXt record, and an invalid magic number as the "simple"
solution for this - that doesn't make any layout changes to the
data structures, and is probably the safest solution for backporting.

Going forward, I think something along the lines of your proposal is
okay.
--
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
Dave Martin
2017-06-26 13:32:56 UTC
Permalink
Post by Russell King - ARM Linux
Post by Dave Martin
In kernels with CONFIG_IWMMXT=y running on non-iWMMXt hardware, the
signal frame can be left partially uninitialised in such a way
that userspace cannot parse uc_regspace[] safely. In particular,
this means that the VFP registers cannot be located reliably in the
signal frame when a multi_v7_defconfig kernel is run on the
majority of platforms.
I don't know whether any userspace has implemented any sort of
workaround for this, but the ABI by itself is insufficient anyway.
This series attempts to omit the spurious iWMMXt record when
appropriate.
Not extensively tested, and the ABI impact is unknown for now.
Hmm, I would actually suggest that we poke in a correct size for the
missing iWMMXt record, and an invalid magic number as the "simple"
solution for this - that doesn't make any layout changes to the
data structures, and is probably the safest solution for backporting.
This avoids altering the sigframe layout at all in this case, which
feels less dirsuptive, but overall I'm not sure it's lower-risk.

I'm concerned that there are a some userspace sigframe parsers out there
that work only by accident, especially given that the kernel sigreturn
implementation is the primary example and that doesn't need to be fully
robust (since the kernel lays out the sigframe itself during signal
delivery).
Post by Russell King - ARM Linux
Going forward, I think something along the lines of your proposal is
okay.
I'm happy to do either, or propose one approach for stable and the other
for mainline, but it's hard to know which is least likely to break
userspace, or exactly what the ABI is.

Cheers
---Dave
Russell King - ARM Linux
2017-06-26 14:40:01 UTC
Permalink
Post by Dave Martin
Post by Russell King - ARM Linux
Hmm, I would actually suggest that we poke in a correct size for the
missing iWMMXt record, and an invalid magic number as the "simple"
solution for this - that doesn't make any layout changes to the
data structures, and is probably the safest solution for backporting.
This avoids altering the sigframe layout at all in this case, which
feels less dirsuptive, but overall I'm not sure it's lower-risk.
I'm concerned that there are a some userspace sigframe parsers out there
that work only by accident, especially given that the kernel sigreturn
implementation is the primary example and that doesn't need to be fully
robust (since the kernel lays out the sigframe itself during signal
delivery).
I'd hope that the kernel implementation is not used as an example - it
most certainly is not an example, as it does no parsing of the data
structures. As the kernel is responsible for creating the layout, it
expects the exact same layout coming back in, and any deviation from
that results in the task being forcefully exited.

Userspace doesn't have the luxury of prior knowledge of the layout -
it doesn't know how the kernel is configured. It can't assume (eg)
that VFP will be at 0xa0 bytes in if IWMMXT but not CRUNCH is enabled.

Basically, the layout that the kernel creates is entirely dependent on
the kernel configuration, and any scheme that replicates what the kernel
is doing in the restore paths is doomed to failure. (However, that's
not to say userspace isn't, but if it is, userspace breaks if the kernel
configuration is changed. I don't regard that as a kernel-induced
userspace regression though - it's a bit like expecting EABI userspace
to work with OABI-only supporting kernel.)

Now, the possibilities for userspace to parse the "broken" kernel layout
for VFP information are:

1. To use a fixed offset from the start (which means it breaks if the
kernel is reconfigured.)

2. userspace checks several fixed offsets for the VFP identifier (at 0,
0xa0, 0xc0 or 0x160). That's risky if the other state happens to
contain a word that looks like the VFP identifier.

If userspace is using the proper method that the original code intended,
userspace would hit uninitialised memory for the iWMMXT block identifier
and size (they'd see stale data on the stack) and if they interpret the
"size" field to try and skip over it, they could end up anywhere in
memory space.

Fixing it using your approach would mean that the VFP block ends up at
a variable location depending on whether the iWMMXT state was saved -
which certainly breaks (1) in a way that does not depend on kernel
configuration. (2) survives as they'd find the identifier whatever
happens.

My proposal solves all three cases, because userspace ends up with a
correct size for what is an unknown block of code, and doesn't involve
moving anything around. It shouldn't break the "correct" parsing that
userspace should be doing either, because it should skip over the
unknown block. The only case that it would break is if the identifier
were to somehow match, and I think the chances of that are very slim.
So I believe (without evidence to the contary) that this would be the
lowest risk.
Post by Dave Martin
Post by Russell King - ARM Linux
Going forward, I think something along the lines of your proposal is
okay.
I'm happy to do either, or propose one approach for stable and the other
for mainline, but it's hard to know which is least likely to break
userspace, or exactly what the ABI is.
The intended ABI is a tagged list, where the list headers are made up
of an identifier and a size (where the size gives the offset from the
start of this block to the next - iow, from the address of the
identifer.) The aux_sigframe structure is there as a convenience to
the kernel (which is why it's not in uapi/).

The interface was created by Daniel Jacobowitz from Codesourcery, and I
believe Daniel was working on the userspace side at the same time, so I
would hope that the userspace side does proper parsing - except for one
issue - when Daniel was working on it, we weren't saving VFP state
across signal handlers.

The issue that you've found looks to have been there since the original
design back in 2006.
--
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
Dave Martin
2017-06-26 16:36:39 UTC
Permalink
Post by Russell King - ARM Linux
Post by Dave Martin
Post by Russell King - ARM Linux
Hmm, I would actually suggest that we poke in a correct size for the
missing iWMMXt record, and an invalid magic number as the "simple"
solution for this - that doesn't make any layout changes to the
data structures, and is probably the safest solution for backporting.
This avoids altering the sigframe layout at all in this case, which
feels less dirsuptive, but overall I'm not sure it's lower-risk.
I'm concerned that there are a some userspace sigframe parsers out there
that work only by accident, especially given that the kernel sigreturn
implementation is the primary example and that doesn't need to be fully
robust (since the kernel lays out the sigframe itself during signal
delivery).
I'd hope that the kernel implementation is not used as an example - it
most certainly is not an example, as it does no parsing of the data
structures. As the kernel is responsible for creating the layout, it
expects the exact same layout coming back in, and any deviation from
that results in the task being forcefully exited.
Unfortunately, things that are not intended as examples do still get
used. We can argue that's the userspace folks' fault, but it still
creates de facto ABI...
Post by Russell King - ARM Linux
Userspace doesn't have the luxury of prior knowledge of the layout -
it doesn't know how the kernel is configured. It can't assume (eg)
that VFP will be at 0xa0 bytes in if IWMMXT but not CRUNCH is enabled.
Agreed
Post by Russell King - ARM Linux
Basically, the layout that the kernel creates is entirely dependent on
the kernel configuration, and any scheme that replicates what the kernel
is doing in the restore paths is doomed to failure. (However, that's
not to say userspace isn't, but if it is, userspace breaks if the kernel
configuration is changed. I don't regard that as a kernel-induced
userspace regression though - it's a bit like expecting EABI userspace
to work with OABI-only supporting kernel.)
I'm actually a little confused by, say,

https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/arm/setcontext.S;h=db6aebfbd4d360e3b7ba525cf2e483f8e3ddfc0d;hb=HEAD

Assuming I'm looking in the right place here, glibc effectively uses its
own private format for uc_regspace -- maybe there is kernel history
here I'm not aware of, or maybe it's not even trying to be compatible.


Also, libunwind does not appear to attempt to parse uc_regspace:

git.savannah.gnu.org/gitweb/?p=libunwind.git;a=blob;f=src/arm/Gstep.c;h=37e6c12f115173ebbc9ebcf511c53fd7c0a7d9a1;hb=HEAD


I've not fully understood the gdb code, but there is a comment in
arm-linux-tdep.c that suggests that uc_regspace is not processed (nor do
I see any other mention of uc_regspace or things like VFP_MAGIC:

https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=gdb/arm-linux-tdep.c;h=95c52608adbb1ff92a9ddb203835d5a1102339bd;hb=HEAD

/* The VFP or iWMMXt registers may be saved on the stack, but there's
no reliable way to restore them (yet). */


Do you know of any userspace parser of uc_regspace?

All I have so far is this, from the reporter of the bug:

https://github.com/DynamoRIO/dynamorio/commit/0b75c635033d01ab04f955f5affe14a3ced9ab56
Post by Russell King - ARM Linux
Now, the possibilities for userspace to parse the "broken" kernel layout
1. To use a fixed offset from the start (which means it breaks if the
kernel is reconfigured.)
2. userspace checks several fixed offsets for the VFP identifier (at 0,
0xa0, 0xc0 or 0x160). That's risky if the other state happens to
contain a word that looks like the VFP identifier.
If userspace is using the proper method that the original code intended,
userspace would hit uninitialised memory for the iWMMXT block identifier
and size (they'd see stale data on the stack) and if they interpret the
"size" field to try and skip over it, they could end up anywhere in
memory space.
Fixing it using your approach would mean that the VFP block ends up at
a variable location depending on whether the iWMMXT state was saved -
which certainly breaks (1) in a way that does not depend on kernel
configuration. (2) survives as they'd find the identifier whatever
happens.
My proposal solves all three cases, because userspace ends up with a
correct size for what is an unknown block of code, and doesn't involve
moving anything around. It shouldn't break the "correct" parsing that
userspace should be doing either, because it should skip over the
unknown block. The only case that it would break is if the identifier
were to somehow match, and I think the chances of that are very slim.
So I believe (without evidence to the contary) that this would be the
lowest risk.
This seems reasonable.

Although I'm concerned that parsers may give up as soon as they see an
unknown block, that is at least a clean failure, and is better than
wandering off into the weeds. I haven't seen evidence of such a parser
existing, so far -- there will be ones I don't know about.
Post by Russell King - ARM Linux
Post by Dave Martin
Post by Russell King - ARM Linux
Going forward, I think something along the lines of your proposal is
okay.
I'm happy to do either, or propose one approach for stable and the other
for mainline, but it's hard to know which is least likely to break
userspace, or exactly what the ABI is.
The intended ABI is a tagged list, where the list headers are made up
of an identifier and a size (where the size gives the offset from the
start of this block to the next - iow, from the address of the
identifer.) The aux_sigframe structure is there as a convenience to
the kernel (which is why it's not in uapi/).
The interface was created by Daniel Jacobowitz from Codesourcery, and I
believe Daniel was working on the userspace side at the same time, so I
would hope that the userspace side does proper parsing - except for one
issue - when Daniel was working on it, we weren't saving VFP state
across signal handlers.
The issue that you've found looks to have been there since the original
design back in 2006.
Looks like it.

I'll update the patch to preserve the iWMMXt block on signal delivery
but define a new dummy tag for this case.


Should we enforce the same on sigreturn, or be more tolerant?

One issue is that I believe there is software out there (ab)using
sigreturn to do an atomic siglongjmp type operation.

There is some merit to this, since the effect cannot be achieved 100%
safely in any other way. However, it may require the caller to
manufacture a sigframe from scratch. If so, it may be natural to
omit the IWMMXT block (and indeed the VFP block, if the caller
doesn't care what's in the VFP registers at the destination).

The DynamoRIO example above takes a signal to generate a "template"
sigframe, which is then modified to produce the desired result.
Putting aside the issue of whether this is an abuse of sigreturn
or not (and the question of why they are doing it at all), this
seems a reasonable approach -- which they also apparently use for
x86. So their sigframe will contain the dummy iWMMXt block, but
it will have a valid tag if we patch the kernel to write one.

Other projects may not be so lucky if they don't use a delivered
signal as a template in this way.

Cheers
---Dave
Russell King - ARM Linux
2017-06-26 18:12:32 UTC
Permalink
Post by Dave Martin
Post by Russell King - ARM Linux
I'd hope that the kernel implementation is not used as an example - it
most certainly is not an example, as it does no parsing of the data
structures. As the kernel is responsible for creating the layout, it
expects the exact same layout coming back in, and any deviation from
that results in the task being forcefully exited.
Unfortunately, things that are not intended as examples do still get
used. We can argue that's the userspace folks' fault, but it still
creates de facto ABI...
Given that the contents of the structure depend on kernel configuration
symbols, it's impossible for userspace to use it unless they also have
some kind of static configuration as well.
Post by Dave Martin
Post by Russell King - ARM Linux
Basically, the layout that the kernel creates is entirely dependent on
the kernel configuration, and any scheme that replicates what the kernel
is doing in the restore paths is doomed to failure. (However, that's
not to say userspace isn't, but if it is, userspace breaks if the kernel
configuration is changed. I don't regard that as a kernel-induced
userspace regression though - it's a bit like expecting EABI userspace
to work with OABI-only supporting kernel.)
The kernel gained the tagged-list approach in 2006, and didn't start
preserving the VFP state until 2010.
Post by Dave Martin
I'm actually a little confused by, say,
https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/arm/setcontext.S;h=db6aebfbd4d360e3b7ba525cf2e483f8e3ddfc0d;hb=HEAD
Assuming I'm looking in the right place here, glibc effectively uses its
own private format for uc_regspace -- maybe there is kernel history
here I'm not aware of, or maybe it's not even trying to be compatible.
It looks to me like glibc is expecting:

- If the HWCAP includes VFP
- 64 bytes of d8-d15 registers
- fpscr
- If the HWCAP includes iWMMXT
- 48 bytes of iWMMXT state

The kernel has never used that (partial!) format - note that it seems
to omit d0-d7 from the context.

Given that setcontext()'s man page says:

The function setcontext() restores the user context pointed at by ucp.
A successful call does not return. The context should have been
obtained by a call of getcontext(), or makecontext(3), or passed as
third argument to a signal handler.

it seems that for this to work in the signal handler case, there would
have to be some kind of translation from the kernel format to glibc's
format when calling into the signal handler - maybe there is... but
what you point out is definitely incompatible with the kernel today,
and has always been incompatible.

If there's no translation going on, then this has never worked, and so
there's no possibility of a regression!
Post by Dave Martin
git.savannah.gnu.org/gitweb/?p=libunwind.git;a=blob;f=src/arm/Gstep.c;h=37e6c12f115173ebbc9ebcf511c53fd7c0a7d9a1;hb=HEAD
Yea, it's just looking at the integer register set.
Post by Dave Martin
I've not fully understood the gdb code, but there is a comment in
arm-linux-tdep.c that suggests that uc_regspace is not processed (nor do
https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=gdb/arm-linux-tdep.c;h=95c52608adbb1ff92a9ddb203835d5a1102339bd;hb=HEAD
/* The VFP or iWMMXt registers may be saved on the stack, but there's
no reliable way to restore them (yet). */
It sounds like no one implemented the userspace side of this then!
Post by Dave Martin
Do you know of any userspace parser of uc_regspace?
https://github.com/DynamoRIO/dynamorio/commit/0b75c635033d01ab04f955f5affe14a3ced9ab56
Hmm, well, it seems like they're the first to test this feature, which
is pretty sad.
Post by Dave Martin
Should we enforce the same on sigreturn, or be more tolerant?
I've been thinking about that, and haven't come to a decision. There
is the matter that more complex parsing is harder to be correct (think
about out of bounds 'size' values, although that can be mitigated by
ensuring that size is numerically correct for the magic ID - but then
what if we have a wrong ID, or the size is incorrect for the magic ID?)
Post by Dave Martin
There is some merit to this, since the effect cannot be achieved 100%
safely in any other way. However, it may require the caller to
manufacture a sigframe from scratch. If so, it may be natural to
omit the IWMMXT block (and indeed the VFP block, if the caller
doesn't care what's in the VFP registers at the destination).
As you can see, the kernel hasn't really catered for manufactured
sigframes - it expects to see the same sigframe that it wrote out.
Whether that's reasonable or not, I'm not sure, but no one's
complained about it yet!
Post by Dave Martin
The DynamoRIO example above takes a signal to generate a "template"
sigframe, which is then modified to produce the desired result.
Putting aside the issue of whether this is an abuse of sigreturn
or not (and the question of why they are doing it at all), this
seems a reasonable approach -- which they also apparently use for
x86. So their sigframe will contain the dummy iWMMXt block, but
it will have a valid tag if we patch the kernel to write one.
Bear in mind that parsing the data in uc_regspace is going to be
hardware specific, it's hard to do it in a generic way. Debuggers
necessarily have to know the intricate hardware details of the
system its running on, so it's reasonable for them to poke about
in that area. I'm not so sure about generic applications though.

Anyway, I don't have time this evening to continue this reply... so
I'll send it anyway. :)
--
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
Dave Martin
2017-06-27 17:15:31 UTC
Permalink
Post by Russell King - ARM Linux
Post by Dave Martin
Post by Russell King - ARM Linux
I'd hope that the kernel implementation is not used as an example - it
most certainly is not an example, as it does no parsing of the data
structures. As the kernel is responsible for creating the layout, it
expects the exact same layout coming back in, and any deviation from
that results in the task being forcefully exited.
Unfortunately, things that are not intended as examples do still get
used. We can argue that's the userspace folks' fault, but it still
creates de facto ABI...
Given that the contents of the structure depend on kernel configuration
symbols, it's impossible for userspace to use it unless they also have
some kind of static configuration as well.
Agreed
Post by Russell King - ARM Linux
Post by Dave Martin
Post by Russell King - ARM Linux
Basically, the layout that the kernel creates is entirely dependent on
the kernel configuration, and any scheme that replicates what the kernel
is doing in the restore paths is doomed to failure. (However, that's
not to say userspace isn't, but if it is, userspace breaks if the kernel
configuration is changed. I don't regard that as a kernel-induced
userspace regression though - it's a bit like expecting EABI userspace
to work with OABI-only supporting kernel.)
The kernel gained the tagged-list approach in 2006, and didn't start
preserving the VFP state until 2010.
Post by Dave Martin
I'm actually a little confused by, say,
https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/arm/setcontext.S;h=db6aebfbd4d360e3b7ba525cf2e483f8e3ddfc0d;hb=HEAD
Assuming I'm looking in the right place here, glibc effectively uses its
own private format for uc_regspace -- maybe there is kernel history
here I'm not aware of, or maybe it's not even trying to be compatible.
- If the HWCAP includes VFP
- 64 bytes of d8-d15 registers
- fpscr
- If the HWCAP includes iWMMXT
- 48 bytes of iWMMXT state
The kernel has never used that (partial!) format - note that it seems
to omit d0-d7 from the context.
The function setcontext() restores the user context pointed at by ucp.
A successful call does not return. The context should have been
obtained by a call of getcontext(), or makecontext(3), or passed as
third argument to a signal handler.
it seems that for this to work in the signal handler case, there would
have to be some kind of translation from the kernel format to glibc's
format when calling into the signal handler - maybe there is... but
what you point out is definitely incompatible with the kernel today,
and has always been incompatible.
If there's no translation going on, then this has never worked, and so
there's no possibility of a regression!
Yes. Sadly, there's no indication of whether the incompatibility is
intentional or not.
Post by Russell King - ARM Linux
Post by Dave Martin
git.savannah.gnu.org/gitweb/?p=libunwind.git;a=blob;f=src/arm/Gstep.c;h=37e6c12f115173ebbc9ebcf511c53fd7c0a7d9a1;hb=HEAD
Yea, it's just looking at the integer register set.
Post by Dave Martin
I've not fully understood the gdb code, but there is a comment in
arm-linux-tdep.c that suggests that uc_regspace is not processed (nor do
https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=gdb/arm-linux-tdep.c;h=95c52608adbb1ff92a9ddb203835d5a1102339bd;hb=HEAD
/* The VFP or iWMMXt registers may be saved on the stack, but there's
no reliable way to restore them (yet). */
It sounds like no one implemented the userspace side of this then!
Post by Dave Martin
Do you know of any userspace parser of uc_regspace?
https://github.com/DynamoRIO/dynamorio/commit/0b75c635033d01ab04f955f5affe14a3ced9ab56
Hmm, well, it seems like they're the first to test this feature, which
is pretty sad.
Hmm indeed
Post by Russell King - ARM Linux
Post by Dave Martin
Should we enforce the same on sigreturn, or be more tolerant?
I've been thinking about that, and haven't come to a decision. There
is the matter that more complex parsing is harder to be correct (think
about out of bounds 'size' values, although that can be mitigated by
ensuring that size is numerically correct for the magic ID - but then
what if we have a wrong ID, or the size is incorrect for the magic ID?)
Post by Dave Martin
There is some merit to this, since the effect cannot be achieved 100%
safely in any other way. However, it may require the caller to
manufacture a sigframe from scratch. If so, it may be natural to
omit the IWMMXT block (and indeed the VFP block, if the caller
doesn't care what's in the VFP registers at the destination).
As you can see, the kernel hasn't really catered for manufactured
sigframes - it expects to see the same sigframe that it wrote out.
Whether that's reasonable or not, I'm not sure, but no one's
complained about it yet!
Post by Dave Martin
The DynamoRIO example above takes a signal to generate a "template"
sigframe, which is then modified to produce the desired result.
Putting aside the issue of whether this is an abuse of sigreturn
or not (and the question of why they are doing it at all), this
seems a reasonable approach -- which they also apparently use for
x86. So their sigframe will contain the dummy iWMMXt block, but
it will have a valid tag if we patch the kernel to write one.
Bear in mind that parsing the data in uc_regspace is going to be
hardware specific, it's hard to do it in a generic way. Debuggers
necessarily have to know the intricate hardware details of the
system its running on, so it's reasonable for them to poke about
in that area. I'm not so sure about generic applications though.
Anyway, I don't have time this evening to continue this reply... so
I'll send it anyway. :)
There's certainly a limit to the portability that userspace can expect
here. Returning from a signal is portable; poking about inside
mcontext_t is not, though we should aim for least surprise.


For the RFC v2 I just posted, I've aimed for a halfway house where
the code is kept a simple as possible without mandating the
iWMMXt dummy block to be present on non-iWMMXt hardware.

If present, the block must have the same location and size as
the iwmmxt_sigframe would have. This should avoid the possibility
of any runtime overrun when attempting to skip blocks.

Cheers
---Dave
Russell King - ARM Linux
2017-07-19 09:28:27 UTC
Permalink
As a user (I recently implemented a work-around for this problem in DynamoRIO: https://github.com/DynamoRIO/dynamorio/commit/0b75c635033d01ab04f955f5affe14a3ced9ab56) I'm happy with any of the proposed solutions, provided that in the "invalid magic number" case user space is not required to regenerate the bogus block with the same invalid magic number when invoking sigreturn.
I have to ask why a bug report was never sent to kernel people on this.
If no one tells people that software is buggy, that software stands
little chance of being fixed!
--
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
Dave Martin
2017-07-19 10:40:08 UTC
Permalink
Post by Russell King - ARM Linux
As a user (I recently implemented a work-around for this problem in DynamoRIO: https://github.com/DynamoRIO/dynamorio/commit/0b75c635033d01ab04f955f5affe14a3ced9ab56) I'm happy with any of the proposed solutions, provided that in the "invalid magic number" case user space is not required to regenerate the bogus block with the same invalid magic number when invoking sigreturn.
I have to ask why a bug report was never sent to kernel people on this.
If no one tells people that software is buggy, that software stands
little chance of being fixed!
Partly because I fielded it before it got raised on the list.

However, it would have been good to have more discussion on the list and
to get the DynamoRIO guys commenting on there -- since they are the ones
actually trying to use this feature.

I'll try to encourage that more in the future...

Cheers
---Dave

Loading...