Discussion:
[arm/hw_breakpoint] Detecting that hardware watchpoint support is unavailable
Pavel Labath
2017-07-07 12:37:53 UTC
Permalink
Hello Pratyush, Will,

I have an arm(32) chip, which has hardware debug support disabled, as
far as I can tell (DBGDSCR.MDBGen reads as zero even after explicitly
setting in).

The current behavior of the kernel on these chips is to return a
non-zero number for the "number of supported watchpoint resources"
when queried with ptrace and then fail at runtime (ENODEV) when the
tracer attempts to set this breakpoint.

This does not seem like a particularly nice api, and I believe it
would be better to just return zero for the watchpoint count
initially, as we already know that setting the watchpoints will fail.

arch_hw_breakpoint_init() zeroes out the core_num_wrps when it detects
that debug support is not present, however when we return resource
info (ptrace_get_hbp_resource_info), we go through
hw_breakpoint_slots(), which does not read these.

I think having ptrace_get_hbp_resource_info() read core_num_wrps would
be more consistent because that's how we read maximum watchpoint
length (arch_get_max_wp_len), which means we (correctly ?) return 0
when hardware debug is not supported. In fact, that is how I presently
detect that hardware debug is not supported (reading max_wp_len), but
at this moment that seems more like an accident than a deliberate
interface. Having num_wrps, num_brps zeroed out as well would make it
clear that this is how it's intended to be used.

So, what do you think about a patch like that?

regards,
pavel
Will Deacon
2017-07-07 12:43:32 UTC
Permalink
Post by Pavel Labath
Hello Pratyush, Will,
I have an arm(32) chip, which has hardware debug support disabled, as
far as I can tell (DBGDSCR.MDBGen reads as zero even after explicitly
setting in).
The current behavior of the kernel on these chips is to return a
non-zero number for the "number of supported watchpoint resources"
when queried with ptrace and then fail at runtime (ENODEV) when the
tracer attempts to set this breakpoint.
This does not seem like a particularly nice api, and I believe it
would be better to just return zero for the watchpoint count
initially, as we already know that setting the watchpoints will fail.
arch_hw_breakpoint_init() zeroes out the core_num_wrps when it detects
that debug support is not present, however when we return resource
info (ptrace_get_hbp_resource_info), we go through
hw_breakpoint_slots(), which does not read these.
I think having ptrace_get_hbp_resource_info() read core_num_wrps would
be more consistent because that's how we read maximum watchpoint
length (arch_get_max_wp_len), which means we (correctly ?) return 0
when hardware debug is not supported. In fact, that is how I presently
detect that hardware debug is not supported (reading max_wp_len), but
at this moment that seems more like an accident than a deliberate
interface. Having num_wrps, num_brps zeroed out as well would make it
clear that this is how it's intended to be used.
So, what do you think about a patch like that?
I don't want to expose hw_breakpoint internals to ptrace as long as we
go through the internal perf interface. The two options really are:

1. Rip out the perf crap and integrate ptrace more tightly with
hw_breakpoint.

2. Fix hw_breakpoint_slots(...) to do what you want, although I seem
to recall a chicken-and-egg problem there where perf insists on calling
that function (hw_breakpoint_slots) early.

Will
Pavel Labath
2017-07-07 13:11:49 UTC
Permalink
Post by Will Deacon
Post by Pavel Labath
Hello Pratyush, Will,
I have an arm(32) chip, which has hardware debug support disabled, as
far as I can tell (DBGDSCR.MDBGen reads as zero even after explicitly
setting in).
The current behavior of the kernel on these chips is to return a
non-zero number for the "number of supported watchpoint resources"
when queried with ptrace and then fail at runtime (ENODEV) when the
tracer attempts to set this breakpoint.
This does not seem like a particularly nice api, and I believe it
would be better to just return zero for the watchpoint count
initially, as we already know that setting the watchpoints will fail.
arch_hw_breakpoint_init() zeroes out the core_num_wrps when it detects
that debug support is not present, however when we return resource
info (ptrace_get_hbp_resource_info), we go through
hw_breakpoint_slots(), which does not read these.
I think having ptrace_get_hbp_resource_info() read core_num_wrps would
be more consistent because that's how we read maximum watchpoint
length (arch_get_max_wp_len), which means we (correctly ?) return 0
when hardware debug is not supported. In fact, that is how I presently
detect that hardware debug is not supported (reading max_wp_len), but
at this moment that seems more like an accident than a deliberate
interface. Having num_wrps, num_brps zeroed out as well would make it
clear that this is how it's intended to be used.
So, what do you think about a patch like that?
Fair enough. I am going to read this as "good idea in general, but I
object to the implementation".
Post by Will Deacon
I don't want to expose hw_breakpoint internals to ptrace as long as we
I have to point out that the two of them are already connected via the
arch_get_debug_arch() and arch_get_max_wp_len() functions, both of
which are solely called from ptrace_get_hbp_resource_info(). However,
I understand that we don't want to add more stuff there if this is not
the direction we want to be going in.
Post by Will Deacon
1. Rip out the perf crap and integrate ptrace more tightly with
hw_breakpoint.
I am afraid this may be a bit above my skill level here. I am fairly
new to the kernel end of the ptrace interface. Could you elaborate on
what exactly would this entail? E.g. can this be done solely within
arm code, or would it require changing general kernel interfaces (I
fear yes).
Post by Will Deacon
2. Fix hw_breakpoint_slots(...) to do what you want, although I seem
to recall a chicken-and-egg problem there where perf insists on calling
that function (hw_breakpoint_slots) early.
Yeah, there is a call in init_hw_breakpoint(), which looks fairly hard to avoid.
Will Deacon
2017-07-18 13:54:19 UTC
Permalink
Post by Pavel Labath
Post by Will Deacon
Post by Pavel Labath
Hello Pratyush, Will,
I have an arm(32) chip, which has hardware debug support disabled, as
far as I can tell (DBGDSCR.MDBGen reads as zero even after explicitly
setting in).
The current behavior of the kernel on these chips is to return a
non-zero number for the "number of supported watchpoint resources"
when queried with ptrace and then fail at runtime (ENODEV) when the
tracer attempts to set this breakpoint.
This does not seem like a particularly nice api, and I believe it
would be better to just return zero for the watchpoint count
initially, as we already know that setting the watchpoints will fail.
arch_hw_breakpoint_init() zeroes out the core_num_wrps when it detects
that debug support is not present, however when we return resource
info (ptrace_get_hbp_resource_info), we go through
hw_breakpoint_slots(), which does not read these.
I think having ptrace_get_hbp_resource_info() read core_num_wrps would
be more consistent because that's how we read maximum watchpoint
length (arch_get_max_wp_len), which means we (correctly ?) return 0
when hardware debug is not supported. In fact, that is how I presently
detect that hardware debug is not supported (reading max_wp_len), but
at this moment that seems more like an accident than a deliberate
interface. Having num_wrps, num_brps zeroed out as well would make it
clear that this is how it's intended to be used.
So, what do you think about a patch like that?
Fair enough. I am going to read this as "good idea in general, but I
object to the implementation".
Post by Will Deacon
I don't want to expose hw_breakpoint internals to ptrace as long as we
I have to point out that the two of them are already connected via the
arch_get_debug_arch() and arch_get_max_wp_len() functions, both of
which are solely called from ptrace_get_hbp_resource_info(). However,
I understand that we don't want to add more stuff there if this is not
the direction we want to be going in.
Post by Will Deacon
1. Rip out the perf crap and integrate ptrace more tightly with
hw_breakpoint.
I am afraid this may be a bit above my skill level here. I am fairly
new to the kernel end of the ptrace interface. Could you elaborate on
what exactly would this entail? E.g. can this be done solely within
arm code, or would it require changing general kernel interfaces (I
fear yes).
I think it would be do-able within the arm code, although abstracting the
common parts may be beneficial. The controversial part is that I would be in
favour of dropping our hw_breakpoint support because I think it's generally
pretty unusable outside of GDB, and GDB interacts with it only indirectly
via ptrace.

If it turns out that somebody is actually using the perf interface to
hw_breakpoint on arm64, then we could perhaps implement that internally
on top of ptrace, instead of the current way round.

Will

Loading...