WIP: Add initial support for "live" debugging by prakashsurya · Pull Request #34 · crash-python/crash-python

prakashsurya · May 9, 2019

This change adds a new "crash-kcore.sh" script that enables the use of
this repository for debugging a "live" system via the "/proc/kcore"
interface, rather than reading from a kernel crash dump.

Current functionality includes:

Ability to print global variables
Ability to run existing crash-python commands
Ability to print backtraces with "bt"

Caveats:

Thread information is read once at startup, and never updated. As a
result, when listing the backtraces for thread, they may not reflect
the current state of the system; they'll reflect the state of the
system during crash-python initialization.
We cannot (as far as I know) completely disable the caching done by
GDB for the "core" target. Thus, when printing small amounts of data
repeatedly (e.g. calling "p jiffies_64" repeatedly), the value shown
may not reflect the current state of the system, it'll reflect the
value when it was first read and cached.

Co-authored-by: Serapheim Dimitropoulos serapheim@delphix.com
Co-authored-by: Tom Caputi tcaputi@datto.com

prakashsurya · May 9, 2019

Here's some examples:

# We use Ubuntu, which uses the SLUB instead of the SLAB. This command will
# result in a failure to initialize crash-python, so I just remove it.
$ rm crash/commands/kmem.py

$ sudo PYTHONPATH="/usr/local/lib/python3.6/site-packages" ./crash-kcore.sh
...
(gdb) bt
#2  0xffffffffacba1523 in default_idle_call () at /build/linux-fkZVDM/linux-4.15.0/kernel/sched/idle.c:98
    #0  0xffffffffacb9bf71 in context_switch (rf=<optimized out>, next=<optimized out>, prev=0x0 <__UNIQUE_ID_license151>, rq=<optimized out>) at /build/linux-fkZVDM/linux-4.15.0/kernel/sched/core.c:2831
    #1  0xffffffffacb9bf71 in __schedule (preempt=<optimized out>) at /build/linux-fkZVDM/linux-4.15.0/kernel/sched/core.c:3404
#3  0xffffffffac2d4af2 in cpuidle_idle_call () at /build/linux-fkZVDM/linux-4.15.0/kernel/sched/idle.c:156
#4  0xffffffffac2d4af2 in do_idle () at /build/linux-fkZVDM/linux-4.15.0/kernel/sched/idle.c:246
#5  0xffffffffac2d4d53 in cpu_startup_entry (state=CPUHP_ONLINE) at /build/linux-fkZVDM/linux-4.15.0/kernel/sched/idle.c:351
#6  0xffffffffacb93c9e in rest_init () at /build/linux-fkZVDM/linux-4.15.0/init/main.c:436
#7  0xffffffffad8a40d6 in start_kernel () at /build/linux-fkZVDM/linux-4.15.0/init/main.c:716
#8  0xffffffffad8a34d2 in x86_64_start_reservations (real_mode_data=<optimized out>) at /build/linux-fkZVDM/linux-4.15.0/arch/x86/kernel/head64.c:388
#9  0xffffffffad8a3548 in x86_64_start_kernel (real_mode_data=0x8a000 <error: Cannot access memory at address 0x8a000>) at /build/linux-fkZVDM/linux-4.15.0/arch/x86/kernel/head64.c:369
#10 0xffffffffac2000d5 in  () at /build/linux-fkZVDM/linux-4.15.0/arch/x86/kernel/head_64.S:239

(gdb) thread 123
[Switching to thread 123 (LWP 585)]
#0  context_switch (rf=<optimized out>, next=<optimized out>, warning: Couldn't find general-purpose registers in core file.
prev=<unavailable>, rq=<optimized out>) at /build/linux-fkZVDM/linux-4.15.0/kernel/sched/core.c:2831
2831    /build/linux-fkZVDM/linux-4.15.0/kernel/sched/core.c: No such file or directory.
(gdb) bt
#2  0xffffffffacb9c5ac in schedule () at /build/linux-fkZVDM/linux-4.15.0/kernel/sched/core.c:3448
    #0  0xffffffffacb9bf71 in context_switch (rf=<optimized out>, next=<optimized out>, prev=<unavailable>, rq=<optimized out>) at /build/linux-fkZVDM/linux-4.15.0/kernel/sched/core.c:2831
    #1  0xffffffffacb9bf71 in __schedule (preempt=<optimized out>) at /build/linux-fkZVDM/linux-4.15.0/kernel/sched/core.c:3404
#3  0xffffffffacba0b41 in schedule_hrtimeout_range_clock (expires=<optimized out>, delta=<optimized out>, mode=<optimized out>, clock=<optimized out>) at /build/linux-fkZVDM/linux-4.15.0/kernel/time/hrtimer.c:1702
#4  0xffffffffacba0b63 in schedule_hrtimeout_range (expires=<optimized out>, delta=<optimized out>, mode=<optimized out>) at /build/linux-fkZVDM/linux-4.15.0/kernel/time/hrtimer.c:1759
#5  0xffffffffac4c73bc in ep_poll (ep=0xffff9c2fef1a8480, events=<optimized out>, maxevents=<optimized out>, timeout=<optimized out>) at /build/linux-fkZVDM/linux-4.15.0/fs/eventpoll.c:1809
#6  0xffffffffac4c8ef6 in SYSC_epoll_wait (timeout=<optimized out>, maxevents=<optimized out>, events=<optimized out>, epfd=<optimized out>) at /build/linux-fkZVDM/linux-4.15.0/fs/eventpoll.c:2183
#7  0xffffffffac4c8ef6 in SyS_epoll_wait (epfd=<optimized out>, events=140724575256240, maxevents=31, timeout=<optimized out>) at /build/linux-fkZVDM/linux-4.15.0/fs/eventpoll.c:2148
#8  0xffffffffac203ae3 in do_syscall_64 (regs=<unavailable>) at /build/linux-fkZVDM/linux-4.15.0/arch/x86/entry/common.c:287
#9  0xffffffffacc00081 in entry_SYSCALL_64 () at /build/linux-fkZVDM/linux-4.15.0/arch/x86/entry/entry_64.S:237

(gdb) p init_uts_ns
$1 = {
  kref = {
    refcount = {
      refs = {
        counter = 6
      }
    }
  },
  name = {
    sysname = "Linux", '\000' <repeats 59 times>,
    nodename = "ps-trunk.dcenter", '\000' <repeats 48 times>,
    release = "4.15.0-48-generic", '\000' <repeats 47 times>,
    version = "#51-Ubuntu SMP Wed Apr 3 08:28:49 UTC 2019", '\000' <repeats 22 times>,
    machine = "x86_64", '\000' <repeats 58 times>,
    domainname = "(none)", '\000' <repeats 58 times>
  },
  user_ns = 0xffffffffad652f80 <init_user_ns>,
  ucounts = 0x0 <__UNIQUE_ID_license151>,
  ns = {
    stashed = {
      counter = 0
    },
    ops = 0xffffffffad02d400 <utsns_operations>,
    inum = 4026531838
  }
}

(gdb) p spa_namespace_avl
$2 = {
  avl_root = 0xffff9c2fe76bc108,
  avl_compar = 0xffffffffc0520680 <spa_name_compare>,
  avl_offset = 264,
  avl_numnodes = 1,
  avl_size = 8944
}

(gdb) thread apply all bt
...
Thread 4 (LWP 2):
#2  0xffffffffacb9c5ac in schedule () at /build/linux-fkZVDM/linux-4.15.0/kernel/sched/core.c:3448
    #0  0xffffffffacb9bf71 in context_switch (rf=<optimized out>, next=<optimized out>, warning: Couldn't find general-purpose registers in core file.
prev=<unavailable>, rq=<optimized out>) at /build/linux-fkZVDM/linux-4.15.0/kernel/sched/core.c:2831
    #1  0xffffffffacb9bf71 in __schedule (preempt=<optimized out>) at /build/linux-fkZVDM/linux-4.15.0/kernel/sched/core.c:3404
#3  0xffffffffac2b04b8 in kthreadd (unused=<optimized out>) at /build/linux-fkZVDM/linux-4.15.0/kernel/kthread.c:569
#4  0xffffffffacc00205 in ret_from_fork () at /build/linux-fkZVDM/linux-4.15.0/arch/x86/entry/entry_64.S:406

Thread 3 (LWP 1):
#2  0xffffffffacb9c5ac in schedule () at /build/linux-fkZVDM/linux-4.15.0/kernel/sched/core.c:3448
    #0  0xffffffffacb9bf71 in context_switch (rf=<optimized out>, next=<optimized out>, warning: Couldn't find general-purpose registers in core file.
prev=<unavailable>, rq=<optimized out>) at /build/linux-fkZVDM/linux-4.15.0/kernel/sched/core.c:2831
    #1  0xffffffffacb9bf71 in __schedule (preempt=<optimized out>) at /build/linux-fkZVDM/linux-4.15.0/kernel/sched/core.c:3404
#3  0xffffffffacba0b41 in schedule_hrtimeout_range_clock (expires=<optimized out>, delta=<optimized out>, mode=<optimized out>, clock=<optimized out>) at /build/linux-fkZVDM/linux-4.15.0/kernel/time/hrtimer.c:1702
#4  0xffffffffacba0b63 in schedule_hrtimeout_range (expires=<optimized out>, delta=<optimized out>, mode=<optimized out>) at /build/linux-fkZVDM/linux-4.15.0/kernel/time/hrtimer.c:1759
#5  0xffffffffac4c73bc in ep_poll (ep=0xffff9c2fe4445480, events=<optimized out>, maxevents=<optimized out>, timeout=<optimized out>) at /build/linux-fkZVDM/linux-4.15.0/fs/eventpoll.c:1809
#6  0xffffffffac4c8ef6 in SYSC_epoll_wait (timeout=<optimized out>, maxevents=<optimized out>, events=<optimized out>, epfd=<optimized out>) at /build/linux-fkZVDM/linux-4.15.0/fs/eventpoll.c:2183
#7  0xffffffffac4c8ef6 in SyS_epoll_wait (epfd=<optimized out>, events=140723516086640, maxevents=80, timeout=<optimized out>) at /build/linux-fkZVDM/linux-4.15.0/fs/eventpoll.c:2148
#8  0xffffffffac203ae3 in do_syscall_64 (regs=<unavailable>) at /build/linux-fkZVDM/linux-4.15.0/arch/x86/entry/common.c:287
#9  0xffffffffacc00081 in entry_SYSCALL_64 () at /build/linux-fkZVDM/linux-4.15.0/arch/x86/entry/entry_64.S:237

Thread 2 (process 1):
#2  0xffffffffacb9c862 in schedule_idle () at /build/linux-fkZVDM/linux-4.15.0/kernel/sched/core.c:3475
    #0  0xffffffffacb9bf71 in context_switch (rf=<optimized out>, next=<optimized out>, prev=0x0 <__UNIQUE_ID_license151>, rq=<optimized out>) at /build/linux-fkZVDM/linux-4.15.0/kernel/sched/core.c:2831
    #1  0xffffffffacb9bf71 in __schedule (preempt=<optimized out>) at /build/linux-fkZVDM/linux-4.15.0/kernel/sched/core.c:3404
#3  0xffffffffac2d4acd in do_idle () at /build/linux-fkZVDM/linux-4.15.0/kernel/sched/idle.c:269
#4  0xffffffffac2d4d53 in cpu_startup_entry (state=CPUHP_ONLINE) at /build/linux-fkZVDM/linux-4.15.0/kernel/sched/idle.c:351
#5  0xffffffffacb93c9e in rest_init () at /build/linux-fkZVDM/linux-4.15.0/init/main.c:436
#6  0xffffffffad8a40d6 in start_kernel () at /build/linux-fkZVDM/linux-4.15.0/init/main.c:716
#7  0xffffffffad8a34d2 in x86_64_start_reservations (real_mode_data=<optimized out>) at /build/linux-fkZVDM/linux-4.15.0/arch/x86/kernel/head64.c:388
#8  0xffffffffad8a3548 in x86_64_start_kernel (real_mode_data=0x8a000 <error: Cannot access memory at address 0x8a000>) at /build/linux-fkZVDM/linux-4.15.0/arch/x86/kernel/head64.c:369
#9  0xffffffffac2000d5 in  () at /build/linux-fkZVDM/linux-4.15.0/arch/x86/kernel/head_64.S:239

prakashsurya · May 9, 2019

Also worth noting, sometimes bt won't work for a thread. I don't yet know why, but it might be due to us not populating the registers for "active" thread (see changes to kernel.py). E.g.

(gdb) thread 346
[Switching to thread 346 (LWP 27300)]
#0  <unavailable> in ?? ()
(gdb) bt
Traceback (most recent call last):
  File "./build/lib/crash/arch/__init__.py", line 51, in __next__
    pc = frame.inferior_frame().pc()
gdb.error: PC not available  346  LWP 32529 "crash-kcore.sh" context_switch (rf=<optimized out>, next=<optimized out>, warning: Couldn't find general-purpose registers in core file.

#0  <unavailable> in ?? ()
Backtrace stopped: not enough registers or memory available to unwind further

EDIT: I have more evidence to support the idea that this error is for the currently "active" task. If I run info threads it shows me that thread 346 is the thread for my shell session on the system that's running the crash-kcore.sh script:

  346  LWP 32529 "crash-kcore.sh" context_switch (rf=<optimized out>, next=<optimized out>, warning: Couldn't find general-purpose registers in core file.

prakashsurya · May 9, 2019

Additionally, if I revert commit bd9d86a, it'll (IMO) more cleanly print the backtraces. I don't know exactly what the intended behavior of that commit was, but reordering the first two frames of a scheduled thread was confusing to me. E.g.

(gdb) thread 123
[Switching to thread 123 (LWP 585)]
#0  context_switch (rf=<optimized out>, next=<optimized out>, warning: Couldn't find general-purpose registers in core file.
prev=<unavailable>, rq=<optimized out>) at /build/linux-fkZVDM/linux-4.15.0/kernel/sched/core.c:2831
2831    /build/linux-fkZVDM/linux-4.15.0/kernel/sched/core.c: No such file or directory.
(gdb) bt
#0  0xffffffffacb9bf71 in context_switch (rf=<optimized out>, next=<optimized out>, prev=<unavailable>, rq=<optimized out>) at /build/linux-fkZVDM/linux-4.15.0/kernel/sched/core.c:2831
#1  0xffffffffacb9bf71 in __schedule (preempt=<optimized out>) at /build/linux-fkZVDM/linux-4.15.0/kernel/sched/core.c:3404
#2  0xffffffffacb9c5ac in schedule () at /build/linux-fkZVDM/linux-4.15.0/kernel/sched/core.c:3448
#3  0xffffffffacba0b41 in schedule_hrtimeout_range_clock (expires=<optimized out>, delta=<optimized out>, mode=<optimized out>, clock=<optimized out>) at /build/linux-fkZVDM/linux-4.15.0/kernel/time/hrtimer.c:1702
#4  0xffffffffacba0b63 in schedule_hrtimeout_range (expires=<optimized out>, delta=<optimized out>, mode=<optimized out>) at /build/linux-fkZVDM/linux-4.15.0/kernel/time/hrtimer.c:1759
#5  0xffffffffac4c73bc in ep_poll (ep=0xffff9c2fef1a8480, events=<optimized out>, maxevents=<optimized out>, timeout=<optimized out>) at /build/linux-fkZVDM/linux-4.15.0/fs/eventpoll.c:1809
#6  0xffffffffac4c8ef6 in SYSC_epoll_wait (timeout=<optimized out>, maxevents=<optimized out>, events=<optimized out>, epfd=<optimized out>) at /build/linux-fkZVDM/linux-4.15.0/fs/eventpoll.c:2183
#7  0xffffffffac4c8ef6 in SyS_epoll_wait (epfd=<optimized out>, events=140724575256240, maxevents=31, timeout=<optimized out>) at /build/linux-fkZVDM/linux-4.15.0/fs/eventpoll.c:2148
#8  0xffffffffac203ae3 in do_syscall_64 (regs=<unavailable>) at /build/linux-fkZVDM/linux-4.15.0/arch/x86/entry/common.c:287
#9  0xffffffffacc00081 in entry_SYSCALL_64 () at /build/linux-fkZVDM/linux-4.15.0/arch/x86/entry/entry_64.S:237

jeffmahoney · May 9, 2019

The goal was to remove context_switch and __schedule from the trace since it's just noise. Since it still appears in 'info threads' it's of dubious value and we can probably just drop it.

prakashsurya · May 9, 2019

It looks like that command isn't quite right when running on a live system with our new crash-kcore.sh script:

(gdb) info threads                                                                                                                                        
  Id   Target Id                  Frame                                                                                                                   
  2    process 1 "swapper/0"      0x0000000000000000 in __UNIQUE_ID_license151 ()                                                                         
  3    LWP 1 "systemd"            context_switch (rf=<optimized out>, next=<optimized out>, warning: Couldn't find general-purpose registers in core file.
prev=<unavailable>, rq=<optimized out>) at /build/linux-fkZVDM/linux-4.15.0/kernel/sched/core.c:2831                                                      
  4    LWP 2 "kthreadd"           context_switch (rf=<optimized out>, next=<optimized out>, warning: Couldn't find general-purpose registers in core file.
prev=<unavailable>, rq=<optimized out>) at /build/linux-fkZVDM/linux-4.15.0/kernel/sched/core.c:2831                                                      
  5    LWP 4 "kworker/0:0H"       context_switch (rf=<optimized out>, next=<optimized out>, warning: Couldn't find general-purpose registers in core file.
prev=<unavailable>, rq=<optimized out>) at /build/linux-fkZVDM/linux-4.15.0/kernel/sched/core.c:2831                                                      
  6    LWP 6 "mm_percpu_wq"       context_switch (rf=<optimized out>, next=<optimized out>, warning: Couldn't find general-purpose registers in core file.
...

When I run ./crash.sh and use a kdump file, I get this:

py-crash> info threads                                                                                                                                                                          
  Id   Target Id                  Frame                                                                                                                                                         
  1    pid 0 "swapper/0"          context_switch (rf=<optimized out>, next=<optimized out>, prev=<unavailable>, rq=<optimized out>) at /build/linux-6ZmFRN/linux-4.15.0/kernel/sched/core.c:2831
  2    pid 1 "systemd"            context_switch (rf=<optimized out>, next=<optimized out>, prev=<unavailable>, rq=<optimized out>) at /build/linux-6ZmFRN/linux-4.15.0/kernel/sched/core.c:2831
  3    pid 2 "kthreadd"           context_switch (rf=<optimized out>, next=<optimized out>, prev=<unavailable>, rq=<optimized out>) at /build/linux-6ZmFRN/linux-4.15.0/kernel/sched/core.c:2831
  4    pid 3 "kworker/0:0"        context_switch (rf=<optimized out>, next=<optimized out>, prev=<unavailable>, rq=<optimized out>) at /build/linux-6ZmFRN/linux-4.15.0/kernel/sched/core.c:2831
  5    pid 4 "kworker/0:0H"       context_switch (rf=<optimized out>, next=<optimized out>, prev=<unavailable>, rq=<optimized out>) at /build/linux-6ZmFRN/linux-4.15.0/kernel/sched/core.c:2831
...

I'm guessing this is because I couldn't figure out how to override the fetch_registers function of the "core" target.

jeffmahoney · May 10, 2019

Yeah, that's exactly it. Hooking in in setup_tasks() means that you'll cache whatever registers just happen to be there when it's run. The Python target already has the ability to just pass operations to an underlying target if it doesn't implement them itself. I'll look into what's required to stack on top of core.

We're also going to run into trouble pretty quickly with "ps" since it iterates that same task list. As tasks start up and exit, it's going to get out-of-sync pretty quickly. The whole thing is set up for the gdb inferior to not be executing. I'll have to look into that too.

Also, I'd prefer to keep just one startup script. Since "/proc/kcore" is a special name it should be easy enough to key off it. I have something worked up locally that I'm testing with. I've also changed how I load the debuginfo a bit since you grabbed a repo snapshot

prakashsurya · May 10, 2019

@jeffmahoney First off, since I didn't say it before, I want to thank you for all of your work on this! (including the "libkdumpfile" and "gdb-python" stuff, and making it all open and available) .. It's a really awesome foundation for Linux kernel debugging, and opens the door to enable a lot of cool stuff with a relatively minimal amount of work.

Some coworkers and I have spent the past week digging into all of this, and we're really impressed. If you're open to it, I think we'd like to discuss what we've done, and perhaps how we could help push the project forward (e.g. perhaps help get the "gdb-python" changes upstreamed).

The Python target already has the ability to just pass operations to an underlying target if it doesn't implement them itself. I'll look into what's required to stack on top of core.

Yea, I saw how fetch_registers gets overridden for the "kdumpfile" target. I tried to inspect gdb.current_target() when I used the "core" target, but I didn't see anyway to override it in the same way. It seemed like that target method was only available when using the "kdumpfile" target. I welcome any ideas/improvements to get that working.

One potential idea I had to get around this, is to create a new "kcorefile" python target, similar to the "kdumpfile" target. I think this might be possible, since we could likely read "/proc/kcore" using "libbfd", but we'd effectively have to re-implement the "core" target (right?) in python (and create python bindings for "libbfd") which seems a bit silly.

We're also going to run into trouble pretty quickly with "ps" since it iterates that same task list. As tasks start up and exit, it's going to get out-of-sync pretty quickly. The whole thing is set up for the gdb inferior to not be executing. I'll have to look into that too.

Yea, I agree. I don't have a good idea how to address that, but I welcome any ideas/improvements you might have.

Also, I'd prefer to keep just one startup script. Since "/proc/kcore" is a special name it should be easy enough to key off it.

I agree. I was thinking about updating crash.sh to use "/proc/kcore" automatically if a dumpfile was not specified (similar to how crash(8) works when a dumpfile isn't specified), but I just hadn't gotten around to that yet. Alternatively we could allow "/proc/kcore" to be specified as the dumpfile, and then key off of that (which sounds more like what you were thinking?). I'm open to making these changes, it might just have to wait until I have more time to work on this next week.

Thanks for giving this a look so quickly. I don't really consider this "done", but I did want to open the PR to start the discussion and get some feedback on it. Even with the problems that still need to be worked through, this is SOOOO much better and more powerful/extensible/etc than crash(8), so thanks again! :)

jeffmahoney · May 10, 2019

If you have GDB developers on staff, that would certainly help. We have Tom de Vries on our toolchain team and he's offered to help when I think the patches are ready to go as well.

I looked into what it would take to allow a small target on top of regular "core" and it's actually not too difficult. GDB already allows stacking targets -- like core or kdumpfile is stacked on exec. Core is a process strata target and I've written py-target to be at the thread strata, which is above it. This means that we can stack a python target over "core" as long as I remove some of the assumptions that I made initially -- like wanting to clear away other targets. I've got a mostly working kcore python target now but I need to sort out why I'm hitting assertions in GDB when doing "info threads."

You're right that this PR isn't ready to land but it's been a good start for some discussion when the project really lacks any other public forum. :)

prakashsurya · May 10, 2019

Unfortunately we don’t have any GDB developers, but I think we’re open to learning what we need to know to get things moving forward (and avoid maintaining a patch stack to GDB indefinitely). So, if you need some extra help and are open to giving us some direction, we can devote some time to this.

For a little context, we’re transitioning from an illumos based product to Linux, and the ability to efficiently do live and post-mortem kernel debugging is relatively high on our list of priorities; and all other tools we’ve looked at for Linux are lacking compared to what we have on illumos.

prakashsurya · May 21, 2019

@jeffmahoney I've been working with @sdimitro, and if you have some time, we'd like to get your feedback on our latest commit to this PR. This adds a new dependency on drgn; I think it's a reasonable dependency in that it solves some of the issues of using the "core" target (more details in the commit message).

@osandov since this change is adding a dependency on your (awesome) "drgn" project, I wanted to loop you in too.

We still don't have a solid plan as to how to solve the problem of "stale" thread information when running on a live kernel, though.. so if you have some ideas about how to solve that issue, we'd love to hear them. We have some ideas (e.g. refreshing the thread list whenever "bt" is called), but we haven't tried any of them out yet, so we're not sure if they'll be viable yet.

With these two changes, I think crash-kcore.sh for /proc/kcore is on parity with crash.sh for crash dumps...

Previously, without "drgn", the value printed here wouldn't change (i.e. caching issue appears solved):

$ sudo ./crash-kcore.sh
...
(gdb) p jiffies
$1 = 4311762114
(gdb) p jiffies
$2 = 4311762272
(gdb) p jiffies
$3 = 4311762415

Also, previously, the output here would complain about not being able to obtain register values (see my prior comment):

(gdb) info threads                                                                                                                                                                              
  Id   Target Id                  Frame                                                                                                                                                         
  1    pid 0 "swapper/0"          context_switch (rf=<optimized out>, next=<optimized out>, prev=<unavailable>, rq=<optimized out>) at /build/linux-3btXxq/linux-4.15.0/kernel/sched/core.c:2831
  2    pid 1 "systemd"            context_switch (rf=<optimized out>, next=<optimized out>, prev=<unavailable>, rq=<optimized out>) at /build/linux-3btXxq/linux-4.15.0/kernel/sched/core.c:2831
  3    pid 2 "kthreadd"           context_switch (rf=<optimized out>, next=<optimized out>, prev=<unavailable>, rq=<optimized out>) at /build/linux-3btXxq/linux-4.15.0/kernel/sched/core.c:2831
  4    pid 4 "kworker/0:0H"       context_switch (rf=<optimized out>, next=<optimized out>, prev=<unavailable>, rq=<optimized out>) at /build/linux-3btXxq/linux-4.15.0/kernel/sched/core.c:2831
  5    pid 6 "mm_percpu_wq"       context_switch (rf=<optimized out>, next=<optimized out>, prev=<unavailable>, rq=<optimized out>) at /build/linux-3btXxq/linux-4.15.0/kernel/sched/core.c:2831
  6    pid 7 "ksoftirqd/0"        context_switch (rf=<optimized out>, next=<optimized out>, prev=<unavailable>, rq=<optimized out>) at /build/linux-3btXxq/linux-4.15.0/kernel/sched/core.c:2831
...

Printing stacks and symbols still work too (as they did before with the "core" target):

(gdb) thread 123
[Switching to thread 123 (pid 596)]
#0  context_switch (rf=<optimized out>, next=<optimized out>, prev=<unavailable>, rq=<optimized out>) at /build/linux-3btXxq/linux-4.15.0/kernel/sched/core.c:2831
2831    /build/linux-3btXxq/linux-4.15.0/kernel/sched/core.c: No such file or directory.

(gdb) bt
#0  0xffffffff9099d011 in context_switch (rf=<optimized out>, next=<optimized out>, prev=<unavailable>, rq=<optimized out>) at /build/linux-3btXxq/linux-4.15.0/kernel/sched/core.c:2831
#1  0xffffffff9099d011 in __schedule (preempt=<optimized out>) at /build/linux-3btXxq/linux-4.15.0/kernel/sched/core.c:3404
#2  0xffffffff9099d64c in schedule () at /build/linux-3btXxq/linux-4.15.0/kernel/sched/core.c:3448
#3  0xffffffff909a1be1 in schedule_hrtimeout_range_clock (expires=<optimized out>, delta=<optimized out>, mode=<optimized out>, clock=<optimized out>) at /build/linux-3btXxq/linux-4.15.0/kernel/time/hrtimer.c:1702
#4  0xffffffff909a1c03 in schedule_hrtimeout_range (expires=<optimized out>, delta=<optimized out>, mode=<optimized out>) at /build/linux-3btXxq/linux-4.15.0/kernel/time/hrtimer.c:1759
#5  0xffffffff90290a85 in poll_schedule_timeout (pwq=0xffffaa22410a7a70, state=<optimized out>, expires=<optimized out>, slack=<optimized out>) at /build/linux-3btXxq/linux-4.15.0/fs/select.c:243
#6  0xffffffff902914a2 in do_select (n=<optimized out>, fds=<optimized out>, end_time=0x0 <irq_stack_union>) at /build/linux-3btXxq/linux-4.15.0/fs/select.c:580
#7  0xffffffff90292237 in core_sys_select (n=<optimized out>, inp=<unavailable>, outp=<optimized out>, exp=<optimized out>, end_time=<optimized out>) at /build/linux-3btXxq/linux-4.15.0/fs/select.c:654
#8  0xffffffff90292437 in SYSC_select (tvp=<optimized out>, exp=<optimized out>, outp=<optimized out>, inp=<optimized out>, n=<optimized out>) at /build/linux-3btXxq/linux-4.15.0/fs/select.c:695
#9  0xffffffff90292437 in SyS_select (n=5, inp=140734162616720, outp=0, exp=<optimized out>, tvp=<unavailable>) at /build/linux-3btXxq/linux-4.15.0/fs/select.c:677
#10 0xffffffff90003af3 in do_syscall_64 (regs=<unavailable>) at /build/linux-3btXxq/linux-4.15.0/arch/x86/entry/common.c:290
#11 0xffffffff90a00081 in entry_SYSCALL_64 () at /build/linux-3btXxq/linux-4.15.0/arch/x86/entry/entry_64.S:237

(gdb) p init_uts_ns
$2 = {
  kref = {
    refcount = {
      refs = {
        counter = 6
      }
    }
  },
  name = {
    sysname = "Linux", '\000' <repeats 59 times>,
    nodename = "ps-trunk.dcenter", '\000' <repeats 48 times>,
    release = "4.15.0-50-generic", '\000' <repeats 47 times>,
    version = "#54-Ubuntu SMP Mon May 6 18:46:08 UTC 2019", '\000' <repeats 22 times>,
    machine = "x86_64", '\000' <repeats 58 times>,
    domainname = "(none)", '\000' <repeats 58 times>
  },
  user_ns = 0xffffffff91452f80 <init_user_ns>,
  ucounts = 0x0 <irq_stack_union>,
  ns = {
    stashed = {
      counter = 0
    },
    ops = 0xffffffff90e2d460 <utsns_operations>,
    inum = 4026531838
  }
}

(gdb) p spa_namespace_avl
$1 = {
  avl_root = 0xffff8b6ce4750108,
  avl_compar = 0xffffffffc0532be0 <spa_name_compare>,
  avl_offset = 264,
  avl_numnodes = 1,
  avl_size = 8944
}

This commit adds knowledge of the task flags for newer releases. In Linux v3.14, several elements were removed from task_state_array. In Linux v4.4, TASK_PARKED was renumbered to be in task_state_array. This commit handles the right things and will complain if the flags change again. Signed-off-by: Jeff Mahoney <jeffm@suse.com>

Kernel 4.2 introduced TASK_NOLOAD, which when combined with TASK_UNINTERRUPTIBLE, produced TASK_IDLE. This mask is used for kernel threads, so without support for the flags, `ps' shows ?? for kernel threads. Signed-off-by: Jeff Mahoney <jeffm@suse.com>

It's more user-friendly to be able to locate whether a command is present alphabetically. Signed-off-by: Jeff Mahoney <jeffm@suse.com>

The use of anonymous structures and unions means that things like: struct foo { struct { int x; }; }; if 'x' in cls.foo_type: # will evaluate false when foo.x works fine in C code. In order to make these less painful for subsystem modules, we add a struct_has_member helper that does the right thing to resolve the member. Signed-off-by: Jeff Mahoney <jeffm@suse.com>

Kernel v4.19 moved most of mm_struct into an anonymous sub-structure. Even though C code can access members directly, the gdb type infrastructure reflects the actual type layout. This means that things like "if 'rss_stat' in cls.mm_struct_type" will return false even if the member is present. To cope with this, use struct_has_member instead, which does the right things when detecting whether a struct member is present. Signed-off-by: Jeff Mahoney <jeffm@suse.com>

With upcoming file system subsystem modules, we'll want a common way to handle UUID decoding. XFS uses uuid_t while btrfs uses an array of u8. This introduces helpers into crash.util: - decode_uuid -- decodes the byte array - decode_uuid_t -- decodes the uuid_t Signed-off-by: Jeff Mahoney <jeffm@suse.com>

The decode_flags helper takes a gdb.Value representing an integer and a dictionary of int -> str that maps the powers of 2 to flag names and produces a human-readable string describing the flags. If no name is found FLAG_$number is used instead. Signed-off-by: Jeff Mahoney <jeffm@suse.com>

Internally, gdb treats the type loaded from a typed symbol and a type symbol differently and wants to do the full type comparison dance. If we use the typed symbol directly, we can use a pointer comparison. Signed-off-by: Jeff Mahoney <jeffm@suse.com>

The list_empty method returns a boolean indicating whether a list_head describes an empty list. Signed-off-by: Jeff Mahoney <jeffm@suse.com>

The cycle tests aren't passing exact_cycles=True and will loop forever. Signed-off-by: Jeff Mahoney <jeffm@suse.com>

The tests that load files (or targets) need to tear them down so subsequent tests don't get tripped up by them. Signed-off-by: Jeff Mahoney <jeffm@suse.com>

Now that we have DelayedAttributes everywhere, the setup code can be converted to use it. Signed-off-by: Jeff Mahoney <jeffm@suse.com>

This commit adds baseline ppc64 support. It should be enough to populate the thread list but this is an old commit that needs refreshing. Signed-off-by: Jeff Mahoney <jeffm@suse.com>

This commit adds some typical helpers for bitmaps: - find_first_set_bit - find_next_set_bit - find_last_set_bit - find_first_zero_bit - find_next_zero_bit Signed-off-by: Jeff Mahoney <jeffm@suse.com>

When I rebased crash-python-gdb to an 8.3 prerelease, I found that targets have been converted to C++. That necessitated a rewrite of much of the target code, and I cleaned up some rough edges. With the new target, we load the vmcore using a simple 'target kdumpfile /path/to/vmcore' command that can be used entirely outside of the crash semantic code. This means we can debug the target more easily and use it in the testing code without having to parse everything for every test. This commit converts crash to use the new target but doesn't exploit it for testing yet. Signed-off-by: Jeff Mahoney <jeffm@suse.com>

This module contains for_each_module and a new for_each_module_section. Signed-off-by: Jeff Mahoney <jeffm@suse.com>

The "right" thing to do is for .data..percpu to be loaded at offset 0. Unfortunately, that only works when debuginfo is embedded in the binary. When separate debuginfo is used, section offsets can't be specified and gdb interprets an offset of 0 to mean "immediately after the preceding section." In order to make the rest of the percpu code sane, we'll let gdb make the same assumption when embedded debuginfo is used. Signed-off-by: Jeff Mahoney <jeffm@suse.com>

Kernels prior to 2.6.30 didn't have dynamic percpu ranges. The test cases have also not been extended to cover the dynamic ranges. This commit catches DelayedAttributeError so the test cases can pass. Signed-off-by: Jeff Mahoney <jeffm@suse.com>

One of the things that has had a serious negative effect on the quality of crash-python is the inability to do automated testing on a broad variety of kernels and vmcores. Often, we have to run our tests by hand. More often, we get reports from new users that something broke unexpectedly. This commit adds real unit testing against real kernels and vmcores. It includes a few test cases that will need to be extended further. The policy moving forward will be that new features will require a matching test case that passes across a variety of kernels. Signed-off-by: Jeff Mahoney <jeffm@suse.com>

With Linux v5.1-rc1, knode_class was moved from struct device to struct device_private. This commit updates for_each_class_device to use the implementation that matches the kernel. Signed-off-by: Jeff Mahoney <jeffm@suse.com>

These two helpers will take a generic superblock or inode and determine whether it belongs to the given file system. It's a naive implementation that uses a string comparison. This is intentional so the comparison can be made without symbol resolution that may require module loading. Signed-off-by: Jeff Mahoney <jeffm@suse.com>

The helper routines can be passed bad pointers, so document that each can raise gdb.NotAvailableError. Signed-off-by: Jeff Mahoney <jeffm@suse.com>

This commit adds documentation for the mount API and makes private some methods/functions that are meant to be internal. Signed-off-by: Jeff Mahoney <jeffm@suse.com>

…uids This adds helpers to: - export the fsid and metadata uuid from btrfs file systems - test whether a generic super block belongs to btrfs - test whether a generic inode belongs to btrfs We also document the APIs of existing helpers. Signed-off-by: Jeff Mahoney <jeffm@suse.com>

This adds an `btrfs' command to display some details of btrfs file systems. Included subcommands are: - 'list' -- list all mounted btrfs file systems, including device and uuid. Signed-off-by: Jeff Mahoney <jeffm@suse.com>

This commit adds a basic xfs file system system module. Included are: - Python variables for flags - Mappings from flags to flag names - Decoding for xfs_bufs and inodes - Helpers for mount flags, superblock version, and uuid - AIL iterators including item decoding Signed-off-by: Jeff Mahoney <jeffm@suse.com>

This adds an `xfs' command to display some details of xfs file systems. Included subcommands are: - 'list' -- list all mounted xfs file systems, including device and uuid - 'show' -- show details of a single xfs file system - 'dump-ail' -- dump contents of the AIL for one file system - 'dump-buft' -- dump contents of the bt_delwrite_queue for one file system Signed-off-by: Jeff Mahoney <jeffm@suse.com>

This adds a helper to pass back the requests in flight for a particular queue (block single-queue only). Signed-off-by: Jeff Mahoney <jeffm@suse.com>

This commit adds a basic `lsmod' command. By default, it will display the module name, core address, size, and users of it. With the -p option, it will display the percpu base and size. With -p <n>, it will display the percpu base for the given CPU number. Signed-off-by: Jeff Mahoney <jeffm@suse.com>

The arch-specific part of get_stack_pointer just needs to interpret the arch's thread_struct. Pass it that and avoid confusion. Signed-off-by: Jeff Mahoney <jeffm@suse.com>

Every consumer of a task shouldn't need to drill down into the structure just to get the task name, pid, etc. Signed-off-by: Jeff Mahoney <jeffm@suse.com>

This is a big commit that pulls the formatting of the output out of the command. The idea is that we can implement this more cleanly by adding methods to the formatting class. Signed-off-by: Jeff Mahoney <jeffm@suse.com>

Kernel v5.1-rc1 moved the compressed config data into .rodata using asm .globl variables to mark the bounds. This commit updates crash.cache.syscache to handle the new variables and cleans up the code a bit. Signed-off-by: Jeff Mahoney <jeffm@suse.com>

The test output is littered with 'broken link' reports during tests that are specifically testing that behavior. We can tidy up a bit by adding a print_broken_links option that defaults to True but can be set to False by the test cases. Signed-off-by: Jeff Mahoney <jeffm@suse.com>

Despite being called 'get_typed_pointer', we were dereferencing the pointer before turning. Also, we were refusing to take the address of a value that wasn't already the type we were targeting, which is silly since that would just return the object back. Signed-off-by: Jeff Mahoney <jeffm@suse.com>

This commit adds API documentation and static typing hints to tasks. There are some minor code changes to make mypy happy with the result. Signed-off-by: Jeff Mahoney <jeffm@suse.com>

This commit is the leftover bits for typing and documentation. Signed-off-by: Jeff Mahoney <jeffm@suse.com>

This change adds a new "crash-kcore.sh" script that enables the use of this repository for debugging a "live" system via the "/proc/kcore" interface, rather than reading from a kernel crash dump. Current functionality includes: * Ability to print global variables * Ability to run existing crash-python commands * Ability to print backtraces with "bt" Caveats: * Thread information is read once at startup, and never updated. As a result, when listing the backtraces for thread, they may not reflect the current state of the system; they'll reflect the state of the system during crash-python initialization. * We cannot (as far as I know) completely disable the caching done by GDB for the "core" target. Thus, when printing small amounts of data repeatedly (e.g. calling "p jiffies_64" repeatedly), the value shown may not reflect the current state of the system, it'll reflect the value when it was first read and cached. Co-authored-by: Serapheim Dimitropoulos <serapheim@delphix.com> Co-authored-by: Tom Caputi <tcaputi@datto.com>

This adds a new "kcore" GDB target which uses "drgn" as the backend for reading from "/proc/kcore"; very similar to the existing "kdump" target which uses "kdumpfile" for reading from a crash dump. The benefit of using this new "kcore" target as opposed to GDB's existing "core" target is twofold: 1. The caching done by the "core" target is not what we want when inspecting a live, running kernel. By moving to a new python based target, we avoid all of this; each read of memory from GDB will call into our target, so we have much more control. 2. The "core" target is unable to properly fetch registers when it's used with "/proc/kcore". By using a python target, we can override the "fetch_registers" function, and do the right thing for fetching registers of kernel threads. Additionally, the code to do this is the same for "/proc/kcore" and a kernel crash dump, so the same code can be used for both the new "kcore" and existing "kdump" targets. Unfortunately though, this new "kcore" target does not solve the issue of thread information being read and cached during startup, meaning we still do not have "live" thread information when using this new target. See also: https://github.com/osandov/drgn Co-authored-by: Prakash Surya <prakash.surya@delphix.com>

jeffmahoney force-pushed the next branch from 967c378 to 65bb722 Compare May 21, 2019 03:29

prakashsurya force-pushed the next-kcore branch from 4c6c1d7 to 2029c57 Compare May 21, 2019 15:44

jeffmahoney and others added 18 commits May 21, 2019 17:38

crash.commands.help: sort commands in help output

68f6630

It's more user-friendly to be able to locate whether a command is present alphabetically. Signed-off-by: Jeff Mahoney <jeffm@suse.com>

crash.types.list: add list_empty

736fdf7

The list_empty method returns a boolean indicating whether a list_head describes an empty list. Signed-off-by: Jeff Mahoney <jeffm@suse.com>

crash.types.list: fix cycle tests

df92870

The cycle tests aren't passing exact_cycles=True and will loop forever. Signed-off-by: Jeff Mahoney <jeffm@suse.com>

tests: clear file during test teardown

87aec81

The tests that load files (or targets) need to tear them down so subsequent tests don't get tripped up by them. Signed-off-by: Jeff Mahoney <jeffm@suse.com>

crash.kernel: convert setup to use DelayedSymvals

ff7ef81

Now that we have DelayedAttributes everywhere, the setup code can be converted to use it. Signed-off-by: Jeff Mahoney <jeffm@suse.com>

crash.arch: add baseline ppc64 support

a132e76

This commit adds baseline ppc64 support. It should be enough to populate the thread list but this is an old commit that needs refreshing. Signed-off-by: Jeff Mahoney <jeffm@suse.com>

crash.types.bitmap: add find first/last/next helpers

bbe3d6f

This commit adds some typical helpers for bitmaps: - find_first_set_bit - find_next_set_bit - find_last_set_bit - find_first_zero_bit - find_next_zero_bit Signed-off-by: Jeff Mahoney <jeffm@suse.com>

crash.types.module: create module for modules

afdd5b2

This module contains for_each_module and a new for_each_module_section. Signed-off-by: Jeff Mahoney <jeffm@suse.com>

jeffmahoney added 19 commits May 21, 2019 17:42

crash.subsystem.filesystem: document gdb.NotAvailableError

bfd2628

The helper routines can be passed bad pointers, so document that each can raise gdb.NotAvailableError. Signed-off-by: Jeff Mahoney <jeffm@suse.com>

crash.subsytem.filesystem.mount: add API documentation

c5691bf

This commit adds documentation for the mount API and makes private some methods/functions that are meant to be internal. Signed-off-by: Jeff Mahoney <jeffm@suse.com>

crash.commands.btrfs: add basic btrfs command

7e077de

This adds an `btrfs' command to display some details of btrfs file systems. Included subcommands are: - 'list' -- list all mounted btrfs file systems, including device and uuid. Signed-off-by: Jeff Mahoney <jeffm@suse.com>

crash.subsystem.storage.blocksq: add per-queue requests_in_flight call

5483d9f

This adds a helper to pass back the requests in flight for a particular queue (block single-queue only). Signed-off-by: Jeff Mahoney <jeffm@suse.com>

crash.types.task: get_stack_pointer should take thread_struct

e7789d4

The arch-specific part of get_stack_pointer just needs to interpret the arch's thread_struct. Pass it that and avoid confusion. Signed-off-by: Jeff Mahoney <jeffm@suse.com>

crash.types.task: add accessor-helpers

1d22d73

Every consumer of a task shouldn't need to drill down into the structure just to get the task name, pid, etc. Signed-off-by: Jeff Mahoney <jeffm@suse.com>

crash.commands.ps: factor out formatting

af12734

This is a big commit that pulls the formatting of the output out of the command. The idea is that we can implement this more cleanly by adding methods to the formatting class. Signed-off-by: Jeff Mahoney <jeffm@suse.com>

crash.types.task: add documentation and static typing hints

0f70ab1

This commit adds API documentation and static typing hints to tasks. There are some minor code changes to make mypy happy with the result. Signed-off-by: Jeff Mahoney <jeffm@suse.com>

crash: more documentation and typing

d6bc98b

This commit is the leftover bits for typing and documentation. Signed-off-by: Jeff Mahoney <jeffm@suse.com>

jeffmahoney force-pushed the next branch from 65bb722 to d6bc98b Compare May 21, 2019 21:46

Prakash Surya and others added 2 commits May 22, 2019 09:24

prakashsurya force-pushed the next-kcore branch from 2029c57 to 14529d1 Compare May 22, 2019 16:25

jeffmahoney force-pushed the next branch 7 times, most recently from 89aca9c to a1171f9 Compare May 23, 2019 21:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP: Add initial support for "live" debugging#34

WIP: Add initial support for "live" debugging#34
prakashsurya wants to merge 75 commits intocrash-python:nextcrash-python/crash-python:nextfrom
prakashsurya:next-kcoreprakashsurya/crash-python:next-kcoreCopy head branch name to clipboard

prakashsurya commented May 9, 2019

Uh oh!

prakashsurya commented May 9, 2019

Uh oh!

prakashsurya commented May 9, 2019 •

edited

Loading

Uh oh!

prakashsurya commented May 9, 2019

Uh oh!

jeffmahoney commented May 9, 2019

Uh oh!

prakashsurya commented May 9, 2019

Uh oh!

jeffmahoney commented May 10, 2019

Uh oh!

prakashsurya commented May 10, 2019 •

edited

Loading

Uh oh!

jeffmahoney commented May 10, 2019

Uh oh!

prakashsurya commented May 10, 2019

Uh oh!

prakashsurya commented May 21, 2019 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Search code, repositories, users, issues, pull requests...

Conversation

prakashsurya commented May 9, 2019

Uh oh!

prakashsurya commented May 9, 2019

Uh oh!

prakashsurya commented May 9, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

prakashsurya commented May 9, 2019

Uh oh!

jeffmahoney commented May 9, 2019

Uh oh!

prakashsurya commented May 9, 2019

Uh oh!

jeffmahoney commented May 10, 2019

Uh oh!

prakashsurya commented May 10, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jeffmahoney commented May 10, 2019

Uh oh!

prakashsurya commented May 10, 2019

Uh oh!

prakashsurya commented May 21, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

prakashsurya commented May 9, 2019 •

edited

Loading

prakashsurya commented May 10, 2019 •

edited

Loading

prakashsurya commented May 21, 2019 •

edited

Loading