Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

WIP: Add initial support for "live" debugging#34

Open
prakashsurya wants to merge 75 commits intocrash-python:nextcrash-python/crash-python:nextfrom
prakashsurya:next-kcoreprakashsurya/crash-python:next-kcoreCopy head branch name to clipboard
Open

WIP: Add initial support for "live" debugging#34
prakashsurya wants to merge 75 commits intocrash-python:nextcrash-python/crash-python:nextfrom
prakashsurya:next-kcoreprakashsurya/crash-python:next-kcoreCopy head branch name to clipboard

Conversation

@prakashsurya
Copy link

This change adds a new "crash-kcore.sh" script that enables the use of
this repository for debugging a "live" system via the "/proc/kcore"
interface, rather than reading from a kernel crash dump.

Current functionality includes:

  • Ability to print global variables
  • Ability to run existing crash-python commands
  • Ability to print backtraces with "bt"

Caveats:

  • Thread information is read once at startup, and never updated. As a
    result, when listing the backtraces for thread, they may not reflect
    the current state of the system; they'll reflect the state of the
    system during crash-python initialization.

  • We cannot (as far as I know) completely disable the caching done by
    GDB for the "core" target. Thus, when printing small amounts of data
    repeatedly (e.g. calling "p jiffies_64" repeatedly), the value shown
    may not reflect the current state of the system, it'll reflect the
    value when it was first read and cached.

Co-authored-by: Serapheim Dimitropoulos serapheim@delphix.com
Co-authored-by: Tom Caputi tcaputi@datto.com

@prakashsurya
Copy link
Author

Here's some examples:

# We use Ubuntu, which uses the SLUB instead of the SLAB. This command will
# result in a failure to initialize crash-python, so I just remove it.
$ rm crash/commands/kmem.py

$ sudo PYTHONPATH="/usr/local/lib/python3.6/site-packages" ./crash-kcore.sh
...
(gdb) bt
#2  0xffffffffacba1523 in default_idle_call () at /build/linux-fkZVDM/linux-4.15.0/kernel/sched/idle.c:98
    #0  0xffffffffacb9bf71 in context_switch (rf=<optimized out>, next=<optimized out>, prev=0x0 <__UNIQUE_ID_license151>, rq=<optimized out>) at /build/linux-fkZVDM/linux-4.15.0/kernel/sched/core.c:2831
    #1  0xffffffffacb9bf71 in __schedule (preempt=<optimized out>) at /build/linux-fkZVDM/linux-4.15.0/kernel/sched/core.c:3404
#3  0xffffffffac2d4af2 in cpuidle_idle_call () at /build/linux-fkZVDM/linux-4.15.0/kernel/sched/idle.c:156
#4  0xffffffffac2d4af2 in do_idle () at /build/linux-fkZVDM/linux-4.15.0/kernel/sched/idle.c:246
#5  0xffffffffac2d4d53 in cpu_startup_entry (state=CPUHP_ONLINE) at /build/linux-fkZVDM/linux-4.15.0/kernel/sched/idle.c:351
#6  0xffffffffacb93c9e in rest_init () at /build/linux-fkZVDM/linux-4.15.0/init/main.c:436
#7  0xffffffffad8a40d6 in start_kernel () at /build/linux-fkZVDM/linux-4.15.0/init/main.c:716
#8  0xffffffffad8a34d2 in x86_64_start_reservations (real_mode_data=<optimized out>) at /build/linux-fkZVDM/linux-4.15.0/arch/x86/kernel/head64.c:388
#9  0xffffffffad8a3548 in x86_64_start_kernel (real_mode_data=0x8a000 <error: Cannot access memory at address 0x8a000>) at /build/linux-fkZVDM/linux-4.15.0/arch/x86/kernel/head64.c:369
#10 0xffffffffac2000d5 in  () at /build/linux-fkZVDM/linux-4.15.0/arch/x86/kernel/head_64.S:239

(gdb) thread 123
[Switching to thread 123 (LWP 585)]
#0  context_switch (rf=<optimized out>, next=<optimized out>, warning: Couldn't find general-purpose registers in core file.
prev=<unavailable>, rq=<optimized out>) at /build/linux-fkZVDM/linux-4.15.0/kernel/sched/core.c:2831
2831    /build/linux-fkZVDM/linux-4.15.0/kernel/sched/core.c: No such file or directory.
(gdb) bt
#2  0xffffffffacb9c5ac in schedule () at /build/linux-fkZVDM/linux-4.15.0/kernel/sched/core.c:3448
    #0  0xffffffffacb9bf71 in context_switch (rf=<optimized out>, next=<optimized out>, prev=<unavailable>, rq=<optimized out>) at /build/linux-fkZVDM/linux-4.15.0/kernel/sched/core.c:2831
    #1  0xffffffffacb9bf71 in __schedule (preempt=<optimized out>) at /build/linux-fkZVDM/linux-4.15.0/kernel/sched/core.c:3404
#3  0xffffffffacba0b41 in schedule_hrtimeout_range_clock (expires=<optimized out>, delta=<optimized out>, mode=<optimized out>, clock=<optimized out>) at /build/linux-fkZVDM/linux-4.15.0/kernel/time/hrtimer.c:1702
#4  0xffffffffacba0b63 in schedule_hrtimeout_range (expires=<optimized out>, delta=<optimized out>, mode=<optimized out>) at /build/linux-fkZVDM/linux-4.15.0/kernel/time/hrtimer.c:1759
#5  0xffffffffac4c73bc in ep_poll (ep=0xffff9c2fef1a8480, events=<optimized out>, maxevents=<optimized out>, timeout=<optimized out>) at /build/linux-fkZVDM/linux-4.15.0/fs/eventpoll.c:1809
#6  0xffffffffac4c8ef6 in SYSC_epoll_wait (timeout=<optimized out>, maxevents=<optimized out>, events=<optimized out>, epfd=<optimized out>) at /build/linux-fkZVDM/linux-4.15.0/fs/eventpoll.c:2183
#7  0xffffffffac4c8ef6 in SyS_epoll_wait (epfd=<optimized out>, events=140724575256240, maxevents=31, timeout=<optimized out>) at /build/linux-fkZVDM/linux-4.15.0/fs/eventpoll.c:2148
#8  0xffffffffac203ae3 in do_syscall_64 (regs=<unavailable>) at /build/linux-fkZVDM/linux-4.15.0/arch/x86/entry/common.c:287
#9  0xffffffffacc00081 in entry_SYSCALL_64 () at /build/linux-fkZVDM/linux-4.15.0/arch/x86/entry/entry_64.S:237

(gdb) p init_uts_ns
$1 = {
  kref = {
    refcount = {
      refs = {
        counter = 6
      }
    }
  },
  name = {
    sysname = "Linux", '\000' <repeats 59 times>,
    nodename = "ps-trunk.dcenter", '\000' <repeats 48 times>,
    release = "4.15.0-48-generic", '\000' <repeats 47 times>,
    version = "#51-Ubuntu SMP Wed Apr 3 08:28:49 UTC 2019", '\000' <repeats 22 times>,
    machine = "x86_64", '\000' <repeats 58 times>,
    domainname = "(none)", '\000' <repeats 58 times>
  },
  user_ns = 0xffffffffad652f80 <init_user_ns>,
  ucounts = 0x0 <__UNIQUE_ID_license151>,
  ns = {
    stashed = {
      counter = 0
    },
    ops = 0xffffffffad02d400 <utsns_operations>,
    inum = 4026531838
  }
}

(gdb) p spa_namespace_avl
$2 = {
  avl_root = 0xffff9c2fe76bc108,
  avl_compar = 0xffffffffc0520680 <spa_name_compare>,
  avl_offset = 264,
  avl_numnodes = 1,
  avl_size = 8944
}

(gdb) thread apply all bt
...
Thread 4 (LWP 2):
#2  0xffffffffacb9c5ac in schedule () at /build/linux-fkZVDM/linux-4.15.0/kernel/sched/core.c:3448
    #0  0xffffffffacb9bf71 in context_switch (rf=<optimized out>, next=<optimized out>, warning: Couldn't find general-purpose registers in core file.
prev=<unavailable>, rq=<optimized out>) at /build/linux-fkZVDM/linux-4.15.0/kernel/sched/core.c:2831
    #1  0xffffffffacb9bf71 in __schedule (preempt=<optimized out>) at /build/linux-fkZVDM/linux-4.15.0/kernel/sched/core.c:3404
#3  0xffffffffac2b04b8 in kthreadd (unused=<optimized out>) at /build/linux-fkZVDM/linux-4.15.0/kernel/kthread.c:569
#4  0xffffffffacc00205 in ret_from_fork () at /build/linux-fkZVDM/linux-4.15.0/arch/x86/entry/entry_64.S:406

Thread 3 (LWP 1):
#2  0xffffffffacb9c5ac in schedule () at /build/linux-fkZVDM/linux-4.15.0/kernel/sched/core.c:3448
    #0  0xffffffffacb9bf71 in context_switch (rf=<optimized out>, next=<optimized out>, warning: Couldn't find general-purpose registers in core file.
prev=<unavailable>, rq=<optimized out>) at /build/linux-fkZVDM/linux-4.15.0/kernel/sched/core.c:2831
    #1  0xffffffffacb9bf71 in __schedule (preempt=<optimized out>) at /build/linux-fkZVDM/linux-4.15.0/kernel/sched/core.c:3404
#3  0xffffffffacba0b41 in schedule_hrtimeout_range_clock (expires=<optimized out>, delta=<optimized out>, mode=<optimized out>, clock=<optimized out>) at /build/linux-fkZVDM/linux-4.15.0/kernel/time/hrtimer.c:1702
#4  0xffffffffacba0b63 in schedule_hrtimeout_range (expires=<optimized out>, delta=<optimized out>, mode=<optimized out>) at /build/linux-fkZVDM/linux-4.15.0/kernel/time/hrtimer.c:1759
#5  0xffffffffac4c73bc in ep_poll (ep=0xffff9c2fe4445480, events=<optimized out>, maxevents=<optimized out>, timeout=<optimized out>) at /build/linux-fkZVDM/linux-4.15.0/fs/eventpoll.c:1809
#6  0xffffffffac4c8ef6 in SYSC_epoll_wait (timeout=<optimized out>, maxevents=<optimized out>, events=<optimized out>, epfd=<optimized out>) at /build/linux-fkZVDM/linux-4.15.0/fs/eventpoll.c:2183
#7  0xffffffffac4c8ef6 in SyS_epoll_wait (epfd=<optimized out>, events=140723516086640, maxevents=80, timeout=<optimized out>) at /build/linux-fkZVDM/linux-4.15.0/fs/eventpoll.c:2148
#8  0xffffffffac203ae3 in do_syscall_64 (regs=<unavailable>) at /build/linux-fkZVDM/linux-4.15.0/arch/x86/entry/common.c:287
#9  0xffffffffacc00081 in entry_SYSCALL_64 () at /build/linux-fkZVDM/linux-4.15.0/arch/x86/entry/entry_64.S:237

Thread 2 (process 1):
#2  0xffffffffacb9c862 in schedule_idle () at /build/linux-fkZVDM/linux-4.15.0/kernel/sched/core.c:3475
    #0  0xffffffffacb9bf71 in context_switch (rf=<optimized out>, next=<optimized out>, prev=0x0 <__UNIQUE_ID_license151>, rq=<optimized out>) at /build/linux-fkZVDM/linux-4.15.0/kernel/sched/core.c:2831
    #1  0xffffffffacb9bf71 in __schedule (preempt=<optimized out>) at /build/linux-fkZVDM/linux-4.15.0/kernel/sched/core.c:3404
#3  0xffffffffac2d4acd in do_idle () at /build/linux-fkZVDM/linux-4.15.0/kernel/sched/idle.c:269
#4  0xffffffffac2d4d53 in cpu_startup_entry (state=CPUHP_ONLINE) at /build/linux-fkZVDM/linux-4.15.0/kernel/sched/idle.c:351
#5  0xffffffffacb93c9e in rest_init () at /build/linux-fkZVDM/linux-4.15.0/init/main.c:436
#6  0xffffffffad8a40d6 in start_kernel () at /build/linux-fkZVDM/linux-4.15.0/init/main.c:716
#7  0xffffffffad8a34d2 in x86_64_start_reservations (real_mode_data=<optimized out>) at /build/linux-fkZVDM/linux-4.15.0/arch/x86/kernel/head64.c:388
#8  0xffffffffad8a3548 in x86_64_start_kernel (real_mode_data=0x8a000 <error: Cannot access memory at address 0x8a000>) at /build/linux-fkZVDM/linux-4.15.0/arch/x86/kernel/head64.c:369
#9  0xffffffffac2000d5 in  () at /build/linux-fkZVDM/linux-4.15.0/arch/x86/kernel/head_64.S:239

@prakashsurya
Copy link
Author

prakashsurya commented May 9, 2019

Also worth noting, sometimes bt won't work for a thread. I don't yet know why, but it might be due to us not populating the registers for "active" thread (see changes to kernel.py). E.g.

(gdb) thread 346
[Switching to thread 346 (LWP 27300)]
#0  <unavailable> in ?? ()
(gdb) bt
Traceback (most recent call last):
  File "./build/lib/crash/arch/__init__.py", line 51, in __next__
    pc = frame.inferior_frame().pc()
gdb.error: PC not available  346  LWP 32529 "crash-kcore.sh" context_switch (rf=<optimized out>, next=<optimized out>, warning: Couldn't find general-purpose registers in core file.

#0  <unavailable> in ?? ()
Backtrace stopped: not enough registers or memory available to unwind further

EDIT: I have more evidence to support the idea that this error is for the currently "active" task. If I run info threads it shows me that thread 346 is the thread for my shell session on the system that's running the crash-kcore.sh script:

  346  LWP 32529 "crash-kcore.sh" context_switch (rf=<optimized out>, next=<optimized out>, warning: Couldn't find general-purpose registers in core file.

@prakashsurya
Copy link
Author

Additionally, if I revert commit bd9d86a, it'll (IMO) more cleanly print the backtraces. I don't know exactly what the intended behavior of that commit was, but reordering the first two frames of a scheduled thread was confusing to me. E.g.

(gdb) thread 123
[Switching to thread 123 (LWP 585)]
#0  context_switch (rf=<optimized out>, next=<optimized out>, warning: Couldn't find general-purpose registers in core file.
prev=<unavailable>, rq=<optimized out>) at /build/linux-fkZVDM/linux-4.15.0/kernel/sched/core.c:2831
2831    /build/linux-fkZVDM/linux-4.15.0/kernel/sched/core.c: No such file or directory.
(gdb) bt
#0  0xffffffffacb9bf71 in context_switch (rf=<optimized out>, next=<optimized out>, prev=<unavailable>, rq=<optimized out>) at /build/linux-fkZVDM/linux-4.15.0/kernel/sched/core.c:2831
#1  0xffffffffacb9bf71 in __schedule (preempt=<optimized out>) at /build/linux-fkZVDM/linux-4.15.0/kernel/sched/core.c:3404
#2  0xffffffffacb9c5ac in schedule () at /build/linux-fkZVDM/linux-4.15.0/kernel/sched/core.c:3448
#3  0xffffffffacba0b41 in schedule_hrtimeout_range_clock (expires=<optimized out>, delta=<optimized out>, mode=<optimized out>, clock=<optimized out>) at /build/linux-fkZVDM/linux-4.15.0/kernel/time/hrtimer.c:1702
#4  0xffffffffacba0b63 in schedule_hrtimeout_range (expires=<optimized out>, delta=<optimized out>, mode=<optimized out>) at /build/linux-fkZVDM/linux-4.15.0/kernel/time/hrtimer.c:1759
#5  0xffffffffac4c73bc in ep_poll (ep=0xffff9c2fef1a8480, events=<optimized out>, maxevents=<optimized out>, timeout=<optimized out>) at /build/linux-fkZVDM/linux-4.15.0/fs/eventpoll.c:1809
#6  0xffffffffac4c8ef6 in SYSC_epoll_wait (timeout=<optimized out>, maxevents=<optimized out>, events=<optimized out>, epfd=<optimized out>) at /build/linux-fkZVDM/linux-4.15.0/fs/eventpoll.c:2183
#7  0xffffffffac4c8ef6 in SyS_epoll_wait (epfd=<optimized out>, events=140724575256240, maxevents=31, timeout=<optimized out>) at /build/linux-fkZVDM/linux-4.15.0/fs/eventpoll.c:2148
#8  0xffffffffac203ae3 in do_syscall_64 (regs=<unavailable>) at /build/linux-fkZVDM/linux-4.15.0/arch/x86/entry/common.c:287
#9  0xffffffffacc00081 in entry_SYSCALL_64 () at /build/linux-fkZVDM/linux-4.15.0/arch/x86/entry/entry_64.S:237

@jeffmahoney
Copy link
Member

The goal was to remove context_switch and __schedule from the trace since it's just noise. Since it still appears in 'info threads' it's of dubious value and we can probably just drop it.

@prakashsurya
Copy link
Author

It looks like that command isn't quite right when running on a live system with our new crash-kcore.sh script:

(gdb) info threads                                                                                                                                        
  Id   Target Id                  Frame                                                                                                                   
  2    process 1 "swapper/0"      0x0000000000000000 in __UNIQUE_ID_license151 ()                                                                         
  3    LWP 1 "systemd"            context_switch (rf=<optimized out>, next=<optimized out>, warning: Couldn't find general-purpose registers in core file.
prev=<unavailable>, rq=<optimized out>) at /build/linux-fkZVDM/linux-4.15.0/kernel/sched/core.c:2831                                                      
  4    LWP 2 "kthreadd"           context_switch (rf=<optimized out>, next=<optimized out>, warning: Couldn't find general-purpose registers in core file.
prev=<unavailable>, rq=<optimized out>) at /build/linux-fkZVDM/linux-4.15.0/kernel/sched/core.c:2831                                                      
  5    LWP 4 "kworker/0:0H"       context_switch (rf=<optimized out>, next=<optimized out>, warning: Couldn't find general-purpose registers in core file.
prev=<unavailable>, rq=<optimized out>) at /build/linux-fkZVDM/linux-4.15.0/kernel/sched/core.c:2831                                                      
  6    LWP 6 "mm_percpu_wq"       context_switch (rf=<optimized out>, next=<optimized out>, warning: Couldn't find general-purpose registers in core file.
...

When I run ./crash.sh and use a kdump file, I get this:

py-crash> info threads                                                                                                                                                                          
  Id   Target Id                  Frame                                                                                                                                                         
  1    pid 0 "swapper/0"          context_switch (rf=<optimized out>, next=<optimized out>, prev=<unavailable>, rq=<optimized out>) at /build/linux-6ZmFRN/linux-4.15.0/kernel/sched/core.c:2831
  2    pid 1 "systemd"            context_switch (rf=<optimized out>, next=<optimized out>, prev=<unavailable>, rq=<optimized out>) at /build/linux-6ZmFRN/linux-4.15.0/kernel/sched/core.c:2831
  3    pid 2 "kthreadd"           context_switch (rf=<optimized out>, next=<optimized out>, prev=<unavailable>, rq=<optimized out>) at /build/linux-6ZmFRN/linux-4.15.0/kernel/sched/core.c:2831
  4    pid 3 "kworker/0:0"        context_switch (rf=<optimized out>, next=<optimized out>, prev=<unavailable>, rq=<optimized out>) at /build/linux-6ZmFRN/linux-4.15.0/kernel/sched/core.c:2831
  5    pid 4 "kworker/0:0H"       context_switch (rf=<optimized out>, next=<optimized out>, prev=<unavailable>, rq=<optimized out>) at /build/linux-6ZmFRN/linux-4.15.0/kernel/sched/core.c:2831
...

I'm guessing this is because I couldn't figure out how to override the fetch_registers function of the "core" target.

@jeffmahoney
Copy link
Member

Yeah, that's exactly it. Hooking in in setup_tasks() means that you'll cache whatever registers just happen to be there when it's run. The Python target already has the ability to just pass operations to an underlying target if it doesn't implement them itself. I'll look into what's required to stack on top of core.

We're also going to run into trouble pretty quickly with "ps" since it iterates that same task list. As tasks start up and exit, it's going to get out-of-sync pretty quickly. The whole thing is set up for the gdb inferior to not be executing. I'll have to look into that too.

Also, I'd prefer to keep just one startup script. Since "/proc/kcore" is a special name it should be easy enough to key off it. I have something worked up locally that I'm testing with. I've also changed how I load the debuginfo a bit since you grabbed a repo snapshot

@prakashsurya
Copy link
Author

prakashsurya commented May 10, 2019

@jeffmahoney First off, since I didn't say it before, I want to thank you for all of your work on this! (including the "libkdumpfile" and "gdb-python" stuff, and making it all open and available) .. It's a really awesome foundation for Linux kernel debugging, and opens the door to enable a lot of cool stuff with a relatively minimal amount of work.

Some coworkers and I have spent the past week digging into all of this, and we're really impressed. If you're open to it, I think we'd like to discuss what we've done, and perhaps how we could help push the project forward (e.g. perhaps help get the "gdb-python" changes upstreamed).

The Python target already has the ability to just pass operations to an underlying target if it doesn't implement them itself. I'll look into what's required to stack on top of core.

Yea, I saw how fetch_registers gets overridden for the "kdumpfile" target. I tried to inspect gdb.current_target() when I used the "core" target, but I didn't see anyway to override it in the same way. It seemed like that target method was only available when using the "kdumpfile" target. I welcome any ideas/improvements to get that working.

One potential idea I had to get around this, is to create a new "kcorefile" python target, similar to the "kdumpfile" target. I think this might be possible, since we could likely read "/proc/kcore" using "libbfd", but we'd effectively have to re-implement the "core" target (right?) in python (and create python bindings for "libbfd") which seems a bit silly.

We're also going to run into trouble pretty quickly with "ps" since it iterates that same task list. As tasks start up and exit, it's going to get out-of-sync pretty quickly. The whole thing is set up for the gdb inferior to not be executing. I'll have to look into that too.

Yea, I agree. I don't have a good idea how to address that, but I welcome any ideas/improvements you might have.

Also, I'd prefer to keep just one startup script. Since "/proc/kcore" is a special name it should be easy enough to key off it.

I agree. I was thinking about updating crash.sh to use "/proc/kcore" automatically if a dumpfile was not specified (similar to how crash(8) works when a dumpfile isn't specified), but I just hadn't gotten around to that yet. Alternatively we could allow "/proc/kcore" to be specified as the dumpfile, and then key off of that (which sounds more like what you were thinking?). I'm open to making these changes, it might just have to wait until I have more time to work on this next week.

Thanks for giving this a look so quickly. I don't really consider this "done", but I did want to open the PR to start the discussion and get some feedback on it. Even with the problems that still need to be worked through, this is SOOOO much better and more powerful/extensible/etc than crash(8), so thanks again! :)

@jeffmahoney
Copy link
Member

If you have GDB developers on staff, that would certainly help. We have Tom de Vries on our toolchain team and he's offered to help when I think the patches are ready to go as well.

I looked into what it would take to allow a small target on top of regular "core" and it's actually not too difficult. GDB already allows stacking targets -- like core or kdumpfile is stacked on exec. Core is a process strata target and I've written py-target to be at the thread strata, which is above it. This means that we can stack a python target over "core" as long as I remove some of the assumptions that I made initially -- like wanting to clear away other targets. I've got a mostly working kcore python target now but I need to sort out why I'm hitting assertions in GDB when doing "info threads."

You're right that this PR isn't ready to land but it's been a good start for some discussion when the project really lacks any other public forum. :)

@prakashsurya
Copy link
Author

Unfortunately we don’t have any GDB developers, but I think we’re open to learning what we need to know to get things moving forward (and avoid maintaining a patch stack to GDB indefinitely). So, if you need some extra help and are open to giving us some direction, we can devote some time to this.

For a little context, we’re transitioning from an illumos based product to Linux, and the ability to efficiently do live and post-mortem kernel debugging is relatively high on our list of priorities; and all other tools we’ve looked at for Linux are lacking compared to what we have on illumos.

@prakashsurya
Copy link
Author

prakashsurya commented May 21, 2019

@jeffmahoney I've been working with @sdimitro, and if you have some time, we'd like to get your feedback on our latest commit to this PR. This adds a new dependency on drgn; I think it's a reasonable dependency in that it solves some of the issues of using the "core" target (more details in the commit message).

@osandov since this change is adding a dependency on your (awesome) "drgn" project, I wanted to loop you in too.

We still don't have a solid plan as to how to solve the problem of "stale" thread information when running on a live kernel, though.. so if you have some ideas about how to solve that issue, we'd love to hear them. We have some ideas (e.g. refreshing the thread list whenever "bt" is called), but we haven't tried any of them out yet, so we're not sure if they'll be viable yet.

With these two changes, I think crash-kcore.sh for /proc/kcore is on parity with crash.sh for crash dumps...

Previously, without "drgn", the value printed here wouldn't change (i.e. caching issue appears solved):

$ sudo ./crash-kcore.sh
...
(gdb) p jiffies
$1 = 4311762114
(gdb) p jiffies
$2 = 4311762272
(gdb) p jiffies
$3 = 4311762415

Also, previously, the output here would complain about not being able to obtain register values (see my prior comment):

(gdb) info threads                                                                                                                                                                              
  Id   Target Id                  Frame                                                                                                                                                         
  1    pid 0 "swapper/0"          context_switch (rf=<optimized out>, next=<optimized out>, prev=<unavailable>, rq=<optimized out>) at /build/linux-3btXxq/linux-4.15.0/kernel/sched/core.c:2831
  2    pid 1 "systemd"            context_switch (rf=<optimized out>, next=<optimized out>, prev=<unavailable>, rq=<optimized out>) at /build/linux-3btXxq/linux-4.15.0/kernel/sched/core.c:2831
  3    pid 2 "kthreadd"           context_switch (rf=<optimized out>, next=<optimized out>, prev=<unavailable>, rq=<optimized out>) at /build/linux-3btXxq/linux-4.15.0/kernel/sched/core.c:2831
  4    pid 4 "kworker/0:0H"       context_switch (rf=<optimized out>, next=<optimized out>, prev=<unavailable>, rq=<optimized out>) at /build/linux-3btXxq/linux-4.15.0/kernel/sched/core.c:2831
  5    pid 6 "mm_percpu_wq"       context_switch (rf=<optimized out>, next=<optimized out>, prev=<unavailable>, rq=<optimized out>) at /build/linux-3btXxq/linux-4.15.0/kernel/sched/core.c:2831
  6    pid 7 "ksoftirqd/0"        context_switch (rf=<optimized out>, next=<optimized out>, prev=<unavailable>, rq=<optimized out>) at /build/linux-3btXxq/linux-4.15.0/kernel/sched/core.c:2831
...

Printing stacks and symbols still work too (as they did before with the "core" target):

(gdb) thread 123
[Switching to thread 123 (pid 596)]
#0  context_switch (rf=<optimized out>, next=<optimized out>, prev=<unavailable>, rq=<optimized out>) at /build/linux-3btXxq/linux-4.15.0/kernel/sched/core.c:2831
2831    /build/linux-3btXxq/linux-4.15.0/kernel/sched/core.c: No such file or directory.

(gdb) bt
#0  0xffffffff9099d011 in context_switch (rf=<optimized out>, next=<optimized out>, prev=<unavailable>, rq=<optimized out>) at /build/linux-3btXxq/linux-4.15.0/kernel/sched/core.c:2831
#1  0xffffffff9099d011 in __schedule (preempt=<optimized out>) at /build/linux-3btXxq/linux-4.15.0/kernel/sched/core.c:3404
#2  0xffffffff9099d64c in schedule () at /build/linux-3btXxq/linux-4.15.0/kernel/sched/core.c:3448
#3  0xffffffff909a1be1 in schedule_hrtimeout_range_clock (expires=<optimized out>, delta=<optimized out>, mode=<optimized out>, clock=<optimized out>) at /build/linux-3btXxq/linux-4.15.0/kernel/time/hrtimer.c:1702
#4  0xffffffff909a1c03 in schedule_hrtimeout_range (expires=<optimized out>, delta=<optimized out>, mode=<optimized out>) at /build/linux-3btXxq/linux-4.15.0/kernel/time/hrtimer.c:1759
#5  0xffffffff90290a85 in poll_schedule_timeout (pwq=0xffffaa22410a7a70, state=<optimized out>, expires=<optimized out>, slack=<optimized out>) at /build/linux-3btXxq/linux-4.15.0/fs/select.c:243
#6  0xffffffff902914a2 in do_select (n=<optimized out>, fds=<optimized out>, end_time=0x0 <irq_stack_union>) at /build/linux-3btXxq/linux-4.15.0/fs/select.c:580
#7  0xffffffff90292237 in core_sys_select (n=<optimized out>, inp=<unavailable>, outp=<optimized out>, exp=<optimized out>, end_time=<optimized out>) at /build/linux-3btXxq/linux-4.15.0/fs/select.c:654
#8  0xffffffff90292437 in SYSC_select (tvp=<optimized out>, exp=<optimized out>, outp=<optimized out>, inp=<optimized out>, n=<optimized out>) at /build/linux-3btXxq/linux-4.15.0/fs/select.c:695
#9  0xffffffff90292437 in SyS_select (n=5, inp=140734162616720, outp=0, exp=<optimized out>, tvp=<unavailable>) at /build/linux-3btXxq/linux-4.15.0/fs/select.c:677
#10 0xffffffff90003af3 in do_syscall_64 (regs=<unavailable>) at /build/linux-3btXxq/linux-4.15.0/arch/x86/entry/common.c:290
#11 0xffffffff90a00081 in entry_SYSCALL_64 () at /build/linux-3btXxq/linux-4.15.0/arch/x86/entry/entry_64.S:237

(gdb) p init_uts_ns
$2 = {
  kref = {
    refcount = {
      refs = {
        counter = 6
      }
    }
  },
  name = {
    sysname = "Linux", '\000' <repeats 59 times>,
    nodename = "ps-trunk.dcenter", '\000' <repeats 48 times>,
    release = "4.15.0-50-generic", '\000' <repeats 47 times>,
    version = "#54-Ubuntu SMP Mon May 6 18:46:08 UTC 2019", '\000' <repeats 22 times>,
    machine = "x86_64", '\000' <repeats 58 times>,
    domainname = "(none)", '\000' <repeats 58 times>
  },
  user_ns = 0xffffffff91452f80 <init_user_ns>,
  ucounts = 0x0 <irq_stack_union>,
  ns = {
    stashed = {
      counter = 0
    },
    ops = 0xffffffff90e2d460 <utsns_operations>,
    inum = 4026531838
  }
}

(gdb) p spa_namespace_avl
$1 = {
  avl_root = 0xffff8b6ce4750108,
  avl_compar = 0xffffffffc0532be0 <spa_name_compare>,
  avl_offset = 264,
  avl_numnodes = 1,
  avl_size = 8944
}

jeffmahoney and others added 18 commits May 21, 2019 17:38
This commit adds knowledge of the task flags for newer releases.

In Linux v3.14, several elements were removed from task_state_array.
In Linux v4.4, TASK_PARKED was renumbered to be in task_state_array.

This commit handles the right things and will complain if the flags
change again.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Kernel 4.2 introduced TASK_NOLOAD, which when combined with
TASK_UNINTERRUPTIBLE, produced TASK_IDLE.  This mask is used for
kernel threads, so without support for the flags, `ps' shows ?? for
kernel threads.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
It's more user-friendly to be able to locate whether a command is
present alphabetically.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
The use of anonymous structures and unions means that things like:

struct foo {
	struct {
		int x;
	};
};

if 'x' in cls.foo_type:
	# will evaluate false

when foo.x works fine in C code.  In order to make these less painful
for subsystem modules, we add a struct_has_member helper that does
the right thing to resolve the member.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Kernel v4.19 moved most of mm_struct into an anonymous sub-structure.

Even though C code can access members directly, the gdb type
infrastructure reflects the actual type layout.  This means that things like
"if 'rss_stat' in cls.mm_struct_type" will return false even if the
member is present.

To cope with this, use struct_has_member instead, which does the right
things when detecting whether a struct member is present.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
With upcoming file system subsystem modules, we'll want a common way to
handle UUID decoding.  XFS uses uuid_t while btrfs uses an array of u8.

This introduces helpers into crash.util:
- decode_uuid   -- decodes the byte array
- decode_uuid_t -- decodes the uuid_t

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
The decode_flags helper takes a gdb.Value representing an integer
and a dictionary of int -> str that maps the powers of 2 to flag
names and produces a human-readable string describing the flags.  If
no name is found FLAG_$number is used instead.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Internally, gdb treats the type loaded from a typed symbol and a type
symbol differently and wants to do the full type comparison dance.

If we use the typed symbol directly, we can use a pointer comparison.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
The list_empty method returns a boolean indicating whether a list_head
describes an empty list.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
The cycle tests aren't passing exact_cycles=True and will loop forever.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
The tests that load files (or targets) need to tear them down so subsequent
tests don't get tripped up by them.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Now that we have DelayedAttributes everywhere, the setup code
can be converted to use it.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
This commit adds baseline ppc64 support.  It should be enough to populate
the thread list but this is an old commit that needs refreshing.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
This commit adds some typical helpers for bitmaps:
- find_first_set_bit
- find_next_set_bit
- find_last_set_bit
- find_first_zero_bit
- find_next_zero_bit

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
When I rebased crash-python-gdb to an 8.3 prerelease, I found that
targets have been converted to C++.  That necessitated a rewrite of
much of the target code, and I cleaned up some rough edges.

With the new target, we load the vmcore using a simple
'target kdumpfile /path/to/vmcore' command that can be used entirely
outside of the crash semantic code.  This means we can debug the
target more easily and use it in the testing code without having to
parse everything for every test.

This commit converts crash to use the new target but doesn't exploit it
for testing yet.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
This module contains for_each_module and a new
for_each_module_section.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
The "right" thing to do is for .data..percpu to be loaded at offset 0.
Unfortunately, that only works when debuginfo is embedded in the binary.
When separate debuginfo is used, section offsets can't be specified and
gdb interprets an offset of 0 to mean "immediately after the preceding
section."  In order to make the rest of the percpu code sane, we'll
let gdb make the same assumption when embedded debuginfo is used.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Kernels prior to 2.6.30 didn't have dynamic percpu ranges.  The test cases
have also not been extended to cover the dynamic ranges.  This commit
catches DelayedAttributeError so the test cases can pass.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
One of the things that has had a serious negative effect on the quality
of crash-python is the inability to do automated testing on a broad
variety of kernels and vmcores.  Often, we have to run our tests
by hand.  More often, we get reports from new users that something
broke unexpectedly.

This commit adds real unit testing against real kernels and vmcores.
It includes a few test cases that will need to be extended further.

The policy moving forward will be that new features will require
a matching test case that passes across a variety of kernels.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
With Linux v5.1-rc1, knode_class was moved from struct device to
struct device_private.  This commit updates for_each_class_device
to use the implementation that matches the kernel.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
These two helpers will take a generic superblock or inode and
determine whether it belongs to the given file system.

It's a naive implementation that uses a string comparison.  This is
intentional so the comparison can be made without symbol
resolution that may require module loading.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
The helper routines can be passed bad pointers, so document
that each can raise gdb.NotAvailableError.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
This commit adds documentation for the mount API and makes
private some methods/functions that are meant to be internal.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
…uids

This adds helpers to:
- export the fsid and metadata uuid from btrfs file systems
- test whether a generic super block belongs to btrfs
- test whether a generic inode belongs to btrfs

We also document the APIs of existing helpers.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
This adds an `btrfs' command to display some details of btrfs file systems.

Included subcommands are:
- 'list' -- list all mounted btrfs file systems, including device and uuid.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
This commit adds a basic xfs file system system module.

Included are:
- Python variables for flags
- Mappings from flags to flag names
- Decoding for xfs_bufs and inodes
- Helpers for mount flags, superblock version, and uuid
- AIL iterators including item decoding

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
This adds an `xfs' command to display some details of xfs file systems.

Included subcommands are:
- 'list' -- list all mounted xfs file systems, including device and uuid
- 'show' -- show details of a single xfs file system
- 'dump-ail' -- dump contents of the AIL for one file system
- 'dump-buft' -- dump contents of the bt_delwrite_queue for one file system

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
This adds a helper to pass back the requests in flight for a particular
queue (block single-queue only).

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
This commit adds a basic `lsmod' command.

By default, it will display the module name, core address, size, and
users of it.

With the -p option, it will display the percpu base and size.  With -p <n>,
it will display the percpu base for the given CPU number.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
The arch-specific part of get_stack_pointer just needs to interpret
the arch's thread_struct.  Pass it that and avoid confusion.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Every consumer of a task shouldn't need to drill down into the structure
just to get the task name, pid, etc.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
This is a big commit that pulls the formatting of the output
out of the command.  The idea is that we can implement this more cleanly
by adding methods to the formatting class.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Kernel v5.1-rc1 moved the compressed config data into .rodata
using asm .globl variables to mark the bounds.

This commit updates crash.cache.syscache to handle the new variables
and cleans up the code a bit.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
The test output is littered with 'broken link' reports during tests
that are specifically testing that behavior.  We can tidy up a bit
by adding a print_broken_links option that defaults to True but
can be set to False by the test cases.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Despite being called 'get_typed_pointer', we were dereferencing the
pointer before turning.  Also, we were refusing to take the address
of a value that wasn't already the type we were targeting, which is
silly since that would just return the object back.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
This commit adds API documentation and static typing hints to tasks.  There
are some minor code changes to make mypy happy with the result.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
This commit is the leftover bits for typing and documentation.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Prakash Surya and others added 2 commits May 22, 2019 09:24
This change adds a new "crash-kcore.sh" script that enables the use of
this repository for debugging a "live" system via the "/proc/kcore"
interface, rather than reading from a kernel crash dump.

Current functionality includes:

 * Ability to print global variables
 * Ability to run existing crash-python commands
 * Ability to print backtraces with "bt"

Caveats:

 * Thread information is read once at startup, and never updated. As a
   result, when listing the backtraces for thread, they may not reflect
   the current state of the system; they'll reflect the state of the
   system during crash-python initialization.

 * We cannot (as far as I know) completely disable the caching done by
   GDB for the "core" target. Thus, when printing small amounts of data
   repeatedly (e.g. calling "p jiffies_64" repeatedly), the value shown
   may not reflect the current state of the system, it'll reflect the
   value when it was first read and cached.

Co-authored-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Co-authored-by: Tom Caputi <tcaputi@datto.com>
This adds a new "kcore" GDB target which uses "drgn" as the backend for
reading from "/proc/kcore"; very similar to the existing "kdump" target
which uses "kdumpfile" for reading from a crash dump.

The benefit of using this new "kcore" target as opposed to GDB's
existing "core" target is twofold:

 1. The caching done by the "core" target is not what we want when
    inspecting a live, running kernel. By moving to a new python based
    target, we avoid all of this; each read of memory from GDB will call
    into our target, so we have much more control.

 2. The "core" target is unable to properly fetch registers when it's
    used with "/proc/kcore". By using a python target, we can override
    the "fetch_registers" function, and do the right thing for fetching
    registers of kernel threads. Additionally, the code to do this is
    the same for "/proc/kcore" and a kernel crash dump, so the same code
    can be used for both the new "kcore" and existing "kdump" targets.

Unfortunately though, this new "kcore" target does not solve the issue
of thread information being read and cached during startup, meaning we
still do not have "live" thread information when using this new target.

See also: https://github.com/osandov/drgn

Co-authored-by: Prakash Surya <prakash.surya@delphix.com>
@jeffmahoney jeffmahoney force-pushed the next branch 7 times, most recently from 89aca9c to a1171f9 Compare May 23, 2019 21:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants

Morty Proxy This is a proxified and sanitized view of the page, visit original site.