Discussion:
debug problems on ppc 83xx target due to changed struct task_struct
Holger Brunck
2016-08-12 14:50:38 UTC
Permalink
Hi all,
we are currently trying to update from kernel 3.10 to latest stable kernel 4.4
and we encounter the following problem.

When I try to debug our multithreaded userspace application with gdb I get
stuck when trying to single step code.

Due to the fact that our ppc 8360 target is mainlined (kmeter1) I was able to
bisect the kernel to check when this problem was introduced. It turns out to be
introduced with commit:

commit 0c8c0f03e3a292e031596484275c14cf39c0ab7a
Author: Dave Hansen <***@sr71.net>
Date: Fri Jul 17 12:28:11 2015 +0200

x86/fpu, sched: Dynamically allocate 'struct fpu'

This patch moves the definition of "struct thread_struct thread;" from the
middle of the "struct task_struct" to the end (include/linux/sched.h). When I
move back the definition to the original place and recompile, everything works
fine. I double-checked this with the latest kernel 4.7 and I have the same
behaviour.

I also tried a different toolchain (I got poky/jethro with gcc 5.2) and tried a
eldk 5.4 with gcc 4.7, as I am not sure if this is really a kernel problem or
related to a gdb or gcc. But I still have the same behaviour with a different
toolchain.

Does anyone have an idea why the change in sched.h break my debug usecase?
Anyone out here who is debugging ppc83xx targets flawlessly with a recent kernel?

Any help is apreciated.

Best regards
Holger Brunck
Dave Hansen
2016-08-12 15:14:46 UTC
Permalink
Post by Holger Brunck
When I try to debug our multithreaded userspace application with gdb I get
stuck when trying to single step code.
Can you clarify "stuck"? Like the instructions don't advance? Have you
been able to find a root cause for this?
Post by Holger Brunck
Does anyone have an idea why the change in sched.h break my debug
usecase? Anyone out here who is debugging ppc83xx targets flawlessly
with a recent kernel?
Thanks for going to the trouble of bisecting this, btw!

I'd _suspect_ something very specific to your platform since this
doesn't appear to affect even other ppc variants.

I wonder if making it cross a page boundary from some other structure
causes this, or moving it relative to something else. Could you try
moving it to a few more places, or padding it by, say PAGE_SIZE on
either side makes a difference?

Is there some assembly involved in your single-stepping, or some other
code that assumes relative offsets between two pieces of 'task_struct'?
Holger Brunck
2016-08-12 15:47:37 UTC
Permalink
Thanks for the quick answer!
Post by Dave Hansen
Post by Holger Brunck
When I try to debug our multithreaded userspace application with gdb I get
stuck when trying to single step code.
Can you clarify "stuck"? Like the instructions don't advance? Have you
been able to find a root cause for this?
the behaviour is slightly different on the kernel versions. So my setup is a
remote debug session via gdbserver.

After connecting to the gdbserver I set a break point and start to run my
program. When hitting the breakpoint I try to single step. With stuck I mean
that the connection to the gdbserver is broken and I can't control my debug
session anymore while the application is not continuing.

On Kernel 4.2 I got additionally the following dump in my serial terminal:

------------[ cut here ]------------
WARNING: at
/opt/keymile/ws_root/git_repositories/prod/keyne/plat/kernel/gpl/kernel/sched/core.c:1975
Modules linked in:
CPU: 0 PID: 0 Comm: swapper Not tainted 4.2.0-00003-g0478a57 #10
task: c04213d0 ti: c0434000 task.ti: c0434000
NIP: c003c4ac LR: c005d7f0 CTR: c005d7c8
REGS: c0435ce0 TRAP: 0700 Not tainted (4.2.0-00003-g0478a57)
MSR: 00021032 <ME,IR,DR,RI> CR: 22044228 XER: 20000000

GPR00: c005dfd8 c0435d90 c04213d0 cfba7a70 c042624c 00000000 00000001 00000000
GPR08: 00000001 00000001 00000007 ffffffff 42044228 eec349c0 00000000 00000000
GPR16: 0fe75f34 c0434000 0000000a c005d7c8 00000001 0000000a c0430000 c042624c
GPR24: 7ffc66b5 7ffc66b5 00000001 c0434000 0000000a c0426240 cfb81e90 c04261e0
NIP [c003c4ac] wake_up_process+0x10/0x20
LR [c005d7f0] hrtimer_wakeup+0x28/0x44
Call Trace:
[c0435d90] [c0426240] 0xc0426240 (unreliable)
[c0435da0] [c005dfd8] __hrtimer_run_queues.constprop.7+0x114/0x214
[c0435df0] [c005e334] hrtimer_interrupt+0xb8/0x29c
[c0435e40] [c0009c80] __timer_interrupt+0xb8/0x1c4
[c0435e60] [c000a03c] timer_interrupt+0x8c/0xb8
[c0435e90] [c000ece4] ret_from_except+0x0/0x14
--- interrupt: 901 at arch_cpu_idle+0x24/0x6c
LR = arch_cpu_idle+0x24/0x6c
[c0435f50] [c0434000] 0xc0434000 (unreliable)
[c0435f60] [c0044cc0] cpu_startup_entry+0x138/0x1cc
[c0435fb0] [c03fdde0] start_kernel+0x32c/0x340
[c0435ff0] [00003438] 0x3438


This trace is missing when I try the same with latest kernel 4.7. But the
behaviour is similar. The board is still reachable via telnet but I need to kill
the gdbserver session manually to get control over the initial serial terminal
again. When I move the mentioned line of code everything works fine.
Post by Dave Hansen
Post by Holger Brunck
Does anyone have an idea why the change in sched.h break my debug
usecase? Anyone out here who is debugging ppc83xx targets flawlessly
with a recent kernel?
Thanks for going to the trouble of bisecting this, btw!
I'd _suspect_ something very specific to your platform since this
doesn't appear to affect even other ppc variants.
yeah I also think this. I did the same test on an embedded ARM target and it
works fine, so it seems to be somehow related to ppc 83xx which is a 32-bit
target. And what we also need is multithreading and/or c++ code. I did check
with some simple code and single stepping works fine.

It might also be that your code change simply exposes an error in the gdb/g++
environment.
Post by Dave Hansen
I wonder if making it cross a page boundary from some other structure
causes this, or moving it relative to something else. Could you try
moving it to a few more places, or padding it by, say PAGE_SIZE on
either side makes a difference?
yes I can do some more tests at the beginning of the next week. Moving this
definition within the structure is a good idea.
Post by Dave Hansen
Is there some assembly involved in your single-stepping, or some other
code that assumes relative offsets between two pieces of 'task_struct'?
no. At least not in the code we have written. Not sure what the related g++
libraries are doing.

Regards
Holger
Dave Hansen
2016-08-12 16:09:04 UTC
Permalink
Post by Holger Brunck
Post by Dave Hansen
Post by Holger Brunck
When I try to debug our multithreaded userspace application with gdb I get
stuck when trying to single step code.
Can you clarify "stuck"? Like the instructions don't advance? Have you
been able to find a root cause for this?
the behaviour is slightly different on the kernel versions. So my setup is a
remote debug session via gdbserver.
After connecting to the gdbserver I set a break point and start to run my
program. When hitting the breakpoint I try to single step. With stuck I mean
that the connection to the gdbserver is broken and I can't control my debug
session anymore while the application is not continuing.
Could you try debugging locally with gdb? It would be nice to take all
the stuff involved with remote debugging out of the picture.

Have you tried turning on a bunch of kernel debugging (SLAB/SLUB
debugging, pagealloc debug, lockdep, etc...)? If something is getting
corrupted, those tend to catch it.
Post by Holger Brunck
------------[ cut here ]------------
WARNING: at
/opt/keymile/ws_root/git_repositories/prod/keyne/plat/kernel/gpl/kernel/sched/core.c:1975
CPU: 0 PID: 0 Comm: swapper Not tainted 4.2.0-00003-g0478a57 #10
task: c04213d0 ti: c0434000 task.ti: c0434000
NIP: c003c4ac LR: c005d7f0 CTR: c005d7c8
REGS: c0435ce0 TRAP: 0700 Not tainted (4.2.0-00003-g0478a57)
MSR: 00021032 <ME,IR,DR,RI> CR: 22044228 XER: 20000000
GPR00: c005dfd8 c0435d90 c04213d0 cfba7a70 c042624c 00000000 00000001 00000000
GPR08: 00000001 00000001 00000007 ffffffff 42044228 eec349c0 00000000 00000000
GPR16: 0fe75f34 c0434000 0000000a c005d7c8 00000001 0000000a c0430000 c042624c
GPR24: 7ffc66b5 7ffc66b5 00000001 c0434000 0000000a c0426240 cfb81e90 c04261e0
NIP [c003c4ac] wake_up_process+0x10/0x20
LR [c005d7f0] hrtimer_wakeup+0x28/0x44
[c0435d90] [c0426240] 0xc0426240 (unreliable)
[c0435da0] [c005dfd8] __hrtimer_run_queues.constprop.7+0x114/0x214
[c0435df0] [c005e334] hrtimer_interrupt+0xb8/0x29c
[c0435e40] [c0009c80] __timer_interrupt+0xb8/0x1c4
[c0435e60] [c000a03c] timer_interrupt+0x8c/0xb8
[c0435e90] [c000ece4] ret_from_except+0x0/0x14
--- interrupt: 901 at arch_cpu_idle+0x24/0x6c
LR = arch_cpu_idle+0x24/0x6c
[c0435f50] [c0434000] 0xc0434000 (unreliable)
[c0435f60] [c0044cc0] cpu_startup_entry+0x138/0x1cc
[c0435fb0] [c03fdde0] start_kernel+0x32c/0x340
[c0435ff0] [00003438] 0x3438
This trace is missing when I try the same with latest kernel 4.7. But the
behaviour is similar. The board is still reachable via telnet but I need to kill
the gdbserver session manually to get control over the initial serial terminal
again. When I move the mentioned line of code everything works fine.
I think that warning was just a false positive. It got removed:

http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=119d6f6a3be8b

Is the process still alive at the point that the remote debugger stops
responding? What is it doing at that point?
Holger Brunck
2016-08-15 14:35:47 UTC
Permalink
Post by Dave Hansen
Post by Holger Brunck
Post by Dave Hansen
Post by Holger Brunck
When I try to debug our multithreaded userspace application with gdb I get
stuck when trying to single step code.
Can you clarify "stuck"? Like the instructions don't advance? Have you
been able to find a root cause for this?
the behaviour is slightly different on the kernel versions. So my setup is a
remote debug session via gdbserver.
After connecting to the gdbserver I set a break point and start to run my
program. When hitting the breakpoint I try to single step. With stuck I mean
that the connection to the gdbserver is broken and I can't control my debug
session anymore while the application is not continuing.
Could you try debugging locally with gdb? It would be nice to take all
the stuff involved with remote debugging out of the picture.
I tried this but unfortunately the error only occurs while remote debugging.
Locally with gdb everything works fine. BTW we double-checked with a 85xx ppc
target which is also 32-bit and it ends up with the same behaviour.

I was also investigating where I have to move the line in the struct task_struct
and it turns out to be like this (diff to 4.7 kernel):

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 253538f..4868874 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1655,7 +1655,9 @@ struct task_struct {
struct signal_struct *signal;
struct sighand_struct *sighand;

+ // struct thread_struct thread; // until here everything is fine
sigset_t blocked, real_blocked;
+ struct thread_struct thread; // from here it's broken
sigset_t saved_sigmask; /* restored if set_restore_sigmask() was used */
struct sigpending pending;

@@ -1919,7 +1921,6 @@ struct task_struct {
struct task_struct *oom_reaper_list;
#endif
/* CPU-specific state of this task */
- struct thread_struct thread;
/*

So it's in the area where some signal information are stored, which makes sense
because this is highly used in case of gdb debugging.
Post by Dave Hansen
Have you tried turning on a bunch of kernel debugging (SLAB/SLUB
debugging, pagealloc debug, lockdep, etc...)? If something is getting
corrupted, those tend to catch it.
I switched on some memory debugging features but didn't get suspicious output.
To make the situation even more weird after enabling FTRACE in the kernel to
trace some signal code the error disappeared.
Post by Dave Hansen
Is the process still alive at the point that the remote debugger stops
responding? What is it doing at that point?
the process is still alive. The state of the process, it's threads and the
gdbserver is like this:

Bad case after a single step:
73 73 TS - 0 19 0 0.3 S sigsuspend gdbserver
74 74 TS - 0 19 0 0.0 tl+ ptrace_stop infra_pbec83xx_
74 77 IDL 0 - 19 0 0.0 tl+ ptrace_stop TR_Task
74 78 IDL 0 - 19 0 0.0 tl+ ptrace_stop TR_Timeout
74 79 TS - 0 19 0 0.0 tl+ poll_schedule_ timed_msg
74 80 IDL 0 - 19 0 0.0 tl+ ptrace_stop stimuli
74 81 TS - -5 24 0 0.0 t<l+ ptrace_stop timer0Dflt
74 82 TS - -19 38 0 0.0 t<l+ futex_wait_que timerUpd0
74 83 TS - -19 38 0 0.0 t<l+ timerfd_read timerClk
74 84 TS - -19 38 0 0.0 t<l+ ptrace_stop b/beatWDogRefr

Good case after a single step:
76 76 TS - 0 19 0 4.0 S poll_schedule_ gdbserver
77 77 TS - 0 19 0 0.0 tl ptrace_stop infra_pbec83xx_
77 84 IDL 0 - 19 0 0.0 tl ptrace_stop TR_Task
77 85 IDL 0 - 19 0 0.0 tl ptrace_stop TR_Timeout
77 86 TS - 0 19 0 0.0 tl ptrace_stop timed_msg
77 87 IDL 0 - 19 0 0.0 tl ptrace_stop stimuli
77 88 TS - -5 24 0 0.0 t<l ptrace_stop timer0Dflt
77 89 TS - -19 38 0 0.0 t<l ptrace_stop timerUpd0
77 90 TS - -19 38 0 0.0 t<l ptrace_stop timerClk
77 91 TS - -19 38 0 0.0 t<l ptrace_stop b/beatWDogRefr

So in the error case only some threads are at ptrace_stop, while all of them
should be after a single step with the gdb. So it's somewhere in the signal
handling between kernel and gdbserver.

Best regards
Holger Brunck
Dave Hansen
2016-08-15 16:19:58 UTC
Permalink
Post by Holger Brunck
I tried this but unfortunately the error only occurs while remote debugging.
Locally with gdb everything works fine. BTW we double-checked with a 85xx ppc
target which is also 32-bit and it ends up with the same behaviour.
I was also investigating where I have to move the line in the struct task_struct
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 253538f..4868874 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1655,7 +1655,9 @@ struct task_struct {
struct signal_struct *signal;
struct sighand_struct *sighand;
+ // struct thread_struct thread; // until here everything is fine
sigset_t blocked, real_blocked;
+ struct thread_struct thread; // from here it's broken
sigset_t saved_sigmask; /* restored if set_restore_sigmask() was used */
struct sigpending pending;
Wow, thanks for all the debugging here!

So, we know it has to do with signals, thread_info, and probably only
affects 32-bit powerpc. Seems awfully weird. Have you checked with any
of the 64-bit powerpc guys to see if they have any ideas?

I went grepping around for a bit.

Where is the task_struct stored? Is it on-stack on ppc32 or something?
The thread_info is, I assume, but I see some THREAD_INFO vs. THREAD
(thread struct) math happening in here, which confuses me:

.globl ret_from_debug_exc
ret_from_debug_exc:
mfspr r9,SPRN_SPRG_THREAD
lwz r10,SAVED_KSP_LIMIT(r1)
stw r10,KSP_LIMIT(r9)
lwz r9,THREAD_INFO-THREAD(r9)
CURRENT_THREAD_INFO(r10, r1)
lwz r10,TI_PREEMPT(r10)
stw r10,TI_PREEMPT(r9)
RESTORE_xSRR(SRR0,SRR1);
RESTORE_xSRR(CSRR0,CSRR1);
RESTORE_MMU_REGS;
RET_FROM_EXC_LEVEL(SPRN_DSRR0, SPRN_DSRR1, PPC_RFDI)

But, I'm really at a loss to explain this. It still seems like a deeply
ppc-specific issue. We can obviously work around it with an #ifdef for
your platform, but that's awfully hackish and hides the real bug,
whatever it is.

My suspicion is that there's a bug in the 32-bit ppc assembly somewhere.
I don't see any references to 'blocked' or 'real_blocked' in assembly
though. You could add a bunch of padding instead of moving the
thread_struct and see if that does anything, but that's really a stab in
the dark.
christophe leroy
2016-08-16 17:27:35 UTC
Permalink
Post by Dave Hansen
Post by Holger Brunck
I tried this but unfortunately the error only occurs while remote debugging.
Locally with gdb everything works fine. BTW we double-checked with a 85xx ppc
target which is also 32-bit and it ends up with the same behaviour.
I was also investigating where I have to move the line in the struct task_struct
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 253538f..4868874 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1655,7 +1655,9 @@ struct task_struct {
struct signal_struct *signal;
struct sighand_struct *sighand;
+ // struct thread_struct thread; // until here everything is fine
sigset_t blocked, real_blocked;
+ struct thread_struct thread; // from here it's broken
sigset_t saved_sigmask; /* restored if set_restore_sigmask() was used */
struct sigpending pending;
Wow, thanks for all the debugging here!
So, we know it has to do with signals, thread_info, and probably only
affects 32-bit powerpc. Seems awfully weird. Have you checked with any
of the 64-bit powerpc guys to see if they have any ideas?
I went grepping around for a bit.
Where is the task_struct stored? Is it on-stack on ppc32 or something?
The thread_info is, I assume, but I see some THREAD_INFO vs. THREAD
.globl ret_from_debug_exc
mfspr r9,SPRN_SPRG_THREAD
lwz r10,SAVED_KSP_LIMIT(r1)
stw r10,KSP_LIMIT(r9)
lwz r9,THREAD_INFO-THREAD(r9)
CURRENT_THREAD_INFO(r10, r1)
lwz r10,TI_PREEMPT(r10)
stw r10,TI_PREEMPT(r9)
RESTORE_xSRR(SRR0,SRR1);
RESTORE_xSRR(CSRR0,CSRR1);
RESTORE_MMU_REGS;
RET_FROM_EXC_LEVEL(SPRN_DSRR0, SPRN_DSRR1, PPC_RFDI)
But, I'm really at a loss to explain this. It still seems like a deeply
ppc-specific issue. We can obviously work around it with an #ifdef for
your platform, but that's awfully hackish and hides the real bug,
whatever it is.
My suspicion is that there's a bug in the 32-bit ppc assembly somewhere.
I don't see any references to 'blocked' or 'real_blocked' in assembly
though. You could add a bunch of padding instead of moving the
thread_struct and see if that does anything, but that's really a stab in
the dark.
Just to let you know, I'm not sure it is the same issue, but I also get
my 8xx target stuck when I try to use gdbserver.

If I debug a very small app, it gets stuck quickly after the app has
stopped: indeed, the console seems ok but as soon as I try to execute
something simple, like a ps or top, it get stuck. The target still
responds to pings, but nothing else.

If I debug a big app, it gets stuck soon after the start of debug: I set
a bpoint at main(), do a 'continue', get breaked at main(), do some
steps with 'next' then it gets stuck.

I have tried moving the struct thread_struct thread but it has no impact.

I have tried to run gdbserver with 'perf top' running on another
console, and then 'perf top' stops updating itself. The quick keys are
still working, but the data is not updated anymore and the stats on top
all goes to 0 (# irq/sec etc...)

Christophe

---
L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel antivirus Avast.
https://www.avast.com/antivirus
Dave Hansen
2016-08-16 17:36:15 UTC
Permalink
Post by christophe leroy
If I debug a very small app, it gets stuck quickly after the app has
stopped: indeed, the console seems ok but as soon as I try to execute
something simple, like a ps or top, it get stuck. The target still
responds to pings, but nothing else.
This one is a pretty common symptom when the kernel crashes holding a
per-process lock of some kind, probably most commonly mmap_sem.

Do you have a serial console? Can you do a sysrq-t on it to get stack
dumps of the hung processes and ps?

It's time to turn on all the debugging options we can find. Can you
build a kernel with all this stuff enabled (=y on all of these):

CONFIG_SLUB_DEBUG
CONFIG_DEBUG_VM
CONFIG_DEBUG_PAGEALLOC
CONFIG_PAGE_POISONING
CONFIG_SLUB_DEBUG_ON
CONFIG_KASAN
CONFIG_*LOCKUP_DETECTOR
CONFIG_DETECT_HUNG_TASK
CONFIG_SCHED_DEBUG
CONFIG_DEBUG_PREEMPT
CONFIG_MAGIC_SYSRQ

And all of the spinlock/lockdep stuff enabled too:

CONFIG_DEBUG_SPINLOCK=y
CONFIG_DEBUG_MUTEXES=y
CONFIG_DEBUG_LOCK_ALLOC=y
CONFIG_PROVE_LOCKING=y
CONFIG_LOCKDEP=y
CONFIG_DEBUG_ATOMIC_SLEEP=y
CONFIG_STACKTRACE=y
CONFIG_DEBUG_LIST=y

Let's hope that these turn something up.
Christophe Leroy
2016-08-17 08:22:06 UTC
Permalink
Post by Dave Hansen
Post by christophe leroy
If I debug a very small app, it gets stuck quickly after the app has
stopped: indeed, the console seems ok but as soon as I try to execute
something simple, like a ps or top, it get stuck. The target still
responds to pings, but nothing else.
This one is a pretty common symptom when the kernel crashes holding a
per-process lock of some kind, probably most commonly mmap_sem.
Do you have a serial console? Can you do a sysrq-t on it to get stack
dumps of the hung processes and ps?
It's time to turn on all the debugging options we can find. Can you
CONFIG_SLUB_DEBUG
CONFIG_DEBUG_VM
CONFIG_DEBUG_PAGEALLOC
CONFIG_PAGE_POISONING
CONFIG_SLUB_DEBUG_ON
CONFIG_KASAN
CONFIG_*LOCKUP_DETECTOR
CONFIG_DETECT_HUNG_TASK
CONFIG_SCHED_DEBUG
CONFIG_DEBUG_PREEMPT
CONFIG_MAGIC_SYSRQ
CONFIG_DEBUG_SPINLOCK=y
CONFIG_DEBUG_MUTEXES=y
CONFIG_DEBUG_LOCK_ALLOC=y
CONFIG_PROVE_LOCKING=y
CONFIG_LOCKDEP=y
CONFIG_DEBUG_ATOMIC_SLEEP=y
CONFIG_STACKTRACE=y
CONFIG_DEBUG_LIST=y
Let's hope that these turn something up.
Looks like all those options don't turn anything up.

See below what I get with Sysrq. It looks like the time is freezed:
* Log date remains the same for each line
* Timers dump is identical the two times
* The running task is stuck in udelay()

[ 415.307239] sysrq: SysRq : HELP : loglevel(0-9) reboot(b) crash(c)
show-all-locks(d) terminate-all-tasks(e) memory-full-oom-kill(f)
kill-all-tasks(i) show-memory-usage(m) nice-all-RT-tasks(n) poweroff(o)
show-registers(p) show-all-timers(q) sync(s) show-task-states(t)
unmount(u) show-blocked-tasks(w)

[ 415.307239] sysrq: SysRq : Show Regs
[ 415.307239] CPU: 0 PID: 281 Comm: sh Tainted: G W
4.4.14-s3k-dev-g08fc2be-svn #110
[ 415.307239] task: c620d380 ti: c7fee000 task.ti: c6214000
[ 415.307239] NIP: c0007e20 LR: c020f8a8 CTR: c020f874
[ 415.307239] REGS: c62159c0 TRAP: 0501 Tainted: G W
(4.4.14-s3k-dev-g08fc2be-svn)
[ 415.307239] MSR: 00009032 <EE,ME,IR,DR,RI> CR: 24022884 XER: 00000000
[ 415.307239]
GPR00: c020f8a0 c6215a70 c620d380 00000008 c601ba18 c020f8a0 00000000
0000036a
GPR08: c620d7b0 00000000 cc38f55d 40324184 84022882
[ 415.307239] NIP [c0007e20] udelay+0x30/0x3c
[ 415.307239] LR [c020f8a8] nand_wait+0x34/0x12c
[ 415.307239] Call Trace:
[ 415.307239] [c6215a70] [c020f8a0] nand_wait+0x2c/0x12c (unreliable)
[ 415.307239] [c6215a90] [c020e120] nand_write_page+0xb8/0x1a0
[ 415.307239] [c6215ac0] [c020f248] nand_do_write_ops+0x1ec/0x44c
[ 415.307239] [c6215b30] [c020f5ec] nand_write+0x68/0x94
[ 415.307239] [c6215b70] [c02210e0] ubi_io_write+0x108/0x61c
[ 415.307239] [c6215bb0] [c021e10c] ubi_eba_write_leb+0x88/0x7b4
[ 415.307239] [c6215c20] [c0163bfc] ubifs_leb_write+0xac/0x144
[ 415.307239] [c6215c50] [c0164740] ubifs_wbuf_sync_nolock+0xfc/0x33c
[ 415.307239] [c6215c70] [c0158b7c] ubifs_jnl_write_inode+0xb0/0x208
[ 415.307239] [c6215cb0] [c015f764] ubifs_write_inode+0xc4/0x14c
[ 415.307239] [c6215cd0] [c015bd68] ubifs_writepage+0x1ac/0x22c
[ 415.307239] [c6215d00] [c009a428] __writepage+0x24/0x7c
[ 415.307239] [c6215d10] [c009c6e4] write_cache_pages+0x1d0/0x380
[ 415.307239] [c6215db0] [c008fe6c] __filemap_fdatawrite_range+0x64/0x88
[ 415.307239] [c6215de0] [c0090644] filemap_write_and_wait_range+0x38/0x94
[ 415.307239] [c6215e00] [c015ac3c] ubifs_fsync+0x50/0xd4
[ 415.307239] [c6215e20] [c0093bb8] generic_file_write_iter+0x200/0x2b4
[ 415.307239] [c6215e50] [c015c294] ubifs_write_iter+0x104/0x1a8
[ 415.307239] [c6215ea0] [c00cba6c] __vfs_write+0xc8/0x14c
[ 415.307239] [c6215ef0] [c00cbc80] vfs_write+0xa4/0x178
[ 415.307239] [c6215f10] [c00cbe64] SyS_write+0x4c/0xc4
[ 415.307239] [c6215f40] [c000bbd4] ret_from_syscall+0x0/0x38
[ 415.307239] Instruction dump:
[ 415.307239] 3d20c05c 812975ac 7c6349d6 7d4c42e6 7d2c42e6 7d2a4850
7f834840 409d0018
[ 415.307239] 7c210b78 7d2c42e6 7d2a4850 7f834840 <419dfff0> 7c421378
4e800020 7c0802a6


[ 415.307239] sysrq: SysRq : Show Blocked State
[ 415.307239] task PC stack pid father
[ 415.307239] Sched Debug Version: v0.11, 4.4.14-s3k-dev-g08fc2be-svn #110
[ 415.307239] ktime : 415174.352520
[ 415.307239] sched_clk : 415307.239936
[ 415.307239] cpu_clk : 415307.239936
[ 415.307239] jiffies : 11517
[ 415.307239]
[ 415.307239] sysctl_sched
[ 415.307239] .sysctl_sched_latency : 6.000000
[ 415.307239] .sysctl_sched_min_granularity : 0.750000
[ 415.307239] .sysctl_sched_wakeup_granularity : 1.000000
[ 415.307239] .sysctl_sched_child_runs_first : 0
[ 415.307239] .sysctl_sched_features : 44859
[ 415.307239] .sysctl_sched_tunable_scaling : 1 (logaritmic)
[ 415.307239]
[ 415.307239] cpu#0
[ 415.307239] .nr_running : 2
[ 415.307239] .load : 2048
[ 415.307239] .nr_switches : 26656
[ 415.307239] .nr_load_updates : 0
[ 415.307239] .nr_uninterruptible : 0
[ 415.307239] .next_balance : 0.000000
[ 415.307239] .curr->pid : 281
[ 415.307239] .clock : 415307.239936
[ 415.307239] .clock_task : 415307.239936
[ 415.307239] .cpu_load[0] : 0
[ 415.307239] .cpu_load[1] : 0
[ 415.307239] .cpu_load[2] : 0
[ 415.307239] .cpu_load[3] : 0
[ 415.307239] .cpu_load[4] : 0
[ 415.307239]
[ 415.307239] cfs_rq[0]:
[ 415.307239] .exec_clock : 0.000000
[ 415.307239] .MIN_vruntime : 42490.860493
[ 415.307239] .min_vruntime : 42493.860493
[ 415.307239] .max_vruntime : 42490.860493
[ 415.307239] .spread : 0.000000
[ 415.307239] .spread0 : 0.000000
[ 415.307239] .nr_spread_over : 0
[ 415.307239] .nr_running : 2
[ 415.307239] .load : 2048
[ 415.307239]
[ 415.307239] rt_rq[0]:
[ 415.307239] .rt_nr_running : 0
[ 415.307239] .rt_throttled : 0
[ 415.307239] .rt_time : 0.000000
[ 415.307239] .rt_runtime : 950.000000
[ 415.307239]
[ 415.307239] dl_rq[0]:
[ 415.307239] .dl_nr_running : 0
[ 415.307239]
[ 415.307239] runnable tasks:
[ 415.307239] task PID tree-key switches prio
wait-time sum-exec sum-sleep
[ 415.307239]
----------------------------------------------------------------------------------------------------------
[ 415.307239] init 1 41218.246221 324 120
0.000000 7091.870720 0.000000
[ 415.307239] kthreadd 2 10041.075333 53 120
0.000000 39.044608 0.000000
[ 415.307239] ksoftirqd/0 3 42471.449421 525 120
0.000000 174.042368 0.000000
[ 415.307239] kworker/0:0 4 393.682966 10 120
0.000000 3.465472 0.000000
[ 415.307239] kworker/0:0H 5 542.229904 3 100
0.000000 0.373504 0.000000
[ 415.307239] kworker/u2:0 6 42490.860493 482 120
0.000000 682.979200 0.000000
[ 415.307239] rcu_preempt 7 42471.239885 7355 120
0.000000 2497.164544 0.000000
[ 415.307239] rcu_sched 8 16.696155 2 120
0.000000 0.232192 0.000000
[ 415.307239] rcu_bh 9 18.788058 2 120
0.000000 0.225792 0.000000
[ 415.307239] kdevtmpfs 10 9906.339623 53 120
0.000000 49.822080 0.000000
[ 415.307239] perf 11 209.966618 2 100
0.000000 0.418688 0.000000
[ 415.307239] khungtaskd 12 42271.067789 43 120
0.000000 9.401344 0.000000
[ 415.307239] writeback 13 393.600057 2 100
0.000000 0.301184 0.000000
[ 415.307239] kworker/0:1 14 42462.051405 3062 120
0.000000 1814.468864 0.000000
[ 415.307239] crypto 15 395.547701 2 100
0.000000 0.310016 0.000000
[ 415.307239] rpciod 16 547.595271 3 100
0.000000 0.346752 0.000000
[ 415.307239] kswapd0 17 1141.966520 3 120
0.000000 0.758656 0.000000
[ 415.307239] fsnotify_mark 18 1289.797239 2 120
0.000000 1.068032 0.000000
[ 415.307239] nfsiod 19 1294.974059 2 100
0.000000 0.303360 0.000000
[ 415.307239] bond0 37 4873.191411 2 100
0.000000 0.320512 0.000000
[ 415.307239] ipv6_addrconf 49 5541.654525 2 100
0.000000 0.303232 0.000000
[ 415.307239] deferwq 50 5664.433381 2 100
0.000000 0.299264 0.000000
[ 415.307239] kworker/u2:1 51 29504.315231 5 120
0.000000 0.803840 0.000000
[ 415.307239] spi32766 52 24174.314766 243 120
0.000000 44.892928 0.000000
[ 415.307239] ubi_bgt0d 63 10399.804677 11 120
0.000000 25.593344 0.000000
[ 415.307239] ubifs_bgt0_0 65 10052.319044 2 120
0.000000 7.036288 0.000000
[ 415.307239] udhcpc 171 19428.265114 1 120
0.000000 5.979520 0.000000
[ 415.307239] watchdog 188 0.654209 842 0
0.000000 3819.744512 0.000000
[ 415.307239] ntpd 208 42384.913677 419 120
0.000000 645.997952 0.000000
[ 415.307239] inetd 221 21775.101204 5 120
0.000000 39.310848 0.000000
[ 415.307239] crond 234 40244.762133 14 120
0.000000 49.742208 0.000000
[ 415.307239] snmpd 250 42386.978381 1732 120
0.000000 12158.752128 0.000000
[ 415.307239] login 263 28214.829541 46 120
0.000000 420.666240 0.000000
[ 415.307239] R sh 281 42490.860493 174 120
0.000000 538.372864 0.000000
[ 415.307239]


[ 415.307239] sysrq: SysRq : HELP : loglevel(0-9) reboot(b) crash(c)
show-all-locks(d) terminate-all-tasks(e) memory-full-oom-kill(f)
kill-all-tasks(i) show-memory-usage(m) nice-all-RT-tasks(n) poweroff(o)
show-registers(p) show-all-timers(q) sync(s) show-task-states(t)
unmount(u) show-blocked-tasks(w)
[ 415.307239] sysrq: SysRq : Show clockevent devices & pending hrtimers
(no others)
[ 415.307239] Timer List Version: v0.8
[ 415.307239] HRTIMER_MAX_CLOCK_BASES: 4
[ 415.307239] now at 415174352520 nsecs
[ 415.307239]
[ 415.307239] cpu: 0
[ 415.307239] clock 0:
[ 415.307239] .base: c05a6a00
[ 415.307239] .index: 0
[ 415.307239] .resolution: 1 nsecs
[ 415.307239] .get_time: ktime_get
[ 415.307239] .offset: 0 nsecs
[ 415.307239] active timers:
[ 415.307239] #0: tick_cpu_sched, tick_sched_timer, S:01
[ 415.307239] # expires at 415180000000-415180000000 nsecs [in 5647480
to 5647480 nsecs]
[ 415.307239] #1: <c6485ea8>, hrtimer_wakeup, S:01
[ 415.307239] # expires at 415316748939-415316748939 nsecs [in
142396419 to 142396419 nsecs]
[ 415.307239] #2: <c6481a28>, hrtimer_wakeup, S:01
[ 415.307239] # expires at 415325581283-415326561308 nsecs [in
151228763 to 152208788 nsecs]
[ 415.307239] #3: <c61fa178>, it_real_fn, S:01
[ 415.307239] # expires at 415635620681-415635620681 nsecs [in
461268161 to 461268161 nsecs]
[ 415.307239] #4: <c622dea8>, hrtimer_wakeup, S:01
[ 415.307239] # expires at 426702410278-426702460278 nsecs [in
11528057758 to 11528107758 nsecs]
[ 415.307239] #5: <c6487a28>, hrtimer_wakeup, S:01
[ 415.307239] # expires at 129624716464923-129624816464923 nsecs [in
129209542112403 to 129209642112403 nsecs]
[ 415.307239] clock 1:
[ 415.307239] .base: c05a6a20
[ 415.307239] .index: 1
[ 415.307239] .resolution: 1 nsecs
[ 415.307239] .get_time: ktime_get_real
[ 415.307239] .offset: 1471420915047270622 nsecs
[ 415.307239] active timers:
[ 415.307239] clock 2:
[ 415.307239] .base: c05a6a40
[ 415.307239] .index: 2
[ 415.307239] .resolution: 1 nsecs
[ 415.307239] .get_time: ktime_get_boottime
[ 415.307239] .offset: 0 nsecs
[ 415.307239] active timers:
[ 415.307239] clock 3:
[ 415.307239] .base: c05a6a60
[ 415.307239] .index: 3
[ 415.307239] .resolution: 1 nsecs
[ 415.307239] .get_time: ktime_get_clocktai
[ 415.307239] .offset: 1471420915047270622 nsecs
[ 415.307239] active timers:
[ 415.307239] .expires_next : 415180000000 nsecs
[ 415.307239] .hres_active : 1
[ 415.307239] .nr_events : 43162
[ 415.307239] .nr_retries : 20
[ 415.307239] .nr_hangs : 0
[ 415.307239] .max_hang_time : 0
[ 415.307239] .nohz_mode : 0
[ 415.307239] .last_tick : 0 nsecs
[ 415.307239] .tick_stopped : 0
[ 415.307239] .idle_jiffies : 0
[ 415.307239] .idle_calls : 0
[ 415.307239] .idle_sleeps : 0
[ 415.307239] .idle_entrytime : 0 nsecs
[ 415.307239] .idle_waketime : 0 nsecs
[ 415.307239] .idle_exittime : 0 nsecs
[ 415.307239] .idle_sleeptime : 0 nsecs
[ 415.307239] .iowait_sleeptime: 0 nsecs
[ 415.307239] .last_jiffies : 0
[ 415.307239] .next_timer : 0
[ 415.307239] .idle_expires : 0 nsecs
[ 415.307239] jiffies: 11517
[ 415.307239]
[ 415.307239] Tick Device: mode: 1
[ 415.307239] Per CPU device: 0
[ 415.307239] Clock Event Device: decrementer
[ 415.307239] max_delta_ns: 260301049532
[ 415.307239] min_delta_ns: 1000
[ 415.307239] mult: 35433480
[ 415.307239] shift: 32
[ 415.307239] mode: 3
[ 415.307239] next_event: 415180000000 nsecs
[ 415.307239] set_next_event: decrementer_set_next_event
[ 415.307239] shutdown: decrementer_shutdown
[ 415.307239] resume: decrementer_shutdown
[ 415.307239] event_handler: hrtimer_interrupt
[ 415.307239] retries: 0
[ 415.307239]


[ 415.307239] sysrq: SysRq : HELP : loglevel(0-9) reboot(b) crash(c)
show-all-locks(d) terminate-all-tasks(e) memory-full-oom-kill(f)
kill-all-tasks(i) show-memory-usage(m) nice-all-RT-tasks(n) poweroff(o)
show-registers(p) show-all-timers(q) sync(s) show-task-states(t)
unmount(u) show-blocked-tasks(w)


[ 415.307239] sysrq: SysRq : Show State
[ 415.307239] task PC stack pid father
[ 415.307239] init S 0fd6f410 5244 1 0 0x00000000
[ 415.307239] Call Trace:
[ 415.307239] [c6091dc0] [c05dd3b0] list_entries+0x0/0x120000 (unreliable)
[ 415.307239] [c6091e80] [c042b07c] __schedule+0x2cc/0x660
[ 415.307239] [c6091eb0] [c042b4a0] schedule+0x2c/0x54
[ 415.307239] [c6091ec0] [c0018fb0] do_wait+0x1b0/0x1fc
[ 415.307239] [c6091ef0] [c0019cf4] SyS_wait4+0x74/0xd8
[ 415.307239] [c6091f40] [c000bbd4] ret_from_syscall+0x0/0x38
[ 415.307239] kthreadd S 00000000 6492 2 0 0x00000800
[ 415.307239] Call Trace:
[ 415.307239] [c6093e00] [c05dd3b0] list_entries+0x0/0x120000 (unreliable)
[ 415.307239] [c6093ec0] [c042b100] __schedule+0x350/0x660
[ 415.307239] [c6093ef0] [c042b4a0] schedule+0x2c/0x54
[ 415.307239] [c6093f00] [c00354d0] kthreadd+0x138/0x148
[ 415.307239] [c6093f40] [c000bd38] ret_from_kernel_thread+0x5c/0x64
[ 415.307239] ksoftirqd/0 S 00000000 5644 3 2 0x00000800
[ 415.307239] Call Trace:
[ 415.307239] [c6095d70] [c0048b8c] trace_hardirqs_on_caller+0x9c/0x1a0
(unreliable)
[ 415.307239] [c6095e30] [c042b100] __schedule+0x350/0x660
[ 415.307239] [c6095e60] [c042b4a0] schedule+0x2c/0x54
[ 415.307239] [c6095e70] [c00387e8] smpboot_thread_fn+0x284/0x2ec
[ 415.307239] [c6095eb0] [c0034de8] kthread+0xc8/0xdc
[ 415.307239] [c6095f40] [c000bd38] ret_from_kernel_thread+0x5c/0x64
[ 415.307239] kworker/0:0 S 00000000 6748 4 2 0x00000800
[ 415.307239] Call Trace:
[ 415.307239] [c6097d60] [0000000c] 0xc (unreliable)
[ 415.307239] [c6097e20] [c042b100] __schedule+0x350/0x660
[ 415.307239] [c6097e50] [c042b4a0] schedule+0x2c/0x54
[ 415.307239] [c6097e60] [c002ef60] worker_thread+0x1b0/0x474
[ 415.307239] [c6097eb0] [c0034de8] kthread+0xc8/0xdc
[ 415.307239] [c6097f40] [c000bd38] ret_from_kernel_thread+0x5c/0x64
[ 415.307239] kworker/0:0H S 00000000 7404 5 2 0x00000800
[ 415.307239] Call Trace:
[ 415.307239] [c6099d60] [c05dd3b0] list_entries+0x0/0x120000 (unreliable)
[ 415.307239] [c6099e20] [c042b100] __schedule+0x350/0x660
[ 415.307239] [c6099e50] [c042b4a0] schedule+0x2c/0x54
[ 415.307239] [c6099e60] [c002ef60] worker_thread+0x1b0/0x474
[ 415.307239] [c6099eb0] [c0034de8] kthread+0xc8/0xdc
[ 415.307239] [c6099f40] [c000bd38] ret_from_kernel_thread+0x5c/0x64
[ 415.307239] kworker/u2:0 R running 5324 6 2 0x00000800
[ 415.307239] Call Trace:
[ 415.307239] [c609bd60] [c96e0000] 0xc96e0000 (unreliable)
[ 415.307239] [c609be20] [c042b100] __schedule+0x350/0x660
[ 415.307239] [c609be50] [c042b4a0] schedule+0x2c/0x54
[ 415.307239] [c609be60] [c002ef60] worker_thread+0x1b0/0x474
[ 415.307239] [c609beb0] [c0034de8] kthread+0xc8/0xdc
[ 415.307239] [c609bf40] [c000bd38] ret_from_kernel_thread+0x5c/0x64
[ 415.307239] rcu_preempt S 00000000 6540 7 2 0x00000800
[ 415.307239] Call Trace:
[ 415.307239] [c609dd30] [c05dd3b0] list_entries+0x0/0x120000 (unreliable)
[ 415.307239] [c609ddf0] [c042b100] __schedule+0x350/0x660
[ 415.307239] [c609de20] [c042b4a0] schedule+0x2c/0x54
[ 415.307239] [c609de30] [c005df00] rcu_gp_kthread+0xa4/0x8d0
[ 415.307239] [c609deb0] [c0034de8] kthread+0xc8/0xdc
[ 415.307239] [c609df40] [c000bd38] ret_from_kernel_thread+0x5c/0x64
[ 415.307239] rcu_sched S 00000000 7356 8 2 0x00000800
[ 415.307239] Call Trace:
[ 415.307239] [c609fd30] [c05dd3b0] list_entries+0x0/0x120000 (unreliable)
[ 415.307239] [c609fdf0] [c042b100] __schedule+0x350/0x660
[ 415.307239] [c609fe20] [c042b4a0] schedule+0x2c/0x54
[ 415.307239] [c609fe30] [c005df00] rcu_gp_kthread+0xa4/0x8d0
[ 415.307239] [c609feb0] [c0034de8] kthread+0xc8/0xdc
[ 415.307239] [c609ff40] [c000bd38] ret_from_kernel_thread+0x5c/0x64
[ 415.307239] rcu_bh S 00000000 7356 9 2 0x00000800
[ 415.307239] Call Trace:
[ 415.307239] [c60b1d30] [c05dd3b0] list_entries+0x0/0x120000 (unreliable)
[ 415.307239] [c60b1df0] [c042b100] __schedule+0x350/0x660
[ 415.307239] [c60b1e20] [c042b4a0] schedule+0x2c/0x54
[ 415.307239] [c60b1e30] [c005df00] rcu_gp_kthread+0xa4/0x8d0
[ 415.307239] [c60b1eb0] [c0034de8] kthread+0xc8/0xdc
[ 415.307239] [c60b1f40] [c000bd38] ret_from_kernel_thread+0x5c/0x64
[ 415.307239] kdevtmpfs S 00000000 6348 10 2 0x00000800
[ 415.307239] Call Trace:
[ 415.307239] [c60b3d10] [c05dd3b0] list_entries+0x0/0x120000 (unreliable)
[ 415.307239] [c60b3dd0] [c042b100] __schedule+0x350/0x660
[ 415.307239] [c60b3e00] [c042b4a0] schedule+0x2c/0x54
[ 415.307239] [c60b3e10] [c01f4304] devtmpfsd+0x2e0/0x2e4
[ 415.307239] [c60b3eb0] [c0034de8] kthread+0xc8/0xdc
[ 415.307239] [c60b3f40] [c000bd38] ret_from_kernel_thread+0x5c/0x64
[ 415.307239] perf S 00000000 7388 11 2 0x00000800
[ 415.307239] Call Trace:
[ 415.307239] [c60b5d50] [c05dd3b0] list_entries+0x0/0x120000 (unreliable)
[ 415.307239] [c60b5e10] [c042b100] __schedule+0x350/0x660
[ 415.307239] [c60b5e40] [c042b4a0] schedule+0x2c/0x54
[ 415.307239] [c60b5e50] [c002f458] rescuer_thread+0x234/0x2c8
[ 415.307239] [c60b5eb0] [c0034de8] kthread+0xc8/0xdc
[ 415.307239] [c60b5f40] [c000bd38] ret_from_kernel_thread+0x5c/0x64
[ 415.307239] khungtaskd S 00000000 7324 12 2 0x00000800
[ 415.307239] Call Trace:
[ 415.307239] [c60b9dd0] [c042b100] __schedule+0x350/0x660
[ 415.307239] [c60b9e00] [c042b4a0] schedule+0x2c/0x54
[ 415.307239] [c60b9e10] [c042f2a0] schedule_timeout+0xfc/0x16c
[ 415.307239] [c60b9e60] [c00788ac] watchdog+0x70/0x498
[ 415.307239] [c60b9eb0] [c0034de8] kthread+0xc8/0xdc
[ 415.307239] [c60b9f40] [c000bd38] ret_from_kernel_thread+0x5c/0x64
[ 415.307239] writeback S 00000000 7388 13 2 0x00000800
[ 415.307239] Call Trace:
[ 415.307239] [c60bbd50] [c05dd3b0] list_entries+0x0/0x120000 (unreliable)
[ 415.307239] [c60bbe10] [c042b100] __schedule+0x350/0x660
[ 415.307239] [c60bbe40] [c042b4a0] schedule+0x2c/0x54
[ 415.307239] [c60bbe50] [c002f458] rescuer_thread+0x234/0x2c8
[ 415.307239] [c60bbeb0] [c0034de8] kthread+0xc8/0xdc
[ 415.307239] [c60bbf40] [c000bd38] ret_from_kernel_thread+0x5c/0x64
[ 415.307239] kworker/0:1 S 00000000 6332 14 2 0x00000800
[ 415.307239] Call Trace:
[ 415.307239] [c60bdd60] [c004a590] __lock_acquire+0x314/0x1a30
(unreliable)
[ 415.307239] [c60bde20] [c042b100] __schedule+0x350/0x660
[ 415.307239] [c60bde50] [c042b4a0] schedule+0x2c/0x54
[ 415.307239] [c60bde60] [c002ef60] worker_thread+0x1b0/0x474
[ 415.307239] [c60bdeb0] [c0034de8] kthread+0xc8/0xdc
[ 415.307239] [c60bdf40] [c000bd38] ret_from_kernel_thread+0x5c/0x64
[ 415.307239] crypto S 00000000 7388 15 2 0x00000800
[ 415.307239] Call Trace:
[ 415.307239] [c60bfd50] [c05dd3b0] list_entries+0x0/0x120000 (unreliable)
[ 415.307239] [c60bfe10] [c042b100] __schedule+0x350/0x660
[ 415.307239] [c60bfe40] [c042b4a0] schedule+0x2c/0x54
[ 415.307239] [c60bfe50] [c002f458] rescuer_thread+0x234/0x2c8
[ 415.307239] [c60bfeb0] [c0034de8] kthread+0xc8/0xdc
[ 415.307239] [c60bff40] [c000bd38] ret_from_kernel_thread+0x5c/0x64
[ 415.307239] rpciod S 00000000 7388 16 2 0x00000800
[ 415.307239] Call Trace:
[ 415.307239] [c6115d50] [c05dd3b0] list_entries+0x0/0x120000 (unreliable)
[ 415.307239] [c6115e10] [c042b100] __schedule+0x350/0x660
[ 415.307239] [c6115e40] [c042b4a0] schedule+0x2c/0x54
[ 415.307239] [c6115e50] [c002f458] rescuer_thread+0x234/0x2c8
[ 415.307239] [c6115eb0] [c0034de8] kthread+0xc8/0xdc
[ 415.307239] [c6115f40] [c000bd38] ret_from_kernel_thread+0x5c/0x64
[ 415.307239] kswapd0 S 00000000 7068 17 2 0x00000800
[ 415.307239] Call Trace:
[ 415.307239] [c6117d10] [22000082] 0x22000082 (unreliable)
[ 415.307239] [c6117dd0] [c042b100] __schedule+0x350/0x660
[ 415.307239] [c6117e00] [c042b4a0] schedule+0x2c/0x54
[ 415.307239] [c6117e10] [c00a3eb4] kswapd+0x628/0x638
[ 415.307239] [c6117eb0] [c0034de8] kthread+0xc8/0xdc
[ 415.307239] [c6117f40] [c000bd38] ret_from_kernel_thread+0x5c/0x64
[ 415.307239] fsnotify_mark S 00000000 6620 18 2 0x00000800
[ 415.307239] Call Trace:
[ 415.307239] [c6119d60] [00009032] 0x9032 (unreliable)
[ 415.307239] [c6119e20] [c042b100] __schedule+0x350/0x660
[ 415.307239] [c6119e50] [c042b4a0] schedule+0x2c/0x54
[ 415.307239] [c6119e60] [c00ffbb8] fsnotify_mark_destroy+0xfc/0x12c
[ 415.307239] [c6119eb0] [c0034de8] kthread+0xc8/0xdc
[ 415.307239] [c6119f40] [c000bd38] ret_from_kernel_thread+0x5c/0x64
[ 415.307239] nfsiod S 00000000 7388 19 2 0x00000800
[ 415.307239] Call Trace:
[ 415.307239] [c611bd50] [c05dd3b0] list_entries+0x0/0x120000 (unreliable)
[ 415.307239] [c611be10] [c042b100] __schedule+0x350/0x660
[ 415.307239] [c611be40] [c042b4a0] schedule+0x2c/0x54
[ 415.307239] [c611be50] [c002f458] rescuer_thread+0x234/0x2c8
[ 415.307239] [c611beb0] [c0034de8] kthread+0xc8/0xdc
[ 415.307239] [c611bf40] [c000bd38] ret_from_kernel_thread+0x5c/0x64
[ 415.307239] bond0 S 00000000 7388 37 2 0x00000800
[ 415.307239] Call Trace:
[ 415.307239] [c6225d50] [c05dd3b0] list_entries+0x0/0x120000 (unreliable)
[ 415.307239] [c6225e10] [c042b100] __schedule+0x350/0x660
[ 415.307239] [c6225e40] [c042b4a0] schedule+0x2c/0x54
[ 415.307239] [c6225e50] [c002f458] rescuer_thread+0x234/0x2c8
[ 415.307239] [c6225eb0] [c0034de8] kthread+0xc8/0xdc
[ 415.307239] [c6225f40] [c000bd38] ret_from_kernel_thread+0x5c/0x64
[ 415.307239] ipv6_addrconf S 00000000 7388 49 2 0x00000800
[ 415.307239] Call Trace:
[ 415.307239] [c6211d50] [c05dd3b0] list_entries+0x0/0x120000 (unreliable)
[ 415.307239] [c6211e10] [c042b100] __schedule+0x350/0x660
[ 415.307239] [c6211e40] [c042b4a0] schedule+0x2c/0x54
[ 415.307239] [c6211e50] [c002f458] rescuer_thread+0x234/0x2c8
[ 415.307239] [c6211eb0] [c0034de8] kthread+0xc8/0xdc
[ 415.307239] [c6211f40] [c000bd38] ret_from_kernel_thread+0x5c/0x64
[ 415.307239] deferwq S 00000000 7388 50 2 0x00000800
[ 415.307239] Call Trace:
[ 415.307239] [c621fd50] [c05dd3b0] list_entries+0x0/0x120000 (unreliable)
[ 415.307239] [c621fe10] [c042b100] __schedule+0x350/0x660
[ 415.307239] [c621fe40] [c042b4a0] schedule+0x2c/0x54
[ 415.307239] [c621fe50] [c002f458] rescuer_thread+0x234/0x2c8
[ 415.307239] [c621feb0] [c0034de8] kthread+0xc8/0xdc
[ 415.307239] [c621ff40] [c000bd38] ret_from_kernel_thread+0x5c/0x64
[ 415.307239] kworker/u2:1 S 00000000 7404 51 2 0x00000800
[ 415.307239] Call Trace:
[ 415.307239] [c6213d60] [c05dd3b0] list_entries+0x0/0x120000 (unreliable)
[ 415.307239] [c6213e20] [c042b100] __schedule+0x350/0x660
[ 415.307239] [c6213e50] [c042b4a0] schedule+0x2c/0x54
[ 415.307239] [c6213e60] [c002ef60] worker_thread+0x1b0/0x474
[ 415.307239] [c6213eb0] [c0034de8] kthread+0xc8/0xdc
[ 415.307239] [c6213f40] [c000bd38] ret_from_kernel_thread+0x5c/0x64
[ 415.307239] spi32766 S 00000000 7404 52 2 0x00000800
[ 415.307239] Call Trace:
[ 415.307239] [c621bd80] [c05dd3b0] list_entries+0x0/0x120000 (unreliable)
[ 415.307239] [c621be40] [c042b100] __schedule+0x350/0x660
[ 415.307239] [c621be70] [c042b4a0] schedule+0x2c/0x54
[ 415.307239] [c621be80] [c0035618] kthread_worker_fn+0xc8/0x124
[ 415.307239] [c621beb0] [c0034de8] kthread+0xc8/0xdc
[ 415.307239] [c621bf40] [c000bd38] ret_from_kernel_thread+0x5c/0x64
[ 415.307239] ubi_bgt0d S 00000000 6156 63 2 0x00000800
[ 415.307239] Call Trace:
[ 415.307239] [c622fd70] [c05dd3b0] list_entries+0x0/0x120000 (unreliable)
[ 415.307239] [c622fe30] [c042b100] __schedule+0x350/0x660
[ 415.307239] [c622fe60] [c042b4a0] schedule+0x2c/0x54
[ 415.307239] [c622fe70] [c0223eac] ubi_thread+0xc8/0x188
[ 415.307239] [c622feb0] [c0034de8] kthread+0xc8/0xdc
[ 415.307239] [c622ff40] [c000bd38] ret_from_kernel_thread+0x5c/0x64
[ 415.307239] ubifs_bgt0_0 S 00000000 7180 65 2 0x00000800
[ 415.307239] Call Trace:
[ 415.307239] [c622bd80] [c042ff6c]
_raw_spin_unlock_irqrestore+0x48/0x98 (unreliable)
[ 415.307239] [c622be40] [c042b100] __schedule+0x350/0x660
[ 415.307239] [c622be70] [c042b4a0] schedule+0x2c/0x54
[ 415.307239] [c622be80] [c016fa68] ubifs_bg_thread+0x120/0x19c
[ 415.307239] [c622beb0] [c0034de8] kthread+0xc8/0xdc
[ 415.307239] [c622bf40] [c000bd38] ret_from_kernel_thread+0x5c/0x64
[ 415.307239] udhcpc S 0fd9efa8 6268 171 1 0x00000000
[ 415.307239] Call Trace:
[ 415.307239] [c64879e0] [c042b07c] __schedule+0x2cc/0x660
[ 415.307239] [c6487a10] [c042b4a0] schedule+0x2c/0x54
[ 415.307239] [c6487a20] [c042f5f4]
schedule_hrtimeout_range_clock+0xc4/0x14c
[ 415.307239] [c6487a90] [c00de47c] poll_schedule_timeout+0x44/0x7c
[ 415.307239] [c6487aa0] [c00deb7c] do_select+0x5ec/0x710
[ 415.307239] [c6487dc0] [c00deecc] core_sys_select+0x22c/0x4cc
[ 415.307239] [c6487f10] [c00df214] SyS_select+0xa8/0x124
[ 415.307239] [c6487f40] [c000bbd4] ret_from_syscall+0x0/0x38
[ 415.307239] watchdog S 0fedb94c 5532 188 1 0x00000000
[ 415.307239] Call Trace:
[ 415.307239] [c6485d80] [c05a4140] pidmap_lock+0x0/0x20 (unreliable)
[ 415.307239] [c6485e40] [c042b07c] __schedule+0x2cc/0x660
[ 415.307239] [c6485e70] [c042b4a0] schedule+0x2c/0x54
[ 415.307239] [c6485e80] [c042f424] do_nanosleep+0x74/0xe0
[ 415.307239] [c6485ea0] [c0063fbc] hrtimer_nanosleep+0xb0/0x144
[ 415.307239] [c6485f10] [c0064104] SyS_nanosleep+0xb4/0xe0
[ 415.307239] [c6485f40] [c000bbd4] ret_from_syscall+0x0/0x38
[ 415.307239] ntpd S 0fe32fa8 5452 208 1 0x00000000
[ 415.307239] Call Trace:
[ 415.307239] [c6223920] [c05dd3b0] list_entries+0x0/0x120000 (unreliable)
[ 415.307239] [c62239e0] [c042b07c] __schedule+0x2cc/0x660
[ 415.307239] [c6223a10] [c042b4a0] schedule+0x2c/0x54
[ 415.307239] [c6223a20] [c042f674]
schedule_hrtimeout_range_clock+0x144/0x14c
[ 415.307239] [c6223a90] [c00de47c] poll_schedule_timeout+0x44/0x7c
[ 415.307239] [c6223aa0] [c00deb7c] do_select+0x5ec/0x710
[ 415.307239] [c6223dc0] [c00deecc] core_sys_select+0x22c/0x4cc
[ 415.307239] [c6223f10] [c00df214] SyS_select+0xa8/0x124
[ 415.307239] [c6223f40] [c000bbd4] ret_from_syscall+0x0/0x38
[ 415.307239] inetd S 0fd9efa8 6172 221 1 0x00000000
[ 415.307239] Call Trace:
[ 415.307239] [c6483920] [c05dd3b0] list_entries+0x0/0x120000 (unreliable)
[ 415.307239] [c64839e0] [c042b07c] __schedule+0x2cc/0x660
[ 415.307239] [c6483a10] [c042b4a0] schedule+0x2c/0x54
[ 415.307239] [c6483a20] [c042f674]
schedule_hrtimeout_range_clock+0x144/0x14c
[ 415.307239] [c6483a90] [c00de47c] poll_schedule_timeout+0x44/0x7c
[ 415.307239] [c6483aa0] [c00deb7c] do_select+0x5ec/0x710
[ 415.307239] [c6483dc0] [c00deecc] core_sys_select+0x22c/0x4cc
[ 415.307239] [c6483f10] [c00df214] SyS_select+0xa8/0x124
[ 415.307239] [c6483f40] [c000bbd4] ret_from_syscall+0x0/0x38
[ 415.307239] crond S 0fd6f94c 6156 234 1 0x00000000
[ 415.307239] Call Trace:
[ 415.307239] [c622dd80] [c004a590] __lock_acquire+0x314/0x1a30
(unreliable)
[ 415.307239] [c622de40] [c042b07c] __schedule+0x2cc/0x660
[ 415.307239] [c622de70] [c042b4a0] schedule+0x2c/0x54
[ 415.307239] [c622de80] [c042f424] do_nanosleep+0x74/0xe0
[ 415.307239] [c622dea0] [c0063fbc] hrtimer_nanosleep+0xb0/0x144
[ 415.307239] [c622df10] [c0064104] SyS_nanosleep+0xb4/0xe0
[ 415.307239] [c622df40] [c000bbd4] ret_from_syscall+0x0/0x38
[ 415.307239] snmpd S 0fa3afa8 5516 250 1 0x00000000
[ 415.307239] Call Trace:
[ 415.307239] [c6481920] [c05dd3b0] list_entries+0x0/0x120000 (unreliable)
[ 415.307239] [c64819e0] [c042b07c] __schedule+0x2cc/0x660
[ 415.307239] [c6481a10] [c042b4a0] schedule+0x2c/0x54
[ 415.307239] [c6481a20] [c042f5f4]
schedule_hrtimeout_range_clock+0xc4/0x14c
[ 415.307239] [c6481a90] [c00de47c] poll_schedule_timeout+0x44/0x7c
[ 415.307239] [c6481aa0] [c00deb7c] do_select+0x5ec/0x710
[ 415.307239] [c6481dc0] [c00deecc] core_sys_select+0x22c/0x4cc
[ 415.307239] [c6481f10] [c00df214] SyS_select+0xa8/0x124
[ 415.307239] [c6481f40] [c000bbd4] ret_from_syscall+0x0/0x38
[ 415.307239] login S 0fd6f410 5404 263 1 0x00000000
[ 415.307239] Call Trace:
[ 415.307239] [c611ddc0] [c05dd3b0] list_entries+0x0/0x120000 (unreliable)
[ 415.307239] [c611de80] [c042b07c] __schedule+0x2cc/0x660
[ 415.307239] [c611deb0] [c042b4a0] schedule+0x2c/0x54
[ 415.307239] [c611dec0] [c0018fb0] do_wait+0x1b0/0x1fc
[ 415.307239] [c611def0] [c0019cf4] SyS_wait4+0x74/0xd8
[ 415.307239] [c611df40] [c000bbd4] ret_from_syscall+0x0/0x38
[ 415.307239] sh R running 5580 281 263 0x00000000
[ 415.307239] Call Trace:
[ 415.307239] [c7fefe60] [c003da18] show_state_filter+0xc8/0x1c8
(unreliable)
[ 415.307239] [c7fefe90] [c01d901c] sysrq_handle_showstate+0x14/0x24
[ 415.307239] [c7fefea0] [c01d94cc] __handle_sysrq+0x180/0x268
[ 415.307239] [c7fefed0] [c01e2980] cpm_uart_int_rx+0x1f8/0x33c
[ 415.307239] [c7feff20] [c01e2e70] cpm_uart_int+0x104/0x134
[ 415.307239] [c7feff40] [c0054598] handle_irq_event_percpu+0x5c/0x158
[ 415.307239] [c7feff70] [c00546d8] handle_irq_event+0x44/0x78
[ 415.307239] [c7feff90] [c00579b8] handle_fasteoi_irq+0xd4/0x210
[ 415.307239] [c7feffb0] [c0053e48] generic_handle_irq+0x30/0x4c
[ 415.307239] [c7feffc0] [c00132dc] cpm_cascade+0x28/0x48
[ 415.307239] [c7feffd0] [c0053e48] generic_handle_irq+0x30/0x4c
[ 415.307239] [c7feffe0] [c0003ea0] __do_irq+0x3c/0x88
[ 415.307239] [c7fefff0] [c000b394] call_do_irq+0x24/0x3c
[ 415.307239] [c6215980] [c0003f74] do_IRQ+0x88/0xe0
[ 415.307239] [c62159b0] [c000c314] ret_from_except+0x0/0x14
[ 415.307239] [c6215a70] [c020f8a0] nand_wait+0x2c/0x12c
[ 415.307239] [c6215a90] [c020e120] nand_write_page+0xb8/0x1a0
[ 415.307239] [c6215ac0] [c020f248] nand_do_write_ops+0x1ec/0x44c
[ 415.307239] [c6215b30] [c020f5ec] nand_write+0x68/0x94
[ 415.307239] [c6215b70] [c02210e0] ubi_io_write+0x108/0x61c
[ 415.307239] [c6215bb0] [c021e10c] ubi_eba_write_leb+0x88/0x7b4
[ 415.307239] [c6215c20] [c0163bfc] ubifs_leb_write+0xac/0x144
[ 415.307239] [c6215c50] [c0164740] ubifs_wbuf_sync_nolock+0xfc/0x33c
[ 415.307239] [c6215c70] [c0158b7c] ubifs_jnl_write_inode+0xb0/0x208
[ 415.307239] [c6215cb0] [c015f764] ubifs_write_inode+0xc4/0x14c
[ 415.307239] [c6215cd0] [c015bd68] ubifs_writepage+0x1ac/0x22c
[ 415.307239] [c6215d00] [c009a428] __writepage+0x24/0x7c
[ 415.307239] [c6215d10] [c009c6e4] write_cache_pages+0x1d0/0x380
[ 415.307239] [c6215db0] [c008fe6c] __filemap_fdatawrite_range+0x64/0x88
[ 415.307239] [c6215de0] [c0090644] filemap_write_and_wait_range+0x38/0x94
[ 415.307239] [c6215e00] [c015ac3c] ubifs_fsync+0x50/0xd4
[ 415.307239] [c6215e20] [c0093bb8] generic_file_write_iter+0x200/0x2b4
[ 415.307239] [c6215e50] [c015c294] ubifs_write_iter+0x104/0x1a8
[ 415.307239] [c6215ea0] [c00cba6c] __vfs_write+0xc8/0x14c
[ 415.307239] [c6215ef0] [c00cbc80] vfs_write+0xa4/0x178
[ 415.307239] [c6215f10] [c00cbe64] SyS_write+0x4c/0xc4
[ 415.307239] [c6215f40] [c000bbd4] ret_from_syscall+0x0/0x38
[ 415.307239] Sched Debug Version: v0.11, 4.4.14-s3k-dev-g08fc2be-svn #110
[ 415.307239] ktime : 415174.352520
[ 415.307239] sched_clk : 415307.239936
[ 415.307239] cpu_clk : 415307.239936
[ 415.307239] jiffies : 11517
[ 415.307239]
[ 415.307239] sysctl_sched
[ 415.307239] .sysctl_sched_latency : 6.000000
[ 415.307239] .sysctl_sched_min_granularity : 0.750000
[ 415.307239] .sysctl_sched_wakeup_granularity : 1.000000
[ 415.307239] .sysctl_sched_child_runs_first : 0
[ 415.307239] .sysctl_sched_features : 44859
[ 415.307239] .sysctl_sched_tunable_scaling : 1 (logaritmic)
[ 415.307239]
[ 415.307239] cpu#0
[ 415.307239] .nr_running : 2
[ 415.307239] .load : 2048
[ 415.307239] .nr_switches : 26656
[ 415.307239] .nr_load_updates : 0
[ 415.307239] .nr_uninterruptible : 0
[ 415.307239] .next_balance : 0.000000
[ 415.307239] .curr->pid : 281
[ 415.307239] .clock : 415307.239936
[ 415.307239] .clock_task : 415307.239936
[ 415.307239] .cpu_load[0] : 0
[ 415.307239] .cpu_load[1] : 0
[ 415.307239] .cpu_load[2] : 0
[ 415.307239] .cpu_load[3] : 0
[ 415.307239] .cpu_load[4] : 0
[ 415.307239]
[ 415.307239] cfs_rq[0]:
[ 415.307239] .exec_clock : 0.000000
[ 415.307239] .MIN_vruntime : 42490.860493
[ 415.307239] .min_vruntime : 42493.860493
[ 415.307239] .max_vruntime : 42490.860493
[ 415.307239] .spread : 0.000000
[ 415.307239] .spread0 : 0.000000
[ 415.307239] .nr_spread_over : 0
[ 415.307239] .nr_running : 2
[ 415.307239] .load : 2048
[ 415.307239]
[ 415.307239] rt_rq[0]:
[ 415.307239] .rt_nr_running : 0
[ 415.307239] .rt_throttled : 0
[ 415.307239] .rt_time : 0.000000
[ 415.307239] .rt_runtime : 950.000000
[ 415.307239]
[ 415.307239] dl_rq[0]:
[ 415.307239] .dl_nr_running : 0
[ 415.307239]
[ 415.307239] runnable tasks:
[ 415.307239] task PID tree-key switches prio
wait-time sum-exec sum-sleep
[ 415.307239]
----------------------------------------------------------------------------------------------------------
[ 415.307239] init 1 41218.246221 324 120
0.000000 7091.870720 0.000000
[ 415.307239] kthreadd 2 10041.075333 53 120
0.000000 39.044608 0.000000
[ 415.307239] ksoftirqd/0 3 42471.449421 525 120
0.000000 174.042368 0.000000
[ 415.307239] kworker/0:0 4 393.682966 10 120
0.000000 3.465472 0.000000
[ 415.307239] kworker/0:0H 5 542.229904 3 100
0.000000 0.373504 0.000000
[ 415.307239] kworker/u2:0 6 42490.860493 482 120
0.000000 682.979200 0.000000
[ 415.307239] rcu_preempt 7 42471.239885 7355 120
0.000000 2497.164544 0.000000
[ 415.307239] rcu_sched 8 16.696155 2 120
0.000000 0.232192 0.000000
[ 415.307239] rcu_bh 9 18.788058 2 120
0.000000 0.225792 0.000000
[ 415.307239] kdevtmpfs 10 9906.339623 53 120
0.000000 49.822080 0.000000
[ 415.307239] perf 11 209.966618 2 100
0.000000 0.418688 0.000000
[ 415.307239] khungtaskd 12 42271.067789 43 120
0.000000 9.401344 0.000000
[ 415.307239] writeback 13 393.600057 2 100
0.000000 0.301184 0.000000
[ 415.307239] kworker/0:1 14 42462.051405 3062 120
0.000000 1814.468864 0.000000
[ 415.307239] crypto 15 395.547701 2 100
0.000000 0.310016 0.000000
[ 415.307239] rpciod 16 547.595271 3 100
0.000000 0.346752 0.000000
[ 415.307239] kswapd0 17 1141.966520 3 120
0.000000 0.758656 0.000000
[ 415.307239] fsnotify_mark 18 1289.797239 2 120
0.000000 1.068032 0.000000
[ 415.307239] nfsiod 19 1294.974059 2 100
0.000000 0.303360 0.000000
[ 415.307239] bond0 37 4873.191411 2 100
0.000000 0.320512 0.000000
[ 415.307239] ipv6_addrconf 49 5541.654525 2 100
0.000000 0.303232 0.000000
[ 415.307239] deferwq 50 5664.433381 2 100
0.000000 0.299264 0.000000
[ 415.307239] kworker/u2:1 51 29504.315231 5 120
0.000000 0.803840 0.000000
[ 415.307239] spi32766 52 24174.314766 243 120
0.000000 44.892928 0.000000
[ 415.307239] ubi_bgt0d 63 10399.804677 11 120
0.000000 25.593344 0.000000
[ 415.307239] ubifs_bgt0_0 65 10052.319044 2 120
0.000000 7.036288 0.000000
[ 415.307239] udhcpc 171 19428.265114 1 120
0.000000 5.979520 0.000000
[ 415.307239] watchdog 188 0.654209 842 0
0.000000 3819.744512 0.000000
[ 415.307239] ntpd 208 42384.913677 419 120
0.000000 645.997952 0.000000
[ 415.307239] inetd 221 21775.101204 5 120
0.000000 39.310848 0.000000
[ 415.307239] crond 234 40244.762133 14 120
0.000000 49.742208 0.000000
[ 415.307239] snmpd 250 42386.978381 1732 120
0.000000 12158.752128 0.000000
[ 415.307239] login 263 28214.829541 46 120
0.000000 420.666240 0.000000
[ 415.307239] R sh 281 42490.860493 174 120
0.000000 538.372864 0.000000
[ 415.307239]
[ 415.307239]
[ 415.307239] Showing all locks held in the system:
[ 415.307239] 7 locks held by sh/281:
[ 415.307239] #0: (sb_writers#5){.+.+.+}, at: [<c00ce934>]
__sb_start_write+0xac/0xc4
[ 415.307239] #1: (&ui->ui_mutex){+.+...}, at: [<c015f704>]
ubifs_write_inode+0x64/0x14c
[ 415.307239] #2: (&c->commit_sem){++++..}, at: [<c0157bac>]
make_reservation+0x5c/0x408
[ 415.307239] #3: (&wbuf->io_mutex/1){+.+...}, at: [<c0157bd4>]
make_reservation+0x84/0x408
[ 415.307239] #4: (&le->mutex){++++..}, at: [<c021d8d0>]
leb_write_lock+0x24/0x38
[ 415.307239] #5: (rcu_read_lock){......}, at: [<c01d934c>]
__handle_sysrq+0x0/0x268
[ 415.307239] #6: (tasklist_lock){.+.?..}, at: [<c004a10c>]
debug_show_all_locks+0x3c/0x1ac
[ 415.307239]
[ 415.307239] =============================================
[ 415.307239]
[ 415.307239] Showing busy workqueues and worker pools:
[ 415.307239] workqueue events_unbound: flags=0x2
[ 415.307239] pwq 2: cpus=0 flags=0x4 nice=0 active=1/512
[ 415.307239] pending: flush_to_ldisc


[ 415.307239] sysrq: SysRq : Show clockevent devices & pending hrtimers
(no others)
[ 415.307239] Timer List Version: v0.8
[ 415.307239] HRTIMER_MAX_CLOCK_BASES: 4
[ 415.307239] now at 415174352520 nsecs
[ 415.307239]
[ 415.307239] cpu: 0
[ 415.307239] clock 0:
[ 415.307239] .base: c05a6a00
[ 415.307239] .index: 0
[ 415.307239] .resolution: 1 nsecs
[ 415.307239] .get_time: ktime_get
[ 415.307239] .offset: 0 nsecs
[ 415.307239] active timers:
[ 415.307239] #0: tick_cpu_sched, tick_sched_timer, S:01
[ 415.307239] # expires at 415180000000-415180000000 nsecs [in 5647480
to 5647480 nsecs]
[ 415.307239] #1: <c6485ea8>, hrtimer_wakeup, S:01
[ 415.307239] # expires at 415316748939-415316748939 nsecs [in
142396419 to 142396419 nsecs]
[ 415.307239] #2: <c6481a28>, hrtimer_wakeup, S:01
[ 415.307239] # expires at 415325581283-415326561308 nsecs [in
151228763 to 152208788 nsecs]
[ 415.307239] #3: <c61fa178>, it_real_fn, S:01
[ 415.307239] # expires at 415635620681-415635620681 nsecs [in
461268161 to 461268161 nsecs]
[ 415.307239] #4: <c622dea8>, hrtimer_wakeup, S:01
[ 415.307239] # expires at 426702410278-426702460278 nsecs [in
11528057758 to 11528107758 nsecs]
[ 415.307239] #5: <c6487a28>, hrtimer_wakeup, S:01
[ 415.307239] # expires at 129624716464923-129624816464923 nsecs [in
129209542112403 to 129209642112403 nsecs]
[ 415.307239] clock 1:
[ 415.307239] .base: c05a6a20
[ 415.307239] .index: 1
[ 415.307239] .resolution: 1 nsecs
[ 415.307239] .get_time: ktime_get_real
[ 415.307239] .offset: 1471420915047270622 nsecs
[ 415.307239] active timers:
[ 415.307239] clock 2:
[ 415.307239] .base: c05a6a40
[ 415.307239] .index: 2
[ 415.307239] .resolution: 1 nsecs
[ 415.307239] .get_time: ktime_get_boottime
[ 415.307239] .offset: 0 nsecs
[ 415.307239] active timers:
[ 415.307239] clock 3:
[ 415.307239] .base: c05a6a60
[ 415.307239] .index: 3
[ 415.307239] .resolution: 1 nsecs
[ 415.307239] .get_time: ktime_get_clocktai
[ 415.307239] .offset: 1471420915047270622 nsecs
[ 415.307239] active timers:
[ 415.307239] .expires_next : 415180000000 nsecs
[ 415.307239] .hres_active : 1
[ 415.307239] .nr_events : 43162
[ 415.307239] .nr_retries : 20
[ 415.307239] .nr_hangs : 0
[ 415.307239] .max_hang_time : 0
[ 415.307239] .nohz_mode : 0
[ 415.307239] .last_tick : 0 nsecs
[ 415.307239] .tick_stopped : 0
[ 415.307239] .idle_jiffies : 0
[ 415.307239] .idle_calls : 0
[ 415.307239] .idle_sleeps : 0
[ 415.307239] .idle_entrytime : 0 nsecs
[ 415.307239] .idle_waketime : 0 nsecs
[ 415.307239] .idle_exittime : 0 nsecs
[ 415.307239] .idle_sleeptime : 0 nsecs
[ 415.307239] .iowait_sleeptime: 0 nsecs
[ 415.307239] .last_jiffies : 0
[ 415.307239] .next_timer : 0
[ 415.307239] .idle_expires : 0 nsecs
[ 415.307239] jiffies: 11517
[ 415.307239]
[ 415.307239] Tick Device: mode: 1
[ 415.307239] Per CPU device: 0
[ 415.307239] Clock Event Device: decrementer
[ 415.307239] max_delta_ns: 260301049532
[ 415.307239] min_delta_ns: 1000
[ 415.307239] mult: 35433480
[ 415.307239] shift: 32
[ 415.307239] mode: 3
[ 415.307239] next_event: 415180000000 nsecs
[ 415.307239] set_next_event: decrementer_set_next_event
[ 415.307239] shutdown: decrementer_shutdown
[ 415.307239] resume: decrementer_shutdown
[ 415.307239] event_handler: hrtimer_interrupt
[ 415.307239] retries: 0
[ 415.307239]
Holger Brunck
2016-08-17 15:27:15 UTC
Permalink
Post by christophe leroy
Post by Dave Hansen
Post by Holger Brunck
I tried this but unfortunately the error only occurs while remote debugging.
Locally with gdb everything works fine. BTW we double-checked with a 85xx ppc
target which is also 32-bit and it ends up with the same behaviour.
I was also investigating where I have to move the line in the struct task_struct
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 253538f..4868874 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1655,7 +1655,9 @@ struct task_struct {
struct signal_struct *signal;
struct sighand_struct *sighand;
+ // struct thread_struct thread; // until here everything is fine
sigset_t blocked, real_blocked;
+ struct thread_struct thread; // from here it's broken
sigset_t saved_sigmask; /* restored if set_restore_sigmask() was used */
struct sigpending pending;
Wow, thanks for all the debugging here!
So, we know it has to do with signals, thread_info, and probably only
affects 32-bit powerpc. Seems awfully weird. Have you checked with any
of the 64-bit powerpc guys to see if they have any ideas?
I went grepping around for a bit.
Where is the task_struct stored? Is it on-stack on ppc32 or something?
The thread_info is, I assume, but I see some THREAD_INFO vs. THREAD
.globl ret_from_debug_exc
mfspr r9,SPRN_SPRG_THREAD
lwz r10,SAVED_KSP_LIMIT(r1)
stw r10,KSP_LIMIT(r9)
lwz r9,THREAD_INFO-THREAD(r9)
CURRENT_THREAD_INFO(r10, r1)
lwz r10,TI_PREEMPT(r10)
stw r10,TI_PREEMPT(r9)
RESTORE_xSRR(SRR0,SRR1);
RESTORE_xSRR(CSRR0,CSRR1);
RESTORE_MMU_REGS;
RET_FROM_EXC_LEVEL(SPRN_DSRR0, SPRN_DSRR1, PPC_RFDI)
But, I'm really at a loss to explain this. It still seems like a deeply
ppc-specific issue. We can obviously work around it with an #ifdef for
your platform, but that's awfully hackish and hides the real bug,
whatever it is.
My suspicion is that there's a bug in the 32-bit ppc assembly somewhere.
I don't see any references to 'blocked' or 'real_blocked' in assembly
though. You could add a bunch of padding instead of moving the
thread_struct and see if that does anything, but that's really a stab in
the dark.
Just to let you know, I'm not sure it is the same issue, but I also get
my 8xx target stuck when I try to use gdbserver.
If I debug a very small app, it gets stuck quickly after the app has
stopped: indeed, the console seems ok but as soon as I try to execute
something simple, like a ps or top, it get stuck. The target still
responds to pings, but nothing else.
If I debug a big app, it gets stuck soon after the start of debug: I set
a bpoint at main(), do a 'continue', get breaked at main(), do some
steps with 'next' then it gets stuck.
I have tried moving the struct thread_struct thread but it has no impact.
that sounds a bit different to what I see. Is your program also mutli-threaded?

Maybe you could try with the program I use to reproduce the error:

--- snip -----
#include <pthread.h>
#include <stdio.h>
#include <unistd.h>

void * th_1_func()
{
while (1) {
sleep(2);
printf("Hello from thread function 1)\n");
}
}

int main() {
int err;
pthread_t th_1, th_2, th_3;

err = pthread_create(&th_1, NULL, th_1_func, NULL);
if (err != 0)
printf("pthread_create\n");
err = pthread_create(&th_2, NULL, th_1_func, NULL);
if (err != 0)
printf("pthread_create\n");
err = pthread_create(&th_3, NULL, th_1_func, NULL);
if (err != 0)
printf("pthread_create\n");
while(1) {}
return 0;
}
--- snap ---

Then copy it to your target and start it with the gdbserver. If you let it run
from your host with gdb and try to stop it e.g in the sleep call and then try to
single step it you might see the error. But as I said in this thread the
behaviour might be different depending on your kernel configuration as I
encountered different behaviour when enabling FTRACE or SCHED_STAT.

Best regards
Holger
Christophe Leroy
2016-08-18 08:23:12 UTC
Permalink
Post by Holger Brunck
Post by christophe leroy
Post by Dave Hansen
Post by Holger Brunck
I tried this but unfortunately the error only occurs while remote debugging.
Locally with gdb everything works fine. BTW we double-checked with a 85xx ppc
target which is also 32-bit and it ends up with the same behaviour.
I was also investigating where I have to move the line in the struct task_struct
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 253538f..4868874 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1655,7 +1655,9 @@ struct task_struct {
struct signal_struct *signal;
struct sighand_struct *sighand;
+ // struct thread_struct thread; // until here everything is fine
sigset_t blocked, real_blocked;
+ struct thread_struct thread; // from here it's broken
sigset_t saved_sigmask; /* restored if set_restore_sigmask() was used */
struct sigpending pending;
Wow, thanks for all the debugging here!
So, we know it has to do with signals, thread_info, and probably only
affects 32-bit powerpc. Seems awfully weird. Have you checked with any
of the 64-bit powerpc guys to see if they have any ideas?
I went grepping around for a bit.
Where is the task_struct stored? Is it on-stack on ppc32 or something?
The thread_info is, I assume, but I see some THREAD_INFO vs. THREAD
.globl ret_from_debug_exc
mfspr r9,SPRN_SPRG_THREAD
lwz r10,SAVED_KSP_LIMIT(r1)
stw r10,KSP_LIMIT(r9)
lwz r9,THREAD_INFO-THREAD(r9)
CURRENT_THREAD_INFO(r10, r1)
lwz r10,TI_PREEMPT(r10)
stw r10,TI_PREEMPT(r9)
RESTORE_xSRR(SRR0,SRR1);
RESTORE_xSRR(CSRR0,CSRR1);
RESTORE_MMU_REGS;
RET_FROM_EXC_LEVEL(SPRN_DSRR0, SPRN_DSRR1, PPC_RFDI)
But, I'm really at a loss to explain this. It still seems like a deeply
ppc-specific issue. We can obviously work around it with an #ifdef for
your platform, but that's awfully hackish and hides the real bug,
whatever it is.
My suspicion is that there's a bug in the 32-bit ppc assembly somewhere.
I don't see any references to 'blocked' or 'real_blocked' in assembly
though. You could add a bunch of padding instead of moving the
thread_struct and see if that does anything, but that's really a stab in
the dark.
Just to let you know, I'm not sure it is the same issue, but I also get
my 8xx target stuck when I try to use gdbserver.
If I debug a very small app, it gets stuck quickly after the app has
stopped: indeed, the console seems ok but as soon as I try to execute
something simple, like a ps or top, it get stuck. The target still
responds to pings, but nothing else.
If I debug a big app, it gets stuck soon after the start of debug: I set
a bpoint at main(), do a 'continue', get breaked at main(), do some
steps with 'next' then it gets stuck.
I have tried moving the struct thread_struct thread but it has no impact.
that sounds a bit different to what I see. Is your program also mutli-threaded?
No my program is a simple app doing a few printf("Hello World !"); and
nothing more.

I have now identified the issue, it is most likely specific to the 8xx:
when entering single step exception, the 8xx asserts the internal Freeze
which stops the decrementer and the timebase. In order to clear the
internal Freeze, the ICR SPR has to be read.

I'll now be able to check with your program and see how it behaves.

Christophe
Christophe Leroy
2016-08-19 11:03:38 UTC
Permalink
Post by Holger Brunck
Post by christophe leroy
Post by Dave Hansen
Post by Holger Brunck
I tried this but unfortunately the error only occurs while remote debugging.
Locally with gdb everything works fine. BTW we double-checked with a 85xx ppc
target which is also 32-bit and it ends up with the same behaviour.
I was also investigating where I have to move the line in the struct task_struct
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 253538f..4868874 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1655,7 +1655,9 @@ struct task_struct {
struct signal_struct *signal;
struct sighand_struct *sighand;
+ // struct thread_struct thread; // until here everything is fine
sigset_t blocked, real_blocked;
+ struct thread_struct thread; // from here it's broken
sigset_t saved_sigmask; /* restored if set_restore_sigmask() was used */
struct sigpending pending;
Wow, thanks for all the debugging here!
So, we know it has to do with signals, thread_info, and probably only
affects 32-bit powerpc. Seems awfully weird. Have you checked with any
of the 64-bit powerpc guys to see if they have any ideas?
I went grepping around for a bit.
Where is the task_struct stored? Is it on-stack on ppc32 or something?
The thread_info is, I assume, but I see some THREAD_INFO vs. THREAD
.globl ret_from_debug_exc
mfspr r9,SPRN_SPRG_THREAD
lwz r10,SAVED_KSP_LIMIT(r1)
stw r10,KSP_LIMIT(r9)
lwz r9,THREAD_INFO-THREAD(r9)
CURRENT_THREAD_INFO(r10, r1)
lwz r10,TI_PREEMPT(r10)
stw r10,TI_PREEMPT(r9)
RESTORE_xSRR(SRR0,SRR1);
RESTORE_xSRR(CSRR0,CSRR1);
RESTORE_MMU_REGS;
RET_FROM_EXC_LEVEL(SPRN_DSRR0, SPRN_DSRR1, PPC_RFDI)
But, I'm really at a loss to explain this. It still seems like a deeply
ppc-specific issue. We can obviously work around it with an #ifdef for
your platform, but that's awfully hackish and hides the real bug,
whatever it is.
My suspicion is that there's a bug in the 32-bit ppc assembly somewhere.
I don't see any references to 'blocked' or 'real_blocked' in assembly
though. You could add a bunch of padding instead of moving the
thread_struct and see if that does anything, but that's really a stab in
the dark.
Just to let you know, I'm not sure it is the same issue, but I also get
my 8xx target stuck when I try to use gdbserver.
If I debug a very small app, it gets stuck quickly after the app has
stopped: indeed, the console seems ok but as soon as I try to execute
something simple, like a ps or top, it get stuck. The target still
responds to pings, but nothing else.
If I debug a big app, it gets stuck soon after the start of debug: I set
a bpoint at main(), do a 'continue', get breaked at main(), do some
steps with 'next' then it gets stuck.
I have tried moving the struct thread_struct thread but it has no impact.
that sounds a bit different to what I see. Is your program also mutli-threaded?
--- snip -----
#include <pthread.h>
#include <stdio.h>
#include <unistd.h>
void * th_1_func()
{
while (1) {
sleep(2);
printf("Hello from thread function 1)\n");
}
}
int main() {
int err;
pthread_t th_1, th_2, th_3;
err = pthread_create(&th_1, NULL, th_1_func, NULL);
if (err != 0)
printf("pthread_create\n");
err = pthread_create(&th_2, NULL, th_1_func, NULL);
if (err != 0)
printf("pthread_create\n");
err = pthread_create(&th_3, NULL, th_1_func, NULL);
if (err != 0)
printf("pthread_create\n");
while(1) {}
return 0;
}
--- snap ---
Then copy it to your target and start it with the gdbserver. If you let it run
from your host with gdb and try to stop it e.g in the sleep call and then try to
single step it you might see the error. But as I said in this thread the
behaviour might be different depending on your kernel configuration as I
encountered different behaviour when enabling FTRACE or SCHED_STAT.
Best regards
Holger
Hi

I just tried it on an 885 and on an 8323, it work properly on both targets.

You can see below the Debug Option that are active on my 8323 target.

Christophe

#
# Kernel hacking
#

#
# printk and dmesg options
#
CONFIG_PRINTK_TIME=y
CONFIG_MESSAGE_LOGLEVEL_DEFAULT=4
# CONFIG_DYNAMIC_DEBUG is not set

#
# Compile-time checks and compiler options
#
CONFIG_DEBUG_INFO=y
CONFIG_DEBUG_INFO_REDUCED=y
# CONFIG_DEBUG_INFO_SPLIT is not set
# CONFIG_DEBUG_INFO_DWARF4 is not set
# CONFIG_GDB_SCRIPTS is not set
CONFIG_ENABLE_WARN_DEPRECATED=y
CONFIG_ENABLE_MUST_CHECK=y
CONFIG_FRAME_WARN=1024
# CONFIG_STRIP_ASM_SYMS is not set
# CONFIG_READABLE_ASM is not set
# CONFIG_UNUSED_SYMBOLS is not set
# CONFIG_PAGE_OWNER is not set
CONFIG_DEBUG_FS=y
# CONFIG_HEADERS_CHECK is not set
# CONFIG_DEBUG_SECTION_MISMATCH is not set
CONFIG_SECTION_MISMATCH_WARN_ONLY=y
# CONFIG_DEBUG_FORCE_WEAK_PER_CPU is not set
CONFIG_MAGIC_SYSRQ=y
CONFIG_MAGIC_SYSRQ_DEFAULT_ENABLE=0x1
CONFIG_DEBUG_KERNEL=y

#
# Memory Debugging
#
# CONFIG_PAGE_EXTENSION is not set
# CONFIG_DEBUG_PAGEALLOC is not set
# CONFIG_DEBUG_OBJECTS is not set
# CONFIG_SLUB_DEBUG_ON is not set
# CONFIG_SLUB_STATS is not set
CONFIG_HAVE_DEBUG_KMEMLEAK=y
# CONFIG_DEBUG_KMEMLEAK is not set
# CONFIG_DEBUG_STACK_USAGE is not set
# CONFIG_DEBUG_VM is not set
# CONFIG_DEBUG_MEMORY_INIT is not set
CONFIG_HAVE_DEBUG_STACKOVERFLOW=y
# CONFIG_DEBUG_STACKOVERFLOW is not set
# CONFIG_DEBUG_SHIRQ is not set

#
# Debug Lockups and Hangs
#
# CONFIG_LOCKUP_DETECTOR is not set
CONFIG_DETECT_HUNG_TASK=y
CONFIG_DEFAULT_HUNG_TASK_TIMEOUT=120
# CONFIG_BOOTPARAM_HUNG_TASK_PANIC is not set
CONFIG_BOOTPARAM_HUNG_TASK_PANIC_VALUE=0
# CONFIG_PANIC_ON_OOPS is not set
CONFIG_PANIC_ON_OOPS_VALUE=0
CONFIG_SCHED_DEBUG=y
# CONFIG_SCHED_INFO is not set
# CONFIG_SCHEDSTATS is not set
# CONFIG_SCHED_STACK_END_CHECK is not set
# CONFIG_DEBUG_TIMEKEEPING is not set
# CONFIG_TIMER_STATS is not set
CONFIG_DEBUG_PREEMPT=y

#
# Lock Debugging (spinlocks, mutexes, etc...)
#
# CONFIG_DEBUG_RT_MUTEXES is not set
# CONFIG_DEBUG_SPINLOCK is not set
# CONFIG_DEBUG_MUTEXES is not set
# CONFIG_DEBUG_WW_MUTEX_SLOWPATH is not set
# CONFIG_DEBUG_LOCK_ALLOC is not set
# CONFIG_PROVE_LOCKING is not set
# CONFIG_LOCK_STAT is not set
# CONFIG_DEBUG_ATOMIC_SLEEP is not set
# CONFIG_DEBUG_LOCKING_API_SELFTESTS is not set
# CONFIG_LOCK_TORTURE_TEST is not set
# CONFIG_STACKTRACE is not set
# CONFIG_DEBUG_KOBJECT is not set
CONFIG_DEBUG_BUGVERBOSE=y
# CONFIG_DEBUG_LIST is not set
# CONFIG_DEBUG_PI_LIST is not set
# CONFIG_DEBUG_SG is not set
# CONFIG_DEBUG_NOTIFIERS is not set
# CONFIG_DEBUG_CREDENTIALS is not set

#
# RCU Debugging
#
# CONFIG_PROVE_RCU is not set
# CONFIG_SPARSE_RCU_POINTER is not set
# CONFIG_TORTURE_TEST is not set
# CONFIG_RCU_TORTURE_TEST is not set
CONFIG_RCU_CPU_STALL_TIMEOUT=21
# CONFIG_RCU_TRACE is not set
# CONFIG_RCU_EQS_DEBUG is not set
# CONFIG_DEBUG_BLOCK_EXT_DEVT is not set
# CONFIG_NOTIFIER_ERROR_INJECTION is not set
# CONFIG_FAULT_INJECTION is not set
# CONFIG_LATENCYTOP is not set
CONFIG_HAVE_FUNCTION_TRACER=y
CONFIG_HAVE_FUNCTION_GRAPH_TRACER=y
CONFIG_HAVE_DYNAMIC_FTRACE=y
CONFIG_HAVE_FTRACE_MCOUNT_RECORD=y
CONFIG_HAVE_SYSCALL_TRACEPOINTS=y
CONFIG_TRACE_CLOCK=y
CONFIG_RING_BUFFER=y
CONFIG_RING_BUFFER_ALLOW_SWAP=y
CONFIG_TRACING_SUPPORT=y
CONFIG_FTRACE=y
# CONFIG_FUNCTION_TRACER is not set
# CONFIG_IRQSOFF_TRACER is not set
# CONFIG_PREEMPT_TRACER is not set
# CONFIG_SCHED_TRACER is not set
# CONFIG_ENABLE_DEFAULT_TRACERS is not set
# CONFIG_FTRACE_SYSCALLS is not set
# CONFIG_TRACER_SNAPSHOT is not set
CONFIG_BRANCH_PROFILE_NONE=y
# CONFIG_PROFILE_ANNOTATED_BRANCHES is not set
# CONFIG_PROFILE_ALL_BRANCHES is not set
# CONFIG_STACK_TRACER is not set
# CONFIG_BLK_DEV_IO_TRACE is not set
# CONFIG_UPROBE_EVENT is not set
# CONFIG_PROBE_EVENTS is not set
# CONFIG_TRACEPOINT_BENCHMARK is not set
# CONFIG_RING_BUFFER_BENCHMARK is not set
# CONFIG_RING_BUFFER_STARTUP_TEST is not set
CONFIG_TRACING_EVENTS_GPIO=y

#
# Runtime Testing
#
# CONFIG_LKDTM is not set
# CONFIG_TEST_LIST_SORT is not set
# CONFIG_BACKTRACE_SELF_TEST is not set
# CONFIG_RBTREE_TEST is not set
# CONFIG_ATOMIC64_SELFTEST is not set
# CONFIG_TEST_HEXDUMP is not set
# CONFIG_TEST_STRING_HELPERS is not set
# CONFIG_TEST_KSTRTOX is not set
# CONFIG_TEST_PRINTF is not set
# CONFIG_TEST_RHASHTABLE is not set
# CONFIG_DMA_API_DEBUG is not set
# CONFIG_TEST_FIRMWARE is not set
# CONFIG_TEST_UDELAY is not set
# CONFIG_MEMTEST is not set
# CONFIG_SAMPLES is not set
CONFIG_HAVE_ARCH_KGDB=y
# CONFIG_KGDB is not set
# CONFIG_PPC_DISABLE_WERROR is not set
CONFIG_PPC_WERROR=y
# CONFIG_STRICT_MM_TYPECHECKS is not set
CONFIG_PRINT_STACK_DEPTH=64
# CONFIG_PPC_EMULATED_STATS is not set
# CONFIG_CODE_PATCHING_SELFTEST is not set
# CONFIG_FTR_FIXUP_SELFTEST is not set
# CONFIG_MSI_BITMAP_SELFTEST is not set
# CONFIG_XMON is not set
CONFIG_BDI_SWITCH=y
# CONFIG_BOOTX_TEXT is not set
# CONFIG_PPC_EARLY_DEBUG is not set
CONFIG_STRICT_DEVMEM=y
Holger Brunck
2016-08-19 11:14:07 UTC
Permalink
Post by Christophe Leroy
Post by Holger Brunck
Post by christophe leroy
Post by Dave Hansen
Post by Holger Brunck
I tried this but unfortunately the error only occurs while remote debugging.
Locally with gdb everything works fine. BTW we double-checked with a 85xx ppc
target which is also 32-bit and it ends up with the same behaviour.
I was also investigating where I have to move the line in the struct task_struct
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 253538f..4868874 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1655,7 +1655,9 @@ struct task_struct {
struct signal_struct *signal;
struct sighand_struct *sighand;
+ // struct thread_struct thread; // until here everything is fine
sigset_t blocked, real_blocked;
+ struct thread_struct thread; // from here it's broken
sigset_t saved_sigmask; /* restored if set_restore_sigmask() was used */
struct sigpending pending;
Wow, thanks for all the debugging here!
So, we know it has to do with signals, thread_info, and probably only
affects 32-bit powerpc. Seems awfully weird. Have you checked with any
of the 64-bit powerpc guys to see if they have any ideas?
I went grepping around for a bit.
Where is the task_struct stored? Is it on-stack on ppc32 or something?
The thread_info is, I assume, but I see some THREAD_INFO vs. THREAD
.globl ret_from_debug_exc
mfspr r9,SPRN_SPRG_THREAD
lwz r10,SAVED_KSP_LIMIT(r1)
stw r10,KSP_LIMIT(r9)
lwz r9,THREAD_INFO-THREAD(r9)
CURRENT_THREAD_INFO(r10, r1)
lwz r10,TI_PREEMPT(r10)
stw r10,TI_PREEMPT(r9)
RESTORE_xSRR(SRR0,SRR1);
RESTORE_xSRR(CSRR0,CSRR1);
RESTORE_MMU_REGS;
RET_FROM_EXC_LEVEL(SPRN_DSRR0, SPRN_DSRR1, PPC_RFDI)
But, I'm really at a loss to explain this. It still seems like a deeply
ppc-specific issue. We can obviously work around it with an #ifdef for
your platform, but that's awfully hackish and hides the real bug,
whatever it is.
My suspicion is that there's a bug in the 32-bit ppc assembly somewhere.
I don't see any references to 'blocked' or 'real_blocked' in assembly
though. You could add a bunch of padding instead of moving the
thread_struct and see if that does anything, but that's really a stab in
the dark.
Just to let you know, I'm not sure it is the same issue, but I also get
my 8xx target stuck when I try to use gdbserver.
If I debug a very small app, it gets stuck quickly after the app has
stopped: indeed, the console seems ok but as soon as I try to execute
something simple, like a ps or top, it get stuck. The target still
responds to pings, but nothing else.
If I debug a big app, it gets stuck soon after the start of debug: I set
a bpoint at main(), do a 'continue', get breaked at main(), do some
steps with 'next' then it gets stuck.
I have tried moving the struct thread_struct thread but it has no impact.
that sounds a bit different to what I see. Is your program also mutli-threaded?
--- snip -----
#include <pthread.h>
#include <stdio.h>
#include <unistd.h>
void * th_1_func()
{
while (1) {
sleep(2);
printf("Hello from thread function 1)\n");
}
}
int main() {
int err;
pthread_t th_1, th_2, th_3;
err = pthread_create(&th_1, NULL, th_1_func, NULL);
if (err != 0)
printf("pthread_create\n");
err = pthread_create(&th_2, NULL, th_1_func, NULL);
if (err != 0)
printf("pthread_create\n");
err = pthread_create(&th_3, NULL, th_1_func, NULL);
if (err != 0)
printf("pthread_create\n");
while(1) {}
return 0;
}
--- snap ---
Then copy it to your target and start it with the gdbserver. If you let it run
from your host with gdb and try to stop it e.g in the sleep call and then try to
single step it you might see the error. But as I said in this thread the
behaviour might be different depending on your kernel configuration as I
encountered different behaviour when enabling FTRACE or SCHED_STAT.
Best regards
Holger
Hi
I just tried it on an 885 and on an 8323, it work properly on both targets.
You can see below the Debug Option that are active on my 8323 target.
thanks for trying it.

Could you completely disable FTRACE? As it also works on my side when I have
FTRACE enabled.

Best regards
Holger
Christophe Leroy
2016-08-19 13:44:18 UTC
Permalink
Post by Holger Brunck
Post by Christophe Leroy
Post by Holger Brunck
Post by christophe leroy
Post by Dave Hansen
Post by Holger Brunck
I tried this but unfortunately the error only occurs while remote debugging.
Locally with gdb everything works fine. BTW we double-checked with a 85xx ppc
target which is also 32-bit and it ends up with the same behaviour.
I was also investigating where I have to move the line in the struct task_struct
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 253538f..4868874 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1655,7 +1655,9 @@ struct task_struct {
struct signal_struct *signal;
struct sighand_struct *sighand;
+ // struct thread_struct thread; // until here everything is fine
sigset_t blocked, real_blocked;
+ struct thread_struct thread; // from here it's broken
sigset_t saved_sigmask; /* restored if set_restore_sigmask() was used */
struct sigpending pending;
Wow, thanks for all the debugging here!
So, we know it has to do with signals, thread_info, and probably only
affects 32-bit powerpc. Seems awfully weird. Have you checked with any
of the 64-bit powerpc guys to see if they have any ideas?
I went grepping around for a bit.
Where is the task_struct stored? Is it on-stack on ppc32 or something?
The thread_info is, I assume, but I see some THREAD_INFO vs. THREAD
.globl ret_from_debug_exc
mfspr r9,SPRN_SPRG_THREAD
lwz r10,SAVED_KSP_LIMIT(r1)
stw r10,KSP_LIMIT(r9)
lwz r9,THREAD_INFO-THREAD(r9)
CURRENT_THREAD_INFO(r10, r1)
lwz r10,TI_PREEMPT(r10)
stw r10,TI_PREEMPT(r9)
RESTORE_xSRR(SRR0,SRR1);
RESTORE_xSRR(CSRR0,CSRR1);
RESTORE_MMU_REGS;
RET_FROM_EXC_LEVEL(SPRN_DSRR0, SPRN_DSRR1, PPC_RFDI)
But, I'm really at a loss to explain this. It still seems like a deeply
ppc-specific issue. We can obviously work around it with an #ifdef for
your platform, but that's awfully hackish and hides the real bug,
whatever it is.
My suspicion is that there's a bug in the 32-bit ppc assembly somewhere.
I don't see any references to 'blocked' or 'real_blocked' in assembly
though. You could add a bunch of padding instead of moving the
thread_struct and see if that does anything, but that's really a stab in
the dark.
Just to let you know, I'm not sure it is the same issue, but I also get
my 8xx target stuck when I try to use gdbserver.
If I debug a very small app, it gets stuck quickly after the app has
stopped: indeed, the console seems ok but as soon as I try to execute
something simple, like a ps or top, it get stuck. The target still
responds to pings, but nothing else.
If I debug a big app, it gets stuck soon after the start of debug: I set
a bpoint at main(), do a 'continue', get breaked at main(), do some
steps with 'next' then it gets stuck.
I have tried moving the struct thread_struct thread but it has no impact.
that sounds a bit different to what I see. Is your program also mutli-threaded?
--- snip -----
#include <pthread.h>
#include <stdio.h>
#include <unistd.h>
void * th_1_func()
{
while (1) {
sleep(2);
printf("Hello from thread function 1)\n");
}
}
int main() {
int err;
pthread_t th_1, th_2, th_3;
err = pthread_create(&th_1, NULL, th_1_func, NULL);
if (err != 0)
printf("pthread_create\n");
err = pthread_create(&th_2, NULL, th_1_func, NULL);
if (err != 0)
printf("pthread_create\n");
err = pthread_create(&th_3, NULL, th_1_func, NULL);
if (err != 0)
printf("pthread_create\n");
while(1) {}
return 0;
}
--- snap ---
Then copy it to your target and start it with the gdbserver. If you let it run
from your host with gdb and try to stop it e.g in the sleep call and then try to
single step it you might see the error. But as I said in this thread the
behaviour might be different depending on your kernel configuration as I
encountered different behaviour when enabling FTRACE or SCHED_STAT.
Best regards
Holger
Hi
I just tried it on an 885 and on an 8323, it work properly on both targets.
You can see below the Debug Option that are active on my 8323 target.
thanks for trying it.
Could you completely disable FTRACE? As it also works on my side when I have
FTRACE enabled.
Best regards
Holger
I have now disabled completly FTRACE, the behaviour is still OK.

Christophe
Holger Brunck
2016-08-19 16:26:57 UTC
Permalink
Post by Christophe Leroy
Post by Holger Brunck
Post by Christophe Leroy
I just tried it on an 885 and on an 8323, it work properly on both targets.
You can see below the Debug Option that are active on my 8323 target.
thanks for trying it.
Could you completely disable FTRACE? As it also works on my side when I have
FTRACE enabled.
I have now disabled completly FTRACE, the behaviour is still OK.
I double-checked it on my 8321 board and on this board it's also not
reproducible with my test program. Also on a board with QorIQ P2041 it's not
reproducible so easy, but we see the error with our application code. Easy
reproducible with the mainline kernel for me it's only on my MPC8360 board.

Best regards
Holger

Benjamin Herrenschmidt
2016-08-16 22:13:52 UTC
Permalink
Post by Dave Hansen
 
Wow, thanks for all the debugging here!
Yup, thanks, that's really odd... I wonder if one of those
structures is accessed beyond it's boundary, either the sigset
or the thread struct, causing corruption of neighbouring fields
in task struct...

Can you try adding a little canary on both sides (make it not-so-little
maybe a few words) which you initialize to a known pattern and check
every now and then ?
Post by Dave Hansen
So, we know it has to do with signals, thread_info, and probably only
affects 32-bit powerpc.  Seems awfully weird.  Have you checked with
any
of the 64-bit powerpc guys to see if they have any ideas?
I went grepping around for a bit.
Where is the task_struct stored?  Is it on-stack on ppc32 or
something?
No it's allocated normally.
Post by Dave Hansen
 The thread_info is,
Yes, thread_info is at the bottom of stack
Post by Dave Hansen
I assume, but I see some THREAD_INFO vs. THREAD
 
        .globl  ret_from_debug_exc
        mfspr   r9,SPRN_SPRG_THREAD
        lwz     r10,SAVED_KSP_LIMIT(r1)
        stw     r10,KSP_LIMIT(r9)
        lwz     r9,THREAD_INFO-THREAD(r9)
This calculates the offset between the thread struct and the pointer
to thread info inside task struct and loads that pointer into r9
Post by Dave Hansen
        CURRENT_THREAD_INFO(r10, r1)
        lwz     r10,TI_PREEMPT(r10)
        stw     r10,TI_PREEMPT(r9)
        RESTORE_xSRR(SRR0,SRR1);
        RESTORE_xSRR(CSRR0,CSRR1);
        RESTORE_MMU_REGS;
        RET_FROM_EXC_LEVEL(SPRN_DSRR0, SPRN_DSRR1, PPC_RFDI)
Basically the above code transfers TI_PREEMPT from the "current"
thread info which I believe would be on some exception/interrupt
stack into the current task thread info.
Post by Dave Hansen
But, I'm really at a loss to explain this.  It still seems like a
deeply
ppc-specific issue.  We can obviously work around it with an #ifdef
for
your platform, but that's awfully hackish and hides the real bug,
whatever it is.
My suspicion is that there's a bug in the 32-bit ppc assembly
somewhere.
 I don't see any references to 'blocked' or 'real_blocked' in
assembly
though.  You could add a bunch of padding instead of moving the
thread_struct and see if that does anything, but that's really a stab in
the dark.
Holger Brunck
2016-08-17 15:05:34 UTC
Permalink
Post by Benjamin Herrenschmidt
Post by Dave Hansen
Wow, thanks for all the debugging here!
Yup, thanks, that's really odd... I wonder if one of those
structures is accessed beyond it's boundary, either the sigset
or the thread struct, causing corruption of neighbouring fields
in task struct...
Can you try adding a little canary on both sides (make it not-so-little
maybe a few words) which you initialize to a known pattern and check
every now and then ?
I added a dummy char buffer like this:

--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1655,7 +1655,11 @@ struct task_struct {
struct signal_struct *signal;
struct sighand_struct *sighand;

+ // struct thread_struct thread; // does work
sigset_t blocked, real_blocked;
+
+ struct thread_struct thread; // does work if dummy has 5 bytes
+ char dummy[5]; // if we use 4 bytes it's broken
sigset_t saved_sigmask; /* restored if set_restore_sigmask() was used */
struct sigpending pending;

@@ -1919,7 +1923,6 @@ struct task_struct {
struct task_struct *oom_reaper_list;
#endif
/* CPU-specific state of this task */
- struct thread_struct thread;
/*
* WARNING: on x86, 'thread_struct' contains a variable-sized
* structure. It *MUST* be at the end of 'task_struct'.

If I use 4 bytes the error is present if I add 5 bytes it runs fine. For both
cases I added a printout into the sched_debug.c code to the general scheduler
statistics and the content of the buffer is always zero and does not change. So
at least no one is writing non-zero to the buffer. Where gets the task_struct
initialized? Then I could double check with different values.

Just to let you know in rare case I get a kernel crash (my_trace are some
printouts in arch/powerpc/signal_32.c and arch/powerpc/kernel/signal.c) :

my_trace: handle_signal32
my_trace: save_user_regs
my_trace: copy_fpr_to_user
my_trace: sys_sigreturn
my_trace: restore_user_regs
my_trace: copy_fpr_from_user
my_trace: do_signal: no signal to deliver
Unable to handle kernel paging request for data at address 0x00000000
Faulting instruction address: 0xc01dd2a4
Oops: Kernel access of bad area, sig: 11 [#1]
PREEMPT mpc83xx-km-platform
Modules linked in:
CPU: 0 PID: 65 Comm: TR_Task Not tainted 4.7.0-00271-g76ef984-dirty #77
task: cfbab5f0 ti: cfb94000 task.ti: cfb94000
NIP: c01dd2a4 LR: c003d0fc CTR: c003ddc0
REGS: cfb95bf0 TRAP: 0300 Not tainted (4.7.0-00271-g76ef984-dirty)
MSR: 00001032 <ME,IR,DR,RI> CR: 84022282 XER: 20000000
DAR: 00000000 DSISR: 20000000
GPR00: c003df58 cfb95ca0 cfbab5f0 cfbab138 cfb7f708 00000000 00000001 00000000
GPR08: 00000000 cfb9ea18 00000000 13d50b30 84022282 1006ac08 00000000 0fff0018
GPR16: 0fcc02a8 b7d3b4c0 10068c70 10068c70 0fe1a91c 0fcc22f8 00000000 cfb94000
GPR24: 00000000 ffffffff cfb94000 c044ea40 cfbab130 cfbab138 cfb7f6e0 cfbab130
NIP [c01dd2a4] rb_erase+0x1d0/0x3e4
LR [c003d0fc] set_next_entity+0x7c/0xc8
Call Trace:
[cfb95ca0] [84022282] 0x84022282 (unreliable)
[cfb95cc0] [c003df58] pick_next_task_fair+0x198/0x1e8
[cfb95cf0] [c03666f4] __schedule+0xd8/0x4d8
[cfb95d40] [c0366b30] schedule+0x3c/0xac
[cfb95d60] [c006f96c] futex_wait_queue_me+0xd4/0x164
[cfb95d80] [c007098c] futex_wait+0xfc/0x268
[cfb95e50] [c0072500] do_futex+0x138/0xb34
[cfb95ee0] [c0072f60] SyS_futex+0x64/0x1d0
[cfb95f40] [c000e788] ret_from_syscall+0x0/0x38
--- interrupt: c01 at 0xfca0db4
LR = 0xfca0d90
Instruction dump:
912a0000 81490000 71470001 418200d4 5548003b 418200b0 7d274b78 7d094378
81490004 7f8a3840 409eff60 81490008 <810a0000> 71060001 40820040 80ea0004
---[ end trace e7b4a1ae0909a358 ]---

note: TR_Task[65] exited with preempt_count 2

So I also see a race condition in rare cases when I trigger the error, while
most of the time the kernel continues and the threads are in a state which are
confusing the gdbserver. All these test are done with a simple C program which
runs three threads in a while loop.

Best regards
Holger Brunck
Holger Brunck
2016-08-17 14:59:45 UTC
Permalink
Post by Dave Hansen
Post by Holger Brunck
I tried this but unfortunately the error only occurs while remote debugging.
Locally with gdb everything works fine. BTW we double-checked with a 85xx ppc
target which is also 32-bit and it ends up with the same behaviour.
I was also investigating where I have to move the line in the struct task_struct
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 253538f..4868874 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1655,7 +1655,9 @@ struct task_struct {
struct signal_struct *signal;
struct sighand_struct *sighand;
+ // struct thread_struct thread; // until here everything is fine
sigset_t blocked, real_blocked;
+ struct thread_struct thread; // from here it's broken
sigset_t saved_sigmask; /* restored if set_restore_sigmask() was used */
struct sigpending pending;
Wow, thanks for all the debugging here!
So, we know it has to do with signals, thread_info, and probably only
affects 32-bit powerpc. Seems awfully weird. Have you checked with any
of the 64-bit powerpc guys to see if they have any ideas?
I went grepping around for a bit.
Where is the task_struct stored? Is it on-stack on ppc32 or something?
The thread_info is, I assume, but I see some THREAD_INFO vs. THREAD
.globl ret_from_debug_exc
mfspr r9,SPRN_SPRG_THREAD
lwz r10,SAVED_KSP_LIMIT(r1)
stw r10,KSP_LIMIT(r9)
lwz r9,THREAD_INFO-THREAD(r9)
CURRENT_THREAD_INFO(r10, r1)
lwz r10,TI_PREEMPT(r10)
stw r10,TI_PREEMPT(r9)
RESTORE_xSRR(SRR0,SRR1);
RESTORE_xSRR(CSRR0,CSRR1);
RESTORE_MMU_REGS;
RET_FROM_EXC_LEVEL(SPRN_DSRR0, SPRN_DSRR1, PPC_RFDI)
yeah but here you are in arch/powerpc/kernel/head_booke.h and IIUIC my
architecture uses arch/powerpc/kernel/head_32.S

Some small updates from my side. I was able to simplify the setup. I don't need
my (quite complex) application. I now have a small C - program which starts
three threads simply running a while loop and doing printouts every now and
then. And I still can reproduce the error. So we simply need threads a gdbserver
session, ppc32 and a single step while the threads are already running.

I added some debug prints within the kernel common signal code and the specific
powerpc code and enabled the debug trace from the gdbserver. What I see is that
gdbserver sends for each thread a SIGSTOP to the kernel and waits for a
response. The kernel does receive all the signals but only respond to some of
them in the error case. Which then matches with my "ps" output as I see that
some threads are not in the state pthread_stop and then the gdbserver gets
suspended. I think the interesting part is in arch/powerpc/kernel/signal_32.c
with it's function handle_signal32. For the threads successfully stopped this
function is called once. If the kernel receives two SIGSTOP before calling the
function we end up in the error case. Now my question does anyone know if this
function should handle several pending signals at once if present or will it be
called once per signal?
Post by Dave Hansen
But, I'm really at a loss to explain this. It still seems like a deeply
ppc-specific issue. We can obviously work around it with an #ifdef for
your platform, but that's awfully hackish and hides the real bug,
whatever it is.
what I could do is to reuse the define CONFIG_ARCH_WANTS_DYNAMIC_TASK_STRUCT and
if it's not set I use the thread_struct at it's old position. I know it would
only mask the error and I guess it's not acceptable for mainline but as my time
is limited I could live with such an OOT patch for my board.

Best regards
Holger
Loading...