aboutsummaryrefslogtreecommitdiffstats
path: root/Documentation
diff options
context:
space:
mode:
authorLinus Torvalds <torvalds@linux-foundation.org>2014-12-11 20:30:55 -0500
committerLinus Torvalds <torvalds@linux-foundation.org>2014-12-11 20:30:55 -0500
commit27afc5dbda52ee3dbcd0bda7375c917c6936b470 (patch)
tree47591400f85590d48fa71bbfa50e0707e20e4bd0 /Documentation
parent70e71ca0af244f48a5dcf56dc435243792e3a495 (diff)
parent351997810131565fe62aec2c366deccbf6bda3f4 (diff)
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 updates from Martin Schwidefsky: "The most notable change for this pull request is the ftrace rework from Heiko. It brings a small performance improvement and the ground work to support a new gcc option to replace the mcount blocks with a single nop. Two new s390 specific system calls are added to emulate user space mmio for PCI, an artifact of the how PCI memory is accessed. Two patches for the memory management with changes to common code. For KVM mm_forbids_zeropage is added which disables the empty zero page for an mm that is used by a KVM process. And an optimization, pmdp_get_and_clear_full is added analog to ptep_get_and_clear_full. Some micro optimization for the cmpxchg and the spinlock code. And as usual bug fixes and cleanups" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (46 commits) s390/cputime: fix 31-bit compile s390/scm_block: make the number of reqs per HW req configurable s390/scm_block: handle multiple requests in one HW request s390/scm_block: allocate aidaw pages only when necessary s390/scm_block: use mempool to manage aidaw requests s390/eadm: change timeout value s390/mm: fix memory leak of ptlock in pmd_free_tlb s390: use local symbol names in entry[64].S s390/ptrace: always include vector registers in core files s390/simd: clear vector register pointer on fork/clone s390: translate cputime magic constants to macros s390/idle: convert open coded idle time seqcount s390/idle: add missing irq off lockdep annotation s390/debug: avoid function call for debug_sprintf_* s390/kprobes: fix instruction copy for out of line execution s390: remove diag 44 calls from cpu_relax() s390/dasd: retry partition detection s390/dasd: fix list corruption for sleep_on requests s390/dasd: fix infinite term I/O loop s390/dasd: remove unused code ...
Diffstat (limited to 'Documentation')
-rw-r--r--Documentation/s390/Debugging390.txt462
1 files changed, 61 insertions, 401 deletions
diff --git a/Documentation/s390/Debugging390.txt b/Documentation/s390/Debugging390.txt
index 462321c1aeea..08911b5c6b0e 100644
--- a/Documentation/s390/Debugging390.txt
+++ b/Documentation/s390/Debugging390.txt
@@ -26,11 +26,6 @@ The Linux for s/390 & z/Architecture Kernel Task Structure
26Register Usage & Stackframes on Linux for s/390 & z/Architecture 26Register Usage & Stackframes on Linux for s/390 & z/Architecture
27A sample program with comments 27A sample program with comments
28Compiling programs for debugging on Linux for s/390 & z/Architecture 28Compiling programs for debugging on Linux for s/390 & z/Architecture
29Figuring out gcc compile errors
30Debugging Tools
31objdump
32strace
33Performance Debugging
34Debugging under VM 29Debugging under VM
35s/390 & z/Architecture IO Overview 30s/390 & z/Architecture IO Overview
36Debugging IO on s/390 & z/Architecture under VM 31Debugging IO on s/390 & z/Architecture under VM
@@ -114,28 +109,25 @@ s/390 z/Architecture
114 109
11516-17 16-17 Address Space Control 11016-17 16-17 Address Space Control
116 111
117 00 Primary Space Mode when DAT on 112 00 Primary Space Mode:
118 The linux kernel currently runs in this mode, CR1 is affiliated with 113 The register CR1 contains the primary address-space control ele-
119 this mode & points to the primary segment table origin etc. 114 ment (PASCE), which points to the primary space region/segment
120 115 table origin.
121 01 Access register mode this mode is used in functions to 116
122 copy data between kernel & user space. 117 01 Access register mode
123 118
124 10 Secondary space mode not used in linux however CR7 the 119 10 Secondary Space Mode:
125 register affiliated with this mode is & this & normally 120 The register CR7 contains the secondary address-space control
126 CR13=CR7 to allow us to copy data between kernel & user space. 121 element (SASCE), which points to the secondary space region or
127 We do this as follows: 122 segment table origin.
128 We set ar2 to 0 to designate its 123
129 affiliated gpr ( gpr2 )to point to primary=kernel space. 124 11 Home Space Mode:
130 We set ar4 to 1 to designate its 125 The register CR13 contains the home space address-space control
131 affiliated gpr ( gpr4 ) to point to secondary=home=user space 126 element (HASCE), which points to the home space region/segment
132 & then essentially do a memcopy(gpr2,gpr4,size) to 127 table origin.
133 copy data between the address spaces, the reason we use home space for the 128
134 kernel & don't keep secondary space free is that code will not run in 129 See "Address Spaces on Linux for s/390 & z/Architecture" below
135 secondary space. 130 for more information about address space usage in Linux.
136
137 11 Home Space Mode all user programs run in this mode.
138 it is affiliated with CR13.
139 131
14018-19 18-19 Condition codes (CC) 13218-19 18-19 Condition codes (CC)
141 133
@@ -249,9 +241,9 @@ currently 4TB of physical memory currently on z/Architecture.
249Address Spaces on Linux for s/390 & z/Architecture 241Address Spaces on Linux for s/390 & z/Architecture
250================================================== 242==================================================
251 243
252Our addressing scheme is as follows 244Our addressing scheme is basically as follows:
253
254 245
246 Primary Space Home Space
255Himem 0x7fffffff 2GB on s/390 ***************** **************** 247Himem 0x7fffffff 2GB on s/390 ***************** ****************
256currently 0x3ffffffffff (2^42)-1 * User Stack * * * 248currently 0x3ffffffffff (2^42)-1 * User Stack * * *
257on z/Architecture. ***************** * * 249on z/Architecture. ***************** * *
@@ -264,9 +256,46 @@ on z/Architecture. ***************** * *
264 * Sections * * * 256 * Sections * * *
2650x00000000 ***************** **************** 2570x00000000 ***************** ****************
266 258
267This also means that we need to look at the PSW problem state bit 259This also means that we need to look at the PSW problem state bit and the
268or the addressing mode to decide whether we are looking at 260addressing mode to decide whether we are looking at user or kernel space.
269user or kernel space. 261
262User space runs in primary address mode (or access register mode within
263the vdso code).
264
265The kernel usually also runs in home space mode, however when accessing
266user space the kernel switches to primary or secondary address mode if
267the mvcos instruction is not available or if a compare-and-swap (futex)
268instruction on a user space address is performed.
269
270When also looking at the ASCE control registers, this means:
271
272User space:
273- runs in primary or access register mode
274- cr1 contains the user asce
275- cr7 contains the user asce
276- cr13 contains the kernel asce
277
278Kernel space:
279- runs in home space mode
280- cr1 contains the user or kernel asce
281 -> the kernel asce is loaded when a uaccess requires primary or
282 secondary address mode
283- cr7 contains the user or kernel asce, (changed with set_fs())
284- cr13 contains the kernel asce
285
286In case of uaccess the kernel changes to:
287- primary space mode in case of a uaccess (copy_to_user) and uses
288 e.g. the mvcp instruction to access user space. However the kernel
289 will stay in home space mode if the mvcos instruction is available
290- secondary space mode in case of futex atomic operations, so that the
291 instructions come from primary address space and data from secondary
292 space
293
294In case of KVM, the kernel runs in home space mode, but cr1 gets switched
295to contain the gmap asce before the SIE instruction gets executed. When
296the SIE instruction is finished, cr1 will be switched back to contain the
297user asce.
298
270 299
271Virtual Addresses on s/390 & z/Architecture 300Virtual Addresses on s/390 & z/Architecture
272=========================================== 301===========================================
@@ -706,376 +735,7 @@ Debugging with optimisation has since much improved after fixing
706some bugs, please make sure you are using gdb-5.0 or later developed 735some bugs, please make sure you are using gdb-5.0 or later developed
707after Nov'2000. 736after Nov'2000.
708 737
709Figuring out gcc compile errors
710===============================
711If you are getting a lot of syntax errors compiling a program & the problem
712isn't blatantly obvious from the source.
713It often helps to just preprocess the file, this is done with the -E
714option in gcc.
715What this does is that it runs through the very first phase of compilation
716( compilation in gcc is done in several stages & gcc calls many programs to
717achieve its end result ) with the -E option gcc just calls the gcc preprocessor (cpp).
718The c preprocessor does the following, it joins all the files #included together
719recursively ( #include files can #include other files ) & also the c file you wish to compile.
720It puts a fully qualified path of the #included files in a comment & it
721does macro expansion.
722This is useful for debugging because
7231) You can double check whether the files you expect to be included are the ones
724that are being included ( e.g. double check that you aren't going to the i386 asm directory ).
7252) Check that macro definitions aren't clashing with typedefs,
7263) Check that definitions aren't being used before they are being included.
7274) Helps put the line emitting the error under the microscope if it contains macros.
728
729For convenience the Linux kernel's makefile will do preprocessing automatically for you
730by suffixing the file you want built with .i ( instead of .o )
731
732e.g.
733from the linux directory type
734make arch/s390/kernel/signal.i
735this will build
736
737s390-gcc -D__KERNEL__ -I/home1/barrow/linux/include -Wall -Wstrict-prototypes -O2 -fomit-frame-pointer
738-fno-strict-aliasing -D__SMP__ -pipe -fno-strength-reduce -E arch/s390/kernel/signal.c
739> arch/s390/kernel/signal.i
740
741Now look at signal.i you should see something like.
742
743
744# 1 "/home1/barrow/linux/include/asm/types.h" 1
745typedef unsigned short umode_t;
746typedef __signed__ char __s8;
747typedef unsigned char __u8;
748typedef __signed__ short __s16;
749typedef unsigned short __u16;
750
751If instead you are getting errors further down e.g.
752unknown instruction:2515 "move.l" or better still unknown instruction:2515
753"Fixme not implemented yet, call Martin" you are probably are attempting to compile some code
754meant for another architecture or code that is simply not implemented, with a fixme statement
755stuck into the inline assembly code so that the author of the file now knows he has work to do.
756To look at the assembly emitted by gcc just before it is about to call gas ( the gnu assembler )
757use the -S option.
758Again for your convenience the Linux kernel's Makefile will hold your hand &
759do all this donkey work for you also by building the file with the .s suffix.
760e.g.
761from the Linux directory type
762make arch/s390/kernel/signal.s
763
764s390-gcc -D__KERNEL__ -I/home1/barrow/linux/include -Wall -Wstrict-prototypes -O2 -fomit-frame-pointer
765-fno-strict-aliasing -D__SMP__ -pipe -fno-strength-reduce -S arch/s390/kernel/signal.c
766-o arch/s390/kernel/signal.s
767
768
769This will output something like, ( please note the constant pool & the useful comments
770in the prologue to give you a hand at interpreting it ).
771
772.LC54:
773 .string "misaligned (__u16 *) in __xchg\n"
774.LC57:
775 .string "misaligned (__u32 *) in __xchg\n"
776.L$PG1: # Pool sys_sigsuspend
777.LC192:
778 .long -262401
779.LC193:
780 .long -1
781.LC194:
782 .long schedule-.L$PG1
783.LC195:
784 .long do_signal-.L$PG1
785 .align 4
786.globl sys_sigsuspend
787 .type sys_sigsuspend,@function
788sys_sigsuspend:
789# leaf function 0
790# automatics 16
791# outgoing args 0
792# need frame pointer 0
793# call alloca 0
794# has varargs 0
795# incoming args (stack) 0
796# function length 168
797 STM 8,15,32(15)
798 LR 0,15
799 AHI 15,-112
800 BASR 13,0
801.L$CO1: AHI 13,.L$PG1-.L$CO1
802 ST 0,0(15)
803 LR 8,2
804 N 5,.LC192-.L$PG1(13)
805
806Adding -g to the above output makes the output even more useful
807e.g. typing
808make CC:="s390-gcc -g" kernel/sched.s
809
810which compiles.
811s390-gcc -g -D__KERNEL__ -I/home/barrow/linux-2.3/include -Wall -Wstrict-prototypes -O2 -fomit-frame-pointer -fno-strict-aliasing -pipe -fno-strength-reduce -S kernel/sched.c -o kernel/sched.s
812
813also outputs stabs ( debugger ) info, from this info you can find out the
814offsets & sizes of various elements in structures.
815e.g. the stab for the structure
816struct rlimit {
817 unsigned long rlim_cur;
818 unsigned long rlim_max;
819};
820is
821.stabs "rlimit:T(151,2)=s8rlim_cur:(0,5),0,32;rlim_max:(0,5),32,32;;",128,0,0,0
822from this stab you can see that
823rlimit_cur starts at bit offset 0 & is 32 bits in size
824rlimit_max starts at bit offset 32 & is 32 bits in size.
825
826
827Debugging Tools:
828================
829
830objdump
831=======
832This is a tool with many options the most useful being ( if compiled with -g).
833objdump --source <victim program or object file> > <victims debug listing >
834
835
836The whole kernel can be compiled like this ( Doing this will make a 17MB kernel
837& a 200 MB listing ) however you have to strip it before building the image
838using the strip command to make it a more reasonable size to boot it.
839
840A source/assembly mixed dump of the kernel can be done with the line
841objdump --source vmlinux > vmlinux.lst
842Also, if the file isn't compiled -g, this will output as much debugging information
843as it can (e.g. function names). This is very slow as it spends lots
844of time searching for debugging info. The following self explanatory line should be used
845instead if the code isn't compiled -g, as it is much faster:
846objdump --disassemble-all --syms vmlinux > vmlinux.lst
847
848As hard drive space is valuable most of us use the following approach.
8491) Look at the emitted psw on the console to find the crash address in the kernel.
8502) Look at the file System.map ( in the linux directory ) produced when building
851the kernel to find the closest address less than the current PSW to find the
852offending function.
8533) use grep or similar to search the source tree looking for the source file
854 with this function if you don't know where it is.
8554) rebuild this object file with -g on, as an example suppose the file was
856( /arch/s390/kernel/signal.o )
8575) Assuming the file with the erroneous function is signal.c Move to the base of the
858Linux source tree.
8596) rm /arch/s390/kernel/signal.o
8607) make /arch/s390/kernel/signal.o
8618) watch the gcc command line emitted
8629) type it in again or alternatively cut & paste it on the console adding the -g option.
86310) objdump --source arch/s390/kernel/signal.o > signal.lst
864This will output the source & the assembly intermixed, as the snippet below shows
865This will unfortunately output addresses which aren't the same
866as the kernel ones you should be able to get around the mental arithmetic
867by playing with the --adjust-vma parameter to objdump.
868
869
870
871
872static inline void spin_lock(spinlock_t *lp)
873{
874 a0: 18 34 lr %r3,%r4
875 a2: a7 3a 03 bc ahi %r3,956
876 __asm__ __volatile(" lhi 1,-1\n"
877 a6: a7 18 ff ff lhi %r1,-1
878 aa: 1f 00 slr %r0,%r0
879 ac: ba 01 30 00 cs %r0,%r1,0(%r3)
880 b0: a7 44 ff fd jm aa <sys_sigsuspend+0x2e>
881 saveset = current->blocked;
882 b4: d2 07 f0 68 mvc 104(8,%r15),972(%r4)
883 b8: 43 cc
884 return (set->sig[0] & mask) != 0;
885}
886
8876) If debugging under VM go down to that section in the document for more info.
888
889
890I now have a tool which takes the pain out of --adjust-vma
891& you are able to do something like
892make /arch/s390/kernel/traps.lst
893& it automatically generates the correctly relocated entries for
894the text segment in traps.lst.
895This tool is now standard in linux distro's in scripts/makelst
896
897strace:
898-------
899Q. What is it ?
900A. It is a tool for intercepting calls to the kernel & logging them
901to a file & on the screen.
902
903Q. What use is it ?
904A. You can use it to find out what files a particular program opens.
905
906
907 738
908Example 1
909---------
910If you wanted to know does ping work but didn't have the source
911strace ping -c 1 127.0.0.1
912& then look at the man pages for each of the syscalls below,
913( In fact this is sometimes easier than looking at some spaghetti
914source which conditionally compiles for several architectures ).
915Not everything that it throws out needs to make sense immediately.
916
917Just looking quickly you can see that it is making up a RAW socket
918for the ICMP protocol.
919Doing an alarm(10) for a 10 second timeout
920& doing a gettimeofday call before & after each read to see
921how long the replies took, & writing some text to stdout so the user
922has an idea what is going on.
923
924socket(PF_INET, SOCK_RAW, IPPROTO_ICMP) = 3
925getuid() = 0
926setuid(0) = 0
927stat("/usr/share/locale/C/libc.cat", 0xbffff134) = -1 ENOENT (No such file or directory)
928stat("/usr/share/locale/libc/C", 0xbffff134) = -1 ENOENT (No such file or directory)
929stat("/usr/local/share/locale/C/libc.cat", 0xbffff134) = -1 ENOENT (No such file or directory)
930getpid() = 353
931setsockopt(3, SOL_SOCKET, SO_BROADCAST, [1], 4) = 0
932setsockopt(3, SOL_SOCKET, SO_RCVBUF, [49152], 4) = 0
933fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(3, 1), ...}) = 0
934mmap(0, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40008000
935ioctl(1, TCGETS, {B9600 opost isig icanon echo ...}) = 0
936write(1, "PING 127.0.0.1 (127.0.0.1): 56 d"..., 42PING 127.0.0.1 (127.0.0.1): 56 data bytes
937) = 42
938sigaction(SIGINT, {0x8049ba0, [], SA_RESTART}, {SIG_DFL}) = 0
939sigaction(SIGALRM, {0x8049600, [], SA_RESTART}, {SIG_DFL}) = 0
940gettimeofday({948904719, 138951}, NULL) = 0
941sendto(3, "\10\0D\201a\1\0\0\17#\2178\307\36"..., 64, 0, {sin_family=AF_INET,
942sin_port=htons(0), sin_addr=inet_addr("127.0.0.1")}, 16) = 64
943sigaction(SIGALRM, {0x8049600, [], SA_RESTART}, {0x8049600, [], SA_RESTART}) = 0
944sigaction(SIGALRM, {0x8049ba0, [], SA_RESTART}, {0x8049600, [], SA_RESTART}) = 0
945alarm(10) = 0
946recvfrom(3, "E\0\0T\0005\0\0@\1|r\177\0\0\1\177"..., 192, 0,
947{sin_family=AF_INET, sin_port=htons(50882), sin_addr=inet_addr("127.0.0.1")}, [16]) = 84
948gettimeofday({948904719, 160224}, NULL) = 0
949recvfrom(3, "E\0\0T\0006\0\0\377\1\275p\177\0"..., 192, 0,
950{sin_family=AF_INET, sin_port=htons(50882), sin_addr=inet_addr("127.0.0.1")}, [16]) = 84
951gettimeofday({948904719, 166952}, NULL) = 0
952write(1, "64 bytes from 127.0.0.1: icmp_se"...,
9535764 bytes from 127.0.0.1: icmp_seq=0 ttl=255 time=28.0 ms
954
955Example 2
956---------
957strace passwd 2>&1 | grep open
958produces the following output
959open("/etc/ld.so.cache", O_RDONLY) = 3
960open("/opt/kde/lib/libc.so.5", O_RDONLY) = -1 ENOENT (No such file or directory)
961open("/lib/libc.so.5", O_RDONLY) = 3
962open("/dev", O_RDONLY) = 3
963open("/var/run/utmp", O_RDONLY) = 3
964open("/etc/passwd", O_RDONLY) = 3
965open("/etc/shadow", O_RDONLY) = 3
966open("/etc/login.defs", O_RDONLY) = 4
967open("/dev/tty", O_RDONLY) = 4
968
969The 2>&1 is done to redirect stderr to stdout & grep is then filtering this input
970through the pipe for each line containing the string open.
971
972
973Example 3
974---------
975Getting sophisticated
976telnetd crashes & I don't know why
977
978Steps
979-----
9801) Replace the following line in /etc/inetd.conf
981telnet stream tcp nowait root /usr/sbin/in.telnetd -h
982with
983telnet stream tcp nowait root /blah
984
9852) Create the file /blah with the following contents to start tracing telnetd
986#!/bin/bash
987/usr/bin/strace -o/t1 -f /usr/sbin/in.telnetd -h
9883) chmod 700 /blah to make it executable only to root
9894)
990killall -HUP inetd
991or ps aux | grep inetd
992get inetd's process id
993& kill -HUP inetd to restart it.
994
995Important options
996-----------------
997-o is used to tell strace to output to a file in our case t1 in the root directory
998-f is to follow children i.e.
999e.g in our case above telnetd will start the login process & subsequently a shell like bash.
1000You will be able to tell which is which from the process ID's listed on the left hand side
1001of the strace output.
1002-p<pid> will tell strace to attach to a running process, yup this can be done provided
1003 it isn't being traced or debugged already & you have enough privileges,
1004the reason 2 processes cannot trace or debug the same program is that strace
1005becomes the parent process of the one being debugged & processes ( unlike people )
1006can have only one parent.
1007
1008
1009However the file /t1 will get big quite quickly
1010to test it telnet 127.0.0.1
1011
1012now look at what files in.telnetd execve'd
1013413 execve("/usr/sbin/in.telnetd", ["/usr/sbin/in.telnetd", "-h"], [/* 17 vars */]) = 0
1014414 execve("/bin/login", ["/bin/login", "-h", "localhost", "-p"], [/* 2 vars */]) = 0
1015
1016Whey it worked!.
1017
1018
1019Other hints:
1020------------
1021If the program is not very interactive ( i.e. not much keyboard input )
1022& is crashing in one architecture but not in another you can do
1023an strace of both programs under as identical a scenario as you can
1024on both architectures outputting to a file then.
1025do a diff of the two traces using the diff program
1026i.e.
1027diff output1 output2
1028& maybe you'll be able to see where the call paths differed, this
1029is possibly near the cause of the crash.
1030
1031More info
1032---------
1033Look at man pages for strace & the various syscalls
1034e.g. man strace, man alarm, man socket.
1035
1036
1037Performance Debugging
1038=====================
1039gcc is capable of compiling in profiling code just add the -p option
1040to the CFLAGS, this obviously affects program size & performance.
1041This can be used by the gprof gnu profiling tool or the
1042gcov the gnu code coverage tool ( code coverage is a means of testing
1043code quality by checking if all the code in an executable in exercised by
1044a tester ).
1045
1046
1047Using top to find out where processes are sleeping in the kernel
1048----------------------------------------------------------------
1049To do this copy the System.map from the root directory where
1050the linux kernel was built to the /boot directory on your
1051linux machine.
1052Start top
1053Now type fU<return>
1054You should see a new field called WCHAN which
1055tells you where each process is sleeping here is a typical output.
1056
1057 6:59pm up 41 min, 1 user, load average: 0.00, 0.00, 0.00
105828 processes: 27 sleeping, 1 running, 0 zombie, 0 stopped
1059CPU states: 0.0% user, 0.1% system, 0.0% nice, 99.8% idle
1060Mem: 254900K av, 45976K used, 208924K free, 0K shrd, 28636K buff
1061Swap: 0K av, 0K used, 0K free 8620K cached
1062
1063 PID USER PRI NI SIZE RSS SHARE WCHAN STAT LIB %CPU %MEM TIME COMMAND
1064 750 root 12 0 848 848 700 do_select S 0 0.1 0.3 0:00 in.telnetd
1065 767 root 16 0 1140 1140 964 R 0 0.1 0.4 0:00 top
1066 1 root 8 0 212 212 180 do_select S 0 0.0 0.0 0:00 init
1067 2 root 9 0 0 0 0 down_inte SW 0 0.0 0.0 0:00 kmcheck
1068
1069The time command
1070----------------
1071Another related command is the time command which gives you an indication
1072of where a process is spending the majority of its time.
1073e.g.
1074time ping -c 5 nc
1075outputs
1076real 0m4.054s
1077user 0m0.010s
1078sys 0m0.010s
1079 739
1080Debugging under VM 740Debugging under VM
1081================== 741==================