aboutsummaryrefslogtreecommitdiffstats
path: root/Documentation
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation')
-rw-r--r--Documentation/DocBook/debugobjects.tmpl2
-rw-r--r--Documentation/accounting/getdelays.c3
-rw-r--r--Documentation/atomic_ops.txt4
-rw-r--r--Documentation/cdrom/packet-writing.txt2
-rw-r--r--Documentation/cpu-freq/cpu-drivers.txt2
-rw-r--r--Documentation/cpu-freq/governors.txt26
-rw-r--r--Documentation/cpu-freq/user-guide.txt1
-rw-r--r--Documentation/driver-model/device.txt32
-rw-r--r--Documentation/dvb/get_dvb_firmware8
-rw-r--r--Documentation/fault-injection/fault-injection.txt70
-rw-r--r--Documentation/fb/vesafb.txt2
-rw-r--r--Documentation/filesystems/Locking2
-rw-r--r--Documentation/filesystems/proc.txt15
-rw-r--r--Documentation/filesystems/vfat.txt5
-rw-r--r--Documentation/firmware_class/README3
-rw-r--r--Documentation/hwmon/f71882fg12
-rw-r--r--Documentation/hwmon/ibmaem2
-rw-r--r--Documentation/hwmon/sysfs-interface19
-rw-r--r--Documentation/hwmon/tmp40142
-rw-r--r--Documentation/hwmon/w83627ehf11
-rw-r--r--Documentation/i2c/busses/i2c-viapro4
-rw-r--r--Documentation/kernel-parameters.txt4
-rw-r--r--Documentation/kmemcheck.txt773
-rw-r--r--Documentation/kprobes.txt6
-rw-r--r--Documentation/scsi/scsi_fc_transport.txt14
-rw-r--r--Documentation/scsi/scsi_mid_low_api.txt5
-rw-r--r--Documentation/sysctl/vm.txt23
-rw-r--r--Documentation/trace/ftrace.txt233
-rw-r--r--Documentation/trace/mmiotrace.txt26
-rw-r--r--Documentation/video4linux/CARDLIST.cx238855
-rw-r--r--Documentation/video4linux/CARDLIST.cx882
-rw-r--r--Documentation/video4linux/CARDLIST.em28xx6
-rw-r--r--Documentation/video4linux/CARDLIST.saa713422
-rw-r--r--Documentation/video4linux/CARDLIST.tuner2
-rw-r--r--Documentation/video4linux/gspca.txt12
-rw-r--r--Documentation/video4linux/pxa_camera.txt49
-rw-r--r--Documentation/video4linux/v4l2-framework.txt5
-rw-r--r--Documentation/vm/Makefile2
-rw-r--r--Documentation/vm/balance18
-rw-r--r--Documentation/vm/page-types.c698
-rw-r--r--Documentation/vm/pagemap.txt68
41 files changed, 2028 insertions, 212 deletions
diff --git a/Documentation/DocBook/debugobjects.tmpl b/Documentation/DocBook/debugobjects.tmpl
index 7f5f218015fe..08ff908aa7a2 100644
--- a/Documentation/DocBook/debugobjects.tmpl
+++ b/Documentation/DocBook/debugobjects.tmpl
@@ -106,7 +106,7 @@
106 number of errors are printk'ed including a full stack trace. 106 number of errors are printk'ed including a full stack trace.
107 </para> 107 </para>
108 <para> 108 <para>
109 The statistics are available via debugfs/debug_objects/stats. 109 The statistics are available via /sys/kernel/debug/debug_objects/stats.
110 They provide information about the number of warnings and the 110 They provide information about the number of warnings and the
111 number of successful fixups along with information about the 111 number of successful fixups along with information about the
112 usage of the internal tracking objects and the state of the 112 usage of the internal tracking objects and the state of the
diff --git a/Documentation/accounting/getdelays.c b/Documentation/accounting/getdelays.c
index 7ea231172c85..aa73e72fd793 100644
--- a/Documentation/accounting/getdelays.c
+++ b/Documentation/accounting/getdelays.c
@@ -246,7 +246,8 @@ void print_ioacct(struct taskstats *t)
246 246
247int main(int argc, char *argv[]) 247int main(int argc, char *argv[])
248{ 248{
249 int c, rc, rep_len, aggr_len, len2, cmd_type; 249 int c, rc, rep_len, aggr_len, len2;
250 int cmd_type = TASKSTATS_CMD_ATTR_UNSPEC;
250 __u16 id; 251 __u16 id;
251 __u32 mypid; 252 __u32 mypid;
252 253
diff --git a/Documentation/atomic_ops.txt b/Documentation/atomic_ops.txt
index 4ef245010457..396bec3b74ed 100644
--- a/Documentation/atomic_ops.txt
+++ b/Documentation/atomic_ops.txt
@@ -229,10 +229,10 @@ kernel. It is the use of atomic counters to implement reference
229counting, and it works such that once the counter falls to zero it can 229counting, and it works such that once the counter falls to zero it can
230be guaranteed that no other entity can be accessing the object: 230be guaranteed that no other entity can be accessing the object:
231 231
232static void obj_list_add(struct obj *obj) 232static void obj_list_add(struct obj *obj, struct list_head *head)
233{ 233{
234 obj->active = 1; 234 obj->active = 1;
235 list_add(&obj->list); 235 list_add(&obj->list, head);
236} 236}
237 237
238static void obj_list_del(struct obj *obj) 238static void obj_list_del(struct obj *obj)
diff --git a/Documentation/cdrom/packet-writing.txt b/Documentation/cdrom/packet-writing.txt
index cf1f8126991c..1c407778c8b2 100644
--- a/Documentation/cdrom/packet-writing.txt
+++ b/Documentation/cdrom/packet-writing.txt
@@ -117,7 +117,7 @@ Using the pktcdvd debugfs interface
117 117
118To read pktcdvd device infos in human readable form, do: 118To read pktcdvd device infos in human readable form, do:
119 119
120 # cat /debug/pktcdvd/pktcdvd[0-7]/info 120 # cat /sys/kernel/debug/pktcdvd/pktcdvd[0-7]/info
121 121
122For a description of the debugfs interface look into the file: 122For a description of the debugfs interface look into the file:
123 123
diff --git a/Documentation/cpu-freq/cpu-drivers.txt b/Documentation/cpu-freq/cpu-drivers.txt
index 43c743903dd7..75a58d14d3cf 100644
--- a/Documentation/cpu-freq/cpu-drivers.txt
+++ b/Documentation/cpu-freq/cpu-drivers.txt
@@ -155,7 +155,7 @@ actual frequency must be determined using the following rules:
155- if relation==CPUFREQ_REL_H, try to select a new_freq lower than or equal 155- if relation==CPUFREQ_REL_H, try to select a new_freq lower than or equal
156 target_freq. ("H for highest, but no higher than") 156 target_freq. ("H for highest, but no higher than")
157 157
158Here again the frequency table helper might assist you - see section 3 158Here again the frequency table helper might assist you - see section 2
159for details. 159for details.
160 160
161 161
diff --git a/Documentation/cpu-freq/governors.txt b/Documentation/cpu-freq/governors.txt
index ce73f3eb5ddb..aed082f49d09 100644
--- a/Documentation/cpu-freq/governors.txt
+++ b/Documentation/cpu-freq/governors.txt
@@ -119,10 +119,6 @@ want the kernel to look at the CPU usage and to make decisions on
119what to do about the frequency. Typically this is set to values of 119what to do about the frequency. Typically this is set to values of
120around '10000' or more. It's default value is (cmp. with users-guide.txt): 120around '10000' or more. It's default value is (cmp. with users-guide.txt):
121transition_latency * 1000 121transition_latency * 1000
122The lowest value you can set is:
123transition_latency * 100 or it may get restricted to a value where it
124makes not sense for the kernel anymore to poll that often which depends
125on your HZ config variable (HZ=1000: max=20000us, HZ=250: max=5000).
126Be aware that transition latency is in ns and sampling_rate is in us, so you 122Be aware that transition latency is in ns and sampling_rate is in us, so you
127get the same sysfs value by default. 123get the same sysfs value by default.
128Sampling rate should always get adjusted considering the transition latency 124Sampling rate should always get adjusted considering the transition latency
@@ -131,14 +127,20 @@ in the bash (as said, 1000 is default), do:
131echo `$(($(cat cpuinfo_transition_latency) * 750 / 1000)) \ 127echo `$(($(cat cpuinfo_transition_latency) * 750 / 1000)) \
132 >ondemand/sampling_rate 128 >ondemand/sampling_rate
133 129
134show_sampling_rate_(min|max): THIS INTERFACE IS DEPRECATED, DON'T USE IT. 130show_sampling_rate_min:
135You can use wider ranges now and the general 131The sampling rate is limited by the HW transition latency:
136cpuinfo_transition_latency variable (cmp. with user-guide.txt) can be 132transition_latency * 100
137used to obtain exactly the same info: 133Or by kernel restrictions:
138show_sampling_rate_min = transtition_latency * 500 / 1000 134If CONFIG_NO_HZ is set, the limit is 10ms fixed.
139show_sampling_rate_max = transtition_latency * 500000 / 1000 135If CONFIG_NO_HZ is not set or no_hz=off boot parameter is used, the
140(divided by 1000 is to illustrate that sampling rate is in us and 136limits depend on the CONFIG_HZ option:
141transition latency is exported ns). 137HZ=1000: min=20000us (20ms)
138HZ=250: min=80000us (80ms)
139HZ=100: min=200000us (200ms)
140The highest value of kernel and HW latency restrictions is shown and
141used as the minimum sampling rate.
142
143show_sampling_rate_max: THIS INTERFACE IS DEPRECATED, DON'T USE IT.
142 144
143up_threshold: defines what the average CPU usage between the samplings 145up_threshold: defines what the average CPU usage between the samplings
144of 'sampling_rate' needs to be for the kernel to make a decision on 146of 'sampling_rate' needs to be for the kernel to make a decision on
diff --git a/Documentation/cpu-freq/user-guide.txt b/Documentation/cpu-freq/user-guide.txt
index 75f41193f3e1..5d5f5fadd1c2 100644
--- a/Documentation/cpu-freq/user-guide.txt
+++ b/Documentation/cpu-freq/user-guide.txt
@@ -31,7 +31,6 @@ Contents:
31 31
323. How to change the CPU cpufreq policy and/or speed 323. How to change the CPU cpufreq policy and/or speed
333.1 Preferred interface: sysfs 333.1 Preferred interface: sysfs
343.2 Deprecated interfaces
35 34
36 35
37 36
diff --git a/Documentation/driver-model/device.txt b/Documentation/driver-model/device.txt
index a7cbfff40d07..a124f3126b0d 100644
--- a/Documentation/driver-model/device.txt
+++ b/Documentation/driver-model/device.txt
@@ -162,3 +162,35 @@ device_remove_file(dev,&dev_attr_power);
162 162
163The file name will be 'power' with a mode of 0644 (-rw-r--r--). 163The file name will be 'power' with a mode of 0644 (-rw-r--r--).
164 164
165Word of warning: While the kernel allows device_create_file() and
166device_remove_file() to be called on a device at any time, userspace has
167strict expectations on when attributes get created. When a new device is
168registered in the kernel, a uevent is generated to notify userspace (like
169udev) that a new device is available. If attributes are added after the
170device is registered, then userspace won't get notified and userspace will
171not know about the new attributes.
172
173This is important for device driver that need to publish additional
174attributes for a device at driver probe time. If the device driver simply
175calls device_create_file() on the device structure passed to it, then
176userspace will never be notified of the new attributes. Instead, it should
177probably use class_create() and class->dev_attrs to set up a list of
178desired attributes in the modules_init function, and then in the .probe()
179hook, and then use device_create() to create a new device as a child
180of the probed device. The new device will generate a new uevent and
181properly advertise the new attributes to userspace.
182
183For example, if a driver wanted to add the following attributes:
184struct device_attribute mydriver_attribs[] = {
185 __ATTR(port_count, 0444, port_count_show),
186 __ATTR(serial_number, 0444, serial_number_show),
187 NULL
188};
189
190Then in the module init function is would do:
191 mydriver_class = class_create(THIS_MODULE, "my_attrs");
192 mydriver_class.dev_attr = mydriver_attribs;
193
194And assuming 'dev' is the struct device passed into the probe hook, the driver
195probe function would do something like:
196 create_device(&mydriver_class, dev, chrdev, &private_data, "my_name");
diff --git a/Documentation/dvb/get_dvb_firmware b/Documentation/dvb/get_dvb_firmware
index 2f21ecd4c205..a52adfc9a57f 100644
--- a/Documentation/dvb/get_dvb_firmware
+++ b/Documentation/dvb/get_dvb_firmware
@@ -112,7 +112,7 @@ sub tda10045 {
112 112
113sub tda10046 { 113sub tda10046 {
114 my $sourcefile = "TT_PCI_2.19h_28_11_2006.zip"; 114 my $sourcefile = "TT_PCI_2.19h_28_11_2006.zip";
115 my $url = "http://technotrend-online.com/download/software/219/$sourcefile"; 115 my $url = "http://www.tt-download.com/download/updates/219/$sourcefile";
116 my $hash = "6a7e1e2f2644b162ff0502367553c72d"; 116 my $hash = "6a7e1e2f2644b162ff0502367553c72d";
117 my $outfile = "dvb-fe-tda10046.fw"; 117 my $outfile = "dvb-fe-tda10046.fw";
118 my $tmpdir = tempdir(DIR => "/tmp", CLEANUP => 1); 118 my $tmpdir = tempdir(DIR => "/tmp", CLEANUP => 1);
@@ -129,8 +129,8 @@ sub tda10046 {
129} 129}
130 130
131sub tda10046lifeview { 131sub tda10046lifeview {
132 my $sourcefile = "Drv_2.11.02.zip"; 132 my $sourcefile = "7%5Cdrv_2.11.02.zip";
133 my $url = "http://www.lifeview.com.tw/drivers/pci_card/FlyDVB-T/$sourcefile"; 133 my $url = "http://www.lifeview.hk/dbimages/document/$sourcefile";
134 my $hash = "1ea24dee4eea8fe971686981f34fd2e0"; 134 my $hash = "1ea24dee4eea8fe971686981f34fd2e0";
135 my $outfile = "dvb-fe-tda10046.fw"; 135 my $outfile = "dvb-fe-tda10046.fw";
136 my $tmpdir = tempdir(DIR => "/tmp", CLEANUP => 1); 136 my $tmpdir = tempdir(DIR => "/tmp", CLEANUP => 1);
@@ -317,7 +317,7 @@ sub nxt2002 {
317 317
318sub nxt2004 { 318sub nxt2004 {
319 my $sourcefile = "AVerTVHD_MCE_A180_Drv_v1.2.2.16.zip"; 319 my $sourcefile = "AVerTVHD_MCE_A180_Drv_v1.2.2.16.zip";
320 my $url = "http://www.aver.com/support/Drivers/$sourcefile"; 320 my $url = "http://www.avermedia-usa.com/support/Drivers/$sourcefile";
321 my $hash = "111cb885b1e009188346d72acfed024c"; 321 my $hash = "111cb885b1e009188346d72acfed024c";
322 my $outfile = "dvb-fe-nxt2004.fw"; 322 my $outfile = "dvb-fe-nxt2004.fw";
323 my $tmpdir = tempdir(DIR => "/tmp", CLEANUP => 1); 323 my $tmpdir = tempdir(DIR => "/tmp", CLEANUP => 1);
diff --git a/Documentation/fault-injection/fault-injection.txt b/Documentation/fault-injection/fault-injection.txt
index 4bc374a14345..079305640790 100644
--- a/Documentation/fault-injection/fault-injection.txt
+++ b/Documentation/fault-injection/fault-injection.txt
@@ -29,16 +29,16 @@ o debugfs entries
29fault-inject-debugfs kernel module provides some debugfs entries for runtime 29fault-inject-debugfs kernel module provides some debugfs entries for runtime
30configuration of fault-injection capabilities. 30configuration of fault-injection capabilities.
31 31
32- /debug/fail*/probability: 32- /sys/kernel/debug/fail*/probability:
33 33
34 likelihood of failure injection, in percent. 34 likelihood of failure injection, in percent.
35 Format: <percent> 35 Format: <percent>
36 36
37 Note that one-failure-per-hundred is a very high error rate 37 Note that one-failure-per-hundred is a very high error rate
38 for some testcases. Consider setting probability=100 and configure 38 for some testcases. Consider setting probability=100 and configure
39 /debug/fail*/interval for such testcases. 39 /sys/kernel/debug/fail*/interval for such testcases.
40 40
41- /debug/fail*/interval: 41- /sys/kernel/debug/fail*/interval:
42 42
43 specifies the interval between failures, for calls to 43 specifies the interval between failures, for calls to
44 should_fail() that pass all the other tests. 44 should_fail() that pass all the other tests.
@@ -46,18 +46,18 @@ configuration of fault-injection capabilities.
46 Note that if you enable this, by setting interval>1, you will 46 Note that if you enable this, by setting interval>1, you will
47 probably want to set probability=100. 47 probably want to set probability=100.
48 48
49- /debug/fail*/times: 49- /sys/kernel/debug/fail*/times:
50 50
51 specifies how many times failures may happen at most. 51 specifies how many times failures may happen at most.
52 A value of -1 means "no limit". 52 A value of -1 means "no limit".
53 53
54- /debug/fail*/space: 54- /sys/kernel/debug/fail*/space:
55 55
56 specifies an initial resource "budget", decremented by "size" 56 specifies an initial resource "budget", decremented by "size"
57 on each call to should_fail(,size). Failure injection is 57 on each call to should_fail(,size). Failure injection is
58 suppressed until "space" reaches zero. 58 suppressed until "space" reaches zero.
59 59
60- /debug/fail*/verbose 60- /sys/kernel/debug/fail*/verbose
61 61
62 Format: { 0 | 1 | 2 } 62 Format: { 0 | 1 | 2 }
63 specifies the verbosity of the messages when failure is 63 specifies the verbosity of the messages when failure is
@@ -65,17 +65,17 @@ configuration of fault-injection capabilities.
65 log line per failure; '2' will print a call trace too -- useful 65 log line per failure; '2' will print a call trace too -- useful
66 to debug the problems revealed by fault injection. 66 to debug the problems revealed by fault injection.
67 67
68- /debug/fail*/task-filter: 68- /sys/kernel/debug/fail*/task-filter:
69 69
70 Format: { 'Y' | 'N' } 70 Format: { 'Y' | 'N' }
71 A value of 'N' disables filtering by process (default). 71 A value of 'N' disables filtering by process (default).
72 Any positive value limits failures to only processes indicated by 72 Any positive value limits failures to only processes indicated by
73 /proc/<pid>/make-it-fail==1. 73 /proc/<pid>/make-it-fail==1.
74 74
75- /debug/fail*/require-start: 75- /sys/kernel/debug/fail*/require-start:
76- /debug/fail*/require-end: 76- /sys/kernel/debug/fail*/require-end:
77- /debug/fail*/reject-start: 77- /sys/kernel/debug/fail*/reject-start:
78- /debug/fail*/reject-end: 78- /sys/kernel/debug/fail*/reject-end:
79 79
80 specifies the range of virtual addresses tested during 80 specifies the range of virtual addresses tested during
81 stacktrace walking. Failure is injected only if some caller 81 stacktrace walking. Failure is injected only if some caller
@@ -84,26 +84,26 @@ configuration of fault-injection capabilities.
84 Default required range is [0,ULONG_MAX) (whole of virtual address space). 84 Default required range is [0,ULONG_MAX) (whole of virtual address space).
85 Default rejected range is [0,0). 85 Default rejected range is [0,0).
86 86
87- /debug/fail*/stacktrace-depth: 87- /sys/kernel/debug/fail*/stacktrace-depth:
88 88
89 specifies the maximum stacktrace depth walked during search 89 specifies the maximum stacktrace depth walked during search
90 for a caller within [require-start,require-end) OR 90 for a caller within [require-start,require-end) OR
91 [reject-start,reject-end). 91 [reject-start,reject-end).
92 92
93- /debug/fail_page_alloc/ignore-gfp-highmem: 93- /sys/kernel/debug/fail_page_alloc/ignore-gfp-highmem:
94 94
95 Format: { 'Y' | 'N' } 95 Format: { 'Y' | 'N' }
96 default is 'N', setting it to 'Y' won't inject failures into 96 default is 'N', setting it to 'Y' won't inject failures into
97 highmem/user allocations. 97 highmem/user allocations.
98 98
99- /debug/failslab/ignore-gfp-wait: 99- /sys/kernel/debug/failslab/ignore-gfp-wait:
100- /debug/fail_page_alloc/ignore-gfp-wait: 100- /sys/kernel/debug/fail_page_alloc/ignore-gfp-wait:
101 101
102 Format: { 'Y' | 'N' } 102 Format: { 'Y' | 'N' }
103 default is 'N', setting it to 'Y' will inject failures 103 default is 'N', setting it to 'Y' will inject failures
104 only into non-sleep allocations (GFP_ATOMIC allocations). 104 only into non-sleep allocations (GFP_ATOMIC allocations).
105 105
106- /debug/fail_page_alloc/min-order: 106- /sys/kernel/debug/fail_page_alloc/min-order:
107 107
108 specifies the minimum page allocation order to be injected 108 specifies the minimum page allocation order to be injected
109 failures. 109 failures.
@@ -166,13 +166,13 @@ o Inject slab allocation failures into module init/exit code
166#!/bin/bash 166#!/bin/bash
167 167
168FAILTYPE=failslab 168FAILTYPE=failslab
169echo Y > /debug/$FAILTYPE/task-filter 169echo Y > /sys/kernel/debug/$FAILTYPE/task-filter
170echo 10 > /debug/$FAILTYPE/probability 170echo 10 > /sys/kernel/debug/$FAILTYPE/probability
171echo 100 > /debug/$FAILTYPE/interval 171echo 100 > /sys/kernel/debug/$FAILTYPE/interval
172echo -1 > /debug/$FAILTYPE/times 172echo -1 > /sys/kernel/debug/$FAILTYPE/times
173echo 0 > /debug/$FAILTYPE/space 173echo 0 > /sys/kernel/debug/$FAILTYPE/space
174echo 2 > /debug/$FAILTYPE/verbose 174echo 2 > /sys/kernel/debug/$FAILTYPE/verbose
175echo 1 > /debug/$FAILTYPE/ignore-gfp-wait 175echo 1 > /sys/kernel/debug/$FAILTYPE/ignore-gfp-wait
176 176
177faulty_system() 177faulty_system()
178{ 178{
@@ -217,20 +217,20 @@ then
217 exit 1 217 exit 1
218fi 218fi
219 219
220cat /sys/module/$module/sections/.text > /debug/$FAILTYPE/require-start 220cat /sys/module/$module/sections/.text > /sys/kernel/debug/$FAILTYPE/require-start
221cat /sys/module/$module/sections/.data > /debug/$FAILTYPE/require-end 221cat /sys/module/$module/sections/.data > /sys/kernel/debug/$FAILTYPE/require-end
222 222
223echo N > /debug/$FAILTYPE/task-filter 223echo N > /sys/kernel/debug/$FAILTYPE/task-filter
224echo 10 > /debug/$FAILTYPE/probability 224echo 10 > /sys/kernel/debug/$FAILTYPE/probability
225echo 100 > /debug/$FAILTYPE/interval 225echo 100 > /sys/kernel/debug/$FAILTYPE/interval
226echo -1 > /debug/$FAILTYPE/times 226echo -1 > /sys/kernel/debug/$FAILTYPE/times
227echo 0 > /debug/$FAILTYPE/space 227echo 0 > /sys/kernel/debug/$FAILTYPE/space
228echo 2 > /debug/$FAILTYPE/verbose 228echo 2 > /sys/kernel/debug/$FAILTYPE/verbose
229echo 1 > /debug/$FAILTYPE/ignore-gfp-wait 229echo 1 > /sys/kernel/debug/$FAILTYPE/ignore-gfp-wait
230echo 1 > /debug/$FAILTYPE/ignore-gfp-highmem 230echo 1 > /sys/kernel/debug/$FAILTYPE/ignore-gfp-highmem
231echo 10 > /debug/$FAILTYPE/stacktrace-depth 231echo 10 > /sys/kernel/debug/$FAILTYPE/stacktrace-depth
232 232
233trap "echo 0 > /debug/$FAILTYPE/probability" SIGINT SIGTERM EXIT 233trap "echo 0 > /sys/kernel/debug/$FAILTYPE/probability" SIGINT SIGTERM EXIT
234 234
235echo "Injecting errors into the module $module... (interrupt to stop)" 235echo "Injecting errors into the module $module... (interrupt to stop)"
236sleep 1000000 236sleep 1000000
diff --git a/Documentation/fb/vesafb.txt b/Documentation/fb/vesafb.txt
index ee277dd204b0..950d5a658cb3 100644
--- a/Documentation/fb/vesafb.txt
+++ b/Documentation/fb/vesafb.txt
@@ -95,7 +95,7 @@ There is no way to change the vesafb video mode and/or timings after
95booting linux. If you are not happy with the 60 Hz refresh rate, you 95booting linux. If you are not happy with the 60 Hz refresh rate, you
96have these options: 96have these options:
97 97
98 * configure and load the DOS-Tools for your the graphics board (if 98 * configure and load the DOS-Tools for the graphics board (if
99 available) and boot linux with loadlin. 99 available) and boot linux with loadlin.
100 * use a native driver (matroxfb/atyfb) instead if vesafb. If none 100 * use a native driver (matroxfb/atyfb) instead if vesafb. If none
101 is available, write a new one! 101 is available, write a new one!
diff --git a/Documentation/filesystems/Locking b/Documentation/filesystems/Locking
index 3120f8dd2c31..229d7b7c50a3 100644
--- a/Documentation/filesystems/Locking
+++ b/Documentation/filesystems/Locking
@@ -187,7 +187,7 @@ readpages: no
187write_begin: no locks the page yes 187write_begin: no locks the page yes
188write_end: no yes, unlocks yes 188write_end: no yes, unlocks yes
189perform_write: no n/a yes 189perform_write: no n/a yes
190bmap: yes 190bmap: no
191invalidatepage: no yes 191invalidatepage: no yes
192releasepage: no yes 192releasepage: no yes
193direct_IO: no 193direct_IO: no
diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt
index cd8717a36271..ebff3c10a07f 100644
--- a/Documentation/filesystems/proc.txt
+++ b/Documentation/filesystems/proc.txt
@@ -1003,11 +1003,13 @@ CHAPTER 3: PER-PROCESS PARAMETERS
10033.1 /proc/<pid>/oom_adj - Adjust the oom-killer score 10033.1 /proc/<pid>/oom_adj - Adjust the oom-killer score
1004------------------------------------------------------ 1004------------------------------------------------------
1005 1005
1006This file can be used to adjust the score used to select which processes 1006This file can be used to adjust the score used to select which processes should
1007should be killed in an out-of-memory situation. Giving it a high score will 1007be killed in an out-of-memory situation. The oom_adj value is a characteristic
1008increase the likelihood of this process being killed by the oom-killer. Valid 1008of the task's mm, so all threads that share an mm with pid will have the same
1009values are in the range -16 to +15, plus the special value -17, which disables 1009oom_adj value. A high value will increase the likelihood of this process being
1010oom-killing altogether for this process. 1010killed by the oom-killer. Valid values are in the range -16 to +15 as
1011explained below and a special value of -17, which disables oom-killing
1012altogether for threads sharing pid's mm.
1011 1013
1012The process to be killed in an out-of-memory situation is selected among all others 1014The process to be killed in an out-of-memory situation is selected among all others
1013based on its badness score. This value equals the original memory size of the process 1015based on its badness score. This value equals the original memory size of the process
@@ -1021,6 +1023,9 @@ the parent's score if they do not share the same memory. Thus forking servers
1021are the prime candidates to be killed. Having only one 'hungry' child will make 1023are the prime candidates to be killed. Having only one 'hungry' child will make
1022parent less preferable than the child. 1024parent less preferable than the child.
1023 1025
1026/proc/<pid>/oom_adj cannot be changed for kthreads since they are immune from
1027oom-killing already.
1028
1024/proc/<pid>/oom_score shows process' current badness score. 1029/proc/<pid>/oom_score shows process' current badness score.
1025 1030
1026The following heuristics are then applied: 1031The following heuristics are then applied:
diff --git a/Documentation/filesystems/vfat.txt b/Documentation/filesystems/vfat.txt
index 5147be5e13cd..b58b84b50fa2 100644
--- a/Documentation/filesystems/vfat.txt
+++ b/Documentation/filesystems/vfat.txt
@@ -132,6 +132,11 @@ rodir -- FAT has the ATTR_RO (read-only) attribute. On Windows,
132 If you want to use ATTR_RO as read-only flag even for 132 If you want to use ATTR_RO as read-only flag even for
133 the directory, set this option. 133 the directory, set this option.
134 134
135errors=panic|continue|remount-ro
136 -- specify FAT behavior on critical errors: panic, continue
137 without doing anything or remount the partition in
138 read-only mode (default behavior).
139
135<bool>: 0,1,yes,no,true,false 140<bool>: 0,1,yes,no,true,false
136 141
137TODO 142TODO
diff --git a/Documentation/firmware_class/README b/Documentation/firmware_class/README
index c3480aa66ba8..7eceaff63f5f 100644
--- a/Documentation/firmware_class/README
+++ b/Documentation/firmware_class/README
@@ -77,7 +77,8 @@
77 seconds for the whole load operation. 77 seconds for the whole load operation.
78 78
79 - request_firmware_nowait() is also provided for convenience in 79 - request_firmware_nowait() is also provided for convenience in
80 non-user contexts. 80 user contexts to request firmware asynchronously, but can't be called
81 in atomic contexts.
81 82
82 83
83 about in-kernel persistence: 84 about in-kernel persistence:
diff --git a/Documentation/hwmon/f71882fg b/Documentation/hwmon/f71882fg
index a8321267b5b6..bee4c30bc1e2 100644
--- a/Documentation/hwmon/f71882fg
+++ b/Documentation/hwmon/f71882fg
@@ -2,14 +2,18 @@ Kernel driver f71882fg
2====================== 2======================
3 3
4Supported chips: 4Supported chips:
5 * Fintek F71882FG and F71883FG 5 * Fintek F71858FG
6 Prefix: 'f71882fg' 6 Prefix: 'f71858fg'
7 Addresses scanned: none, address read from Super I/O config space 7 Addresses scanned: none, address read from Super I/O config space
8 Datasheet: Available from the Fintek website 8 Datasheet: Available from the Fintek website
9 * Fintek F71862FG and F71863FG 9 * Fintek F71862FG and F71863FG
10 Prefix: 'f71862fg' 10 Prefix: 'f71862fg'
11 Addresses scanned: none, address read from Super I/O config space 11 Addresses scanned: none, address read from Super I/O config space
12 Datasheet: Available from the Fintek website 12 Datasheet: Available from the Fintek website
13 * Fintek F71882FG and F71883FG
14 Prefix: 'f71882fg'
15 Addresses scanned: none, address read from Super I/O config space
16 Datasheet: Available from the Fintek website
13 * Fintek F8000 17 * Fintek F8000
14 Prefix: 'f8000' 18 Prefix: 'f8000'
15 Addresses scanned: none, address read from Super I/O config space 19 Addresses scanned: none, address read from Super I/O config space
@@ -66,13 +70,13 @@ printed when loading the driver.
66 70
67Three different fan control modes are supported; the mode number is written 71Three different fan control modes are supported; the mode number is written
68to the pwm#_enable file. Note that not all modes are supported on all 72to the pwm#_enable file. Note that not all modes are supported on all
69chips, and some modes may only be available in RPM / PWM mode on the F8000. 73chips, and some modes may only be available in RPM / PWM mode.
70Writing an unsupported mode will result in an invalid parameter error. 74Writing an unsupported mode will result in an invalid parameter error.
71 75
72* 1: Manual mode 76* 1: Manual mode
73 You ask for a specific PWM duty cycle / DC voltage or a specific % of 77 You ask for a specific PWM duty cycle / DC voltage or a specific % of
74 fan#_full_speed by writing to the pwm# file. This mode is only 78 fan#_full_speed by writing to the pwm# file. This mode is only
75 available on the F8000 if the fan channel is in RPM mode. 79 available on the F71858FG / F8000 if the fan channel is in RPM mode.
76 80
77* 2: Normal auto mode 81* 2: Normal auto mode
78 You can define a number of temperature/fan speed trip points, which % the 82 You can define a number of temperature/fan speed trip points, which % the
diff --git a/Documentation/hwmon/ibmaem b/Documentation/hwmon/ibmaem
index e98bdfea3467..1e0d59e000b4 100644
--- a/Documentation/hwmon/ibmaem
+++ b/Documentation/hwmon/ibmaem
@@ -7,7 +7,7 @@ henceforth as AEM.
7Supported systems: 7Supported systems:
8 * Any recent IBM System X server with AEM support. 8 * Any recent IBM System X server with AEM support.
9 This includes the x3350, x3550, x3650, x3655, x3755, x3850 M2, 9 This includes the x3350, x3550, x3650, x3655, x3755, x3850 M2,
10 x3950 M2, and certain HS2x/LS2x/QS2x blades. The IPMI host interface 10 x3950 M2, and certain HC10/HS2x/LS2x/QS2x blades. The IPMI host interface
11 driver ("ipmi-si") needs to be loaded for this driver to do anything. 11 driver ("ipmi-si") needs to be loaded for this driver to do anything.
12 Prefix: 'ibmaem' 12 Prefix: 'ibmaem'
13 Datasheet: Not available 13 Datasheet: Not available
diff --git a/Documentation/hwmon/sysfs-interface b/Documentation/hwmon/sysfs-interface
index 004ee161721e..dcbd502c8792 100644
--- a/Documentation/hwmon/sysfs-interface
+++ b/Documentation/hwmon/sysfs-interface
@@ -70,6 +70,7 @@ are interpreted as 0! For more on how written strings are interpreted see the
70[0-*] denotes any positive number starting from 0 70[0-*] denotes any positive number starting from 0
71[1-*] denotes any positive number starting from 1 71[1-*] denotes any positive number starting from 1
72RO read only value 72RO read only value
73WO write only value
73RW read/write value 74RW read/write value
74 75
75Read/write values may be read-only for some chips, depending on the 76Read/write values may be read-only for some chips, depending on the
@@ -295,6 +296,24 @@ temp[1-*]_label Suggested temperature channel label.
295 user-space. 296 user-space.
296 RO 297 RO
297 298
299temp[1-*]_lowest
300 Historical minimum temperature
301 Unit: millidegree Celsius
302 RO
303
304temp[1-*]_highest
305 Historical maximum temperature
306 Unit: millidegree Celsius
307 RO
308
309temp[1-*]_reset_history
310 Reset temp_lowest and temp_highest
311 WO
312
313temp_reset_history
314 Reset temp_lowest and temp_highest for all sensors
315 WO
316
298Some chips measure temperature using external thermistors and an ADC, and 317Some chips measure temperature using external thermistors and an ADC, and
299report the temperature measurement as a voltage. Converting this voltage 318report the temperature measurement as a voltage. Converting this voltage
300back to a temperature (or the other way around for limits) requires 319back to a temperature (or the other way around for limits) requires
diff --git a/Documentation/hwmon/tmp401 b/Documentation/hwmon/tmp401
new file mode 100644
index 000000000000..9fc447249212
--- /dev/null
+++ b/Documentation/hwmon/tmp401
@@ -0,0 +1,42 @@
1Kernel driver tmp401
2====================
3
4Supported chips:
5 * Texas Instruments TMP401
6 Prefix: 'tmp401'
7 Addresses scanned: I2C 0x4c
8 Datasheet: http://focus.ti.com/docs/prod/folders/print/tmp401.html
9 * Texas Instruments TMP411
10 Prefix: 'tmp411'
11 Addresses scanned: I2C 0x4c
12 Datasheet: http://focus.ti.com/docs/prod/folders/print/tmp411.html
13
14Authors:
15 Hans de Goede <hdegoede@redhat.com>
16 Andre Prendel <andre.prendel@gmx.de>
17
18Description
19-----------
20
21This driver implements support for Texas Instruments TMP401 and
22TMP411 chips. These chips implements one remote and one local
23temperature sensor. Temperature is measured in degrees
24Celsius. Resolution of the remote sensor is 0.0625 degree. Local
25sensor resolution can be set to 0.5, 0.25, 0.125 or 0.0625 degree (not
26supported by the driver so far, so using the default resolution of 0.5
27degree).
28
29The driver provides the common sysfs-interface for temperatures (see
30/Documentation/hwmon/sysfs-interface under Temperatures).
31
32The TMP411 chip is compatible with TMP401. It provides some additional
33features.
34
35* Minimum and Maximum temperature measured since power-on, chip-reset
36
37 Exported via sysfs attributes tempX_lowest and tempX_highest.
38
39* Reset of historical minimum/maximum temperature measurements
40
41 Exported via sysfs attribute temp_reset_history. Writing 1 to this
42 file triggers a reset.
diff --git a/Documentation/hwmon/w83627ehf b/Documentation/hwmon/w83627ehf
index b6eb59384bb3..02b74899edaf 100644
--- a/Documentation/hwmon/w83627ehf
+++ b/Documentation/hwmon/w83627ehf
@@ -12,6 +12,10 @@ Supported chips:
12 Addresses scanned: ISA address retrieved from Super I/O registers 12 Addresses scanned: ISA address retrieved from Super I/O registers
13 Datasheet: 13 Datasheet:
14 http://www.nuvoton.com.tw/NR/rdonlyres/7885623D-A487-4CF9-A47F-30C5F73D6FE6/0/W83627DHG.pdf 14 http://www.nuvoton.com.tw/NR/rdonlyres/7885623D-A487-4CF9-A47F-30C5F73D6FE6/0/W83627DHG.pdf
15 * Winbond W83627DHG-P
16 Prefix: 'w83627dhg'
17 Addresses scanned: ISA address retrieved from Super I/O registers
18 Datasheet: not available
15 * Winbond W83667HG 19 * Winbond W83667HG
16 Prefix: 'w83667hg' 20 Prefix: 'w83667hg'
17 Addresses scanned: ISA address retrieved from Super I/O registers 21 Addresses scanned: ISA address retrieved from Super I/O registers
@@ -28,8 +32,8 @@ Description
28----------- 32-----------
29 33
30This driver implements support for the Winbond W83627EHF, W83627EHG, 34This driver implements support for the Winbond W83627EHF, W83627EHG,
31W83627DHG and W83667HG super I/O chips. We will refer to them collectively 35W83627DHG, W83627DHG-P and W83667HG super I/O chips. We will refer to them
32as Winbond chips. 36collectively as Winbond chips.
33 37
34The chips implement three temperature sensors, five fan rotation 38The chips implement three temperature sensors, five fan rotation
35speed sensors, ten analog voltage sensors (only nine for the 627DHG), one 39speed sensors, ten analog voltage sensors (only nine for the 627DHG), one
@@ -135,3 +139,6 @@ done in the driver for all register addresses.
135The DHG also supports PECI, where the DHG queries Intel CPU temperatures, and 139The DHG also supports PECI, where the DHG queries Intel CPU temperatures, and
136the ICH8 southbridge gets that data via PECI from the DHG, so that the 140the ICH8 southbridge gets that data via PECI from the DHG, so that the
137southbridge drives the fans. And the DHG supports SST, a one-wire serial bus. 141southbridge drives the fans. And the DHG supports SST, a one-wire serial bus.
142
143The DHG-P has an additional automatic fan speed control mode named Smart Fan
144(TM) III+. This mode is not yet supported by the driver.
diff --git a/Documentation/i2c/busses/i2c-viapro b/Documentation/i2c/busses/i2c-viapro
index 22efedf60c87..2e758b0e9456 100644
--- a/Documentation/i2c/busses/i2c-viapro
+++ b/Documentation/i2c/busses/i2c-viapro
@@ -19,6 +19,9 @@ Supported adapters:
19 * VIA Technologies, Inc. VX800/VX820 19 * VIA Technologies, Inc. VX800/VX820
20 Datasheet: available on http://linux.via.com.tw 20 Datasheet: available on http://linux.via.com.tw
21 21
22 * VIA Technologies, Inc. VX855/VX875
23 Datasheet: Availability unknown
24
22Authors: 25Authors:
23 Kyösti Mälkki <kmalkki@cc.hut.fi>, 26 Kyösti Mälkki <kmalkki@cc.hut.fi>,
24 Mark D. Studebaker <mdsxyz123@yahoo.com>, 27 Mark D. Studebaker <mdsxyz123@yahoo.com>,
@@ -53,6 +56,7 @@ Your lspci -n listing must show one of these :
53 device 1106:3287 (VT8251) 56 device 1106:3287 (VT8251)
54 device 1106:8324 (CX700) 57 device 1106:8324 (CX700)
55 device 1106:8353 (VX800/VX820) 58 device 1106:8353 (VX800/VX820)
59 device 1106:8409 (VX855/VX875)
56 60
57If none of these show up, you should look in the BIOS for settings like 61If none of these show up, you should look in the BIOS for settings like
58enable ACPI / SMBus or even USB. 62enable ACPI / SMBus or even USB.
diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index 4426e0b81a11..96230a638e1b 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -546,6 +546,10 @@ and is between 256 and 4096 characters. It is defined in the file
546 console=brl,ttyS0 546 console=brl,ttyS0
547 For now, only VisioBraille is supported. 547 For now, only VisioBraille is supported.
548 548
549 consoleblank= [KNL] The console blank (screen saver) timeout in
550 seconds. Defaults to 10*60 = 10mins. A value of 0
551 disables the blank timer.
552
549 coredump_filter= 553 coredump_filter=
550 [KNL] Change the default value for 554 [KNL] Change the default value for
551 /proc/<pid>/coredump_filter. 555 /proc/<pid>/coredump_filter.
diff --git a/Documentation/kmemcheck.txt b/Documentation/kmemcheck.txt
new file mode 100644
index 000000000000..363044609dad
--- /dev/null
+++ b/Documentation/kmemcheck.txt
@@ -0,0 +1,773 @@
1GETTING STARTED WITH KMEMCHECK
2==============================
3
4Vegard Nossum <vegardno@ifi.uio.no>
5
6
7Contents
8========
90. Introduction
101. Downloading
112. Configuring and compiling
123. How to use
133.1. Booting
143.2. Run-time enable/disable
153.3. Debugging
163.4. Annotating false positives
174. Reporting errors
185. Technical description
19
20
210. Introduction
22===============
23
24kmemcheck is a debugging feature for the Linux Kernel. More specifically, it
25is a dynamic checker that detects and warns about some uses of uninitialized
26memory.
27
28Userspace programmers might be familiar with Valgrind's memcheck. The main
29difference between memcheck and kmemcheck is that memcheck works for userspace
30programs only, and kmemcheck works for the kernel only. The implementations
31are of course vastly different. Because of this, kmemcheck is not as accurate
32as memcheck, but it turns out to be good enough in practice to discover real
33programmer errors that the compiler is not able to find through static
34analysis.
35
36Enabling kmemcheck on a kernel will probably slow it down to the extent that
37the machine will not be usable for normal workloads such as e.g. an
38interactive desktop. kmemcheck will also cause the kernel to use about twice
39as much memory as normal. For this reason, kmemcheck is strictly a debugging
40feature.
41
42
431. Downloading
44==============
45
46kmemcheck can only be downloaded using git. If you want to write patches
47against the current code, you should use the kmemcheck development branch of
48the tip tree. It is also possible to use the linux-next tree, which also
49includes the latest version of kmemcheck.
50
51Assuming that you've already cloned the linux-2.6.git repository, all you
52have to do is add the -tip tree as a remote, like this:
53
54 $ git remote add tip git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip.git
55
56To actually download the tree, fetch the remote:
57
58 $ git fetch tip
59
60And to check out a new local branch with the kmemcheck code:
61
62 $ git checkout -b kmemcheck tip/kmemcheck
63
64General instructions for the -tip tree can be found here:
65http://people.redhat.com/mingo/tip.git/readme.txt
66
67
682. Configuring and compiling
69============================
70
71kmemcheck only works for the x86 (both 32- and 64-bit) platform. A number of
72configuration variables must have specific settings in order for the kmemcheck
73menu to even appear in "menuconfig". These are:
74
75 o CONFIG_CC_OPTIMIZE_FOR_SIZE=n
76
77 This option is located under "General setup" / "Optimize for size".
78
79 Without this, gcc will use certain optimizations that usually lead to
80 false positive warnings from kmemcheck. An example of this is a 16-bit
81 field in a struct, where gcc may load 32 bits, then discard the upper
82 16 bits. kmemcheck sees only the 32-bit load, and may trigger a
83 warning for the upper 16 bits (if they're uninitialized).
84
85 o CONFIG_SLAB=y or CONFIG_SLUB=y
86
87 This option is located under "General setup" / "Choose SLAB
88 allocator".
89
90 o CONFIG_FUNCTION_TRACER=n
91
92 This option is located under "Kernel hacking" / "Tracers" / "Kernel
93 Function Tracer"
94
95 When function tracing is compiled in, gcc emits a call to another
96 function at the beginning of every function. This means that when the
97 page fault handler is called, the ftrace framework will be called
98 before kmemcheck has had a chance to handle the fault. If ftrace then
99 modifies memory that was tracked by kmemcheck, the result is an
100 endless recursive page fault.
101
102 o CONFIG_DEBUG_PAGEALLOC=n
103
104 This option is located under "Kernel hacking" / "Debug page memory
105 allocations".
106
107In addition, I highly recommend turning on CONFIG_DEBUG_INFO=y. This is also
108located under "Kernel hacking". With this, you will be able to get line number
109information from the kmemcheck warnings, which is extremely valuable in
110debugging a problem. This option is not mandatory, however, because it slows
111down the compilation process and produces a much bigger kernel image.
112
113Now the kmemcheck menu should be visible (under "Kernel hacking" / "kmemcheck:
114trap use of uninitialized memory"). Here follows a description of the
115kmemcheck configuration variables:
116
117 o CONFIG_KMEMCHECK
118
119 This must be enabled in order to use kmemcheck at all...
120
121 o CONFIG_KMEMCHECK_[DISABLED | ENABLED | ONESHOT]_BY_DEFAULT
122
123 This option controls the status of kmemcheck at boot-time. "Enabled"
124 will enable kmemcheck right from the start, "disabled" will boot the
125 kernel as normal (but with the kmemcheck code compiled in, so it can
126 be enabled at run-time after the kernel has booted), and "one-shot" is
127 a special mode which will turn kmemcheck off automatically after
128 detecting the first use of uninitialized memory.
129
130 If you are using kmemcheck to actively debug a problem, then you
131 probably want to choose "enabled" here.
132
133 The one-shot mode is mostly useful in automated test setups because it
134 can prevent floods of warnings and increase the chances of the machine
135 surviving in case something is really wrong. In other cases, the one-
136 shot mode could actually be counter-productive because it would turn
137 itself off at the very first error -- in the case of a false positive
138 too -- and this would come in the way of debugging the specific
139 problem you were interested in.
140
141 If you would like to use your kernel as normal, but with a chance to
142 enable kmemcheck in case of some problem, it might be a good idea to
143 choose "disabled" here. When kmemcheck is disabled, most of the run-
144 time overhead is not incurred, and the kernel will be almost as fast
145 as normal.
146
147 o CONFIG_KMEMCHECK_QUEUE_SIZE
148
149 Select the maximum number of error reports to store in an internal
150 (fixed-size) buffer. Since errors can occur virtually anywhere and in
151 any context, we need a temporary storage area which is guaranteed not
152 to generate any other page faults when accessed. The queue will be
153 emptied as soon as a tasklet may be scheduled. If the queue is full,
154 new error reports will be lost.
155
156 The default value of 64 is probably fine. If some code produces more
157 than 64 errors within an irqs-off section, then the code is likely to
158 produce many, many more, too, and these additional reports seldom give
159 any more information (the first report is usually the most valuable
160 anyway).
161
162 This number might have to be adjusted if you are not using serial
163 console or similar to capture the kernel log. If you are using the
164 "dmesg" command to save the log, then getting a lot of kmemcheck
165 warnings might overflow the kernel log itself, and the earlier reports
166 will get lost in that way instead. Try setting this to 10 or so on
167 such a setup.
168
169 o CONFIG_KMEMCHECK_SHADOW_COPY_SHIFT
170
171 Select the number of shadow bytes to save along with each entry of the
172 error-report queue. These bytes indicate what parts of an allocation
173 are initialized, uninitialized, etc. and will be displayed when an
174 error is detected to help the debugging of a particular problem.
175
176 The number entered here is actually the logarithm of the number of
177 bytes that will be saved. So if you pick for example 5 here, kmemcheck
178 will save 2^5 = 32 bytes.
179
180 The default value should be fine for debugging most problems. It also
181 fits nicely within 80 columns.
182
183 o CONFIG_KMEMCHECK_PARTIAL_OK
184
185 This option (when enabled) works around certain GCC optimizations that
186 produce 32-bit reads from 16-bit variables where the upper 16 bits are
187 thrown away afterwards.
188
189 The default value (enabled) is recommended. This may of course hide
190 some real errors, but disabling it would probably produce a lot of
191 false positives.
192
193 o CONFIG_KMEMCHECK_BITOPS_OK
194
195 This option silences warnings that would be generated for bit-field
196 accesses where not all the bits are initialized at the same time. This
197 may also hide some real bugs.
198
199 This option is probably obsolete, or it should be replaced with
200 the kmemcheck-/bitfield-annotations for the code in question. The
201 default value is therefore fine.
202
203Now compile the kernel as usual.
204
205
2063. How to use
207=============
208
2093.1. Booting
210============
211
212First some information about the command-line options. There is only one
213option specific to kmemcheck, and this is called "kmemcheck". It can be used
214to override the default mode as chosen by the CONFIG_KMEMCHECK_*_BY_DEFAULT
215option. Its possible settings are:
216
217 o kmemcheck=0 (disabled)
218 o kmemcheck=1 (enabled)
219 o kmemcheck=2 (one-shot mode)
220
221If SLUB debugging has been enabled in the kernel, it may take precedence over
222kmemcheck in such a way that the slab caches which are under SLUB debugging
223will not be tracked by kmemcheck. In order to ensure that this doesn't happen
224(even though it shouldn't by default), use SLUB's boot option "slub_debug",
225like this: slub_debug=-
226
227In fact, this option may also be used for fine-grained control over SLUB vs.
228kmemcheck. For example, if the command line includes "kmemcheck=1
229slub_debug=,dentry", then SLUB debugging will be used only for the "dentry"
230slab cache, and with kmemcheck tracking all the other caches. This is advanced
231usage, however, and is not generally recommended.
232
233
2343.2. Run-time enable/disable
235============================
236
237When the kernel has booted, it is possible to enable or disable kmemcheck at
238run-time. WARNING: This feature is still experimental and may cause false
239positive warnings to appear. Therefore, try not to use this. If you find that
240it doesn't work properly (e.g. you see an unreasonable amount of warnings), I
241will be happy to take bug reports.
242
243Use the file /proc/sys/kernel/kmemcheck for this purpose, e.g.:
244
245 $ echo 0 > /proc/sys/kernel/kmemcheck # disables kmemcheck
246
247The numbers are the same as for the kmemcheck= command-line option.
248
249
2503.3. Debugging
251==============
252
253A typical report will look something like this:
254
255WARNING: kmemcheck: Caught 32-bit read from uninitialized memory (ffff88003e4a2024)
25680000000000000000000000000000000000000000088ffff0000000000000000
257 i i i i u u u u i i i i i i i i u u u u u u u u u u u u u u u u
258 ^
259
260Pid: 1856, comm: ntpdate Not tainted 2.6.29-rc5 #264 945P-A
261RIP: 0010:[<ffffffff8104ede8>] [<ffffffff8104ede8>] __dequeue_signal+0xc8/0x190
262RSP: 0018:ffff88003cdf7d98 EFLAGS: 00210002
263RAX: 0000000000000030 RBX: ffff88003d4ea968 RCX: 0000000000000009
264RDX: ffff88003e5d6018 RSI: ffff88003e5d6024 RDI: ffff88003cdf7e84
265RBP: ffff88003cdf7db8 R08: ffff88003e5d6000 R09: 0000000000000000
266R10: 0000000000000080 R11: 0000000000000000 R12: 000000000000000e
267R13: ffff88003cdf7e78 R14: ffff88003d530710 R15: ffff88003d5a98c8
268FS: 0000000000000000(0000) GS:ffff880001982000(0063) knlGS:00000
269CS: 0010 DS: 002b ES: 002b CR0: 0000000080050033
270CR2: ffff88003f806ea0 CR3: 000000003c036000 CR4: 00000000000006a0
271DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
272DR3: 0000000000000000 DR6: 00000000ffff4ff0 DR7: 0000000000000400
273 [<ffffffff8104f04e>] dequeue_signal+0x8e/0x170
274 [<ffffffff81050bd8>] get_signal_to_deliver+0x98/0x390
275 [<ffffffff8100b87d>] do_notify_resume+0xad/0x7d0
276 [<ffffffff8100c7b5>] int_signal+0x12/0x17
277 [<ffffffffffffffff>] 0xffffffffffffffff
278
279The single most valuable information in this report is the RIP (or EIP on 32-
280bit) value. This will help us pinpoint exactly which instruction that caused
281the warning.
282
283If your kernel was compiled with CONFIG_DEBUG_INFO=y, then all we have to do
284is give this address to the addr2line program, like this:
285
286 $ addr2line -e vmlinux -i ffffffff8104ede8
287 arch/x86/include/asm/string_64.h:12
288 include/asm-generic/siginfo.h:287
289 kernel/signal.c:380
290 kernel/signal.c:410
291
292The "-e vmlinux" tells addr2line which file to look in. IMPORTANT: This must
293be the vmlinux of the kernel that produced the warning in the first place! If
294not, the line number information will almost certainly be wrong.
295
296The "-i" tells addr2line to also print the line numbers of inlined functions.
297In this case, the flag was very important, because otherwise, it would only
298have printed the first line, which is just a call to memcpy(), which could be
299called from a thousand places in the kernel, and is therefore not very useful.
300These inlined functions would not show up in the stack trace above, simply
301because the kernel doesn't load the extra debugging information. This
302technique can of course be used with ordinary kernel oopses as well.
303
304In this case, it's the caller of memcpy() that is interesting, and it can be
305found in include/asm-generic/siginfo.h, line 287:
306
307281 static inline void copy_siginfo(struct siginfo *to, struct siginfo *from)
308282 {
309283 if (from->si_code < 0)
310284 memcpy(to, from, sizeof(*to));
311285 else
312286 /* _sigchld is currently the largest know union member */
313287 memcpy(to, from, __ARCH_SI_PREAMBLE_SIZE + sizeof(from->_sifields._sigchld));
314288 }
315
316Since this was a read (kmemcheck usually warns about reads only, though it can
317warn about writes to unallocated or freed memory as well), it was probably the
318"from" argument which contained some uninitialized bytes. Following the chain
319of calls, we move upwards to see where "from" was allocated or initialized,
320kernel/signal.c, line 380:
321
322359 static void collect_signal(int sig, struct sigpending *list, siginfo_t *info)
323360 {
324...
325367 list_for_each_entry(q, &list->list, list) {
326368 if (q->info.si_signo == sig) {
327369 if (first)
328370 goto still_pending;
329371 first = q;
330...
331377 if (first) {
332378 still_pending:
333379 list_del_init(&first->list);
334380 copy_siginfo(info, &first->info);
335381 __sigqueue_free(first);
336...
337392 }
338393 }
339
340Here, it is &first->info that is being passed on to copy_siginfo(). The
341variable "first" was found on a list -- passed in as the second argument to
342collect_signal(). We continue our journey through the stack, to figure out
343where the item on "list" was allocated or initialized. We move to line 410:
344
345395 static int __dequeue_signal(struct sigpending *pending, sigset_t *mask,
346396 siginfo_t *info)
347397 {
348...
349410 collect_signal(sig, pending, info);
350...
351414 }
352
353Now we need to follow the "pending" pointer, since that is being passed on to
354collect_signal() as "list". At this point, we've run out of lines from the
355"addr2line" output. Not to worry, we just paste the next addresses from the
356kmemcheck stack dump, i.e.:
357
358 [<ffffffff8104f04e>] dequeue_signal+0x8e/0x170
359 [<ffffffff81050bd8>] get_signal_to_deliver+0x98/0x390
360 [<ffffffff8100b87d>] do_notify_resume+0xad/0x7d0
361 [<ffffffff8100c7b5>] int_signal+0x12/0x17
362
363 $ addr2line -e vmlinux -i ffffffff8104f04e ffffffff81050bd8 \
364 ffffffff8100b87d ffffffff8100c7b5
365 kernel/signal.c:446
366 kernel/signal.c:1806
367 arch/x86/kernel/signal.c:805
368 arch/x86/kernel/signal.c:871
369 arch/x86/kernel/entry_64.S:694
370
371Remember that since these addresses were found on the stack and not as the
372RIP value, they actually point to the _next_ instruction (they are return
373addresses). This becomes obvious when we look at the code for line 446:
374
375422 int dequeue_signal(struct task_struct *tsk, sigset_t *mask, siginfo_t *info)
376423 {
377...
378431 signr = __dequeue_signal(&tsk->signal->shared_pending,
379432 mask, info);
380433 /*
381434 * itimer signal ?
382435 *
383436 * itimers are process shared and we restart periodic
384437 * itimers in the signal delivery path to prevent DoS
385438 * attacks in the high resolution timer case. This is
386439 * compliant with the old way of self restarting
387440 * itimers, as the SIGALRM is a legacy signal and only
388441 * queued once. Changing the restart behaviour to
389442 * restart the timer in the signal dequeue path is
390443 * reducing the timer noise on heavy loaded !highres
391444 * systems too.
392445 */
393446 if (unlikely(signr == SIGALRM)) {
394...
395489 }
396
397So instead of looking at 446, we should be looking at 431, which is the line
398that executes just before 446. Here we see that what we are looking for is
399&tsk->signal->shared_pending.
400
401Our next task is now to figure out which function that puts items on this
402"shared_pending" list. A crude, but efficient tool, is git grep:
403
404 $ git grep -n 'shared_pending' kernel/
405 ...
406 kernel/signal.c:828: pending = group ? &t->signal->shared_pending : &t->pending;
407 kernel/signal.c:1339: pending = group ? &t->signal->shared_pending : &t->pending;
408 ...
409
410There were more results, but none of them were related to list operations,
411and these were the only assignments. We inspect the line numbers more closely
412and find that this is indeed where items are being added to the list:
413
414816 static int send_signal(int sig, struct siginfo *info, struct task_struct *t,
415817 int group)
416818 {
417...
418828 pending = group ? &t->signal->shared_pending : &t->pending;
419...
420851 q = __sigqueue_alloc(t, GFP_ATOMIC, (sig < SIGRTMIN &&
421852 (is_si_special(info) ||
422853 info->si_code >= 0)));
423854 if (q) {
424855 list_add_tail(&q->list, &pending->list);
425...
426890 }
427
428and:
429
4301309 int send_sigqueue(struct sigqueue *q, struct task_struct *t, int group)
4311310 {
432....
4331339 pending = group ? &t->signal->shared_pending : &t->pending;
4341340 list_add_tail(&q->list, &pending->list);
435....
4361347 }
437
438In the first case, the list element we are looking for, "q", is being returned
439from the function __sigqueue_alloc(), which looks like an allocation function.
440Let's take a look at it:
441
442187 static struct sigqueue *__sigqueue_alloc(struct task_struct *t, gfp_t flags,
443188 int override_rlimit)
444189 {
445190 struct sigqueue *q = NULL;
446191 struct user_struct *user;
447192
448193 /*
449194 * We won't get problems with the target's UID changing under us
450195 * because changing it requires RCU be used, and if t != current, the
451196 * caller must be holding the RCU readlock (by way of a spinlock) and
452197 * we use RCU protection here
453198 */
454199 user = get_uid(__task_cred(t)->user);
455200 atomic_inc(&user->sigpending);
456201 if (override_rlimit ||
457202 atomic_read(&user->sigpending) <=
458203 t->signal->rlim[RLIMIT_SIGPENDING].rlim_cur)
459204 q = kmem_cache_alloc(sigqueue_cachep, flags);
460205 if (unlikely(q == NULL)) {
461206 atomic_dec(&user->sigpending);
462207 free_uid(user);
463208 } else {
464209 INIT_LIST_HEAD(&q->list);
465210 q->flags = 0;
466211 q->user = user;
467212 }
468213
469214 return q;
470215 }
471
472We see that this function initializes q->list, q->flags, and q->user. It seems
473that now is the time to look at the definition of "struct sigqueue", e.g.:
474
47514 struct sigqueue {
47615 struct list_head list;
47716 int flags;
47817 siginfo_t info;
47918 struct user_struct *user;
48019 };
481
482And, you might remember, it was a memcpy() on &first->info that caused the
483warning, so this makes perfect sense. It also seems reasonable to assume that
484it is the caller of __sigqueue_alloc() that has the responsibility of filling
485out (initializing) this member.
486
487But just which fields of the struct were uninitialized? Let's look at
488kmemcheck's report again:
489
490WARNING: kmemcheck: Caught 32-bit read from uninitialized memory (ffff88003e4a2024)
49180000000000000000000000000000000000000000088ffff0000000000000000
492 i i i i u u u u i i i i i i i i u u u u u u u u u u u u u u u u
493 ^
494
495These first two lines are the memory dump of the memory object itself, and the
496shadow bytemap, respectively. The memory object itself is in this case
497&first->info. Just beware that the start of this dump is NOT the start of the
498object itself! The position of the caret (^) corresponds with the address of
499the read (ffff88003e4a2024).
500
501The shadow bytemap dump legend is as follows:
502
503 i - initialized
504 u - uninitialized
505 a - unallocated (memory has been allocated by the slab layer, but has not
506 yet been handed off to anybody)
507 f - freed (memory has been allocated by the slab layer, but has been freed
508 by the previous owner)
509
510In order to figure out where (relative to the start of the object) the
511uninitialized memory was located, we have to look at the disassembly. For
512that, we'll need the RIP address again:
513
514RIP: 0010:[<ffffffff8104ede8>] [<ffffffff8104ede8>] __dequeue_signal+0xc8/0x190
515
516 $ objdump -d --no-show-raw-insn vmlinux | grep -C 8 ffffffff8104ede8:
517 ffffffff8104edc8: mov %r8,0x8(%r8)
518 ffffffff8104edcc: test %r10d,%r10d
519 ffffffff8104edcf: js ffffffff8104ee88 <__dequeue_signal+0x168>
520 ffffffff8104edd5: mov %rax,%rdx
521 ffffffff8104edd8: mov $0xc,%ecx
522 ffffffff8104eddd: mov %r13,%rdi
523 ffffffff8104ede0: mov $0x30,%eax
524 ffffffff8104ede5: mov %rdx,%rsi
525 ffffffff8104ede8: rep movsl %ds:(%rsi),%es:(%rdi)
526 ffffffff8104edea: test $0x2,%al
527 ffffffff8104edec: je ffffffff8104edf0 <__dequeue_signal+0xd0>
528 ffffffff8104edee: movsw %ds:(%rsi),%es:(%rdi)
529 ffffffff8104edf0: test $0x1,%al
530 ffffffff8104edf2: je ffffffff8104edf5 <__dequeue_signal+0xd5>
531 ffffffff8104edf4: movsb %ds:(%rsi),%es:(%rdi)
532 ffffffff8104edf5: mov %r8,%rdi
533 ffffffff8104edf8: callq ffffffff8104de60 <__sigqueue_free>
534
535As expected, it's the "rep movsl" instruction from the memcpy() that causes
536the warning. We know about REP MOVSL that it uses the register RCX to count
537the number of remaining iterations. By taking a look at the register dump
538again (from the kmemcheck report), we can figure out how many bytes were left
539to copy:
540
541RAX: 0000000000000030 RBX: ffff88003d4ea968 RCX: 0000000000000009
542
543By looking at the disassembly, we also see that %ecx is being loaded with the
544value $0xc just before (ffffffff8104edd8), so we are very lucky. Keep in mind
545that this is the number of iterations, not bytes. And since this is a "long"
546operation, we need to multiply by 4 to get the number of bytes. So this means
547that the uninitialized value was encountered at 4 * (0xc - 0x9) = 12 bytes
548from the start of the object.
549
550We can now try to figure out which field of the "struct siginfo" that was not
551initialized. This is the beginning of the struct:
552
55340 typedef struct siginfo {
55441 int si_signo;
55542 int si_errno;
55643 int si_code;
55744
55845 union {
559..
56092 } _sifields;
56193 } siginfo_t;
562
563On 64-bit, the int is 4 bytes long, so it must the the union member that has
564not been initialized. We can verify this using gdb:
565
566 $ gdb vmlinux
567 ...
568 (gdb) p &((struct siginfo *) 0)->_sifields
569 $1 = (union {...} *) 0x10
570
571Actually, it seems that the union member is located at offset 0x10 -- which
572means that gcc has inserted 4 bytes of padding between the members si_code
573and _sifields. We can now get a fuller picture of the memory dump:
574
575 _----------------------------=> si_code
576 / _--------------------=> (padding)
577 | / _------------=> _sifields(._kill._pid)
578 | | / _----=> _sifields(._kill._uid)
579 | | | /
580-------|-------|-------|-------|
58180000000000000000000000000000000000000000088ffff0000000000000000
582 i i i i u u u u i i i i i i i i u u u u u u u u u u u u u u u u
583
584This allows us to realize another important fact: si_code contains the value
5850x80. Remember that x86 is little endian, so the first 4 bytes "80000000" are
586really the number 0x00000080. With a bit of research, we find that this is
587actually the constant SI_KERNEL defined in include/asm-generic/siginfo.h:
588
589144 #define SI_KERNEL 0x80 /* sent by the kernel from somewhere */
590
591This macro is used in exactly one place in the x86 kernel: In send_signal()
592in kernel/signal.c:
593
594816 static int send_signal(int sig, struct siginfo *info, struct task_struct *t,
595817 int group)
596818 {
597...
598828 pending = group ? &t->signal->shared_pending : &t->pending;
599...
600851 q = __sigqueue_alloc(t, GFP_ATOMIC, (sig < SIGRTMIN &&
601852 (is_si_special(info) ||
602853 info->si_code >= 0)));
603854 if (q) {
604855 list_add_tail(&q->list, &pending->list);
605856 switch ((unsigned long) info) {
606...
607865 case (unsigned long) SEND_SIG_PRIV:
608866 q->info.si_signo = sig;
609867 q->info.si_errno = 0;
610868 q->info.si_code = SI_KERNEL;
611869 q->info.si_pid = 0;
612870 q->info.si_uid = 0;
613871 break;
614...
615890 }
616
617Not only does this match with the .si_code member, it also matches the place
618we found earlier when looking for where siginfo_t objects are enqueued on the
619"shared_pending" list.
620
621So to sum up: It seems that it is the padding introduced by the compiler
622between two struct fields that is uninitialized, and this gets reported when
623we do a memcpy() on the struct. This means that we have identified a false
624positive warning.
625
626Normally, kmemcheck will not report uninitialized accesses in memcpy() calls
627when both the source and destination addresses are tracked. (Instead, we copy
628the shadow bytemap as well). In this case, the destination address clearly
629was not tracked. We can dig a little deeper into the stack trace from above:
630
631 arch/x86/kernel/signal.c:805
632 arch/x86/kernel/signal.c:871
633 arch/x86/kernel/entry_64.S:694
634
635And we clearly see that the destination siginfo object is located on the
636stack:
637
638782 static void do_signal(struct pt_regs *regs)
639783 {
640784 struct k_sigaction ka;
641785 siginfo_t info;
642...
643804 signr = get_signal_to_deliver(&info, &ka, regs, NULL);
644...
645854 }
646
647And this &info is what eventually gets passed to copy_siginfo() as the
648destination argument.
649
650Now, even though we didn't find an actual error here, the example is still a
651good one, because it shows how one would go about to find out what the report
652was all about.
653
654
6553.4. Annotating false positives
656===============================
657
658There are a few different ways to make annotations in the source code that
659will keep kmemcheck from checking and reporting certain allocations. Here
660they are:
661
662 o __GFP_NOTRACK_FALSE_POSITIVE
663
664 This flag can be passed to kmalloc() or kmem_cache_alloc() (therefore
665 also to other functions that end up calling one of these) to indicate
666 that the allocation should not be tracked because it would lead to
667 a false positive report. This is a "big hammer" way of silencing
668 kmemcheck; after all, even if the false positive pertains to
669 particular field in a struct, for example, we will now lose the
670 ability to find (real) errors in other parts of the same struct.
671
672 Example:
673
674 /* No warnings will ever trigger on accessing any part of x */
675 x = kmalloc(sizeof *x, GFP_KERNEL | __GFP_NOTRACK_FALSE_POSITIVE);
676
677 o kmemcheck_bitfield_begin(name)/kmemcheck_bitfield_end(name) and
678 kmemcheck_annotate_bitfield(ptr, name)
679
680 The first two of these three macros can be used inside struct
681 definitions to signal, respectively, the beginning and end of a
682 bitfield. Additionally, this will assign the bitfield a name, which
683 is given as an argument to the macros.
684
685 Having used these markers, one can later use
686 kmemcheck_annotate_bitfield() at the point of allocation, to indicate
687 which parts of the allocation is part of a bitfield.
688
689 Example:
690
691 struct foo {
692 int x;
693
694 kmemcheck_bitfield_begin(flags);
695 int flag_a:1;
696 int flag_b:1;
697 kmemcheck_bitfield_end(flags);
698
699 int y;
700 };
701
702 struct foo *x = kmalloc(sizeof *x);
703
704 /* No warnings will trigger on accessing the bitfield of x */
705 kmemcheck_annotate_bitfield(x, flags);
706
707 Note that kmemcheck_annotate_bitfield() can be used even before the
708 return value of kmalloc() is checked -- in other words, passing NULL
709 as the first argument is legal (and will do nothing).
710
711
7124. Reporting errors
713===================
714
715As we have seen, kmemcheck will produce false positive reports. Therefore, it
716is not very wise to blindly post kmemcheck warnings to mailing lists and
717maintainers. Instead, I encourage maintainers and developers to find errors
718in their own code. If you get a warning, you can try to work around it, try
719to figure out if it's a real error or not, or simply ignore it. Most
720developers know their own code and will quickly and efficiently determine the
721root cause of a kmemcheck report. This is therefore also the most efficient
722way to work with kmemcheck.
723
724That said, we (the kmemcheck maintainers) will always be on the lookout for
725false positives that we can annotate and silence. So whatever you find,
726please drop us a note privately! Kernel configs and steps to reproduce (if
727available) are of course a great help too.
728
729Happy hacking!
730
731
7325. Technical description
733========================
734
735kmemcheck works by marking memory pages non-present. This means that whenever
736somebody attempts to access the page, a page fault is generated. The page
737fault handler notices that the page was in fact only hidden, and so it calls
738on the kmemcheck code to make further investigations.
739
740When the investigations are completed, kmemcheck "shows" the page by marking
741it present (as it would be under normal circumstances). This way, the
742interrupted code can continue as usual.
743
744But after the instruction has been executed, we should hide the page again, so
745that we can catch the next access too! Now kmemcheck makes use of a debugging
746feature of the processor, namely single-stepping. When the processor has
747finished the one instruction that generated the memory access, a debug
748exception is raised. From here, we simply hide the page again and continue
749execution, this time with the single-stepping feature turned off.
750
751kmemcheck requires some assistance from the memory allocator in order to work.
752The memory allocator needs to
753
754 1. Tell kmemcheck about newly allocated pages and pages that are about to
755 be freed. This allows kmemcheck to set up and tear down the shadow memory
756 for the pages in question. The shadow memory stores the status of each
757 byte in the allocation proper, e.g. whether it is initialized or
758 uninitialized.
759
760 2. Tell kmemcheck which parts of memory should be marked uninitialized.
761 There are actually a few more states, such as "not yet allocated" and
762 "recently freed".
763
764If a slab cache is set up using the SLAB_NOTRACK flag, it will never return
765memory that can take page faults because of kmemcheck.
766
767If a slab cache is NOT set up using the SLAB_NOTRACK flag, callers can still
768request memory with the __GFP_NOTRACK or __GFP_NOTRACK_FALSE_POSITIVE flags.
769This does not prevent the page faults from occurring, however, but marks the
770object in question as being initialized so that no warnings will ever be
771produced for this object.
772
773Currently, the SLAB and SLUB allocators are supported by kmemcheck.
diff --git a/Documentation/kprobes.txt b/Documentation/kprobes.txt
index 1e7a769a10f9..053037a1fe6d 100644
--- a/Documentation/kprobes.txt
+++ b/Documentation/kprobes.txt
@@ -507,9 +507,9 @@ http://www.linuxsymposium.org/2006/linuxsymposium_procv2.pdf (pages 101-115)
507Appendix A: The kprobes debugfs interface 507Appendix A: The kprobes debugfs interface
508 508
509With recent kernels (> 2.6.20) the list of registered kprobes is visible 509With recent kernels (> 2.6.20) the list of registered kprobes is visible
510under the /debug/kprobes/ directory (assuming debugfs is mounted at /debug). 510under the /sys/kernel/debug/kprobes/ directory (assuming debugfs is mounted at //sys/kernel/debug).
511 511
512/debug/kprobes/list: Lists all registered probes on the system 512/sys/kernel/debug/kprobes/list: Lists all registered probes on the system
513 513
514c015d71a k vfs_read+0x0 514c015d71a k vfs_read+0x0
515c011a316 j do_fork+0x0 515c011a316 j do_fork+0x0
@@ -525,7 +525,7 @@ virtual addresses that correspond to modules that've been unloaded),
525such probes are marked with [GONE]. If the probe is temporarily disabled, 525such probes are marked with [GONE]. If the probe is temporarily disabled,
526such probes are marked with [DISABLED]. 526such probes are marked with [DISABLED].
527 527
528/debug/kprobes/enabled: Turn kprobes ON/OFF forcibly. 528/sys/kernel/debug/kprobes/enabled: Turn kprobes ON/OFF forcibly.
529 529
530Provides a knob to globally and forcibly turn registered kprobes ON or OFF. 530Provides a knob to globally and forcibly turn registered kprobes ON or OFF.
531By default, all kprobes are enabled. By echoing "0" to this file, all 531By default, all kprobes are enabled. By echoing "0" to this file, all
diff --git a/Documentation/scsi/scsi_fc_transport.txt b/Documentation/scsi/scsi_fc_transport.txt
index e5b071d46619..d7f181701dc2 100644
--- a/Documentation/scsi/scsi_fc_transport.txt
+++ b/Documentation/scsi/scsi_fc_transport.txt
@@ -1,10 +1,11 @@
1 SCSI FC Tansport 1 SCSI FC Tansport
2 ============================================= 2 =============================================
3 3
4Date: 4/12/2007 4Date: 11/18/2008
5Kernel Revisions for features: 5Kernel Revisions for features:
6 rports : <<TBS>> 6 rports : <<TBS>>
7 vports : 2.6.22 (? TBD) 7 vports : 2.6.22
8 bsg support : 2.6.30 (?TBD?)
8 9
9 10
10Introduction 11Introduction
@@ -15,6 +16,7 @@ The FC transport can be found at:
15 drivers/scsi/scsi_transport_fc.c 16 drivers/scsi/scsi_transport_fc.c
16 include/scsi/scsi_transport_fc.h 17 include/scsi/scsi_transport_fc.h
17 include/scsi/scsi_netlink_fc.h 18 include/scsi/scsi_netlink_fc.h
19 include/scsi/scsi_bsg_fc.h
18 20
19This file is found at Documentation/scsi/scsi_fc_transport.txt 21This file is found at Documentation/scsi/scsi_fc_transport.txt
20 22
@@ -472,6 +474,14 @@ int
472fc_vport_terminate(struct fc_vport *vport) 474fc_vport_terminate(struct fc_vport *vport)
473 475
474 476
477FC BSG support (CT & ELS passthru, and more)
478========================================================================
479<< To Be Supplied >>
480
481
482
483
484
475Credits 485Credits
476======= 486=======
477The following people have contributed to this document: 487The following people have contributed to this document:
diff --git a/Documentation/scsi/scsi_mid_low_api.txt b/Documentation/scsi/scsi_mid_low_api.txt
index a6d5354639b2..de67229251d8 100644
--- a/Documentation/scsi/scsi_mid_low_api.txt
+++ b/Documentation/scsi/scsi_mid_low_api.txt
@@ -1271,6 +1271,11 @@ of interest:
1271 hostdata[0] - area reserved for LLD at end of struct Scsi_Host. Size 1271 hostdata[0] - area reserved for LLD at end of struct Scsi_Host. Size
1272 is set by the second argument (named 'xtr_bytes') to 1272 is set by the second argument (named 'xtr_bytes') to
1273 scsi_host_alloc() or scsi_register(). 1273 scsi_host_alloc() or scsi_register().
1274 vendor_id - a unique value that identifies the vendor supplying
1275 the LLD for the Scsi_Host. Used most often in validating
1276 vendor-specific message requests. Value consists of an
1277 identifier type and a vendor-specific value.
1278 See scsi_netlink.h for a description of valid formats.
1274 1279
1275The scsi_host structure is defined in include/scsi/scsi_host.h 1280The scsi_host structure is defined in include/scsi/scsi_host.h
1276 1281
diff --git a/Documentation/sysctl/vm.txt b/Documentation/sysctl/vm.txt
index 6fab2dcbb4d3..c4de6359d440 100644
--- a/Documentation/sysctl/vm.txt
+++ b/Documentation/sysctl/vm.txt
@@ -233,8 +233,8 @@ These protections are added to score to judge whether this zone should be used
233for page allocation or should be reclaimed. 233for page allocation or should be reclaimed.
234 234
235In this example, if normal pages (index=2) are required to this DMA zone and 235In this example, if normal pages (index=2) are required to this DMA zone and
236pages_high is used for watermark, the kernel judges this zone should not be 236watermark[WMARK_HIGH] is used for watermark, the kernel judges this zone should
237used because pages_free(1355) is smaller than watermark + protection[2] 237not be used because pages_free(1355) is smaller than watermark + protection[2]
238(4 + 2004 = 2008). If this protection value is 0, this zone would be used for 238(4 + 2004 = 2008). If this protection value is 0, this zone would be used for
239normal page requirement. If requirement is DMA zone(index=0), protection[0] 239normal page requirement. If requirement is DMA zone(index=0), protection[0]
240(=0) is used. 240(=0) is used.
@@ -280,9 +280,10 @@ The default value is 65536.
280min_free_kbytes: 280min_free_kbytes:
281 281
282This is used to force the Linux VM to keep a minimum number 282This is used to force the Linux VM to keep a minimum number
283of kilobytes free. The VM uses this number to compute a pages_min 283of kilobytes free. The VM uses this number to compute a
284value for each lowmem zone in the system. Each lowmem zone gets 284watermark[WMARK_MIN] value for each lowmem zone in the system.
285a number of reserved free pages based proportionally on its size. 285Each lowmem zone gets a number of reserved free pages based
286proportionally on its size.
286 287
287Some minimal amount of memory is needed to satisfy PF_MEMALLOC 288Some minimal amount of memory is needed to satisfy PF_MEMALLOC
288allocations; if you set this to lower than 1024KB, your system will 289allocations; if you set this to lower than 1024KB, your system will
@@ -314,10 +315,14 @@ min_unmapped_ratio:
314 315
315This is available only on NUMA kernels. 316This is available only on NUMA kernels.
316 317
317A percentage of the total pages in each zone. Zone reclaim will only 318This is a percentage of the total pages in each zone. Zone reclaim will
318occur if more than this percentage of pages are file backed and unmapped. 319only occur if more than this percentage of pages are in a state that
319This is to insure that a minimal amount of local pages is still available for 320zone_reclaim_mode allows to be reclaimed.
320file I/O even if the node is overallocated. 321
322If zone_reclaim_mode has the value 4 OR'd, then the percentage is compared
323against all file-backed unmapped pages including swapcache pages and tmpfs
324files. Otherwise, only unmapped pages backed by normal files but not tmpfs
325files and similar are considered.
321 326
322The default is 1 percent. 327The default is 1 percent.
323 328
diff --git a/Documentation/trace/ftrace.txt b/Documentation/trace/ftrace.txt
index 7bd27f0e2880..a39b3c749de5 100644
--- a/Documentation/trace/ftrace.txt
+++ b/Documentation/trace/ftrace.txt
@@ -7,7 +7,6 @@ Copyright 2008 Red Hat Inc.
7 (dual licensed under the GPL v2) 7 (dual licensed under the GPL v2)
8Reviewers: Elias Oltmanns, Randy Dunlap, Andrew Morton, 8Reviewers: Elias Oltmanns, Randy Dunlap, Andrew Morton,
9 John Kacur, and David Teigland. 9 John Kacur, and David Teigland.
10
11Written for: 2.6.28-rc2 10Written for: 2.6.28-rc2
12 11
13Introduction 12Introduction
@@ -33,13 +32,26 @@ The File System
33Ftrace uses the debugfs file system to hold the control files as 32Ftrace uses the debugfs file system to hold the control files as
34well as the files to display output. 33well as the files to display output.
35 34
36To mount the debugfs system: 35When debugfs is configured into the kernel (which selecting any ftrace
36option will do) the directory /sys/kernel/debug will be created. To mount
37this directory, you can add to your /etc/fstab file:
38
39 debugfs /sys/kernel/debug debugfs defaults 0 0
40
41Or you can mount it at run time with:
42
43 mount -t debugfs nodev /sys/kernel/debug
37 44
38 # mkdir /debug 45For quicker access to that directory you may want to make a soft link to
39 # mount -t debugfs nodev /debug 46it:
40 47
41( Note: it is more common to mount at /sys/kernel/debug, but for 48 ln -s /sys/kernel/debug /debug
42 simplicity this document will use /debug) 49
50Any selected ftrace option will also create a directory called tracing
51within the debugfs. The rest of the document will assume that you are in
52the ftrace directory (cd /sys/kernel/debug/tracing) and will only concentrate
53on the files within that directory and not distract from the content with
54the extended "/sys/kernel/debug/tracing" path name.
43 55
44That's it! (assuming that you have ftrace configured into your kernel) 56That's it! (assuming that you have ftrace configured into your kernel)
45 57
@@ -389,18 +401,18 @@ trace_options
389The trace_options file is used to control what gets printed in 401The trace_options file is used to control what gets printed in
390the trace output. To see what is available, simply cat the file: 402the trace output. To see what is available, simply cat the file:
391 403
392 cat /debug/tracing/trace_options 404 cat trace_options
393 print-parent nosym-offset nosym-addr noverbose noraw nohex nobin \ 405 print-parent nosym-offset nosym-addr noverbose noraw nohex nobin \
394 noblock nostacktrace nosched-tree nouserstacktrace nosym-userobj 406 noblock nostacktrace nosched-tree nouserstacktrace nosym-userobj
395 407
396To disable one of the options, echo in the option prepended with 408To disable one of the options, echo in the option prepended with
397"no". 409"no".
398 410
399 echo noprint-parent > /debug/tracing/trace_options 411 echo noprint-parent > trace_options
400 412
401To enable an option, leave off the "no". 413To enable an option, leave off the "no".
402 414
403 echo sym-offset > /debug/tracing/trace_options 415 echo sym-offset > trace_options
404 416
405Here are the available options: 417Here are the available options:
406 418
@@ -476,11 +488,11 @@ sched_switch
476This tracer simply records schedule switches. Here is an example 488This tracer simply records schedule switches. Here is an example
477of how to use it. 489of how to use it.
478 490
479 # echo sched_switch > /debug/tracing/current_tracer 491 # echo sched_switch > current_tracer
480 # echo 1 > /debug/tracing/tracing_enabled 492 # echo 1 > tracing_enabled
481 # sleep 1 493 # sleep 1
482 # echo 0 > /debug/tracing/tracing_enabled 494 # echo 0 > tracing_enabled
483 # cat /debug/tracing/trace 495 # cat trace
484 496
485# tracer: sched_switch 497# tracer: sched_switch
486# 498#
@@ -583,13 +595,13 @@ new trace is saved.
583To reset the maximum, echo 0 into tracing_max_latency. Here is 595To reset the maximum, echo 0 into tracing_max_latency. Here is
584an example: 596an example:
585 597
586 # echo irqsoff > /debug/tracing/current_tracer 598 # echo irqsoff > current_tracer
587 # echo 0 > /debug/tracing/tracing_max_latency 599 # echo 0 > tracing_max_latency
588 # echo 1 > /debug/tracing/tracing_enabled 600 # echo 1 > tracing_enabled
589 # ls -ltr 601 # ls -ltr
590 [...] 602 [...]
591 # echo 0 > /debug/tracing/tracing_enabled 603 # echo 0 > tracing_enabled
592 # cat /debug/tracing/latency_trace 604 # cat latency_trace
593# tracer: irqsoff 605# tracer: irqsoff
594# 606#
595irqsoff latency trace v1.1.5 on 2.6.26 607irqsoff latency trace v1.1.5 on 2.6.26
@@ -690,13 +702,13 @@ Like the irqsoff tracer, it records the maximum latency for
690which preemption was disabled. The control of preemptoff tracer 702which preemption was disabled. The control of preemptoff tracer
691is much like the irqsoff tracer. 703is much like the irqsoff tracer.
692 704
693 # echo preemptoff > /debug/tracing/current_tracer 705 # echo preemptoff > current_tracer
694 # echo 0 > /debug/tracing/tracing_max_latency 706 # echo 0 > tracing_max_latency
695 # echo 1 > /debug/tracing/tracing_enabled 707 # echo 1 > tracing_enabled
696 # ls -ltr 708 # ls -ltr
697 [...] 709 [...]
698 # echo 0 > /debug/tracing/tracing_enabled 710 # echo 0 > tracing_enabled
699 # cat /debug/tracing/latency_trace 711 # cat latency_trace
700# tracer: preemptoff 712# tracer: preemptoff
701# 713#
702preemptoff latency trace v1.1.5 on 2.6.26-rc8 714preemptoff latency trace v1.1.5 on 2.6.26-rc8
@@ -837,13 +849,13 @@ tracer.
837Again, using this trace is much like the irqsoff and preemptoff 849Again, using this trace is much like the irqsoff and preemptoff
838tracers. 850tracers.
839 851
840 # echo preemptirqsoff > /debug/tracing/current_tracer 852 # echo preemptirqsoff > current_tracer
841 # echo 0 > /debug/tracing/tracing_max_latency 853 # echo 0 > tracing_max_latency
842 # echo 1 > /debug/tracing/tracing_enabled 854 # echo 1 > tracing_enabled
843 # ls -ltr 855 # ls -ltr
844 [...] 856 [...]
845 # echo 0 > /debug/tracing/tracing_enabled 857 # echo 0 > tracing_enabled
846 # cat /debug/tracing/latency_trace 858 # cat latency_trace
847# tracer: preemptirqsoff 859# tracer: preemptirqsoff
848# 860#
849preemptirqsoff latency trace v1.1.5 on 2.6.26-rc8 861preemptirqsoff latency trace v1.1.5 on 2.6.26-rc8
@@ -999,12 +1011,12 @@ slightly differently than we did with the previous tracers.
999Instead of performing an 'ls', we will run 'sleep 1' under 1011Instead of performing an 'ls', we will run 'sleep 1' under
1000'chrt' which changes the priority of the task. 1012'chrt' which changes the priority of the task.
1001 1013
1002 # echo wakeup > /debug/tracing/current_tracer 1014 # echo wakeup > current_tracer
1003 # echo 0 > /debug/tracing/tracing_max_latency 1015 # echo 0 > tracing_max_latency
1004 # echo 1 > /debug/tracing/tracing_enabled 1016 # echo 1 > tracing_enabled
1005 # chrt -f 5 sleep 1 1017 # chrt -f 5 sleep 1
1006 # echo 0 > /debug/tracing/tracing_enabled 1018 # echo 0 > tracing_enabled
1007 # cat /debug/tracing/latency_trace 1019 # cat latency_trace
1008# tracer: wakeup 1020# tracer: wakeup
1009# 1021#
1010wakeup latency trace v1.1.5 on 2.6.26-rc8 1022wakeup latency trace v1.1.5 on 2.6.26-rc8
@@ -1114,11 +1126,11 @@ can be done from the debug file system. Make sure the
1114ftrace_enabled is set; otherwise this tracer is a nop. 1126ftrace_enabled is set; otherwise this tracer is a nop.
1115 1127
1116 # sysctl kernel.ftrace_enabled=1 1128 # sysctl kernel.ftrace_enabled=1
1117 # echo function > /debug/tracing/current_tracer 1129 # echo function > current_tracer
1118 # echo 1 > /debug/tracing/tracing_enabled 1130 # echo 1 > tracing_enabled
1119 # usleep 1 1131 # usleep 1
1120 # echo 0 > /debug/tracing/tracing_enabled 1132 # echo 0 > tracing_enabled
1121 # cat /debug/tracing/trace 1133 # cat trace
1122# tracer: function 1134# tracer: function
1123# 1135#
1124# TASK-PID CPU# TIMESTAMP FUNCTION 1136# TASK-PID CPU# TIMESTAMP FUNCTION
@@ -1155,7 +1167,7 @@ int trace_fd;
1155[...] 1167[...]
1156int main(int argc, char *argv[]) { 1168int main(int argc, char *argv[]) {
1157 [...] 1169 [...]
1158 trace_fd = open("/debug/tracing/tracing_enabled", O_WRONLY); 1170 trace_fd = open(tracing_file("tracing_enabled"), O_WRONLY);
1159 [...] 1171 [...]
1160 if (condition_hit()) { 1172 if (condition_hit()) {
1161 write(trace_fd, "0", 1); 1173 write(trace_fd, "0", 1);
@@ -1163,26 +1175,20 @@ int main(int argc, char *argv[]) {
1163 [...] 1175 [...]
1164} 1176}
1165 1177
1166Note: Here we hard coded the path name. The debugfs mount is not
1167guaranteed to be at /debug (and is more commonly at
1168/sys/kernel/debug). For simple one time traces, the above is
1169sufficent. For anything else, a search through /proc/mounts may
1170be needed to find where the debugfs file-system is mounted.
1171
1172 1178
1173Single thread tracing 1179Single thread tracing
1174--------------------- 1180---------------------
1175 1181
1176By writing into /debug/tracing/set_ftrace_pid you can trace a 1182By writing into set_ftrace_pid you can trace a
1177single thread. For example: 1183single thread. For example:
1178 1184
1179# cat /debug/tracing/set_ftrace_pid 1185# cat set_ftrace_pid
1180no pid 1186no pid
1181# echo 3111 > /debug/tracing/set_ftrace_pid 1187# echo 3111 > set_ftrace_pid
1182# cat /debug/tracing/set_ftrace_pid 1188# cat set_ftrace_pid
11833111 11893111
1184# echo function > /debug/tracing/current_tracer 1190# echo function > current_tracer
1185# cat /debug/tracing/trace | head 1191# cat trace | head
1186 # tracer: function 1192 # tracer: function
1187 # 1193 #
1188 # TASK-PID CPU# TIMESTAMP FUNCTION 1194 # TASK-PID CPU# TIMESTAMP FUNCTION
@@ -1193,8 +1199,8 @@ no pid
1193 yum-updatesd-3111 [003] 1637.254683: lock_hrtimer_base <-hrtimer_try_to_cancel 1199 yum-updatesd-3111 [003] 1637.254683: lock_hrtimer_base <-hrtimer_try_to_cancel
1194 yum-updatesd-3111 [003] 1637.254685: fget_light <-do_sys_poll 1200 yum-updatesd-3111 [003] 1637.254685: fget_light <-do_sys_poll
1195 yum-updatesd-3111 [003] 1637.254686: pipe_poll <-do_sys_poll 1201 yum-updatesd-3111 [003] 1637.254686: pipe_poll <-do_sys_poll
1196# echo -1 > /debug/tracing/set_ftrace_pid 1202# echo -1 > set_ftrace_pid
1197# cat /debug/tracing/trace |head 1203# cat trace |head
1198 # tracer: function 1204 # tracer: function
1199 # 1205 #
1200 # TASK-PID CPU# TIMESTAMP FUNCTION 1206 # TASK-PID CPU# TIMESTAMP FUNCTION
@@ -1216,6 +1222,51 @@ something like this simple program:
1216#include <fcntl.h> 1222#include <fcntl.h>
1217#include <unistd.h> 1223#include <unistd.h>
1218 1224
1225#define _STR(x) #x
1226#define STR(x) _STR(x)
1227#define MAX_PATH 256
1228
1229const char *find_debugfs(void)
1230{
1231 static char debugfs[MAX_PATH+1];
1232 static int debugfs_found;
1233 char type[100];
1234 FILE *fp;
1235
1236 if (debugfs_found)
1237 return debugfs;
1238
1239 if ((fp = fopen("/proc/mounts","r")) == NULL) {
1240 perror("/proc/mounts");
1241 return NULL;
1242 }
1243
1244 while (fscanf(fp, "%*s %"
1245 STR(MAX_PATH)
1246 "s %99s %*s %*d %*d\n",
1247 debugfs, type) == 2) {
1248 if (strcmp(type, "debugfs") == 0)
1249 break;
1250 }
1251 fclose(fp);
1252
1253 if (strcmp(type, "debugfs") != 0) {
1254 fprintf(stderr, "debugfs not mounted");
1255 return NULL;
1256 }
1257
1258 debugfs_found = 1;
1259
1260 return debugfs;
1261}
1262
1263const char *tracing_file(const char *file_name)
1264{
1265 static char trace_file[MAX_PATH+1];
1266 snprintf(trace_file, MAX_PATH, "%s/%s", find_debugfs(), file_name);
1267 return trace_file;
1268}
1269
1219int main (int argc, char **argv) 1270int main (int argc, char **argv)
1220{ 1271{
1221 if (argc < 1) 1272 if (argc < 1)
@@ -1226,12 +1277,12 @@ int main (int argc, char **argv)
1226 char line[64]; 1277 char line[64];
1227 int s; 1278 int s;
1228 1279
1229 ffd = open("/debug/tracing/current_tracer", O_WRONLY); 1280 ffd = open(tracing_file("current_tracer"), O_WRONLY);
1230 if (ffd < 0) 1281 if (ffd < 0)
1231 exit(-1); 1282 exit(-1);
1232 write(ffd, "nop", 3); 1283 write(ffd, "nop", 3);
1233 1284
1234 fd = open("/debug/tracing/set_ftrace_pid", O_WRONLY); 1285 fd = open(tracing_file("set_ftrace_pid"), O_WRONLY);
1235 s = sprintf(line, "%d\n", getpid()); 1286 s = sprintf(line, "%d\n", getpid());
1236 write(fd, line, s); 1287 write(fd, line, s);
1237 1288
@@ -1383,22 +1434,22 @@ want, depending on your needs.
1383 tracing_cpu_mask file) or you might sometimes see unordered 1434 tracing_cpu_mask file) or you might sometimes see unordered
1384 function calls while cpu tracing switch. 1435 function calls while cpu tracing switch.
1385 1436
1386 hide: echo nofuncgraph-cpu > /debug/tracing/trace_options 1437 hide: echo nofuncgraph-cpu > trace_options
1387 show: echo funcgraph-cpu > /debug/tracing/trace_options 1438 show: echo funcgraph-cpu > trace_options
1388 1439
1389- The duration (function's time of execution) is displayed on 1440- The duration (function's time of execution) is displayed on
1390 the closing bracket line of a function or on the same line 1441 the closing bracket line of a function or on the same line
1391 than the current function in case of a leaf one. It is default 1442 than the current function in case of a leaf one. It is default
1392 enabled. 1443 enabled.
1393 1444
1394 hide: echo nofuncgraph-duration > /debug/tracing/trace_options 1445 hide: echo nofuncgraph-duration > trace_options
1395 show: echo funcgraph-duration > /debug/tracing/trace_options 1446 show: echo funcgraph-duration > trace_options
1396 1447
1397- The overhead field precedes the duration field in case of 1448- The overhead field precedes the duration field in case of
1398 reached duration thresholds. 1449 reached duration thresholds.
1399 1450
1400 hide: echo nofuncgraph-overhead > /debug/tracing/trace_options 1451 hide: echo nofuncgraph-overhead > trace_options
1401 show: echo funcgraph-overhead > /debug/tracing/trace_options 1452 show: echo funcgraph-overhead > trace_options
1402 depends on: funcgraph-duration 1453 depends on: funcgraph-duration
1403 1454
1404 ie: 1455 ie:
@@ -1427,8 +1478,8 @@ want, depending on your needs.
1427- The task/pid field displays the thread cmdline and pid which 1478- The task/pid field displays the thread cmdline and pid which
1428 executed the function. It is default disabled. 1479 executed the function. It is default disabled.
1429 1480
1430 hide: echo nofuncgraph-proc > /debug/tracing/trace_options 1481 hide: echo nofuncgraph-proc > trace_options
1431 show: echo funcgraph-proc > /debug/tracing/trace_options 1482 show: echo funcgraph-proc > trace_options
1432 1483
1433 ie: 1484 ie:
1434 1485
@@ -1451,8 +1502,8 @@ want, depending on your needs.
1451 system clock since it started. A snapshot of this time is 1502 system clock since it started. A snapshot of this time is
1452 given on each entry/exit of functions 1503 given on each entry/exit of functions
1453 1504
1454 hide: echo nofuncgraph-abstime > /debug/tracing/trace_options 1505 hide: echo nofuncgraph-abstime > trace_options
1455 show: echo funcgraph-abstime > /debug/tracing/trace_options 1506 show: echo funcgraph-abstime > trace_options
1456 1507
1457 ie: 1508 ie:
1458 1509
@@ -1549,7 +1600,7 @@ listed in:
1549 1600
1550 available_filter_functions 1601 available_filter_functions
1551 1602
1552 # cat /debug/tracing/available_filter_functions 1603 # cat available_filter_functions
1553put_prev_task_idle 1604put_prev_task_idle
1554kmem_cache_create 1605kmem_cache_create
1555pick_next_task_rt 1606pick_next_task_rt
@@ -1561,12 +1612,12 @@ mutex_lock
1561If I am only interested in sys_nanosleep and hrtimer_interrupt: 1612If I am only interested in sys_nanosleep and hrtimer_interrupt:
1562 1613
1563 # echo sys_nanosleep hrtimer_interrupt \ 1614 # echo sys_nanosleep hrtimer_interrupt \
1564 > /debug/tracing/set_ftrace_filter 1615 > set_ftrace_filter
1565 # echo ftrace > /debug/tracing/current_tracer 1616 # echo ftrace > current_tracer
1566 # echo 1 > /debug/tracing/tracing_enabled 1617 # echo 1 > tracing_enabled
1567 # usleep 1 1618 # usleep 1
1568 # echo 0 > /debug/tracing/tracing_enabled 1619 # echo 0 > tracing_enabled
1569 # cat /debug/tracing/trace 1620 # cat trace
1570# tracer: ftrace 1621# tracer: ftrace
1571# 1622#
1572# TASK-PID CPU# TIMESTAMP FUNCTION 1623# TASK-PID CPU# TIMESTAMP FUNCTION
@@ -1577,7 +1628,7 @@ If I am only interested in sys_nanosleep and hrtimer_interrupt:
1577 1628
1578To see which functions are being traced, you can cat the file: 1629To see which functions are being traced, you can cat the file:
1579 1630
1580 # cat /debug/tracing/set_ftrace_filter 1631 # cat set_ftrace_filter
1581hrtimer_interrupt 1632hrtimer_interrupt
1582sys_nanosleep 1633sys_nanosleep
1583 1634
@@ -1597,7 +1648,7 @@ Note: It is better to use quotes to enclose the wild cards,
1597 otherwise the shell may expand the parameters into names 1648 otherwise the shell may expand the parameters into names
1598 of files in the local directory. 1649 of files in the local directory.
1599 1650
1600 # echo 'hrtimer_*' > /debug/tracing/set_ftrace_filter 1651 # echo 'hrtimer_*' > set_ftrace_filter
1601 1652
1602Produces: 1653Produces:
1603 1654
@@ -1618,7 +1669,7 @@ Produces:
1618 1669
1619Notice that we lost the sys_nanosleep. 1670Notice that we lost the sys_nanosleep.
1620 1671
1621 # cat /debug/tracing/set_ftrace_filter 1672 # cat set_ftrace_filter
1622hrtimer_run_queues 1673hrtimer_run_queues
1623hrtimer_run_pending 1674hrtimer_run_pending
1624hrtimer_init 1675hrtimer_init
@@ -1644,17 +1695,17 @@ To append to the filters, use '>>'
1644To clear out a filter so that all functions will be recorded 1695To clear out a filter so that all functions will be recorded
1645again: 1696again:
1646 1697
1647 # echo > /debug/tracing/set_ftrace_filter 1698 # echo > set_ftrace_filter
1648 # cat /debug/tracing/set_ftrace_filter 1699 # cat set_ftrace_filter
1649 # 1700 #
1650 1701
1651Again, now we want to append. 1702Again, now we want to append.
1652 1703
1653 # echo sys_nanosleep > /debug/tracing/set_ftrace_filter 1704 # echo sys_nanosleep > set_ftrace_filter
1654 # cat /debug/tracing/set_ftrace_filter 1705 # cat set_ftrace_filter
1655sys_nanosleep 1706sys_nanosleep
1656 # echo 'hrtimer_*' >> /debug/tracing/set_ftrace_filter 1707 # echo 'hrtimer_*' >> set_ftrace_filter
1657 # cat /debug/tracing/set_ftrace_filter 1708 # cat set_ftrace_filter
1658hrtimer_run_queues 1709hrtimer_run_queues
1659hrtimer_run_pending 1710hrtimer_run_pending
1660hrtimer_init 1711hrtimer_init
@@ -1677,7 +1728,7 @@ hrtimer_init_sleeper
1677The set_ftrace_notrace prevents those functions from being 1728The set_ftrace_notrace prevents those functions from being
1678traced. 1729traced.
1679 1730
1680 # echo '*preempt*' '*lock*' > /debug/tracing/set_ftrace_notrace 1731 # echo '*preempt*' '*lock*' > set_ftrace_notrace
1681 1732
1682Produces: 1733Produces:
1683 1734
@@ -1767,13 +1818,13 @@ the effect on the tracing is different. Every read from
1767trace_pipe is consumed. This means that subsequent reads will be 1818trace_pipe is consumed. This means that subsequent reads will be
1768different. The trace is live. 1819different. The trace is live.
1769 1820
1770 # echo function > /debug/tracing/current_tracer 1821 # echo function > current_tracer
1771 # cat /debug/tracing/trace_pipe > /tmp/trace.out & 1822 # cat trace_pipe > /tmp/trace.out &
1772[1] 4153 1823[1] 4153
1773 # echo 1 > /debug/tracing/tracing_enabled 1824 # echo 1 > tracing_enabled
1774 # usleep 1 1825 # usleep 1
1775 # echo 0 > /debug/tracing/tracing_enabled 1826 # echo 0 > tracing_enabled
1776 # cat /debug/tracing/trace 1827 # cat trace
1777# tracer: function 1828# tracer: function
1778# 1829#
1779# TASK-PID CPU# TIMESTAMP FUNCTION 1830# TASK-PID CPU# TIMESTAMP FUNCTION
@@ -1809,7 +1860,7 @@ number listed is the number of entries that can be recorded per
1809CPU. To know the full size, multiply the number of possible CPUS 1860CPU. To know the full size, multiply the number of possible CPUS
1810with the number of entries. 1861with the number of entries.
1811 1862
1812 # cat /debug/tracing/buffer_size_kb 1863 # cat buffer_size_kb
18131408 (units kilobytes) 18641408 (units kilobytes)
1814 1865
1815Note, to modify this, you must have tracing completely disabled. 1866Note, to modify this, you must have tracing completely disabled.
@@ -1817,18 +1868,18 @@ To do that, echo "nop" into the current_tracer. If the
1817current_tracer is not set to "nop", an EINVAL error will be 1868current_tracer is not set to "nop", an EINVAL error will be
1818returned. 1869returned.
1819 1870
1820 # echo nop > /debug/tracing/current_tracer 1871 # echo nop > current_tracer
1821 # echo 10000 > /debug/tracing/buffer_size_kb 1872 # echo 10000 > buffer_size_kb
1822 # cat /debug/tracing/buffer_size_kb 1873 # cat buffer_size_kb
182310000 (units kilobytes) 187410000 (units kilobytes)
1824 1875
1825The number of pages which will be allocated is limited to a 1876The number of pages which will be allocated is limited to a
1826percentage of available memory. Allocating too much will produce 1877percentage of available memory. Allocating too much will produce
1827an error. 1878an error.
1828 1879
1829 # echo 1000000000000 > /debug/tracing/buffer_size_kb 1880 # echo 1000000000000 > buffer_size_kb
1830-bash: echo: write error: Cannot allocate memory 1881-bash: echo: write error: Cannot allocate memory
1831 # cat /debug/tracing/buffer_size_kb 1882 # cat buffer_size_kb
183285 188385
1833 1884
1834----------- 1885-----------
diff --git a/Documentation/trace/mmiotrace.txt b/Documentation/trace/mmiotrace.txt
index 5731c67abc55..162effbfbdec 100644
--- a/Documentation/trace/mmiotrace.txt
+++ b/Documentation/trace/mmiotrace.txt
@@ -32,41 +32,41 @@ is no way to automatically detect if you are losing events due to CPUs racing.
32Usage Quick Reference 32Usage Quick Reference
33--------------------- 33---------------------
34 34
35$ mount -t debugfs debugfs /debug 35$ mount -t debugfs debugfs /sys/kernel/debug
36$ echo mmiotrace > /debug/tracing/current_tracer 36$ echo mmiotrace > /sys/kernel/debug/tracing/current_tracer
37$ cat /debug/tracing/trace_pipe > mydump.txt & 37$ cat /sys/kernel/debug/tracing/trace_pipe > mydump.txt &
38Start X or whatever. 38Start X or whatever.
39$ echo "X is up" > /debug/tracing/trace_marker 39$ echo "X is up" > /sys/kernel/debug/tracing/trace_marker
40$ echo nop > /debug/tracing/current_tracer 40$ echo nop > /sys/kernel/debug/tracing/current_tracer
41Check for lost events. 41Check for lost events.
42 42
43 43
44Usage 44Usage
45----- 45-----
46 46
47Make sure debugfs is mounted to /debug. If not, (requires root privileges) 47Make sure debugfs is mounted to /sys/kernel/debug. If not, (requires root privileges)
48$ mount -t debugfs debugfs /debug 48$ mount -t debugfs debugfs /sys/kernel/debug
49 49
50Check that the driver you are about to trace is not loaded. 50Check that the driver you are about to trace is not loaded.
51 51
52Activate mmiotrace (requires root privileges): 52Activate mmiotrace (requires root privileges):
53$ echo mmiotrace > /debug/tracing/current_tracer 53$ echo mmiotrace > /sys/kernel/debug/tracing/current_tracer
54 54
55Start storing the trace: 55Start storing the trace:
56$ cat /debug/tracing/trace_pipe > mydump.txt & 56$ cat /sys/kernel/debug/tracing/trace_pipe > mydump.txt &
57The 'cat' process should stay running (sleeping) in the background. 57The 'cat' process should stay running (sleeping) in the background.
58 58
59Load the driver you want to trace and use it. Mmiotrace will only catch MMIO 59Load the driver you want to trace and use it. Mmiotrace will only catch MMIO
60accesses to areas that are ioremapped while mmiotrace is active. 60accesses to areas that are ioremapped while mmiotrace is active.
61 61
62During tracing you can place comments (markers) into the trace by 62During tracing you can place comments (markers) into the trace by
63$ echo "X is up" > /debug/tracing/trace_marker 63$ echo "X is up" > /sys/kernel/debug/tracing/trace_marker
64This makes it easier to see which part of the (huge) trace corresponds to 64This makes it easier to see which part of the (huge) trace corresponds to
65which action. It is recommended to place descriptive markers about what you 65which action. It is recommended to place descriptive markers about what you
66do. 66do.
67 67
68Shut down mmiotrace (requires root privileges): 68Shut down mmiotrace (requires root privileges):
69$ echo nop > /debug/tracing/current_tracer 69$ echo nop > /sys/kernel/debug/tracing/current_tracer
70The 'cat' process exits. If it does not, kill it by issuing 'fg' command and 70The 'cat' process exits. If it does not, kill it by issuing 'fg' command and
71pressing ctrl+c. 71pressing ctrl+c.
72 72
@@ -78,10 +78,10 @@ to view your kernel log and look for "mmiotrace has lost events" warning. If
78events were lost, the trace is incomplete. You should enlarge the buffers and 78events were lost, the trace is incomplete. You should enlarge the buffers and
79try again. Buffers are enlarged by first seeing how large the current buffers 79try again. Buffers are enlarged by first seeing how large the current buffers
80are: 80are:
81$ cat /debug/tracing/buffer_size_kb 81$ cat /sys/kernel/debug/tracing/buffer_size_kb
82gives you a number. Approximately double this number and write it back, for 82gives you a number. Approximately double this number and write it back, for
83instance: 83instance:
84$ echo 128000 > /debug/tracing/buffer_size_kb 84$ echo 128000 > /sys/kernel/debug/tracing/buffer_size_kb
85Then start again from the top. 85Then start again from the top.
86 86
87If you are doing a trace for a driver project, e.g. Nouveau, you should also 87If you are doing a trace for a driver project, e.g. Nouveau, you should also
diff --git a/Documentation/video4linux/CARDLIST.cx23885 b/Documentation/video4linux/CARDLIST.cx23885
index 91aa3c0f0dd2..450b8f8c389b 100644
--- a/Documentation/video4linux/CARDLIST.cx23885
+++ b/Documentation/video4linux/CARDLIST.cx23885
@@ -16,3 +16,8 @@
16 15 -> TeVii S470 [d470:9022] 16 15 -> TeVii S470 [d470:9022]
17 16 -> DVBWorld DVB-S2 2005 [0001:2005] 17 16 -> DVBWorld DVB-S2 2005 [0001:2005]
18 17 -> NetUP Dual DVB-S2 CI [1b55:2a2c] 18 17 -> NetUP Dual DVB-S2 CI [1b55:2a2c]
19 18 -> Hauppauge WinTV-HVR1270 [0070:2211]
20 19 -> Hauppauge WinTV-HVR1275 [0070:2215]
21 20 -> Hauppauge WinTV-HVR1255 [0070:2251]
22 21 -> Hauppauge WinTV-HVR1210 [0070:2291,0070:2295]
23 22 -> Mygica X8506 DMB-TH [14f1:8651]
diff --git a/Documentation/video4linux/CARDLIST.cx88 b/Documentation/video4linux/CARDLIST.cx88
index 71e9db0b26f7..89093f531727 100644
--- a/Documentation/video4linux/CARDLIST.cx88
+++ b/Documentation/video4linux/CARDLIST.cx88
@@ -78,3 +78,5 @@
78 77 -> TBS 8910 DVB-S [8910:8888] 78 77 -> TBS 8910 DVB-S [8910:8888]
79 78 -> Prof 6200 DVB-S [b022:3022] 79 78 -> Prof 6200 DVB-S [b022:3022]
80 79 -> Terratec Cinergy HT PCI MKII [153b:1177] 80 79 -> Terratec Cinergy HT PCI MKII [153b:1177]
81 80 -> Hauppauge WinTV-IR Only [0070:9290]
82 81 -> Leadtek WinFast DTV1800 Hybrid [107d:6654]
diff --git a/Documentation/video4linux/CARDLIST.em28xx b/Documentation/video4linux/CARDLIST.em28xx
index 78d0a6eed571..a98a688c11b8 100644
--- a/Documentation/video4linux/CARDLIST.em28xx
+++ b/Documentation/video4linux/CARDLIST.em28xx
@@ -17,7 +17,7 @@
17 16 -> Hauppauge WinTV HVR 950 (em2883) [2040:6513,2040:6517,2040:651b] 17 16 -> Hauppauge WinTV HVR 950 (em2883) [2040:6513,2040:6517,2040:651b]
18 17 -> Pinnacle PCTV HD Pro Stick (em2880) [2304:0227] 18 17 -> Pinnacle PCTV HD Pro Stick (em2880) [2304:0227]
19 18 -> Hauppauge WinTV HVR 900 (R2) (em2880) [2040:6502] 19 18 -> Hauppauge WinTV HVR 900 (R2) (em2880) [2040:6502]
20 19 -> PointNix Intra-Oral Camera (em2860) 20 19 -> EM2860/SAA711X Reference Design (em2860)
21 20 -> AMD ATI TV Wonder HD 600 (em2880) [0438:b002] 21 20 -> AMD ATI TV Wonder HD 600 (em2880) [0438:b002]
22 21 -> eMPIA Technology, Inc. GrabBeeX+ Video Encoder (em2800) [eb1a:2801] 22 21 -> eMPIA Technology, Inc. GrabBeeX+ Video Encoder (em2800) [eb1a:2801]
23 22 -> Unknown EM2750/EM2751 webcam grabber (em2750) [eb1a:2750,eb1a:2751] 23 22 -> Unknown EM2750/EM2751 webcam grabber (em2750) [eb1a:2750,eb1a:2751]
@@ -61,3 +61,7 @@
61 63 -> Kaiomy TVnPC U2 (em2860) [eb1a:e303] 61 63 -> Kaiomy TVnPC U2 (em2860) [eb1a:e303]
62 64 -> Easy Cap Capture DC-60 (em2860) 62 64 -> Easy Cap Capture DC-60 (em2860)
63 65 -> IO-DATA GV-MVP/SZ (em2820/em2840) [04bb:0515] 63 65 -> IO-DATA GV-MVP/SZ (em2820/em2840) [04bb:0515]
64 66 -> Empire dual TV (em2880)
65 67 -> Terratec Grabby (em2860) [0ccd:0096]
66 68 -> Terratec AV350 (em2860) [0ccd:0084]
67 69 -> KWorld ATSC 315U HDTV TV Box (em2882) [eb1a:a313]
diff --git a/Documentation/video4linux/CARDLIST.saa7134 b/Documentation/video4linux/CARDLIST.saa7134
index 6dacf2825259..15562427e8a9 100644
--- a/Documentation/video4linux/CARDLIST.saa7134
+++ b/Documentation/video4linux/CARDLIST.saa7134
@@ -124,10 +124,10 @@
124123 -> Beholder BeholdTV 407 [0000:4070] 124123 -> Beholder BeholdTV 407 [0000:4070]
125124 -> Beholder BeholdTV 407 FM [0000:4071] 125124 -> Beholder BeholdTV 407 FM [0000:4071]
126125 -> Beholder BeholdTV 409 [0000:4090] 126125 -> Beholder BeholdTV 409 [0000:4090]
127126 -> Beholder BeholdTV 505 FM/RDS [0000:5051,0000:505B,5ace:5050] 127126 -> Beholder BeholdTV 505 FM [5ace:5050]
128127 -> Beholder BeholdTV 507 FM/RDS / BeholdTV 509 FM [0000:5071,0000:507B,5ace:5070,5ace:5090] 128127 -> Beholder BeholdTV 507 FM / BeholdTV 509 FM [5ace:5070,5ace:5090]
129128 -> Beholder BeholdTV Columbus TVFM [0000:5201] 129128 -> Beholder BeholdTV Columbus TVFM [0000:5201]
130129 -> Beholder BeholdTV 607 / BeholdTV 609 [5ace:6070,5ace:6071,5ace:6072,5ace:6073,5ace:6090,5ace:6091,5ace:6092,5ace:6093] 130129 -> Beholder BeholdTV 607 FM [5ace:6070]
131130 -> Beholder BeholdTV M6 [5ace:6190] 131130 -> Beholder BeholdTV M6 [5ace:6190]
132131 -> Twinhan Hybrid DTV-DVB 3056 PCI [1822:0022] 132131 -> Twinhan Hybrid DTV-DVB 3056 PCI [1822:0022]
133132 -> Genius TVGO AM11MCE 133132 -> Genius TVGO AM11MCE
@@ -143,7 +143,7 @@
143142 -> Beholder BeholdTV H6 [5ace:6290] 143142 -> Beholder BeholdTV H6 [5ace:6290]
144143 -> Beholder BeholdTV M63 [5ace:6191] 144143 -> Beholder BeholdTV M63 [5ace:6191]
145144 -> Beholder BeholdTV M6 Extra [5ace:6193] 145144 -> Beholder BeholdTV M6 Extra [5ace:6193]
146145 -> AVerMedia MiniPCI DVB-T Hybrid M103 [1461:f636] 146145 -> AVerMedia MiniPCI DVB-T Hybrid M103 [1461:f636,1461:f736]
147146 -> ASUSTeK P7131 Analog 147146 -> ASUSTeK P7131 Analog
148147 -> Asus Tiger 3in1 [1043:4878] 148147 -> Asus Tiger 3in1 [1043:4878]
149148 -> Encore ENLTV-FM v5.3 [1a7f:2008] 149148 -> Encore ENLTV-FM v5.3 [1a7f:2008]
@@ -154,4 +154,16 @@
154153 -> Kworld Plus TV Analog Lite PCI [17de:7128] 154153 -> Kworld Plus TV Analog Lite PCI [17de:7128]
155154 -> Avermedia AVerTV GO 007 FM Plus [1461:f31d] 155154 -> Avermedia AVerTV GO 007 FM Plus [1461:f31d]
156155 -> Hauppauge WinTV-HVR1120 ATSC/QAM-Hybrid [0070:6706,0070:6708] 156155 -> Hauppauge WinTV-HVR1120 ATSC/QAM-Hybrid [0070:6706,0070:6708]
157156 -> Hauppauge WinTV-HVR1110r3 [0070:6707,0070:6709,0070:670a] 157156 -> Hauppauge WinTV-HVR1110r3 DVB-T/Hybrid [0070:6707,0070:6709,0070:670a]
158157 -> Avermedia AVerTV Studio 507UA [1461:a11b]
159158 -> AVerMedia Cardbus TV/Radio (E501R) [1461:b7e9]
160159 -> Beholder BeholdTV 505 RDS [0000:505B]
161160 -> Beholder BeholdTV 507 RDS [0000:5071]
162161 -> Beholder BeholdTV 507 RDS [0000:507B]
163162 -> Beholder BeholdTV 607 FM [5ace:6071]
164163 -> Beholder BeholdTV 609 FM [5ace:6090]
165164 -> Beholder BeholdTV 609 FM [5ace:6091]
166165 -> Beholder BeholdTV 607 RDS [5ace:6072]
167166 -> Beholder BeholdTV 607 RDS [5ace:6073]
168167 -> Beholder BeholdTV 609 RDS [5ace:6092]
169168 -> Beholder BeholdTV 609 RDS [5ace:6093]
diff --git a/Documentation/video4linux/CARDLIST.tuner b/Documentation/video4linux/CARDLIST.tuner
index 691d2f37dc57..be67844074dd 100644
--- a/Documentation/video4linux/CARDLIST.tuner
+++ b/Documentation/video4linux/CARDLIST.tuner
@@ -76,3 +76,5 @@ tuner=75 - Philips TEA5761 FM Radio
76tuner=76 - Xceive 5000 tuner 76tuner=76 - Xceive 5000 tuner
77tuner=77 - TCL tuner MF02GIP-5N-E 77tuner=77 - TCL tuner MF02GIP-5N-E
78tuner=78 - Philips FMD1216MEX MK3 Hybrid Tuner 78tuner=78 - Philips FMD1216MEX MK3 Hybrid Tuner
79tuner=79 - Philips PAL/SECAM multi (FM1216 MK5)
80tuner=80 - Philips FQ1216LME MK3 PAL/SECAM w/active loopthrough
diff --git a/Documentation/video4linux/gspca.txt b/Documentation/video4linux/gspca.txt
index 98529e03a46e..2bcf78896e22 100644
--- a/Documentation/video4linux/gspca.txt
+++ b/Documentation/video4linux/gspca.txt
@@ -163,10 +163,11 @@ sunplus 055f:c650 Mustek MDC5500Z
163zc3xx 055f:d003 Mustek WCam300A 163zc3xx 055f:d003 Mustek WCam300A
164zc3xx 055f:d004 Mustek WCam300 AN 164zc3xx 055f:d004 Mustek WCam300 AN
165conex 0572:0041 Creative Notebook cx11646 165conex 0572:0041 Creative Notebook cx11646
166ov519 05a9:0519 OmniVision 166ov519 05a9:0519 OV519 Microphone
167ov519 05a9:0530 OmniVision 167ov519 05a9:0530 OmniVision
168ov519 05a9:4519 OmniVision 168ov519 05a9:4519 Webcam Classic
169ov519 05a9:8519 OmniVision 169ov519 05a9:8519 OmniVision
170ov519 05a9:a518 D-Link DSB-C310 Webcam
170sunplus 05da:1018 Digital Dream Enigma 1.3 171sunplus 05da:1018 Digital Dream Enigma 1.3
171stk014 05e1:0893 Syntek DV4000 172stk014 05e1:0893 Syntek DV4000
172spca561 060b:a001 Maxell Compact Pc PM3 173spca561 060b:a001 Maxell Compact Pc PM3
@@ -178,6 +179,7 @@ spca506 06e1:a190 ADS Instant VCD
178ov534 06f8:3002 Hercules Blog Webcam 179ov534 06f8:3002 Hercules Blog Webcam
179ov534 06f8:3003 Hercules Dualpix HD Weblog 180ov534 06f8:3003 Hercules Dualpix HD Weblog
180sonixj 06f8:3004 Hercules Classic Silver 181sonixj 06f8:3004 Hercules Classic Silver
182sonixj 06f8:3008 Hercules Deluxe Optical Glass
181spca508 0733:0110 ViewQuest VQ110 183spca508 0733:0110 ViewQuest VQ110
182spca508 0130:0130 Clone Digital Webcam 11043 184spca508 0130:0130 Clone Digital Webcam 11043
183spca501 0733:0401 Intel Create and Share 185spca501 0733:0401 Intel Create and Share
@@ -209,6 +211,7 @@ sunplus 08ca:2050 Medion MD 41437
209sunplus 08ca:2060 Aiptek PocketDV5300 211sunplus 08ca:2060 Aiptek PocketDV5300
210tv8532 0923:010f ICM532 cams 212tv8532 0923:010f ICM532 cams
211mars 093a:050f Mars-Semi Pc-Camera 213mars 093a:050f Mars-Semi Pc-Camera
214mr97310a 093a:010f Sakar Digital no. 77379
212pac207 093a:2460 Qtec Webcam 100 215pac207 093a:2460 Qtec Webcam 100
213pac207 093a:2461 HP Webcam 216pac207 093a:2461 HP Webcam
214pac207 093a:2463 Philips SPC 220 NC 217pac207 093a:2463 Philips SPC 220 NC
@@ -265,6 +268,11 @@ sonixj 0c45:60ec SN9C105+MO4000
265sonixj 0c45:60fb Surfer NoName 268sonixj 0c45:60fb Surfer NoName
266sonixj 0c45:60fc LG-LIC300 269sonixj 0c45:60fc LG-LIC300
267sonixj 0c45:60fe Microdia Audio 270sonixj 0c45:60fe Microdia Audio
271sonixj 0c45:6100 PC Camera (SN9C128)
272sonixj 0c45:610a PC Camera (SN9C128)
273sonixj 0c45:610b PC Camera (SN9C128)
274sonixj 0c45:610c PC Camera (SN9C128)
275sonixj 0c45:610e PC Camera (SN9C128)
268sonixj 0c45:6128 Microdia/Sonix SNP325 276sonixj 0c45:6128 Microdia/Sonix SNP325
269sonixj 0c45:612a Avant Camera 277sonixj 0c45:612a Avant Camera
270sonixj 0c45:612c Typhoon Rasy Cam 1.3MPix 278sonixj 0c45:612c Typhoon Rasy Cam 1.3MPix
diff --git a/Documentation/video4linux/pxa_camera.txt b/Documentation/video4linux/pxa_camera.txt
index b1137f9a53eb..4f6d0ca01956 100644
--- a/Documentation/video4linux/pxa_camera.txt
+++ b/Documentation/video4linux/pxa_camera.txt
@@ -26,6 +26,55 @@ Global video workflow
26 26
27 Once the last buffer is filled in, the QCI interface stops. 27 Once the last buffer is filled in, the QCI interface stops.
28 28
29 c) Capture global finite state machine schema
30
31 +----+ +---+ +----+
32 | DQ | | Q | | DQ |
33 | v | v | v
34 +-----------+ +------------------------+
35 | STOP | | Wait for capture start |
36 +-----------+ Q +------------------------+
37+-> | QCI: stop | ------------------> | QCI: run | <------------+
38| | DMA: stop | | DMA: stop | |
39| +-----------+ +-----> +------------------------+ |
40| / | |
41| / +---+ +----+ | |
42|capture list empty / | Q | | DQ | | QCI Irq EOF |
43| / | v | v v |
44| +--------------------+ +----------------------+ |
45| | DMA hotlink missed | | Capture running | |
46| +--------------------+ +----------------------+ |
47| | QCI: run | +-----> | QCI: run | <-+ |
48| | DMA: stop | / | DMA: run | | |
49| +--------------------+ / +----------------------+ | Other |
50| ^ /DMA still | | channels |
51| | capture list / running | DMA Irq End | not |
52| | not empty / | | finished |
53| | / v | yet |
54| +----------------------+ +----------------------+ | |
55| | Videobuf released | | Channel completed | | |
56| +----------------------+ +----------------------+ | |
57+-- | QCI: run | | QCI: run | --+ |
58 | DMA: run | | DMA: run | |
59 +----------------------+ +----------------------+ |
60 ^ / | |
61 | no overrun / | overrun |
62 | / v |
63 +--------------------+ / +----------------------+ |
64 | Frame completed | / | Frame overran | |
65 +--------------------+ <-----+ +----------------------+ restart frame |
66 | QCI: run | | QCI: stop | --------------+
67 | DMA: run | | DMA: stop |
68 +--------------------+ +----------------------+
69
70 Legend: - each box is a FSM state
71 - each arrow is the condition to transition to another state
72 - an arrow with a comment is a mandatory transition (no condition)
73 - arrow "Q" means : a buffer was enqueued
74 - arrow "DQ" means : a buffer was dequeued
75 - "QCI: stop" means the QCI interface is not enabled
76 - "DMA: stop" means all 3 DMA channels are stopped
77 - "DMA: run" means at least 1 DMA channel is still running
29 78
30DMA usage 79DMA usage
31--------- 80---------
diff --git a/Documentation/video4linux/v4l2-framework.txt b/Documentation/video4linux/v4l2-framework.txt
index 854808b67fae..d54c1e4c6a9c 100644
--- a/Documentation/video4linux/v4l2-framework.txt
+++ b/Documentation/video4linux/v4l2-framework.txt
@@ -89,6 +89,11 @@ from dev (driver name followed by the bus_id, to be precise). If you set it
89up before calling v4l2_device_register then it will be untouched. If dev is 89up before calling v4l2_device_register then it will be untouched. If dev is
90NULL, then you *must* setup v4l2_dev->name before calling v4l2_device_register. 90NULL, then you *must* setup v4l2_dev->name before calling v4l2_device_register.
91 91
92You can use v4l2_device_set_name() to set the name based on a driver name and
93a driver-global atomic_t instance. This will generate names like ivtv0, ivtv1,
94etc. If the name ends with a digit, then it will insert a dash: cx18-0,
95cx18-1, etc. This function returns the instance number.
96
92The first 'dev' argument is normally the struct device pointer of a pci_dev, 97The first 'dev' argument is normally the struct device pointer of a pci_dev,
93usb_interface or platform_device. It is rare for dev to be NULL, but it happens 98usb_interface or platform_device. It is rare for dev to be NULL, but it happens
94with ISA devices or when one device creates multiple PCI devices, thus making 99with ISA devices or when one device creates multiple PCI devices, thus making
diff --git a/Documentation/vm/Makefile b/Documentation/vm/Makefile
index 6f562f778b28..5bd269b3731a 100644
--- a/Documentation/vm/Makefile
+++ b/Documentation/vm/Makefile
@@ -2,7 +2,7 @@
2obj- := dummy.o 2obj- := dummy.o
3 3
4# List of programs to build 4# List of programs to build
5hostprogs-y := slabinfo 5hostprogs-y := slabinfo page-types
6 6
7# Tell kbuild to always build the programs 7# Tell kbuild to always build the programs
8always := $(hostprogs-y) 8always := $(hostprogs-y)
diff --git a/Documentation/vm/balance b/Documentation/vm/balance
index bd3d31bc4915..c46e68cf9344 100644
--- a/Documentation/vm/balance
+++ b/Documentation/vm/balance
@@ -75,15 +75,15 @@ Page stealing from process memory and shm is done if stealing the page would
75alleviate memory pressure on any zone in the page's node that has fallen below 75alleviate memory pressure on any zone in the page's node that has fallen below
76its watermark. 76its watermark.
77 77
78pages_min/pages_low/pages_high/low_on_memory/zone_wake_kswapd: These are 78watemark[WMARK_MIN/WMARK_LOW/WMARK_HIGH]/low_on_memory/zone_wake_kswapd: These
79per-zone fields, used to determine when a zone needs to be balanced. When 79are per-zone fields, used to determine when a zone needs to be balanced. When
80the number of pages falls below pages_min, the hysteric field low_on_memory 80the number of pages falls below watermark[WMARK_MIN], the hysteric field
81gets set. This stays set till the number of free pages becomes pages_high. 81low_on_memory gets set. This stays set till the number of free pages becomes
82When low_on_memory is set, page allocation requests will try to free some 82watermark[WMARK_HIGH]. When low_on_memory is set, page allocation requests will
83pages in the zone (providing GFP_WAIT is set in the request). Orthogonal 83try to free some pages in the zone (providing GFP_WAIT is set in the request).
84to this, is the decision to poke kswapd to free some zone pages. That 84Orthogonal to this, is the decision to poke kswapd to free some zone pages.
85decision is not hysteresis based, and is done when the number of free 85That decision is not hysteresis based, and is done when the number of free
86pages is below pages_low; in which case zone_wake_kswapd is also set. 86pages is below watermark[WMARK_LOW]; in which case zone_wake_kswapd is also set.
87 87
88 88
89(Good) Ideas that I have heard: 89(Good) Ideas that I have heard:
diff --git a/Documentation/vm/page-types.c b/Documentation/vm/page-types.c
new file mode 100644
index 000000000000..0833f44ba16b
--- /dev/null
+++ b/Documentation/vm/page-types.c
@@ -0,0 +1,698 @@
1/*
2 * page-types: Tool for querying page flags
3 *
4 * Copyright (C) 2009 Intel corporation
5 * Copyright (C) 2009 Wu Fengguang <fengguang.wu@intel.com>
6 */
7
8#include <stdio.h>
9#include <stdlib.h>
10#include <unistd.h>
11#include <stdint.h>
12#include <stdarg.h>
13#include <string.h>
14#include <getopt.h>
15#include <limits.h>
16#include <sys/types.h>
17#include <sys/errno.h>
18#include <sys/fcntl.h>
19
20
21/*
22 * kernel page flags
23 */
24
25#define KPF_BYTES 8
26#define PROC_KPAGEFLAGS "/proc/kpageflags"
27
28/* copied from kpageflags_read() */
29#define KPF_LOCKED 0
30#define KPF_ERROR 1
31#define KPF_REFERENCED 2
32#define KPF_UPTODATE 3
33#define KPF_DIRTY 4
34#define KPF_LRU 5
35#define KPF_ACTIVE 6
36#define KPF_SLAB 7
37#define KPF_WRITEBACK 8
38#define KPF_RECLAIM 9
39#define KPF_BUDDY 10
40
41/* [11-20] new additions in 2.6.31 */
42#define KPF_MMAP 11
43#define KPF_ANON 12
44#define KPF_SWAPCACHE 13
45#define KPF_SWAPBACKED 14
46#define KPF_COMPOUND_HEAD 15
47#define KPF_COMPOUND_TAIL 16
48#define KPF_HUGE 17
49#define KPF_UNEVICTABLE 18
50#define KPF_NOPAGE 20
51
52/* [32-] kernel hacking assistances */
53#define KPF_RESERVED 32
54#define KPF_MLOCKED 33
55#define KPF_MAPPEDTODISK 34
56#define KPF_PRIVATE 35
57#define KPF_PRIVATE_2 36
58#define KPF_OWNER_PRIVATE 37
59#define KPF_ARCH 38
60#define KPF_UNCACHED 39
61
62/* [48-] take some arbitrary free slots for expanding overloaded flags
63 * not part of kernel API
64 */
65#define KPF_READAHEAD 48
66#define KPF_SLOB_FREE 49
67#define KPF_SLUB_FROZEN 50
68#define KPF_SLUB_DEBUG 51
69
70#define KPF_ALL_BITS ((uint64_t)~0ULL)
71#define KPF_HACKERS_BITS (0xffffULL << 32)
72#define KPF_OVERLOADED_BITS (0xffffULL << 48)
73#define BIT(name) (1ULL << KPF_##name)
74#define BITS_COMPOUND (BIT(COMPOUND_HEAD) | BIT(COMPOUND_TAIL))
75
76static char *page_flag_names[] = {
77 [KPF_LOCKED] = "L:locked",
78 [KPF_ERROR] = "E:error",
79 [KPF_REFERENCED] = "R:referenced",
80 [KPF_UPTODATE] = "U:uptodate",
81 [KPF_DIRTY] = "D:dirty",
82 [KPF_LRU] = "l:lru",
83 [KPF_ACTIVE] = "A:active",
84 [KPF_SLAB] = "S:slab",
85 [KPF_WRITEBACK] = "W:writeback",
86 [KPF_RECLAIM] = "I:reclaim",
87 [KPF_BUDDY] = "B:buddy",
88
89 [KPF_MMAP] = "M:mmap",
90 [KPF_ANON] = "a:anonymous",
91 [KPF_SWAPCACHE] = "s:swapcache",
92 [KPF_SWAPBACKED] = "b:swapbacked",
93 [KPF_COMPOUND_HEAD] = "H:compound_head",
94 [KPF_COMPOUND_TAIL] = "T:compound_tail",
95 [KPF_HUGE] = "G:huge",
96 [KPF_UNEVICTABLE] = "u:unevictable",
97 [KPF_NOPAGE] = "n:nopage",
98
99 [KPF_RESERVED] = "r:reserved",
100 [KPF_MLOCKED] = "m:mlocked",
101 [KPF_MAPPEDTODISK] = "d:mappedtodisk",
102 [KPF_PRIVATE] = "P:private",
103 [KPF_PRIVATE_2] = "p:private_2",
104 [KPF_OWNER_PRIVATE] = "O:owner_private",
105 [KPF_ARCH] = "h:arch",
106 [KPF_UNCACHED] = "c:uncached",
107
108 [KPF_READAHEAD] = "I:readahead",
109 [KPF_SLOB_FREE] = "P:slob_free",
110 [KPF_SLUB_FROZEN] = "A:slub_frozen",
111 [KPF_SLUB_DEBUG] = "E:slub_debug",
112};
113
114
115/*
116 * data structures
117 */
118
119static int opt_raw; /* for kernel developers */
120static int opt_list; /* list pages (in ranges) */
121static int opt_no_summary; /* don't show summary */
122static pid_t opt_pid; /* process to walk */
123
124#define MAX_ADDR_RANGES 1024
125static int nr_addr_ranges;
126static unsigned long opt_offset[MAX_ADDR_RANGES];
127static unsigned long opt_size[MAX_ADDR_RANGES];
128
129#define MAX_BIT_FILTERS 64
130static int nr_bit_filters;
131static uint64_t opt_mask[MAX_BIT_FILTERS];
132static uint64_t opt_bits[MAX_BIT_FILTERS];
133
134static int page_size;
135
136#define PAGES_BATCH (64 << 10) /* 64k pages */
137static int kpageflags_fd;
138static uint64_t kpageflags_buf[KPF_BYTES * PAGES_BATCH];
139
140#define HASH_SHIFT 13
141#define HASH_SIZE (1 << HASH_SHIFT)
142#define HASH_MASK (HASH_SIZE - 1)
143#define HASH_KEY(flags) (flags & HASH_MASK)
144
145static unsigned long total_pages;
146static unsigned long nr_pages[HASH_SIZE];
147static uint64_t page_flags[HASH_SIZE];
148
149
150/*
151 * helper functions
152 */
153
154#define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0]))
155
156#define min_t(type, x, y) ({ \
157 type __min1 = (x); \
158 type __min2 = (y); \
159 __min1 < __min2 ? __min1 : __min2; })
160
161unsigned long pages2mb(unsigned long pages)
162{
163 return (pages * page_size) >> 20;
164}
165
166void fatal(const char *x, ...)
167{
168 va_list ap;
169
170 va_start(ap, x);
171 vfprintf(stderr, x, ap);
172 va_end(ap);
173 exit(EXIT_FAILURE);
174}
175
176
177/*
178 * page flag names
179 */
180
181char *page_flag_name(uint64_t flags)
182{
183 static char buf[65];
184 int present;
185 int i, j;
186
187 for (i = 0, j = 0; i < ARRAY_SIZE(page_flag_names); i++) {
188 present = (flags >> i) & 1;
189 if (!page_flag_names[i]) {
190 if (present)
191 fatal("unkown flag bit %d\n", i);
192 continue;
193 }
194 buf[j++] = present ? page_flag_names[i][0] : '_';
195 }
196
197 return buf;
198}
199
200char *page_flag_longname(uint64_t flags)
201{
202 static char buf[1024];
203 int i, n;
204
205 for (i = 0, n = 0; i < ARRAY_SIZE(page_flag_names); i++) {
206 if (!page_flag_names[i])
207 continue;
208 if ((flags >> i) & 1)
209 n += snprintf(buf + n, sizeof(buf) - n, "%s,",
210 page_flag_names[i] + 2);
211 }
212 if (n)
213 n--;
214 buf[n] = '\0';
215
216 return buf;
217}
218
219
220/*
221 * page list and summary
222 */
223
224void show_page_range(unsigned long offset, uint64_t flags)
225{
226 static uint64_t flags0;
227 static unsigned long index;
228 static unsigned long count;
229
230 if (flags == flags0 && offset == index + count) {
231 count++;
232 return;
233 }
234
235 if (count)
236 printf("%lu\t%lu\t%s\n",
237 index, count, page_flag_name(flags0));
238
239 flags0 = flags;
240 index = offset;
241 count = 1;
242}
243
244void show_page(unsigned long offset, uint64_t flags)
245{
246 printf("%lu\t%s\n", offset, page_flag_name(flags));
247}
248
249void show_summary(void)
250{
251 int i;
252
253 printf(" flags\tpage-count MB"
254 " symbolic-flags\t\t\tlong-symbolic-flags\n");
255
256 for (i = 0; i < ARRAY_SIZE(nr_pages); i++) {
257 if (nr_pages[i])
258 printf("0x%016llx\t%10lu %8lu %s\t%s\n",
259 (unsigned long long)page_flags[i],
260 nr_pages[i],
261 pages2mb(nr_pages[i]),
262 page_flag_name(page_flags[i]),
263 page_flag_longname(page_flags[i]));
264 }
265
266 printf(" total\t%10lu %8lu\n",
267 total_pages, pages2mb(total_pages));
268}
269
270
271/*
272 * page flag filters
273 */
274
275int bit_mask_ok(uint64_t flags)
276{
277 int i;
278
279 for (i = 0; i < nr_bit_filters; i++) {
280 if (opt_bits[i] == KPF_ALL_BITS) {
281 if ((flags & opt_mask[i]) == 0)
282 return 0;
283 } else {
284 if ((flags & opt_mask[i]) != opt_bits[i])
285 return 0;
286 }
287 }
288
289 return 1;
290}
291
292uint64_t expand_overloaded_flags(uint64_t flags)
293{
294 /* SLOB/SLUB overload several page flags */
295 if (flags & BIT(SLAB)) {
296 if (flags & BIT(PRIVATE))
297 flags ^= BIT(PRIVATE) | BIT(SLOB_FREE);
298 if (flags & BIT(ACTIVE))
299 flags ^= BIT(ACTIVE) | BIT(SLUB_FROZEN);
300 if (flags & BIT(ERROR))
301 flags ^= BIT(ERROR) | BIT(SLUB_DEBUG);
302 }
303
304 /* PG_reclaim is overloaded as PG_readahead in the read path */
305 if ((flags & (BIT(RECLAIM) | BIT(WRITEBACK))) == BIT(RECLAIM))
306 flags ^= BIT(RECLAIM) | BIT(READAHEAD);
307
308 return flags;
309}
310
311uint64_t well_known_flags(uint64_t flags)
312{
313 /* hide flags intended only for kernel hacker */
314 flags &= ~KPF_HACKERS_BITS;
315
316 /* hide non-hugeTLB compound pages */
317 if ((flags & BITS_COMPOUND) && !(flags & BIT(HUGE)))
318 flags &= ~BITS_COMPOUND;
319
320 return flags;
321}
322
323
324/*
325 * page frame walker
326 */
327
328int hash_slot(uint64_t flags)
329{
330 int k = HASH_KEY(flags);
331 int i;
332
333 /* Explicitly reserve slot 0 for flags 0: the following logic
334 * cannot distinguish an unoccupied slot from slot (flags==0).
335 */
336 if (flags == 0)
337 return 0;
338
339 /* search through the remaining (HASH_SIZE-1) slots */
340 for (i = 1; i < ARRAY_SIZE(page_flags); i++, k++) {
341 if (!k || k >= ARRAY_SIZE(page_flags))
342 k = 1;
343 if (page_flags[k] == 0) {
344 page_flags[k] = flags;
345 return k;
346 }
347 if (page_flags[k] == flags)
348 return k;
349 }
350
351 fatal("hash table full: bump up HASH_SHIFT?\n");
352 exit(EXIT_FAILURE);
353}
354
355void add_page(unsigned long offset, uint64_t flags)
356{
357 flags = expand_overloaded_flags(flags);
358
359 if (!opt_raw)
360 flags = well_known_flags(flags);
361
362 if (!bit_mask_ok(flags))
363 return;
364
365 if (opt_list == 1)
366 show_page_range(offset, flags);
367 else if (opt_list == 2)
368 show_page(offset, flags);
369
370 nr_pages[hash_slot(flags)]++;
371 total_pages++;
372}
373
374void walk_pfn(unsigned long index, unsigned long count)
375{
376 unsigned long batch;
377 unsigned long n;
378 unsigned long i;
379
380 if (index > ULONG_MAX / KPF_BYTES)
381 fatal("index overflow: %lu\n", index);
382
383 lseek(kpageflags_fd, index * KPF_BYTES, SEEK_SET);
384
385 while (count) {
386 batch = min_t(unsigned long, count, PAGES_BATCH);
387 n = read(kpageflags_fd, kpageflags_buf, batch * KPF_BYTES);
388 if (n == 0)
389 break;
390 if (n < 0) {
391 perror(PROC_KPAGEFLAGS);
392 exit(EXIT_FAILURE);
393 }
394
395 if (n % KPF_BYTES != 0)
396 fatal("partial read: %lu bytes\n", n);
397 n = n / KPF_BYTES;
398
399 for (i = 0; i < n; i++)
400 add_page(index + i, kpageflags_buf[i]);
401
402 index += batch;
403 count -= batch;
404 }
405}
406
407void walk_addr_ranges(void)
408{
409 int i;
410
411 kpageflags_fd = open(PROC_KPAGEFLAGS, O_RDONLY);
412 if (kpageflags_fd < 0) {
413 perror(PROC_KPAGEFLAGS);
414 exit(EXIT_FAILURE);
415 }
416
417 if (!nr_addr_ranges)
418 walk_pfn(0, ULONG_MAX);
419
420 for (i = 0; i < nr_addr_ranges; i++)
421 walk_pfn(opt_offset[i], opt_size[i]);
422
423 close(kpageflags_fd);
424}
425
426
427/*
428 * user interface
429 */
430
431const char *page_flag_type(uint64_t flag)
432{
433 if (flag & KPF_HACKERS_BITS)
434 return "(r)";
435 if (flag & KPF_OVERLOADED_BITS)
436 return "(o)";
437 return " ";
438}
439
440void usage(void)
441{
442 int i, j;
443
444 printf(
445"page-types [options]\n"
446" -r|--raw Raw mode, for kernel developers\n"
447" -a|--addr addr-spec Walk a range of pages\n"
448" -b|--bits bits-spec Walk pages with specified bits\n"
449#if 0 /* planned features */
450" -p|--pid pid Walk process address space\n"
451" -f|--file filename Walk file address space\n"
452#endif
453" -l|--list Show page details in ranges\n"
454" -L|--list-each Show page details one by one\n"
455" -N|--no-summary Don't show summay info\n"
456" -h|--help Show this usage message\n"
457"addr-spec:\n"
458" N one page at offset N (unit: pages)\n"
459" N+M pages range from N to N+M-1\n"
460" N,M pages range from N to M-1\n"
461" N, pages range from N to end\n"
462" ,M pages range from 0 to M\n"
463"bits-spec:\n"
464" bit1,bit2 (flags & (bit1|bit2)) != 0\n"
465" bit1,bit2=bit1 (flags & (bit1|bit2)) == bit1\n"
466" bit1,~bit2 (flags & (bit1|bit2)) == bit1\n"
467" =bit1,bit2 flags == (bit1|bit2)\n"
468"bit-names:\n"
469 );
470
471 for (i = 0, j = 0; i < ARRAY_SIZE(page_flag_names); i++) {
472 if (!page_flag_names[i])
473 continue;
474 printf("%16s%s", page_flag_names[i] + 2,
475 page_flag_type(1ULL << i));
476 if (++j > 3) {
477 j = 0;
478 putchar('\n');
479 }
480 }
481 printf("\n "
482 "(r) raw mode bits (o) overloaded bits\n");
483}
484
485unsigned long long parse_number(const char *str)
486{
487 unsigned long long n;
488
489 n = strtoll(str, NULL, 0);
490
491 if (n == 0 && str[0] != '0')
492 fatal("invalid name or number: %s\n", str);
493
494 return n;
495}
496
497void parse_pid(const char *str)
498{
499 opt_pid = parse_number(str);
500}
501
502void parse_file(const char *name)
503{
504}
505
506void add_addr_range(unsigned long offset, unsigned long size)
507{
508 if (nr_addr_ranges >= MAX_ADDR_RANGES)
509 fatal("too much addr ranges\n");
510
511 opt_offset[nr_addr_ranges] = offset;
512 opt_size[nr_addr_ranges] = size;
513 nr_addr_ranges++;
514}
515
516void parse_addr_range(const char *optarg)
517{
518 unsigned long offset;
519 unsigned long size;
520 char *p;
521
522 p = strchr(optarg, ',');
523 if (!p)
524 p = strchr(optarg, '+');
525
526 if (p == optarg) {
527 offset = 0;
528 size = parse_number(p + 1);
529 } else if (p) {
530 offset = parse_number(optarg);
531 if (p[1] == '\0')
532 size = ULONG_MAX;
533 else {
534 size = parse_number(p + 1);
535 if (*p == ',') {
536 if (size < offset)
537 fatal("invalid range: %lu,%lu\n",
538 offset, size);
539 size -= offset;
540 }
541 }
542 } else {
543 offset = parse_number(optarg);
544 size = 1;
545 }
546
547 add_addr_range(offset, size);
548}
549
550void add_bits_filter(uint64_t mask, uint64_t bits)
551{
552 if (nr_bit_filters >= MAX_BIT_FILTERS)
553 fatal("too much bit filters\n");
554
555 opt_mask[nr_bit_filters] = mask;
556 opt_bits[nr_bit_filters] = bits;
557 nr_bit_filters++;
558}
559
560uint64_t parse_flag_name(const char *str, int len)
561{
562 int i;
563
564 if (!*str || !len)
565 return 0;
566
567 if (len <= 8 && !strncmp(str, "compound", len))
568 return BITS_COMPOUND;
569
570 for (i = 0; i < ARRAY_SIZE(page_flag_names); i++) {
571 if (!page_flag_names[i])
572 continue;
573 if (!strncmp(str, page_flag_names[i] + 2, len))
574 return 1ULL << i;
575 }
576
577 return parse_number(str);
578}
579
580uint64_t parse_flag_names(const char *str, int all)
581{
582 const char *p = str;
583 uint64_t flags = 0;
584
585 while (1) {
586 if (*p == ',' || *p == '=' || *p == '\0') {
587 if ((*str != '~') || (*str == '~' && all && *++str))
588 flags |= parse_flag_name(str, p - str);
589 if (*p != ',')
590 break;
591 str = p + 1;
592 }
593 p++;
594 }
595
596 return flags;
597}
598
599void parse_bits_mask(const char *optarg)
600{
601 uint64_t mask;
602 uint64_t bits;
603 const char *p;
604
605 p = strchr(optarg, '=');
606 if (p == optarg) {
607 mask = KPF_ALL_BITS;
608 bits = parse_flag_names(p + 1, 0);
609 } else if (p) {
610 mask = parse_flag_names(optarg, 0);
611 bits = parse_flag_names(p + 1, 0);
612 } else if (strchr(optarg, '~')) {
613 mask = parse_flag_names(optarg, 1);
614 bits = parse_flag_names(optarg, 0);
615 } else {
616 mask = parse_flag_names(optarg, 0);
617 bits = KPF_ALL_BITS;
618 }
619
620 add_bits_filter(mask, bits);
621}
622
623
624struct option opts[] = {
625 { "raw" , 0, NULL, 'r' },
626 { "pid" , 1, NULL, 'p' },
627 { "file" , 1, NULL, 'f' },
628 { "addr" , 1, NULL, 'a' },
629 { "bits" , 1, NULL, 'b' },
630 { "list" , 0, NULL, 'l' },
631 { "list-each" , 0, NULL, 'L' },
632 { "no-summary", 0, NULL, 'N' },
633 { "help" , 0, NULL, 'h' },
634 { NULL , 0, NULL, 0 }
635};
636
637int main(int argc, char *argv[])
638{
639 int c;
640
641 page_size = getpagesize();
642
643 while ((c = getopt_long(argc, argv,
644 "rp:f:a:b:lLNh", opts, NULL)) != -1) {
645 switch (c) {
646 case 'r':
647 opt_raw = 1;
648 break;
649 case 'p':
650 parse_pid(optarg);
651 break;
652 case 'f':
653 parse_file(optarg);
654 break;
655 case 'a':
656 parse_addr_range(optarg);
657 break;
658 case 'b':
659 parse_bits_mask(optarg);
660 break;
661 case 'l':
662 opt_list = 1;
663 break;
664 case 'L':
665 opt_list = 2;
666 break;
667 case 'N':
668 opt_no_summary = 1;
669 break;
670 case 'h':
671 usage();
672 exit(0);
673 default:
674 usage();
675 exit(1);
676 }
677 }
678
679 if (opt_list == 1)
680 printf("offset\tcount\tflags\n");
681 if (opt_list == 2)
682 printf("offset\tflags\n");
683
684 walk_addr_ranges();
685
686 if (opt_list == 1)
687 show_page_range(0, 0); /* drain the buffer */
688
689 if (opt_no_summary)
690 return 0;
691
692 if (opt_list)
693 printf("\n\n");
694
695 show_summary();
696
697 return 0;
698}
diff --git a/Documentation/vm/pagemap.txt b/Documentation/vm/pagemap.txt
index ce72c0fe6177..600a304a828c 100644
--- a/Documentation/vm/pagemap.txt
+++ b/Documentation/vm/pagemap.txt
@@ -12,9 +12,9 @@ There are three components to pagemap:
12 value for each virtual page, containing the following data (from 12 value for each virtual page, containing the following data (from
13 fs/proc/task_mmu.c, above pagemap_read): 13 fs/proc/task_mmu.c, above pagemap_read):
14 14
15 * Bits 0-55 page frame number (PFN) if present 15 * Bits 0-54 page frame number (PFN) if present
16 * Bits 0-4 swap type if swapped 16 * Bits 0-4 swap type if swapped
17 * Bits 5-55 swap offset if swapped 17 * Bits 5-54 swap offset if swapped
18 * Bits 55-60 page shift (page size = 1<<page shift) 18 * Bits 55-60 page shift (page size = 1<<page shift)
19 * Bit 61 reserved for future use 19 * Bit 61 reserved for future use
20 * Bit 62 page swapped 20 * Bit 62 page swapped
@@ -36,7 +36,7 @@ There are three components to pagemap:
36 * /proc/kpageflags. This file contains a 64-bit set of flags for each 36 * /proc/kpageflags. This file contains a 64-bit set of flags for each
37 page, indexed by PFN. 37 page, indexed by PFN.
38 38
39 The flags are (from fs/proc/proc_misc, above kpageflags_read): 39 The flags are (from fs/proc/page.c, above kpageflags_read):
40 40
41 0. LOCKED 41 0. LOCKED
42 1. ERROR 42 1. ERROR
@@ -49,6 +49,68 @@ There are three components to pagemap:
49 8. WRITEBACK 49 8. WRITEBACK
50 9. RECLAIM 50 9. RECLAIM
51 10. BUDDY 51 10. BUDDY
52 11. MMAP
53 12. ANON
54 13. SWAPCACHE
55 14. SWAPBACKED
56 15. COMPOUND_HEAD
57 16. COMPOUND_TAIL
58 16. HUGE
59 18. UNEVICTABLE
60 20. NOPAGE
61
62Short descriptions to the page flags:
63
64 0. LOCKED
65 page is being locked for exclusive access, eg. by undergoing read/write IO
66
67 7. SLAB
68 page is managed by the SLAB/SLOB/SLUB/SLQB kernel memory allocator
69 When compound page is used, SLUB/SLQB will only set this flag on the head
70 page; SLOB will not flag it at all.
71
7210. BUDDY
73 a free memory block managed by the buddy system allocator
74 The buddy system organizes free memory in blocks of various orders.
75 An order N block has 2^N physically contiguous pages, with the BUDDY flag
76 set for and _only_ for the first page.
77
7815. COMPOUND_HEAD
7916. COMPOUND_TAIL
80 A compound page with order N consists of 2^N physically contiguous pages.
81 A compound page with order 2 takes the form of "HTTT", where H donates its
82 head page and T donates its tail page(s). The major consumers of compound
83 pages are hugeTLB pages (Documentation/vm/hugetlbpage.txt), the SLUB etc.
84 memory allocators and various device drivers. However in this interface,
85 only huge/giga pages are made visible to end users.
8617. HUGE
87 this is an integral part of a HugeTLB page
88
8920. NOPAGE
90 no page frame exists at the requested address
91
92 [IO related page flags]
93 1. ERROR IO error occurred
94 3. UPTODATE page has up-to-date data
95 ie. for file backed page: (in-memory data revision >= on-disk one)
96 4. DIRTY page has been written to, hence contains new data
97 ie. for file backed page: (in-memory data revision > on-disk one)
98 8. WRITEBACK page is being synced to disk
99
100 [LRU related page flags]
101 5. LRU page is in one of the LRU lists
102 6. ACTIVE page is in the active LRU list
10318. UNEVICTABLE page is in the unevictable (non-)LRU list
104 It is somehow pinned and not a candidate for LRU page reclaims,
105 eg. ramfs pages, shmctl(SHM_LOCK) and mlock() memory segments
106 2. REFERENCED page has been referenced since last LRU list enqueue/requeue
107 9. RECLAIM page will be reclaimed soon after its pageout IO completed
10811. MMAP a memory mapped page
10912. ANON a memory mapped page that is not part of a file
11013. SWAPCACHE page is mapped to swap space, ie. has an associated swap entry
11114. SWAPBACKED page is backed by swap/RAM
112
113The page-types tool in this directory can be used to query the above flags.
52 114
53Using pagemap to do something useful: 115Using pagemap to do something useful:
54 116