aboutsummaryrefslogtreecommitdiffstats
path: root/Documentation
diff options
context:
space:
mode:
authorJames Morris <jmorris@macbook.(none)>2009-12-03 01:33:40 -0500
committerJames Morris <jmorris@macbook.(none)>2009-12-03 01:33:40 -0500
commitc84d6efd363a3948eb32ec40d46bab6338580454 (patch)
tree3ba7ac46e6626fe8ac843834588609eb6ccee5c6 /Documentation
parent7539cf4b92be4aecc573ea962135f246a7a33401 (diff)
parent22763c5cf3690a681551162c15d34d935308c8d7 (diff)
Merge branch 'master' into next
Diffstat (limited to 'Documentation')
-rw-r--r--Documentation/ABI/testing/sysfs-bus-pci-devices-cciss28
-rw-r--r--Documentation/ABI/testing/sysfs-class-uwb_rc-wusbhc (renamed from Documentation/ABI/testing/sysfs-class-usb_host)4
-rw-r--r--Documentation/ABI/testing/sysfs-devices-cache_disable18
-rw-r--r--Documentation/ABI/testing/sysfs-devices-system-cpu156
-rw-r--r--Documentation/SubmittingPatches2
-rw-r--r--Documentation/arm/tcm.txt10
-rw-r--r--Documentation/cgroups/cgroups.txt11
-rw-r--r--Documentation/connector/cn_test.c2
-rw-r--r--Documentation/connector/connector.txt8
-rw-r--r--Documentation/cputopology.txt47
-rw-r--r--Documentation/debugging-via-ohci1394.txt8
-rw-r--r--Documentation/fb/framebuffer.txt6
-rw-r--r--Documentation/feature-removal-schedule.txt38
-rw-r--r--Documentation/filesystems/caching/fscache.txt110
-rw-r--r--Documentation/filesystems/caching/netfs-api.txt21
-rw-r--r--Documentation/filesystems/ext3.txt16
-rw-r--r--Documentation/filesystems/ext4.txt21
-rw-r--r--Documentation/filesystems/ocfs2.txt6
-rw-r--r--Documentation/filesystems/proc.txt1
-rw-r--r--Documentation/filesystems/vfat.txt2
-rw-r--r--Documentation/flexible-arrays.txt43
-rw-r--r--Documentation/hwmon/ltc42157
-rw-r--r--Documentation/hwmon/ltc42457
-rw-r--r--Documentation/hwmon/sysfs-interface57
-rw-r--r--Documentation/i2c/busses/i2c-piix42
-rw-r--r--Documentation/i2c/instantiating-devices2
-rw-r--r--Documentation/infiniband/user_mad.txt4
-rw-r--r--Documentation/infiniband/user_verbs.txt2
-rw-r--r--Documentation/isdn/INTERFACE.CAPI83
-rw-r--r--Documentation/kernel-parameters.txt1
-rw-r--r--Documentation/lguest/lguest.c1
-rw-r--r--Documentation/misc-devices/eeprom (renamed from Documentation/i2c/chips/eeprom)0
-rw-r--r--Documentation/misc-devices/max6875 (renamed from Documentation/i2c/chips/max6875)6
-rw-r--r--Documentation/networking/pktgen.txt8
-rw-r--r--Documentation/networking/timestamping/timestamping.c2
-rw-r--r--Documentation/scsi/hptiop.txt21
-rw-r--r--Documentation/slow-work.txt160
-rw-r--r--Documentation/sound/alsa/ALSA-Configuration.txt2
-rw-r--r--Documentation/sound/alsa/HD-Audio-Models.txt2
-rw-r--r--Documentation/thermal/sysfs-api.txt389
-rw-r--r--Documentation/trace/ftrace.txt2
-rw-r--r--Documentation/vm/hwpoison.txt136
-rw-r--r--Documentation/vm/ksm.txt13
-rw-r--r--Documentation/vm/page-types.c304
-rw-r--r--Documentation/vm/pagemap.txt8
-rw-r--r--Documentation/w1/masters/ds24826
46 files changed, 1370 insertions, 413 deletions
diff --git a/Documentation/ABI/testing/sysfs-bus-pci-devices-cciss b/Documentation/ABI/testing/sysfs-bus-pci-devices-cciss
index 0a92a7c93a62..4f29e5f1ebfa 100644
--- a/Documentation/ABI/testing/sysfs-bus-pci-devices-cciss
+++ b/Documentation/ABI/testing/sysfs-bus-pci-devices-cciss
@@ -31,3 +31,31 @@ Date: March 2009
31Kernel Version: 2.6.30 31Kernel Version: 2.6.30
32Contact: iss_storagedev@hp.com 32Contact: iss_storagedev@hp.com
33Description: A symbolic link to /sys/block/cciss!cXdY 33Description: A symbolic link to /sys/block/cciss!cXdY
34
35Where: /sys/bus/pci/devices/<dev>/ccissX/rescan
36Date: August 2009
37Kernel Version: 2.6.31
38Contact: iss_storagedev@hp.com
39Description: Kicks of a rescan of the controller to discover logical
40 drive topology changes.
41
42Where: /sys/bus/pci/devices/<dev>/ccissX/cXdY/lunid
43Date: August 2009
44Kernel Version: 2.6.31
45Contact: iss_storagedev@hp.com
46Description: Displays the 8-byte LUN ID used to address logical
47 drive Y of controller X.
48
49Where: /sys/bus/pci/devices/<dev>/ccissX/cXdY/raid_level
50Date: August 2009
51Kernel Version: 2.6.31
52Contact: iss_storagedev@hp.com
53Description: Displays the RAID level of logical drive Y of
54 controller X.
55
56Where: /sys/bus/pci/devices/<dev>/ccissX/cXdY/usage_count
57Date: August 2009
58Kernel Version: 2.6.31
59Contact: iss_storagedev@hp.com
60Description: Displays the usage count (number of opens) of logical drive Y
61 of controller X.
diff --git a/Documentation/ABI/testing/sysfs-class-usb_host b/Documentation/ABI/testing/sysfs-class-uwb_rc-wusbhc
index 46b66ad1f1b4..4e8106f7cfd9 100644
--- a/Documentation/ABI/testing/sysfs-class-usb_host
+++ b/Documentation/ABI/testing/sysfs-class-uwb_rc-wusbhc
@@ -1,4 +1,4 @@
1What: /sys/class/usb_host/usb_hostN/wusb_chid 1What: /sys/class/uwb_rc/uwbN/wusbhc/wusb_chid
2Date: July 2008 2Date: July 2008
3KernelVersion: 2.6.27 3KernelVersion: 2.6.27
4Contact: David Vrabel <david.vrabel@csr.com> 4Contact: David Vrabel <david.vrabel@csr.com>
@@ -9,7 +9,7 @@ Description:
9 9
10 Set an all zero CHID to stop the host controller. 10 Set an all zero CHID to stop the host controller.
11 11
12What: /sys/class/usb_host/usb_hostN/wusb_trust_timeout 12What: /sys/class/uwb_rc/uwbN/wusbhc/wusb_trust_timeout
13Date: July 2008 13Date: July 2008
14KernelVersion: 2.6.27 14KernelVersion: 2.6.27
15Contact: David Vrabel <david.vrabel@csr.com> 15Contact: David Vrabel <david.vrabel@csr.com>
diff --git a/Documentation/ABI/testing/sysfs-devices-cache_disable b/Documentation/ABI/testing/sysfs-devices-cache_disable
deleted file mode 100644
index 175bb4f70512..000000000000
--- a/Documentation/ABI/testing/sysfs-devices-cache_disable
+++ /dev/null
@@ -1,18 +0,0 @@
1What: /sys/devices/system/cpu/cpu*/cache/index*/cache_disable_X
2Date: August 2008
3KernelVersion: 2.6.27
4Contact: mark.langsdorf@amd.com
5Description: These files exist in every cpu's cache index directories.
6 There are currently 2 cache_disable_# files in each
7 directory. Reading from these files on a supported
8 processor will return that cache disable index value
9 for that processor and node. Writing to one of these
10 files will cause the specificed cache index to be disabled.
11
12 Currently, only AMD Family 10h Processors support cache index
13 disable, and only for their L3 caches. See the BIOS and
14 Kernel Developer's Guide at
15 http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/31116-Public-GH-BKDG_3.20_2-4-09.pdf
16 for formatting information and other details on the
17 cache index disable.
18Users: joachim.deguara@amd.com
diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu b/Documentation/ABI/testing/sysfs-devices-system-cpu
new file mode 100644
index 000000000000..a703b9e9aeb9
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-devices-system-cpu
@@ -0,0 +1,156 @@
1What: /sys/devices/system/cpu/
2Date: pre-git history
3Contact: Linux kernel mailing list <linux-kernel@vger.kernel.org>
4Description:
5 A collection of both global and individual CPU attributes
6
7 Individual CPU attributes are contained in subdirectories
8 named by the kernel's logical CPU number, e.g.:
9
10 /sys/devices/system/cpu/cpu#/
11
12What: /sys/devices/system/cpu/sched_mc_power_savings
13 /sys/devices/system/cpu/sched_smt_power_savings
14Date: June 2006
15Contact: Linux kernel mailing list <linux-kernel@vger.kernel.org>
16Description: Discover and adjust the kernel's multi-core scheduler support.
17
18 Possible values are:
19
20 0 - No power saving load balance (default value)
21 1 - Fill one thread/core/package first for long running threads
22 2 - Also bias task wakeups to semi-idle cpu package for power
23 savings
24
25 sched_mc_power_savings is dependent upon SCHED_MC, which is
26 itself architecture dependent.
27
28 sched_smt_power_savings is dependent upon SCHED_SMT, which
29 is itself architecture dependent.
30
31 The two files are independent of each other. It is possible
32 that one file may be present without the other.
33
34 Introduced by git commit 5c45bf27.
35
36
37What: /sys/devices/system/cpu/kernel_max
38 /sys/devices/system/cpu/offline
39 /sys/devices/system/cpu/online
40 /sys/devices/system/cpu/possible
41 /sys/devices/system/cpu/present
42Date: December 2008
43Contact: Linux kernel mailing list <linux-kernel@vger.kernel.org>
44Description: CPU topology files that describe kernel limits related to
45 hotplug. Briefly:
46
47 kernel_max: the maximum cpu index allowed by the kernel
48 configuration.
49
50 offline: cpus that are not online because they have been
51 HOTPLUGGED off or exceed the limit of cpus allowed by the
52 kernel configuration (kernel_max above).
53
54 online: cpus that are online and being scheduled.
55
56 possible: cpus that have been allocated resources and can be
57 brought online if they are present.
58
59 present: cpus that have been identified as being present in
60 the system.
61
62 See Documentation/cputopology.txt for more information.
63
64
65
66What: /sys/devices/system/cpu/cpu#/node
67Date: October 2009
68Contact: Linux memory management mailing list <linux-mm@kvack.org>
69Description: Discover NUMA node a CPU belongs to
70
71 When CONFIG_NUMA is enabled, a symbolic link that points
72 to the corresponding NUMA node directory.
73
74 For example, the following symlink is created for cpu42
75 in NUMA node 2:
76
77 /sys/devices/system/cpu/cpu42/node2 -> ../../node/node2
78
79
80What: /sys/devices/system/cpu/cpu#/topology/core_id
81 /sys/devices/system/cpu/cpu#/topology/core_siblings
82 /sys/devices/system/cpu/cpu#/topology/core_siblings_list
83 /sys/devices/system/cpu/cpu#/topology/physical_package_id
84 /sys/devices/system/cpu/cpu#/topology/thread_siblings
85 /sys/devices/system/cpu/cpu#/topology/thread_siblings_list
86Date: December 2008
87Contact: Linux kernel mailing list <linux-kernel@vger.kernel.org>
88Description: CPU topology files that describe a logical CPU's relationship
89 to other cores and threads in the same physical package.
90
91 One cpu# directory is created per logical CPU in the system,
92 e.g. /sys/devices/system/cpu/cpu42/.
93
94 Briefly, the files above are:
95
96 core_id: the CPU core ID of cpu#. Typically it is the
97 hardware platform's identifier (rather than the kernel's).
98 The actual value is architecture and platform dependent.
99
100 core_siblings: internal kernel map of cpu#'s hardware threads
101 within the same physical_package_id.
102
103 core_siblings_list: human-readable list of the logical CPU
104 numbers within the same physical_package_id as cpu#.
105
106 physical_package_id: physical package id of cpu#. Typically
107 corresponds to a physical socket number, but the actual value
108 is architecture and platform dependent.
109
110 thread_siblings: internel kernel map of cpu#'s hardware
111 threads within the same core as cpu#
112
113 thread_siblings_list: human-readable list of cpu#'s hardware
114 threads within the same core as cpu#
115
116 See Documentation/cputopology.txt for more information.
117
118
119What: /sys/devices/system/cpu/cpuidle/current_driver
120 /sys/devices/system/cpu/cpuidle/current_governer_ro
121Date: September 2007
122Contact: Linux kernel mailing list <linux-kernel@vger.kernel.org>
123Description: Discover cpuidle policy and mechanism
124
125 Various CPUs today support multiple idle levels that are
126 differentiated by varying exit latencies and power
127 consumption during idle.
128
129 Idle policy (governor) is differentiated from idle mechanism
130 (driver)
131
132 current_driver: displays current idle mechanism
133
134 current_governor_ro: displays current idle policy
135
136 See files in Documentation/cpuidle/ for more information.
137
138
139What: /sys/devices/system/cpu/cpu*/cache/index*/cache_disable_X
140Date: August 2008
141KernelVersion: 2.6.27
142Contact: mark.langsdorf@amd.com
143Description: These files exist in every cpu's cache index directories.
144 There are currently 2 cache_disable_# files in each
145 directory. Reading from these files on a supported
146 processor will return that cache disable index value
147 for that processor and node. Writing to one of these
148 files will cause the specificed cache index to be disabled.
149
150 Currently, only AMD Family 10h Processors support cache index
151 disable, and only for their L3 caches. See the BIOS and
152 Kernel Developer's Guide at
153 http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/31116-Public-GH-BKDG_3.20_2-4-09.pdf
154 for formatting information and other details on the
155 cache index disable.
156Users: joachim.deguara@amd.com
diff --git a/Documentation/SubmittingPatches b/Documentation/SubmittingPatches
index b7f9d3b4bbf6..72651f788f4e 100644
--- a/Documentation/SubmittingPatches
+++ b/Documentation/SubmittingPatches
@@ -232,7 +232,7 @@ your e-mail client so that it sends your patches untouched.
232When sending patches to Linus, always follow step #7. 232When sending patches to Linus, always follow step #7.
233 233
234Large changes are not appropriate for mailing lists, and some 234Large changes are not appropriate for mailing lists, and some
235maintainers. If your patch, uncompressed, exceeds 40 kB in size, 235maintainers. If your patch, uncompressed, exceeds 300 kB in size,
236it is preferred that you store your patch on an Internet-accessible 236it is preferred that you store your patch on an Internet-accessible
237server, and provide instead a URL (link) pointing to your patch. 237server, and provide instead a URL (link) pointing to your patch.
238 238
diff --git a/Documentation/arm/tcm.txt b/Documentation/arm/tcm.txt
index 074f4be6667f..77fd9376e6d7 100644
--- a/Documentation/arm/tcm.txt
+++ b/Documentation/arm/tcm.txt
@@ -29,11 +29,13 @@ TCM location and size. Notice that this is not a MMU table: you
29actually move the physical location of the TCM around. At the 29actually move the physical location of the TCM around. At the
30place you put it, it will mask any underlying RAM from the 30place you put it, it will mask any underlying RAM from the
31CPU so it is usually wise not to overlap any physical RAM with 31CPU so it is usually wise not to overlap any physical RAM with
32the TCM. The TCM memory exists totally outside the MMU and will 32the TCM.
33override any MMU mappings.
34 33
35Code executing inside the ITCM does not "see" any MMU mappings 34The TCM memory can then be remapped to another address again using
36and e.g. register accesses must be made to physical addresses. 35the MMU, but notice that the TCM if often used in situations where
36the MMU is turned off. To avoid confusion the current Linux
37implementation will map the TCM 1 to 1 from physical to virtual
38memory in the location specified by the machine.
37 39
38TCM is used for a few things: 40TCM is used for a few things:
39 41
diff --git a/Documentation/cgroups/cgroups.txt b/Documentation/cgroups/cgroups.txt
index 455d4e6d346d..0b33bfe7dde9 100644
--- a/Documentation/cgroups/cgroups.txt
+++ b/Documentation/cgroups/cgroups.txt
@@ -227,7 +227,14 @@ as the path relative to the root of the cgroup file system.
227Each cgroup is represented by a directory in the cgroup file system 227Each cgroup is represented by a directory in the cgroup file system
228containing the following files describing that cgroup: 228containing the following files describing that cgroup:
229 229
230 - tasks: list of tasks (by pid) attached to that cgroup 230 - tasks: list of tasks (by pid) attached to that cgroup. This list
231 is not guaranteed to be sorted. Writing a thread id into this file
232 moves the thread into this cgroup.
233 - cgroup.procs: list of tgids in the cgroup. This list is not
234 guaranteed to be sorted or free of duplicate tgids, and userspace
235 should sort/uniquify the list if this property is required.
236 Writing a tgid into this file moves all threads with that tgid into
237 this cgroup.
231 - notify_on_release flag: run the release agent on exit? 238 - notify_on_release flag: run the release agent on exit?
232 - release_agent: the path to use for release notifications (this file 239 - release_agent: the path to use for release notifications (this file
233 exists in the top cgroup only) 240 exists in the top cgroup only)
@@ -374,7 +381,7 @@ Now you want to do something with this cgroup.
374 381
375In this directory you can find several files: 382In this directory you can find several files:
376# ls 383# ls
377notify_on_release tasks 384cgroup.procs notify_on_release tasks
378(plus whatever files added by the attached subsystems) 385(plus whatever files added by the attached subsystems)
379 386
380Now attach your shell to this cgroup: 387Now attach your shell to this cgroup:
diff --git a/Documentation/connector/cn_test.c b/Documentation/connector/cn_test.c
index 1711adc33373..b07add3467f1 100644
--- a/Documentation/connector/cn_test.c
+++ b/Documentation/connector/cn_test.c
@@ -34,7 +34,7 @@ static char cn_test_name[] = "cn_test";
34static struct sock *nls; 34static struct sock *nls;
35static struct timer_list cn_test_timer; 35static struct timer_list cn_test_timer;
36 36
37static void cn_test_callback(struct cn_msg *msg) 37static void cn_test_callback(struct cn_msg *msg, struct netlink_skb_parms *nsp)
38{ 38{
39 pr_info("%s: %lu: idx=%x, val=%x, seq=%u, ack=%u, len=%d: %s.\n", 39 pr_info("%s: %lu: idx=%x, val=%x, seq=%u, ack=%u, len=%d: %s.\n",
40 __func__, jiffies, msg->id.idx, msg->id.val, 40 __func__, jiffies, msg->id.idx, msg->id.val,
diff --git a/Documentation/connector/connector.txt b/Documentation/connector/connector.txt
index 81e6bf6ead57..78c9466a9aa8 100644
--- a/Documentation/connector/connector.txt
+++ b/Documentation/connector/connector.txt
@@ -23,7 +23,7 @@ handling, etc... The Connector driver allows any kernelspace agents to use
23netlink based networking for inter-process communication in a significantly 23netlink based networking for inter-process communication in a significantly
24easier way: 24easier way:
25 25
26int cn_add_callback(struct cb_id *id, char *name, void (*callback) (void *)); 26int cn_add_callback(struct cb_id *id, char *name, void (*callback) (struct cn_msg *, struct netlink_skb_parms *));
27void cn_netlink_send(struct cn_msg *msg, u32 __group, int gfp_mask); 27void cn_netlink_send(struct cn_msg *msg, u32 __group, int gfp_mask);
28 28
29struct cb_id 29struct cb_id
@@ -53,15 +53,15 @@ struct cn_msg
53Connector interfaces. 53Connector interfaces.
54/*****************************************/ 54/*****************************************/
55 55
56int cn_add_callback(struct cb_id *id, char *name, void (*callback) (void *)); 56int cn_add_callback(struct cb_id *id, char *name, void (*callback) (struct cn_msg *, struct netlink_skb_parms *));
57 57
58 Registers new callback with connector core. 58 Registers new callback with connector core.
59 59
60 struct cb_id *id - unique connector's user identifier. 60 struct cb_id *id - unique connector's user identifier.
61 It must be registered in connector.h for legal in-kernel users. 61 It must be registered in connector.h for legal in-kernel users.
62 char *name - connector's callback symbolic name. 62 char *name - connector's callback symbolic name.
63 void (*callback) (void *) - connector's callback. 63 void (*callback) (struct cn..) - connector's callback.
64 Argument must be dereferenced to struct cn_msg *. 64 cn_msg and the sender's credentials
65 65
66 66
67void cn_del_callback(struct cb_id *id); 67void cn_del_callback(struct cb_id *id);
diff --git a/Documentation/cputopology.txt b/Documentation/cputopology.txt
index b41f3e58aefa..f1c5c4bccd3e 100644
--- a/Documentation/cputopology.txt
+++ b/Documentation/cputopology.txt
@@ -1,15 +1,28 @@
1 1
2Export cpu topology info via sysfs. Items (attributes) are similar 2Export CPU topology info via sysfs. Items (attributes) are similar
3to /proc/cpuinfo. 3to /proc/cpuinfo.
4 4
51) /sys/devices/system/cpu/cpuX/topology/physical_package_id: 51) /sys/devices/system/cpu/cpuX/topology/physical_package_id:
6represent the physical package id of cpu X; 6
7 physical package id of cpuX. Typically corresponds to a physical
8 socket number, but the actual value is architecture and platform
9 dependent.
10
72) /sys/devices/system/cpu/cpuX/topology/core_id: 112) /sys/devices/system/cpu/cpuX/topology/core_id:
8represent the cpu core id to cpu X; 12
13 the CPU core ID of cpuX. Typically it is the hardware platform's
14 identifier (rather than the kernel's). The actual value is
15 architecture and platform dependent.
16
93) /sys/devices/system/cpu/cpuX/topology/thread_siblings: 173) /sys/devices/system/cpu/cpuX/topology/thread_siblings:
10represent the thread siblings to cpu X in the same core; 18
19 internel kernel map of cpuX's hardware threads within the same
20 core as cpuX
21
114) /sys/devices/system/cpu/cpuX/topology/core_siblings: 224) /sys/devices/system/cpu/cpuX/topology/core_siblings:
12represent the thread siblings to cpu X in the same physical package; 23
24 internal kernel map of cpuX's hardware threads within the same
25 physical_package_id.
13 26
14To implement it in an architecture-neutral way, a new source file, 27To implement it in an architecture-neutral way, a new source file,
15drivers/base/topology.c, is to export the 4 attributes. 28drivers/base/topology.c, is to export the 4 attributes.
@@ -32,32 +45,32 @@ not defined by include/asm-XXX/topology.h:
323) thread_siblings: just the given CPU 453) thread_siblings: just the given CPU
334) core_siblings: just the given CPU 464) core_siblings: just the given CPU
34 47
35Additionally, cpu topology information is provided under 48Additionally, CPU topology information is provided under
36/sys/devices/system/cpu and includes these files. The internal 49/sys/devices/system/cpu and includes these files. The internal
37source for the output is in brackets ("[]"). 50source for the output is in brackets ("[]").
38 51
39 kernel_max: the maximum cpu index allowed by the kernel configuration. 52 kernel_max: the maximum CPU index allowed by the kernel configuration.
40 [NR_CPUS-1] 53 [NR_CPUS-1]
41 54
42 offline: cpus that are not online because they have been 55 offline: CPUs that are not online because they have been
43 HOTPLUGGED off (see cpu-hotplug.txt) or exceed the limit 56 HOTPLUGGED off (see cpu-hotplug.txt) or exceed the limit
44 of cpus allowed by the kernel configuration (kernel_max 57 of CPUs allowed by the kernel configuration (kernel_max
45 above). [~cpu_online_mask + cpus >= NR_CPUS] 58 above). [~cpu_online_mask + cpus >= NR_CPUS]
46 59
47 online: cpus that are online and being scheduled [cpu_online_mask] 60 online: CPUs that are online and being scheduled [cpu_online_mask]
48 61
49 possible: cpus that have been allocated resources and can be 62 possible: CPUs that have been allocated resources and can be
50 brought online if they are present. [cpu_possible_mask] 63 brought online if they are present. [cpu_possible_mask]
51 64
52 present: cpus that have been identified as being present in the 65 present: CPUs that have been identified as being present in the
53 system. [cpu_present_mask] 66 system. [cpu_present_mask]
54 67
55The format for the above output is compatible with cpulist_parse() 68The format for the above output is compatible with cpulist_parse()
56[see <linux/cpumask.h>]. Some examples follow. 69[see <linux/cpumask.h>]. Some examples follow.
57 70
58In this example, there are 64 cpus in the system but cpus 32-63 exceed 71In this example, there are 64 CPUs in the system but cpus 32-63 exceed
59the kernel max which is limited to 0..31 by the NR_CPUS config option 72the kernel max which is limited to 0..31 by the NR_CPUS config option
60being 32. Note also that cpus 2 and 4-31 are not online but could be 73being 32. Note also that CPUs 2 and 4-31 are not online but could be
61brought online as they are both present and possible. 74brought online as they are both present and possible.
62 75
63 kernel_max: 31 76 kernel_max: 31
@@ -67,8 +80,8 @@ brought online as they are both present and possible.
67 present: 0-31 80 present: 0-31
68 81
69In this example, the NR_CPUS config option is 128, but the kernel was 82In this example, the NR_CPUS config option is 128, but the kernel was
70started with possible_cpus=144. There are 4 cpus in the system and cpu2 83started with possible_cpus=144. There are 4 CPUs in the system and cpu2
71was manually taken offline (and is the only cpu that can be brought 84was manually taken offline (and is the only CPU that can be brought
72online.) 85online.)
73 86
74 kernel_max: 127 87 kernel_max: 127
@@ -78,4 +91,4 @@ online.)
78 present: 0-3 91 present: 0-3
79 92
80See cpu-hotplug.txt for the possible_cpus=NUM kernel start parameter 93See cpu-hotplug.txt for the possible_cpus=NUM kernel start parameter
81as well as more information on the various cpumask's. 94as well as more information on the various cpumasks.
diff --git a/Documentation/debugging-via-ohci1394.txt b/Documentation/debugging-via-ohci1394.txt
index 59a91e5c6909..611f5a5499b1 100644
--- a/Documentation/debugging-via-ohci1394.txt
+++ b/Documentation/debugging-via-ohci1394.txt
@@ -64,14 +64,14 @@ be used to view the printk buffer of a remote machine, even with live update.
64 64
65Bernhard Kaindl enhanced firescope to support accessing 64-bit machines 65Bernhard Kaindl enhanced firescope to support accessing 64-bit machines
66from 32-bit firescope and vice versa: 66from 32-bit firescope and vice versa:
67- ftp://ftp.suse.de/private/bk/firewire/tools/firescope-0.2.2.tar.bz2 67- http://halobates.de/firewire/firescope-0.2.2.tar.bz2
68 68
69and he implemented fast system dump (alpha version - read README.txt): 69and he implemented fast system dump (alpha version - read README.txt):
70- ftp://ftp.suse.de/private/bk/firewire/tools/firedump-0.1.tar.bz2 70- http://halobates.de/firewire/firedump-0.1.tar.bz2
71 71
72There is also a gdb proxy for firewire which allows to use gdb to access 72There is also a gdb proxy for firewire which allows to use gdb to access
73data which can be referenced from symbols found by gdb in vmlinux: 73data which can be referenced from symbols found by gdb in vmlinux:
74- ftp://ftp.suse.de/private/bk/firewire/tools/fireproxy-0.33.tar.bz2 74- http://halobates.de/firewire/fireproxy-0.33.tar.bz2
75 75
76The latest version of this gdb proxy (fireproxy-0.34) can communicate (not 76The latest version of this gdb proxy (fireproxy-0.34) can communicate (not
77yet stable) with kgdb over an memory-based communication module (kgdbom). 77yet stable) with kgdb over an memory-based communication module (kgdbom).
@@ -178,7 +178,7 @@ Step-by-step instructions for using firescope with early OHCI initialization:
178 178
179Notes 179Notes
180----- 180-----
181Documentation and specifications: ftp://ftp.suse.de/private/bk/firewire/docs 181Documentation and specifications: http://halobates.de/firewire/
182 182
183FireWire is a trademark of Apple Inc. - for more information please refer to: 183FireWire is a trademark of Apple Inc. - for more information please refer to:
184http://en.wikipedia.org/wiki/FireWire 184http://en.wikipedia.org/wiki/FireWire
diff --git a/Documentation/fb/framebuffer.txt b/Documentation/fb/framebuffer.txt
index b3e3a0356839..fe79e3c8847d 100644
--- a/Documentation/fb/framebuffer.txt
+++ b/Documentation/fb/framebuffer.txt
@@ -312,10 +312,8 @@ and to the following documentation:
3128. Mailing list 3128. Mailing list
313--------------- 313---------------
314 314
315There are several frame buffer device related mailing lists at SourceForge: 315There is a frame buffer device related mailing list at kernel.org:
316 - linux-fbdev-announce@lists.sourceforge.net, for announcements, 316linux-fbdev@vger.kernel.org.
317 - linux-fbdev-user@lists.sourceforge.net, for generic user support,
318 - linux-fbdev-devel@lists.sourceforge.net, for project developers.
319 317
320Point your web browser to http://sourceforge.net/projects/linux-fbdev/ for 318Point your web browser to http://sourceforge.net/projects/linux-fbdev/ for
321subscription information and archive browsing. 319subscription information and archive browsing.
diff --git a/Documentation/feature-removal-schedule.txt b/Documentation/feature-removal-schedule.txt
index 89a47b5aff07..bc693fffabe0 100644
--- a/Documentation/feature-removal-schedule.txt
+++ b/Documentation/feature-removal-schedule.txt
@@ -418,6 +418,14 @@ When: 2.6.33
418Why: Should be implemented in userspace, policy daemon. 418Why: Should be implemented in userspace, policy daemon.
419Who: Johannes Berg <johannes@sipsolutions.net> 419Who: Johannes Berg <johannes@sipsolutions.net>
420 420
421---------------------------
422
423What: CONFIG_INOTIFY
424When: 2.6.33
425Why: last user (audit) will be converted to the newer more generic
426 and more easily maintained fsnotify subsystem
427Who: Eric Paris <eparis@redhat.com>
428
421---------------------------- 429----------------------------
422 430
423What: lock_policy_rwsem_* and unlock_policy_rwsem_* will not be 431What: lock_policy_rwsem_* and unlock_policy_rwsem_* will not be
@@ -451,3 +459,33 @@ Why: OSS sound_core grabs all legacy minors (0-255) of SOUND_MAJOR
451 will also allow making ALSA OSS emulation independent of 459 will also allow making ALSA OSS emulation independent of
452 sound_core. The dependency will be broken then too. 460 sound_core. The dependency will be broken then too.
453Who: Tejun Heo <tj@kernel.org> 461Who: Tejun Heo <tj@kernel.org>
462
463----------------------------
464
465What: Support for VMware's guest paravirtuliazation technique [VMI] will be
466 dropped.
467When: 2.6.37 or earlier.
468Why: With the recent innovations in CPU hardware acceleration technologies
469 from Intel and AMD, VMware ran a few experiments to compare these
470 techniques to guest paravirtualization technique on VMware's platform.
471 These hardware assisted virtualization techniques have outperformed the
472 performance benefits provided by VMI in most of the workloads. VMware
473 expects that these hardware features will be ubiquitous in a couple of
474 years, as a result, VMware has started a phased retirement of this
475 feature from the hypervisor. We will be removing this feature from the
476 Kernel too. Right now we are targeting 2.6.37 but can retire earlier if
477 technical reasons (read opportunity to remove major chunk of pvops)
478 arise.
479
480 Please note that VMI has always been an optimization and non-VMI kernels
481 still work fine on VMware's platform.
482 Latest versions of VMware's product which support VMI are,
483 Workstation 7.0 and VSphere 4.0 on ESX side, future maintainence
484 releases for these products will continue supporting VMI.
485
486 For more details about VMI retirement take a look at this,
487 http://blogs.vmware.com/guestosguide/2009/09/vmi-retirement.html
488
489Who: Alok N Kataria <akataria@vmware.com>
490
491----------------------------
diff --git a/Documentation/filesystems/caching/fscache.txt b/Documentation/filesystems/caching/fscache.txt
index 9e94b9491d89..a91e2e2095b0 100644
--- a/Documentation/filesystems/caching/fscache.txt
+++ b/Documentation/filesystems/caching/fscache.txt
@@ -235,6 +235,7 @@ proc files.
235 neg=N Number of negative lookups made 235 neg=N Number of negative lookups made
236 pos=N Number of positive lookups made 236 pos=N Number of positive lookups made
237 crt=N Number of objects created by lookup 237 crt=N Number of objects created by lookup
238 tmo=N Number of lookups timed out and requeued
238 Updates n=N Number of update cookie requests seen 239 Updates n=N Number of update cookie requests seen
239 nul=N Number of upd reqs given a NULL parent 240 nul=N Number of upd reqs given a NULL parent
240 run=N Number of upd reqs granted CPU time 241 run=N Number of upd reqs granted CPU time
@@ -250,8 +251,10 @@ proc files.
250 ok=N Number of successful alloc reqs 251 ok=N Number of successful alloc reqs
251 wt=N Number of alloc reqs that waited on lookup completion 252 wt=N Number of alloc reqs that waited on lookup completion
252 nbf=N Number of alloc reqs rejected -ENOBUFS 253 nbf=N Number of alloc reqs rejected -ENOBUFS
254 int=N Number of alloc reqs aborted -ERESTARTSYS
253 ops=N Number of alloc reqs submitted 255 ops=N Number of alloc reqs submitted
254 owt=N Number of alloc reqs waited for CPU time 256 owt=N Number of alloc reqs waited for CPU time
257 abt=N Number of alloc reqs aborted due to object death
255 Retrvls n=N Number of retrieval (read) requests seen 258 Retrvls n=N Number of retrieval (read) requests seen
256 ok=N Number of successful retr reqs 259 ok=N Number of successful retr reqs
257 wt=N Number of retr reqs that waited on lookup completion 260 wt=N Number of retr reqs that waited on lookup completion
@@ -261,6 +264,7 @@ proc files.
261 oom=N Number of retr reqs failed -ENOMEM 264 oom=N Number of retr reqs failed -ENOMEM
262 ops=N Number of retr reqs submitted 265 ops=N Number of retr reqs submitted
263 owt=N Number of retr reqs waited for CPU time 266 owt=N Number of retr reqs waited for CPU time
267 abt=N Number of retr reqs aborted due to object death
264 Stores n=N Number of storage (write) requests seen 268 Stores n=N Number of storage (write) requests seen
265 ok=N Number of successful store reqs 269 ok=N Number of successful store reqs
266 agn=N Number of store reqs on a page already pending storage 270 agn=N Number of store reqs on a page already pending storage
@@ -268,12 +272,37 @@ proc files.
268 oom=N Number of store reqs failed -ENOMEM 272 oom=N Number of store reqs failed -ENOMEM
269 ops=N Number of store reqs submitted 273 ops=N Number of store reqs submitted
270 run=N Number of store reqs granted CPU time 274 run=N Number of store reqs granted CPU time
275 pgs=N Number of pages given store req processing time
276 rxd=N Number of store reqs deleted from tracking tree
277 olm=N Number of store reqs over store limit
278 VmScan nos=N Number of release reqs against pages with no pending store
279 gon=N Number of release reqs against pages stored by time lock granted
280 bsy=N Number of release reqs ignored due to in-progress store
281 can=N Number of page stores cancelled due to release req
271 Ops pend=N Number of times async ops added to pending queues 282 Ops pend=N Number of times async ops added to pending queues
272 run=N Number of times async ops given CPU time 283 run=N Number of times async ops given CPU time
273 enq=N Number of times async ops queued for processing 284 enq=N Number of times async ops queued for processing
285 can=N Number of async ops cancelled
286 rej=N Number of async ops rejected due to object lookup/create failure
274 dfr=N Number of async ops queued for deferred release 287 dfr=N Number of async ops queued for deferred release
275 rel=N Number of async ops released 288 rel=N Number of async ops released
276 gc=N Number of deferred-release async ops garbage collected 289 gc=N Number of deferred-release async ops garbage collected
290 CacheOp alo=N Number of in-progress alloc_object() cache ops
291 luo=N Number of in-progress lookup_object() cache ops
292 luc=N Number of in-progress lookup_complete() cache ops
293 gro=N Number of in-progress grab_object() cache ops
294 upo=N Number of in-progress update_object() cache ops
295 dro=N Number of in-progress drop_object() cache ops
296 pto=N Number of in-progress put_object() cache ops
297 syn=N Number of in-progress sync_cache() cache ops
298 atc=N Number of in-progress attr_changed() cache ops
299 rap=N Number of in-progress read_or_alloc_page() cache ops
300 ras=N Number of in-progress read_or_alloc_pages() cache ops
301 alp=N Number of in-progress allocate_page() cache ops
302 als=N Number of in-progress allocate_pages() cache ops
303 wrp=N Number of in-progress write_page() cache ops
304 ucp=N Number of in-progress uncache_page() cache ops
305 dsp=N Number of in-progress dissociate_pages() cache ops
277 306
278 307
279 (*) /proc/fs/fscache/histogram 308 (*) /proc/fs/fscache/histogram
@@ -299,6 +328,87 @@ proc files.
299 jiffy range covered, and the SECS field the equivalent number of seconds. 328 jiffy range covered, and the SECS field the equivalent number of seconds.
300 329
301 330
331===========
332OBJECT LIST
333===========
334
335If CONFIG_FSCACHE_OBJECT_LIST is enabled, the FS-Cache facility will maintain a
336list of all the objects currently allocated and allow them to be viewed
337through:
338
339 /proc/fs/fscache/objects
340
341This will look something like:
342
343 [root@andromeda ~]# head /proc/fs/fscache/objects
344 OBJECT PARENT STAT CHLDN OPS OOP IPR EX READS EM EV F S | NETFS_COOKIE_DEF TY FL NETFS_DATA OBJECT_KEY, AUX_DATA
345 ======== ======== ==== ===== === === === == ===== == == = = | ================ == == ================ ================
346 17e4b 2 ACTV 0 0 0 0 0 0 7b 4 0 8 | NFS.fh DT 0 ffff88001dd82820 010006017edcf8bbc93b43298fdfbe71e50b57b13a172c0117f38472, e567634700000000000000000000000063f2404a000000000000000000000000c9030000000000000000000063f2404a
347 1693a 2 ACTV 0 0 0 0 0 0 7b 4 0 8 | NFS.fh DT 0 ffff88002db23380 010006017edcf8bbc93b43298fdfbe71e50b57b1e0162c01a2df0ea6, 420ebc4a000000000000000000000000420ebc4a0000000000000000000000000e1801000000000000000000420ebc4a
348
349where the first set of columns before the '|' describe the object:
350
351 COLUMN DESCRIPTION
352 ======= ===============================================================
353 OBJECT Object debugging ID (appears as OBJ%x in some debug messages)
354 PARENT Debugging ID of parent object
355 STAT Object state
356 CHLDN Number of child objects of this object
357 OPS Number of outstanding operations on this object
358 OOP Number of outstanding child object management operations
359 IPR
360 EX Number of outstanding exclusive operations
361 READS Number of outstanding read operations
362 EM Object's event mask
363 EV Events raised on this object
364 F Object flags
365 S Object slow-work work item flags
366
367and the second set of columns describe the object's cookie, if present:
368
369 COLUMN DESCRIPTION
370 =============== =======================================================
371 NETFS_COOKIE_DEF Name of netfs cookie definition
372 TY Cookie type (IX - index, DT - data, hex - special)
373 FL Cookie flags
374 NETFS_DATA Netfs private data stored in the cookie
375 OBJECT_KEY Object key } 1 column, with separating comma
376 AUX_DATA Object aux data } presence may be configured
377
378The data shown may be filtered by attaching the a key to an appropriate keyring
379before viewing the file. Something like:
380
381 keyctl add user fscache:objlist <restrictions> @s
382
383where <restrictions> are a selection of the following letters:
384
385 K Show hexdump of object key (don't show if not given)
386 A Show hexdump of object aux data (don't show if not given)
387
388and the following paired letters:
389
390 C Show objects that have a cookie
391 c Show objects that don't have a cookie
392 B Show objects that are busy
393 b Show objects that aren't busy
394 W Show objects that have pending writes
395 w Show objects that don't have pending writes
396 R Show objects that have outstanding reads
397 r Show objects that don't have outstanding reads
398 S Show objects that have slow work queued
399 s Show objects that don't have slow work queued
400
401If neither side of a letter pair is given, then both are implied. For example:
402
403 keyctl add user fscache:objlist KB @s
404
405shows objects that are busy, and lists their object keys, but does not dump
406their auxiliary data. It also implies "CcWwRrSs", but as 'B' is given, 'b' is
407not implied.
408
409By default all objects and all fields will be shown.
410
411
302========= 412=========
303DEBUGGING 413DEBUGGING
304========= 414=========
diff --git a/Documentation/filesystems/caching/netfs-api.txt b/Documentation/filesystems/caching/netfs-api.txt
index 2666b1ed5e9e..1902c57b72ef 100644
--- a/Documentation/filesystems/caching/netfs-api.txt
+++ b/Documentation/filesystems/caching/netfs-api.txt
@@ -641,7 +641,7 @@ data file must be retired (see the relinquish cookie function below).
641 641
642Furthermore, note that this does not cancel the asynchronous read or write 642Furthermore, note that this does not cancel the asynchronous read or write
643operation started by the read/alloc and write functions, so the page 643operation started by the read/alloc and write functions, so the page
644invalidation and release functions must use: 644invalidation functions must use:
645 645
646 bool fscache_check_page_write(struct fscache_cookie *cookie, 646 bool fscache_check_page_write(struct fscache_cookie *cookie,
647 struct page *page); 647 struct page *page);
@@ -654,6 +654,25 @@ to see if a page is being written to the cache, and:
654to wait for it to finish if it is. 654to wait for it to finish if it is.
655 655
656 656
657When releasepage() is being implemented, a special FS-Cache function exists to
658manage the heuristics of coping with vmscan trying to eject pages, which may
659conflict with the cache trying to write pages to the cache (which may itself
660need to allocate memory):
661
662 bool fscache_maybe_release_page(struct fscache_cookie *cookie,
663 struct page *page,
664 gfp_t gfp);
665
666This takes the netfs cookie, and the page and gfp arguments as supplied to
667releasepage(). It will return false if the page cannot be released yet for
668some reason and if it returns true, the page has been uncached and can now be
669released.
670
671To make a page available for release, this function may wait for an outstanding
672storage request to complete, or it may attempt to cancel the storage request -
673in which case the page will not be stored in the cache this time.
674
675
657========================== 676==========================
658INDEX AND DATA FILE UPDATE 677INDEX AND DATA FILE UPDATE
659========================== 678==========================
diff --git a/Documentation/filesystems/ext3.txt b/Documentation/filesystems/ext3.txt
index 570f9bd9be2b..05d5cf1d743f 100644
--- a/Documentation/filesystems/ext3.txt
+++ b/Documentation/filesystems/ext3.txt
@@ -123,10 +123,18 @@ resuid=n The user ID which may use the reserved blocks.
123 123
124sb=n Use alternate superblock at this location. 124sb=n Use alternate superblock at this location.
125 125
126quota 126quota These options are ignored by the filesystem. They
127noquota 127noquota are used only by quota tools to recognize volumes
128grpquota 128grpquota where quota should be turned on. See documentation
129usrquota 129usrquota in the quota-tools package for more details
130 (http://sourceforge.net/projects/linuxquota).
131
132jqfmt=<quota type> These options tell filesystem details about quota
133usrjquota=<file> so that quota information can be properly updated
134grpjquota=<file> during journal replay. They replace the above
135 quota options. See documentation in the quota-tools
136 package for more details
137 (http://sourceforge.net/projects/linuxquota).
130 138
131bh (*) ext3 associates buffer heads to data pages to 139bh (*) ext3 associates buffer heads to data pages to
132nobh (a) cache disk block mapping information 140nobh (a) cache disk block mapping information
diff --git a/Documentation/filesystems/ext4.txt b/Documentation/filesystems/ext4.txt
index 18b5ec8cea45..6d94e0696f8c 100644
--- a/Documentation/filesystems/ext4.txt
+++ b/Documentation/filesystems/ext4.txt
@@ -134,9 +134,15 @@ ro Mount filesystem read only. Note that ext4 will
134 mount options "ro,noload" can be used to prevent 134 mount options "ro,noload" can be used to prevent
135 writes to the filesystem. 135 writes to the filesystem.
136 136
137journal_checksum Enable checksumming of the journal transactions.
138 This will allow the recovery code in e2fsck and the
139 kernel to detect corruption in the kernel. It is a
140 compatible change and will be ignored by older kernels.
141
137journal_async_commit Commit block can be written to disk without waiting 142journal_async_commit Commit block can be written to disk without waiting
138 for descriptor blocks. If enabled older kernels cannot 143 for descriptor blocks. If enabled older kernels cannot
139 mount the device. 144 mount the device. This will enable 'journal_checksum'
145 internally.
140 146
141journal=update Update the ext4 file system's journal to the current 147journal=update Update the ext4 file system's journal to the current
142 format. 148 format.
@@ -282,9 +288,16 @@ stripe=n Number of filesystem blocks that mballoc will try
282 to use for allocation size and alignment. For RAID5/6 288 to use for allocation size and alignment. For RAID5/6
283 systems this should be the number of data 289 systems this should be the number of data
284 disks * RAID chunk size in file system blocks. 290 disks * RAID chunk size in file system blocks.
285delalloc (*) Deferring block allocation until write-out time. 291
286nodelalloc Disable delayed allocation. Blocks are allocation 292delalloc (*) Defer block allocation until just before ext4
287 when data is copied from user to page cache. 293 writes out the block(s) in question. This
294 allows ext4 to better allocation decisions
295 more efficiently.
296nodelalloc Disable delayed allocation. Blocks are allocated
297 when the data is copied from userspace to the
298 page cache, either via the write(2) system call
299 or when an mmap'ed page which was previously
300 unallocated is written for the first time.
288 301
289max_batch_time=usec Maximum amount of time ext4 should wait for 302max_batch_time=usec Maximum amount of time ext4 should wait for
290 additional filesystem operations to be batch 303 additional filesystem operations to be batch
diff --git a/Documentation/filesystems/ocfs2.txt b/Documentation/filesystems/ocfs2.txt
index c2a0871280a0..c58b9f5ba002 100644
--- a/Documentation/filesystems/ocfs2.txt
+++ b/Documentation/filesystems/ocfs2.txt
@@ -20,15 +20,16 @@ Lots of code taken from ext3 and other projects.
20Authors in alphabetical order: 20Authors in alphabetical order:
21Joel Becker <joel.becker@oracle.com> 21Joel Becker <joel.becker@oracle.com>
22Zach Brown <zach.brown@oracle.com> 22Zach Brown <zach.brown@oracle.com>
23Mark Fasheh <mark.fasheh@oracle.com> 23Mark Fasheh <mfasheh@suse.com>
24Kurt Hackel <kurt.hackel@oracle.com> 24Kurt Hackel <kurt.hackel@oracle.com>
25Tao Ma <tao.ma@oracle.com>
25Sunil Mushran <sunil.mushran@oracle.com> 26Sunil Mushran <sunil.mushran@oracle.com>
26Manish Singh <manish.singh@oracle.com> 27Manish Singh <manish.singh@oracle.com>
28Tiger Yang <tiger.yang@oracle.com>
27 29
28Caveats 30Caveats
29======= 31=======
30Features which OCFS2 does not support yet: 32Features which OCFS2 does not support yet:
31 - quotas
32 - Directory change notification (F_NOTIFY) 33 - Directory change notification (F_NOTIFY)
33 - Distributed Caching (F_SETLEASE/F_GETLEASE/break_lease) 34 - Distributed Caching (F_SETLEASE/F_GETLEASE/break_lease)
34 35
@@ -70,7 +71,6 @@ commit=nrsec (*) Ocfs2 can be told to sync all its data and metadata
70 performance. 71 performance.
71localalloc=8(*) Allows custom localalloc size in MB. If the value is too 72localalloc=8(*) Allows custom localalloc size in MB. If the value is too
72 large, the fs will silently revert it to the default. 73 large, the fs will silently revert it to the default.
73 Localalloc is not enabled for local mounts.
74localflocks This disables cluster aware flock. 74localflocks This disables cluster aware flock.
75inode64 Indicates that Ocfs2 is allowed to create inodes at 75inode64 Indicates that Ocfs2 is allowed to create inodes at
76 any location in the filesystem, including those which 76 any location in the filesystem, including those which
diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt
index b5aee7838a00..2c48f945546b 100644
--- a/Documentation/filesystems/proc.txt
+++ b/Documentation/filesystems/proc.txt
@@ -1113,7 +1113,6 @@ Table 1-12: Files in /proc/fs/ext4/<devname>
1113.............................................................................. 1113..............................................................................
1114 File Content 1114 File Content
1115 mb_groups details of multiblock allocator buddy cache of free blocks 1115 mb_groups details of multiblock allocator buddy cache of free blocks
1116 mb_history multiblock allocation history
1117.............................................................................. 1116..............................................................................
1118 1117
1119 1118
diff --git a/Documentation/filesystems/vfat.txt b/Documentation/filesystems/vfat.txt
index b58b84b50fa2..eed520fd0c8e 100644
--- a/Documentation/filesystems/vfat.txt
+++ b/Documentation/filesystems/vfat.txt
@@ -102,7 +102,7 @@ shortname=lower|win95|winnt|mixed
102 winnt: emulate the Windows NT rule for display/create. 102 winnt: emulate the Windows NT rule for display/create.
103 mixed: emulate the Windows NT rule for display, 103 mixed: emulate the Windows NT rule for display,
104 emulate the Windows 95 rule for create. 104 emulate the Windows 95 rule for create.
105 Default setting is `lower'. 105 Default setting is `mixed'.
106 106
107tz=UTC -- Interpret timestamps as UTC rather than local time. 107tz=UTC -- Interpret timestamps as UTC rather than local time.
108 This option disables the conversion of timestamps 108 This option disables the conversion of timestamps
diff --git a/Documentation/flexible-arrays.txt b/Documentation/flexible-arrays.txt
index 84eb26808dee..cb8a3a00cc92 100644
--- a/Documentation/flexible-arrays.txt
+++ b/Documentation/flexible-arrays.txt
@@ -1,5 +1,5 @@
1Using flexible arrays in the kernel 1Using flexible arrays in the kernel
2Last updated for 2.6.31 2Last updated for 2.6.32
3Jonathan Corbet <corbet@lwn.net> 3Jonathan Corbet <corbet@lwn.net>
4 4
5Large contiguous memory allocations can be unreliable in the Linux kernel. 5Large contiguous memory allocations can be unreliable in the Linux kernel.
@@ -40,6 +40,13 @@ argument is passed directly to the internal memory allocation calls. With
40the current code, using flags to ask for high memory is likely to lead to 40the current code, using flags to ask for high memory is likely to lead to
41notably unpleasant side effects. 41notably unpleasant side effects.
42 42
43It is also possible to define flexible arrays at compile time with:
44
45 DEFINE_FLEX_ARRAY(name, element_size, total);
46
47This macro will result in a definition of an array with the given name; the
48element size and total will be checked for validity at compile time.
49
43Storing data into a flexible array is accomplished with a call to: 50Storing data into a flexible array is accomplished with a call to:
44 51
45 int flex_array_put(struct flex_array *array, unsigned int element_nr, 52 int flex_array_put(struct flex_array *array, unsigned int element_nr,
@@ -76,16 +83,30 @@ particular element has never been allocated.
76Note that it is possible to get back a valid pointer for an element which 83Note that it is possible to get back a valid pointer for an element which
77has never been stored in the array. Memory for array elements is allocated 84has never been stored in the array. Memory for array elements is allocated
78one page at a time; a single allocation could provide memory for several 85one page at a time; a single allocation could provide memory for several
79adjacent elements. The flexible array code does not know if a specific 86adjacent elements. Flexible array elements are normally initialized to the
80element has been written; it only knows if the associated memory is 87value FLEX_ARRAY_FREE (defined as 0x6c in <linux/poison.h>), so errors
81present. So a flex_array_get() call on an element which was never stored 88involving that number probably result from use of unstored array entries.
82in the array has the potential to return a pointer to random data. If the 89Note that, if array elements are allocated with __GFP_ZERO, they will be
83caller does not have a separate way to know which elements were actually 90initialized to zero and this poisoning will not happen.
84stored, it might be wise, at least, to add GFP_ZERO to the flags argument 91
85to ensure that all elements are zeroed. 92Individual elements in the array can be cleared with:
86 93
87There is no way to remove a single element from the array. It is possible, 94 int flex_array_clear(struct flex_array *array, unsigned int element_nr);
88though, to remove all elements with a call to: 95
96This function will set the given element to FLEX_ARRAY_FREE and return
97zero. If storage for the indicated element is not allocated for the array,
98flex_array_clear() will return -EINVAL instead. Note that clearing an
99element does not release the storage associated with it; to reduce the
100allocated size of an array, call:
101
102 int flex_array_shrink(struct flex_array *array);
103
104The return value will be the number of pages of memory actually freed.
105This function works by scanning the array for pages containing nothing but
106FLEX_ARRAY_FREE bytes, so (1) it can be expensive, and (2) it will not work
107if the array's pages are allocated with __GFP_ZERO.
108
109It is possible to remove all elements of an array with a call to:
89 110
90 void flex_array_free_parts(struct flex_array *array); 111 void flex_array_free_parts(struct flex_array *array);
91 112
diff --git a/Documentation/hwmon/ltc4215 b/Documentation/hwmon/ltc4215
index 2e6a21eb656c..c196a1846259 100644
--- a/Documentation/hwmon/ltc4215
+++ b/Documentation/hwmon/ltc4215
@@ -22,12 +22,13 @@ Usage Notes
22----------- 22-----------
23 23
24This driver does not probe for LTC4215 devices, due to the fact that some 24This driver does not probe for LTC4215 devices, due to the fact that some
25of the possible addresses are unfriendly to probing. You will need to use 25of the possible addresses are unfriendly to probing. You will have to
26the "force" parameter to tell the driver where to find the device. 26instantiate the devices explicitly.
27 27
28Example: the following will load the driver for an LTC4215 at address 0x44 28Example: the following will load the driver for an LTC4215 at address 0x44
29on I2C bus #0: 29on I2C bus #0:
30$ modprobe ltc4215 force=0,0x44 30$ modprobe ltc4215
31$ echo ltc4215 0x44 > /sys/bus/i2c/devices/i2c-0/new_device
31 32
32 33
33Sysfs entries 34Sysfs entries
diff --git a/Documentation/hwmon/ltc4245 b/Documentation/hwmon/ltc4245
index bae7a3adc5d8..02838a47d862 100644
--- a/Documentation/hwmon/ltc4245
+++ b/Documentation/hwmon/ltc4245
@@ -23,12 +23,13 @@ Usage Notes
23----------- 23-----------
24 24
25This driver does not probe for LTC4245 devices, due to the fact that some 25This driver does not probe for LTC4245 devices, due to the fact that some
26of the possible addresses are unfriendly to probing. You will need to use 26of the possible addresses are unfriendly to probing. You will have to
27the "force" parameter to tell the driver where to find the device. 27instantiate the devices explicitly.
28 28
29Example: the following will load the driver for an LTC4245 at address 0x23 29Example: the following will load the driver for an LTC4245 at address 0x23
30on I2C bus #1: 30on I2C bus #1:
31$ modprobe ltc4245 force=1,0x23 31$ modprobe ltc4245
32$ echo ltc4245 0x23 > /sys/bus/i2c/devices/i2c-1/new_device
32 33
33 34
34Sysfs entries 35Sysfs entries
diff --git a/Documentation/hwmon/sysfs-interface b/Documentation/hwmon/sysfs-interface
index dcbd502c8792..82def883361b 100644
--- a/Documentation/hwmon/sysfs-interface
+++ b/Documentation/hwmon/sysfs-interface
@@ -353,10 +353,20 @@ power[1-*]_average Average power use
353 Unit: microWatt 353 Unit: microWatt
354 RO 354 RO
355 355
356power[1-*]_average_interval Power use averaging interval 356power[1-*]_average_interval Power use averaging interval. A poll
357 notification is sent to this file if the
358 hardware changes the averaging interval.
357 Unit: milliseconds 359 Unit: milliseconds
358 RW 360 RW
359 361
362power[1-*]_average_interval_max Maximum power use averaging interval
363 Unit: milliseconds
364 RO
365
366power[1-*]_average_interval_min Minimum power use averaging interval
367 Unit: milliseconds
368 RO
369
360power[1-*]_average_highest Historical average maximum power use 370power[1-*]_average_highest Historical average maximum power use
361 Unit: microWatt 371 Unit: microWatt
362 RO 372 RO
@@ -365,6 +375,18 @@ power[1-*]_average_lowest Historical average minimum power use
365 Unit: microWatt 375 Unit: microWatt
366 RO 376 RO
367 377
378power[1-*]_average_max A poll notification is sent to
379 power[1-*]_average when power use
380 rises above this value.
381 Unit: microWatt
382 RW
383
384power[1-*]_average_min A poll notification is sent to
385 power[1-*]_average when power use
386 sinks below this value.
387 Unit: microWatt
388 RW
389
368power[1-*]_input Instantaneous power use 390power[1-*]_input Instantaneous power use
369 Unit: microWatt 391 Unit: microWatt
370 RO 392 RO
@@ -381,6 +403,39 @@ power[1-*]_reset_history Reset input_highest, input_lowest,
381 average_highest and average_lowest. 403 average_highest and average_lowest.
382 WO 404 WO
383 405
406power[1-*]_accuracy Accuracy of the power meter.
407 Unit: Percent
408 RO
409
410power[1-*]_alarm 1 if the system is drawing more power than the
411 cap allows; 0 otherwise. A poll notification is
412 sent to this file when the power use exceeds the
413 cap. This file only appears if the cap is known
414 to be enforced by hardware.
415 RO
416
417power[1-*]_cap If power use rises above this limit, the
418 system should take action to reduce power use.
419 A poll notification is sent to this file if the
420 cap is changed by the hardware. The *_cap
421 files only appear if the cap is known to be
422 enforced by hardware.
423 Unit: microWatt
424 RW
425
426power[1-*]_cap_hyst Margin of hysteresis built around capping and
427 notification.
428 Unit: microWatt
429 RW
430
431power[1-*]_cap_max Maximum cap that can be set.
432 Unit: microWatt
433 RO
434
435power[1-*]_cap_min Minimum cap that can be set.
436 Unit: microWatt
437 RO
438
384********** 439**********
385* Energy * 440* Energy *
386********** 441**********
diff --git a/Documentation/i2c/busses/i2c-piix4 b/Documentation/i2c/busses/i2c-piix4
index c5b37c570554..ac540c71c7eb 100644
--- a/Documentation/i2c/busses/i2c-piix4
+++ b/Documentation/i2c/busses/i2c-piix4
@@ -8,7 +8,7 @@ Supported adapters:
8 Datasheet: Only available via NDA from ServerWorks 8 Datasheet: Only available via NDA from ServerWorks
9 * ATI IXP200, IXP300, IXP400, SB600, SB700 and SB800 southbridges 9 * ATI IXP200, IXP300, IXP400, SB600, SB700 and SB800 southbridges
10 Datasheet: Not publicly available 10 Datasheet: Not publicly available
11 * AMD SB900 11 * AMD Hudson-2
12 Datasheet: Not publicly available 12 Datasheet: Not publicly available
13 * Standard Microsystems (SMSC) SLC90E66 (Victory66) southbridge 13 * Standard Microsystems (SMSC) SLC90E66 (Victory66) southbridge
14 Datasheet: Publicly available at the SMSC website http://www.smsc.com 14 Datasheet: Publicly available at the SMSC website http://www.smsc.com
diff --git a/Documentation/i2c/instantiating-devices b/Documentation/i2c/instantiating-devices
index c740b7b41088..e89490270aba 100644
--- a/Documentation/i2c/instantiating-devices
+++ b/Documentation/i2c/instantiating-devices
@@ -188,7 +188,7 @@ segment, the address is sufficient to uniquely identify the device to be
188deleted. 188deleted.
189 189
190Example: 190Example:
191# echo eeprom 0x50 > /sys/class/i2c-adapter/i2c-3/new_device 191# echo eeprom 0x50 > /sys/bus/i2c/devices/i2c-3/new_device
192 192
193While this interface should only be used when in-kernel device declaration 193While this interface should only be used when in-kernel device declaration
194can't be done, there is a variety of cases where it can be helpful: 194can't be done, there is a variety of cases where it can be helpful:
diff --git a/Documentation/infiniband/user_mad.txt b/Documentation/infiniband/user_mad.txt
index 744687dd195b..8a366959f5cc 100644
--- a/Documentation/infiniband/user_mad.txt
+++ b/Documentation/infiniband/user_mad.txt
@@ -128,8 +128,8 @@ Setting IsSM Capability Bit
128 To create the appropriate character device files automatically with 128 To create the appropriate character device files automatically with
129 udev, a rule like 129 udev, a rule like
130 130
131 KERNEL="umad*", NAME="infiniband/%k" 131 KERNEL=="umad*", NAME="infiniband/%k"
132 KERNEL="issm*", NAME="infiniband/%k" 132 KERNEL=="issm*", NAME="infiniband/%k"
133 133
134 can be used. This will create device nodes named 134 can be used. This will create device nodes named
135 135
diff --git a/Documentation/infiniband/user_verbs.txt b/Documentation/infiniband/user_verbs.txt
index f847501e50b5..afe3f8da9018 100644
--- a/Documentation/infiniband/user_verbs.txt
+++ b/Documentation/infiniband/user_verbs.txt
@@ -58,7 +58,7 @@ Memory pinning
58 To create the appropriate character device files automatically with 58 To create the appropriate character device files automatically with
59 udev, a rule like 59 udev, a rule like
60 60
61 KERNEL="uverbs*", NAME="infiniband/%k" 61 KERNEL=="uverbs*", NAME="infiniband/%k"
62 62
63 can be used. This will create device nodes named 63 can be used. This will create device nodes named
64 64
diff --git a/Documentation/isdn/INTERFACE.CAPI b/Documentation/isdn/INTERFACE.CAPI
index 686e107923ec..5fe8de5cc727 100644
--- a/Documentation/isdn/INTERFACE.CAPI
+++ b/Documentation/isdn/INTERFACE.CAPI
@@ -60,10 +60,9 @@ open() operation on regular files or character devices.
60 60
61After a successful return from register_appl(), CAPI messages from the 61After a successful return from register_appl(), CAPI messages from the
62application may be passed to the driver for the device via calls to the 62application may be passed to the driver for the device via calls to the
63send_message() callback function. The CAPI message to send is stored in the 63send_message() callback function. Conversely, the driver may call Kernel
64data portion of an skb. Conversely, the driver may call Kernel CAPI's 64CAPI's capi_ctr_handle_message() function to pass a received CAPI message to
65capi_ctr_handle_message() function to pass a received CAPI message to Kernel 65Kernel CAPI for forwarding to an application, specifying its ApplID.
66CAPI for forwarding to an application, specifying its ApplID.
67 66
68Deregistration requests (CAPI operation CAPI_RELEASE) from applications are 67Deregistration requests (CAPI operation CAPI_RELEASE) from applications are
69forwarded as calls to the release_appl() callback function, passing the same 68forwarded as calls to the release_appl() callback function, passing the same
@@ -142,6 +141,7 @@ u16 (*send_message)(struct capi_ctr *ctrlr, struct sk_buff *skb)
142 to accepting or queueing the message. Errors occurring during the 141 to accepting or queueing the message. Errors occurring during the
143 actual processing of the message should be signaled with an 142 actual processing of the message should be signaled with an
144 appropriate reply message. 143 appropriate reply message.
144 May be called in process or interrupt context.
145 Calls to this function are not serialized by Kernel CAPI, ie. it must 145 Calls to this function are not serialized by Kernel CAPI, ie. it must
146 be prepared to be re-entered. 146 be prepared to be re-entered.
147 147
@@ -154,7 +154,8 @@ read_proc_t *ctr_read_proc
154 system entry, /proc/capi/controllers/<n>; will be called with a 154 system entry, /proc/capi/controllers/<n>; will be called with a
155 pointer to the device's capi_ctr structure as the last (data) argument 155 pointer to the device's capi_ctr structure as the last (data) argument
156 156
157Note: Callback functions are never called in interrupt context. 157Note: Callback functions except send_message() are never called in interrupt
158context.
158 159
159- to be filled in before calling capi_ctr_ready(): 160- to be filled in before calling capi_ctr_ready():
160 161
@@ -171,14 +172,40 @@ u8 serial[CAPI_SERIAL_LEN]
171 value to return for CAPI_GET_SERIAL 172 value to return for CAPI_GET_SERIAL
172 173
173 174
1744.3 The _cmsg Structure 1754.3 SKBs
176
177CAPI messages are passed between Kernel CAPI and the driver via send_message()
178and capi_ctr_handle_message(), stored in the data portion of a socket buffer
179(skb). Each skb contains a single CAPI message coded according to the CAPI 2.0
180standard.
181
182For the data transfer messages, DATA_B3_REQ and DATA_B3_IND, the actual
183payload data immediately follows the CAPI message itself within the same skb.
184The Data and Data64 parameters are not used for processing. The Data64
185parameter may be omitted by setting the length field of the CAPI message to 22
186instead of 30.
187
188
1894.4 The _cmsg Structure
175 190
176(declared in <linux/isdn/capiutil.h>) 191(declared in <linux/isdn/capiutil.h>)
177 192
178The _cmsg structure stores the contents of a CAPI 2.0 message in an easily 193The _cmsg structure stores the contents of a CAPI 2.0 message in an easily
179accessible form. It contains members for all possible CAPI 2.0 parameters, of 194accessible form. It contains members for all possible CAPI 2.0 parameters,
180which only those appearing in the message type currently being processed are 195including subparameters of the Additional Info and B Protocol structured
181actually used. Unused members should be set to zero. 196parameters, with the following exceptions:
197
198* second Calling party number (CONNECT_IND)
199
200* Data64 (DATA_B3_REQ and DATA_B3_IND)
201
202* Sending complete (subparameter of Additional Info, CONNECT_REQ and INFO_REQ)
203
204* Global Configuration (subparameter of B Protocol, CONNECT_REQ, CONNECT_RESP
205 and SELECT_B_PROTOCOL_REQ)
206
207Only those parameters appearing in the message type currently being processed
208are actually used. Unused members should be set to zero.
182 209
183Members are named after the CAPI 2.0 standard names of the parameters they 210Members are named after the CAPI 2.0 standard names of the parameters they
184represent. See <linux/isdn/capiutil.h> for the exact spelling. Member data 211represent. See <linux/isdn/capiutil.h> for the exact spelling. Member data
@@ -190,18 +217,19 @@ u16 for CAPI parameters of type 'word'
190 217
191u32 for CAPI parameters of type 'dword' 218u32 for CAPI parameters of type 'dword'
192 219
193_cstruct for CAPI parameters of type 'struct' not containing any 220_cstruct for CAPI parameters of type 'struct'
194 variably-sized (struct) subparameters (eg. 'Called Party Number')
195 The member is a pointer to a buffer containing the parameter in 221 The member is a pointer to a buffer containing the parameter in
196 CAPI encoding (length + content). It may also be NULL, which will 222 CAPI encoding (length + content). It may also be NULL, which will
197 be taken to represent an empty (zero length) parameter. 223 be taken to represent an empty (zero length) parameter.
224 Subparameters are stored in encoded form within the content part.
198 225
199_cmstruct for CAPI parameters of type 'struct' containing 'struct' 226_cmstruct alternative representation for CAPI parameters of type 'struct'
200 subparameters ('Additional Info' and 'B Protocol') 227 (used only for the 'Additional Info' and 'B Protocol' parameters)
201 The representation is a single byte containing one of the values: 228 The representation is a single byte containing one of the values:
202 CAPI_DEFAULT: the parameter is empty 229 CAPI_DEFAULT: The parameter is empty/absent.
203 CAPI_COMPOSE: the values of the subparameters are stored 230 CAPI_COMPOSE: The parameter is present.
204 individually in the corresponding _cmsg structure members 231 Subparameter values are stored individually in the corresponding
232 _cmsg structure members.
205 233
206Functions capi_cmsg2message() and capi_message2cmsg() are provided to convert 234Functions capi_cmsg2message() and capi_message2cmsg() are provided to convert
207messages between their transport encoding described in the CAPI 2.0 standard 235messages between their transport encoding described in the CAPI 2.0 standard
@@ -297,3 +325,26 @@ char *capi_cmd2str(u8 Command, u8 Subcommand)
297 be NULL if the command/subcommand is not one of those defined in the 325 be NULL if the command/subcommand is not one of those defined in the
298 CAPI 2.0 standard. 326 CAPI 2.0 standard.
299 327
328
3297. Debugging
330
331The module kernelcapi has a module parameter showcapimsgs controlling some
332debugging output produced by the module. It can only be set when the module is
333loaded, via a parameter "showcapimsgs=<n>" to the modprobe command, either on
334the command line or in the configuration file.
335
336If the lowest bit of showcapimsgs is set, kernelcapi logs controller and
337application up and down events.
338
339In addition, every registered CAPI controller has an associated traceflag
340parameter controlling how CAPI messages sent from and to tha controller are
341logged. The traceflag parameter is initialized with the value of the
342showcapimsgs parameter when the controller is registered, but can later be
343changed via the MANUFACTURER_REQ command KCAPI_CMD_TRACE.
344
345If the value of traceflag is non-zero, CAPI messages are logged.
346DATA_B3 messages are only logged if the value of traceflag is > 2.
347
348If the lowest bit of traceflag is set, only the command/subcommand and message
349length are logged. Otherwise, kernelcapi logs a readable representation of
350the entire message.
diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index 5d386b4ff6a0..332fe3b47e0c 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -670,6 +670,7 @@ and is between 256 and 4096 characters. It is defined in the file
670 earlyprintk= [X86,SH,BLACKFIN] 670 earlyprintk= [X86,SH,BLACKFIN]
671 earlyprintk=vga 671 earlyprintk=vga
672 earlyprintk=serial[,ttySn[,baudrate]] 672 earlyprintk=serial[,ttySn[,baudrate]]
673 earlyprintk=ttySn[,baudrate]
673 earlyprintk=dbgp[debugController#] 674 earlyprintk=dbgp[debugController#]
674 675
675 Append ",keep" to not disable it when the real console 676 Append ",keep" to not disable it when the real console
diff --git a/Documentation/lguest/lguest.c b/Documentation/lguest/lguest.c
index ba9373f82ab5..098de5bce00a 100644
--- a/Documentation/lguest/lguest.c
+++ b/Documentation/lguest/lguest.c
@@ -42,7 +42,6 @@
42#include <signal.h> 42#include <signal.h>
43#include "linux/lguest_launcher.h" 43#include "linux/lguest_launcher.h"
44#include "linux/virtio_config.h" 44#include "linux/virtio_config.h"
45#include <linux/virtio_ids.h>
46#include "linux/virtio_net.h" 45#include "linux/virtio_net.h"
47#include "linux/virtio_blk.h" 46#include "linux/virtio_blk.h"
48#include "linux/virtio_console.h" 47#include "linux/virtio_console.h"
diff --git a/Documentation/i2c/chips/eeprom b/Documentation/misc-devices/eeprom
index f7e8104b5764..f7e8104b5764 100644
--- a/Documentation/i2c/chips/eeprom
+++ b/Documentation/misc-devices/eeprom
diff --git a/Documentation/i2c/chips/max6875 b/Documentation/misc-devices/max6875
index 10ca43cd1a72..1e89ee3ccc1b 100644
--- a/Documentation/i2c/chips/max6875
+++ b/Documentation/misc-devices/max6875
@@ -42,10 +42,12 @@ General Remarks
42 42
43Valid addresses for the MAX6875 are 0x50 and 0x52. 43Valid addresses for the MAX6875 are 0x50 and 0x52.
44Valid addresses for the MAX6874 are 0x50, 0x52, 0x54 and 0x56. 44Valid addresses for the MAX6874 are 0x50, 0x52, 0x54 and 0x56.
45The driver does not probe any address, so you must force the address. 45The driver does not probe any address, so you explicitly instantiate the
46devices.
46 47
47Example: 48Example:
48$ modprobe max6875 force=0,0x50 49$ modprobe max6875
50$ echo max6875 0x50 > /sys/bus/i2c/devices/i2c-0/new_device
49 51
50The MAX6874/MAX6875 ignores address bit 0, so this driver attaches to multiple 52The MAX6874/MAX6875 ignores address bit 0, so this driver attaches to multiple
51addresses. For example, for address 0x50, it also reserves 0x51. 53addresses. For example, for address 0x50, it also reserves 0x51.
diff --git a/Documentation/networking/pktgen.txt b/Documentation/networking/pktgen.txt
index c6cf4a3c16e0..61bb645d50e0 100644
--- a/Documentation/networking/pktgen.txt
+++ b/Documentation/networking/pktgen.txt
@@ -90,6 +90,11 @@ Examples:
90 pgset "dstmac 00:00:00:00:00:00" sets MAC destination address 90 pgset "dstmac 00:00:00:00:00:00" sets MAC destination address
91 pgset "srcmac 00:00:00:00:00:00" sets MAC source address 91 pgset "srcmac 00:00:00:00:00:00" sets MAC source address
92 92
93 pgset "queue_map_min 0" Sets the min value of tx queue interval
94 pgset "queue_map_max 7" Sets the max value of tx queue interval, for multiqueue devices
95 To select queue 1 of a given device,
96 use queue_map_min=1 and queue_map_max=1
97
93 pgset "src_mac_count 1" Sets the number of MACs we'll range through. 98 pgset "src_mac_count 1" Sets the number of MACs we'll range through.
94 The 'minimum' MAC is what you set with srcmac. 99 The 'minimum' MAC is what you set with srcmac.
95 100
@@ -101,6 +106,9 @@ Examples:
101 IPDST_RND, UDPSRC_RND, 106 IPDST_RND, UDPSRC_RND,
102 UDPDST_RND, MACSRC_RND, MACDST_RND 107 UDPDST_RND, MACSRC_RND, MACDST_RND
103 MPLS_RND, VID_RND, SVID_RND 108 MPLS_RND, VID_RND, SVID_RND
109 QUEUE_MAP_RND # queue map random
110 QUEUE_MAP_CPU # queue map mirrors smp_processor_id()
111
104 112
105 pgset "udp_src_min 9" set UDP source port min, If < udp_src_max, then 113 pgset "udp_src_min 9" set UDP source port min, If < udp_src_max, then
106 cycle through the port range. 114 cycle through the port range.
diff --git a/Documentation/networking/timestamping/timestamping.c b/Documentation/networking/timestamping/timestamping.c
index 43d143104210..a7936fe8444a 100644
--- a/Documentation/networking/timestamping/timestamping.c
+++ b/Documentation/networking/timestamping/timestamping.c
@@ -381,7 +381,7 @@ int main(int argc, char **argv)
381 memset(&hwtstamp, 0, sizeof(hwtstamp)); 381 memset(&hwtstamp, 0, sizeof(hwtstamp));
382 strncpy(hwtstamp.ifr_name, interface, sizeof(hwtstamp.ifr_name)); 382 strncpy(hwtstamp.ifr_name, interface, sizeof(hwtstamp.ifr_name));
383 hwtstamp.ifr_data = (void *)&hwconfig; 383 hwtstamp.ifr_data = (void *)&hwconfig;
384 memset(&hwconfig, 0, sizeof(&hwconfig)); 384 memset(&hwconfig, 0, sizeof(hwconfig));
385 hwconfig.tx_type = 385 hwconfig.tx_type =
386 (so_timestamping_flags & SOF_TIMESTAMPING_TX_HARDWARE) ? 386 (so_timestamping_flags & SOF_TIMESTAMPING_TX_HARDWARE) ?
387 HWTSTAMP_TX_ON : HWTSTAMP_TX_OFF; 387 HWTSTAMP_TX_ON : HWTSTAMP_TX_OFF;
diff --git a/Documentation/scsi/hptiop.txt b/Documentation/scsi/hptiop.txt
index a6eb4add1be6..9605179711f4 100644
--- a/Documentation/scsi/hptiop.txt
+++ b/Documentation/scsi/hptiop.txt
@@ -3,6 +3,25 @@ HIGHPOINT ROCKETRAID 3xxx/4xxx ADAPTER DRIVER (hptiop)
3Controller Register Map 3Controller Register Map
4------------------------- 4-------------------------
5 5
6For RR44xx Intel IOP based adapters, the controller IOP is accessed via PCI BAR0 and BAR2:
7
8 BAR0 offset Register
9 0x11C5C Link Interface IRQ Set
10 0x11C60 Link Interface IRQ Clear
11
12 BAR2 offset Register
13 0x10 Inbound Message Register 0
14 0x14 Inbound Message Register 1
15 0x18 Outbound Message Register 0
16 0x1C Outbound Message Register 1
17 0x20 Inbound Doorbell Register
18 0x24 Inbound Interrupt Status Register
19 0x28 Inbound Interrupt Mask Register
20 0x30 Outbound Interrupt Status Register
21 0x34 Outbound Interrupt Mask Register
22 0x40 Inbound Queue Port
23 0x44 Outbound Queue Port
24
6For Intel IOP based adapters, the controller IOP is accessed via PCI BAR0: 25For Intel IOP based adapters, the controller IOP is accessed via PCI BAR0:
7 26
8 BAR0 offset Register 27 BAR0 offset Register
@@ -93,7 +112,7 @@ The driver exposes following sysfs attributes:
93 112
94 113
95----------------------------------------------------------------------------- 114-----------------------------------------------------------------------------
96Copyright (C) 2006-2007 HighPoint Technologies, Inc. All Rights Reserved. 115Copyright (C) 2006-2009 HighPoint Technologies, Inc. All Rights Reserved.
97 116
98 This file is distributed in the hope that it will be useful, 117 This file is distributed in the hope that it will be useful,
99 but WITHOUT ANY WARRANTY; without even the implied warranty of 118 but WITHOUT ANY WARRANTY; without even the implied warranty of
diff --git a/Documentation/slow-work.txt b/Documentation/slow-work.txt
index ebc50f808ea4..9dbf4470c7e1 100644
--- a/Documentation/slow-work.txt
+++ b/Documentation/slow-work.txt
@@ -41,6 +41,13 @@ expand files, provided the time taken to do so isn't too long.
41Operations of both types may sleep during execution, thus tying up the thread 41Operations of both types may sleep during execution, thus tying up the thread
42loaned to it. 42loaned to it.
43 43
44A further class of work item is available, based on the slow work item class:
45
46 (*) Delayed slow work items.
47
48These are slow work items that have a timer to defer queueing of the item for
49a while.
50
44 51
45THREAD-TO-CLASS ALLOCATION 52THREAD-TO-CLASS ALLOCATION
46-------------------------- 53--------------------------
@@ -64,9 +71,11 @@ USING SLOW WORK ITEMS
64Firstly, a module or subsystem wanting to make use of slow work items must 71Firstly, a module or subsystem wanting to make use of slow work items must
65register its interest: 72register its interest:
66 73
67 int ret = slow_work_register_user(); 74 int ret = slow_work_register_user(struct module *module);
68 75
69This will return 0 if successful, or a -ve error upon failure. 76This will return 0 if successful, or a -ve error upon failure. The module
77pointer should be the module interested in using this facility (almost
78certainly THIS_MODULE).
70 79
71 80
72Slow work items may then be set up by: 81Slow work items may then be set up by:
@@ -93,6 +102,10 @@ Slow work items may then be set up by:
93 102
94 or: 103 or:
95 104
105 delayed_slow_work_init(&myitem, &myitem_ops);
106
107 or:
108
96 vslow_work_init(&myitem, &myitem_ops); 109 vslow_work_init(&myitem, &myitem_ops);
97 110
98 depending on its class. 111 depending on its class.
@@ -102,15 +115,92 @@ A suitably set up work item can then be enqueued for processing:
102 int ret = slow_work_enqueue(&myitem); 115 int ret = slow_work_enqueue(&myitem);
103 116
104This will return a -ve error if the thread pool is unable to gain a reference 117This will return a -ve error if the thread pool is unable to gain a reference
105on the item, 0 otherwise. 118on the item, 0 otherwise, or (for delayed work):
119
120 int ret = delayed_slow_work_enqueue(&myitem, my_jiffy_delay);
106 121
107 122
108The items are reference counted, so there ought to be no need for a flush 123The items are reference counted, so there ought to be no need for a flush
109operation. When all a module's slow work items have been processed, and the 124operation. But as the reference counting is optional, means to cancel
125existing work items are also included:
126
127 cancel_slow_work(&myitem);
128 cancel_delayed_slow_work(&myitem);
129
130can be used to cancel pending work. The above cancel function waits for
131existing work to have been executed (or prevent execution of them, depending
132on timing).
133
134
135When all a module's slow work items have been processed, and the
110module has no further interest in the facility, it should unregister its 136module has no further interest in the facility, it should unregister its
111interest: 137interest:
112 138
113 slow_work_unregister_user(); 139 slow_work_unregister_user(struct module *module);
140
141The module pointer is used to wait for all outstanding work items for that
142module before completing the unregistration. This prevents the put_ref() code
143from being taken away before it completes. module should almost certainly be
144THIS_MODULE.
145
146
147================
148HELPER FUNCTIONS
149================
150
151The slow-work facility provides a function by which it can be determined
152whether or not an item is queued for later execution:
153
154 bool queued = slow_work_is_queued(struct slow_work *work);
155
156If it returns false, then the item is not on the queue (it may be executing
157with a requeue pending). This can be used to work out whether an item on which
158another depends is on the queue, thus allowing a dependent item to be queued
159after it.
160
161If the above shows an item on which another depends not to be queued, then the
162owner of the dependent item might need to wait. However, to avoid locking up
163the threads unnecessarily be sleeping in them, it can make sense under some
164circumstances to return the work item to the queue, thus deferring it until
165some other items have had a chance to make use of the yielded thread.
166
167To yield a thread and defer an item, the work function should simply enqueue
168the work item again and return. However, this doesn't work if there's nothing
169actually on the queue, as the thread just vacated will jump straight back into
170the item's work function, thus busy waiting on a CPU.
171
172Instead, the item should use the thread to wait for the dependency to go away,
173but rather than using schedule() or schedule_timeout() to sleep, it should use
174the following function:
175
176 bool requeue = slow_work_sleep_till_thread_needed(
177 struct slow_work *work,
178 signed long *_timeout);
179
180This will add a second wait and then sleep, such that it will be woken up if
181either something appears on the queue that could usefully make use of the
182thread - and behind which this item can be queued, or if the event the caller
183set up to wait for happens. True will be returned if something else appeared
184on the queue and this work function should perhaps return, of false if
185something else woke it up. The timeout is as for schedule_timeout().
186
187For example:
188
189 wq = bit_waitqueue(&my_flags, MY_BIT);
190 init_wait(&wait);
191 requeue = false;
192 do {
193 prepare_to_wait(wq, &wait, TASK_UNINTERRUPTIBLE);
194 if (!test_bit(MY_BIT, &my_flags))
195 break;
196 requeue = slow_work_sleep_till_thread_needed(&my_work,
197 &timeout);
198 } while (timeout > 0 && !requeue);
199 finish_wait(wq, &wait);
200 if (!test_bit(MY_BIT, &my_flags)
201 goto do_my_thing;
202 if (requeue)
203 return; // to slow_work
114 204
115 205
116=============== 206===============
@@ -118,7 +208,8 @@ ITEM OPERATIONS
118=============== 208===============
119 209
120Each work item requires a table of operations of type struct slow_work_ops. 210Each work item requires a table of operations of type struct slow_work_ops.
121All members are required: 211Only ->execute() is required; the getting and putting of a reference and the
212describing of an item are all optional.
122 213
123 (*) Get a reference on an item: 214 (*) Get a reference on an item:
124 215
@@ -148,6 +239,16 @@ All members are required:
148 This should perform the work required of the item. It may sleep, it may 239 This should perform the work required of the item. It may sleep, it may
149 perform disk I/O and it may wait for locks. 240 perform disk I/O and it may wait for locks.
150 241
242 (*) View an item through /proc:
243
244 void (*desc)(struct slow_work *work, struct seq_file *m);
245
246 If supplied, this should print to 'm' a small string describing the work
247 the item is to do. This should be no more than about 40 characters, and
248 shouldn't include a newline character.
249
250 See the 'Viewing executing and queued items' section below.
251
151 252
152================== 253==================
153POOL CONFIGURATION 254POOL CONFIGURATION
@@ -172,3 +273,50 @@ The slow-work thread pool has a number of configurables:
172 is bounded to between 1 and one fewer than the number of active threads. 273 is bounded to between 1 and one fewer than the number of active threads.
173 This ensures there is always at least one thread that can process very 274 This ensures there is always at least one thread that can process very
174 slow work items, and always at least one thread that won't. 275 slow work items, and always at least one thread that won't.
276
277
278==================================
279VIEWING EXECUTING AND QUEUED ITEMS
280==================================
281
282If CONFIG_SLOW_WORK_DEBUG is enabled, a debugfs file is made available:
283
284 /sys/kernel/debug/slow_work/runqueue
285
286through which the list of work items being executed and the queues of items to
287be executed may be viewed. The owner of a work item is given the chance to
288add some information of its own.
289
290The contents look something like the following:
291
292 THR PID ITEM ADDR FL MARK DESC
293 === ===== ================ == ===== ==========
294 0 3005 ffff880023f52348 a 952ms FSC: OBJ17d3: LOOK
295 1 3006 ffff880024e33668 2 160ms FSC: OBJ17e5 OP60d3b: Write1/Store fl=2
296 2 3165 ffff8800296dd180 a 424ms FSC: OBJ17e4: LOOK
297 3 4089 ffff8800262c8d78 a 212ms FSC: OBJ17ea: CRTN
298 4 4090 ffff88002792bed8 2 388ms FSC: OBJ17e8 OP60d36: Write1/Store fl=2
299 5 4092 ffff88002a0ef308 2 388ms FSC: OBJ17e7 OP60d2e: Write1/Store fl=2
300 6 4094 ffff88002abaf4b8 2 132ms FSC: OBJ17e2 OP60d4e: Write1/Store fl=2
301 7 4095 ffff88002bb188e0 a 388ms FSC: OBJ17e9: CRTN
302 vsq - ffff880023d99668 1 308ms FSC: OBJ17e0 OP60f91: Write1/EnQ fl=2
303 vsq - ffff8800295d1740 1 212ms FSC: OBJ16be OP4d4b6: Write1/EnQ fl=2
304 vsq - ffff880025ba3308 1 160ms FSC: OBJ179a OP58dec: Write1/EnQ fl=2
305 vsq - ffff880024ec83e0 1 160ms FSC: OBJ17ae OP599f2: Write1/EnQ fl=2
306 vsq - ffff880026618e00 1 160ms FSC: OBJ17e6 OP60d33: Write1/EnQ fl=2
307 vsq - ffff880025a2a4b8 1 132ms FSC: OBJ16a2 OP4d583: Write1/EnQ fl=2
308 vsq - ffff880023cbe6d8 9 212ms FSC: OBJ17eb: LOOK
309 vsq - ffff880024d37590 9 212ms FSC: OBJ17ec: LOOK
310 vsq - ffff880027746cb0 9 212ms FSC: OBJ17ed: LOOK
311 vsq - ffff880024d37ae8 9 212ms FSC: OBJ17ee: LOOK
312 vsq - ffff880024d37cb0 9 212ms FSC: OBJ17ef: LOOK
313 vsq - ffff880025036550 9 212ms FSC: OBJ17f0: LOOK
314 vsq - ffff8800250368e0 9 212ms FSC: OBJ17f1: LOOK
315 vsq - ffff880025036aa8 9 212ms FSC: OBJ17f2: LOOK
316
317In the 'THR' column, executing items show the thread they're occupying and
318queued threads indicate which queue they're on. 'PID' shows the process ID of
319a slow-work thread that's executing something. 'FL' shows the work item flags.
320'MARK' indicates how long since an item was queued or began executing. Lastly,
321the 'DESC' column permits the owner of an item to give some information.
322
diff --git a/Documentation/sound/alsa/ALSA-Configuration.txt b/Documentation/sound/alsa/ALSA-Configuration.txt
index 1c8eb4518ce0..fd9a2f67edf2 100644
--- a/Documentation/sound/alsa/ALSA-Configuration.txt
+++ b/Documentation/sound/alsa/ALSA-Configuration.txt
@@ -522,7 +522,7 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed.
522 pcm_devs - Number of PCM devices assigned to each card 522 pcm_devs - Number of PCM devices assigned to each card
523 (default = 1, up to 4) 523 (default = 1, up to 4)
524 pcm_substreams - Number of PCM substreams assigned to each PCM 524 pcm_substreams - Number of PCM substreams assigned to each PCM
525 (default = 8, up to 16) 525 (default = 8, up to 128)
526 hrtimer - Use hrtimer (=1, default) or system timer (=0) 526 hrtimer - Use hrtimer (=1, default) or system timer (=0)
527 fake_buffer - Fake buffer allocations (default = 1) 527 fake_buffer - Fake buffer allocations (default = 1)
528 528
diff --git a/Documentation/sound/alsa/HD-Audio-Models.txt b/Documentation/sound/alsa/HD-Audio-Models.txt
index f1708b79f963..4c7f9aee5c4e 100644
--- a/Documentation/sound/alsa/HD-Audio-Models.txt
+++ b/Documentation/sound/alsa/HD-Audio-Models.txt
@@ -209,6 +209,7 @@ AD1884A / AD1883 / AD1984A / AD1984B
209 laptop laptop with HP jack sensing 209 laptop laptop with HP jack sensing
210 mobile mobile devices with HP jack sensing 210 mobile mobile devices with HP jack sensing
211 thinkpad Lenovo Thinkpad X300 211 thinkpad Lenovo Thinkpad X300
212 touchsmart HP Touchsmart
212 213
213AD1884 214AD1884
214====== 215======
@@ -358,6 +359,7 @@ STAC9227/9228/9229/927x
358 5stack-no-fp D965 5stack without front panel 359 5stack-no-fp D965 5stack without front panel
359 dell-3stack Dell Dimension E520 360 dell-3stack Dell Dimension E520
360 dell-bios Fixes with Dell BIOS setup 361 dell-bios Fixes with Dell BIOS setup
362 volknob Fixes with volume-knob widget 0x24
361 auto BIOS setup (default) 363 auto BIOS setup (default)
362 364
363STAC92HD71B* 365STAC92HD71B*
diff --git a/Documentation/thermal/sysfs-api.txt b/Documentation/thermal/sysfs-api.txt
index 70d68ce8640a..a87dc277a5ca 100644
--- a/Documentation/thermal/sysfs-api.txt
+++ b/Documentation/thermal/sysfs-api.txt
@@ -1,5 +1,5 @@
1Generic Thermal Sysfs driver How To 1Generic Thermal Sysfs driver How To
2========================= 2===================================
3 3
4Written by Sujith Thomas <sujith.thomas@intel.com>, Zhang Rui <rui.zhang@intel.com> 4Written by Sujith Thomas <sujith.thomas@intel.com>, Zhang Rui <rui.zhang@intel.com>
5 5
@@ -10,20 +10,20 @@ Copyright (c) 2008 Intel Corporation
10 10
110. Introduction 110. Introduction
12 12
13The generic thermal sysfs provides a set of interfaces for thermal zone devices (sensors) 13The generic thermal sysfs provides a set of interfaces for thermal zone
14and thermal cooling devices (fan, processor...) to register with the thermal management 14devices (sensors) and thermal cooling devices (fan, processor...) to register
15solution and to be a part of it. 15with the thermal management solution and to be a part of it.
16 16
17This how-to focuses on enabling new thermal zone and cooling devices to participate 17This how-to focuses on enabling new thermal zone and cooling devices to
18in thermal management. 18participate in thermal management.
19This solution is platform independent and any type of thermal zone devices and 19This solution is platform independent and any type of thermal zone devices
20cooling devices should be able to make use of the infrastructure. 20and cooling devices should be able to make use of the infrastructure.
21 21
22The main task of the thermal sysfs driver is to expose thermal zone attributes as well 22The main task of the thermal sysfs driver is to expose thermal zone attributes
23as cooling device attributes to the user space. 23as well as cooling device attributes to the user space.
24An intelligent thermal management application can make decisions based on inputs 24An intelligent thermal management application can make decisions based on
25from thermal zone attributes (the current temperature and trip point temperature) 25inputs from thermal zone attributes (the current temperature and trip point
26and throttle appropriate devices. 26temperature) and throttle appropriate devices.
27 27
28[0-*] denotes any positive number starting from 0 28[0-*] denotes any positive number starting from 0
29[1-*] denotes any positive number starting from 1 29[1-*] denotes any positive number starting from 1
@@ -31,77 +31,77 @@ and throttle appropriate devices.
311. thermal sysfs driver interface functions 311. thermal sysfs driver interface functions
32 32
331.1 thermal zone device interface 331.1 thermal zone device interface
341.1.1 struct thermal_zone_device *thermal_zone_device_register(char *name, int trips, 341.1.1 struct thermal_zone_device *thermal_zone_device_register(char *name,
35 void *devdata, struct thermal_zone_device_ops *ops) 35 int trips, void *devdata, struct thermal_zone_device_ops *ops)
36 36
37 This interface function adds a new thermal zone device (sensor) to 37 This interface function adds a new thermal zone device (sensor) to
38 /sys/class/thermal folder as thermal_zone[0-*]. 38 /sys/class/thermal folder as thermal_zone[0-*]. It tries to bind all the
39 It tries to bind all the thermal cooling devices registered at the same time. 39 thermal cooling devices registered at the same time.
40 40
41 name: the thermal zone name. 41 name: the thermal zone name.
42 trips: the total number of trip points this thermal zone supports. 42 trips: the total number of trip points this thermal zone supports.
43 devdata: device private data 43 devdata: device private data
44 ops: thermal zone device call-backs. 44 ops: thermal zone device call-backs.
45 .bind: bind the thermal zone device with a thermal cooling device. 45 .bind: bind the thermal zone device with a thermal cooling device.
46 .unbind: unbind the thermal zone device with a thermal cooling device. 46 .unbind: unbind the thermal zone device with a thermal cooling device.
47 .get_temp: get the current temperature of the thermal zone. 47 .get_temp: get the current temperature of the thermal zone.
48 .get_mode: get the current mode (user/kernel) of the thermal zone. 48 .get_mode: get the current mode (user/kernel) of the thermal zone.
49 "kernel" means thermal management is done in kernel. 49 - "kernel" means thermal management is done in kernel.
50 "user" will prevent kernel thermal driver actions upon trip points 50 - "user" will prevent kernel thermal driver actions upon trip points
51 so that user applications can take charge of thermal management. 51 so that user applications can take charge of thermal management.
52 .set_mode: set the mode (user/kernel) of the thermal zone. 52 .set_mode: set the mode (user/kernel) of the thermal zone.
53 .get_trip_type: get the type of certain trip point. 53 .get_trip_type: get the type of certain trip point.
54 .get_trip_temp: get the temperature above which the certain trip point 54 .get_trip_temp: get the temperature above which the certain trip point
55 will be fired. 55 will be fired.
56 56
571.1.2 void thermal_zone_device_unregister(struct thermal_zone_device *tz) 571.1.2 void thermal_zone_device_unregister(struct thermal_zone_device *tz)
58 58
59 This interface function removes the thermal zone device. 59 This interface function removes the thermal zone device.
60 It deletes the corresponding entry form /sys/class/thermal folder and unbind all 60 It deletes the corresponding entry form /sys/class/thermal folder and
61 the thermal cooling devices it uses. 61 unbind all the thermal cooling devices it uses.
62 62
631.2 thermal cooling device interface 631.2 thermal cooling device interface
641.2.1 struct thermal_cooling_device *thermal_cooling_device_register(char *name, 641.2.1 struct thermal_cooling_device *thermal_cooling_device_register(char *name,
65 void *devdata, struct thermal_cooling_device_ops *) 65 void *devdata, struct thermal_cooling_device_ops *)
66 66
67 This interface function adds a new thermal cooling device (fan/processor/...) to 67 This interface function adds a new thermal cooling device (fan/processor/...)
68 /sys/class/thermal/ folder as cooling_device[0-*]. 68 to /sys/class/thermal/ folder as cooling_device[0-*]. It tries to bind itself
69 It tries to bind itself to all the thermal zone devices register at the same time. 69 to all the thermal zone devices register at the same time.
70 name: the cooling device name. 70 name: the cooling device name.
71 devdata: device private data. 71 devdata: device private data.
72 ops: thermal cooling devices call-backs. 72 ops: thermal cooling devices call-backs.
73 .get_max_state: get the Maximum throttle state of the cooling device. 73 .get_max_state: get the Maximum throttle state of the cooling device.
74 .get_cur_state: get the Current throttle state of the cooling device. 74 .get_cur_state: get the Current throttle state of the cooling device.
75 .set_cur_state: set the Current throttle state of the cooling device. 75 .set_cur_state: set the Current throttle state of the cooling device.
76 76
771.2.2 void thermal_cooling_device_unregister(struct thermal_cooling_device *cdev) 771.2.2 void thermal_cooling_device_unregister(struct thermal_cooling_device *cdev)
78 78
79 This interface function remove the thermal cooling device. 79 This interface function remove the thermal cooling device.
80 It deletes the corresponding entry form /sys/class/thermal folder and unbind 80 It deletes the corresponding entry form /sys/class/thermal folder and
81 itself from all the thermal zone devices using it. 81 unbind itself from all the thermal zone devices using it.
82 82
831.3 interface for binding a thermal zone device with a thermal cooling device 831.3 interface for binding a thermal zone device with a thermal cooling device
841.3.1 int thermal_zone_bind_cooling_device(struct thermal_zone_device *tz, 841.3.1 int thermal_zone_bind_cooling_device(struct thermal_zone_device *tz,
85 int trip, struct thermal_cooling_device *cdev); 85 int trip, struct thermal_cooling_device *cdev);
86 86
87 This interface function bind a thermal cooling device to the certain trip point 87 This interface function bind a thermal cooling device to the certain trip
88 of a thermal zone device. 88 point of a thermal zone device.
89 This function is usually called in the thermal zone device .bind callback. 89 This function is usually called in the thermal zone device .bind callback.
90 tz: the thermal zone device 90 tz: the thermal zone device
91 cdev: thermal cooling device 91 cdev: thermal cooling device
92 trip: indicates which trip point the cooling devices is associated with 92 trip: indicates which trip point the cooling devices is associated with
93 in this thermal zone. 93 in this thermal zone.
94 94
951.3.2 int thermal_zone_unbind_cooling_device(struct thermal_zone_device *tz, 951.3.2 int thermal_zone_unbind_cooling_device(struct thermal_zone_device *tz,
96 int trip, struct thermal_cooling_device *cdev); 96 int trip, struct thermal_cooling_device *cdev);
97 97
98 This interface function unbind a thermal cooling device from the certain trip point 98 This interface function unbind a thermal cooling device from the certain
99 of a thermal zone device. 99 trip point of a thermal zone device. This function is usually called in
100 This function is usually called in the thermal zone device .unbind callback. 100 the thermal zone device .unbind callback.
101 tz: the thermal zone device 101 tz: the thermal zone device
102 cdev: thermal cooling device 102 cdev: thermal cooling device
103 trip: indicates which trip point the cooling devices is associated with 103 trip: indicates which trip point the cooling devices is associated with
104 in this thermal zone. 104 in this thermal zone.
105 105
1062. sysfs attributes structure 1062. sysfs attributes structure
107 107
@@ -114,153 +114,166 @@ if hwmon is compiled in or built as a module.
114 114
115Thermal zone device sys I/F, created once it's registered: 115Thermal zone device sys I/F, created once it's registered:
116/sys/class/thermal/thermal_zone[0-*]: 116/sys/class/thermal/thermal_zone[0-*]:
117 |-----type: Type of the thermal zone 117 |---type: Type of the thermal zone
118 |-----temp: Current temperature 118 |---temp: Current temperature
119 |-----mode: Working mode of the thermal zone 119 |---mode: Working mode of the thermal zone
120 |-----trip_point_[0-*]_temp: Trip point temperature 120 |---trip_point_[0-*]_temp: Trip point temperature
121 |-----trip_point_[0-*]_type: Trip point type 121 |---trip_point_[0-*]_type: Trip point type
122 122
123Thermal cooling device sys I/F, created once it's registered: 123Thermal cooling device sys I/F, created once it's registered:
124/sys/class/thermal/cooling_device[0-*]: 124/sys/class/thermal/cooling_device[0-*]:
125 |-----type : Type of the cooling device(processor/fan/...) 125 |---type: Type of the cooling device(processor/fan/...)
126 |-----max_state: Maximum cooling state of the cooling device 126 |---max_state: Maximum cooling state of the cooling device
127 |-----cur_state: Current cooling state of the cooling device 127 |---cur_state: Current cooling state of the cooling device
128 128
129 129
130These two dynamic attributes are created/removed in pairs. 130Then next two dynamic attributes are created/removed in pairs. They represent
131They represent the relationship between a thermal zone and its associated cooling device. 131the relationship between a thermal zone and its associated cooling device.
132They are created/removed for each 132They are created/removed for each successful execution of
133thermal_zone_bind_cooling_device/thermal_zone_unbind_cooling_device successful execution. 133thermal_zone_bind_cooling_device/thermal_zone_unbind_cooling_device.
134 134
135/sys/class/thermal/thermal_zone[0-*] 135/sys/class/thermal/thermal_zone[0-*]:
136 |-----cdev[0-*]: The [0-*]th cooling device in the current thermal zone 136 |---cdev[0-*]: [0-*]th cooling device in current thermal zone
137 |-----cdev[0-*]_trip_point: Trip point that cdev[0-*] is associated with 137 |---cdev[0-*]_trip_point: Trip point that cdev[0-*] is associated with
138 138
139Besides the thermal zone device sysfs I/F and cooling device sysfs I/F, 139Besides the thermal zone device sysfs I/F and cooling device sysfs I/F,
140the generic thermal driver also creates a hwmon sysfs I/F for each _type_ of 140the generic thermal driver also creates a hwmon sysfs I/F for each _type_
141thermal zone device. E.g. the generic thermal driver registers one hwmon class device 141of thermal zone device. E.g. the generic thermal driver registers one hwmon
142and build the associated hwmon sysfs I/F for all the registered ACPI thermal zones. 142class device and build the associated hwmon sysfs I/F for all the registered
143ACPI thermal zones.
144
143/sys/class/hwmon/hwmon[0-*]: 145/sys/class/hwmon/hwmon[0-*]:
144 |-----name: The type of the thermal zone devices. 146 |---name: The type of the thermal zone devices
145 |-----temp[1-*]_input: The current temperature of thermal zone [1-*]. 147 |---temp[1-*]_input: The current temperature of thermal zone [1-*]
146 |-----temp[1-*]_critical: The critical trip point of thermal zone [1-*]. 148 |---temp[1-*]_critical: The critical trip point of thermal zone [1-*]
149
147Please read Documentation/hwmon/sysfs-interface for additional information. 150Please read Documentation/hwmon/sysfs-interface for additional information.
148 151
149*************************** 152***************************
150* Thermal zone attributes * 153* Thermal zone attributes *
151*************************** 154***************************
152 155
153type Strings which represent the thermal zone type. 156type
154 This is given by thermal zone driver as part of registration. 157 Strings which represent the thermal zone type.
155 Eg: "acpitz" indicates it's an ACPI thermal device. 158 This is given by thermal zone driver as part of registration.
156 In order to keep it consistent with hwmon sys attribute, 159 E.g: "acpitz" indicates it's an ACPI thermal device.
157 this should be a short, lowercase string, 160 In order to keep it consistent with hwmon sys attribute; this should
158 not containing spaces nor dashes. 161 be a short, lowercase string, not containing spaces nor dashes.
159 RO 162 RO, Required
160 Required 163
161 164temp
162temp Current temperature as reported by thermal zone (sensor) 165 Current temperature as reported by thermal zone (sensor).
163 Unit: millidegree Celsius 166 Unit: millidegree Celsius
164 RO 167 RO, Required
165 Required 168
166 169mode
167mode One of the predefined values in [kernel, user] 170 One of the predefined values in [kernel, user].
168 This file gives information about the algorithm 171 This file gives information about the algorithm that is currently
169 that is currently managing the thermal zone. 172 managing the thermal zone. It can be either default kernel based
170 It can be either default kernel based algorithm 173 algorithm or user space application.
171 or user space application. 174 kernel = Thermal management in kernel thermal zone driver.
172 RW 175 user = Preventing kernel thermal zone driver actions upon
173 Optional 176 trip points so that user application can take full
174 kernel = Thermal management in kernel thermal zone driver. 177 charge of the thermal management.
175 user = Preventing kernel thermal zone driver actions upon 178 RW, Optional
176 trip points so that user application can take full 179
177 charge of the thermal management. 180trip_point_[0-*]_temp
178 181 The temperature above which trip point will be fired.
179trip_point_[0-*]_temp The temperature above which trip point will be fired 182 Unit: millidegree Celsius
180 Unit: millidegree Celsius 183 RO, Optional
181 RO 184
182 Optional 185trip_point_[0-*]_type
183 186 Strings which indicate the type of the trip point.
184trip_point_[0-*]_type Strings which indicate the type of the trip point 187 E.g. it can be one of critical, hot, passive, active[0-*] for ACPI
185 E.g. it can be one of critical, hot, passive, 188 thermal zone.
186 active[0-*] for ACPI thermal zone. 189 RO, Optional
187 RO 190
188 Optional 191cdev[0-*]
189 192 Sysfs link to the thermal cooling device node where the sys I/F
190cdev[0-*] Sysfs link to the thermal cooling device node where the sys I/F 193 for cooling device throttling control represents.
191 for cooling device throttling control represents. 194 RO, Optional
192 RO 195
193 Optional 196cdev[0-*]_trip_point
194 197 The trip point with which cdev[0-*] is associated in this thermal
195cdev[0-*]_trip_point The trip point with which cdev[0-*] is associated in this thermal zone 198 zone; -1 means the cooling device is not associated with any trip
196 -1 means the cooling device is not associated with any trip point. 199 point.
197 RO 200 RO, Optional
198 Optional 201
199 202passive
200****************************** 203 Attribute is only present for zones in which the passive cooling
201* Cooling device attributes * 204 policy is not supported by native thermal driver. Default is zero
202****************************** 205 and can be set to a temperature (in millidegrees) to enable a
203 206 passive trip point for the zone. Activation is done by polling with
204type String which represents the type of device 207 an interval of 1 second.
205 eg: For generic ACPI: this should be "Fan", 208 Unit: millidegrees Celsius
206 "Processor" or "LCD" 209 RW, Optional
207 eg. For memory controller device on intel_menlow platform: 210
208 this should be "Memory controller" 211*****************************
209 RO 212* Cooling device attributes *
210 Required 213*****************************
211 214
212max_state The maximum permissible cooling state of this cooling device. 215type
213 RO 216 String which represents the type of device, e.g:
214 Required 217 - for generic ACPI: should be "Fan", "Processor" or "LCD"
215 218 - for memory controller device on intel_menlow platform:
216cur_state The current cooling state of this cooling device. 219 should be "Memory controller".
217 the value can any integer numbers between 0 and max_state, 220 RO, Required
218 cur_state == 0 means no cooling 221
219 cur_state == max_state means the maximum cooling. 222max_state
220 RW 223 The maximum permissible cooling state of this cooling device.
221 Required 224 RO, Required
225
226cur_state
227 The current cooling state of this cooling device.
228 The value can any integer numbers between 0 and max_state:
229 - cur_state == 0 means no cooling
230 - cur_state == max_state means the maximum cooling.
231 RW, Required
222 232
2233. A simple implementation 2333. A simple implementation
224 234
225ACPI thermal zone may support multiple trip points like critical/hot/passive/active. 235ACPI thermal zone may support multiple trip points like critical, hot,
226If an ACPI thermal zone supports critical, passive, active[0] and active[1] at the same time, 236passive, active. If an ACPI thermal zone supports critical, passive,
227it may register itself as a thermal_zone_device (thermal_zone1) with 4 trip points in all. 237active[0] and active[1] at the same time, it may register itself as a
228It has one processor and one fan, which are both registered as thermal_cooling_device. 238thermal_zone_device (thermal_zone1) with 4 trip points in all.
229If the processor is listed in _PSL method, and the fan is listed in _AL0 method, 239It has one processor and one fan, which are both registered as
230the sys I/F structure will be built like this: 240thermal_cooling_device.
241
242If the processor is listed in _PSL method, and the fan is listed in _AL0
243method, the sys I/F structure will be built like this:
231 244
232/sys/class/thermal: 245/sys/class/thermal:
233 246
234|thermal_zone1: 247|thermal_zone1:
235 |-----type: acpitz 248 |---type: acpitz
236 |-----temp: 37000 249 |---temp: 37000
237 |-----mode: kernel 250 |---mode: kernel
238 |-----trip_point_0_temp: 100000 251 |---trip_point_0_temp: 100000
239 |-----trip_point_0_type: critical 252 |---trip_point_0_type: critical
240 |-----trip_point_1_temp: 80000 253 |---trip_point_1_temp: 80000
241 |-----trip_point_1_type: passive 254 |---trip_point_1_type: passive
242 |-----trip_point_2_temp: 70000 255 |---trip_point_2_temp: 70000
243 |-----trip_point_2_type: active0 256 |---trip_point_2_type: active0
244 |-----trip_point_3_temp: 60000 257 |---trip_point_3_temp: 60000
245 |-----trip_point_3_type: active1 258 |---trip_point_3_type: active1
246 |-----cdev0: --->/sys/class/thermal/cooling_device0 259 |---cdev0: --->/sys/class/thermal/cooling_device0
247 |-----cdev0_trip_point: 1 /* cdev0 can be used for passive */ 260 |---cdev0_trip_point: 1 /* cdev0 can be used for passive */
248 |-----cdev1: --->/sys/class/thermal/cooling_device3 261 |---cdev1: --->/sys/class/thermal/cooling_device3
249 |-----cdev1_trip_point: 2 /* cdev1 can be used for active[0]*/ 262 |---cdev1_trip_point: 2 /* cdev1 can be used for active[0]*/
250 263
251|cooling_device0: 264|cooling_device0:
252 |-----type: Processor 265 |---type: Processor
253 |-----max_state: 8 266 |---max_state: 8
254 |-----cur_state: 0 267 |---cur_state: 0
255 268
256|cooling_device3: 269|cooling_device3:
257 |-----type: Fan 270 |---type: Fan
258 |-----max_state: 2 271 |---max_state: 2
259 |-----cur_state: 0 272 |---cur_state: 0
260 273
261/sys/class/hwmon: 274/sys/class/hwmon:
262 275
263|hwmon0: 276|hwmon0:
264 |-----name: acpitz 277 |---name: acpitz
265 |-----temp1_input: 37000 278 |---temp1_input: 37000
266 |-----temp1_crit: 100000 279 |---temp1_crit: 100000
diff --git a/Documentation/trace/ftrace.txt b/Documentation/trace/ftrace.txt
index 957b22fde2df..8179692fbb90 100644
--- a/Documentation/trace/ftrace.txt
+++ b/Documentation/trace/ftrace.txt
@@ -1231,6 +1231,7 @@ something like this simple program:
1231#include <sys/stat.h> 1231#include <sys/stat.h>
1232#include <fcntl.h> 1232#include <fcntl.h>
1233#include <unistd.h> 1233#include <unistd.h>
1234#include <string.h>
1234 1235
1235#define _STR(x) #x 1236#define _STR(x) #x
1236#define STR(x) _STR(x) 1237#define STR(x) _STR(x)
@@ -1265,6 +1266,7 @@ const char *find_debugfs(void)
1265 return NULL; 1266 return NULL;
1266 } 1267 }
1267 1268
1269 strcat(debugfs, "/tracing/");
1268 debugfs_found = 1; 1270 debugfs_found = 1;
1269 1271
1270 return debugfs; 1272 return debugfs;
diff --git a/Documentation/vm/hwpoison.txt b/Documentation/vm/hwpoison.txt
new file mode 100644
index 000000000000..3ffadf8da61f
--- /dev/null
+++ b/Documentation/vm/hwpoison.txt
@@ -0,0 +1,136 @@
1What is hwpoison?
2
3Upcoming Intel CPUs have support for recovering from some memory errors
4(``MCA recovery''). This requires the OS to declare a page "poisoned",
5kill the processes associated with it and avoid using it in the future.
6
7This patchkit implements the necessary infrastructure in the VM.
8
9To quote the overview comment:
10
11 * High level machine check handler. Handles pages reported by the
12 * hardware as being corrupted usually due to a 2bit ECC memory or cache
13 * failure.
14 *
15 * This focusses on pages detected as corrupted in the background.
16 * When the current CPU tries to consume corruption the currently
17 * running process can just be killed directly instead. This implies
18 * that if the error cannot be handled for some reason it's safe to
19 * just ignore it because no corruption has been consumed yet. Instead
20 * when that happens another machine check will happen.
21 *
22 * Handles page cache pages in various states. The tricky part
23 * here is that we can access any page asynchronous to other VM
24 * users, because memory failures could happen anytime and anywhere,
25 * possibly violating some of their assumptions. This is why this code
26 * has to be extremely careful. Generally it tries to use normal locking
27 * rules, as in get the standard locks, even if that means the
28 * error handling takes potentially a long time.
29 *
30 * Some of the operations here are somewhat inefficient and have non
31 * linear algorithmic complexity, because the data structures have not
32 * been optimized for this case. This is in particular the case
33 * for the mapping from a vma to a process. Since this case is expected
34 * to be rare we hope we can get away with this.
35
36The code consists of a the high level handler in mm/memory-failure.c,
37a new page poison bit and various checks in the VM to handle poisoned
38pages.
39
40The main target right now is KVM guests, but it works for all kinds
41of applications. KVM support requires a recent qemu-kvm release.
42
43For the KVM use there was need for a new signal type so that
44KVM can inject the machine check into the guest with the proper
45address. This in theory allows other applications to handle
46memory failures too. The expection is that near all applications
47won't do that, but some very specialized ones might.
48
49---
50
51There are two (actually three) modi memory failure recovery can be in:
52
53vm.memory_failure_recovery sysctl set to zero:
54 All memory failures cause a panic. Do not attempt recovery.
55 (on x86 this can be also affected by the tolerant level of the
56 MCE subsystem)
57
58early kill
59 (can be controlled globally and per process)
60 Send SIGBUS to the application as soon as the error is detected
61 This allows applications who can process memory errors in a gentle
62 way (e.g. drop affected object)
63 This is the mode used by KVM qemu.
64
65late kill
66 Send SIGBUS when the application runs into the corrupted page.
67 This is best for memory error unaware applications and default
68 Note some pages are always handled as late kill.
69
70---
71
72User control:
73
74vm.memory_failure_recovery
75 See sysctl.txt
76
77vm.memory_failure_early_kill
78 Enable early kill mode globally
79
80PR_MCE_KILL
81 Set early/late kill mode/revert to system default
82 arg1: PR_MCE_KILL_CLEAR: Revert to system default
83 arg1: PR_MCE_KILL_SET: arg2 defines thread specific mode
84 PR_MCE_KILL_EARLY: Early kill
85 PR_MCE_KILL_LATE: Late kill
86 PR_MCE_KILL_DEFAULT: Use system global default
87PR_MCE_KILL_GET
88 return current mode
89
90
91---
92
93Testing:
94
95madvise(MADV_POISON, ....)
96 (as root)
97 Poison a page in the process for testing
98
99
100hwpoison-inject module through debugfs
101 /sys/debug/hwpoison/corrupt-pfn
102
103Inject hwpoison fault at PFN echoed into this file
104
105
106Architecture specific MCE injector
107
108x86 has mce-inject, mce-test
109
110Some portable hwpoison test programs in mce-test, see blow.
111
112---
113
114References:
115
116http://halobates.de/mce-lc09-2.pdf
117 Overview presentation from LinuxCon 09
118
119git://git.kernel.org/pub/scm/utils/cpu/mce/mce-test.git
120 Test suite (hwpoison specific portable tests in tsrc)
121
122git://git.kernel.org/pub/scm/utils/cpu/mce/mce-inject.git
123 x86 specific injector
124
125
126---
127
128Limitations:
129
130- Not all page types are supported and never will. Most kernel internal
131objects cannot be recovered, only LRU pages for now.
132- Right now hugepage support is missing.
133
134---
135Andi Kleen, Oct 2009
136
diff --git a/Documentation/vm/ksm.txt b/Documentation/vm/ksm.txt
index 72a22f65960e..262d8e6793a3 100644
--- a/Documentation/vm/ksm.txt
+++ b/Documentation/vm/ksm.txt
@@ -52,15 +52,15 @@ The KSM daemon is controlled by sysfs files in /sys/kernel/mm/ksm/,
52readable by all but writable only by root: 52readable by all but writable only by root:
53 53
54max_kernel_pages - set to maximum number of kernel pages that KSM may use 54max_kernel_pages - set to maximum number of kernel pages that KSM may use
55 e.g. "echo 2000 > /sys/kernel/mm/ksm/max_kernel_pages" 55 e.g. "echo 100000 > /sys/kernel/mm/ksm/max_kernel_pages"
56 Value 0 imposes no limit on the kernel pages KSM may use; 56 Value 0 imposes no limit on the kernel pages KSM may use;
57 but note that any process using MADV_MERGEABLE can cause 57 but note that any process using MADV_MERGEABLE can cause
58 KSM to allocate these pages, unswappable until it exits. 58 KSM to allocate these pages, unswappable until it exits.
59 Default: 2000 (chosen for demonstration purposes) 59 Default: quarter of memory (chosen to not pin too much)
60 60
61pages_to_scan - how many present pages to scan before ksmd goes to sleep 61pages_to_scan - how many present pages to scan before ksmd goes to sleep
62 e.g. "echo 200 > /sys/kernel/mm/ksm/pages_to_scan" 62 e.g. "echo 100 > /sys/kernel/mm/ksm/pages_to_scan"
63 Default: 200 (chosen for demonstration purposes) 63 Default: 100 (chosen for demonstration purposes)
64 64
65sleep_millisecs - how many milliseconds ksmd should sleep before next scan 65sleep_millisecs - how many milliseconds ksmd should sleep before next scan
66 e.g. "echo 20 > /sys/kernel/mm/ksm/sleep_millisecs" 66 e.g. "echo 20 > /sys/kernel/mm/ksm/sleep_millisecs"
@@ -70,7 +70,8 @@ run - set 0 to stop ksmd from running but keep merged pages,
70 set 1 to run ksmd e.g. "echo 1 > /sys/kernel/mm/ksm/run", 70 set 1 to run ksmd e.g. "echo 1 > /sys/kernel/mm/ksm/run",
71 set 2 to stop ksmd and unmerge all pages currently merged, 71 set 2 to stop ksmd and unmerge all pages currently merged,
72 but leave mergeable areas registered for next run 72 but leave mergeable areas registered for next run
73 Default: 1 (for immediate use by apps which register) 73 Default: 0 (must be changed to 1 to activate KSM,
74 except if CONFIG_SYSFS is disabled)
74 75
75The effectiveness of KSM and MADV_MERGEABLE is shown in /sys/kernel/mm/ksm/: 76The effectiveness of KSM and MADV_MERGEABLE is shown in /sys/kernel/mm/ksm/:
76 77
@@ -86,4 +87,4 @@ pages_volatile embraces several different kinds of activity, but a high
86proportion there would also indicate poor use of madvise MADV_MERGEABLE. 87proportion there would also indicate poor use of madvise MADV_MERGEABLE.
87 88
88Izik Eidus, 89Izik Eidus,
89Hugh Dickins, 30 July 2009 90Hugh Dickins, 24 Sept 2009
diff --git a/Documentation/vm/page-types.c b/Documentation/vm/page-types.c
index fa1a30d9e9d5..4793c6aac733 100644
--- a/Documentation/vm/page-types.c
+++ b/Documentation/vm/page-types.c
@@ -2,7 +2,10 @@
2 * page-types: Tool for querying page flags 2 * page-types: Tool for querying page flags
3 * 3 *
4 * Copyright (C) 2009 Intel corporation 4 * Copyright (C) 2009 Intel corporation
5 * Copyright (C) 2009 Wu Fengguang <fengguang.wu@intel.com> 5 *
6 * Authors: Wu Fengguang <fengguang.wu@intel.com>
7 *
8 * Released under the General Public License (GPL).
6 */ 9 */
7 10
8#define _LARGEFILE64_SOURCE 11#define _LARGEFILE64_SOURCE
@@ -69,7 +72,9 @@
69#define KPF_COMPOUND_TAIL 16 72#define KPF_COMPOUND_TAIL 16
70#define KPF_HUGE 17 73#define KPF_HUGE 17
71#define KPF_UNEVICTABLE 18 74#define KPF_UNEVICTABLE 18
75#define KPF_HWPOISON 19
72#define KPF_NOPAGE 20 76#define KPF_NOPAGE 20
77#define KPF_KSM 21
73 78
74/* [32-] kernel hacking assistances */ 79/* [32-] kernel hacking assistances */
75#define KPF_RESERVED 32 80#define KPF_RESERVED 32
@@ -116,7 +121,9 @@ static char *page_flag_names[] = {
116 [KPF_COMPOUND_TAIL] = "T:compound_tail", 121 [KPF_COMPOUND_TAIL] = "T:compound_tail",
117 [KPF_HUGE] = "G:huge", 122 [KPF_HUGE] = "G:huge",
118 [KPF_UNEVICTABLE] = "u:unevictable", 123 [KPF_UNEVICTABLE] = "u:unevictable",
124 [KPF_HWPOISON] = "X:hwpoison",
119 [KPF_NOPAGE] = "n:nopage", 125 [KPF_NOPAGE] = "n:nopage",
126 [KPF_KSM] = "x:ksm",
120 127
121 [KPF_RESERVED] = "r:reserved", 128 [KPF_RESERVED] = "r:reserved",
122 [KPF_MLOCKED] = "m:mlocked", 129 [KPF_MLOCKED] = "m:mlocked",
@@ -152,9 +159,6 @@ static unsigned long opt_size[MAX_ADDR_RANGES];
152static int nr_vmas; 159static int nr_vmas;
153static unsigned long pg_start[MAX_VMAS]; 160static unsigned long pg_start[MAX_VMAS];
154static unsigned long pg_end[MAX_VMAS]; 161static unsigned long pg_end[MAX_VMAS];
155static unsigned long voffset;
156
157static int pagemap_fd;
158 162
159#define MAX_BIT_FILTERS 64 163#define MAX_BIT_FILTERS 64
160static int nr_bit_filters; 164static int nr_bit_filters;
@@ -163,9 +167,16 @@ static uint64_t opt_bits[MAX_BIT_FILTERS];
163 167
164static int page_size; 168static int page_size;
165 169
166#define PAGES_BATCH (64 << 10) /* 64k pages */ 170static int pagemap_fd;
167static int kpageflags_fd; 171static int kpageflags_fd;
168 172
173static int opt_hwpoison;
174static int opt_unpoison;
175
176static char *hwpoison_debug_fs = "/debug/hwpoison";
177static int hwpoison_inject_fd;
178static int hwpoison_forget_fd;
179
169#define HASH_SHIFT 13 180#define HASH_SHIFT 13
170#define HASH_SIZE (1 << HASH_SHIFT) 181#define HASH_SIZE (1 << HASH_SHIFT)
171#define HASH_MASK (HASH_SIZE - 1) 182#define HASH_MASK (HASH_SIZE - 1)
@@ -207,6 +218,74 @@ static void fatal(const char *x, ...)
207 exit(EXIT_FAILURE); 218 exit(EXIT_FAILURE);
208} 219}
209 220
221static int checked_open(const char *pathname, int flags)
222{
223 int fd = open(pathname, flags);
224
225 if (fd < 0) {
226 perror(pathname);
227 exit(EXIT_FAILURE);
228 }
229
230 return fd;
231}
232
233/*
234 * pagemap/kpageflags routines
235 */
236
237static unsigned long do_u64_read(int fd, char *name,
238 uint64_t *buf,
239 unsigned long index,
240 unsigned long count)
241{
242 long bytes;
243
244 if (index > ULONG_MAX / 8)
245 fatal("index overflow: %lu\n", index);
246
247 if (lseek(fd, index * 8, SEEK_SET) < 0) {
248 perror(name);
249 exit(EXIT_FAILURE);
250 }
251
252 bytes = read(fd, buf, count * 8);
253 if (bytes < 0) {
254 perror(name);
255 exit(EXIT_FAILURE);
256 }
257 if (bytes % 8)
258 fatal("partial read: %lu bytes\n", bytes);
259
260 return bytes / 8;
261}
262
263static unsigned long kpageflags_read(uint64_t *buf,
264 unsigned long index,
265 unsigned long pages)
266{
267 return do_u64_read(kpageflags_fd, PROC_KPAGEFLAGS, buf, index, pages);
268}
269
270static unsigned long pagemap_read(uint64_t *buf,
271 unsigned long index,
272 unsigned long pages)
273{
274 return do_u64_read(pagemap_fd, "/proc/pid/pagemap", buf, index, pages);
275}
276
277static unsigned long pagemap_pfn(uint64_t val)
278{
279 unsigned long pfn;
280
281 if (val & PM_PRESENT)
282 pfn = PM_PFRAME(val);
283 else
284 pfn = 0;
285
286 return pfn;
287}
288
210 289
211/* 290/*
212 * page flag names 291 * page flag names
@@ -255,7 +334,8 @@ static char *page_flag_longname(uint64_t flags)
255 * page list and summary 334 * page list and summary
256 */ 335 */
257 336
258static void show_page_range(unsigned long offset, uint64_t flags) 337static void show_page_range(unsigned long voffset,
338 unsigned long offset, uint64_t flags)
259{ 339{
260 static uint64_t flags0; 340 static uint64_t flags0;
261 static unsigned long voff; 341 static unsigned long voff;
@@ -281,7 +361,8 @@ static void show_page_range(unsigned long offset, uint64_t flags)
281 count = 1; 361 count = 1;
282} 362}
283 363
284static void show_page(unsigned long offset, uint64_t flags) 364static void show_page(unsigned long voffset,
365 unsigned long offset, uint64_t flags)
285{ 366{
286 if (opt_pid) 367 if (opt_pid)
287 printf("%lx\t", voffset); 368 printf("%lx\t", voffset);
@@ -362,6 +443,62 @@ static uint64_t well_known_flags(uint64_t flags)
362 return flags; 443 return flags;
363} 444}
364 445
446static uint64_t kpageflags_flags(uint64_t flags)
447{
448 flags = expand_overloaded_flags(flags);
449
450 if (!opt_raw)
451 flags = well_known_flags(flags);
452
453 return flags;
454}
455
456/*
457 * page actions
458 */
459
460static void prepare_hwpoison_fd(void)
461{
462 char buf[100];
463
464 if (opt_hwpoison && !hwpoison_inject_fd) {
465 sprintf(buf, "%s/corrupt-pfn", hwpoison_debug_fs);
466 hwpoison_inject_fd = checked_open(buf, O_WRONLY);
467 }
468
469 if (opt_unpoison && !hwpoison_forget_fd) {
470 sprintf(buf, "%s/renew-pfn", hwpoison_debug_fs);
471 hwpoison_forget_fd = checked_open(buf, O_WRONLY);
472 }
473}
474
475static int hwpoison_page(unsigned long offset)
476{
477 char buf[100];
478 int len;
479
480 len = sprintf(buf, "0x%lx\n", offset);
481 len = write(hwpoison_inject_fd, buf, len);
482 if (len < 0) {
483 perror("hwpoison inject");
484 return len;
485 }
486 return 0;
487}
488
489static int unpoison_page(unsigned long offset)
490{
491 char buf[100];
492 int len;
493
494 len = sprintf(buf, "0x%lx\n", offset);
495 len = write(hwpoison_forget_fd, buf, len);
496 if (len < 0) {
497 perror("hwpoison forget");
498 return len;
499 }
500 return 0;
501}
365 502
366/* 503/*
367 * page frame walker 504 * page frame walker
@@ -394,104 +531,83 @@ static int hash_slot(uint64_t flags)
394 exit(EXIT_FAILURE); 531 exit(EXIT_FAILURE);
395} 532}
396 533
397static void add_page(unsigned long offset, uint64_t flags) 534static void add_page(unsigned long voffset,
535 unsigned long offset, uint64_t flags)
398{ 536{
399 flags = expand_overloaded_flags(flags); 537 flags = kpageflags_flags(flags);
400
401 if (!opt_raw)
402 flags = well_known_flags(flags);
403 538
404 if (!bit_mask_ok(flags)) 539 if (!bit_mask_ok(flags))
405 return; 540 return;
406 541
542 if (opt_hwpoison)
543 hwpoison_page(offset);
544 if (opt_unpoison)
545 unpoison_page(offset);
546
407 if (opt_list == 1) 547 if (opt_list == 1)
408 show_page_range(offset, flags); 548 show_page_range(voffset, offset, flags);
409 else if (opt_list == 2) 549 else if (opt_list == 2)
410 show_page(offset, flags); 550 show_page(voffset, offset, flags);
411 551
412 nr_pages[hash_slot(flags)]++; 552 nr_pages[hash_slot(flags)]++;
413 total_pages++; 553 total_pages++;
414} 554}
415 555
416static void walk_pfn(unsigned long index, unsigned long count) 556#define KPAGEFLAGS_BATCH (64 << 10) /* 64k pages */
557static void walk_pfn(unsigned long voffset,
558 unsigned long index,
559 unsigned long count)
417{ 560{
561 uint64_t buf[KPAGEFLAGS_BATCH];
418 unsigned long batch; 562 unsigned long batch;
419 unsigned long n; 563 unsigned long pages;
420 unsigned long i; 564 unsigned long i;
421 565
422 if (index > ULONG_MAX / KPF_BYTES)
423 fatal("index overflow: %lu\n", index);
424
425 lseek(kpageflags_fd, index * KPF_BYTES, SEEK_SET);
426
427 while (count) { 566 while (count) {
428 uint64_t kpageflags_buf[KPF_BYTES * PAGES_BATCH]; 567 batch = min_t(unsigned long, count, KPAGEFLAGS_BATCH);
429 568 pages = kpageflags_read(buf, index, batch);
430 batch = min_t(unsigned long, count, PAGES_BATCH); 569 if (pages == 0)
431 n = read(kpageflags_fd, kpageflags_buf, batch * KPF_BYTES);
432 if (n == 0)
433 break; 570 break;
434 if (n < 0) {
435 perror(PROC_KPAGEFLAGS);
436 exit(EXIT_FAILURE);
437 }
438 571
439 if (n % KPF_BYTES != 0) 572 for (i = 0; i < pages; i++)
440 fatal("partial read: %lu bytes\n", n); 573 add_page(voffset + i, index + i, buf[i]);
441 n = n / KPF_BYTES;
442 574
443 for (i = 0; i < n; i++) 575 index += pages;
444 add_page(index + i, kpageflags_buf[i]); 576 count -= pages;
445
446 index += batch;
447 count -= batch;
448 } 577 }
449} 578}
450 579
451 580#define PAGEMAP_BATCH (64 << 10)
452#define PAGEMAP_BATCH 4096 581static void walk_vma(unsigned long index, unsigned long count)
453static unsigned long task_pfn(unsigned long pgoff)
454{ 582{
455 static uint64_t buf[PAGEMAP_BATCH]; 583 uint64_t buf[PAGEMAP_BATCH];
456 static unsigned long start; 584 unsigned long batch;
457 static long count; 585 unsigned long pages;
458 uint64_t pfn; 586 unsigned long pfn;
587 unsigned long i;
459 588
460 if (pgoff < start || pgoff >= start + count) { 589 while (count) {
461 if (lseek64(pagemap_fd, 590 batch = min_t(unsigned long, count, PAGEMAP_BATCH);
462 (uint64_t)pgoff * PM_ENTRY_BYTES, 591 pages = pagemap_read(buf, index, batch);
463 SEEK_SET) < 0) { 592 if (pages == 0)
464 perror("pagemap seek"); 593 break;
465 exit(EXIT_FAILURE);
466 }
467 count = read(pagemap_fd, buf, sizeof(buf));
468 if (count == 0)
469 return 0;
470 if (count < 0) {
471 perror("pagemap read");
472 exit(EXIT_FAILURE);
473 }
474 if (count % PM_ENTRY_BYTES) {
475 fatal("pagemap read not aligned.\n");
476 exit(EXIT_FAILURE);
477 }
478 count /= PM_ENTRY_BYTES;
479 start = pgoff;
480 }
481 594
482 pfn = buf[pgoff - start]; 595 for (i = 0; i < pages; i++) {
483 if (pfn & PM_PRESENT) 596 pfn = pagemap_pfn(buf[i]);
484 pfn = PM_PFRAME(pfn); 597 if (pfn)
485 else 598 walk_pfn(index + i, pfn, 1);
486 pfn = 0; 599 }
487 600
488 return pfn; 601 index += pages;
602 count -= pages;
603 }
489} 604}
490 605
491static void walk_task(unsigned long index, unsigned long count) 606static void walk_task(unsigned long index, unsigned long count)
492{ 607{
493 int i = 0;
494 const unsigned long end = index + count; 608 const unsigned long end = index + count;
609 unsigned long start;
610 int i = 0;
495 611
496 while (index < end) { 612 while (index < end) {
497 613
@@ -501,15 +617,11 @@ static void walk_task(unsigned long index, unsigned long count)
501 if (pg_start[i] >= end) 617 if (pg_start[i] >= end)
502 return; 618 return;
503 619
504 voffset = max_t(unsigned long, pg_start[i], index); 620 start = max_t(unsigned long, pg_start[i], index);
505 index = min_t(unsigned long, pg_end[i], end); 621 index = min_t(unsigned long, pg_end[i], end);
506 622
507 assert(voffset < index); 623 assert(start < index);
508 for (; voffset < index; voffset++) { 624 walk_vma(start, index - start);
509 unsigned long pfn = task_pfn(voffset);
510 if (pfn)
511 walk_pfn(pfn, 1);
512 }
513 } 625 }
514} 626}
515 627
@@ -527,18 +639,14 @@ static void walk_addr_ranges(void)
527{ 639{
528 int i; 640 int i;
529 641
530 kpageflags_fd = open(PROC_KPAGEFLAGS, O_RDONLY); 642 kpageflags_fd = checked_open(PROC_KPAGEFLAGS, O_RDONLY);
531 if (kpageflags_fd < 0) {
532 perror(PROC_KPAGEFLAGS);
533 exit(EXIT_FAILURE);
534 }
535 643
536 if (!nr_addr_ranges) 644 if (!nr_addr_ranges)
537 add_addr_range(0, ULONG_MAX); 645 add_addr_range(0, ULONG_MAX);
538 646
539 for (i = 0; i < nr_addr_ranges; i++) 647 for (i = 0; i < nr_addr_ranges; i++)
540 if (!opt_pid) 648 if (!opt_pid)
541 walk_pfn(opt_offset[i], opt_size[i]); 649 walk_pfn(0, opt_offset[i], opt_size[i]);
542 else 650 else
543 walk_task(opt_offset[i], opt_size[i]); 651 walk_task(opt_offset[i], opt_size[i]);
544 652
@@ -575,6 +683,8 @@ static void usage(void)
575" -l|--list Show page details in ranges\n" 683" -l|--list Show page details in ranges\n"
576" -L|--list-each Show page details one by one\n" 684" -L|--list-each Show page details one by one\n"
577" -N|--no-summary Don't show summay info\n" 685" -N|--no-summary Don't show summay info\n"
686" -X|--hwpoison hwpoison pages\n"
687" -x|--unpoison unpoison pages\n"
578" -h|--help Show this usage message\n" 688" -h|--help Show this usage message\n"
579"addr-spec:\n" 689"addr-spec:\n"
580" N one page at offset N (unit: pages)\n" 690" N one page at offset N (unit: pages)\n"
@@ -624,11 +734,7 @@ static void parse_pid(const char *str)
624 opt_pid = parse_number(str); 734 opt_pid = parse_number(str);
625 735
626 sprintf(buf, "/proc/%d/pagemap", opt_pid); 736 sprintf(buf, "/proc/%d/pagemap", opt_pid);
627 pagemap_fd = open(buf, O_RDONLY); 737 pagemap_fd = checked_open(buf, O_RDONLY);
628 if (pagemap_fd < 0) {
629 perror(buf);
630 exit(EXIT_FAILURE);
631 }
632 738
633 sprintf(buf, "/proc/%d/maps", opt_pid); 739 sprintf(buf, "/proc/%d/maps", opt_pid);
634 file = fopen(buf, "r"); 740 file = fopen(buf, "r");
@@ -788,6 +894,8 @@ static struct option opts[] = {
788 { "list" , 0, NULL, 'l' }, 894 { "list" , 0, NULL, 'l' },
789 { "list-each" , 0, NULL, 'L' }, 895 { "list-each" , 0, NULL, 'L' },
790 { "no-summary", 0, NULL, 'N' }, 896 { "no-summary", 0, NULL, 'N' },
897 { "hwpoison" , 0, NULL, 'X' },
898 { "unpoison" , 0, NULL, 'x' },
791 { "help" , 0, NULL, 'h' }, 899 { "help" , 0, NULL, 'h' },
792 { NULL , 0, NULL, 0 } 900 { NULL , 0, NULL, 0 }
793}; 901};
@@ -799,7 +907,7 @@ int main(int argc, char *argv[])
799 page_size = getpagesize(); 907 page_size = getpagesize();
800 908
801 while ((c = getopt_long(argc, argv, 909 while ((c = getopt_long(argc, argv,
802 "rp:f:a:b:lLNh", opts, NULL)) != -1) { 910 "rp:f:a:b:lLNXxh", opts, NULL)) != -1) {
803 switch (c) { 911 switch (c) {
804 case 'r': 912 case 'r':
805 opt_raw = 1; 913 opt_raw = 1;
@@ -825,6 +933,14 @@ int main(int argc, char *argv[])
825 case 'N': 933 case 'N':
826 opt_no_summary = 1; 934 opt_no_summary = 1;
827 break; 935 break;
936 case 'X':
937 opt_hwpoison = 1;
938 prepare_hwpoison_fd();
939 break;
940 case 'x':
941 opt_unpoison = 1;
942 prepare_hwpoison_fd();
943 break;
828 case 'h': 944 case 'h':
829 usage(); 945 usage();
830 exit(0); 946 exit(0);
@@ -844,7 +960,7 @@ int main(int argc, char *argv[])
844 walk_addr_ranges(); 960 walk_addr_ranges();
845 961
846 if (opt_list == 1) 962 if (opt_list == 1)
847 show_page_range(0, 0); /* drain the buffer */ 963 show_page_range(0, 0, 0); /* drain the buffer */
848 964
849 if (opt_no_summary) 965 if (opt_no_summary)
850 return 0; 966 return 0;
diff --git a/Documentation/vm/pagemap.txt b/Documentation/vm/pagemap.txt
index 600a304a828c..df09b9650a81 100644
--- a/Documentation/vm/pagemap.txt
+++ b/Documentation/vm/pagemap.txt
@@ -57,7 +57,9 @@ There are three components to pagemap:
57 16. COMPOUND_TAIL 57 16. COMPOUND_TAIL
58 16. HUGE 58 16. HUGE
59 18. UNEVICTABLE 59 18. UNEVICTABLE
60 19. HWPOISON
60 20. NOPAGE 61 20. NOPAGE
62 21. KSM
61 63
62Short descriptions to the page flags: 64Short descriptions to the page flags:
63 65
@@ -86,9 +88,15 @@ Short descriptions to the page flags:
8617. HUGE 8817. HUGE
87 this is an integral part of a HugeTLB page 89 this is an integral part of a HugeTLB page
88 90
9119. HWPOISON
92 hardware detected memory corruption on this page: don't touch the data!
93
8920. NOPAGE 9420. NOPAGE
90 no page frame exists at the requested address 95 no page frame exists at the requested address
91 96
9721. KSM
98 identical memory pages dynamically shared between one or more processes
99
92 [IO related page flags] 100 [IO related page flags]
93 1. ERROR IO error occurred 101 1. ERROR IO error occurred
94 3. UPTODATE page has up-to-date data 102 3. UPTODATE page has up-to-date data
diff --git a/Documentation/w1/masters/ds2482 b/Documentation/w1/masters/ds2482
index 9210d6fa5024..299b91c7609f 100644
--- a/Documentation/w1/masters/ds2482
+++ b/Documentation/w1/masters/ds2482
@@ -24,8 +24,8 @@ General Remarks
24 24
25Valid addresses are 0x18, 0x19, 0x1a, and 0x1b. 25Valid addresses are 0x18, 0x19, 0x1a, and 0x1b.
26However, the device cannot be detected without writing to the i2c bus, so no 26However, the device cannot be detected without writing to the i2c bus, so no
27detection is done. 27detection is done. You should instantiate the device explicitly.
28You should force the device address.
29 28
30$ modprobe ds2482 force=0,0x18 29$ modprobe ds2482
30$ echo ds2482 0x18 > /sys/bus/i2c/devices/i2c-0/new_device
31 31