14 files changed, 621 insertions, 137 deletions
diff --git a/Documentation/devicetree/bindings/powerpc/fsl/mpic-msgr.txt b/Documentation/devicetree/bindings/powerpc/fsl/mpic-msgr.txt
new file mode 100644
index 000000000000..bc8ded641ab6
--- /dev/null
+++ b/Documentation/devicetree/bindings/powerpc/fsl/mpic-msgr.txt
@@ -0,0 +1,63 @@
+* FSL MPIC Message Registers
+This binding specifies what properties must be available in the device tree
+representation of the message register blocks found in some FSL MPIC
+implementations.
+Required properties:
+    - compatible: Specifies the compatibility list for the message register
+      block.  The type shall be <string-list> and the value shall be of the form
+      "fsl,mpic-v<version>-msgr", where <version> is the version number of
+      the MPIC containing the message registers.
+    - reg: Specifies the base physical address(s) and size(s) of the
+      message register block's addressable register space.  The type shall be
+      <prop-encoded-array>.
+    - interrupts: Specifies a list of interrupt-specifiers which are available
+      for receiving interrupts. Interrupt-specifier consists of two cells: first
+      cell is interrupt-number and second cell is level-sense. The type shall be
+      <prop-encoded-array>.
+Optional properties:
+    - mpic-msgr-receive-mask: Specifies what registers in the containing block
+      are allowed to receive interrupts. The value is a bit mask where a set
+      bit at bit 'n' indicates that message register 'n' can receive interrupts.
+      Note that "bit 'n'" is numbered from LSB for PPC hardware. The type shall
+      be <u32>. If not present, then all of the message registers in the block
+      are available.
+Aliases:
+    An alias should be created for every message register block.  They are not
+    required, though.  However, a particular implementation of this binding
+    may require aliases to be present.  Aliases are of the form
+    'mpic-msgr-block<n>', where <n> is an integer specifying the block's number.
+    Numbers shall start at 0.
+Example:
+        aliases {
+                mpic-msgr-block0 = &mpic_msgr_block0;
+                mpic-msgr-block1 = &mpic_msgr_block1;
+        };
+        mpic_msgr_block0: mpic-msgr-block@41400 {
+                compatible = "fsl,mpic-v3.1-msgr";
+                reg = <0x41400 0x200>;
+                // Message registers 0 and 2 in this block can receive interrupts on
+                // sources 0xb0 and 0xb2, respectively.
+                interrupts = <0xb0 2 0xb2 2>;
+                mpic-msgr-receive-mask = <0x5>;
+        };
+        mpic_msgr_block1: mpic-msgr-block@42400 {
+                compatible = "fsl,mpic-v3.1-msgr";
+                reg = <0x42400 0x200>;
+                // Message registers 0 and 2 in this block can receive interrupts on
+                // sources 0xb4 and 0xb6, respectively.
+                interrupts = <0xb4 2 0xb6 2>;
+                mpic-msgr-receive-mask = <0x5>;
+        };
diff --git a/Documentation/devicetree/bindings/powerpc/fsl/mpic.txt b/Documentation/devicetree/bindings/powerpc/fsl/mpic.txt
index 2cf38bd841fd..dc5744636a57 100644
--- a/Documentation/devicetree/bindings/powerpc/fsl/mpic.txt
+++ b/Documentation/devicetree/bindings/powerpc/fsl/mpic.txt
@@ -56,7 +56,27 @@ PROPERTIES
          to the client.  The presence of this property also mandates
          that any initialization related to interrupt sources shall
          be limited to sources explicitly referenced in the device tree.
-       
+  - big-endian
+      Usage: optional
+      Value type: <empty>
+          If present the MPIC will be assumed to be big-endian.  Some
+          device-trees omit this property on MPIC nodes even when the MPIC is
+          in fact big-endian, so certain boards override this property.
+  - single-cpu-affinity
+      Usage: optional
+      Value type: <empty>
+          If present the MPIC will be assumed to only be able to route
+          non-IPI interrupts to a single CPU at a time (EG: Freescale MPIC).
+  - last-interrupt-source
+      Usage: optional
+      Value type: <u32>
+          Some MPICs do not correctly report the number of hardware sources
+          in the global feature registers.  If specified, this field will
+          override the value read from MPIC_GREG_FEATURE_LAST_SRC.
 INTERRUPT SPECIFIER DEFINITION
  Interrupt specifiers consists of 4 cells encoded as
diff --git a/Documentation/devicetree/bindings/powerpc/fsl/msi-pic.txt b/Documentation/devicetree/bindings/powerpc/fsl/msi-pic.txt
index 5d586e1ccaf5..5693877ab377 100644
--- a/Documentation/devicetree/bindings/powerpc/fsl/msi-pic.txt
+++ b/Documentation/devicetree/bindings/powerpc/fsl/msi-pic.txt
@@ -6,8 +6,10 @@ Required properties:
  etc.) and the second is "fsl,mpic-msi" or "fsl,ipic-msi" depending on
  the parent type.
- reg : should contain the address and the length of the shared message
+- reg : It may contain one or two regions. The first region should contain
-  interrupt register set.
+  the address and the length of the shared message interrupt register set.
+  The second region should contain the address of aliased MSIIR register for
+  platforms that have such an alias.
 - msi-available-ranges: use <start count> style section to define which
  msi interrupt can be used in the 256 msi interrupts. This property is
diff --git a/Documentation/filesystems/debugfs.txt b/Documentation/filesystems/debugfs.txt
index 4e2575873187..7a34f827989c 100644
--- a/Documentation/filesystems/debugfs.txt
+++ b/Documentation/filesystems/debugfs.txt
@@ -136,7 +136,7 @@ file.
        void __iomem *base;
    };
-    struct dentry *debugfs_create_regset32(const char *name, mode_t mode,
+    struct dentry *debugfs_create_regset32(const char *name, umode_t mode,
                                     struct dentry *parent,
                                     struct debugfs_regset32 *regset);
diff --git a/Documentation/filesystems/porting b/Documentation/filesystems/porting
index b4a3d765ff9a..74acd9618819 100644
--- a/Documentation/filesystems/porting
+++ b/Documentation/filesystems/porting
@@ -429,3 +429,9 @@ filemap_write_and_wait_range() so that all dirty pages are synced out properly.
 You must also keep in mind that ->fsync() is not called with i_mutex held
 anymore, so if you require i_mutex locking you must make sure to take it and
 release it yourself.
+--
+[mandatory]
+        d_alloc_root() is gone, along with a lot of bugs caused by code
+misusing it.  Replacement: d_make_root(inode).  The difference is,
+d_make_root() drops the reference to inode if dentry allocation fails.  
diff --git a/Documentation/filesystems/qnx6.txt b/Documentation/filesystems/qnx6.txt
new file mode 100644
index 000000000000..050223ea03c7
--- /dev/null
+++ b/Documentation/filesystems/qnx6.txt
@@ -0,0 +1,174 @@
+The QNX6 Filesystem
+===================
+The qnx6fs is used by newer QNX operating system versions. (e.g. Neutrino)
+It got introduced in QNX 6.4.0 and is used default since 6.4.1.
+Option
+======
+mmi_fs          Mount filesystem as used for example by Audi MMI 3G system
+Specification
+=============
+qnx6fs shares many properties with traditional Unix filesystems. It has the
+concepts of blocks, inodes and directories.
+On QNX it is possible to create little endian and big endian qnx6 filesystems.
+This feature makes it possible to create and use a different endianness fs
+for the target (QNX is used on quite a range of embedded systems) plattform
+running on a different endianess.
+The Linux driver handles endianness transparently. (LE and BE)
+Blocks
+------
+The space in the device or file is split up into blocks. These are a fixed
+size of 512, 1024, 2048 or 4096, which is decided when the filesystem is
+created.
+Blockpointers are 32bit, so the maximum space that can be adressed is
+2^32 * 4096 bytes or 16TB
+The superblocks
+---------------
+The superblock contains all global information about the filesystem.
+Each qnx6fs got two superblocks, each one having a 64bit serial number.
+That serial number is used to identify the "active" superblock.
+In write mode with reach new snapshot (after each synchronous write), the
+serial of the new master superblock is increased (old superblock serial + 1)
+So basically the snapshot functionality is realized by an atomic final
+update of the serial number. Before updating that serial, all modifications
+are done by copying all modified blocks during that specific write request
+(or period) and building up a new (stable) filesystem structure under the
+inactive superblock.
+Each superblock holds a set of root inodes for the different filesystem
+parts. (Inode, Bitmap and Longfilenames)
+Each of these root nodes holds information like total size of the stored
+data and the adressing levels in that specific tree.
+If the level value is 0, up to 16 direct blocks can be adressed by each
+node.
+Level 1 adds an additional indirect adressing level where each indirect
+adressing block holds up to blocksize / 4 bytes pointers to data blocks.
+Level 2 adds an additional indirect adressig block level (so, already up
+to 16 * 256 * 256 = 1048576 blocks that can be adressed by such a tree)a
+Unused block pointers are always set to ~0 - regardless of root node,
+indirect adressing blocks or inodes.
+Data leaves are always on the lowest level. So no data is stored on upper
+tree levels.
+The first Superblock is located at 0x2000. (0x2000 is the bootblock size)
+The Audi MMI 3G first superblock directly starts at byte 0.
+Second superblock position can either be calculated from the superblock
+information (total number of filesystem blocks) or by taking the highest
+device address, zeroing the last 3 bytes and then substracting 0x1000 from
+that address.
+0x1000 is the size reserved for each superblock - regardless of the
+blocksize of the filesystem.
+Inodes
+------
+Each object in the filesystem is represented by an inode. (index node)
+The inode structure contains pointers to the filesystem blocks which contain
+the data held in the object and all of the metadata about an object except
+its longname. (filenames longer than 27 characters)
+The metadata about an object includes the permissions, owner, group, flags,
+size, number of blocks used, access time, change time and modification time.
+Object mode field is POSIX format. (which makes things easier)
+There are also pointers to the first 16 blocks, if the object data can be
+adressed with 16 direct blocks.
+For more than 16 blocks an indirect adressing in form of another tree is
+used. (scheme is the same as the one used for the superblock root nodes)
+The filesize is stored 64bit. Inode counting starts with 1. (whilst long
+filename inodes start with 0)
+Directories
+-----------
+A directory is a filesystem object and has an inode just like a file.
+It is a specially formatted file containing records which associate each
+name with an inode number.
+'.' inode number points to the directory inode
+'..' inode number points to the parent directory inode
+Eeach filename record additionally got a filename length field.
+One special case are long filenames or subdirectory names.
+These got set a filename length field of 0xff in the corresponding directory
+record plus the longfile inode number also stored in that record.
+With that longfilename inode number, the longfilename tree can be walked
+starting with the superblock longfilename root node pointers.
+Special files
+-------------
+Symbolic links are also filesystem objects with inodes. They got a specific
+bit in the inode mode field identifying them as symbolic link.
+The directory entry file inode pointer points to the target file inode.
+Hard links got an inode, a directory entry, but a specific mode bit set,
+no block pointers and the directory file record pointing to the target file
+inode.
+Character and block special devices do not exist in QNX as those files
+are handled by the QNX kernel/drivers and created in /dev independant of the
+underlaying filesystem.
+Long filenames
+--------------
+Long filenames are stored in a seperate adressing tree. The staring point
+is the longfilename root node in the active superblock.
+Each data block (tree leaves) holds one long filename. That filename is
+limited to 510 bytes. The first two starting bytes are used as length field
+for the actual filename.
+If that structure shall fit for all allowed blocksizes, it is clear why there
+is a limit of 510 bytes for the actual filename stored.
+Bitmap
+------
+The qnx6fs filesystem allocation bitmap is stored in a tree under bitmap
+root node in the superblock and each bit in the bitmap represents one
+filesystem block.
+The first block is block 0, which starts 0x1000 after superblock start.
+So for a normal qnx6fs 0x3000 (bootblock + superblock) is the physical
+address at which block 0 is located.
+Bits at the end of the last bitmap block are set to 1, if the device is
+smaller than addressing space in the bitmap.
+Bitmap system area
+------------------
+The bitmap itself is devided into three parts.
+First the system area, that is split into two halfs.
+Then userspace.
+The requirement for a static, fixed preallocated system area comes from how
+qnx6fs deals with writes.
+Each superblock got it's own half of the system area. So superblock #1
+always uses blocks from the lower half whilst superblock #2 just writes to
+blocks represented by the upper half bitmap system area bits.
+Bitmap blocks, Inode blocks and indirect addressing blocks for those two
+tree structures are treated as system blocks.
+The rational behind that is that a write request can work on a new snapshot
+(system area of the inactive - resp. lower serial numbered superblock) while
+at the same time there is still a complete stable filesystem structer in the
+other half of the system area.
+When finished with writing (a sync write is completed, the maximum sync leap
+time or a filesystem sync is requested), serial of the previously inactive
+superblock atomically is increased and the fs switches over to that - then
+stable declared - superblock.
+For all data outside the system area, blocks are just copied while writing.
diff --git a/Documentation/ioctl/ioctl-number.txt b/Documentation/ioctl/ioctl-number.txt
index 68fbfb6529eb..3b7488fc3373 100644
--- a/Documentation/ioctl/ioctl-number.txt
+++ b/Documentation/ioctl/ioctl-number.txt
@@ -218,6 +218,7 @@ Code  Seq#(hex)	Include File		Comments
 'h'     00-7F                           conflict! Charon filesystem
                                        <mailto:zapman@interlan.net>
 'h'     00-1F   linux/hpet.h            conflict!
+'h'     80-8F   fs/hfsplus/ioctl.c
 'i'     00-3F   linux/i2o-dev.h         conflict!
 'i'     0B-1F   linux/ipmi.h            conflict!
 'i'     80-8F   linux/i8k.h
diff --git a/Documentation/networking/dns_resolver.txt b/Documentation/networking/dns_resolver.txt
index 7f531ad83285..d86adcdae420 100644
--- a/Documentation/networking/dns_resolver.txt
+++ b/Documentation/networking/dns_resolver.txt
@@ -102,6 +102,10 @@ implemented in the module can be called after doing:
     If _expiry is non-NULL, the expiry time (TTL) of the result will be
     returned also.
+The kernel maintains an internal keyring in which it caches looked up keys.
+This can be cleared by any process that has the CAP_SYS_ADMIN capability by
+the use of KEYCTL_KEYRING_CLEAR on the keyring ID.
 ===============================
 READING DNS KEYS FROM USERSPACE
diff --git a/Documentation/powerpc/firmware-assisted-dump.txt b/Documentation/powerpc/firmware-assisted-dump.txt
new file mode 100644
index 000000000000..3007bc98af28
--- /dev/null
+++ b/Documentation/powerpc/firmware-assisted-dump.txt
@@ -0,0 +1,270 @@
+                   Firmware-Assisted Dump
+                   ------------------------
+                       July 2011
+The goal of firmware-assisted dump is to enable the dump of
+a crashed system, and to do so from a fully-reset system, and
+to minimize the total elapsed time until the system is back
+in production use.
+- Firmware assisted dump (fadump) infrastructure is intended to replace
+  the existing phyp assisted dump.
+- Fadump uses the same firmware interfaces and memory reservation model
+  as phyp assisted dump.
+- Unlike phyp dump, fadump exports the memory dump through /proc/vmcore
+  in the ELF format in the same way as kdump. This helps us reuse the
+  kdump infrastructure for dump capture and filtering.
+- Unlike phyp dump, userspace tool does not need to refer any sysfs
+  interface while reading /proc/vmcore.
+- Unlike phyp dump, fadump allows user to release all the memory reserved
+  for dump, with a single operation of echo 1 > /sys/kernel/fadump_release_mem.
+- Once enabled through kernel boot parameter, fadump can be
+  started/stopped through /sys/kernel/fadump_registered interface (see
+  sysfs files section below) and can be easily integrated with kdump
+  service start/stop init scripts.
+Comparing with kdump or other strategies, firmware-assisted
+dump offers several strong, practical advantages:
+-- Unlike kdump, the system has been reset, and loaded
+   with a fresh copy of the kernel.  In particular,
+   PCI and I/O devices have been reinitialized and are
+   in a clean, consistent state.
+-- Once the dump is copied out, the memory that held the dump
+   is immediately available to the running kernel. And therefore,
+   unlike kdump, fadump doesn't need a 2nd reboot to get back
+   the system to the production configuration.
+The above can only be accomplished by coordination with,
+and assistance from the Power firmware. The procedure is
+as follows:
+-- The first kernel registers the sections of memory with the
+   Power firmware for dump preservation during OS initialization.
+   These registered sections of memory are reserved by the first
+   kernel during early boot.
+-- When a system crashes, the Power firmware will save
+   the low memory (boot memory of size larger of 5% of system RAM
+   or 256MB) of RAM to the previous registered region. It will
+   also save system registers, and hardware PTE's.
+   NOTE: The term 'boot memory' means size of the low memory chunk
+         that is required for a kernel to boot successfully when
+         booted with restricted memory. By default, the boot memory
+         size will be the larger of 5% of system RAM or 256MB.
+         Alternatively, user can also specify boot memory size
+         through boot parameter 'fadump_reserve_mem=' which will
+         override the default calculated size. Use this option
+         if default boot memory size is not sufficient for second
+         kernel to boot successfully.
+-- After the low memory (boot memory) area has been saved, the
+   firmware will reset PCI and other hardware state.  It will
+   *not* clear the RAM. It will then launch the bootloader, as
+   normal.
+-- The freshly booted kernel will notice that there is a new
+   node (ibm,dump-kernel) in the device tree, indicating that
+   there is crash data available from a previous boot. During
+   the early boot OS will reserve rest of the memory above
+   boot memory size effectively booting with restricted memory
+   size. This will make sure that the second kernel will not
+   touch any of the dump memory area.
+-- User-space tools will read /proc/vmcore to obtain the contents
+   of memory, which holds the previous crashed kernel dump in ELF
+   format. The userspace tools may copy this info to disk, or
+   network, nas, san, iscsi, etc. as desired.
+-- Once the userspace tool is done saving dump, it will echo
+   '1' to /sys/kernel/fadump_release_mem to release the reserved
+   memory back to general use, except the memory required for
+   next firmware-assisted dump registration.
+   e.g.
+     # echo 1 > /sys/kernel/fadump_release_mem
+Please note that the firmware-assisted dump feature
+is only available on Power6 and above systems with recent
+firmware versions.
+Implementation details:
+----------------------
+During boot, a check is made to see if firmware supports
+this feature on that particular machine. If it does, then
+we check to see if an active dump is waiting for us. If yes
+then everything but boot memory size of RAM is reserved during
+early boot (See Fig. 2). This area is released once we finish
+collecting the dump from user land scripts (e.g. kdump scripts)
+that are run. If there is dump data, then the
+/sys/kernel/fadump_release_mem file is created, and the reserved
+memory is held.
+If there is no waiting dump data, then only the memory required
+to hold CPU state, HPTE region, boot memory dump and elfcore
+header, is reserved at the top of memory (see Fig. 1). This area
+is *not* released: this region will be kept permanently reserved,
+so that it can act as a receptacle for a copy of the boot memory
+content in addition to CPU state and HPTE region, in the case a
+crash does occur.
+  o Memory Reservation during first kernel
+  Low memory                                        Top of memory
+  0      boot memory size                                       |
+  |           |                       |<--Reserved dump area -->|
+  V           V                       |   Permanent Reservation V
+  +-----------+----------/ /----------+---+----+-----------+----+
+  |           |                       |CPU|HPTE|  DUMP     |ELF |
+  +-----------+----------/ /----------+---+----+-----------+----+
+        |                                           ^
+        |                                           |
+        \                                           /
+         -------------------------------------------
+          Boot memory content gets transferred to
+          reserved area by firmware at the time of
+          crash
+                   Fig. 1
+  o Memory Reservation during second kernel after crash
+  Low memory                                        Top of memory
+  0      boot memory size                                       |
+  |           |<------------- Reserved dump area ----------- -->|
+  V           V                                                 V
+  +-----------+----------/ /----------+---+----+-----------+----+
+  |           |                       |CPU|HPTE|  DUMP     |ELF |
+  +-----------+----------/ /----------+---+----+-----------+----+
+        |                                                    |
+        V                                                    V
+   Used by second                                    /proc/vmcore
+   kernel to boot
+                   Fig. 2
+Currently the dump will be copied from /proc/vmcore to a
+a new file upon user intervention. The dump data available through
+/proc/vmcore will be in ELF format. Hence the existing kdump
+infrastructure (kdump scripts) to save the dump works fine with
+minor modifications.
+The tools to examine the dump will be same as the ones
+used for kdump.
+How to enable firmware-assisted dump (fadump):
+-------------------------------------
+1. Set config option CONFIG_FA_DUMP=y and build kernel.
+2. Boot into linux kernel with 'fadump=on' kernel cmdline option.
+3. Optionally, user can also set 'fadump_reserve_mem=' kernel cmdline
+   to specify size of the memory to reserve for boot memory dump
+   preservation.
+NOTE: If firmware-assisted dump fails to reserve memory then it will
+   fallback to existing kdump mechanism if 'crashkernel=' option
+   is set at kernel cmdline.
+Sysfs/debugfs files:
+------------
+Firmware-assisted dump feature uses sysfs file system to hold
+the control files and debugfs file to display memory reserved region.
+Here is the list of files under kernel sysfs:
+ /sys/kernel/fadump_enabled
+    This is used to display the fadump status.
+    0 = fadump is disabled
+    1 = fadump is enabled
+    This interface can be used by kdump init scripts to identify if
+    fadump is enabled in the kernel and act accordingly.
+ /sys/kernel/fadump_registered
+    This is used to display the fadump registration status as well
+    as to control (start/stop) the fadump registration.
+    0 = fadump is not registered.
+    1 = fadump is registered and ready to handle system crash.
+    To register fadump echo 1 > /sys/kernel/fadump_registered and
+    echo 0 > /sys/kernel/fadump_registered for un-register and stop the
+    fadump. Once the fadump is un-registered, the system crash will not
+    be handled and vmcore will not be captured. This interface can be
+    easily integrated with kdump service start/stop.
+ /sys/kernel/fadump_release_mem
+    This file is available only when fadump is active during
+    second kernel. This is used to release the reserved memory
+    region that are held for saving crash dump. To release the
+    reserved memory echo 1 to it:
+    echo 1  > /sys/kernel/fadump_release_mem
+    After echo 1, the content of the /sys/kernel/debug/powerpc/fadump_region
+    file will change to reflect the new memory reservations.
+    The existing userspace tools (kdump infrastructure) can be easily
+    enhanced to use this interface to release the memory reserved for
+    dump and continue without 2nd reboot.
+Here is the list of files under powerpc debugfs:
+(Assuming debugfs is mounted on /sys/kernel/debug directory.)
+ /sys/kernel/debug/powerpc/fadump_region
+    This file shows the reserved memory regions if fadump is
+    enabled otherwise this file is empty. The output format
+    is:
+    <region>: [<start>-<end>] <reserved-size> bytes, Dumped: <dump-size>
+    e.g.
+    Contents when fadump is registered during first kernel
+    # cat /sys/kernel/debug/powerpc/fadump_region
+    CPU : [0x0000006ffb0000-0x0000006fff001f] 0x40020 bytes, Dumped: 0x0
+    HPTE: [0x0000006fff0020-0x0000006fff101f] 0x1000 bytes, Dumped: 0x0
+    DUMP: [0x0000006fff1020-0x0000007fff101f] 0x10000000 bytes, Dumped: 0x0
+    Contents when fadump is active during second kernel
+    # cat /sys/kernel/debug/powerpc/fadump_region
+    CPU : [0x0000006ffb0000-0x0000006fff001f] 0x40020 bytes, Dumped: 0x40020
+    HPTE: [0x0000006fff0020-0x0000006fff101f] 0x1000 bytes, Dumped: 0x1000
+    DUMP: [0x0000006fff1020-0x0000007fff101f] 0x10000000 bytes, Dumped: 0x10000000
+        : [0x00000010000000-0x0000006ffaffff] 0x5ffb0000 bytes, Dumped: 0x5ffb0000
+NOTE: Please refer to Documentation/filesystems/debugfs.txt on
+      how to mount the debugfs filesystem.
+TODO:
+-----
+ o Need to come up with the better approach to find out more
+   accurate boot memory size that is required for a kernel to
+   boot successfully when booted with restricted memory.
+ o The fadump implementation introduces a fadump crash info structure
+   in the scratch area before the ELF core header. The idea of introducing
+   this structure is to pass some important crash info data to the second
+   kernel which will help second kernel to populate ELF core header with
+   correct data before it gets exported through /proc/vmcore. The current
+   design implementation does not address a possibility of introducing
+   additional fields (in future) to this structure without affecting
+   compatibility. Need to come up with the better approach to address this.
+   The possible approaches are:
+        1. Introduce version field for version tracking, bump up the version
+        whenever a new field is added to the structure in future. The version
+        field can be used to find out what fields are valid for the current
+        version of the structure.
+        2. Reserve the area of predefined size (say PAGE_SIZE) for this
+        structure and have unused area as reserved (initialized to zero)
+        for future field additions.
+   The advantage of approach 1 over 2 is we don't need to reserve extra space.
+---
+Author: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
+This document is based on the original documentation written for phyp
+assisted dump by Linas Vepstas and Manish Ahuja.
diff --git a/Documentation/powerpc/mpc52xx.txt b/Documentation/powerpc/mpc52xx.txt
index 10dd4ab93b85..0d540a31ea1a 100644
--- a/Documentation/powerpc/mpc52xx.txt
+++ b/Documentation/powerpc/mpc52xx.txt
@@ -2,7 +2,7 @@ Linux 2.6.x on MPC52xx family
 -----------------------------
 For the latest info, go to http://www.246tNt.com/mpc52xx/
- 
 To compile/use :
  - U-Boot:
@@ -10,23 +10,23 @@ To compile/use :
        if you wish to ).
     # make lite5200_defconfig
     # make uImage
-    
     then, on U-boot:
     => tftpboot 200000 uImage
     => tftpboot 400000 pRamdisk
     => bootm 200000 400000
-    
  - DBug:
     # <edit Makefile to set ARCH=ppc & CROSS_COMPILE=... ( also EXTRAVERSION
        if you wish to ).
     # make lite5200_defconfig
     # cp your_initrd.gz arch/ppc/boot/images/ramdisk.image.gz
-     # make zImage.initrd 
+     # make zImage.initrd
-     # make 
+     # make
     then in DBug:
     DBug> dn -i zImage.initrd.lite5200
-     
 Some remarks :
 - The port is named mpc52xxx, and config options are PPC_MPC52xx. The MGT5100
diff --git a/Documentation/powerpc/phyp-assisted-dump.txt b/Documentation/powerpc/phyp-assisted-dump.txt
deleted file mode 100644
index ad340205d96a..000000000000
--- a/Documentation/powerpc/phyp-assisted-dump.txt
+++ /dev/null
@@ -1,127 +0,0 @@
-                   Hypervisor-Assisted Dump
-                   ------------------------
-                       November 2007
-The goal of hypervisor-assisted dump is to enable the dump of
-a crashed system, and to do so from a fully-reset system, and
-to minimize the total elapsed time until the system is back
-in production use.
-As compared to kdump or other strategies, hypervisor-assisted
-dump offers several strong, practical advantages:
-- Unlike kdump, the system has been reset, and loaded
-   with a fresh copy of the kernel.  In particular,
-   PCI and I/O devices have been reinitialized and are
-   in a clean, consistent state.
-- As the dump is performed, the dumped memory becomes
-   immediately available to the system for normal use.
-- After the dump is completed, no further reboots are
-   required; the system will be fully usable, and running
-   in its normal, production mode on its normal kernel.
-The above can only be accomplished by coordination with,
-and assistance from the hypervisor. The procedure is
-as follows:
-- When a system crashes, the hypervisor will save
-   the low 256MB of RAM to a previously registered
-   save region. It will also save system state, system
-   registers, and hardware PTE's.
-- After the low 256MB area has been saved, the
-   hypervisor will reset PCI and other hardware state.
-   It will *not* clear RAM. It will then launch the
-   bootloader, as normal.
-- The freshly booted kernel will notice that there
-   is a new node (ibm,dump-kernel) in the device tree,
-   indicating that there is crash data available from
-   a previous boot. It will boot into only 256MB of RAM,
-   reserving the rest of system memory.
-- Userspace tools will parse /sys/kernel/release_region
-   and read /proc/vmcore to obtain the contents of memory,
-   which holds the previous crashed kernel. The userspace
-   tools may copy this info to disk, or network, nas, san,
-   iscsi, etc. as desired.
-   For Example: the values in /sys/kernel/release-region
-   would look something like this (address-range pairs).
-   CPU:0x177fee000-0x10000: HPTE:0x177ffe020-0x1000: /
-   DUMP:0x177fff020-0x10000000, 0x10000000-0x16F1D370A
-- As the userspace tools complete saving a portion of
-   dump, they echo an offset and size to
-   /sys/kernel/release_region to release the reserved
-   memory back to general use.
-   An example of this is:
-     "echo 0x40000000 0x10000000 > /sys/kernel/release_region"
-   which will release 256MB at the 1GB boundary.
-Please note that the hypervisor-assisted dump feature
-is only available on Power6-based systems with recent
-firmware versions.
-Implementation details:
----------------------
-During boot, a check is made to see if firmware supports
-this feature on this particular machine. If it does, then
-we check to see if a active dump is waiting for us. If yes
-then everything but 256 MB of RAM is reserved during early
-boot. This area is released once we collect a dump from user
-land scripts that are run. If there is dump data, then
-the /sys/kernel/release_region file is created, and
-the reserved memory is held.
-If there is no waiting dump data, then only the highest
-256MB of the ram is reserved as a scratch area. This area
-is *not* released: this region will be kept permanently
-reserved, so that it can act as a receptacle for a copy
-of the low 256MB in the case a crash does occur. See,
-however, "open issues" below, as to whether
-such a reserved region is really needed.
-Currently the dump will be copied from /proc/vmcore to a
-a new file upon user intervention. The starting address
-to be read and the range for each data point in provided
-in /sys/kernel/release_region.
-The tools to examine the dump will be same as the ones
-used for kdump.
-General notes:
--------------
-Security: please note that there are potential security issues
-with any sort of dump mechanism. In particular, plaintext
-(unencrypted) data, and possibly passwords, may be present in
-the dump data. Userspace tools must take adequate precautions to
-preserve security.
-Open issues/ToDo:
------------
- o The various code paths that tell the hypervisor that a crash
-   occurred, vs. it simply being a normal reboot, should be
-   reviewed, and possibly clarified/fixed.
- o Instead of using /sys/kernel, should there be a /sys/dump
-   instead? There is a dump_subsys being created by the s390 code,
-   perhaps the pseries code should use a similar layout as well.
- o Is reserving a 256MB region really required? The goal of
-   reserving a 256MB scratch area is to make sure that no
-   important crash data is clobbered when the hypervisor
-   save low mem to the scratch area. But, if one could assure
-   that nothing important is located in some 256MB area, then
-   it would not need to be reserved. Something that can be
-   improved in subsequent versions.
- o Still working the kdump team to integrate this with kdump,
-   some work remains but this would not affect the current
-   patches.
- o Still need to write a shell script, to copy the dump away.
-   Currently I am parsing it manually.
diff --git a/Documentation/security/00-INDEX b/Documentation/security/00-INDEX
index 99b85d39751c..eeed1de546d4 100644
--- a/Documentation/security/00-INDEX
+++ b/Documentation/security/00-INDEX
@@ -6,6 +6,8 @@ SELinux.txt
        - how to get started with the SELinux security enhancement.
 Smack.txt
        - documentation on the Smack Linux Security Module.
+Yama.txt
+        - documentation on the Yama Linux Security Module.
 apparmor.txt
        - documentation on the AppArmor security extension.
 credentials.txt
diff --git a/Documentation/security/Yama.txt b/Documentation/security/Yama.txt
new file mode 100644
index 000000000000..a9511f179069
--- /dev/null
+++ b/Documentation/security/Yama.txt
@@ -0,0 +1,65 @@
+Yama is a Linux Security Module that collects a number of system-wide DAC
+security protections that are not handled by the core kernel itself. To
+select it at boot time, specify "security=yama" (though this will disable
+any other LSM).
+Yama is controlled through sysctl in /proc/sys/kernel/yama:
+- ptrace_scope
+==============================================================
+ptrace_scope:
+As Linux grows in popularity, it will become a larger target for
+malware. One particularly troubling weakness of the Linux process
+interfaces is that a single user is able to examine the memory and
+running state of any of their processes. For example, if one application
+(e.g. Pidgin) was compromised, it would be possible for an attacker to
+attach to other running processes (e.g. Firefox, SSH sessions, GPG agent,
+etc) to extract additional credentials and continue to expand the scope
+of their attack without resorting to user-assisted phishing.
+This is not a theoretical problem. SSH session hijacking
+(http://www.storm.net.nz/projects/7) and arbitrary code injection
+(http://c-skills.blogspot.com/2007/05/injectso.html) attacks already
+exist and remain possible if ptrace is allowed to operate as before.
+Since ptrace is not commonly used by non-developers and non-admins, system
+builders should be allowed the option to disable this debugging system.
+For a solution, some applications use prctl(PR_SET_DUMPABLE, ...) to
+specifically disallow such ptrace attachment (e.g. ssh-agent), but many
+do not. A more general solution is to only allow ptrace directly from a
+parent to a child process (i.e. direct "gdb EXE" and "strace EXE" still
+work), or with CAP_SYS_PTRACE (i.e. "gdb --pid=PID", and "strace -p PID"
+still work as root).
+For software that has defined application-specific relationships
+between a debugging process and its inferior (crash handlers, etc),
+prctl(PR_SET_PTRACER, pid, ...) can be used. An inferior can declare which
+other process (and its descendents) are allowed to call PTRACE_ATTACH
+against it. Only one such declared debugging process can exists for
+each inferior at a time. For example, this is used by KDE, Chromium, and
+Firefox's crash handlers, and by Wine for allowing only Wine processes
+to ptrace each other. If a process wishes to entirely disable these ptrace
+restrictions, it can call prctl(PR_SET_PTRACER, PR_SET_PTRACER_ANY, ...)
+so that any otherwise allowed process (even those in external pid namespaces)
+may attach.
+The sysctl settings are:
+0 - classic ptrace permissions: a process can PTRACE_ATTACH to any other
+    process running under the same uid, as long as it is dumpable (i.e.
+    did not transition uids, start privileged, or have called
+    prctl(PR_SET_DUMPABLE...) already).
+1 - restricted ptrace: a process must have a predefined relationship
+    with the inferior it wants to call PTRACE_ATTACH on. By default,
+    this relationship is that of only its descendants when the above
+    classic criteria is also met. To change the relationship, an
+    inferior can call prctl(PR_SET_PTRACER, debugger, ...) to declare
+    an allowed debugger PID to call PTRACE_ATTACH on the inferior.
+The original children-only logic was based on the restrictions in grsecurity.
+==============================================================
diff --git a/Documentation/security/keys.txt b/Documentation/security/keys.txt
index fcbe7a703405..787717091421 100644
--- a/Documentation/security/keys.txt
+++ b/Documentation/security/keys.txt
@@ -554,6 +554,10 @@ The keyctl syscall functions are:
     process must have write permission on the keyring, and it must be a
     keyring (or else error ENOTDIR will result).
+     This function can also be used to clear special kernel keyrings if they
+     are appropriately marked if the user has CAP_SYS_ADMIN capability.  The
+     DNS resolver cache keyring is an example of this.
 (*) Link a key into a keyring: