<feed xmlns='http://www.w3.org/2005/Atom'>
<title>litmus-rt.git/arch/powerpc/lib, branch wip-default-clustering</title>
<subtitle>The LITMUS^RT kernel.</subtitle>
<link rel='alternate' type='text/html' href='http://rtsrv.cs.unc.edu/cgit/cgit.cgi/litmus-rt.git/'/>
<entry>
<title>Merge commit 'paulus-perf/master' into next</title>
<updated>2010-07-09T01:25:48+00:00</updated>
<author>
<name>Benjamin Herrenschmidt</name>
<email>benh@kernel.crashing.org</email>
</author>
<published>2010-07-09T01:25:48+00:00</published>
<link rel='alternate' type='text/html' href='http://rtsrv.cs.unc.edu/cgit/cgit.cgi/litmus-rt.git/commit/?id=5f07aa7524e98d6f68f2bec54f155ef6012e2c9a'/>
<id>5f07aa7524e98d6f68f2bec54f155ef6012e2c9a</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc: Fix feature-fixup tests for gcc 4.5</title>
<updated>2010-07-08T08:11:41+00:00</updated>
<author>
<name>Stephen Rothwell</name>
<email>sfr@canb.auug.org.au</email>
</author>
<published>2010-06-28T21:08:29+00:00</published>
<link rel='alternate' type='text/html' href='http://rtsrv.cs.unc.edu/cgit/cgit.cgi/litmus-rt.git/commit/?id=3880ecb05bc5ece4c6e392a21ea77518e55b4935'/>
<id>3880ecb05bc5ece4c6e392a21ea77518e55b4935</id>
<content type='text'>
The feature-fixup test declare some extern void variables and then take
their addresses.  Fix this by declaring them as extern u8 instead.

Fixes these warnings (treated as errors):

  CC      arch/powerpc/lib/feature-fixups.o
cc1: warnings being treated as errors
arch/powerpc/lib/feature-fixups.c: In function 'test_cpu_macros':
arch/powerpc/lib/feature-fixups.c:293:23: error: taking address of expression of type 'void'
arch/powerpc/lib/feature-fixups.c:294:9: error: taking address of expression of type 'void'
arch/powerpc/lib/feature-fixups.c:297:2: error: taking address of expression of type 'void'
arch/powerpc/lib/feature-fixups.c:297:2: error: taking address of expression of type 'void'
arch/powerpc/lib/feature-fixups.c: In function 'test_fw_macros':
arch/powerpc/lib/feature-fixups.c:306:23: error: taking address of expression of type 'void'
arch/powerpc/lib/feature-fixups.c:307:9: error: taking address of expression of type 'void'
arch/powerpc/lib/feature-fixups.c:310:2: error: taking address of expression of type 'void'
arch/powerpc/lib/feature-fixups.c:310:2: error: taking address of expression of type 'void'
arch/powerpc/lib/feature-fixups.c: In function 'test_lwsync_macros':
arch/powerpc/lib/feature-fixups.c:321:23: error: taking address of expression of type 'void'
arch/powerpc/lib/feature-fixups.c:322:9: error: taking address of expression of type 'void'
arch/powerpc/lib/feature-fixups.c:326:3: error: taking address of expression of type 'void'
arch/powerpc/lib/feature-fixups.c:326:3: error: taking address of expression of type 'void'
arch/powerpc/lib/feature-fixups.c:329:3: error: taking address of expression of type 'void'
arch/powerpc/lib/feature-fixups.c:329:3: error: taking address of expression of type 'void'

Signed-off-by: Stephen Rothwell &lt;sfr@canb.auug.org.au&gt;
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The feature-fixup test declare some extern void variables and then take
their addresses.  Fix this by declaring them as extern u8 instead.

Fixes these warnings (treated as errors):

  CC      arch/powerpc/lib/feature-fixups.o
cc1: warnings being treated as errors
arch/powerpc/lib/feature-fixups.c: In function 'test_cpu_macros':
arch/powerpc/lib/feature-fixups.c:293:23: error: taking address of expression of type 'void'
arch/powerpc/lib/feature-fixups.c:294:9: error: taking address of expression of type 'void'
arch/powerpc/lib/feature-fixups.c:297:2: error: taking address of expression of type 'void'
arch/powerpc/lib/feature-fixups.c:297:2: error: taking address of expression of type 'void'
arch/powerpc/lib/feature-fixups.c: In function 'test_fw_macros':
arch/powerpc/lib/feature-fixups.c:306:23: error: taking address of expression of type 'void'
arch/powerpc/lib/feature-fixups.c:307:9: error: taking address of expression of type 'void'
arch/powerpc/lib/feature-fixups.c:310:2: error: taking address of expression of type 'void'
arch/powerpc/lib/feature-fixups.c:310:2: error: taking address of expression of type 'void'
arch/powerpc/lib/feature-fixups.c: In function 'test_lwsync_macros':
arch/powerpc/lib/feature-fixups.c:321:23: error: taking address of expression of type 'void'
arch/powerpc/lib/feature-fixups.c:322:9: error: taking address of expression of type 'void'
arch/powerpc/lib/feature-fixups.c:326:3: error: taking address of expression of type 'void'
arch/powerpc/lib/feature-fixups.c:326:3: error: taking address of expression of type 'void'
arch/powerpc/lib/feature-fixups.c:329:3: error: taking address of expression of type 'void'
arch/powerpc/lib/feature-fixups.c:329:3: error: taking address of expression of type 'void'

Signed-off-by: Stephen Rothwell &lt;sfr@canb.auug.org.au&gt;
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc: Fix module building for gcc 4.5 and 64 bit</title>
<updated>2010-07-08T08:11:38+00:00</updated>
<author>
<name>Stephen Rothwell</name>
<email>sfr@canb.auug.org.au</email>
</author>
<published>2010-06-29T20:08:42+00:00</published>
<link rel='alternate' type='text/html' href='http://rtsrv.cs.unc.edu/cgit/cgit.cgi/litmus-rt.git/commit/?id=7fca5dc8aa7aaa6a1023bd3587901b88ebfe8154'/>
<id>7fca5dc8aa7aaa6a1023bd3587901b88ebfe8154</id>
<content type='text'>
Gcc 4.5 is now generating out of line register save and restore
in the function prefix and postfix when we use -Os.

Signed-off-by: Stephen Rothwell &lt;sfr@canb.auug.org.au&gt;
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Gcc 4.5 is now generating out of line register save and restore
in the function prefix and postfix when we use -Os.

Signed-off-by: Stephen Rothwell &lt;sfr@canb.auug.org.au&gt;
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc, hw_breakpoints: Implement hw_breakpoints for 64-bit server processors</title>
<updated>2010-06-22T09:40:50+00:00</updated>
<author>
<name>K.Prasad</name>
<email>prasad@linux.vnet.ibm.com</email>
</author>
<published>2010-06-15T06:05:19+00:00</published>
<link rel='alternate' type='text/html' href='http://rtsrv.cs.unc.edu/cgit/cgit.cgi/litmus-rt.git/commit/?id=5aae8a53708025d4e718f0d2e7c2f766779ddc71'/>
<id>5aae8a53708025d4e718f0d2e7c2f766779ddc71</id>
<content type='text'>
Implement perf-events based hw-breakpoint interfaces for PowerPC
64-bit server (Book III S) processors.  This allows access to a
given location to be used as an event that can be counted or
profiled by the perf_events subsystem.

This is done using the DABR (data breakpoint register), which can
also be used for process debugging via ptrace.  When perf_event
hw_breakpoint support is configured in, the perf_event subsystem
manages the DABR and arbitrates access to it, and ptrace then
creates a perf_event when it is requested to set a data breakpoint.

[Adopted suggestions from Paul Mackerras &lt;paulus@samba.org&gt; to
- emulate_step() all system-wide breakpoints and single-step only the
  per-task breakpoints
- perform arch-specific cleanup before unregistration through
  arch_unregister_hw_breakpoint()
]

Signed-off-by: K.Prasad &lt;prasad@linux.vnet.ibm.com&gt;
Signed-off-by: Paul Mackerras &lt;paulus@samba.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Implement perf-events based hw-breakpoint interfaces for PowerPC
64-bit server (Book III S) processors.  This allows access to a
given location to be used as an event that can be counted or
profiled by the perf_events subsystem.

This is done using the DABR (data breakpoint register), which can
also be used for process debugging via ptrace.  When perf_event
hw_breakpoint support is configured in, the perf_event subsystem
manages the DABR and arbitrates access to it, and ptrace then
creates a perf_event when it is requested to set a data breakpoint.

[Adopted suggestions from Paul Mackerras &lt;paulus@samba.org&gt; to
- emulate_step() all system-wide breakpoints and single-step only the
  per-task breakpoints
- perform arch-specific cleanup before unregistration through
  arch_unregister_hw_breakpoint()
]

Signed-off-by: K.Prasad &lt;prasad@linux.vnet.ibm.com&gt;
Signed-off-by: Paul Mackerras &lt;paulus@samba.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc: Emulate most Book I instructions in emulate_step()</title>
<updated>2010-06-22T09:40:29+00:00</updated>
<author>
<name>Paul Mackerras</name>
<email>paulus@samba.org</email>
</author>
<published>2010-06-15T04:48:58+00:00</published>
<link rel='alternate' type='text/html' href='http://rtsrv.cs.unc.edu/cgit/cgit.cgi/litmus-rt.git/commit/?id=0016a4cf5582415849fafbf9f019dd9530824789'/>
<id>0016a4cf5582415849fafbf9f019dd9530824789</id>
<content type='text'>
This extends the emulate_step() function to handle a large proportion
of the Book I instructions implemented on current 64-bit server
processors.  The aim is to handle all the load and store instructions
used in the kernel, plus all of the instructions that appear between
l[wd]arx and st[wd]cx., so this handles the Altivec/VMX lvx and stvx
and the VSX lxv2dx and stxv2dx instructions (implemented in POWER7).

The new code can emulate user mode instructions, and checks the
effective address for a load or store if the saved state is for
user mode.  It doesn't handle little-endian mode at present.

For floating-point, Altivec/VMX and VSX instructions, it checks
that the saved MSR has the enable bit for the relevant facility
set, and if so, assumes that the FP/VMX/VSX registers contain
valid state, and does loads or stores directly to/from the
FP/VMX/VSX registers, using assembly helpers in ldstfp.S.

Instructions supported now include:
* Loads and stores, including some but not all VMX and VSX instructions,
  and lmw/stmw
* Atomic loads and stores (l[dw]arx, st[dw]cx.)
* Arithmetic instructions (add, subtract, multiply, divide, etc.)
* Compare instructions
* Rotate and mask instructions
* Shift instructions
* Logical instructions (and, or, xor, etc.)
* Condition register logical instructions
* mtcrf, cntlz[wd], exts[bhw]
* isync, sync, lwsync, ptesync, eieio
* Cache operations (dcbf, dcbst, dcbt, dcbtst)

The overflow-checking arithmetic instructions are not included, but
they appear not to be ever used in C code.

This uses decimal values for the minor opcodes in the switch statements
because that is what appears in the Power ISA specification, thus it is
easier to check that they are correct if they are in decimal.

If this is used to single-step an instruction where a data breakpoint
interrupt occurred, then there is the possibility that the instruction
is a lwarx or ldarx.  In that case we have to be careful not to lose the
reservation until we get to the matching st[wd]cx., or we'll never make
forward progress.  One alternative is to try to arrange that we can
return from interrupts and handle data breakpoint interrupts without
losing the reservation, which means not using any spinlocks, mutexes,
or atomic ops (including bitops).  That seems rather fragile.  The
other alternative is to emulate the larx/stcx and all the instructions
in between.  This is why this commit adds support for a wide range
of integer instructions.

Signed-off-by: Paul Mackerras &lt;paulus@samba.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This extends the emulate_step() function to handle a large proportion
of the Book I instructions implemented on current 64-bit server
processors.  The aim is to handle all the load and store instructions
used in the kernel, plus all of the instructions that appear between
l[wd]arx and st[wd]cx., so this handles the Altivec/VMX lvx and stvx
and the VSX lxv2dx and stxv2dx instructions (implemented in POWER7).

The new code can emulate user mode instructions, and checks the
effective address for a load or store if the saved state is for
user mode.  It doesn't handle little-endian mode at present.

For floating-point, Altivec/VMX and VSX instructions, it checks
that the saved MSR has the enable bit for the relevant facility
set, and if so, assumes that the FP/VMX/VSX registers contain
valid state, and does loads or stores directly to/from the
FP/VMX/VSX registers, using assembly helpers in ldstfp.S.

Instructions supported now include:
* Loads and stores, including some but not all VMX and VSX instructions,
  and lmw/stmw
* Atomic loads and stores (l[dw]arx, st[dw]cx.)
* Arithmetic instructions (add, subtract, multiply, divide, etc.)
* Compare instructions
* Rotate and mask instructions
* Shift instructions
* Logical instructions (and, or, xor, etc.)
* Condition register logical instructions
* mtcrf, cntlz[wd], exts[bhw]
* isync, sync, lwsync, ptesync, eieio
* Cache operations (dcbf, dcbst, dcbt, dcbtst)

The overflow-checking arithmetic instructions are not included, but
they appear not to be ever used in C code.

This uses decimal values for the minor opcodes in the switch statements
because that is what appears in the Power ISA specification, thus it is
easier to check that they are correct if they are in decimal.

If this is used to single-step an instruction where a data breakpoint
interrupt occurred, then there is the possibility that the instruction
is a lwarx or ldarx.  In that case we have to be careful not to lose the
reservation until we get to the matching st[wd]cx., or we'll never make
forward progress.  One alternative is to try to arrange that we can
return from interrupts and handle data breakpoint interrupts without
losing the reservation, which means not using any spinlocks, mutexes,
or atomic ops (including bitops).  That seems rather fragile.  The
other alternative is to emulate the larx/stcx and all the instructions
in between.  This is why this commit adds support for a wide range
of integer instructions.

Signed-off-by: Paul Mackerras &lt;paulus@samba.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc: Fix string library functions</title>
<updated>2010-05-21T07:31:08+00:00</updated>
<author>
<name>Andreas Schwab</name>
<email>schwab@linux-m68k.org</email>
</author>
<published>2010-05-18T08:15:21+00:00</published>
<link rel='alternate' type='text/html' href='http://rtsrv.cs.unc.edu/cgit/cgit.cgi/litmus-rt.git/commit/?id=ca5d0674c37840366f04a7bbfbf78e7b5f3ce0a4'/>
<id>ca5d0674c37840366f04a7bbfbf78e7b5f3ce0a4</id>
<content type='text'>
The powerpc strncmp implementation does not correctly handle a zero
length, despite the claim in 0119536cd314ef95553604208c25bc35581f7f0a
(Add hand-coded assembly strcmp).

Additionally, all the length arguments are size_t, not int, so use
PPC_LCMPI and eq instead of cmpwi and le throughout.

Signed-off-by: Andreas Schwab &lt;schwab@linux-m68k.org&gt;
Acked-by: Paul Mackerras &lt;paulus@samba.org&gt;
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The powerpc strncmp implementation does not correctly handle a zero
length, despite the claim in 0119536cd314ef95553604208c25bc35581f7f0a
(Add hand-coded assembly strcmp).

Additionally, all the length arguments are size_t, not int, so use
PPC_LCMPI and eq instead of cmpwi and le throughout.

Signed-off-by: Andreas Schwab &lt;schwab@linux-m68k.org&gt;
Acked-by: Paul Mackerras &lt;paulus@samba.org&gt;
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc: Fix handling of strncmp with zero len</title>
<updated>2010-04-07T08:00:39+00:00</updated>
<author>
<name>Jeff Mahoney</name>
<email>jeffm@suse.com</email>
</author>
<published>2010-03-17T10:55:51+00:00</published>
<link rel='alternate' type='text/html' href='http://rtsrv.cs.unc.edu/cgit/cgit.cgi/litmus-rt.git/commit/?id=637a99022fb119b90fb281715d13172f0394fc12'/>
<id>637a99022fb119b90fb281715d13172f0394fc12</id>
<content type='text'>
Commit 0119536c, which added the assembly version of strncmp to
powerpc, mentions that it adds two instructions to the version from
boot/string.S to allow it to handle len=0. Unfortunately, it doesn't
always return 0 when that is the case. The length is passed in r5, but
the return value is passed back in r3. In certain cases, this will
happen to work. Otherwise it will pass back the address of the first
string as the return value.

This patch lifts the len &lt;= 0 handling code from memcpy to handle that
case.

Reported by: Christian_Sellars@symantec.com
Signed-off-by: Jeff Mahoney &lt;jeffm@suse.com&gt;
CC: &lt;stable@kernel.org&gt;
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Commit 0119536c, which added the assembly version of strncmp to
powerpc, mentions that it adds two instructions to the version from
boot/string.S to allow it to handle len=0. Unfortunately, it doesn't
always return 0 when that is the case. The length is passed in r5, but
the return value is passed back in r3. In certain cases, this will
happen to work. Otherwise it will pass back the address of the first
string as the return value.

This patch lifts the len &lt;= 0 handling code from memcpy to handle that
case.

Reported by: Christian_Sellars@symantec.com
Signed-off-by: Jeff Mahoney &lt;jeffm@suse.com&gt;
CC: &lt;stable@kernel.org&gt;
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h</title>
<updated>2010-03-30T13:02:32+00:00</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2010-03-24T08:04:11+00:00</published>
<link rel='alternate' type='text/html' href='http://rtsrv.cs.unc.edu/cgit/cgit.cgi/litmus-rt.git/commit/?id=5a0e3ad6af8660be21ca98a971cd00f331318c05'/>
<id>5a0e3ad6af8660be21ca98a971cd00f331318c05</id>
<content type='text'>
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files.  percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.

percpu.h -&gt; slab.h dependency is about to be removed.  Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability.  As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.

  http://userweb.kernel.org/~tj/misc/slabh-sweep.py

The script does the followings.

* Scan files for gfp and slab usages and update includes such that
  only the necessary includes are there.  ie. if only gfp is used,
  gfp.h, if slab is used, slab.h.

* When the script inserts a new include, it looks at the include
  blocks and try to put the new include such that its order conforms
  to its surrounding.  It's put in the include block which contains
  core kernel includes, in the same order that the rest are ordered -
  alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
  doesn't seem to be any matching order.

* If the script can't find a place to put a new include (mostly
  because the file doesn't have fitting include block), it prints out
  an error message indicating which .h file needs to be added to the
  file.

The conversion was done in the following steps.

1. The initial automatic conversion of all .c files updated slightly
   over 4000 files, deleting around 700 includes and adding ~480 gfp.h
   and ~3000 slab.h inclusions.  The script emitted errors for ~400
   files.

2. Each error was manually checked.  Some didn't need the inclusion,
   some needed manual addition while adding it to implementation .h or
   embedding .c file was more appropriate for others.  This step added
   inclusions to around 150 files.

3. The script was run again and the output was compared to the edits
   from #2 to make sure no file was left behind.

4. Several build tests were done and a couple of problems were fixed.
   e.g. lib/decompress_*.c used malloc/free() wrappers around slab
   APIs requiring slab.h to be added manually.

5. The script was run on all .h files but without automatically
   editing them as sprinkling gfp.h and slab.h inclusions around .h
   files could easily lead to inclusion dependency hell.  Most gfp.h
   inclusion directives were ignored as stuff from gfp.h was usually
   wildly available and often used in preprocessor macros.  Each
   slab.h inclusion directive was examined and added manually as
   necessary.

6. percpu.h was updated not to include slab.h.

7. Build test were done on the following configurations and failures
   were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
   distributed build env didn't work with gcov compiles) and a few
   more options had to be turned off depending on archs to make things
   build (like ipr on powerpc/64 which failed due to missing writeq).

   * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
   * powerpc and powerpc64 SMP allmodconfig
   * sparc and sparc64 SMP allmodconfig
   * ia64 SMP allmodconfig
   * s390 SMP allmodconfig
   * alpha SMP allmodconfig
   * um on x86_64 SMP allmodconfig

8. percpu.h modifications were reverted so that it could be applied as
   a separate patch and serve as bisection point.

Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Guess-its-ok-by: Christoph Lameter &lt;cl@linux-foundation.org&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Lee Schermerhorn &lt;Lee.Schermerhorn@hp.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files.  percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.

percpu.h -&gt; slab.h dependency is about to be removed.  Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability.  As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.

  http://userweb.kernel.org/~tj/misc/slabh-sweep.py

The script does the followings.

* Scan files for gfp and slab usages and update includes such that
  only the necessary includes are there.  ie. if only gfp is used,
  gfp.h, if slab is used, slab.h.

* When the script inserts a new include, it looks at the include
  blocks and try to put the new include such that its order conforms
  to its surrounding.  It's put in the include block which contains
  core kernel includes, in the same order that the rest are ordered -
  alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
  doesn't seem to be any matching order.

* If the script can't find a place to put a new include (mostly
  because the file doesn't have fitting include block), it prints out
  an error message indicating which .h file needs to be added to the
  file.

The conversion was done in the following steps.

1. The initial automatic conversion of all .c files updated slightly
   over 4000 files, deleting around 700 includes and adding ~480 gfp.h
   and ~3000 slab.h inclusions.  The script emitted errors for ~400
   files.

2. Each error was manually checked.  Some didn't need the inclusion,
   some needed manual addition while adding it to implementation .h or
   embedding .c file was more appropriate for others.  This step added
   inclusions to around 150 files.

3. The script was run again and the output was compared to the edits
   from #2 to make sure no file was left behind.

4. Several build tests were done and a couple of problems were fixed.
   e.g. lib/decompress_*.c used malloc/free() wrappers around slab
   APIs requiring slab.h to be added manually.

5. The script was run on all .h files but without automatically
   editing them as sprinkling gfp.h and slab.h inclusions around .h
   files could easily lead to inclusion dependency hell.  Most gfp.h
   inclusion directives were ignored as stuff from gfp.h was usually
   wildly available and often used in preprocessor macros.  Each
   slab.h inclusion directive was examined and added manually as
   necessary.

6. percpu.h was updated not to include slab.h.

7. Build test were done on the following configurations and failures
   were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
   distributed build env didn't work with gcov compiles) and a few
   more options had to be turned off depending on archs to make things
   build (like ipr on powerpc/64 which failed due to missing writeq).

   * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
   * powerpc and powerpc64 SMP allmodconfig
   * sparc and sparc64 SMP allmodconfig
   * ia64 SMP allmodconfig
   * s390 SMP allmodconfig
   * alpha SMP allmodconfig
   * um on x86_64 SMP allmodconfig

8. percpu.h modifications were reverted so that it could be applied as
   a separate patch and serve as bisection point.

Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Guess-its-ok-by: Christoph Lameter &lt;cl@linux-foundation.org&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Lee Schermerhorn &lt;Lee.Schermerhorn@hp.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc: Fix lwsync feature fixup vs. modules on 64-bit</title>
<updated>2010-02-26T07:29:17+00:00</updated>
<author>
<name>Benjamin Herrenschmidt</name>
<email>benh@kernel.crashing.org</email>
</author>
<published>2010-02-26T07:29:17+00:00</published>
<link rel='alternate' type='text/html' href='http://rtsrv.cs.unc.edu/cgit/cgit.cgi/litmus-rt.git/commit/?id=3d98ffbffb16f2a1569b83cb78db0b5100e6c937'/>
<id>3d98ffbffb16f2a1569b83cb78db0b5100e6c937</id>
<content type='text'>
Anton's commit enabling the use of the lwsync fixup mechanism on 64-bit
breaks modules. The lwsync fixup section uses .long instead of the
FTR_ENTRY_OFFSET macro used by other fixups sections, and thus will
generate 32-bit relocations that our module loader cannot resolve.

This changes it to use the same type as other feature sections.

Note however that we might want to consider using 32-bit for all the
feature fixup offsets and add support for R_PPC_REL32 to module_64.c
instead as that would reduce the size of the kernel image. I'll leave
that as an exercise for the reader for now...

Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Anton's commit enabling the use of the lwsync fixup mechanism on 64-bit
breaks modules. The lwsync fixup section uses .long instead of the
FTR_ENTRY_OFFSET macro used by other fixups sections, and thus will
generate 32-bit relocations that our module loader cannot resolve.

This changes it to use the same type as other feature sections.

Note however that we might want to consider using 32-bit for all the
feature fixup offsets and add support for R_PPC_REL32 to module_64.c
instead as that would reduce the size of the kernel image. I'll leave
that as an exercise for the reader for now...

Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc: Improve 64bit copy_tofrom_user</title>
<updated>2010-02-17T03:03:16+00:00</updated>
<author>
<name>Anton Blanchard</name>
<email>anton@samba.org</email>
</author>
<published>2010-02-10T14:56:26+00:00</published>
<link rel='alternate' type='text/html' href='http://rtsrv.cs.unc.edu/cgit/cgit.cgi/litmus-rt.git/commit/?id=789c299ca280f96368c0296b739e89c0bb232f8a'/>
<id>789c299ca280f96368c0296b739e89c0bb232f8a</id>
<content type='text'>
Here is a patch from Paul Mackerras that improves the ppc64 copy_tofrom_user.
The loop now does 32 bytes at a time and as well as pairing loads and stores.

A quick test case that reads 8kB over and over shows the improvement:

POWER6: 53% faster
POWER7: 51% faster

#define _XOPEN_SOURCE 500
#include &lt;stdlib.h&gt;
#include &lt;stdio.h&gt;
#include &lt;unistd.h&gt;
#include &lt;fcntl.h&gt;
#include &lt;sys/types.h&gt;
#include &lt;sys/stat.h&gt;

#define BUFSIZE (8 * 1024)
#define ITERATIONS 10000000

int main()
{
	char tmpfile[] = "/tmp/copy_to_user_testXXXXXX";
	int fd;
	char *buf[BUFSIZE];
	unsigned long i;

	fd = mkstemp(tmpfile);
	if (fd &lt; 0) {
		perror("open");
		exit(1);
	}

	if (write(fd, buf, BUFSIZE) != BUFSIZE) {
		perror("open");
		exit(1);
	}

	for (i = 0; i &lt; 10000000; i++) {
		if (pread(fd, buf, BUFSIZE, 0) != BUFSIZE) {
			perror("pread");
			exit(1);
		}
	}

	unlink(tmpfile);

	return 0;
}

Signed-off-by: Anton Blanchard &lt;anton@samba.org&gt;
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Here is a patch from Paul Mackerras that improves the ppc64 copy_tofrom_user.
The loop now does 32 bytes at a time and as well as pairing loads and stores.

A quick test case that reads 8kB over and over shows the improvement:

POWER6: 53% faster
POWER7: 51% faster

#define _XOPEN_SOURCE 500
#include &lt;stdlib.h&gt;
#include &lt;stdio.h&gt;
#include &lt;unistd.h&gt;
#include &lt;fcntl.h&gt;
#include &lt;sys/types.h&gt;
#include &lt;sys/stat.h&gt;

#define BUFSIZE (8 * 1024)
#define ITERATIONS 10000000

int main()
{
	char tmpfile[] = "/tmp/copy_to_user_testXXXXXX";
	int fd;
	char *buf[BUFSIZE];
	unsigned long i;

	fd = mkstemp(tmpfile);
	if (fd &lt; 0) {
		perror("open");
		exit(1);
	}

	if (write(fd, buf, BUFSIZE) != BUFSIZE) {
		perror("open");
		exit(1);
	}

	for (i = 0; i &lt; 10000000; i++) {
		if (pread(fd, buf, BUFSIZE, 0) != BUFSIZE) {
			perror("pread");
			exit(1);
		}
	}

	unlink(tmpfile);

	return 0;
}

Signed-off-by: Anton Blanchard &lt;anton@samba.org&gt;
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
