aboutsummaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorBenjamin Herrenschmidt <benh@kernel.crashing.org>2006-10-06 03:43:53 -0400
committerLinus Torvalds <torvalds@g5.osdl.org>2006-10-06 11:53:40 -0400
commit7f7bbbe50b8a28f4dfaa4cea939ddb50198c4a99 (patch)
tree6421bee762afd8f09c643e90860c4bdbe4dde93b
parent1ca4cb2418c04914e4661c059cf5b7b9262c645a (diff)
[PATCH] page fault retry with NOPAGE_REFAULT
Add a way for a no_page() handler to request a retry of the faulting instruction. It goes back to userland on page faults and just tries again in get_user_pages(). I added a cond_resched() in the loop in that later case. The problem I have with signal and spufs is an actual bug affecting apps and I don't see other ways of fixing it. In addition, we are having issues with infiniband and 64k pages (related to the way the hypervisor deals with some HV cards) that will require us to muck around with the MMU from within the IB driver's no_page() (it's a pSeries specific driver) and return to the caller the same way using NOPAGE_REFAULT. And to add to this, the graphics folks have been following a new approach of memory management that involves transparently swapping objects between video ram and main meory. To do that, they need installing PTEs from a no_page() handler as well and that also requires returning with NOPAGE_REFAULT. (For the later, they are currently using io_remap_pfn_range to install one PTE from no_page() which is a bit racy, we need to add a check for the PTE having already been installed afer taking the lock, but that's ok, they are only at the proof-of-concept stage. I'll send a patch adding a "clean" function to do that, we can use that from spufs too and get rid of the sparsemem hacks we do to create struct page for SPEs. Basically, that provides a generic solution for being able to have no_page() map hardware devices, which is something that I think sound driver folks have been asking for some time too). All of these things depend on having the NOPAGE_REFAULT exit path from no_page() handlers. Signed-off-by: Benjamin Herrenchmidt <benh@kernel.crashing.org> Cc: Hugh Dickins <hugh@veritas.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-rw-r--r--include/linux/mm.h1
-rw-r--r--mm/memory.c9
2 files changed, 7 insertions, 3 deletions
diff --git a/include/linux/mm.h b/include/linux/mm.h
index b7966ab8cb6a..26146623be2f 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -593,6 +593,7 @@ static inline int page_mapped(struct page *page)
593 */ 593 */
594#define NOPAGE_SIGBUS (NULL) 594#define NOPAGE_SIGBUS (NULL)
595#define NOPAGE_OOM ((struct page *) (-1)) 595#define NOPAGE_OOM ((struct page *) (-1))
596#define NOPAGE_REFAULT ((struct page *) (-2)) /* Return to userspace, rerun */
596 597
597/* 598/*
598 * Error return values for the *_nopfn functions 599 * Error return values for the *_nopfn functions
diff --git a/mm/memory.c b/mm/memory.c
index 9cf3f341a28a..b5a4aadd961a 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1086,6 +1086,7 @@ int get_user_pages(struct task_struct *tsk, struct mm_struct *mm,
1086 default: 1086 default:
1087 BUG(); 1087 BUG();
1088 } 1088 }
1089 cond_resched();
1089 } 1090 }
1090 if (pages) { 1091 if (pages) {
1091 pages[i] = page; 1092 pages[i] = page;
@@ -2169,11 +2170,13 @@ retry:
2169 * after the next truncate_count read. 2170 * after the next truncate_count read.
2170 */ 2171 */
2171 2172
2172 /* no page was available -- either SIGBUS or OOM */ 2173 /* no page was available -- either SIGBUS, OOM or REFAULT */
2173 if (new_page == NOPAGE_SIGBUS) 2174 if (unlikely(new_page == NOPAGE_SIGBUS))
2174 return VM_FAULT_SIGBUS; 2175 return VM_FAULT_SIGBUS;
2175 if (new_page == NOPAGE_OOM) 2176 else if (unlikely(new_page == NOPAGE_OOM))
2176 return VM_FAULT_OOM; 2177 return VM_FAULT_OOM;
2178 else if (unlikely(new_page == NOPAGE_REFAULT))
2179 return VM_FAULT_MINOR;
2177 2180
2178 /* 2181 /*
2179 * Should we do an early C-O-W break? 2182 * Should we do an early C-O-W break?