writeback: per task dirty rate limit

Add two fields to task_struct. 1) account dirtied pages in the individual tasks, for accuracy 2) per-task balance_dirty_pages() call intervals, for flexibility The balance_dirty_pages() call interval (ie. nr_dirtied_pause) will scale near-sqrt to the safety gap between dirty pages and threshold. The main problem of per-task nr_dirtied is, if 1k+ tasks start dirtying pages at exactly the same time, each task will be assigned a large initial nr_dirtied_pause, so that the dirty threshold will be exceeded long before each task reached its nr_dirtied_pause and hence call balance_dirty_pages(). The solution is to watch for the number of pages dirtied on each CPU in between the calls into balance_dirty_pages(). If it exceeds ratelimit_pages (3% dirty threshold), force call balance_dirty_pages() for a chance to set bdi->dirty_exceeded. In normal situations, this safeguarding condition is not expected to trigger at all. On the sqrt in dirty_poll_interval(): It will serve as an initial guess when dirty pages are still in the freerun area. When dirty pages are floating inside the dirty control scope [freerun, limit], a followup patch will use some refined dirty poll interval to get the desired pause time. thresh-dirty (MB) sqrt 1 16 2 22 4 32 8 45 16 64 32 90 64 128 128 181 256 256 512 362 1024 512 The above table means, given 1MB (or 1GB) gap and the dd tasks polling balance_dirty_pages() on every 16 (or 512) pages, the dirty limit won't be exceeded as long as there are less than 16 (or 512) concurrent dd's. So sqrt naturally leads to less overheads and more safe concurrent tasks for large memory servers, which have large (thresh-freerun) gaps. peter: keep the per-CPU ratelimit for safeguarding the 1k+ tasks case CC: Peter Zijlstra <a.p.zijlstra@chello.nl> Reviewed-by: Andrea Righi <andrea@betterlinux.com> Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
author: Wu Fengguang <fengguang.wu@intel.com> 2011-06-11 20:10:12 -0400
committer: Wu Fengguang <fengguang.wu@intel.com> 2011-10-03 09:08:57 -0400
commit: 9d823e8f6b1b7b39f952d7d1795f29162143a433 (patch)
tree: 2ef4c0d29353452dd2f894e7dbd240a31bdd0a02 /include/linux/sched.h
parent: 7381131cbcf7e15d201a0ffd782a4698efe4e740 (diff)
1 files changed, 7 insertions, 0 deletions
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 41d0237fd449..a4a5582dc618 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1525,6 +1525,13 @@ struct task_struct {
        int make_it_fail;
 #endif
        struct prop_local_single dirties;
+        /*
+         * when (nr_dirtied >= nr_dirtied_pause), it's time to call
+         * balance_dirty_pages() for some dirty throttling pause
+         */
+        int nr_dirtied;
+        int nr_dirtied_pause;
 #ifdef CONFIG_LATENCYTOP
        int latency_record_count;
        struct latency_record latency_record[LT_SAVECOUNT];
author	Wu Fengguang <fengguang.wu@intel.com>	2011-06-11 20:10:12 -0400
committer	Wu Fengguang <fengguang.wu@intel.com>	2011-10-03 09:08:57 -0400
commit	9d823e8f6b1b7b39f952d7d1795f29162143a433 (patch)
tree	2ef4c0d29353452dd2f894e7dbd240a31bdd0a02 /include/linux/sched.h
parent	7381131cbcf7e15d201a0ffd782a4698efe4e740 (diff)

diff --git a/include/linux/sched.h b/include/linux/sched.h index 41d0237fd449..a4a5582dc618 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h
@@ -1525,6 +1525,13 @@ struct task_struct {
1525	int make_it_fail;	1525	int make_it_fail;
1526	#endif	1526	#endif
1527	struct prop_local_single dirties;	1527	struct prop_local_single dirties;
		1528	/*
		1529	* when (nr_dirtied >= nr_dirtied_pause), it's time to call
		1530	* balance_dirty_pages() for some dirty throttling pause
		1531	*/
		1532	int nr_dirtied;
		1533	int nr_dirtied_pause;
		1534
1528	#ifdef CONFIG_LATENCYTOP	1535	#ifdef CONFIG_LATENCYTOP
1529	int latency_record_count;	1536	int latency_record_count;
1530	struct latency_record latency_record[LT_SAVECOUNT];	1537	struct latency_record latency_record[LT_SAVECOUNT];