aboutsummaryrefslogtreecommitdiffstats
path: root/fs/jfs
Commit message (Collapse)AuthorAge
...
* SL*B: drop kmem cache argument from constructorAlexey Dobriyan2008-07-26
| | | | | | | | | | | | | | | | | | | | | | | | Kmem cache passed to constructor is only needed for constructors that are themselves multiplexeres. Nobody uses this "feature", nor does anybody uses passed kmem cache in non-trivial way, so pass only pointer to object. Non-trivial places are: arch/powerpc/mm/init_64.c arch/powerpc/mm/hugetlbpage.c This is flag day, yes. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Acked-by: Pekka Enberg <penberg@cs.helsinki.fi> Acked-by: Christoph Lameter <cl@linux-foundation.org> Cc: Jon Tollefson <kniht@linux.vnet.ibm.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Matt Mackall <mpm@selenic.com> [akpm@linux-foundation.org: fix arch/powerpc/mm/hugetlbpage.c] [akpm@linux-foundation.org: fix mm/slab.c] [akpm@linux-foundation.org: fix ubifs] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* quota: move function-macros from quota.h to quotaops.hJan Kara2008-07-25
| | | | | | | | | | | | | | | | Move declarations of some macros, which should be in fact functions to quotaops.h. This way they can be later converted to inline functions because we can now use declarations from quota.h. Also add necessary includes of quotaops.h to a few files. [akpm@linux-foundation.org: fix JFS build] [akpm@linux-foundation.org: fix UFS build] [vegard.nossum@gmail.com: fix QUOTA=n build] Signed-off-by: Jan Kara <jack@suse.cz> Cc: Vegard Nossum <vegard.nossum@gmail.com> Cc: Arjen Pool <arjenpool@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* jfs: remove DIRENTSIZAdrian Bunk2008-06-10
| | | | | | | | | After fat gets fixed the unused DIRENTSIZ macro was the last user of struct dirent we should get rid of since the kernel and userspace versions differed. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
* JFS: diAlloc() should return -EIO rather than EIOLi Zefan2008-05-28
| | | | | | | | | | | | | | | | | | The comment above the function says one of its return value is -EIO, and also the caller of diAlloc() checks for -EIO: struct inode *ialloc(struct inode *parent, umode_t mode) { ... rc = diAlloc(parent, S_ISDIR(mode), inode); if (rc) { jfs_warn("ialloc: diAlloc returned %d!", rc); if (rc == -EIO) make_bad_inode(inode); ... Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
* JFS: skip bad iput() call in error pathDave Kleikamp2008-05-21
| | | | | | | If jfs_iget() fails, we can't call iput() on the returned error. Thanks to Eric Sesterhenn's fuzzer testing for reporting the problem. Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
* JFS: switch to seq_filesAlexey Dobriyan2008-05-13
| | | | | Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
* JFS: 0 is not valid errno value so return NULL from jfs_lookupMarcin Slusarz2008-05-12
| | | | | | | Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com> Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com> Cc: jfs-discussion@lists.sourceforge.net Cc: Alexander Viro <viro@zeniv.linux.org.uk>
* proc: remove proc_root_fsAlexey Dobriyan2008-04-29
| | | | | | | | Use creation by full path instead: "fs/foo". Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* [PATCH] r/o bind mounts: elevate write count for ioctls()Dave Hansen2008-04-19
| | | | | | | | | | | | | Some ioctl()s can cause writes to the filesystem. Take these, and make them use mnt_want/drop_write() instead. [AV: updated] Acked-by: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Hansen <haveblue@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* jfs: replace __inline with inlineHarvey Harrison2008-03-05
| | | | | Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
* jfs: le*_add_cpu conversionMarcin Slusarz2008-02-13
| | | | | | | | | | | | | replace all: little_endian_variable = cpu_to_leX(leX_to_cpu(little_endian_variable) + expression_in_cpu_byteorder); with: leX_add_cpu(&little_endian_variable, expression_in_cpu_byteorder); generated with semantic patch Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com> Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com> Cc: jfs-discussion@lists.sourceforge.net
* BKL-removal: Implement a compat_ioctl handler for JFSAndi Kleen2008-02-07
| | | | | | | | The ioctls were already compatible except for the actual values so this was fairly easy to do. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
* BKL-removal: Use unlocked_ioctl for jfsAndi Kleen2008-02-07
| | | | | | | | | Convert jfs_ioctl over to not use the BKL. The only potential race I could see was with two ioctls in parallel changing the flags and losing the updates. Use the i_mutex to protect against this. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
* iget: stop JFS from using iget() and read_inode()David Howells2008-02-07
| | | | | | | | | | | | | | | | Stop the JFS filesystem from using iget() and read_inode(). Replace jfs_read_inode() with jfs_iget(), and call that instead of iget(). jfs_iget() then uses iget_locked() directly and returns a proper error code instead of an inode in the event of an error. jfs_fill_super() returns any error incurred when getting the root inode instead of EINVAL. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com> Acked-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Spelling fixes: lenght->lengthPaulius Zaleckas2008-02-03
| | | | | Signed-off-by: Paulius Zaleckas <pauliusz@yahoo.com> Signed-off-by: Adrian Bunk <bunk@kernel.org>
* mount options: fix jfsMiklos Szeredi2008-01-24
| | | | | | | | Add iocharset= and errors= options to /proc/mounts for jfs filesystems. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
* JFS: simplify types to get rid of sparse warningDave Kleikamp2008-01-10
| | | | | | | jfs_metapage.c was using uints and unsigned ints inconsistently when regular ints suffice. Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
* JFS: FIx one more plain integer as NULL pointer warningDave Kleikamp2008-01-03
| | | | Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
* JFS: Remove defconfig ptr comparison to 0Joe Perches2008-01-03
| | | | | | | Remove sparse warning: Using plain integer as NULL pointer Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
* JFS: use DIV_ROUND_UP where appropriateShaun Zinck2008-01-03
| | | | | | | | This replaces some macros and code, which do the same thing as DIV_ROUND_UP defined in kernel.h, to use the DIV_ROUND_UP macro. Signed-off-by: Shaun Zinck <shaun.zinck@gmail.com> Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
* Remove unnecessary kmalloc casts in the jfs filesystemJack Stone2008-01-03
| | | | | Signed-off-by: Jack Stone <jack@hawkeye.stone.uk.eu.org> Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
* JFS is missing a memory barrierNick Piggin2008-01-03
| | | | | | | | | | | | | JFS is missing a memory barrier needed to close the critical section before clearing the lock bit. Use lock bitops for this. unlock_page() has a second barrier after clearing the lock, which is required because it checks whether the waitqueue is active without locks. Such a barrier is not required here because the waitqueue spinlock is always taken (something to think about if performance is an issue). Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
* JFS: Make sure special inode data is written after journal is flushedDave Kleikamp2008-01-03
| | | | | | | | | This patch makes sure that data that we tried to flush before the journal was completely written actually gets pushed to disk. To avoid duplicating code, moved common code to write_special_inodes(). Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
* JFS: clear PAGECACHE_TAG_DIRTY for no-write pagesDave Kleikamp2008-01-03
| | | | | | | | | | | | | | | | When JFS decides to drop a dirty metapage, it simply clears the META_dirty bit and leave alone the PG_dirty and PAGECACHE_TAG_DIRTY bits. When such no-write page goes to metapage_writepage(), the `relic' PAGECACHE_TAG_DIRTY tag should be cleared, to prevent pdflush from repeatedly trying to sync them. This is done through set_page_writeback(), so call it should be called in all cases. If no I/O is initiated, end_page_writeback() should be called immediately. This is how __block_write_full_page() does things. Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com> CC: Fengguang Wu <wfg@mail.ustc.edu.cn>
* Forbid user to change file flags on quota filesJan Kara2007-11-14
| | | | | | | | | | | | | | | | Forbid user from changing file flags on quota files. User has no bussiness in playing with these flags when quota is on. Furthermore there is a remote possibility of deadlock due to a lock inversion between quota file's i_mutex and transaction's start (i_mutex for quota file is locked only when trasaction is started in quota operations) in ext3 and ext4. Signed-off-by: Jan Kara <jack@suse.cz> Cc: LIOU Payphone <lioupayphone@gmail.com> Cc: <linux-ext4@vger.kernel.org> Acked-by: Dave Kleikamp <shaggy@austin.ibm.com> Cc: <reiserfs-dev@namesys.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* exportfs: make struct export_operations constChristoph Hellwig2007-10-22
| | | | | | | | | | | | | | | | | | | | | | | Now that nfsd has stopped writing to the find_exported_dentry member we an mark the export_operations const Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Neil Brown <neilb@suse.de> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: <linux-ext4@vger.kernel.org> Cc: Dave Kleikamp <shaggy@austin.ibm.com> Cc: Anton Altaparmakov <aia21@cantab.net> Cc: David Chinner <dgc@sgi.com> Cc: Timothy Shimmin <tes@sgi.com> Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Cc: Hugh Dickins <hugh@veritas.com> Cc: Chris Mason <mason@suse.com> Cc: Jeff Mahoney <jeffm@suse.com> Cc: "Vladimir V. Saveliev" <vs@namesys.com> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Mark Fasheh <mark.fasheh@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* jfs: new export opsChristoph Hellwig2007-10-22
| | | | | | | | | | | Trivial switch over to the new generic helpers. Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Neil Brown <neilb@suse.de> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Dave Kleikamp <shaggy@austin.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Fix misspellings of "system", "controller", "interrupt" and "necessary".Robert P. J. Day2007-10-19
| | | | | | | | Fix the various misspellings of "system", controller", "interrupt" and "[un]necessary". Signed-off-by: Robert P. J. Day <rpjday@mindspring.com> Signed-off-by: Adrian Bunk <bunk@kernel.org>
* introduce I_SYNCJoern Engel2007-10-17
| | | | | | | | | | | | | | | | | | | | I_LOCK was used for several unrelated purposes, which caused deadlock situations in certain filesystems as a side effect. One of the purposes now uses the new I_SYNC bit. Also document the various bits and change their order from historical to logical. [bunk@stusta.de: make fs/inode.c:wake_up_inode() static] Signed-off-by: Joern Engel <joern@wohnheim.fh-wedel.de> Cc: Dave Kleikamp <shaggy@linux.vnet.ibm.com> Cc: David Chinner <dgc@sgi.com> Cc: Anton Altaparmakov <aia21@cam.ac.uk> Cc: Al Viro <viro@ftp.linux.org.uk> Cc: Christoph Hellwig <hch@infradead.org> Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Slab API: remove useless ctor parameter and reorder parametersChristoph Lameter2007-10-17
| | | | | | | | | | | | | | | | | | | | | Slab constructors currently have a flags parameter that is never used. And the order of the arguments is opposite to other slab functions. The object pointer is placed before the kmem_cache pointer. Convert ctor(void *object, struct kmem_cache *s, unsigned long flags) to ctor(struct kmem_cache *s, void *object) throughout the kernel [akpm@linux-foundation.org: coupla fixes] Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* fs: restore nobhNick Piggin2007-10-16
| | | | | | | | | | | | | | Implement nobh in new aops. This is a bit tricky. FWIW, nobh_truncate is now implemented in a way that does not create blocks in sparse regions, which is a silly thing for it to have been doing (isn't it?) ext2 survives fsx and fsstress. jfs is converted as well... ext3 should be easy to do (but not done yet). [akpm@linux-foundation.org: coding-style fixes] Cc: Badari Pulavarty <pbadari@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* jfs: convert to new aopsNick Piggin2007-10-16
| | | | | | | Signed-off-by: Nick Piggin <npiggin@suse.de> Acked-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* more low-hanging fruits - kernel, fs, lib signednessAl Viro2007-10-14
| | | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* JFS: Bio cleanup: Replace missing return statementsDave Kleikamp2007-10-13
| | | | | | | | | | | | commit e30408b2a99cb7b8bf529c7dc2328a19d71894cf ("JFS: fix bio-related build breakage") removed some "return 0;" statements, rather than changing them to null returns. Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com> Cc: Jeff Garzik <jeff@garzik.org> Cc: Randy Dunlap <randy.dunlap@oracle.com> Cc: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Fix up more bio falloutAl Viro2007-10-12
| | | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* JFS: fix bio-related build breakageJeff Garzik2007-10-12
| | | | | Signed-off-by: Jeff Garzik <jgarzik@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Drop 'size' argument from bio_endio and bi_end_ioNeilBrown2007-10-10
| | | | | | | | | | | | | As bi_end_io is only called once when the reqeust is complete, the 'size' argument is now redundant. Remove it. Now there is no need for bio_endio to subtract the size completed from bi_size. So don't do that either. While we are at it, change bi_end_io to return void. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
* mm: Remove slab destructors from kmem_cache_create().Paul Mundt2007-07-19
| | | | | | | | | | | | | | Slab destructors were no longer supported after Christoph's c59def9f222d44bb7e2f0a559f2906191a0862d7 change. They've been BUGs for both slab and slub, and slob never supported them either. This rips out support for the dtor pointer from kmem_cache_create() completely and fixes up every single callsite in the kernel (there were about 224, not including the slab allocator definitions themselves, or the documentation references). Signed-off-by: Paul Mundt <lethal@linux-sh.org>
* Introduce is_owner_or_cap() to wrap CAP_FOWNER use with fsuid checkSatyam Sharma2007-07-17
| | | | | | | | | | | | | | | | | | | | Introduce is_owner_or_cap() macro in fs.h, and convert over relevant users to it. This is done because we want to avoid bugs in the future where we check for only effective fsuid of the current task against a file's owning uid, without simultaneously checking for CAP_FOWNER as well, thus violating its semantics. [ XFS uses special macros and structures, and in general looked ... untouchable, so we leave it alone -- but it has been looked over. ] The (current->fsuid != inode->i_uid) check in generic_permission() and exec_permission_lite() is left alone, because those operations are covered by CAP_DAC_OVERRIDE and CAP_DAC_READ_SEARCH. Similarly operations falling under the purview of CAP_CHOWN and CAP_LEASE are also left alone. Signed-off-by: Satyam Sharma <ssatyam@cse.iitk.ac.in> Cc: Al Viro <viro@ftp.linux.org.uk> Acked-by: Serge E. Hallyn <serge@hallyn.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* knfsd: exportfs: remove iget abuseChristoph Hellwig2007-07-17
| | | | | | | | | | | | | | | | | | | | | | | | | When the exportfs interface was added the expectation was that filesystems provide an operation to convert from a file handle to an inode/dentry, but it kept a backwards compat option that still calls into iget. Calling into iget from non-filesystem code is very bad, because it gives too little information to filesystem, and simply crashes if the filesystem doesn't implement the ->read_inode routine. Fortunately there are only two filesystems left using this fallback: efs and jfs. This patch moves a copy of export_iget to each of those to implement the get_dentry method. While this is a temporary increase of lines of code in the kernel it allows for a much cleaner interface and important code restructuring in later patches. [akpm@linux-foundation.org: add jfs_get_inode_flags() declaration] Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* knfsd: exportfs: add exportfs.h headerChristoph Hellwig2007-07-17
| | | | | | | | | | | | | currently the export_operation structure and helpers related to it are in fs.h. fs.h is already far too large and there are very few places needing the export bits, so split them off into a separate header. [akpm@linux-foundation.org: fix cifs build] Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Neil Brown <neilb@suse.de> Cc: Steven French <sfrench@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* sendfile: remove .sendfile from filesystems that use generic_file_sendfile()Jens Axboe2007-07-10
| | | | | | | They can use generic_file_splice_read() instead. Since sys_sendfile() now prefers that, there should be no change in behaviour. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
* JFS: Update print_hex_dump() syntaxDave Kleikamp2007-06-13
| | | | Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
* JFS: use print_hex_dump() rather than private dump_mem() functionDave Kleikamp2007-06-06
| | | | Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
* JFS: Whitespace cleanup and remove some dead codeDave Kleikamp2007-06-06
| | | | Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
* Remove SLAB_CTOR_CONSTRUCTORChristoph Lameter2007-05-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | SLAB_CTOR_CONSTRUCTOR is always specified. No point in checking it. Signed-off-by: Christoph Lameter <clameter@sgi.com> Cc: David Howells <dhowells@redhat.com> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Steven French <sfrench@us.ibm.com> Cc: Michael Halcrow <mhalcrow@us.ibm.com> Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Cc: Miklos Szeredi <miklos@szeredi.hu> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Roman Zippel <zippel@linux-m68k.org> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Dave Kleikamp <shaggy@austin.ibm.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Anton Altaparmakov <aia21@cantab.net> Cc: Mark Fasheh <mark.fasheh@oracle.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Jan Kara <jack@ucw.cz> Cc: David Chinner <dgc@sgi.com> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Fix occurrences of "the the "Michael Opdenacker2007-05-09
| | | | | Signed-off-by: Michael Opdenacker <michael@free-electrons.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
* Merge branch 'for-linus' of ↵Linus Torvalds2007-05-08
|\ | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/shaggy/jfs-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/shaggy/jfs-2.6: JFS: Fix race waking up jfsIO kernel thread JFS: use __set_current_state() Copy i_flags to jfs inode flags on write JFS: document uid, gid, and umask mount options in jfs.txt
| * JFS: Fix race waking up jfsIO kernel threadDave Kleikamp2007-05-05
| | | | | | | | | | | | | | | | | | | | | | It's possible for a journal I/O request to be added to the log_redrive queue and the jfsIO thread to be awakened after the thread releases log_redrive_lock but before it sets its state to TASK_INTERRUPTIBLE. The jfsIO thread should set the state before giving up the spinlock, so the waking thread will really wake it. Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com>
| * JFS: use __set_current_state()Milind Arun Choudhary2007-04-26
| | | | | | | | | | | | | | | | use __set_current_state(TASK_*) instead of current->state = TASK_* Signed-off-by: Milind Arun Choudhary <milindchoudhary@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
v>
dee5d0d518de
c04fc586c1a4





d33601644cd3



c04fc586c1a4







8732794b1661
10fbcf4c6cb1

dee5d0d518de


10fbcf4c6cb1
8732794b1661

c04fc586c1a4





d33601644cd3

c04fc586c1a4
9ae49fab239f
c04fc586c1a4

9ae49fab239f

c04fc586c1a4
9ae49fab239f



d33601644cd3

c04fc586c1a4

475049809977
c04fc586c1a4





9ae49fab239f
c04fc586c1a4
8732794b1661
10fbcf4c6cb1

8732794b1661
c04fc586c1a4
9ae49fab239f
c04fc586c1a4







63d027a63888
c04fc586c1a4




c04fc586c1a4




321bf4ed5ff5






63d027a63888
321bf4ed5ff5
c04fc586c1a4




c04fc586c1a4
63d027a63888

10fbcf4c6cb1
c04fc586c1a4

4faf8d950ec4
39da08cb074c
4faf8d950ec4



39da08cb074c

















8732794b1661
39da08cb074c
4faf8d950ec4







39da08cb074c





4faf8d950ec4
8732794b1661
4faf8d950ec4
39da08cb074c
4faf8d950ec4









39da08cb074c


c04fc586c1a4
39da08cb074c
4faf8d950ec4
39da08cb074c

4faf8d950ec4




39da08cb074c



c04fc586c1a4
0fc44159bfcb


76b67ed9dce6
0fc44159bfcb





8732794b1661
0fc44159bfcb
8732794b1661




76b67ed9dce6





c04fc586c1a4


39da08cb074c


0fc44159bfcb







8732794b1661
8732794b1661
0fc44159bfcb

bde631a51876







f62388187207


bde631a51876


b15f562fc2f5
10fbcf4c6cb1
b15f562fc2f5

bde631a51876
10fbcf4c6cb1

bde631a51876
b15f562fc2f5

bde631a51876

b15f562fc2f5
10fbcf4c6cb1
bde631a51876
b15f562fc2f5
fcf07d22f089


bde631a51876
fcf07d22f089
bde631a51876
20b2f52b73fe


fcf07d22f089
bde631a51876

10fbcf4c6cb1
fcf07d22f089


3701cde6e352
fcf07d22f089
3701cde6e352
20b2f52b73fe


fcf07d22f089
3701cde6e352

bde631a51876
10fbcf4c6cb1








4faf8d950ec4
4b45099b7583
1da177e4c3f4
bde631a51876

3701cde6e352


10fbcf4c6cb1
4faf8d950ec4
6e259e7dc482




4faf8d950ec4
bde631a51876





1da177e4c3f4

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
  
                               

   


                         
                         
                         
                           

                          
                             


                           
                      
                         
                       
                       
 
                                      
                       
                           


  
                                                                        

                                             
                                                                       

                

                                                                            
 
                   

                                                           

                          


                   

                                                                         


                                             

                                                                         



                                             

                                                              

                                       

                                                                 



                          

                                 
                        








                                                          
                                                          














                                                                        
                     
                             


                                                          




                                                           
      
                             




                                                          
                                                          
                                                          





                                                          




                                                          
                                                                   
                                                                  
                                                                   
                                                                    
                                  
                                                                 
                                                                               


                                                                   
      
                                                              

                                                                   
                                                                  
                                                                     
                                                               
                                                                       


                                                                             
                                  
                                                                          

                                                                               


                                                                            
      




                                                       
                                                              
 

                                                                         
 






                                             





                                                                     
 
                                                                
 

                                                                         

                          







                                                               
 
                                                            
 

                                                                  




                          




                                                                           






                                                                                       
                                                                
 



                                                                          

                                                                       






                                                                         
                                                           
 
                                      
                                                             
                                              


                            



















                                                                      

















                                                                      

  
                                                   



                                                      
                                                                         


                  

                                     
                                                
                                            

                    





                                                                   

                                                     
 
                                            

                                               



                     








                                                                        





                                                           
 
                                               
                                                                               
 
                                      
 
 
                                        
 




                                                               
                
                           
 


                              
                                  


                         
                                                             

                                                          



                                            

                                                                             



                                                                 
                           



                              
                                  


                         
                                                       
                                                    
                                     
                                                                      
 


                 

















                                                                        
                





                                                        



                                                                      







                                                                
                                                                            

                                                                          


                                   
                                                                   

                                                                            





                                                             

                                                                 
 
                                                               

                                                        

                                              
                               



                                     

                                                       

                                                                
                        





                                           
                                                            
                                 
                                                               

                                                           
                                                                     
         
                                      







                                                                               
                                            




                                                                        




                                                       






                                                                        
                                                                      
 




                                                                 
         

                    
                                                

                   
 
                       



                                                                    

















                                                                             
                                                                    
 







                                                                





                                                                            
                                        
                                                                     
                      
 









                                


                                           
                                                   
                                          
 

                                               




                                                                   



                                               
 


                              
                





                                              
                                                      
 




                                                                             





                                                                  


                                                          


                                                               







                                 
                                           
                                 

 







                                                                   


                                                                     


                 
                  
                                     

                               
 

                                                                        
 

                                                                          

 
                                 
                                                            
 
                                             


                                                                           
                     
                                                                     
      


                                                      
                                             

  
                                               


                                                    
                     
                                                  
      


                                             
                                          

            
 








                                                               
                                                             
                                          
 

                


                                                                       
                                                                         
                   




                                                                        
         





                                                                         

                                      
/*
 * Basic Node interface support
 */

#include <linux/module.h>
#include <linux/init.h>
#include <linux/mm.h>
#include <linux/memory.h>
#include <linux/vmstat.h>
#include <linux/notifier.h>
#include <linux/node.h>
#include <linux/hugetlb.h>
#include <linux/compaction.h>
#include <linux/cpumask.h>
#include <linux/topology.h>
#include <linux/nodemask.h>
#include <linux/cpu.h>
#include <linux/device.h>
#include <linux/swap.h>
#include <linux/slab.h>

static struct bus_type node_subsys = {
	.name = "node",
	.dev_name = "node",
};


static ssize_t node_read_cpumap(struct device *dev, int type, char *buf)
{
	struct node *node_dev = to_node(dev);
	const struct cpumask *mask = cpumask_of_node(node_dev->dev.id);
	int len;

	/* 2008/04/07: buf currently PAGE_SIZE, need 9 chars per 32 bits. */
	BUILD_BUG_ON((NR_CPUS/32 * 9) > (PAGE_SIZE-1));

	len = type?
		cpulist_scnprintf(buf, PAGE_SIZE-2, mask) :
		cpumask_scnprintf(buf, PAGE_SIZE-2, mask);
 	buf[len++] = '\n';
 	buf[len] = '\0';
	return len;
}

static inline ssize_t node_read_cpumask(struct device *dev,
				struct device_attribute *attr, char *buf)
{
	return node_read_cpumap(dev, 0, buf);
}
static inline ssize_t node_read_cpulist(struct device *dev,
				struct device_attribute *attr, char *buf)
{
	return node_read_cpumap(dev, 1, buf);
}

static DEVICE_ATTR(cpumap,  S_IRUGO, node_read_cpumask, NULL);
static DEVICE_ATTR(cpulist, S_IRUGO, node_read_cpulist, NULL);

#define K(x) ((x) << (PAGE_SHIFT - 10))
static ssize_t node_read_meminfo(struct device *dev,
			struct device_attribute *attr, char *buf)
{
	int n;
	int nid = dev->id;
	struct sysinfo i;

	si_meminfo_node(&i, nid);
	n = sprintf(buf,
		       "Node %d MemTotal:       %8lu kB\n"
		       "Node %d MemFree:        %8lu kB\n"
		       "Node %d MemUsed:        %8lu kB\n"
		       "Node %d Active:         %8lu kB\n"
		       "Node %d Inactive:       %8lu kB\n"
		       "Node %d Active(anon):   %8lu kB\n"
		       "Node %d Inactive(anon): %8lu kB\n"
		       "Node %d Active(file):   %8lu kB\n"
		       "Node %d Inactive(file): %8lu kB\n"
		       "Node %d Unevictable:    %8lu kB\n"
		       "Node %d Mlocked:        %8lu kB\n",
		       nid, K(i.totalram),
		       nid, K(i.freeram),
		       nid, K(i.totalram - i.freeram),
		       nid, K(node_page_state(nid, NR_ACTIVE_ANON) +
				node_page_state(nid, NR_ACTIVE_FILE)),
		       nid, K(node_page_state(nid, NR_INACTIVE_ANON) +
				node_page_state(nid, NR_INACTIVE_FILE)),
		       nid, K(node_page_state(nid, NR_ACTIVE_ANON)),
		       nid, K(node_page_state(nid, NR_INACTIVE_ANON)),
		       nid, K(node_page_state(nid, NR_ACTIVE_FILE)),
		       nid, K(node_page_state(nid, NR_INACTIVE_FILE)),
		       nid, K(node_page_state(nid, NR_UNEVICTABLE)),
		       nid, K(node_page_state(nid, NR_MLOCK)));

#ifdef CONFIG_HIGHMEM
	n += sprintf(buf + n,
		       "Node %d HighTotal:      %8lu kB\n"
		       "Node %d HighFree:       %8lu kB\n"
		       "Node %d LowTotal:       %8lu kB\n"
		       "Node %d LowFree:        %8lu kB\n",
		       nid, K(i.totalhigh),
		       nid, K(i.freehigh),
		       nid, K(i.totalram - i.totalhigh),
		       nid, K(i.freeram - i.freehigh));
#endif
	n += sprintf(buf + n,
		       "Node %d Dirty:          %8lu kB\n"
		       "Node %d Writeback:      %8lu kB\n"
		       "Node %d FilePages:      %8lu kB\n"
		       "Node %d Mapped:         %8lu kB\n"
		       "Node %d AnonPages:      %8lu kB\n"
		       "Node %d Shmem:          %8lu kB\n"
		       "Node %d KernelStack:    %8lu kB\n"
		       "Node %d PageTables:     %8lu kB\n"
		       "Node %d NFS_Unstable:   %8lu kB\n"
		       "Node %d Bounce:         %8lu kB\n"
		       "Node %d WritebackTmp:   %8lu kB\n"
		       "Node %d Slab:           %8lu kB\n"
		       "Node %d SReclaimable:   %8lu kB\n"
		       "Node %d SUnreclaim:     %8lu kB\n"
#ifdef CONFIG_TRANSPARENT_HUGEPAGE
		       "Node %d AnonHugePages:  %8lu kB\n"
#endif
			,
		       nid, K(node_page_state(nid, NR_FILE_DIRTY)),
		       nid, K(node_page_state(nid, NR_WRITEBACK)),
		       nid, K(node_page_state(nid, NR_FILE_PAGES)),
		       nid, K(node_page_state(nid, NR_FILE_MAPPED)),
#ifdef CONFIG_TRANSPARENT_HUGEPAGE
		       nid, K(node_page_state(nid, NR_ANON_PAGES)
			+ node_page_state(nid, NR_ANON_TRANSPARENT_HUGEPAGES) *
			HPAGE_PMD_NR),
#else
		       nid, K(node_page_state(nid, NR_ANON_PAGES)),
#endif
		       nid, K(node_page_state(nid, NR_SHMEM)),
		       nid, node_page_state(nid, NR_KERNEL_STACK) *
				THREAD_SIZE / 1024,
		       nid, K(node_page_state(nid, NR_PAGETABLE)),
		       nid, K(node_page_state(nid, NR_UNSTABLE_NFS)),
		       nid, K(node_page_state(nid, NR_BOUNCE)),
		       nid, K(node_page_state(nid, NR_WRITEBACK_TEMP)),
		       nid, K(node_page_state(nid, NR_SLAB_RECLAIMABLE) +
				node_page_state(nid, NR_SLAB_UNRECLAIMABLE)),
		       nid, K(node_page_state(nid, NR_SLAB_RECLAIMABLE)),
#ifdef CONFIG_TRANSPARENT_HUGEPAGE
		       nid, K(node_page_state(nid, NR_SLAB_UNRECLAIMABLE))
			, nid,
			K(node_page_state(nid, NR_ANON_TRANSPARENT_HUGEPAGES) *
			HPAGE_PMD_NR));
#else
		       nid, K(node_page_state(nid, NR_SLAB_UNRECLAIMABLE)));
#endif
	n += hugetlb_report_node_meminfo(nid, buf + n);
	return n;
}

#undef K
static DEVICE_ATTR(meminfo, S_IRUGO, node_read_meminfo, NULL);

static ssize_t node_read_numastat(struct device *dev,
				struct device_attribute *attr, char *buf)
{
	return sprintf(buf,
		       "numa_hit %lu\n"
		       "numa_miss %lu\n"
		       "numa_foreign %lu\n"
		       "interleave_hit %lu\n"
		       "local_node %lu\n"
		       "other_node %lu\n",
		       node_page_state(dev->id, NUMA_HIT),
		       node_page_state(dev->id, NUMA_MISS),
		       node_page_state(dev->id, NUMA_FOREIGN),
		       node_page_state(dev->id, NUMA_INTERLEAVE_HIT),
		       node_page_state(dev->id, NUMA_LOCAL),
		       node_page_state(dev->id, NUMA_OTHER));
}
static DEVICE_ATTR(numastat, S_IRUGO, node_read_numastat, NULL);

static ssize_t node_read_vmstat(struct device *dev,
				struct device_attribute *attr, char *buf)
{
	int nid = dev->id;
	int i;
	int n = 0;

	for (i = 0; i < NR_VM_ZONE_STAT_ITEMS; i++)
		n += sprintf(buf+n, "%s %lu\n", vmstat_text[i],
			     node_page_state(nid, i));

	return n;
}
static DEVICE_ATTR(vmstat, S_IRUGO, node_read_vmstat, NULL);

static ssize_t node_read_distance(struct device *dev,
			struct device_attribute *attr, char * buf)
{
	int nid = dev->id;
	int len = 0;
	int i;

	/*
	 * buf is currently PAGE_SIZE in length and each node needs 4 chars
	 * at the most (distance + space or newline).
	 */
	BUILD_BUG_ON(MAX_NUMNODES * 4 > PAGE_SIZE);

	for_each_online_node(i)
		len += sprintf(buf + len, "%s%d", i ? " " : "", node_distance(nid, i));

	len += sprintf(buf + len, "\n");
	return len;
}
static DEVICE_ATTR(distance, S_IRUGO, node_read_distance, NULL);

#ifdef CONFIG_HUGETLBFS
/*
 * hugetlbfs per node attributes registration interface:
 * When/if hugetlb[fs] subsystem initializes [sometime after this module],
 * it will register its per node attributes for all online nodes with
 * memory.  It will also call register_hugetlbfs_with_node(), below, to
 * register its attribute registration functions with this node driver.
 * Once these hooks have been initialized, the node driver will call into
 * the hugetlb module to [un]register attributes for hot-plugged nodes.
 */
static node_registration_func_t __hugetlb_register_node;
static node_registration_func_t __hugetlb_unregister_node;

static inline bool hugetlb_register_node(struct node *node)
{
	if (__hugetlb_register_node &&
			node_state(node->dev.id, N_MEMORY)) {
		__hugetlb_register_node(node);
		return true;
	}
	return false;
}

static inline void hugetlb_unregister_node(struct node *node)
{
	if (__hugetlb_unregister_node)
		__hugetlb_unregister_node(node);
}

void register_hugetlbfs_with_node(node_registration_func_t doregister,
				  node_registration_func_t unregister)
{
	__hugetlb_register_node   = doregister;
	__hugetlb_unregister_node = unregister;
}
#else
static inline void hugetlb_register_node(struct node *node) {}

static inline void hugetlb_unregister_node(struct node *node) {}
#endif

static void node_device_release(struct device *dev)
{
	struct node *node = to_node(dev);

#if defined(CONFIG_MEMORY_HOTPLUG_SPARSE) && defined(CONFIG_HUGETLBFS)
	/*
	 * We schedule the work only when a memory section is
	 * onlined/offlined on this node. When we come here,
	 * all the memory on this node has been offlined,
	 * so we won't enqueue new work to this work.
	 *
	 * The work is using node->node_work, so we should
	 * flush work before freeing the memory.
	 */
	flush_work(&node->node_work);
#endif
	kfree(node);
}

/*
 * register_node - Setup a sysfs device for a node.
 * @num - Node number to use when creating the device.
 *
 * Initialize and register the node device.
 */
static int register_node(struct node *node, int num, struct node *parent)
{
	int error;

	node->dev.id = num;
	node->dev.bus = &node_subsys;
	node->dev.release = node_device_release;
	error = device_register(&node->dev);

	if (!error){
		device_create_file(&node->dev, &dev_attr_cpumap);
		device_create_file(&node->dev, &dev_attr_cpulist);
		device_create_file(&node->dev, &dev_attr_meminfo);
		device_create_file(&node->dev, &dev_attr_numastat);
		device_create_file(&node->dev, &dev_attr_distance);
		device_create_file(&node->dev, &dev_attr_vmstat);

		scan_unevictable_register_node(node);

		hugetlb_register_node(node);

		compaction_register_node(node);
	}
	return error;
}

/**
 * unregister_node - unregister a node device
 * @node: node going away
 *
 * Unregisters a node device @node.  All the devices on the node must be
 * unregistered before calling this function.
 */
void unregister_node(struct node *node)
{
	device_remove_file(&node->dev, &dev_attr_cpumap);
	device_remove_file(&node->dev, &dev_attr_cpulist);
	device_remove_file(&node->dev, &dev_attr_meminfo);
	device_remove_file(&node->dev, &dev_attr_numastat);
	device_remove_file(&node->dev, &dev_attr_distance);
	device_remove_file(&node->dev, &dev_attr_vmstat);

	scan_unevictable_unregister_node(node);
	hugetlb_unregister_node(node);		/* no-op, if memoryless node */

	device_unregister(&node->dev);
}

struct node *node_devices[MAX_NUMNODES];

/*
 * register cpu under node
 */
int register_cpu_under_node(unsigned int cpu, unsigned int nid)
{
	int ret;
	struct device *obj;

	if (!node_online(nid))
		return 0;

	obj = get_cpu_device(cpu);
	if (!obj)
		return 0;

	ret = sysfs_create_link(&node_devices[nid]->dev.kobj,
				&obj->kobj,
				kobject_name(&obj->kobj));
	if (ret)
		return ret;

	return sysfs_create_link(&obj->kobj,
				 &node_devices[nid]->dev.kobj,
				 kobject_name(&node_devices[nid]->dev.kobj));
}

int unregister_cpu_under_node(unsigned int cpu, unsigned int nid)
{
	struct device *obj;

	if (!node_online(nid))
		return 0;

	obj = get_cpu_device(cpu);
	if (!obj)
		return 0;

	sysfs_remove_link(&node_devices[nid]->dev.kobj,
			  kobject_name(&obj->kobj));
	sysfs_remove_link(&obj->kobj,
			  kobject_name(&node_devices[nid]->dev.kobj));

	return 0;
}

#ifdef CONFIG_MEMORY_HOTPLUG_SPARSE
#define page_initialized(page)  (page->lru.next)

static int get_nid_for_pfn(unsigned long pfn)
{
	struct page *page;

	if (!pfn_valid_within(pfn))
		return -1;
	page = pfn_to_page(pfn);
	if (!page_initialized(page))
		return -1;
	return pfn_to_nid(pfn);
}

/* register memory section under specified node if it spans that node */
int register_mem_sect_under_node(struct memory_block *mem_blk, int nid)
{
	int ret;
	unsigned long pfn, sect_start_pfn, sect_end_pfn;

	if (!mem_blk)
		return -EFAULT;
	if (!node_online(nid))
		return 0;

	sect_start_pfn = section_nr_to_pfn(mem_blk->start_section_nr);
	sect_end_pfn = section_nr_to_pfn(mem_blk->end_section_nr);
	sect_end_pfn += PAGES_PER_SECTION - 1;
	for (pfn = sect_start_pfn; pfn <= sect_end_pfn; pfn++) {
		int page_nid;

		page_nid = get_nid_for_pfn(pfn);
		if (page_nid < 0)
			continue;
		if (page_nid != nid)
			continue;
		ret = sysfs_create_link_nowarn(&node_devices[nid]->dev.kobj,
					&mem_blk->dev.kobj,
					kobject_name(&mem_blk->dev.kobj));
		if (ret)
			return ret;

		return sysfs_create_link_nowarn(&mem_blk->dev.kobj,
				&node_devices[nid]->dev.kobj,
				kobject_name(&node_devices[nid]->dev.kobj));
	}
	/* mem section does not span the specified node */
	return 0;
}

/* unregister memory section under all nodes that it spans */
int unregister_mem_sect_under_nodes(struct memory_block *mem_blk,
				    unsigned long phys_index)
{
	NODEMASK_ALLOC(nodemask_t, unlinked_nodes, GFP_KERNEL);
	unsigned long pfn, sect_start_pfn, sect_end_pfn;

	if (!mem_blk) {
		NODEMASK_FREE(unlinked_nodes);
		return -EFAULT;
	}
	if (!unlinked_nodes)
		return -ENOMEM;
	nodes_clear(*unlinked_nodes);

	sect_start_pfn = section_nr_to_pfn(phys_index);
	sect_end_pfn = sect_start_pfn + PAGES_PER_SECTION - 1;
	for (pfn = sect_start_pfn; pfn <= sect_end_pfn; pfn++) {
		int nid;

		nid = get_nid_for_pfn(pfn);
		if (nid < 0)
			continue;
		if (!node_online(nid))
			continue;
		if (node_test_and_set(nid, *unlinked_nodes))
			continue;
		sysfs_remove_link(&node_devices[nid]->dev.kobj,
			 kobject_name(&mem_blk->dev.kobj));
		sysfs_remove_link(&mem_blk->dev.kobj,
			 kobject_name(&node_devices[nid]->dev.kobj));
	}
	NODEMASK_FREE(unlinked_nodes);
	return 0;
}

static int link_mem_sections(int nid)
{
	unsigned long start_pfn = NODE_DATA(nid)->node_start_pfn;
	unsigned long end_pfn = start_pfn + NODE_DATA(nid)->node_spanned_pages;
	unsigned long pfn;
	struct memory_block *mem_blk = NULL;
	int err = 0;

	for (pfn = start_pfn; pfn < end_pfn; pfn += PAGES_PER_SECTION) {
		unsigned long section_nr = pfn_to_section_nr(pfn);
		struct mem_section *mem_sect;
		int ret;

		if (!present_section_nr(section_nr))
			continue;
		mem_sect = __nr_to_section(section_nr);

		/* same memblock ? */
		if (mem_blk)
			if ((section_nr >= mem_blk->start_section_nr) &&
			    (section_nr <= mem_blk->end_section_nr))
				continue;

		mem_blk = find_memory_block_hinted(mem_sect, mem_blk);

		ret = register_mem_sect_under_node(mem_blk, nid);
		if (!err)
			err = ret;

		/* discard ref obtained in find_memory_block() */
	}

	if (mem_blk)
		kobject_put(&mem_blk->dev.kobj);
	return err;
}

#ifdef CONFIG_HUGETLBFS
/*
 * Handle per node hstate attribute [un]registration on transistions
 * to/from memoryless state.
 */
static void node_hugetlb_work(struct work_struct *work)
{
	struct node *node = container_of(work, struct node, node_work);

	/*
	 * We only get here when a node transitions to/from memoryless state.
	 * We can detect which transition occurred by examining whether the
	 * node has memory now.  hugetlb_register_node() already check this
	 * so we try to register the attributes.  If that fails, then the
	 * node has transitioned to memoryless, try to unregister the
	 * attributes.
	 */
	if (!hugetlb_register_node(node))
		hugetlb_unregister_node(node);
}

static void init_node_hugetlb_work(int nid)
{
	INIT_WORK(&node_devices[nid]->node_work, node_hugetlb_work);
}

static int node_memory_callback(struct notifier_block *self,
				unsigned long action, void *arg)
{
	struct memory_notify *mnb = arg;
	int nid = mnb->status_change_nid;

	switch (action) {
	case MEM_ONLINE:
	case MEM_OFFLINE:
		/*
		 * offload per node hstate [un]registration to a work thread
		 * when transitioning to/from memoryless state.
		 */
		if (nid != NUMA_NO_NODE)
			schedule_work(&node_devices[nid]->node_work);
		break;

	case MEM_GOING_ONLINE:
	case MEM_GOING_OFFLINE:
	case MEM_CANCEL_ONLINE:
	case MEM_CANCEL_OFFLINE:
	default:
		break;
	}

	return NOTIFY_OK;
}
#endif	/* CONFIG_HUGETLBFS */
#else	/* !CONFIG_MEMORY_HOTPLUG_SPARSE */

static int link_mem_sections(int nid) { return 0; }
#endif	/* CONFIG_MEMORY_HOTPLUG_SPARSE */

#if !defined(CONFIG_MEMORY_HOTPLUG_SPARSE) || \
    !defined(CONFIG_HUGETLBFS)
static inline int node_memory_callback(struct notifier_block *self,
				unsigned long action, void *arg)
{
	return NOTIFY_OK;
}

static void init_node_hugetlb_work(int nid) { }

#endif

int register_one_node(int nid)
{
	int error = 0;
	int cpu;

	if (node_online(nid)) {
		int p_node = parent_node(nid);
		struct node *parent = NULL;

		if (p_node != nid)
			parent = node_devices[p_node];

		node_devices[nid] = kzalloc(sizeof(struct node), GFP_KERNEL);
		if (!node_devices[nid])
			return -ENOMEM;

		error = register_node(node_devices[nid], nid, parent);

		/* link cpu under this node */
		for_each_present_cpu(cpu) {
			if (cpu_to_node(cpu) == nid)
				register_cpu_under_node(cpu, nid);
		}

		/* link memory sections under this node */
		error = link_mem_sections(nid);

		/* initialize work queue for memory hot plug */
		init_node_hugetlb_work(nid);
	}

	return error;

}

void unregister_one_node(int nid)
{
	unregister_node(node_devices[nid]);
	node_devices[nid] = NULL;
}

/*
 * node states attributes
 */

static ssize_t print_nodes_state(enum node_states state, char *buf)
{
	int n;

	n = nodelist_scnprintf(buf, PAGE_SIZE-2, node_states[state]);
	buf[n++] = '\n';
	buf[n] = '\0';
	return n;
}

struct node_attr {
	struct device_attribute attr;
	enum node_states state;
};

static ssize_t show_node_state(struct device *dev,
			       struct device_attribute *attr, char *buf)
{
	struct node_attr *na = container_of(attr, struct node_attr, attr);
	return print_nodes_state(na->state, buf);
}

#define _NODE_ATTR(name, state) \
	{ __ATTR(name, 0444, show_node_state, NULL), state }

static struct node_attr node_state_attr[] = {
	[N_POSSIBLE] = _NODE_ATTR(possible, N_POSSIBLE),
	[N_ONLINE] = _NODE_ATTR(online, N_ONLINE),
	[N_NORMAL_MEMORY] = _NODE_ATTR(has_normal_memory, N_NORMAL_MEMORY),
#ifdef CONFIG_HIGHMEM
	[N_HIGH_MEMORY] = _NODE_ATTR(has_high_memory, N_HIGH_MEMORY),
#endif
#ifdef CONFIG_MOVABLE_NODE
	[N_MEMORY] = _NODE_ATTR(has_memory, N_MEMORY),
#endif
	[N_CPU] = _NODE_ATTR(has_cpu, N_CPU),
};

static struct attribute *node_state_attrs[] = {
	&node_state_attr[N_POSSIBLE].attr.attr,
	&node_state_attr[N_ONLINE].attr.attr,
	&node_state_attr[N_NORMAL_MEMORY].attr.attr,
#ifdef CONFIG_HIGHMEM
	&node_state_attr[N_HIGH_MEMORY].attr.attr,
#endif
#ifdef CONFIG_MOVABLE_NODE
	&node_state_attr[N_MEMORY].attr.attr,
#endif
	&node_state_attr[N_CPU].attr.attr,
	NULL
};

static struct attribute_group memory_root_attr_group = {
	.attrs = node_state_attrs,
};

static const struct attribute_group *cpu_root_attr_groups[] = {
	&memory_root_attr_group,
	NULL,
};

#define NODE_CALLBACK_PRI	2	/* lower than SLAB */
static int __init register_node_type(void)
{
	int ret;

 	BUILD_BUG_ON(ARRAY_SIZE(node_state_attr) != NR_NODE_STATES);
 	BUILD_BUG_ON(ARRAY_SIZE(node_state_attrs)-1 != NR_NODE_STATES);

	ret = subsys_system_register(&node_subsys, cpu_root_attr_groups);
	if (!ret) {
		static struct notifier_block node_memory_callback_nb = {
			.notifier_call = node_memory_callback,
			.priority = NODE_CALLBACK_PRI,
		};
		register_hotmemory_notifier(&node_memory_callback_nb);
	}

	/*
	 * Note:  we're not going to unregister the node class if we fail
	 * to register the node state class attribute files.
	 */
	return ret;
}
postcore_initcall(register_node_type);