aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAge
* serial: turn serial console suspend a boot rather than compile time optionAndres Salomon2007-10-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, there's a CONFIG_DISABLE_CONSOLE_SUSPEND that allows one to stop the serial console from being suspended when the rest of the machine goes to sleep. This is incredibly useful for debugging power management-related things; however, having it as a compile-time option has proved to be incredibly inconvenient for us (OLPC). There are plenty of times that we want serial console to not suspend, but for the most part we'd like serial console to be suspended. This drops CONFIG_DISABLE_CONSOLE_SUSPEND, and replaces it with a kernel boot parameter (no_console_suspend). By default, the serial console will be suspended along with the rest of the system; by passing 'no_console_suspend' to the kernel during boot, serial console will remain alive during suspend. For now, this is pretty serial console specific; further fixes could be applied to make this work for things like netconsole. Signed-off-by: Andres Salomon <dilinger@debian.org> Acked-by: "Rafael J. Wysocki" <rjw@sisk.pl> Acked-by: Pavel Machek <pavel@ucw.cz> Cc: Nigel Cunningham <nigel@suspend2.net> Cc: Russell King <rmk@arm.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* freezer: measure freezing timeRafael J. Wysocki2007-10-18
| | | | | | | | | Measure the time of the freezing of tasks, even if it doesn't fail. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* freezer: be more verboseRafael J. Wysocki2007-10-18
| | | | | | | | | | | Increase the freezer's verbosity a bit, so that it's easier to read problem reports related to it. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Nigel Cunningham <nigel@nigel.suspend2.net> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* pm_trace displays the wrong time from the RTCRafael J. Wysocki2007-10-18
| | | | | | | | | | | | The way in which read_magic_time() displays the date read from the RTC is apparently confusing to the users (cf. https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=250238). Make it print dates in the standard way. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Dave Jones <davej@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* unexport pm_power_off_prepareAdrian Bunk2007-10-18
| | | | | | | | | This patch removes the unused EXPORT_SYMBOL(pm_power_off_prepare). Signed-off-by: Adrian Bunk <bunk@stusta.de> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* freezer: do not send signals to kernel threadsRafael J. Wysocki2007-10-18
| | | | | | | | | | | | | | | | | | The freezer should not send signals to kernel threads, since that may lead to subtle problems. In particular, commit b74d0deb968e1f85942f17080eace015ce3c332c has changed recalc_sigpending_tsk() so that it doesn't clear TIF_SIGPENDING. For this reason, if the freezer continues to send fake signals to kernel threads and the freezing of kernel threads fails, some of them may be running with TIF_SIGPENDING set forever. Accordingly, recalc_sigpending_tsk() shouldn't set the task's TIF_SIGPENDING flag if TIF_FREEZE is set. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Cc: Nigel Cunningham <nigel@nigel.suspend2.net> Cc: Pavel Machek <pavel@ucw.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* freezer: introduce freezer-friendly waiting macrosRafael J. Wysocki2007-10-18
| | | | | | | | | | | | | | | | Introduce freezer-friendly wrappers around wait_event_interruptible() and wait_event_interruptible_timeout(), originally defined in <linux/wait.h>, to be used in freezable kernel threads. Make some of the freezable kernel threads use them. This is necessary for the freezer to stop sending signals to kernel threads, which is implemented in the next patch. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Pavel Machek <pavel@ucw.cz> Cc: Nigel Cunningham <nigel@nigel.suspend2.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* freezer: prevent new tasks from inheriting TIF_FREEZE setRafael J. Wysocki2007-10-18
| | | | | | | | | | | | Tasks should go to the refrigerator only if explicitly requested to do that by the freezer and not as a result of inheriting the TIF_FREEZE flag set from the parent. Make it happen. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Pavel Machek <pavel@ucw.cz> Acked-by: Nigel Cunningham <nigel@nigel.suspend2.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* freezer: do not sync filesystems from freeze_processesRafael J. Wysocki2007-10-18
| | | | | | | | | | | | | | | | | The syncing of filesystems from within the freezer is generally not needed. Also, if there's an ext3 filesystem loopback-mounted from a FUSE one, the syncing results in writes to it and deadlocks. Similarly, it will deadlock if FUSE implements sync. Change freeze_processes() so that it doesn't execute sys_sync() and make the suspend and hibernation code path sync filesystems independently of the freezer. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Pavel Machek <pavel@ucw.cz> Cc: Nigel Cunningham <nigel@nigel.suspend2.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* freezer: document relationship with memory shrinkingRafael J. Wysocki2007-10-18
| | | | | | | | | | | | One important reason to freeze tasks, which is that we don't want them to allocate memory after freeing it for the hibernation image, has not been documented. Fix it. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Pavel Machek <pavel@ucw.cz> Acked-by: Nigel Cunningham <nigel@nigel.suspend2.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* PM: Rename hibernation_ops to platform_hibernation_opsRafael J. Wysocki2007-10-18
| | | | | | | | | | | | Rename 'struct hibernation_ops' to 'struct platform_hibernation_ops' in analogy with 'struct platform_suspend_ops'. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Pavel Machek <pavel@ucw.cz> Cc: Len Brown <lenb@kernel.org> Cc: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* PM: Rework struct hibernation_opsRafael J. Wysocki2007-10-18
| | | | | | | | | | | | | | | | | During hibernation we also need to tell the ACPI core that we're going to put the system into the S4 sleep state. For this reason, an additional method in 'struct hibernation_ops' is needed, playing the role of set_target() in 'struct platform_suspend_operations'. Moreover, the role of the .prepare() method is now different, so it's better to introduce another method, that in general may be different from .prepare(), that will be used to prepare the platform for creating the hibernation image (.prepare() is used anyway to notify the platform that we're going to enter the low power state after the image has been saved). Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* PM: Make suspend_ops staticRafael J. Wysocki2007-10-18
| | | | | | | | | | | The variable suspend_ops representing the set of global platform-specific suspend-related operations, used by the PM core, need not be exported outside of kernel/power/main.c .  Make it static. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* PM: Rework struct platform_suspend_opsRafael J. Wysocki2007-10-18
| | | | | | | | | | | | | | | | | There is no reason why the .prepare() and .finish() methods in 'struct platform_suspend_ops' should take any arguments, since architectures don't use these methods' argument in any practically meaningful way (ie. either the target system sleep state is conveyed to the platform by .set_target(), or there is only one suspend state supported and it is indicated to the PM core by .valid(), or .prepare() and .finish() aren't defined at all).  There also is no reason why .finish() should return any result. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Pavel Machek <pavel@ucw.cz> Cc: Len Brown <lenb@kernel.org> Cc: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* PM: Rename struct pm_ops and related thingsRafael J. Wysocki2007-10-18
| | | | | | | | | | | | | | | | The name of 'struct pm_ops' suggests that it is related to the power management in general, but in fact it is only related to suspend.  Moreover, its name should indicate what this structure is used for, so it seems reasonable to change it to 'struct platform_suspend_ops'.  In that case, the name of the global variable of this type used by the PM core and the names of related functions should be changed accordingly. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Pavel Machek <pavel@ucw.cz> Cc: Len Brown <lenb@kernel.org> Cc: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* PM: Move definition of struct pm_ops to suspend.hRafael J. Wysocki2007-10-18
| | | | | | | | | | | | | | | | | | | | | | Move the definition of 'struct pm_ops' and related functions from <linux/pm.h> to <linux/suspend.h> . There are, at least, the following reasons to do that: * 'struct pm_ops' is specifically related to suspend and not to the power management in general. * As long as 'struct pm_ops' is defined in <linux/pm.h>, any modification of it causes the entire kernel to be recompiled, which is unnecessary and annoying. * Some suspend-related features are already defined in <linux/suspend.h>, so it is logical to move the definition of 'struct pm_ops' into there. * 'struct hibernation_ops', being the hibernation-related counterpart of 'struct pm_ops', is defined in <linux/suspend.h> . Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Pavel Machek <pavel@ucw.cz> Cc: Len Brown <lenb@kernel.org> Cc: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* make kernel/power/main.c:suspend_enter() staticAdrian Bunk2007-10-18
| | | | | | | | | | suspend_enter() can now become static. Signed-off-by: Adrian Bunk <bunk@kernel.org> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: Pavel Machek <pavel@ucw.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* logo.c: get rid of mips_machgroupRalf Baechle2007-10-18
| | | | | | | | | | | | | | | | This has not been any serious user of this ill conceived thing since the original invention in like '95 so I recently deleted this from everywhere except the last instance in logo.c. This patch removes the last two instances in logo.c. They conditions were not useful anyway as when compiled in they would always evaluate as true. Last not least this is necessary to get the SGI IP22 and DECstation kernels to compile again. Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Cc: "Antonino A. Daplas" <adaplas@pol.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* fb modedb: Refactor confusing mode_option assignmentGeert Uytterhoeven2007-10-18
| | | | | | | Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com> Cc: "Antonino A. Daplas" <adaplas@pol.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* tty_ioctl: fix the baud_table check in encode_baud_rateMaciej W. Rozycki2007-10-18
| | | | | | | | | | | The tty_termios_encode_baud_rate() function as defined by tty_ioctl.c has a problem with the baud_table within. The comparison operators are reversed and as a result this table's entries never match and BOTHER is always used. Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org> Acked-by: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Remove CONFIG_VT_UNICODEJan Engelhardt2007-10-18
| | | | | | | | | | | | | | Since default_utf8 is already a sysfs attribute, having an extra CONFIG_VT_UNICODE compile-time option is redundant, since sysfs attributes can be set at boot and run time. Also let Linux VCs default to UTF-8 (as per the discussion at http://lkml.org/lkml/2007/9/6/99). Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de> Cc: Bill Nottingham <notting@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Kexec: Update URL in MAINTAINERS fileSigned-off-by@vergenet.net":Simon2007-10-18
| | | | | | | | | | I'm not sure that the new URL satifies the requirement of status/info, but it does at least as good a job as the old URL, and contains current releases of kexec-tools, rather than somewhat ancient versions. Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* i4l: Fix random hard freeze with AVM c4 cardKarsten Keil2007-10-18
| | | | | | | | | | | | | | | | | | | | | The patch - Includes the call to capilib_data_b3_req in the spinlock. This routine in turn calls the offending mq_enqueue routine that triggered the freeze if not locked. This should also fix other indicators of incosistent capilib_msgidqueue list, that trigger messages like: Oct 5 03:05:57 BERL0 kernel: kcapi: msgid 3019 ncci 0x30301 not on queue that we saw several times a day (usually several in a row). - Fixes all occurrences of c4_dispatch_tx to be called with active spinlock, there were some instances where no lock was active. Mostly these are in very infrequently called routines, so the additional performance penalty is minimal. Signed-off-by: Karsten Keil <kkeil@suse.de> Signed-off-by: Rainer Brestan <rainer.brestan@frequentis.com> Signed-off-by: Ralf Schlatterbeck <rsc@runtux.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* i4l: fix random freezes with AVM B1 driversKarsten Keil2007-10-18
| | | | | | | | | | | | | | | This fix the same issue which was debbuged for the C4 controller for the B1 versions. The capilib_ function modify or traverse a linked list without locking. This patch extends the existing locking to the calls of these function to prevent access to a list which is in the middle of a modification. Signed-off-by: Karsten Keil <kkeil@suse.de> C: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* au1100fb: fix modpost warningsRalf Baechle2007-10-18
| | | | | | | | | | | | | | MODPOST vmlinux.o WARNING: vmlinux.o(.text+0x170be8): Section mismatch: reference to .init.data:au1100fb_fix (between 'au1100fb_drv_probe' and 'read_null') WARNING: vmlinux.o(.text+0x170dc4): Section mismatch: reference to .init.data:au1100fb_var (between 'au1100fb_drv_probe' and 'read_null') WARNING: vmlinux.o(.text+0x170dd0): Section mismatch: reference to .init.data:au1100fb_fix (between 'au1100fb_drv_probe' and 'read_null') WARNING: vmlinux.o(.text+0x170de0): Section mismatch: reference to .init.data:au1100fb_var (between 'au1100fb_drv_probe' and 'read_null') WARNING: vmlinux.o(.text+0x170e70): Section mismatch: reference to .init.data:au1100fb_var (between 'au1100fb_drv_probe' and 'read_null') Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Cc: "Antonino A. Daplas" <adaplas@pol.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* netport_con.c: fix build errors and warningsRalf Baechle2007-10-18
| | | | | | | | | | | | | | | | | Fix build broken by accaa24c492f1aa3b9c37226d868dc59c3007531: CC drivers/video/console/newport_con.o drivers/video/console/newport_con.c: In function 'newport_show_logo': drivers/video/console/newport_con.c:111: error: assignment of read-only location drivers/video/console/newport_con.c:111: warning: assignment makes integer from pointer without a cast drivers/video/console/newport_con.c:112: error: assignment of read-only location drivers/video/console/newport_con.c:112: warning: assignment makes integer from pointer without a cast Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Cc: "Randy.Dunlap" <rdunlap@xenotime.net> Cc: "Antonino A. Daplas" <adaplas@pol.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* ext4: lighten up resize transaction requirementsEric Sandeen2007-10-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When resizing online, setup_new_group_blocks attempts to reserve a potentially very large transaction, depending on the current filesystem geometry. For some journal sizes, there may not be enough room for this transaction, and the online resize will fail. The patch below resizes & restarts the transaction as necessary while setting up the new group, and should work with even the smallest journal. Tested with something like: [root@newbox ~]# dd if=/dev/zero of=fsfile bs=1024 count=32768 [root@newbox ~]# mkfs.ext3 -b 1024 fsfile 16384 [root@newbox ~]# mount -o loop fsfile mnt/ [root@newbox ~]# resize2fs /dev/loop0 resize2fs 1.40.2 (12-Jul-2007) Filesystem at /dev/loop0 is mounted on /root/mnt; on-line resizing required old desc_blocks = 1, new_desc_blocks = 1 Performing an on-line resize of /dev/loop0 to 32768 (1k) blocks. resize2fs: No space left on device While trying to add group #2 [root@newbox ~]# dmesg | tail -n 1 JBD: resize2fs wants too many credits (258 > 256) [root@newbox ~]# With the below change, it works. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Acked-by: Andreas Dilger <adilger@clusterfs.com>
* ext4: fix setup_new_group_blocks lockingEric Sandeen2007-10-17
| | | | | | | | | setup_new_group_blocks() manipulates the group descriptor block bh under the block_bitmap bh's lock. It shouldn't matter since nobody but resize should be touching these blocks, but it's worth fixing up. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Mingming Cao <cmm@us.ibm.com>
* ext4: sparse fixesAneesh Kumar K.V2007-10-17
| | | | Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
* ext4: Convert ext4_extent_idx.ei_leaf to ext4_extent_idx.ei_leaf_loAneesh Kumar K.V2007-10-17
| | | | | | | | Convert ext4_extent_idx.ei_leaf ext4_extent_idx.ei_leaf_lo This helps in finding BUGs due to direct partial access of these split 48 bit values. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
* ext4: Convert ext4_extent.ee_start to ext4_extent.ee_start_loAneesh Kumar K.V2007-10-17
| | | | | | | | | | Convert ext4_extent.ee_start to ext4_extent.ee_start_lo This helps in finding BUGs due to direct partial access of these split 48 bit values Also fix direct partial access in ext4 code Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
* ext4: Convert s_r_blocks_count and s_free_blocks_countAneesh Kumar K.V2007-10-17
| | | | | | | | | | | | | Convert s_r_blocks_count and s_free_blocks_count to s_r_blocks_count_lo and s_free_blocks_count_lo This helps in finding BUGs due to direct partial access of these split 64 bit values Also fix direct partial access in ext4 code Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
* ext4: Convert s_blocks_count to s_blocks_count_loAneesh Kumar K.V2007-10-17
| | | | | | | | | | Convert s_blocks_count to s_blocks_count_lo This helps in finding BUGs due to direct partial access of these split 64 bit values Also fix direct partial access in ext4 code Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
* ext4: Convert bg_inode_bitmap and bg_inode_tableAneesh Kumar K.V2007-10-17
| | | | | | | | | | Convert bg_inode_bitmap and bg_inode_table to bg_inode_bitmap_lo and bg_inode_table_lo. This helps in finding BUGs due to direct partial access of these split 64 bit values Also fix one direct partial access Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
* ext4: Convert bg_block_bitmap to bg_block_bitmap_loAneesh Kumar K.V2007-10-17
| | | | | | | | Convert bg_block_bitmap to bg_block_bitmap_lo This helps in catching some BUGS due to direct partial access of these split fields. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
* ext4: FLEX_BG Kernel support v2.Jose R. Santos2007-10-17
| | | | | | | | | | | | | This feature relaxes check restrictions on where each block groups meta data is located within the storage media. This allows for the allocation of bitmaps or inode tables outside the block group boundaries in cases where bad blocks forces us to look for new blocks which the owning block group can not satisfy. This will also allow for new meta-data allocation schemes to improve performance and scalability. Signed-off-by: Jose R. Santos <jrs@us.ibm.com> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
* ext4: Fix sparse warningsAneesh Kumar K.V2007-10-17
| | | | | Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
* Ext4: Uninitialized Block GroupsAndreas Dilger2007-10-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In pass1 of e2fsck, every inode table in the fileystem is scanned and checked, regardless of whether it is in use. This is this the most time consuming part of the filesystem check. The unintialized block group feature can greatly reduce e2fsck time by eliminating checking of uninitialized inodes. With this feature, there is a a high water mark of used inodes for each block group. Block and inode bitmaps can be uninitialized on disk via a flag in the group descriptor to avoid reading or scanning them at e2fsck time. A checksum of each group descriptor is used to ensure that corruption in the group descriptor's bit flags does not cause incorrect operation. The feature is enabled through a mkfs option mke2fs /dev/ -O uninit_groups A patch adding support for uninitialized block groups to e2fsprogs tools has been posted to the linux-ext4 mailing list. The patches have been stress tested with fsstress and fsx. In performance tests testing e2fsck time, we have seen that e2fsck time on ext3 grows linearly with the total number of inodes in the filesytem. In ext4 with the uninitialized block groups feature, the e2fsck time is constant, based solely on the number of used inodes rather than the total inode count. Since typical ext4 filesystems only use 1-10% of their inodes, this feature can greatly reduce e2fsck time for users. With performance improvement of 2-20 times, depending on how full the filesystem is. The attached graph shows the major improvements in e2fsck times in filesystems with a large total inode count, but few inodes in use. In each group descriptor if we have EXT4_BG_INODE_UNINIT set in bg_flags: Inode table is not initialized/used in this group. So we can skip the consistency check during fsck. EXT4_BG_BLOCK_UNINIT set in bg_flags: No block in the group is used. So we can skip the block bitmap verification for this group. We also add two new fields to group descriptor as a part of uninitialized group patch. __le16 bg_itable_unused; /* Unused inodes count */ __le16 bg_checksum; /* crc16(sb_uuid+group+desc) */ bg_itable_unused: If we have EXT4_BG_INODE_UNINIT not set in bg_flags then bg_itable_unused will give the offset within the inode table till the inodes are used. This can be used by fsck to skip list of inodes that are marked unused. bg_checksum: Now that we depend on bg_flags and bg_itable_unused to determine the block and inode usage, we need to make sure group descriptor is not corrupt. We add checksum to group descriptor to detect corruption. If the descriptor is found to be corrupt, we mark all the blocks and inodes in the group used. Signed-off-by: Avantika Mathur <mathur@us.ibm.com> Signed-off-by: Andreas Dilger <adilger@clusterfs.com> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
* ext4: remove #ifdef CONFIG_EXT4_INDEXEric Sandeen2007-10-17
| | | | | | | | | | | | CONFIG_EXT4_INDEX is not an exposed config option in the kernel, and it is unconditionally defined in ext4_fs.h. tune2fs is already able to turn off dir indexing, so at this point it's just cluttering up the code. Remove it. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
* ext4: Remove (partial, never completed) fragment supportColy Li2007-10-17
| | | | | | | | | | Fragment support in ext2/3/4 was never implemented, and it probably will never be implemented. So remove it from ext4. Signed-off-by: Coly Li <coyli@suse.de> Acked-by: Andreas Dilger <adilger@clusterfs.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
* JBD2: debug code cleanup.Jose R. Santos2007-10-17
| | | | | | | | | | | | | | | | | | Mostly stolen from akpm's JBD cleanup patch. - use `#ifdef foo' instead of `#if defined(foo)' - Make journal_enable_debug __read_mostly just for the heck of it - Make jbd_debugfs_dir and jbd_debug static - debugfs_remove(NULL) is legal: remove unneeded tests - remove unnecessary empty loops Signed-off-by: Jose R. Santos <jrs@us.ibm.com> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
* jbd2: fix commit code to properly abort journalJan Kara2007-10-17
| | | | | | | | | | | We should really call journal_abort() and not __journal_abort_hard() in case of errors. The latter call does not record the error in the journal superblock and thus filesystem won't be marked as with errors later (and user could happily mount it without any warning). Signed-off-by: Jan Kara <jack@suse.cz> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
* jbd2: JBD_XXX to JBD2_XXX naming cleanupMingming Cao2007-10-17
| | | | | | | change JBD_XXX macros to JBD2_XXX in JBD2/Ext4 Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
* JBD2/Ext4: Convert kmalloc to kzalloc in jbd2/ext4Mingming Cao2007-10-17
| | | | | | Convert kmalloc to kzalloc() and get rid of the memset(). Signed-off-by: Mingming Cao <cmm@us.ibm.com>
* JBD2: replace jbd_kmalloc with kmalloc directly.Mingming Cao2007-10-17
| | | | | | This patch cleans up jbd_kmalloc and replace it with kmalloc directly Signed-off-by: Mingming Cao <cmm@us.ibm.com>
* JBD: replace jbd_kmalloc with kmalloc directlyMingming Cao2007-10-17
| | | | | | This patch cleans up jbd_kmalloc and replace it with kmalloc directly Signed-off-by: Mingming Cao <cmm@us.ibm.com>
* JBD2: jbd2 slab allocation cleanupsMingming Cao2007-10-17
| | | | | | | | | | | JBD2: Replace slab allocations with page allocations JBD2 allocate memory for committed_data and frozen_data from slab. However JBD2 should not pass slab pages down to the block layer. Use page allocator pages instead. This will also prepare JBD for the large blocksize patchset. Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Mingming Cao <cmm@us.ibm.com>
* JBD: JBD slab allocation cleanupsMingming Cao2007-10-17
| | | | | | | | | | JBD: Replace slab allocations with page allocations JBD allocate memory for committed_data and frozen_data from slab. However JBD should not pass slab pages down to the block layer. Use page allocator pages instead. This will also prepare JBD for the large blocksize patchset. Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Mingming Cao <cmm@us.ibm.com>
* Merge branch 'release' of ↵Linus Torvalds2007-10-17
|\ | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6 * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6: [IA64] fix non-numa build
| * [IA64] fix non-numa buildAndrew Morton2007-10-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | arch/ia64/kernel/machine_kexec.c: In function `arch_crash_save_vmcoreinfo': arch/ia64/kernel/machine_kexec.c:131: error: `pgdat_list' undeclared (first use in this function) arch/ia64/kernel/machine_kexec.c:131: error: (Each undeclared identifier is reported only once arch/ia64/kernel/machine_kexec.c:131: error: for each function it appears in.) arch/ia64/kernel/machine_kexec.c:134: error: `node_memblk' undeclared (first use in this function) arch/ia64/kernel/machine_kexec.c:135: error: `NR_NODE_MEMBLKS' undeclared (first use in this function) arch/ia64/kernel/machine_kexec.c:136: error: invalid application of `sizeof' to incomplete type `node_memblk_s' arch/ia64/kernel/machine_kexec.c:137: error: dereferencing pointer to incomplete type arch/ia64/kernel/machine_kexec.c:138: error: dereferencing pointer to incomplete type make[1]: *** [arch/ia64/kernel/machine_kexec.o] Error 1 Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Tony Luck <tony.luck@intel.com>