litmus-rt.git - The LITMUS^RT kernel.

	Commit message (Collapse)	Author	Age
*	ieee1394: sbp2: remove irritating log message	Stefan Richter	2006-12-07
\| \| \| \| \| \| \| \| \| \| \|	The queue depth can be read from /sys/bus/scsi/devices/*/queue_depth, so don't log it. And the hint about speed improvements is misleading, at least under current kernels. If serialization is switched off, read performance is typically increased by less than 10%. (I did not test write performance recently.) On the other hand, serialize_io=0 is not yet safe due to some implementation issues that are not trivial to fix. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
*	ohci1394: shortcut irq printing	Alexey Dobriyan	2006-12-07
\| \| \| \| \| \| \|	To print irq number no need to transform to string using %d, then print using %s. Just use %d. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
*	ieee1394: nodemgr: take it easy if bus_rescan_devices fails	Stefan Richter	2006-12-07
\| \| \| \| \| \|	This happens. No need to log a BUG trace. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
*	drivers/ieee1394/*: use kmemdup()	Eric Sesterhenn	2006-12-07
\| \| \| \| \| \|	Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de> Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
*	ieee1394: ohci1394: proper log messages in suspend and resume	Stefan Richter	2006-12-07
\| \| \| \| \| \| \| \| \| \|	- correct thinko in one of my last commits: cannot use PRINT macro with ohci == NULL - add log messages on ohci == NULL and on pci_enable_device != 0 - update log macros from patch "revert fail on error in suspend" to use PRINT and DBGMSG where possible Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
*	ieee1394: ohci1394: revert fail on error in suspend	Stefan Richter	2006-12-07
\| \| \| \| \| \| \| \|	Some errors during preparation for suspended state can be skipped with a warning instead of a failure of the whole suspend transition, notably an error in pci_set_power_state. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
*	ieee1394: only build OUI database files if config enabled	Randy Dunlap	2006-12-07
\| \| \| \| \| \| \|	Only build IEEE1394 OUI database files if the config option is enabled. Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
*	ieee1394: fix printk format warning	Randy Dunlap	2006-12-07
\| \| \| \| \| \| \| \|	Fix printk format warning: drivers/ieee1394/nodemgr.c:364: warning: long long unsigned int format, u64 arg (arg 3) Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
*	ieee1394: nodemgr: revise semaphore protection of driver core data	Stefan Richter	2006-12-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- The list "struct class.children" is supposed to be protected by class.sem, not by class.subsys.rwsem. - nodemgr_remove_uds() iterated over nodemgr_ud_class.children without proper protection. This was never observed as a bug since the code is usually only accessed by knodemgrd. All knodemgrds are currently globally serialized. But userspace can trigger this code too by writing to /sys/bus/ieee1394/destroy_node. - Clean up access to the FireWire bus type's subsys.rwsem: Access it uniformly via ieee1394_bus_type. Shrink rwsem protected regions where possible. Expand them where necessary. The latter wasn't a problem so far because knodemgr is globally serialized. This should harden the interaction of ieee1394 with sysfs and lay ground for deserialized operation of multiple knodemgrds and for implementation of subthreads for parallelized scanning and probing. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
*	ieee1394: nodemgr: reflect which return values are errors	Stefan Richter	2006-12-07
\| \| \| \| \| \|	Give better names to local variables. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
*	ieee1394: nodemgr: small fix after sysfs errors patch	Stefan Richter	2006-12-07
\| \| \| \| \| \|	One hunk in "ieee1394: handle sysfs errors" was wrong. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
*	dv1394: remove BKL contention	Stefan Richter	2006-12-07
\| \| \| \| \| \|	Purges the one remaining call to lock_kernel() from the 1394 subsystem. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
*	video1394: remove BKL contention	Daniel Drake	2006-12-07
\| \| \| \| \| \| \| \| \| \| \| \| \|	video1394 does not need to take the BKL. The data structures shared between file_operations and interrupts are already protected through context-specific spinlocks. The only other danger is video1394_release() being called during another operation, however this cannot happen because release is only ever invoked when the last thread has closed the fd. Signed-off-by: Daniel Drake <ddrake@brontes3d.com> Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
*	video1394: small optimizations to frame retrieval codepath	Daniel Drake	2006-12-07
\| \| \| \| \| \| \| \| \|	Add some GCC branch prediction optimizations to unlikely error/safety conditions in the ioctl handling code commonly called during an application's capture loop. Signed-off-by: Daniel Drake <ddrake@brontes3d.com> Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
*	ieee1394: handle sysfs errors	Stefan Richter	2006-12-07
\| \| \| \| \| \|	Handle driver core errors with as much care as appropriate. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
*	ieee1394: coding style in hosts.c	Stefan Richter	2006-12-07
\| \| \| \| \| \|	Some 80-columns pedantry, and touch up of a // comment. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
*	ieee1394: lock smaller region by host_num_alloc mutex	Stefan Richter	2006-12-07
\| \| \| \| \| \|	We need the mutex only around the iteration over existing hosts. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
*	ieee1394: usecs_to_jiffies takes unsigned int argument	Stefan Richter	2006-12-07
\| \| \| \|	Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
*	ieee1394: ohci1394: suspend/resume cosmetics	Stefan Richter	2006-12-07
\| \| \| \| \| \| \|	Reorder the definitions of ohci1394_pci_suspend and _resume. Remove redundant comments. Beautify return statements. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
*	ohci1394: steps to implement suspend/resume	Bernhard Kaindl	2006-12-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I did a quick shot on what I described and the appended patch does the first thing needed for working suspend/resume in ohci1394 which is HW de- and re-initialisation. It works with suspend2disk on my Ricoh R5C552 IEEE 1394 Controller with the 2.6.17 kernel to the extent that if I call dvgrab --interactive after suspend2disk without unloading ohci1394, it does not lock up dvgrab with 100% CPU but properly connects to the camera, given that I first unplug and plug the camera after coming back from suspend. I guess that could be fixed by forcing a bus reset in the resume function. I cannot test suspend to RAM here at the moment and should follow the guidelines in Documentation/power/pci.txt also, so this is rather a quick report than a finished patch and there are some rough edges: However, with this patch, I have to unload at least some in-kernel users of ohci1394 like dv1394 or video1394 before suspending. Not doing that caused an Oops and a bad tasklet error, probably from not handling ISO tasklets during suspend/resume properly. Maybe these can be temporarily cleared or unregistered and re-registered for suspend/resume with help from the other layers or from the highlevel 1394 core, but I do not really know what these do. But this patch provides a useful base to start from and is already of much help for people which do not need dv1394 and video1394 or can unload them at least during suspend. I cannot test function with sbp2 at the moment, but raw1394 seems to work fine. Signed-off-by: Bernhard Kaindl <bk@fsfe.org> Update 1: merge with previous two ohci1394 suspend/resume patches Update 2: version for application on top of Linux 2.6.19-rc4 Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
*	ieee1394: raw1394: add comments on lock usage	Stefan Richter	2006-12-07
\| \| \| \| \| \| \|	Add a who-is-who about some locks and list heads in raw1394's struct definitions. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
*	ieee1394: sbp2: slightly reorder sbp2scsi_abort	Stefan Richter	2006-12-07
\| \| \| \| \| \| \| \| \|	Put the target's fetch agent into reset state before the underlying ORB DMA is unmapped and the ->done handler is called. It is highly unlikely but the target could access that ORB right before sbp2 sends the reset request. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
*	ieee1394: remove unused struct member from highlevel API	Stefan Richter	2006-12-07
\| \| \| \| \| \| \| \| \|	struct hpsb_highlevel's struct module *owner is neither used by the IEEE 1394 core nor set by any of the in-tree drivers or the two out-of-tree highlevel drivers I know about (dfg1394, mem1394) --- nor is this member documented. An unscheduled removal seems acceptable. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
*	Add "run_scheduled_work()" workqueue function	Linus Torvalds	2006-12-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This allows workqueue users to run just their own pending work, rather than wait for the whole workqueue to finish running. This solves the deadlock with networking libphy that was due to other workqueue entries possibly needing a lock that was held by the routine that wanted to flush its own work. It's not wonderful: if you absolutely need to synchronize with the work function having been executed, any user strictly speaking should have its own completion tracking logic, since when we run things explicitly by hand, the generic workqueue layer can no longer help us synchronize. Also, this is strictly only usable for work that has been scheduled without any delayed timers. You can not mix the new interface with schedule_delayed_work(). But it's better than what we had currently. Acked-by: Maciej W. Rozycki <macro@linux-mips.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
*	Merge branch 'upstream-linus' of ↵	Linus Torvalds	2006-12-07
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev * 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev: [PATCH] libata: Incorrect timing computation for PIO5/6 [PATCH] sata_promise: new EH conversion, take 2 [PATCH] libata: let ATA_FLAG_PIO_POLLING use polling pio for ATA_PROT_NODATA [PATCH] sata_promise: cleanups, take 2
\| *	[PATCH] libata: Incorrect timing computation for PIO5/6	Alan	2006-12-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The ata timing computation code makes some mistakes in PIO5/6 because a check was not updated correctly when I put this support into the kernel. Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
\| *	[PATCH] sata_promise: new EH conversion, take 2	Mikael Pettersson	2006-12-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch converts sata_promise to use new-style libata error handling on Promise SATA chips, for both SATA and PATA ports. * ATA_FLAG_SRST is no longer set * ->phy_reset is no longer set as it is unused when ->error_handler is present, and pdc_sata_phy_reset() has been removed * pdc_freeze() masks interrupts and halts DMA via PDC_CTLSTAT * pdc_thaw() clears interrupt status in PDC_INT_SEQMASK and then unmasks interrupts in PDC_CTLSTAT * pdc_error_handler() reinitialises the port if it isn't frozen, and then invokes ata_do_eh() with standard {s,}ata reset methods * pdc_post_internal_cmd() resets the port in case of errors * the PATA-only 20619 chip continues to use old-style EH: not by necessity but simply because I don't have documentation for it or any way to test it Since the previous version pdc_error_handler() has been rewritten and it now mostly matches ahci and sata_sil24. In case anyone wonders: the call to pdc_reset_port() isn't a heavy-duty reset, it's a light-weight reset to quickly put a port into a sane state. The discussion about the PCI flushes in pdc_freeze() and pdc_thaw() seemed to end with a consensus that the flushes are OK and not obviously redundant, so I decided to keep them for now. This patch was prepared against 2.6.19-git7, but it also applies to 2.6.19 + libata #upstream, with or without the revised sata_promise cleanup patch I recently submitted. This patch does conflict with the #promise-sata-pata patch: this patch removes pdc_sata_phy_reset() while #promise-sata-pata modifies it. The correct patch resolution is to remove the function. Tested on 2037x and 2057x chips, with PATA patches on top and disks on both SATA and PATA ports. Signed-off-by: Mikael Pettersson <mikpe@it.uu.se> Signed-off-by: Jeff Garzik <jeff@garzik.org>
\| *	[PATCH] libata: let ATA_FLAG_PIO_POLLING use polling pio for ATA_PROT_NODATA	Albert Lee	2006-12-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Even if ATA_FLAG_PIO_POLLING is set, libata uses irq pio for the ATA_PROT_NODATA protocol. This patch let ATA_FLAG_PIO_POLLING use polling pio for the ATA_PROT_NODATA protocol. Signed-off-by: Albert Lee <albertcc@tw.ibm.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
\| *	[PATCH] sata_promise: cleanups, take 2	Mikael Pettersson	2006-12-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch performs two simple cleanups of sata_promise. * Remove board_20771 and map device id 0x3577 to board_2057x. After the recent corrections for SATAII chips, board_20771 and board_2057x were equivalent in the driver. * Remove hp->hotplug_offset and use hp->flags & PDC_FLAG_GEN_II to compute hotplug_offset in pdc_host_init(). hp->hotplug_offset was used to distinguish 1st and 2nd generation chips in one particular case, but now we have that information in a more general form in hp->flags, so hp->hotplug_offset is redundant. Changes since previous submission: rebased on libata-dev #upstream, cleaned up hotplug_offset computation based on Tejun's comments, expanded hotplug_offset removal rationale. This patch does not depend on the pending new EH conversion patch. Signed-off-by: Mikael Pettersson <mikpe@it.uu.se> Signed-off-by: Jeff Garzik <jeff@garzik.org>
* \|	Merge master.kernel.org:/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmw	Linus Torvalds	2006-12-07
\|\ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* master.kernel.org:/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmw: (73 commits) [DLM] Clean up lowcomms [GFS2] Change gfs2_fsync() to use write_inode_now() [GFS2] Fix indent in recovery.c [GFS2] Don't flush everything on fdatasync [GFS2] Add a comment about reading the super block [GFS2] Mount problem with the GFS2 code [GFS2] Remove gfs2_check_acl() [DLM] fix format warnings in rcom.c and recoverd.c [GFS2] lock function parameter [DLM] don't accept replies to old recovery messages [DLM] fix size of STATUS_REPLY message [GFS2] fs/gfs2/log.c:log_bmap() fix printk format warning [DLM] fix add_requestqueue checking nodes list [GFS2] Fix recursive locking in gfs2_getattr [GFS2] Fix recursive locking in gfs2_permission [GFS2] Reduce number of arguments to meta_io.c:getbuf() [GFS2] Move gfs2_meta_syncfs() into log.c [GFS2] Fix journal flush problem [GFS2] mark_inode_dirty after write to stuffed file [GFS2] Fix glock ordering on inode creation ...
\| * \|	[DLM] Clean up lowcomms	Patrick Caulfield	2006-12-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This fixes up most of the things pointed out by akpm and Pavel Machek with comments below indicating why some things have been left: Andrew Morton wrote: > >> +static struct nodeinfo nodeid2nodeinfo(int nodeid, gfp_t alloc) >> +{ >> + struct nodeinfo ni; >> + int r; >> + int n; >> + >> + down_read(&nodeinfo_lock); > > Given that this function can sleep, I wonder if `alloc' is useful. > > I see lots of callers passing in a literal "0" for `alloc'. That's in fact > a secret (GFP_ATOMIC & ~__GFP_HIGH). I doubt if that's what you really > meant. Particularly as the code could at least have used __GFP_WAIT (aka > GFP_NOIO) which is much, much more reliable than "0". In fact "0" is the > least reliable mode possible. > > IOW, this is all bollixed up. When 0 is passed into nodeid2nodeinfo the function does not try to allocate a new structure at all. it's an indication that the caller only wants the nodeinfo struct for that nodeid if there actually is one in existance. I've tidied the function itself so it's more obvious, (and tidier!) >> +/* Data received from remote end / >> +static int receive_from_sock(void) >> +{ >> + int ret = 0; >> + struct msghdr msg; >> + struct kvec iov[2]; >> + unsigned len; >> + int r; >> + struct sctp_sndrcvinfo sinfo; >> + struct cmsghdr cmsg; >> + struct nodeinfo ni; >> + >> + /* These two are marginally too big for stack allocation, but this >> + * function is (currently) only called by dlm_recvd so static should be >> + * OK. >> + / >> + static struct sockaddr_storage msgname; >> + static char incmsg[CMSG_SPACE(sizeof(struct sctp_sndrcvinfo))]; > > whoa. This is globally singly-threaded code?? Yes. it is only ever run in the context of dlm_recvd. >> >> +static void initiate_association(int nodeid) >> +{ >> + struct sockaddr_storage rem_addr; >> + static char outcmsg[CMSG_SPACE(sizeof(struct sctp_sndrcvinfo))]; > > Another static buffer to worry about. Globally singly-threaded code? Yes. Only ever called by dlm_sendd. >> + >> +/ Send a message / >> +static int send_to_sock(struct nodeinfo ni) >> +{ >> + int ret = 0; >> + struct writequeue_entry e; >> + int len, offset; >> + struct msghdr outmsg; >> + static char outcmsg[CMSG_SPACE(sizeof(struct sctp_sndrcvinfo))]; > > Singly-threaded? Yep. >> >> +static void dealloc_nodeinfo(void) >> +{ >> + int i; >> + >> + for (i=1; i<=max_nodeid; i++) { >> + struct nodeinfo ni = nodeid2nodeinfo(i, 0); >> + if (ni) { >> + idr_remove(&nodeinfo_idr, i); > > Didn't that need locking? Not. it's only ever called at DLM shutdown after all the other threads have been stopped. >> >> +static int write_list_empty(void) >> +{ >> + int status; >> + >> + spin_lock_bh(&write_nodes_lock); >> + status = list_empty(&write_nodes); >> + spin_unlock_bh(&write_nodes_lock); >> + >> + return status; >> +} > > This function's return value is meaningless. As soon as the lock gets > dropped, the return value can get out of sync with reality. > > Looking at the caller, this _might_ happen to be OK, but it's a nasty and > dangerous thing. Really the locking should be moved into the caller. It's just an optimisation to allow the caller to schedule if there is no work to do. if something arrives immediately afterwards then it will get picked up when the process re-awakes (and it will be woken by that arrival). The 'accepting' atomic has gone completely. as Andrew pointed out it didn't really achieve much anyway. I suspect it was a plaster over some other startup or shutdown bug to be honest. Signed-off-by: Patrick Caulfield <pcaulfie@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Cc: Andrew Morton <akpm@osdl.org> Cc: Pavel Machek <pavel@ucw.cz>
\| * \|	[GFS2] Change gfs2_fsync() to use write_inode_now()	Steven Whitehouse	2006-12-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a bit better than the previous version of gfs2_fsync() although it would be better still if we were able to call a function which only wrote the inode & metadata. Its no big deal though that this will potentially write the data as well since the VFS has already done that before calling gfs2_fsync(). I've also added a comment to explain whats going on here. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Cc: Andrew Morton <akpm@osdl.org>
\| * \|	[GFS2] Fix indent in recovery.c	Steven Whitehouse	2006-12-05
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As per comments from Andrew Morton and Jan Engelhardt, this fixes the indent and removes the "static" from a variable declaration since its not needed in this case (now allocated on the stack of the function in question). Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Cc: Jan Engelhardt <jengelh@linux01.gwdg.de> Cc: Andrew Morton <akpm@osdl.org>
\| * \|	[GFS2] Don't flush everything on fdatasync	Steven Whitehouse	2006-11-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The gfs2_fsync() function was doing a journal flush on each and every call. While this is correct, its also a lot of overhead. This patch means that on fdatasync flushes we rely on the VFS to flush the data for us and we don't do a journal flush unless we really need to. We have to do a journal flush for stuffed files though because they have the data and the inode metadata in the same block. Journaled files also need a journal flush too of course. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
\| * \|	[GFS2] Add a comment about reading the super block	Steven Whitehouse	2006-11-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The comment explains why we use the bio functions to read the super block. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Cc: Andrew Morton <akpm@osdl.org> Cc: Srinivasa Ds <srinivasa@in.ibm.com>
\| * \|	[GFS2] Mount problem with the GFS2 code	Srinivasa Ds	2006-11-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	While mounting the gfs2 filesystem,our test team had a problem and we got this error message. ======================================================= GFS2: fsid=: Trying to join cluster "lock_nolock", "dasde1" GFS2: fsid=dasde1.0: Joined cluster. Now mounting FS... GFS2: not a GFS2 filesystem GFS2: fsid=dasde1.0: can't read superblock: -22 ========================================================================== On debugging further we found that problem is while reading the super block(gfs2_read_super) and comparing the magic number in it. When I replace the submit_bio() call(present in gfs2_read_super) with the sb_getblk() and ll_rw_block(), mount operation succeded. On further analysis we found that before calling submit_bio(), bio->bi_sector was set to "sector" variable. This "sector" variable has the same value of bh->b_blocknr(block number). Hence there is a need to multiply this valuwith (blocksize >> 9)(9 because,sector size 2^9,samething happens in ll_rw_block also, before calling submit_bio()). So I have developed the patch which solves this problem. Please let me know your comments. ================================================================ Signed-off-by: Srinivasa DS <srinivasa@in.ibm.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
\| * \|	[GFS2] Remove gfs2_check_acl()	Steven Whitehouse	2006-11-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As pointed out by Adrian Bunk, the gfs2_check_acl() function is no longer used. This patch removes it and renamed gfs2_check_acl_locked() to gfs2_check_acl() since we only need one variant of that function now. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Cc: Adrian Bunk <bunk@stusta.de>
\| * \|	[DLM] fix format warnings in rcom.c and recoverd.c	Ryusuke Konishi	2006-11-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This fixes the following gcc warnings generated on the architectures where uint64_t != unsigned long long (e.g. ppc64). fs/dlm/rcom.c:154: warning: format '%llx' expects type 'long long unsigned int', but argument 4 has type 'uint64_t' fs/dlm/rcom.c:154: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'uint64_t' fs/dlm/recoverd.c:48: warning: format '%llx' expects type 'long long unsigned int', but argument 3 has type 'uint64_t' fs/dlm/recoverd.c:202: warning: format '%llx' expects type 'long long unsigned int', but argument 3 has type 'uint64_t' fs/dlm/recoverd.c:210: warning: format '%llx' expects type 'long long unsigned int', but argument 3 has type 'uint64_t' Signed-off-by: Ryusuke Konishi <ryusuke@osrg.net> Signed-off-by: Patrick Caulfield <pcaulfie@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
\| * \|	[GFS2] lock function parameter	Randy Dunlap	2006-11-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fix function parameter typing: fs/gfs2/glock.c:100: warning: function declaration isn't a prototype Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
\| * \|	[DLM] don't accept replies to old recovery messages	David Teigland	2006-11-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We often abort a recovery after sending a status request to a remote node. We want to ignore any potential status reply we get from the remote node. If we get one of these unwanted replies, we've often moved on to the next recovery message and incremented the message sequence counter, so the reply will be ignored due to the seq number. In some cases, we've not moved on to the next message so the seq number of the reply we want to ignore is still correct, causing the reply to be accepted. The next recovery message will then mistake this old reply as a new one. To fix this, we add the flag RCOM_WAIT to indicate when we can accept a new reply. We clear this flag if we abort recovery while waiting for a reply. Before the flag is set again (to allow new replies) we know that any old replies will be rejected due to their sequence number. We also initialize the recovery-message sequence number to a random value when a lockspace is first created. This makes it clear when messages are being rejected from an old instance of a lockspace that has since been recreated. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
\| * \|	[DLM] fix size of STATUS_REPLY message	David Teigland	2006-11-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When the not_ready routine sends a "fake" status reply with blank status flags, it needs to use the correct size for a normal STATUS_REPLY by including the size of the would-be config parameters. We also fill in the non-existant config parameters with an invalid lvblen value so it's easier to notice if these invalid paratmers are ever being used. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
\| * \|	[GFS2] fs/gfs2/log.c:log_bmap() fix printk format warning	Ryusuke Konishi	2006-11-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fix a printk format warning in fs/gfs2/log.c: fs/gfs2/log.c:322: warning: format '%llu' expects type 'long long unsigned int', but argument 3 has type 'sector_t' Signed-off-by: Ryusuke Konishi <ryusuke@osrg.net> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
\| * \|	[DLM] fix add_requestqueue checking nodes list	David Teigland	2006-11-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Requests that arrive after recovery has started are saved in the requestqueue and processed after recovery is done. Some of these requests are purged during recovery if they are from nodes that have been removed. We move the purging of the requests (dlm_purge_requestqueue) to later in the recovery sequence which allows the routine saving requests (dlm_add_requestqueue) to avoid filtering out requests by nodeid since the same will be done by the purge. The current code has add_requestqueue filtering by nodeid but doesn't hold any locks when accessing the list of current nodes. This also means that we need to call the purge routine when the lockspace is being shut down since the add routine will not be rejecting requests itself any more. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
\| * \|	[GFS2] Fix recursive locking in gfs2_getattr	Steven Whitehouse	2006-11-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The readdirplus NFS operation can result in gfs2_getattr being called with the glock already held. In this case we do not want to try and grab the lock again. This fixes Red Hat bugzilla #215727 Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
\| * \|	[GFS2] Fix recursive locking in gfs2_permission	Steven Whitehouse	2006-11-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since gfs2_permission may be called either from the VFS (in which case we need to obtain a shared glock) or from GFS2 (in which case we already have a glock) we need to test to see whether or not a lock is required. The original test was buggy due to a potential race. This one should be safe. This fixes Red Hat bugzilla #217129 Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
\| * \|	[GFS2] Reduce number of arguments to meta_io.c:getbuf()	Steven Whitehouse	2006-11-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since the superblock and the address_space are determined by the glock, we might as well just pass that as the argument since all the callers already have that available. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
\| * \|	[GFS2] Move gfs2_meta_syncfs() into log.c	Steven Whitehouse	2006-11-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	By moving gfs2_meta_syncfs() into log.c, gfs2_ail1_start() can be made static. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
\| * \|	[GFS2] Fix journal flush problem	Steven Whitehouse	2006-11-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This fixes a bug which resulted in poor performance due to flushing the journal too often. The code path in question was via the inode_go_sync() function in glops.c. The solution is not to flush the journal immediately when inodes are ejected from memory, but batch up the work for glockd to deal with later on. This means that glocks may now live on beyond the end of the lifetime of their inodes (but not very much longer in the normal case). Also fixed in this patch is a bug (which was hidden by the bug mentioned above) in calculation of the number of free journal blocks. The gfs2_logd process has been altered to be more responsive to the journal filling up. We now wake it up when the number of uncommitted journal blocks has reached the threshold level rather than trying to flush directly at the end of each transaction. This again means doing fewer, but larger, log flushes in general. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
\| * \|	[GFS2] mark_inode_dirty after write to stuffed file	Steven Whitehouse	2006-11-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Writes to stuffed files were not being marked dirty correctly. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
\| * \|	[GFS2] Fix glock ordering on inode creation	Steven Whitehouse	2006-11-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The lock order here should be parent -> child rather than numeric order. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>