aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAge
* Merge branch 'release' of ↵Linus Torvalds2006-12-07
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/aegl/linux-2.6 * 'release' of master.kernel.org:/pub/scm/linux/kernel/git/aegl/linux-2.6: [IA64] replace kmalloc+memset with kzalloc [IA64] resolve name clash by renaming is_available_memory() [IA64] Need export for csum_ipv6_magic [IA64] Fix DISCONTIGMEM without VIRTUAL_MEM_MAP [PATCH] Add support for type argument in PAL_GET_PSTATE [IA64] tidy up return value of ip_fast_csum [IA64] implement csum_ipv6_magic for ia64. [IA64] More Itanium PAL spec updates [IA64] Update processor_info features [IA64] Add se bit to Processor State Parameter structure [IA64] Add dp bit to cache and bus check structs [IA64] SN: Correctly update smp_affinty mask [IA64] sparse cleanups [IA64] IA64 Kexec/kdump
| * [IA64] replace kmalloc+memset with kzallocYan Burman2006-12-07
| | | | | | | | | | | | | | | | Replace kmalloc+memset with kzalloc Signed-off-by: Yan Burman <burman.yan@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Tony Luck <tony.luck@intel.com>
| * [IA64] resolve name clash by renaming is_available_memory()Christoph Lameter2006-12-07
| | | | | | | | | | | | | | | | | | There is a name clash with ia64 arch code in Andrew's tree. Rename is_avialable_memory() to is_memory_available() to avoid the clash. Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Tony Luck <tony.luck@intel.com>
| * [IA64] Need export for csum_ipv6_magicTony Luck2006-12-07
| | | | | | | | | | | | | | | | Now we have our own highly optimized assembly code version of this routine (Thanks Ken!) we should export it so that it can be used. Signed-off-by: Tony Luck <tony.luck@intel.com>
| * [IA64] Fix DISCONTIGMEM without VIRTUAL_MEM_MAPMatthew Wilcox2006-12-07
| | | | | | | | | | | | | | | | | | | | | | | | | | make allnoconfig currently fails to build because it selects DISCONTIGMEM without VIRTUAL_MEM_MAP. I see no particular reason this combination ought to fail, so I fixed it by: - Including memory_model.h in all circumstances, except when both DISCONTIGMEM and VIRTUAL_MEM_MAP are enabled. - Defining ia64_pfn_valid() to 1 unless VIRTUAL_MEM_MAP is enabled Signed-off-by: Matthew Wilcox <matthew@wil.cx> Signed-off-by: Tony Luck <tony.luck@intel.com>
| * [PATCH] Add support for type argument in PAL_GET_PSTATEVenkatesh Pallipadi2006-12-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | PAL_GET_PSTATE accepts a type argument to return different kinds of frequency information. Refer: Intel Itanium®Architecture Software Developer's Manual - Volume 2: System Architecture, Revision 2.2 (http://developer.intel.com/design/itanium/manuals/245318.htm) Add the support for type argument and use Instantaneous frequency in the acpi driver. Also fix a bug, where in return value of PAL_GET_PSTATE was getting compared with 'control' bits instead of 'status' bits. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
| * [IA64] tidy up return value of ip_fast_csumChen, Kenneth W2006-12-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | While working on implementing csum_ipv6_magic, I noticed that current version of ip_fast_csum will potentially return bits above "unsigned short" as 1. While no harm is done right now because all call sites will chop off the upper bits when it uses the return value. However, this is still dangerous and buggy. Here is a patch to enforce that the function really returns unsigned short in the native register format. The fix is free as there are plenty open slot to add one more asm instruction. Signed-off-by: Ken Chen <kenneth.w.chen@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
| * [IA64] implement csum_ipv6_magic for ia64.Chen, Kenneth W2006-12-07
| | | | | | | | | | | | | | | | The asm version is 4.4 times faster than the generic C version and 10X smaller in code size. Signed-off-by: Ken Chen <kenneth.w.chen@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
| * [IA64] More Itanium PAL spec updatesRuss Anderson2006-12-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Additional updates to conform with Rev 2.2 of Volume 2 of "Intel Itanium Architecture Software Developer's Manual" (January 2006). Add pal_bus_features_s bits 52 & 53 (page 2:347) Add pal_vm_info_2_s field max_purges (page 2:2:451) Add PAL_GET_HW_POLICY call (page 2:381) Add PAL_SET_HW_POLICY call (page 2:439) Sample output before: --------------------------------------------------------------------- cobra:~ # cat /proc/pal/cpu0/vm_info Physical Address Space : 50 bits Virtual Address Space : 61 bits Protection Key Registers(PKR) : 16 Implemented bits in PKR.key : 24 Hash Tag ID : 0x2 Size of RR.rid : 24 Supported memory attributes : WB, UC, UCE, WC, NaTPage --------------------------------------------------------------------- Sample output after: --------------------------------------------------------------------- cobra:~ # cat /proc/pal/cpu0/vm_info Physical Address Space : 50 bits Virtual Address Space : 61 bits Protection Key Registers(PKR) : 16 Implemented bits in PKR.key : 24 Hash Tag ID : 0x2 Max Purges : 1 Size of RR.rid : 24 Supported memory attributes : WB, UC, UCE, WC, NaTPage --------------------------------------------------------------------- Signed-off-by: Russ Anderson (rja@sgi.com) Signed-off-by: Tony Luck <tony.luck@intel.com>
| * [IA64] Update processor_info featuresRuss Anderson2006-12-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add the printing of additional processor features to proc_features. Based on Rev 2.2 of Volume 2 of "Intel Itanium Architecture Software Developer's Manual" (January 2006) fields (pages 2:430-2:432). This patch gets the features back in sync with the spec. Sample output before: -------------------------------------------------------------- cobra:~ # cat /proc/pal/cpu0/processor_info XIP,XPSR,XFS implemented : On NoCtrl XR1-XR3 implemented : On NoCtrl Disable dynamic predicate prediction : NotImpl Disable processor physical number : NotImpl Disable dynamic data cache prefetch : NotImpl Disable dynamic inst cache prefetch : NotImpl Disable dynamic branch prediction : NotImpl Disable BINIT on processor time-out : On Ctrl Disable dynamic power management (DPM) : NotImpl Disable coherency : NotImpl Disable cache : NotImpl Enable CMCI promotion : Off Ctrl Enable MCA to BINIT promotion : Off Ctrl Enable MCA promotion : NotImpl Enable BERR promotion : NotImpl cobra:~ # -------------------------------------------------------------- Sample output after: -------------------------------------------------------------- cobra:~ # cat /proc/pal/cpu0/processor_info Unimplemented instruction address fault : NotImpl INIT, PMI, and LINT pins : NotImpl Simple unimplimented instr addresses : On NoCtrl Variable P-state performance : NotImpl Virtual machine features implemeted : On NoCtrl XIP,XPSR,XFS implemented : On NoCtrl XR1-XR3 implemented : On NoCtrl Disable dynamic predicate prediction : NotImpl Disable processor physical number : NotImpl Disable dynamic data cache prefetch : NotImpl Disable dynamic inst cache prefetch : NotImpl Disable dynamic branch prediction : NotImpl Disable P-states : Off Ctrl Enable MCA on Data Poisoning : Off Ctrl Enable vmsw instruction : On Ctrl Enable extern environmental notification : NotImpl Disable BINIT on processor time-out : On Ctrl Disable dynamic power management (DPM) : NotImpl Disable coherency : NotImpl Disable cache : NotImpl Enable CMCI promotion : Off Ctrl Enable MCA to BINIT promotion : Off Ctrl Enable MCA promotion : NotImpl Enable BERR promotion : NotImpl cobra:~ # -------------------------------------------------------------- Signed-off-by: Russ Anderson (rja@sgi.com) Signed-off-by: Tony Luck <tony.luck@intel.com>
| * [IA64] Add se bit to Processor State Parameter structureRuss Anderson2006-12-07
| | | | | | | | | | | | | | | | | | | | Rev 2.2 of Volume 2 of "Intel Itanium Architecture Software Developer's Manual" (January 2006) adds a se bit to the Processor State Parameter fields (pages 2:299). This patch gets the structs back in sync with the spec. Signed-off-by: Russ Anderson (rja@sgi.com) Signed-off-by: Tony Luck <tony.luck@intel.com>
| * [IA64] Add dp bit to cache and bus check structsRuss Anderson2006-12-07
| | | | | | | | | | | | | | | | | | | | Rev 2.2 of Volume 2 of "Intel Itanium Architecture Software Developer's Manual" (January 2006) adds a dp bit to the cache_check and bus_check fields (pages 2:401-2:404). This patch gets the structs back in sync with the spec. Signed-off-by: Russ Anderson (rja@sgi.com) Signed-off-by: Tony Luck <tony.luck@intel.com>
| * [IA64] SN: Correctly update smp_affinty maskJohn Keller2006-12-07
| | | | | | | | | | | | | | | | | | On Altix systems, the /proc/irq/nn/smp_affinity mask is not being setup at device iniitalization, or updated after an interrupt redirection. This patch resolves those issues. Signed-off-by: John Keller <jpk@sgi.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
| * [IA64] sparse cleanupsMatthew Wilcox2006-12-07
| | | | | | | | | | | | | | 0/NULL confusion and some missing UL on constants. Signed-off-by: Matthew Wilcox <matthew@wil.cx> Signed-off-by: Tony Luck <tony.luck@intel.com>
| * [IA64] IA64 Kexec/kdumpZou Nan hai2006-12-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Changes and updates. 1. Remove fake rendz path and related code according to discuss with Khalid Aziz. 2. fc.i offset fix in relocate_kernel.S. 3. iospic shutdown code eoi and mask race fix from Fujitsu. 4. Warm boot hook in machine_kexec to SN SAL code from Jack Steiner. 5. Send slave to SAL slave loop patch from Jay Lan. 6. Kdump on non-recoverable MCA event patch from Jay Lan 7. Use CTL_UNNUMBERED in kdump_on_init sysctl. Signed-off-by: Zou Nan hai <nanhai.zou@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
* | Merge branch 'intx' of ↵Linus Torvalds2006-12-07
|\ \ | | | | | | | | | | | | | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/misc-2.6 * 'intx' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/misc-2.6: PCI MSI: always toggle legacy-INTx-enable bit upon MSI entry/exit
| * | PCI MSI: always toggle legacy-INTx-enable bit upon MSI entry/exitJeff Garzik2006-12-07
| | | | | | | | | | | | | | | | | | | | | | | | The current code (prior to this change) would disable the PCI INTx legacy interrupt when enabling MSI... but only on PCI Express. We should do this for all MSI devices, for safety's sake. Signed-off-by: Jeff Garzik <jeff@garzik.org>
* | | Merge branch 'master' of /home/trondmy/kernel/linux-2.6/ into merge_linusTrond Myklebust2006-12-07
|\ \ \
| * | | Add "run_scheduled_work()" workqueue functionLinus Torvalds2006-12-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This allows workqueue users to run just their own pending work, rather than wait for the whole workqueue to finish running. This solves the deadlock with networking libphy that was due to other workqueue entries possibly needing a lock that was held by the routine that wanted to flush its own work. It's not wonderful: if you absolutely need to synchronize with the work function having been executed, any user strictly speaking should have its own completion tracking logic, since when we run things explicitly by hand, the generic workqueue layer can no longer help us synchronize. Also, this is strictly only usable for work that has been scheduled without any delayed timers. You can not mix the new interface with schedule_delayed_work(). But it's better than what we had currently. Acked-by: Maciej W. Rozycki <macro@linux-mips.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
| * | | Merge branch 'upstream-linus' of ↵Linus Torvalds2006-12-07
| |\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev * 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev: [PATCH] libata: Incorrect timing computation for PIO5/6 [PATCH] sata_promise: new EH conversion, take 2 [PATCH] libata: let ATA_FLAG_PIO_POLLING use polling pio for ATA_PROT_NODATA [PATCH] sata_promise: cleanups, take 2
| | * | | [PATCH] libata: Incorrect timing computation for PIO5/6Alan2006-12-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The ata timing computation code makes some mistakes in PIO5/6 because a check was not updated correctly when I put this support into the kernel. Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
| | * | | [PATCH] sata_promise: new EH conversion, take 2Mikael Pettersson2006-12-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch converts sata_promise to use new-style libata error handling on Promise SATA chips, for both SATA and PATA ports. * ATA_FLAG_SRST is no longer set * ->phy_reset is no longer set as it is unused when ->error_handler is present, and pdc_sata_phy_reset() has been removed * pdc_freeze() masks interrupts and halts DMA via PDC_CTLSTAT * pdc_thaw() clears interrupt status in PDC_INT_SEQMASK and then unmasks interrupts in PDC_CTLSTAT * pdc_error_handler() reinitialises the port if it isn't frozen, and then invokes ata_do_eh() with standard {s,}ata reset methods * pdc_post_internal_cmd() resets the port in case of errors * the PATA-only 20619 chip continues to use old-style EH: not by necessity but simply because I don't have documentation for it or any way to test it Since the previous version pdc_error_handler() has been rewritten and it now mostly matches ahci and sata_sil24. In case anyone wonders: the call to pdc_reset_port() isn't a heavy-duty reset, it's a light-weight reset to quickly put a port into a sane state. The discussion about the PCI flushes in pdc_freeze() and pdc_thaw() seemed to end with a consensus that the flushes are OK and not obviously redundant, so I decided to keep them for now. This patch was prepared against 2.6.19-git7, but it also applies to 2.6.19 + libata #upstream, with or without the revised sata_promise cleanup patch I recently submitted. This patch does conflict with the #promise-sata-pata patch: this patch removes pdc_sata_phy_reset() while #promise-sata-pata modifies it. The correct patch resolution is to remove the function. Tested on 2037x and 2057x chips, with PATA patches on top and disks on both SATA and PATA ports. Signed-off-by: Mikael Pettersson <mikpe@it.uu.se> Signed-off-by: Jeff Garzik <jeff@garzik.org>
| | * | | [PATCH] libata: let ATA_FLAG_PIO_POLLING use polling pio for ATA_PROT_NODATAAlbert Lee2006-12-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Even if ATA_FLAG_PIO_POLLING is set, libata uses irq pio for the ATA_PROT_NODATA protocol. This patch let ATA_FLAG_PIO_POLLING use polling pio for the ATA_PROT_NODATA protocol. Signed-off-by: Albert Lee <albertcc@tw.ibm.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
| | * | | [PATCH] sata_promise: cleanups, take 2Mikael Pettersson2006-12-07
| | | |/ | | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch performs two simple cleanups of sata_promise. * Remove board_20771 and map device id 0x3577 to board_2057x. After the recent corrections for SATAII chips, board_20771 and board_2057x were equivalent in the driver. * Remove hp->hotplug_offset and use hp->flags & PDC_FLAG_GEN_II to compute hotplug_offset in pdc_host_init(). hp->hotplug_offset was used to distinguish 1st and 2nd generation chips in one particular case, but now we have that information in a more general form in hp->flags, so hp->hotplug_offset is redundant. Changes since previous submission: rebased on libata-dev #upstream, cleaned up hotplug_offset computation based on Tejun's comments, expanded hotplug_offset removal rationale. This patch does not depend on the pending new EH conversion patch. Signed-off-by: Mikael Pettersson <mikpe@it.uu.se> Signed-off-by: Jeff Garzik <jeff@garzik.org>
| * | | Merge master.kernel.org:/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmwLinus Torvalds2006-12-07
| |\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * master.kernel.org:/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmw: (73 commits) [DLM] Clean up lowcomms [GFS2] Change gfs2_fsync() to use write_inode_now() [GFS2] Fix indent in recovery.c [GFS2] Don't flush everything on fdatasync [GFS2] Add a comment about reading the super block [GFS2] Mount problem with the GFS2 code [GFS2] Remove gfs2_check_acl() [DLM] fix format warnings in rcom.c and recoverd.c [GFS2] lock function parameter [DLM] don't accept replies to old recovery messages [DLM] fix size of STATUS_REPLY message [GFS2] fs/gfs2/log.c:log_bmap() fix printk format warning [DLM] fix add_requestqueue checking nodes list [GFS2] Fix recursive locking in gfs2_getattr [GFS2] Fix recursive locking in gfs2_permission [GFS2] Reduce number of arguments to meta_io.c:getbuf() [GFS2] Move gfs2_meta_syncfs() into log.c [GFS2] Fix journal flush problem [GFS2] mark_inode_dirty after write to stuffed file [GFS2] Fix glock ordering on inode creation ...
| | * | | [DLM] Clean up lowcommsPatrick Caulfield2006-12-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This fixes up most of the things pointed out by akpm and Pavel Machek with comments below indicating why some things have been left: Andrew Morton wrote: > >> +static struct nodeinfo *nodeid2nodeinfo(int nodeid, gfp_t alloc) >> +{ >> + struct nodeinfo *ni; >> + int r; >> + int n; >> + >> + down_read(&nodeinfo_lock); > > Given that this function can sleep, I wonder if `alloc' is useful. > > I see lots of callers passing in a literal "0" for `alloc'. That's in fact > a secret (GFP_ATOMIC & ~__GFP_HIGH). I doubt if that's what you really > meant. Particularly as the code could at least have used __GFP_WAIT (aka > GFP_NOIO) which is much, much more reliable than "0". In fact "0" is the > least reliable mode possible. > > IOW, this is all bollixed up. When 0 is passed into nodeid2nodeinfo the function does not try to allocate a new structure at all. it's an indication that the caller only wants the nodeinfo struct for that nodeid if there actually is one in existance. I've tidied the function itself so it's more obvious, (and tidier!) >> +/* Data received from remote end */ >> +static int receive_from_sock(void) >> +{ >> + int ret = 0; >> + struct msghdr msg; >> + struct kvec iov[2]; >> + unsigned len; >> + int r; >> + struct sctp_sndrcvinfo *sinfo; >> + struct cmsghdr *cmsg; >> + struct nodeinfo *ni; >> + >> + /* These two are marginally too big for stack allocation, but this >> + * function is (currently) only called by dlm_recvd so static should be >> + * OK. >> + */ >> + static struct sockaddr_storage msgname; >> + static char incmsg[CMSG_SPACE(sizeof(struct sctp_sndrcvinfo))]; > > whoa. This is globally singly-threaded code?? Yes. it is only ever run in the context of dlm_recvd. >> >> +static void initiate_association(int nodeid) >> +{ >> + struct sockaddr_storage rem_addr; >> + static char outcmsg[CMSG_SPACE(sizeof(struct sctp_sndrcvinfo))]; > > Another static buffer to worry about. Globally singly-threaded code? Yes. Only ever called by dlm_sendd. >> + >> +/* Send a message */ >> +static int send_to_sock(struct nodeinfo *ni) >> +{ >> + int ret = 0; >> + struct writequeue_entry *e; >> + int len, offset; >> + struct msghdr outmsg; >> + static char outcmsg[CMSG_SPACE(sizeof(struct sctp_sndrcvinfo))]; > > Singly-threaded? Yep. >> >> +static void dealloc_nodeinfo(void) >> +{ >> + int i; >> + >> + for (i=1; i<=max_nodeid; i++) { >> + struct nodeinfo *ni = nodeid2nodeinfo(i, 0); >> + if (ni) { >> + idr_remove(&nodeinfo_idr, i); > > Didn't that need locking? Not. it's only ever called at DLM shutdown after all the other threads have been stopped. >> >> +static int write_list_empty(void) >> +{ >> + int status; >> + >> + spin_lock_bh(&write_nodes_lock); >> + status = list_empty(&write_nodes); >> + spin_unlock_bh(&write_nodes_lock); >> + >> + return status; >> +} > > This function's return value is meaningless. As soon as the lock gets > dropped, the return value can get out of sync with reality. > > Looking at the caller, this _might_ happen to be OK, but it's a nasty and > dangerous thing. Really the locking should be moved into the caller. It's just an optimisation to allow the caller to schedule if there is no work to do. if something arrives immediately afterwards then it will get picked up when the process re-awakes (and it will be woken by that arrival). The 'accepting' atomic has gone completely. as Andrew pointed out it didn't really achieve much anyway. I suspect it was a plaster over some other startup or shutdown bug to be honest. Signed-off-by: Patrick Caulfield <pcaulfie@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Cc: Andrew Morton <akpm@osdl.org> Cc: Pavel Machek <pavel@ucw.cz>
| | * | | [GFS2] Change gfs2_fsync() to use write_inode_now()Steven Whitehouse2006-12-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a bit better than the previous version of gfs2_fsync() although it would be better still if we were able to call a function which only wrote the inode & metadata. Its no big deal though that this will potentially write the data as well since the VFS has already done that before calling gfs2_fsync(). I've also added a comment to explain whats going on here. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Cc: Andrew Morton <akpm@osdl.org>
| | * | | [GFS2] Fix indent in recovery.cSteven Whitehouse2006-12-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As per comments from Andrew Morton and Jan Engelhardt, this fixes the indent and removes the "static" from a variable declaration since its not needed in this case (now allocated on the stack of the function in question). Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Cc: Jan Engelhardt <jengelh@linux01.gwdg.de> Cc: Andrew Morton <akpm@osdl.org>
| | * | | [GFS2] Don't flush everything on fdatasyncSteven Whitehouse2006-11-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The gfs2_fsync() function was doing a journal flush on each and every call. While this is correct, its also a lot of overhead. This patch means that on fdatasync flushes we rely on the VFS to flush the data for us and we don't do a journal flush unless we really need to. We have to do a journal flush for stuffed files though because they have the data and the inode metadata in the same block. Journaled files also need a journal flush too of course. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
| | * | | [GFS2] Add a comment about reading the super blockSteven Whitehouse2006-11-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The comment explains why we use the bio functions to read the super block. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Cc: Andrew Morton <akpm@osdl.org> Cc: Srinivasa Ds <srinivasa@in.ibm.com>
| | * | | [GFS2] Mount problem with the GFS2 codeSrinivasa Ds2006-11-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | While mounting the gfs2 filesystem,our test team had a problem and we got this error message. ======================================================= GFS2: fsid=: Trying to join cluster "lock_nolock", "dasde1" GFS2: fsid=dasde1.0: Joined cluster. Now mounting FS... GFS2: not a GFS2 filesystem GFS2: fsid=dasde1.0: can't read superblock: -22 ========================================================================== On debugging further we found that problem is while reading the super block(gfs2_read_super) and comparing the magic number in it. When I replace the submit_bio() call(present in gfs2_read_super) with the sb_getblk() and ll_rw_block(), mount operation succeded. On further analysis we found that before calling submit_bio(), bio->bi_sector was set to "sector" variable. This "sector" variable has the same value of bh->b_blocknr(block number). Hence there is a need to multiply this valuwith (blocksize >> 9)(9 because,sector size 2^9,samething happens in ll_rw_block also, before calling submit_bio()). So I have developed the patch which solves this problem. Please let me know your comments. ================================================================ Signed-off-by: Srinivasa DS <srinivasa@in.ibm.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
| | * | | [GFS2] Remove gfs2_check_acl()Steven Whitehouse2006-11-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As pointed out by Adrian Bunk, the gfs2_check_acl() function is no longer used. This patch removes it and renamed gfs2_check_acl_locked() to gfs2_check_acl() since we only need one variant of that function now. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Cc: Adrian Bunk <bunk@stusta.de>
| | * | | [DLM] fix format warnings in rcom.c and recoverd.cRyusuke Konishi2006-11-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This fixes the following gcc warnings generated on the architectures where uint64_t != unsigned long long (e.g. ppc64). fs/dlm/rcom.c:154: warning: format '%llx' expects type 'long long unsigned int', but argument 4 has type 'uint64_t' fs/dlm/rcom.c:154: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'uint64_t' fs/dlm/recoverd.c:48: warning: format '%llx' expects type 'long long unsigned int', but argument 3 has type 'uint64_t' fs/dlm/recoverd.c:202: warning: format '%llx' expects type 'long long unsigned int', but argument 3 has type 'uint64_t' fs/dlm/recoverd.c:210: warning: format '%llx' expects type 'long long unsigned int', but argument 3 has type 'uint64_t' Signed-off-by: Ryusuke Konishi <ryusuke@osrg.net> Signed-off-by: Patrick Caulfield <pcaulfie@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
| | * | | [GFS2] lock function parameterRandy Dunlap2006-11-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix function parameter typing: fs/gfs2/glock.c:100: warning: function declaration isn't a prototype Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
| | * | | [DLM] don't accept replies to old recovery messagesDavid Teigland2006-11-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We often abort a recovery after sending a status request to a remote node. We want to ignore any potential status reply we get from the remote node. If we get one of these unwanted replies, we've often moved on to the next recovery message and incremented the message sequence counter, so the reply will be ignored due to the seq number. In some cases, we've not moved on to the next message so the seq number of the reply we want to ignore is still correct, causing the reply to be accepted. The next recovery message will then mistake this old reply as a new one. To fix this, we add the flag RCOM_WAIT to indicate when we can accept a new reply. We clear this flag if we abort recovery while waiting for a reply. Before the flag is set again (to allow new replies) we know that any old replies will be rejected due to their sequence number. We also initialize the recovery-message sequence number to a random value when a lockspace is first created. This makes it clear when messages are being rejected from an old instance of a lockspace that has since been recreated. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
| | * | | [DLM] fix size of STATUS_REPLY messageDavid Teigland2006-11-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When the not_ready routine sends a "fake" status reply with blank status flags, it needs to use the correct size for a normal STATUS_REPLY by including the size of the would-be config parameters. We also fill in the non-existant config parameters with an invalid lvblen value so it's easier to notice if these invalid paratmers are ever being used. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
| | * | | [GFS2] fs/gfs2/log.c:log_bmap() fix printk format warningRyusuke Konishi2006-11-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix a printk format warning in fs/gfs2/log.c: fs/gfs2/log.c:322: warning: format '%llu' expects type 'long long unsigned int', but argument 3 has type 'sector_t' Signed-off-by: Ryusuke Konishi <ryusuke@osrg.net> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
| | * | | [DLM] fix add_requestqueue checking nodes listDavid Teigland2006-11-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Requests that arrive after recovery has started are saved in the requestqueue and processed after recovery is done. Some of these requests are purged during recovery if they are from nodes that have been removed. We move the purging of the requests (dlm_purge_requestqueue) to later in the recovery sequence which allows the routine saving requests (dlm_add_requestqueue) to avoid filtering out requests by nodeid since the same will be done by the purge. The current code has add_requestqueue filtering by nodeid but doesn't hold any locks when accessing the list of current nodes. This also means that we need to call the purge routine when the lockspace is being shut down since the add routine will not be rejecting requests itself any more. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
| | * | | [GFS2] Fix recursive locking in gfs2_getattrSteven Whitehouse2006-11-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The readdirplus NFS operation can result in gfs2_getattr being called with the glock already held. In this case we do not want to try and grab the lock again. This fixes Red Hat bugzilla #215727 Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
| | * | | [GFS2] Fix recursive locking in gfs2_permissionSteven Whitehouse2006-11-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since gfs2_permission may be called either from the VFS (in which case we need to obtain a shared glock) or from GFS2 (in which case we already have a glock) we need to test to see whether or not a lock is required. The original test was buggy due to a potential race. This one should be safe. This fixes Red Hat bugzilla #217129 Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
| | * | | [GFS2] Reduce number of arguments to meta_io.c:getbuf()Steven Whitehouse2006-11-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since the superblock and the address_space are determined by the glock, we might as well just pass that as the argument since all the callers already have that available. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
| | * | | [GFS2] Move gfs2_meta_syncfs() into log.cSteven Whitehouse2006-11-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | By moving gfs2_meta_syncfs() into log.c, gfs2_ail1_start() can be made static. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
| | * | | [GFS2] Fix journal flush problemSteven Whitehouse2006-11-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This fixes a bug which resulted in poor performance due to flushing the journal too often. The code path in question was via the inode_go_sync() function in glops.c. The solution is not to flush the journal immediately when inodes are ejected from memory, but batch up the work for glockd to deal with later on. This means that glocks may now live on beyond the end of the lifetime of their inodes (but not very much longer in the normal case). Also fixed in this patch is a bug (which was hidden by the bug mentioned above) in calculation of the number of free journal blocks. The gfs2_logd process has been altered to be more responsive to the journal filling up. We now wake it up when the number of uncommitted journal blocks has reached the threshold level rather than trying to flush directly at the end of each transaction. This again means doing fewer, but larger, log flushes in general. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
| | * | | [GFS2] mark_inode_dirty after write to stuffed fileSteven Whitehouse2006-11-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Writes to stuffed files were not being marked dirty correctly. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
| | * | | [GFS2] Fix glock ordering on inode creationSteven Whitehouse2006-11-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The lock order here should be parent -> child rather than numeric order. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
| | * | | [GFS2] Simplify glops functionsSteven Whitehouse2006-11-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The go_sync callback took two flags, but one of them was set on every call, so this patch removes once of the flags and makes the previously conditional operations (on this flag), unconditional. The go_inval callback took three flags, each of which was set on every call to it. This patch removes the flags and makes the operations unconditional, which makes the logic rather more obvious. Two now unused flags are also removed from incore.h. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
| | * | | [GFS2] Fix Kconfig wrt CRC32Steven Whitehouse2006-11-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | GFS2 requires the CRC32 library function. This was reported by Toralf Förster. Cc: Toralf Förster <toralf.foerster@gmx.de> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
| | * | | [GFS2] Make sentinel dirents compatible with gfs1Steven Whitehouse2006-11-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When deleting directory entries, we set the inum.no_addr to zero in a dirent when its the first dirent in a block and thus cannot be merged into the previous dirent as is the usual case. In gfs1, inum.no_formal_ino was used instead. This patch changes gfs2 to set both inum.no_addr and inum.no_formal_ino to zero. It also changes the test from just looking at inum.no_addr to look at both inum.no_addr and inum.no_formal_ino and a sentinel is now considered to be a dirent in which _either_ (or both) of them is set to zero. This resolves Red Hat bugzillas: #215809, #211465 Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
| | * | | [GFS2] Remove unused function from inode.cSteven Whitehouse2006-11-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The gfs2_glock_nq_m_atime function is unused in so far as its only ever called with num_gh = 1, and this falls through to the gfs2_glock_nq_atime function, so we might as well call that directly. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
| | * | | [GFS2] Remove unused sysfs filesSteven Whitehouse2006-11-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Four of the sysfs files are unused and can therefore be removed. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>