aboutsummaryrefslogtreecommitdiffstats
path: root/fs/ocfs2/dlm/dlmrecovery.c
Commit message (Collapse)AuthorAge
* Use helpers to obtain task pid in printksPavel Emelyanov2007-10-19
| | | | | | | | | | | | | | | | The task_struct->pid member is going to be deprecated, so start using the helpers (task_pid_nr/task_pid_vnr/task_pid_nr_ns) in the kernel. The first thing to start with is the pid, printed to dmesg - in this case we may safely use task_pid_nr(). Besides, printks produce more (much more) than a half of all the explicit pid usage. [akpm@linux-foundation.org: git-drm went and changed lots of stuff] Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Cc: Dave Airlie <airlied@linux.ie> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* [KJ PATCH] Replacing memset(<addr>,0,PAGE_SIZE) with clear_page() in ↵Shani Moideen2007-07-10
| | | | | | | | | | fs/ocfs2/dlm/dlmrecovery.c Replacing memset(<addr>,0,PAGE_SIZE) with clear_page() in fs/ocfs2/dlm/dlmrecovery.c Signed-off-by: Shani Moideen <shani.moideen@wipro.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* [PATCH] ocfs2: use list_for_each_entry where beneficalChristoph Hellwig2007-07-10
| | | | | Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: fix sparse warnings in fs/ocfs2/dlmMark Fasheh2007-05-02
| | | | Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2_dlm: fix race in dlm_remaster_locksSrinivas Eeda2007-04-26
| | | | | | | | | | | There is a possibility that dlm_remaster_locks could overwride node->state with DLM_RECO_NODE_DATA_REQUESTED after dlm_reco_data_done_handler sets the node->state to DLM_RECO_NODE_DATA_DONE. This could lead to recovery getting stuck and requires a cluster reboot. Synchronize with dlm_reco_state_lock spinlock. Signed-off-by: Srinivas Eeda <srinivas.eeda@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: Added post handler callable function in o2net message handlerKurt Hackel2007-02-07
| | | | | | | | | | | | | Currently o2net allows one handler function per message type. This patch adds the ability to call another function to be called after the handler has returned the message to the other node. Handlers are now given the option of returning a context (in the form of a void **) which will be passed back into the post message handler function. Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2_dlm: Cookies in locks not being printed correctly in error messagesKurt Hackel2007-02-07
| | | | | | | | | | | | The dlm encodes the node number and a sequence number in the lock cookie. It also stores the cookie in the lockres in the big endian format to avoid swapping 8 bytes on each lock request. The bug here was that it was assuming the cookie to be in the cpu format when decoding it for printing the error message. This patch swaps the bytes before the print. Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2_dlm: wake up sleepers on the lockres waitqueueKurt Hackel2007-02-07
| | | | | | | | | | The dlm was not waking up threads waiting on the lockres wait queue, waiting for the lockres to be no longer be in the DLM_LOCK_RES_IN_PROGRESS and the DLM_LOCK_RES_MIGRATING states. Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2_dlm: Dlm dispatch was stopping too earlyKurt Hackel2007-02-07
| | | | | | | | | | dlm_dispatch_work was not processing the queued up tasks at the first sign of the node leaving the domain leading to not only incompleted tasks but also a mismatch in the dlm refcnt. Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2_dlm: Drop inflight refmap even if no locks found on the lockresKurt Hackel2007-02-07
| | | | | | Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2_dlm: Fix migrate lockres handler queue scanningKurt Hackel2007-02-07
| | | | | | | | | | | | The migrate lockres handler was only searching for its lock on migrated lockres on the expected queue. This could be problematic as the new master could have also issued a convert request during the migration and thus moved the lock to the convert queue. We now search for the lock on all three queues. Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Sunil Mushran <Sunil.Mushran@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2_dlm: Make dlmunlock() wait for migration to completeKurt Hackel2007-02-07
| | | | | | | | | dlmunlock() was not waiting for migration to complete before releasing locks on locally mastered locks. Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Sunil Mushran <Sunil.Mushran@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2_dlm: fix cluster-wide refcounting of lock resourcesKurt Hackel2007-02-07
| | | | | | | | | | | This was previously broken and migration of some locks had to be temporarily disabled. We use a new (and backward-incompatible) set of network messages to account for all references to a lock resources held across the cluster. once these are all freed, the master node may then free the lock resource memory once its local references are dropped. Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* [PATCH] Fix numerous kcalloc() calls, convert to kzalloc()Robert P. J. Day2006-12-13
| | | | | | | | | | | | | | | | | | | All kcalloc() calls of the form "kcalloc(1,...)" are converted to the equivalent kzalloc() calls, and a few kcalloc() calls with the incorrect ordering of the first two arguments are fixed. Signed-off-by: Robert P. J. Day <rpjday@mindspring.com> Cc: Jeff Garzik <jeff@garzik.org> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Dominik Brodowski <linux@dominikbrodowski.net> Cc: Adam Belay <ambx1@neo.rr.com> Cc: James Bottomley <James.Bottomley@steeleye.com> Cc: Greg KH <greg@kroah.com> Cc: Mark Fasheh <mark.fasheh@oracle.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* WorkStruct: make allyesconfigDavid Howells2006-11-22
| | | | | | Fix up for make allyesconfig. Signed-Off-By: David Howells <dhowells@redhat.com>
* ocfs2: Allow binary names in the DLMMark Fasheh2006-09-24
| | | | | | | | The OCFS2 DLM uses strlen() to determine lock name length, which excludes the possibility of putting binary values in the name string. Fix this by requiring that string length be passed in as a parameter. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* [PATCH] fs/ocfs2/dlm/dlmrecovery.c: make dlm_lockres_master_requery() staticAdrian Bunk2006-06-29
| | | | | | | dlm_lockres_master_requery() became global without any external usage. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* [PATCH] spin/rwlock init cleanupsIngo Molnar2006-06-27
| | | | | | | | | | | | | | | | | | | | | locking init cleanups: - convert " = SPIN_LOCK_UNLOCKED" to spin_lock_init() or DEFINE_SPINLOCK() - convert rwlocks in a similar manner this patch was generated automatically. Motivation: - cleanliness - lockdep needs control of lock initialization, which the open-coded variants do not give - it's also useful for -rt and for lock debugging in general Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] fs/ocfs2/dlm/: cleanupsAdrian Bunk2006-06-26
| | | | | | | | | | | | This patch #if 0's the no longer used dlm_dump_lock_resources(). Since this makes dlmdebug.h empty, this patch also removes this header. Additionally, the needlessly global dlm_is_node_recovered() is made static. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: move dlm work to a private work queueKurt Hackel2006-06-26
| | | | | | | | The work that is done can block for long periods of time and so is not appropriate for keventd. Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: tune down some noisy messages during dlm recoveryKurt Hackel2006-06-26
| | | | | Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: display message before waiting for recovery to completeKurt Hackel2006-06-26
| | | | | Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: use GFP_NOFS in some dlm operationsKurt Hackel2006-06-26
| | | | | Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: wait for recovery when starting lock masteryKurt Hackel2006-06-26
| | | | | Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: continue recovery when a dead node is encounteredKurt Hackel2006-06-26
| | | | | Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: remove unneccesary spin_unlock() in dlm_remaster_locks()Kurt Hackel2006-06-26
| | | | | Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: dlm_remaster_locks() should never exit without completingKurt Hackel2006-06-26
| | | | | | | | | We cannot restart recovery. Once we begin to recover a node, keep the state of the recovery intact and follow through, regardless of any other node deaths that may occur. Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: special case recovery lock in dlmlock_remote()Kurt Hackel2006-06-26
| | | | | | | | | | If the previous master of the recovery lock dies, let calc_usage take it down completely and let the caller completely redo the dlmlock() call. Otherwise, there will never be an opportunity to re-master the lockres and recovery wont be able to progress. Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: update lvb immediately during recoveryKurt Hackel2006-06-26
| | | | | Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: dump mismatching migrated lvbs before BUG()Kurt Hackel2006-06-26
| | | | | Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: make dlm recovery finalization 2 stageKurt Hackel2006-06-26
| | | | | | | Makes it easier for the recovery process to deal with node death. Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: dlm recovery / lockres reference count fixKurt Hackel2006-06-26
| | | | | | | Take a reference on lockres structures while they are on the recovery list. Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: clean up recovery related messagesKurt Hackel2006-06-26
| | | | | Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: handle network errors during recoveryKurt Hackel2006-06-26
| | | | | Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: only recover one dead node at a timeKurt Hackel2006-06-26
| | | | | Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: Better tracking for recovery state changesKurt Hackel2006-06-26
| | | | | Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: Fix empty lvb checkKurt Hackel2006-06-26
| | | | | | | | The check for an empty lvb should check the entire buffer not just the first byte. Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: fix inverted logic in dlm_is_node_deadKurt Hackel2006-06-26
| | | | | Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: allocate lockres hash pages in an arrayDaniel Phillips2006-06-26
| | | | | | | | This allows us to have a hash table greater than a single page which greatly improves dlm performance on some tests. Signed-off-by: Daniel Phillips <phillips@google.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: calculate lockid hash values outside of the spinlockMark Fasheh2006-06-26
| | | | | | Fixes a performance bug - pointed out by Andrew. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* [PATCH] fs: use list_move()Akinobu Mita2006-06-26
| | | | | | | | | | | | | | | | This patch converts the combination of list_del(A) and list_add(A, B) to list_move(A, B) under fs/. Cc: Ian Kent <raven@themaw.net> Acked-by: Joel Becker <joel.becker@oracle.com> Cc: Neil Brown <neilb@cse.unsw.edu.au> Cc: Hans Reiser <reiserfs-dev@namesys.com> Cc: Urban Widmark <urban@teststation.com> Acked-by: David Howells <dhowells@redhat.com> Acked-by: Mark Fasheh <mark.fasheh@oracle.com> Signed-off-by: Akinobu Mita <mita@miraclelinux.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* ocfs2: don't use MLF* in dlm/ filesKurt Hackel2006-03-24
| | | | | Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* [PATCH] ocfs2: dlm recovery fixesKurt Hackel2006-03-24
| | | | | | | | | | when starting lock mastery (excepting the recovery lock) wait on any nodes needing recovery. fix one instance where lock resources were left attached to the recovery list after recovery completed. ensure that the node_down code is run uniformly regardless of which node found the dead node first. Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* [PATCH] ocfs2: use hlists for lockres hashMark Fasheh2006-03-01
| | | | | | | Switch from list_head to hlist_head. Make the size of the hash dependent upon the allocated area, rather than a constant. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* [PATCH] ocfs2: add dlm_wait_for_node_deathKurt Hackel2006-02-16
| | | | | | | | | | * add dlm_wait_for_node_death function to be used after receiving a network error. this will wait for the given timeout to allow the heartbeat callbacks to update the domain map. without this, some paths may spin and consume enough cpu that the heartbeat gets starved and never updates. Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* [PATCH] ocfs2: recheck recovery state after getting lockKurt Hackel2006-02-16
| | | | | | | | * after successfully taking the $RECOVERY lock in EX mode, recheck to make sure that recovery has not already begun or completed on another node Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* [PATCH] fs/ocfs2/dlm/dlmrecovery.c must #include <linux/delay.h>Adrian Bunk2006-02-03
| | | | | | | | | fs/ocfs2/dlm/dlmrecovery.c does now use msleep(), and does therefore need to #include <linux/delay.h> for getting the prototype of this function. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* [PATCH] ocfs2/dlm: fixesKurt Hackel2006-02-03
| | | | | | | | | | | | | | | | | | | | | * fix a hang which can occur during shutdown migration * do not allow nodes to join during recovery * when restarting lock mastery, do not ignore nodes which come up * more than one node could become recovery master, fix this * sleep to allow some time for heartbeat state to catch up to network * extra debug info for bad recovery state problems * make DLM_RECO_NODE_DATA_DONE a valid state for non-master recovery nodes * prune all locks from dead nodes on $RECOVERY lock resources * do NOT automatically add new nodes to mle nodemaps until they have properly joined the domain * make sure dlm_pick_recovery_master only exits when all nodes have synced * properly handle dlmunlock errors in dlm_pick_recovery_master * do not propagate network errors in dlm_send_begin_reco_message * dead nodes were not being put in the recovery map sometimes, fix this * dlmunlock was failing to clear the unlock actions on DLM_DENIED Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* [PATCH] OCFS2: The Second Oracle Cluster FilesystemKurt Hackel2006-01-03
A distributed lock manager built with the cluster file system use case in mind. The OCFS2 dlm exposes a VMS style API, though things have been simplified internally. The only lock levels implemented currently are NLMODE, PRMODE and EXMODE. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com> Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>