aboutsummaryrefslogtreecommitdiffstats
path: root/fs
Commit message (Collapse)AuthorAge
...
| | * | | | | | | | | nfs41: nfs41_sequence_free_slotAndy Adamson2009-06-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [from nfs41: separate free slot from sequence done] Don't free the slot until after all rpc_restart_calls have completed. Session reset will require more work. As noted by Trond, since we're using rpc_wake_up_next rather than rpc_wake_up() we must always wake up the next task in the queue either by going through nfs4_free_slot, or just calling rpc_wake_up_next if no slot is to be freed. [nfs41: sequence res use slotid] [nfs41: remove SEQ4_STATUS_USE_TK_STATUS] [got rid of nfs4_sequence_res.sr_session, use nfs_client.cl_session instead] Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfs41: rpc_wake_up_next if sessions slot was not consumed.] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfs41: nfs4_sequence_free_slot use nfs_client for data server] Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| | * | | | | | | | | nfs41: free slotAndy Adamson2009-06-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Free a slot in the slot table. Mark the slot as free in the bitmap-based allocation table by clearing a bit corresponding to the slotid. Update lowest_free_slotid if freed slotid is lower than that. Update highest_used_slotid. In the case the freed slotid equals the highest_used_slotid, scan downwards for the next highest used slotid using the optimized fls* functions. Finally, wake up thread waiting on slot_tbl_waitq for a free slot to become available. Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfs41: free slot use slotid] Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfs41: use find_first_zero_bit for nfs4_find_slot] While at it, obliterate lowest_free_slotid and fix-up related comments. As per review comment 21/85. Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfs41: use __clear_bit for nfs4_free_slot] While at it, fix-up function comment. Part of review comment 22/85. Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfs41: use find_last_bit in nfs4_free_slot to determine highest used slot.] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfs41: rpc_sleep_on slot_tbl_waitq must be called under slot_tbl_lock] Otherwise there's a race (we've hit) with nfs4_free_slot where nfs41_setup_sequence sees a full slot table, unlocks slot_tbl_lock, nfs4_free_slots happen concurrently and call rpc_wake_up_next where there's nobody to wake up yet, context goes back to nfs41_setup_sequence which goes to sleep when the slot table is actually empty now and there's no-one to wake it up anymore. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| | * | | | | | | | | nfs41: setup_sequence methodAndy Adamson2009-06-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Allocate a slot in the session slot table and set the sequence op arguments. Called at the rpc prepare stage. Add a status to nfs41_sequence_res, initialize it to one so that we catch rpc level failures which do not go through decode_sequence which sets the new status field. Note that upon an rpc level failure, we don't know if the server processed the sequence operation or not. Proceed as if the server did process the sequence operation. Signed-off-by: Rahul Iyer <iyer@netapp.com> [nfs41: sequence args use slotid] [nfs41: find slot return slotid] Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfs41: remove SEQ4_STATUS_USE_TK_STATUS] As per 11-14-08 review [move extern declaration from nfs41: sequence setup/done support] [removed sa_session definition, changed sa_cache_this into a u8 to reduce footprint] Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfs41: rpc_sleep_on slot_tbl_waitq must be called under slot_tbl_lock] Otherwise there's a race (we've hit) with nfs4_free_slot where nfs41_setup_sequence sees a full slot table, unlocks slot_tbl_lock, nfs4_free_slots happen concurrently and call rpc_wake_up_next where there's nobody to wake up yet, context goes back to nfs41_setup_sequence which goes to sleep when the slot table is actually empty now and there's no-one to wake it up anymore. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| | * | | | | | | | | nfs41: find slotBenny Halevy2009-06-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Find a free slot using bitmap-based allocation. Use the optimized ffz function to find a zero bit in the bitmap that indicates a free slot, starting the search from the 'lowest_free_slotid' position. If found, mark the slot as used in the bitmap, get the slot's slotid and seqid, and update max_slotid to be used by the SEQUENCE operation. Also, update lowest_free_slotid for next search. If no free slot was found the caller has to wait for a free slot (outside the scope of this function) Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfs41: find slot return slotid] Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [use find_first_zero_bit for nfs4_find_slot as per review comment 21/85.] [use NFS4_MAX_SLOT_TABLE rather than NFS4_NO_SLOT] [nfs41: rpc_sleep_on slot_tbl_waitq must be called under slot_tbl_lock] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| | * | | | | | | | | nfs41: nfs4_setup_sequenceAndy Adamson2009-06-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Perform the nfs4_setup_sequence in the rpc_call_prepare state. If a session slot is not available, we will rpc_sleep_on the slot wait queue leaving the tk_action as rpc_call_prepare. Once we have a session slot, hang on to it even through rpc_restart_calls. Ensure the nfs41_sequence_res sr_slot pointer is NULL before rpc_run_task is called as nfs41_setup_sequence will only find a new slot if it is NULL. A future patch will call free slot after any rpc_restart_calls, and handle the rpc restart that result from a sequence operation error. Signed-off-by: Rahul Iyer <iyer@netapp.com> [nfs41: sequence res use slotid] Signed-off-by: Andy Adamson<andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfs41: simplify nfs4_call_sync] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfs41: nfs4_call_sync] [nfs41: check for session not minorversion] [nfs41: remove rpc_message from nfs41_call_sync_args] [moved NFS4_MAX_SLOT_TABLE logic into nfs41_setup_sequence] Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfs41: nfs41_call_sync_data use nfs_client not nfs_server] Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfs41: expose nfs4_call_sync_session for lease renewal] Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfs41: remove unnecessary return check] Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| | * | | | | | | | | nfs41: xdr {encode,decode}_sequenceAndy Adamson2009-06-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Implement stubs for encode and decode sequence, defined as no-ops when CONFIG_NFS_V4_1 is not defined. Add the nfsv41 encode and decode sizes. Add encode_sequence to all nfs4_enc_* routines and decode_sequence to all nfs4_dec_* routines as required by v41. [was nfs41: minorversion support for xdr] [added nfs_client argument to encode_sequence so not to use sequence_args to pass sa_session] Signed-off-by: Andy Adamson<andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfs41: pass *session in seq_args and seq_res] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| | * | | | | | | | | nfs41: encode minorversion in compound headerBenny Halevy2009-06-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Andy Adamdon <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfs41: pass *session in seq_args and seq_res] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| | * | | | | | | | | NFS: use dynamically computed compound_hdr.replen for xdr_inline_pages offsetBenny Halevy2009-06-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As Trond suggested, rather than passing a constant to xdr_inline_pages, keep a running count of the expected reply bytes. In preparation for nfs41, where additional op sequence are expteced when talking to nfs41 servers. [NFS: cb_compoundhdr.replen is in words not bytes] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfs41: get fs_locations replen before encoding the GETATTR] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfs41: get getacl replen before encoding the GETATTR] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| | * | | | | | | | | NFS: update hdr->replen for every encode opBenny Halevy2009-06-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| | * | | | | | | | | NFS: define and initialize compound_hdr.replenBenny Halevy2009-06-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | replen holds the running count of expected reply bytes. repl will then be used by encoding routines for xdr_inline_pages offset after which data bytes are to be received directly into the xdr buffer pages. NOTE: According to the nfsv4 and v4.1 RFCs, the replied tag SHOULD be the same is the one sent, but this is not required as a MUST for the server to do so. The server may screw us if it replies a tag of a different length in the compound result. [NFS: cb_compoundhdr.replen is in words not bytes] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| | * | | | | | | | | NFS: use decode_change_info_maxsz for xdr maxsz calculationsBenny Halevy2009-06-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| | * | | | | | | | | nfs41: set up seq_res.sr_slotidAndy Adamson2009-06-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Initialize nfs4_sequence_res sr_slotid to NFS4_MAX_SLOT_TABLE. [was nfs41: sequence res use slotid] Signed-off-by: Andy Adamson <andros@netapp.com> [pulled definition of struct nfs4_sequence_res.sr_slotid to here] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| | * | | | | | | | | nfs41: nfs41: pass *session in seq_args and seq_resBenny Halevy2009-06-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To be used for getting the rpc's minorversion and for nfs41 xdr {en,de}coding of the sequence operation. Reset the seq session ptrs for minorversion=0 rpc calls. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| | * | | | | | | | | nfs41: introduce nfs4_call_syncAndy Adamson2009-06-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use nfs4_call_sync rather than rpc_call_sync to provide for a nfs41 sessions-enabled interface for sessions manipulation. The nfs41 rpc logic uses the rpc_call_prepare method to recover and create the session, as well as selecting a free slot id and the rpc_call_done to free the slot and update slot table related metadata. In the coming patches we'll add rpc prepare and done routines for setting up the sequence op and processing the sequence result. Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfs41: nfs4_call_sync] As per 11-14-08 review. Squash into "nfs41: introduce nfs4_call_sync" and "nfs41: nfs4_setup_sequence" Define two functions one for v4 and one for v41 add a pointer to struct nfs4_client to the correct one. Signed-off-by: Andy Adamson <andros@netapp.com> [added BUG() in _nfs4_call_sync_session if !CONFIG_NFS_V4_1] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfs41: check for session not minorversion] Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [group minorversion specific stuff together] Signed-off-by: Alexandros Batsakis <Alexandros.Batsakis@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Andy Adamson <andros@netapp.com> [nfs41: fixup nfs4_clear_client_minor_version] [introduce nfs4_init_client_minor_version() in this patch] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [cleaned-up patch: got rid of nfs_call_sync_t, dprintks, cosmetics, extra server defs] Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| | * | | | | | | | | nfs41: use nfs4_fs_locations_resBenny Halevy2009-06-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In preparation for nfs41 sequence processing. Signed-off-by: Andy Admason <andros@netapp.com> [find nfs4_fs_locations_res] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| | * | | | | | | | | nfs41: use nfs4_setaclresBenny Halevy2009-06-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In preparation for nfs41 sequence processing. Signed-off-by: Andy Admason <andros@netapp.com> [define nfs_setaclres] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| | * | | | | | | | | NFS: get rid of unused xdr decode_setattr(, res) argumentBenny Halevy2009-06-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| | * | | | | | | | | nfs41: use nfs4_getaclresBenny Halevy2009-06-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In preparation for nfs41 sequence processing. Signed-off-by: Andy Admason <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfs41: embed resp_len in nfs_getaclres] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| | * | | | | | | | | nfs41: use nfs4_pathconf_resBenny Halevy2009-06-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In preparation for nfs41 sequence processing. Signed-off-by: Andy Admason <andros@netapp.com> [define nfs4_pathconf_res] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| | * | | | | | | | | nfs41: use nfs4_fsinfo_resBenny Halevy2009-06-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In preparation for nfs41 sequence processing. Signed-off-by: Andy Admason <andros@netapp.com> [define nfs4_fsinfo_res] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| | * | | | | | | | | nfs41: use nfs4_statfs_resBenny Halevy2009-06-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In preparation for nfs41 sequence processing. Signed-off-by: Andy Admason <andros@netapp.com> [define nfs4_statfs_res] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| | * | | | | | | | | nfs41: use nfs4_readlink_resBenny Halevy2009-06-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In preparation for nfs41 sequence processing. Signed-off-by: Andy Admason <andros@netapp.com> [define nfs4_readlink_res] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| | * | | | | | | | | nfs41: use nfs4_server_caps_argBenny Halevy2009-06-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In preparation for nfs41 sequence processing. Signed-off-by: Andy Admason <andros@netapp.com> [define nfs4_server_caps_arg] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| | * | | | | | | | | nfs41: sessions client infrastructureAndy Adamson2009-06-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | NFSv4.1 Sessions basic data types, initialization, and destruction. The session is always associated with a struct nfs_client that holds the exchange_id results. Signed-off-by: Rahul Iyer <iyer@netapp.com> Signed-off-by: Andy Adamson<andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [remove extraneous rpc_clnt pointer, use the struct nfs_client cl_rpcclient. remove the rpc_clnt parameter from nfs4 nfs4_init_session] Signed-off-by: Andy Adamson<andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [Use the presence of a session to determine behaviour instead of the minorversion number.] Signed-off-by: Andy Adamson <andros@netapp.com> [constified nfs4_has_session's struct nfs_client parameter] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [Rename nfs4_put_session() to nfs4_destroy_session() and call it from nfs4_free_client() not nfs4_free_server(). Also get rid of nfs4_get_session() and the ref_count in nfs4_session struct as keeping track of nfs_client should be sufficient] Signed-off-by: Alexandros Batsakis <Alexandros.Batsakis@netapp.com> [nfs41: pass rsize and wsize into nfs4_init_session] Signed-off-by: Andy Adamson <andros@netapp.com> [separated out removal of rpc_clnt parameter from nfs4_init_session ot a patch of its own] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [Pass the nfs_client pointer into nfs4_alloc_session] Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfs41: don't assign to session->clp->cl_session in nfs4_destroy_session] [nfs41: fixup nfs4_clear_client_minor_version] [introduce nfs4_clear_client_minor_version() in this patch] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [Refactor nfs4_init_session] Moved session allocation into nfs4_init_client_minor_version, called from nfs4_init_client. Leave rwise and wsize initialization in nfs4_init_session, called from nfs4_init_server. Reverted moving of nfs_fsid definition to nfs_fs_sb.h Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfs41: Move NFS4_MAX_SLOT_TABLE define from under CONFIG_NFS_V4_1] [Fix comile error when CONFIG_NFS_V4_1 is not set.] Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [moved nfs4_init_slot_table definition to "create_session operation"] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfs41: alloc session with GFP_KERNEL] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| | * | | | | | | | | nfs41: translate NFS4ERR_MINOR_VERS_MISMATCH to EPROTONOSUPPORTBenny Halevy2009-06-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To be returned to the mount command when trying to mount a v4 server using minorversion 1. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| | * | | | | | | | | nfs41: Use mount minorversion optionBenny Halevy2009-06-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use the mount minorversion option to initialize the nfs_client cl_minorversion and match it in nfs_match_client() when looking up a nfs_client. [nfs41: remove ifdefs around nfs_client_initdata.minorversion] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| | * | | | | | | | | nfs41: nfs_client.cl_minorversionBenny Halevy2009-06-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This field is set to the nfsv4 minor version for this mount. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Note: This patch sets the referral to the same minorversion as the current mount. Revisit in future patch. Signed-off-by: Andy Adamson <andros@netapp.com> [removed cl_minorversion assignment in nfs_set_client] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [always define nfs_client.cl_minorversion] Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| | * | | | | | | | | nfs41: add mount command option minorversionMike Sager2009-06-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | mount -t nfs4 -o minorversion=[0|1] specifies whether to use 4.0 or 4.1. By default, the minorversion is set to 0. Signed-off-by: Mike Sager <sager@netapp.com> [set default minorversion to 0 as per Trond and SteveD's request] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| | * | | | | | | | | nfs41: Add Kconfig symbols for NFSv4.1Ricardo Labiaga2009-06-17
| | | |/ / / / / / / | | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Added CONFIG_NFS_V4_1 and made it depend upon CONFIG_NFS_V4 and EXPERIMENTAL. Indicate that CONFIG_NFS_V4_1 is for NFS developers at the moment At the moment we're expecting folks trying out nfs41 to actively participate in the development process by helping us debug issues and ideally send patches to fix problems. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* | | | | | | | | | Making fs/minix/minix.h double including safeHitoshi Mitake2009-06-22
| |_|_|_|_|_|_|_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I happened to find that fs/minix/minix.h doesn't guard double include. Yes, I know this never cause something destructive because this is self-evidence that no source file includes minix.h twice, but I think fixing this is better than disregarding it. Signed-off-by: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | | | | | | Merge branch 'perfcounters-fixes-for-linus' of ↵Linus Torvalds2009-06-20
|\ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perfcounters-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (49 commits) perfcounter: Handle some IO return values perf_counter: Push perf_sample_data through the swcounter code perf_counter tools: Define and use our own u64, s64 etc. definitions perf_counter: Close race in perf_lock_task_context() perf_counter, x86: Improve interactions with fast-gup perf_counter: Simplify and fix task migration counting perf_counter tools: Add a data file header perf_counter: Update userspace callchain sampling uses perf_counter: Make callchain samples extensible perf report: Filter to parent set by default perf_counter tools: Handle lost events perf_counter: Add event overlow handling fs: Provide empty .set_page_dirty() aop for anon inodes perf_counter: tools: Makefile tweaks for 64-bit powerpc perf_counter: powerpc: Add processor back-end for MPC7450 family perf_counter: powerpc: Make powerpc perf_counter code safe for 32-bit kernels perf_counter: powerpc: Change how processor-specific back-ends get selected perf_counter: powerpc: Use unsigned long for register and constraint values perf_counter: powerpc: Enable use of software counters on 32-bit powerpc perf_counter tools: Add and use isprint() ...
| * | | | | | | | | fs: Provide empty .set_page_dirty() aop for anon inodesPeter Zijlstra2009-06-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | .set_page_dirty() is one of those a_ops that defaults to the buffer implementation when not set. Therefore provide a dummy function to make it do nothing. (Uncovered by perfcounters fd's which can now be writable-mmap-ed.) Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Davide Libenzi <davidel@xmailserver.org> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | | | | | | | | | fat: Fix the removal of opts->fs_dmaskOGAWA Hirofumi2009-06-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | (ce3b0f8d5c2203301fc87f3aaaed73e5819e2a48: New helper - current_umask()) is removing the opts->fs_dmask, probably it's a cut-and-paste miss or something. Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
* | | | | | | | | | Merge branch 'for-linus' of git://git.infradead.org/users/eparis/notifyLinus Torvalds2009-06-19
|\ \ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * 'for-linus' of git://git.infradead.org/users/eparis/notify: inotify: inotify_destroy_mark_entry could get called twice
| * | | | | | | | | | inotify: inotify_destroy_mark_entry could get called twiceEric Paris2009-06-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | inotify_destroy_mark_entry could get called twice for the same mark since it is called directly in inotify_rm_watch and when the mark is being destroyed for another reason. As an example assume that the file being watched was just deleted so inotify_destroy_mark_entry would get called from the path fsnotify_inoderemove() -> fsnotify_destroy_marks_by_inode() -> fsnotify_destroy_mark_entry() -> inotify_destroy_mark_entry(). If this happened at the same time as userspace tried to remove a watch via inotify_rm_watch we could attempt to remove the mark from the idr twice and could thus double dec the ref cnt and potentially could be in a use after free/double free situation. The fix is to have inotify_rm_watch use the generic recursive safe fsnotify_destroy_mark_by_entry() so we are sure the inotify_destroy_mark_entry() function can only be called one. This patch also renames the function to inotify_ingored_remove_idr() so it is clear what is actually going on in the function. Hopefully this fixes: [ 20.342058] idr_remove called for id=20 which is not allocated. [ 20.348000] Pid: 1860, comm: udevd Not tainted 2.6.30-tip #1077 [ 20.353933] Call Trace: [ 20.356410] [<ffffffff811a82b7>] idr_remove+0x115/0x18f [ 20.361737] [<ffffffff8134259d>] ? _spin_lock+0x6d/0x75 [ 20.367061] [<ffffffff8111640a>] ? inotify_destroy_mark_entry+0xa3/0xcf [ 20.373771] [<ffffffff8111641e>] inotify_destroy_mark_entry+0xb7/0xcf [ 20.380306] [<ffffffff81115913>] inotify_freeing_mark+0xe/0x10 [ 20.386238] [<ffffffff8111410d>] fsnotify_destroy_mark_by_entry+0x143/0x170 [ 20.393293] [<ffffffff811163a3>] inotify_destroy_mark_entry+0x3c/0xcf [ 20.399829] [<ffffffff811164d1>] sys_inotify_rm_watch+0x9b/0xc6 [ 20.405850] [<ffffffff8100bcdb>] system_call_fastpath+0x16/0x1b Reported-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Eric Paris <eparis@redhat.com> Tested-by: Peter Ziljlstra <peterz@infradead.org>
* | | | | | | | | | | Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-blockLinus Torvalds2009-06-19
|\ \ \ \ \ \ \ \ \ \ \ | |/ / / / / / / / / / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * 'for-linus' of git://git.kernel.dk/linux-2.6-block: Fix kernel-doc parameter name typo in blk-settings.c: block: rename CONFIG_LBD to CONFIG_LBDAF block: Fix bounce_pfn setting hd: stop defining MAJOR_NR
| * | | | | | | | | | block: rename CONFIG_LBD to CONFIG_LBDAFBartlomiej Zolnierkiewicz2009-06-19
| | |/ / / / / / / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Follow-up to "block: enable by default support for large devices and files on 32-bit archs". Rename CONFIG_LBD to CONFIG_LBDAF to: - allow update of existing [def]configs for "default y" change - reflect that it is used also for large files support nowadays Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
* | | | | | | | | | Merge branch 'for_linus' of ↵Linus Torvalds2009-06-18
|\ \ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: jbd2: clean up jbd2_journal_try_to_free_buffers() ext4: Don't update ctime for non-extent-mapped inodes ext4: Fix up whitespace issues in fs/ext4/inode.c ext4: Fix 64-bit block type problem on 32-bit platforms ext4: teach the inode allocator to use a goal inode number ext4: Use a hash of the topdir directory name for the Orlov parent group ext4: document the "abort" mount option ext4: move the abort flag from s_mount_opts to s_mount_flags ext4: update the s_last_mounted field in the superblock ext4: change s_mount_opt to be an unsigned int ext4: online defrag -- Add EXT4_IOC_MOVE_EXT ioctl ext4: avoid unnecessary spinlock in critical POSIX ACL path ext3: avoid unnecessary spinlock in critical POSIX ACL path ext4: convert instrumentation from markers to tracepoints jbd2: convert instrumentation from markers to tracepoints
| * | | | | | | | | | jbd2: clean up jbd2_journal_try_to_free_buffers()Hisashi Hifumi2009-06-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch reverts 3f31fddf, which is no longer needed because if a race between freeing buffer and committing transaction functionality occurs and dio gets error, currently dio falls back to buffered IO due to the commit 6ccfa806. Signed-off-by: Hisashi Hifumi <hifumi.hisashi@oss.ntt.co.jp> Cc: Mingming Cao <cmm@us.ibm.com> Acked-by: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
| * | | | | | | | | | ext4: Don't update ctime for non-extent-mapped inodesTheodore Ts'o2009-06-15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The VFS handles updating ctime, so we don't need to update the inode's ctime in ext4_splace_branch() to update the direct or indirect blocks. This was harmless when we did this in ext3, but in ext4, thanks to delayed allocation, updating the ctime in ext4_splice_branch() can cause the ctime to mysteriously jump when the blocks are finally allocated. Thanks to Björn Steinbrink for pointing out this problem on the git mailing list. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
| * | | | | | | | | | ext4: Fix up whitespace issues in fs/ext4/inode.cTheodore Ts'o2009-06-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a pure cleanup patch. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
| * | | | | | | | | | ext4: Fix 64-bit block type problem on 32-bit platformsTheodore Ts'o2009-06-15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The function ext4_mb_free_blocks() was using an "unsigned long" to pass a block number; this will cause 64-bit block numbers to get truncated on x86 and other 32-bit platforms. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reviewed-by: Eric Sandeen <sandeen@redhat.com>
| * | | | | | | | | | ext4: teach the inode allocator to use a goal inode numberAndreas Dilger2009-06-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Enhance the inode allocator to take a goal inode number as a paremeter; if it is specified, it takes precedence over Orlov or parent directory inode allocation algorithms. The extents migration function uses the goal inode number so that the extent trees allocated the migration function use the correct flex_bg. In the future, the goal inode functionality will also be used to allocate an adjacent inode for the extended attributes. Also, for testing purposes the goal inode number can be specified via /sys/fs/{dev}/inode_goal. This can be useful for testing inode allocation beyond 2^32 blocks on very large filesystems. Signed-off-by: Andreas Dilger <adilger@sun.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
| * | | | | | | | | | ext4: Use a hash of the topdir directory name for the Orlov parent groupTheodore Ts'o2009-06-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of using a random number to determine the goal parent grop for the Orlov top directories, use a hash of the directory name. This allows for repeatable results when trying to benchmark filesystem layout algorithms. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
| * | | | | | | | | | ext4: move the abort flag from s_mount_opts to s_mount_flagsTheodore Ts'o2009-06-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We're running out of space in the mount options word, and EXT4_MOUNT_ABORT isn't really a mount option, but a run-time flag. So move it to become EXT4_MF_FS_ABORTED in s_mount_flags. Also remove bogus ext2_fs.h / ext4.h simultaneous #include protection, which can never happen. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
| * | | | | | | | | | ext4: update the s_last_mounted field in the superblockTheodore Ts'o2009-06-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This field can be very helpful when a system administrator is trying to sort through large numbers of block devices or filesystem images. What is stored in this field can be ambiguous if multiple filesystem namespaces are in play; what we store in practice is the mountpoint interpreted by the process's namespace which first opens a file in the filesystem. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
| * | | | | | | | | | ext4: change s_mount_opt to be an unsigned intTheodore Ts'o2009-06-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We can only fit 32 options in s_mount_opt because an unsigned long is 32-bits on a x86 machine. So use an unsigned int to save space on 64-bit platforms. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
| * | | | | | | | | | ext4: online defrag -- Add EXT4_IOC_MOVE_EXT ioctlAkira Fujita2009-06-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The EXT4_IOC_MOVE_EXT exchanges the blocks between orig_fd and donor_fd, and then write the file data of orig_fd to donor_fd. ext4_mext_move_extent() is the main fucntion of ext4 online defrag, and this patch includes all functions related to ext4 online defrag. Signed-off-by: Akira Fujita <a-fujita@rs.jp.nec.com> Signed-off-by: Takashi Sato <t-sato@yk.jp.nec.com> Signed-off-by: Kazuya Mio <k-mio@sx.jp.nec.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
| * | | | | | | | | | ext4: avoid unnecessary spinlock in critical POSIX ACL pathTheodore Ts'o2009-04-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If a filesystem supports POSIX ACL's, the VFS layer expects the filesystem to do POSIX ACL checks on any files not owned by the caller, and it does this for every single pathname component that it looks up. That obviously can be pretty expensive if the filesystem isn't careful about it, especially with locking. That's doubly sad, since the common case tends to be that there are no ACL's associated with the files in question. ext4 already caches the ACL data so that it doesn't have to look it up over and over again, but it does so by taking the inode->i_lock spinlock on every lookup. Which is a noticeable overhead even if it's a private lock, especially on CPU's where the serialization is expensive (eg Intel Netburst aka 'P4'). For the special case of not actually having any ACL's, all that locking is unnecessary. Even if somebody else were to be changing the ACL's on another CPU, we simply don't care - if we've seen a NULL ACL, we might as well use it. So just load the ACL speculatively without any locking, and if it was NULL, just use it. If it's non-NULL (either because we had a cached entry, or because the cache hasn't been filled in at all), it means that we'll need to get the lock and re-load it properly. (This commit was ported from a patch originally authored by Linus for ext3.) Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
| * | | | | | | | | | ext3: avoid unnecessary spinlock in critical POSIX ACL pathLinus Torvalds2009-04-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If a filesystem supports POSIX ACL's, the VFS layer expects the filesystem to do POSIX ACL checks on any files not owned by the caller, and it does this for every single pathname component that it looks up. That obviously can be pretty expensive if the filesystem isn't careful about it, especially with locking. That's doubly sad, since the common case tends to be that there are no ACL's associated with the files in question. ext3 already caches the ACL data so that it doesn't have to look it up over and over again, but it does so by taking the inode->i_lock spinlock on every lookup. Which is a noticeable overhead even if it's a private lock, especially on CPU's where the serialization is expensive (eg Intel Netburst aka 'P4'). For the special case of not actually having any ACL's, all that locking is unnecessary. Even if somebody else were to be changing the ACL's on another CPU, we simply don't care - if we've seen a NULL ACL, we might as well use it. So just load the ACL speculatively without any locking, and if it was NULL, just use it. If it's non-NULL (either because we had a cached entry, or because the cache hasn't been filled in at all), it means that we'll need to get the lock and re-load it properly. This is noticeable even on Nehalem, which does locking quite well (much better than P4). From lmbench: Processor, Processes - times in microseconds - smaller is better -------------------------------------------------------------------- Host OS Mhz null null open slct fork exec sh call I/O stat clos TCP proc proc proc --------- ------------- ---- ---- ---- ---- ---- ---- ---- ---- ---- - before: nehalem.l Linux 2.6.30- 3193 0.04 0.09 0.95 1.45 2.18 69.1 273. 1141 nehalem.l Linux 2.6.30- 3193 0.04 0.09 0.95 1.48 2.28 69.9 253. 1140 nehalem.l Linux 2.6.30- 3193 0.04 0.10 0.95 1.42 2.19 68.6 284. 1141 - after: nehalem.l Linux 2.6.30- 3193 0.04 0.09 0.92 1.44 2.12 68.3 282. 1094 nehalem.l Linux 2.6.30- 3193 0.04 0.09 0.92 1.39 2.20 67.0 308. 1123 nehalem.l Linux 2.6.30- 3193 0.04 0.09 0.92 1.39 2.36 67.4 293. 1148 where you can see what appears to be a roughly 3% improvement in stat and open/close latencies from just the removal of the locking overhead. Of course, this only matters for files you don't own (the owner never needs to do the ACL checks), but that's the common case for libraries, header files, and executables. As well as for the base components of any absolute pathname, even if you are the owner of the final file. [ At some point we probably want to move this ACL caching logic entirely into the VFS layer (and only call down to the filesystem when uncached), but in the meantime this improves ext3 a bit. A similar fix to btrfs makes a much bigger difference (15x improvement in lmbench) due to broken caching. ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Acked-by: Jan Kara <jack@suse.cz> Cc: Al Viro <viro@zeniv.linux.org.uk>