aboutsummaryrefslogtreecommitdiffstats
path: root/fs/f2fs
Commit message (Collapse)AuthorAge
* f2fs: fix to release count of meta page in ->invalidatepageChao Yu2015-02-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We will encounter deadloop in below scenario: 1. increase page count for F2FS_DIRTY_META type in following path: ->recover_fsync_data ->recover_data ->do_recover_data ->recover_data_page ->change_curseg ->write_sum_page ->set_page_dirty 2. fail in recover_data() 3. invalidate meta pages in truncate_inode_pages_final without decreasing page count. 4. deadloop when sync_meta_pages as page count will always be non-zero. message: NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [<c1129a37>] pagevec_lookup_tag+0x27/0x30 [<f0e774c7>] sync_meta_pages+0x87/0x160 [f2fs] [<f0e86dd9>] recover_fsync_data+0xeb9/0xf10 [f2fs] [<f0e75398>] f2fs_fill_super+0x888/0x980 [f2fs] [<c11733ca>] mount_bdev+0x16a/0x1a0 [<f0e7180f>] f2fs_mount+0x1f/0x30 [f2fs] [<c1173da6>] mount_fs+0x36/0x170 [<c118b6f5>] vfs_kern_mount+0x55/0xe0 [<c118d63f>] do_mount+0x1df/0x9f0 [<c118e110>] SyS_mount+0x70/0xb0 [<c15a0c48>] sysenter_do_call+0x12/0x12 To avoid page count leak, let's add ->invalidatepage and ->releasepage in f2fs_meta_aops as f2fs_node_aops to release meta page count correctly. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: do checkpoint when umount flag is not setJaegeuk Kim2015-02-11
| | | | | | | | If the previous checkpoint was done without CP_UMOUNT flag, it needs to do checkpoint with CP_UMOUNT for the next fast boot. Reviewed-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: trigger correct checkpoint during umountJaegeuk Kim2015-02-11
| | | | | | | | | | | | This patch fixes to trigger checkpoint with umount flag when kill_sb was called. In kill_sb, f2fs_sync_fs was finally called, but at this time, f2fs can't do checkpoint with CP_UMOUNT. After then, f2fs_put_super is not doing checkpoint, since it is not dirty. So, this patch adds a flag to indicate f2fs_sync_fs is called during umount. Reviewed-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: update memory footprint informationJaegeuk Kim2015-02-11
| | | | | | This patch adds missing memory usages, and splits them in detail. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: fix wrong memory footprint statistics in debugfsChao Yu2015-02-11
| | | | | | | | Our value of memory footprint statistics showed in debugfs is not calculated correctly. Fix it in this patch. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: avoid infinite loop on cp_errorJaegeuk Kim2015-02-11
| | | | | | | If cp_error is set, we should avoid all the infinite loop. In f2fs_sync_file, there is a hole, and this patch fixes that. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: pids_lock can be statickbuild test robot2015-01-09
| | | | | | | fs/f2fs/trace.c:19:12: sparse: symbol 'pids_lock' was not declared. Should it be static? Signed-off-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: add f2fs_destroy_trace_ios to free radix treeJaegeuk Kim2015-01-09
| | | | | | This patch removes radix tree after finishing tracing IOs. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: add spin_lock to cover radix operations in IO tracerJaegeuk Kim2015-01-09
| | | | | | This patch adds spin_lock to cover radix tree operations in IO tracer. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: add nat/sit entries into statusJaegeuk Kim2015-01-09
| | | | | | This patch adds NAT/SIT entry informations. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: free radix_tree_nodes used by nat_set entriesJaegeuk Kim2015-01-09
| | | | | | | In the normal case, the radix_tree_nodes are freed successfully. But, when cp_error was detected, we should destroy them forcefully. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: fix wrong unlock_page callJaegeuk Kim2015-01-09
| | | | | | This patch removes wrongly called unlock_page. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: get rid of kzalloc in __recover_inline_statusChao Yu2015-01-09
| | | | | | | | | | | We use kzalloc to allocate memory in __recover_inline_status, and use this all-zero memory to check the inline date content of inode page by comparing them. This is low effective and not needed, let's check inline date content directly. Signed-off-by: Chao Yu <chao2.yu@samsung.com> [Jaegeuk Kim: make the code more neat] Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: align direct_io'ed data to sectionJaegeuk Kim2015-01-09
| | | | | | | | | | | | | | This patch aligns the start block address of a file for direct io to the f2fs's section size. Some flash devices manage an over 4KB-sized page as a write unit, and if the direct_io'ed data are written but not aligned to that unit, the performance can be degraded due to the partial page copies. Thus, since f2fs has a section that is well aligned to FTL units, we can align the block address to the section size so that f2fs avoids this misalignment. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: remove uncovered code pathJaegeuk Kim2015-01-09
| | | | | | This patch removes unnecessary function calls. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: avoid potential unnecessary codesJaegeuk Kim2015-01-09
| | | | | | This patch relocates some operations to avoid unnecessary execution. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: clean up to remove parameterJaegeuk Kim2015-01-09
| | | | | | | This patch uses dn->data_blkaddr as a parameter for the destination block address. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: reuse inode_entry_slab in gc procedure for using slab more effectivelyChao Yu2015-01-09
| | | | | | | | | | | | | | | | | | | | | | | | | There are two slab cache inode_entry_slab and winode_slab using the same structure as below: struct dir_inode_entry { struct list_head list; /* list head */ struct inode *inode; /* vfs inode pointer */ }; struct inode_entry { struct list_head list; struct inode *inode; }; It's a little waste that the two cache can not share their memory space for each other. So in this patch we remove one redundant winode_slab slab cache, then use more universal name struct inode_entry as remaining data structure name of slab, finally we reuse the inode_entry_slab to store dirty dir item and gc item for more effective. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: cleanup parameters for ↵Chao Yu2015-01-09
| | | | | | | | | | | trace_f2fs_submit_{read_,write_,page_,page_m}bio with fio Cleanup parameters for trace_f2fs_submit_{read_,write_,page_,page_m}bio with fio as one parameter. Suggested-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: cleanup trace event of f2fs_submit_page_{m,}bio with DECLARE_EVENT_CLASSChao Yu2015-01-09
| | | | | | | | | | | | | | This patch adds missing parameter _type_ for trace_f2fs_submit_page_bio, then use DECLARE_EVENT_CLASS/DEFINE_EVENT_CONDITION pair to cleanup some trace event code related to f2fs_submit_page_{m,}bio. Additionally, after we remove redundant code, size of code can be reduced: text data bss dec hex filename 176787 8712 56 185555 2d4d3 f2fs.ko.org 174408 8648 56 183112 2cb48 f2fs.ko Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: fix missing cold bit during recoveryJaegeuk Kim2015-01-09
| | | | | | | | | | In do_recover_data, we find and update previous node pages after updating its new block addresses. After then, we call fill_node_footer without reset field, we erase its cold bit so that this new cold node block is written to wrong log area. This patch fixes not to miss its old flag. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: add block count by in-place-update in stat infoChangman Lee2015-01-09
| | | | | | | | This patch adds block count by in-place-update in stat. Signed-off-by: Changman Lee <cm224.lee@samsung.com> Reviewed-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: avoid double lock for cp_rwsemJaegeuk Kim2015-01-09
| | | | | | | | | | | | | | | | | | The __f2fs_add_link is covered by cp_rwsem all the time. This calls init_inode_metadata, which conducts some acl operations including memory allocation with GFP_KERNEL previously. But, under memory pressure, f2fs_write_data_page can be called, which also grabs cp_rwsem too. In this case, this incurs a deadlock pointed by Chao. Thread #1 Thread #2 down_read down_write down_read -> here down_read should wait forever. Reviewed-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: activate f2fs_trace_iosJaegeuk Kim2015-01-09
| | | | | | This patch activates f2fs_trace_ios. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: activate f2fs_trace_pidJaegeuk Kim2015-01-09
| | | | | | This patch activates f2fs_trace_pid. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: add key functions for f2fs_io_tracerJaegeuk Kim2015-01-09
| | | | | | | | | | | This patch adds two key functions to trace process ids and IOs. The basic idea is to 1. remain process ids, pids, in page->private. 2. show pids in IO traces. So, later we can retrieve process information according to IO traces. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: add f2fs_io_tracer supportJaegeuk Kim2015-01-09
| | | | | | | | This patch adds: o initial trace.c and trace.h with skeleton functions o Kconfig and Makefile to activate this feature Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: use f2fs_io_info to clean up messy parameters during IO pathJaegeuk Kim2015-01-09
| | | | | | | | | This patch cleans up parameters on IO paths. The key idea is to use f2fs_io_info adding a parameter, block address, and then use this structure as parameters. Reviewed-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: use ra_meta_pages to simplify readahead code in restore_node_summaryChao Yu2015-01-09
| | | | | | | | | | | | | | | | Use more common function ra_meta_pages() with META_POR to readahead node blocks in restore_node_summary() instead of ra_sum_pages(), hence we can simplify the readahead code there, and also we can remove unused function ra_sum_pages(). changes from v2: o use invalidate_mapping_pages as before suggested by Changman Lee. changes from v1: o fix one bug when using truncate_inode_pages_range which is pointed out by Jaegeuk Kim. Reviewed-by: Changman Lee <cm224.lee@samsung.com> Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: merge two uchar variable in struct node_info to reduce memory costChao Yu2015-01-09
| | | | | | | | | | | | | | | | | | | This patch moves one member of struct nat_entry: _flag_ to struct node_info, so _version_ in struct node_info and _flag_ which are unsigned char type will merge to one 32-bit space in register/memory. So the size of nat_entry will be reduced from 28 bytes to 24 bytes (for 64-bit machine, reduce its size from 40 bytes to 32 bytes) and then slab memory using by f2fs will be reduced. changes from v2: o update description of memory usage gain for 64-bit machine suggested by Changman Lee. changes from v1: o introduce inline copy_node_info() to copy valid data from node info suggested by Jaegeuk Kim, it can avoid bug. Reviewed-by: Changman Lee <cm224.lee@samsung.com> Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: readahead contiguous current summary blocks in checkpointChao Yu2015-01-09
| | | | | | | | | | | Let's add readahead code for reading contiguous compact/normal summary blocks in checkpoint, then we will gain better performance in mount procedure. Changes from v1 o remove inappropriate 'unlikely' in npages_for_summary_flush. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: use missing the use of f2fs_kunmap_pageJaegeuk Kim2015-01-09
| | | | | | This patch calls f2fs_kunmap_page which I missed before. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: remove unnecessary call to invalidate inmemory pagesJaegeuk Kim2015-01-09
| | | | | | | Now we use inmemory pages for atomic write only and provide abort procedure, we don't need to truncate them explicitly. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: fix small discards not to issue redundantlyJaegeuk Kim2015-01-09
| | | | | | | | | | | The ckpt_valid_map and cur_valid_map are synced by seg_info_to_raw_sit. In the case of small discards, the candidates are selected before sync, while fitrim selects candidates after sync. So, for small discards, we need to add candidates only just being obsoleted. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: change atomic and volatile write policiesJaegeuk Kim2015-01-09
| | | | | | | | | | | | | | | This patch adds two new ioctls to release inmemory pages grabbed by atomic writes. o f2fs_ioc_abort_volatile_write - If transaction was failed, all the grabbed pages and data should be written. o f2fs_ioc_release_volatile_write - This is to enhance the performance of PERSIST mode in sqlite. In order to avoid huge memory consumption which causes OOM, this patch changes volatile writes to use normal dirty pages, instead blocked flushing to the disk as long as system does not suffer from memory pressure. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: don't need to call lock_op and lock_page for abortJaegeuk Kim2015-01-09
| | | | | | We don't need to call lock_op and lock_page at the aborting path. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: fix wrong condition check to trigger f2fs_sync_fsJaegeuk Kim2015-01-09
| | | | | | If there is not enough available memory, we need to trigger f2fs_sync_fs. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: remove checking dirty_exceedJaegeuk Kim2015-01-09
| | | | | | | We don't need to force to write dirty_exceeded for f2fs_balance_fs_bg. This flag was only meaningful to write bypassing conditions. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: avoid to ra unneeded blocks in recover flowChao Yu2014-12-08
| | | | | | | | | | | To improve recovery speed, f2fs try to readahead many contiguous blocks in warm node segment, but for most time, abnormal power-off do not occur frequently, so when mount a normal power-off f2fs image, by contrary ra so many blocks and then invalid them will hurt the performance of mount. It's better to just ra the first next-block for normal condition. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: introduce is_valid_blkaddr to cleanup codes in ra_meta_pagesChao Yu2014-12-08
| | | | | | | | | This patch does cleanup work, it introduces is_valid_blkaddr() to include verification code for blkaddr with upper and down boundary value which were in ra_meta_pages previous. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: fix to enable readahead for SSA/CP blocksChao Yu2014-12-08
| | | | | | | | | | | | 1.We use zero as upper boundary value for ra SSA/CP blocks, we will skip readahead as verification failure with max number, it causes low performance. 2.Low boundary value is not accurate for SSA/CP/POR region verification, so these values need to be redefined. This patch fixes above issues. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: use atomic for counting inode with inline_{dir,inode} flagChao Yu2014-12-08
| | | | | | | | As inline_{dir,inode} stat is increased/decreased concurrently by multi threads, so the value is not so accurate, let's use atomic type for counting accurately. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: cleanup path to need cp at fsyncChangman Lee2014-12-08
| | | | | | | | Added some commentaries for code readability and cleaned up if-statement clearly. Signed-off-by: Changman Lee <cm224.lee@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: check if inode state is dirty at fsyncChangman Lee2014-12-08
| | | | | | | | If inode state is dirty, go straight to write. Suggested-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Changman Lee <cm224.lee@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: count the number of inmemory pagesJaegeuk Kim2014-12-08
| | | | | | This patch adds counting # of inmemory pages in the page cache. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: release inmemory pages when the file was closedJaegeuk Kim2014-12-08
| | | | | | If file is closed, let's drop inmemory pages. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: set page private for inmemory pages for truncationJaegeuk Kim2014-12-08
| | | | | | | The inmemory pages should be handled by invalidate_page since it needs to be released int the truncation path. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: count inline_xx in do_read_inodeJaegeuk Kim2014-12-08
| | | | | | | | In do_read_inode, if we failed __recover_inline_status, the inode has inline flag without increasing its count. Later, f2fs_evict_inode will decrease the count, which causes -1. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: do retry operations with cond_reschedJaegeuk Kim2014-12-08
| | | | | | | | | | This patch revists retrial paths in f2fs. The basic idea is to use cond_resched instead of retrying from the very early stage. Suggested-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Reviewed-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: call radix_tree_preload before radix_tree_insertJaegeuk Kim2014-12-05
| | | | | | | | | | | | | | | | | | | | This patch tries to fix: BUG: using smp_processor_id() in preemptible [00000000] code: f2fs_gc-254:0/384 (radix_tree_node_alloc+0x14/0x74) from [<c033d8a0>] (radix_tree_insert+0x110/0x200) (radix_tree_insert+0x110/0x200) from [<c02e8264>] (gc_data_segment+0x340/0x52c) (gc_data_segment+0x340/0x52c) from [<c02e8658>] (f2fs_gc+0x208/0x400) (f2fs_gc+0x208/0x400) from [<c02e8a98>] (gc_thread_func+0x248/0x28c) (gc_thread_func+0x248/0x28c) from [<c0139944>] (kthread+0xa0/0xac) (kthread+0xa0/0xac) from [<c0105ef8>] (ret_from_fork+0x14/0x3c) The reason is that f2fs calls radix_tree_insert under enabled preemption. So, before calling it, we need to call radix_tree_preload. Otherwise, we should use _GFP_WAIT for the radix tree, and use mutex or semaphore to cover the radix tree operations. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>