litmus-rt.git - The LITMUS^RT kernel.

	Commit message (Collapse)	Author	Age
...
\| *	rtl8187: Fix wrong rfkill switch mask for some models	Larry Finger	2009-12-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There are different bits used to convey the setting of the rfkill switch to the driver. The current driver only supports one of these possibilities. These changes were derived from the latest version of the vendor driver. This patch fixes the regression noted in kernel Bugzilla #14743. Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net> Reported-and-tested-by: Antti Kaijanmäki <antti@kaijanmaki.net> Tested-by: Hin-Tak Leung <hintak.leung@gmail.com> Cc: Stable <stable@kernel.org> Signed-off-by: John W. Linville <linville@tuxdriver.com>
\| *	ath9k: fix tx status reporting	Felix Fietkau	2009-12-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch fixes a bug in ath9k's tx status check, which caused mac80211 to consider regularly transmitted unicast frames as un-acked. When checking the ts_status field for errors, it needs to be masked with ATH9K_TXERR_FILT, because this field also contains other fields like ATH9K_TX_ACKED. Without this patch, AP mode is pretty much unusable, as hostapd checks the ACK status for the frames that it injects. Signed-off-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: John W. Linville <linville@tuxdriver.com>
\| *	mac80211: Fix bug in computing crc over dynamic IEs in beacon	Vasanthakumar Thiagarajan	2009-12-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	On a 32-bit machine, BIT() macro does not give the required bit value if the bit is mroe than 31. In ieee802_11_parse_elems_crc(), BIT() is suppossed to get the bit value more than 31 (42 (id of ERP_INFO_IE), 37 (CHANNEL_SWITCH_IE), (42), 32 (POWER_CONSTRAINT_IE), 45 (HT_CAP_IE), 61 (HT_INFO_IE)). As we do not get the required bit value for the above IEs, crc over these IEs are never calculated, so any dynamic change in these IEs after the association is not really handled on 32-bit platforms. This patch fixes this issue. Cc: stable@kernel.org Signed-off-by: Vasanthakumar Thiagarajan <vasanth@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
\| *	net/rfkill/core.c: work around gcc-4.0.2 silliness	Andrew Morton	2009-12-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	net/rfkill/core.c: In function 'rfkill_type_show': net/rfkill/core.c:610: warning: control may reach end of non-void function 'rfkill_get_type_str' being inlined A gcc bug, but simple enough to squish. Cc: John W. Linville <linville@tuxdriver.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: John W. Linville <linville@tuxdriver.com>
\| *	mwl8k: don't complain about oversized beacons in FINALIZE_JOIN	Lennert Buytenhek	2009-12-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The FINALIZE_JOIN firmware command only looks at the first couple of fields in the beacon, and therefore it's not necessary to complain if the beacon is longer than 128 bytes. Signed-off-by: Lennert Buytenhek <buytenh@marvell.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
\| *	mwl8k: don't overwrite mwl8k_vif::bssid until after disassociation	Lennert Buytenhek	2009-12-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When disassociating, mac80211 zeroes vif->bss_info.bssid before calling our ->bss_info_changed(), but we need the BSSID to remove the hardware station database entry for our AP, so we can't clear our local copy of the BSSID until after we've done that. Signed-off-by: Lennert Buytenhek <buytenh@marvell.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
\| *	mwl8k: struct ieee80211_rx_status::qual is deprecated	Lennert Buytenhek	2009-12-07
\| \| \| \| \| \| \| \| \| \|	Signed-off-by: Lennert Buytenhek <buytenh@marvell.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
\| *	mwl8k: don't forget to call pci_disable_device()	Lennert Buytenhek	2009-12-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Don't forget to call pci_disable_device() if pci_request_regions() fails during probe. Signed-off-by: Lennert Buytenhek <buytenh@marvell.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
\| *	mwl8k: increase firmware loading timeouts	Lennert Buytenhek	2009-12-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The time between loading the helper image and starting to upload the main firmware image should be at least 5 ms or so. We were doing an msleep(1) before, and 1 ms appears to not be enough in almost all cases, but building with HZ=100 has always masked this so far. Bumping the msleep argument to 5 fixes firmware loading e.g. when HZ=1000. Some firmware images need more than 200ms to initialize. Bump the ready code timeout to 500ms to accommodate for this. Signed-off-by: Lennert Buytenhek <buytenh@marvell.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
\| *	mwl8k: allow more time for transmit rings to drain	Lennert Buytenhek	2009-12-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Before issuing any firmware commands, we wait for the transmit rings to drain, to prevent control versus data path synchronization issues. In some cases, this can end up taking longer than the current hardcoded limit of 5 seconds, for example if the transmit rings are filled with packets for a host that has dropped off the air and we end up retransmitting every pending packet at the lowest rate a couple of times. This patch changes mwl8k_tx_wait_empty() to only bail out on timeout expiry if there was no change in the number of packets pending in the transmit rings during the waiting period. If at least one transmit ring entry was reclaimed while we were waiting, we are apparently still making progress, and we'll allow waiting for another timeout period. Signed-off-by: Lennert Buytenhek <buytenh@marvell.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
\| *	mwl8k: allow more time for firmware commands to complete	Lennert Buytenhek	2009-12-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Some firmware commands can under some circumstances take more than 2 seconds to complete. This patch bumps the timeout up to 10 seconds, and prints a message whenever a command takes more than 2 seconds. Signed-off-by: Lennert Buytenhek <buytenh@marvell.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
\| *	mwl8k: properly report rate on received 40MHz packets	Lennert Buytenhek	2009-12-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	On 8366, bit 6 in the rx descriptor rate field indicates whether the packet was received on a 20MHz or 40MHz channel, and is not part of the MCS index. Handle this properly, which then prevents hitting the WARN_ON and being dropped in ieee80211_rx(). Signed-off-by: Lennert Buytenhek <buytenh@marvell.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
\| *	mwl8k: fix addr4 zeroing and payload overwrite on DMA header creation	Lennert Buytenhek	2009-12-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When inserting a DMA header into a packet for transmission, mwl8k_add_dma_header() would blindly zero the addr4 field, which is not a good idea if the packet being transmitted is actually a 4-address packet. Also, if the transmitted packet was a 4-address with QoS packet, the memmove() to do the needed header reshuffling would inadvertently overwrite the first two bytes of the packet payload with the QoS field. This fixes both of these issues. Signed-off-by: Lennert Buytenhek <buytenh@marvell.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
\| *	mwl8k: prevent corruption of QoS field on receive	Lennert Buytenhek	2009-12-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Packets exchanged between the mwl8k driver and the firmware always have a 4-address header without QoS field. For QoS packets, the QoS field is passed to/from the firmware via the tx/rx descriptors. We were handling this correctly on transmit, but not on receive -- if a QoS packet was received, we would leave garbage in the QoS field in the packet passed up to the stack, which is Bad(tm). Also, if the packet received on the air was a 4-address without QoS packet, we would forget to skb_pull the 2-byte DMA length prefix off. This patch adds an argument to the ->rxd_process() receive descriptor operation to retrieve the QoS field from the receive descriptor, and extends mwl8k_remove_dma_header() to insert this field back into the packet if the packet received is a QoS packet. It also fixes mwl8k_remove_dma_header() to strip off the length prefix in all cases. Signed-off-by: Lennert Buytenhek <buytenh@marvell.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
\| *	mwl8k: fix UPDATE_STADB command struct legacy_rates array length	Lennert Buytenhek	2009-12-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There exist 12 802.11b/g rates, but mwl8k supports two additional (non-standard) rates, and includes those rates in rate bitmasks and in its internal rate table that hardware rate indices index. Commit "mwl8k: report rate and other information for received frames" added one of the nonstandard rates to the mwl8k_rates table to make the OFDM rates in the table line up with the rate indices that are reported in the receive descriptor (so that we can just simply copy the receive descriptor rate index into ieee80211_rx_status::rate_idx) and bumped MWL8K_IEEE_LEGACY_DATA_RATES from 12 to 13, but this screwed up the UPDATE_STADB command struct layout, as it also uses that define, for its legacy_rates array. To avoid having to convert rate indices and legacy rate bitmaps (e.g. ieee80211_bss_conf::basic_rates) between the 12-rate mac80211 format and the 14-rate mwl8k format, we'll report all 14 rates in our wiphy's band, but filter out the nonstandard ones e.g. in the case of the UPDATE_STADB command which only accepts 12 rates. In the commands that accept 14 rates (SET_AID, SET_RATE), replace the use of the MWL8K_RATE_INDEX_MAX_ARRAY define in the command struct by the constant 14, to make it clearer that these commands accept 14 rates. Signed-off-by: Lennert Buytenhek <buytenh@marvell.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
\| *	mwl8k: fix MCS bitmap size in SET_RATE command	Lennert Buytenhek	2009-12-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The MCS bitmaps in the SET_RATE command structure were of the wrong size, due to use of the wrong define for the array length. Just hardcode the lengths as 16, and do the same for the MCS bitmaps in other command structures. Signed-off-by: Lennert Buytenhek <buytenh@marvell.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
\| *	mac80211: Fix dynamic power save for scanning.	Vivek Natarajan	2009-12-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Not only ps_sdata but also IEEE80211_CONF_PS is to be considered before restoring PS in scan_ps_disable(). For instance, when ps_sdata is set but CONF_PS is not set just because the dynamic timer is still running, a sw scan leads to setting of CONF_PS in scan_ps_disable instead of restarting the dynamic PS timer. Also for the above case, a null data frame is to be sent after returning to operating channel which was not happening with the current implementation. This patch fixes this too. Signed-off-by: Vivek Natarajan <vnatarajan@atheros.com> Reviewed-by: Kalle Valo <kalle.valo@nokia.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
\| *	mac80211: recalculate idle later in MLME	Johannes Berg	2009-12-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	hwsim testing has revealed that when the MLME recalculates the idle state of the device, it sometimes does so before sending the final deauthentication or disassociation frame. This patch changes the place where the idle state is recalculated, but of course driver transmit is typically asynchronous while configuration is expected to be synchronous, so it doesn't fix all possible cases yet. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>
\| *	wl1251: don't build null data template in wl1251_op_config()	Kalle Valo	2009-12-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The bssid can be zero when null data template is set in wl1251_op_config(). It's enough, and especially safe, to set it once after association. Signed-off-by: Kalle Valo <kalle.valo@nokia.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
\| *	wl1251: fix bssid handling	Kalle Valo	2009-12-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	bssid needs to be copied first in wl1251_op_bss_info_changed(), otherwise templates will have incorrect bssid and power save will not work correctly. Signed-off-by: Kalle Valo <kalle.valo@nokia.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
\| *	wl1251: remove false warning messages	Kalle Valo	2009-12-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There was a warning from wl1251_op_bss_info_changed(): wl1251: WARNING Set ctsprotect failed 0 It was printed always, it's completely false and can be removed. Signed-off-by: Kalle Valo <kalle.valo@nokia.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
\| *	iwlwifi: fix warning from ieee80211_stop_tx_ba_cb_irqsafe argument change	John W. Linville	2009-12-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	CC [M] drivers/net/wireless/iwlwifi/iwl-tx.o drivers/net/wireless/iwlwifi/iwl-tx.c: In function ‘iwl_tx_agg_stop’: drivers/net/wireless/iwlwifi/iwl-tx.c:1356: warning: passing argument 1 of ‘ieee80211_stop_tx_ba_cb_irqsafe’ from incompatible pointer type include/net/mac80211.h:2128: note: expected ‘struct ieee80211_vif ’ but argument is of type ‘struct ieee80211_hw ’ Signed-off-by: John W. Linville <linville@tuxdriver.com>
* \|	sctp: fix compile error due to sysctl mismerge	Linus Torvalds	2009-12-08
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I messed up the merge in d7fc02c7bae7b1cf69269992cf880a43a350cdaa, where the conflict in question wasn't just about CTL_UNNUMBERED being removed, but the 'strategy' field is too (sysctl handling is now done through the /proc interface, with no duplicate protocols for reading the data). Reported-by: Larry Finger <Larry.Finger@lwfinger.net> Reported-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* \|	Merge branch 'for-2.6.33' of git://git.kernel.dk/linux-2.6-block	Linus Torvalds	2009-12-08
\|\ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* 'for-2.6.33' of git://git.kernel.dk/linux-2.6-block: (113 commits) cfq-iosched: Do not access cfqq after freeing it block: include linux/err.h to use ERR_PTR cfq-iosched: use call_rcu() instead of doing grace period stall on queue exit blkio: Allow CFQ group IO scheduling even when CFQ is a module blkio: Implement dynamic io controlling policy registration blkio: Export some symbols from blkio as its user CFQ can be a module block: Fix io_context leak after failure of clone with CLONE_IO block: Fix io_context leak after clone with CLONE_IO cfq-iosched: make nonrot check logic consistent io controller: quick fix for blk-cgroup and modular CFQ cfq-iosched: move IO controller declerations to a header file cfq-iosched: fix compile problem with !CONFIG_CGROUP blkio: Documentation blkio: Wait on sync-noidle queue even if rq_noidle = 1 blkio: Implement group_isolation tunable blkio: Determine async workload length based on total number of queues blkio: Wait for cfq queue to get backlogged if group is empty blkio: Propagate cgroup weight updation to cfq groups blkio: Drop the reference to queue once the task changes cgroup blkio: Provide some isolation between groups ...
\| * \|	cfq-iosched: Do not access cfqq after freeing it	Vivek Goyal	2009-12-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fix a crash during boot reported by Jeff Moyer. Fix the issue of accessing cfqq after freeing it. Reported-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Reviewed-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Jens Axboe <axboe@carl.(none)>
\| * \|	block: include linux/err.h to use ERR_PTR	Stephen Rothwell	2009-12-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
\| * \|	cfq-iosched: use call_rcu() instead of doing grace period stall on queue exit	Jens Axboe	2009-12-06
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	After the merge of the IO controller patches, booting on my megaraid box ran much slower. Vivek Goyal traced it down to megaraid discovery creating tons of devices, each suffering a grace period when they later kill that queue (if no device is found). So lets use call_rcu() to batch these deferred frees, instead of taking the grace period hit for each one. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
\| * \|	blkio: Allow CFQ group IO scheduling even when CFQ is a module	Vivek Goyal	2009-12-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	o Now issues of blkio controller and CFQ in module mode should be fixed. Enable the cfq group scheduling support in module mode. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
\| * \|	blkio: Implement dynamic io controlling policy registration	Vivek Goyal	2009-12-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	o One of the goals of block IO controller is that it should be able to support mulitple io control policies, some of which be operational at higher level in storage hierarchy. o To begin with, we had one io controlling policy implemented by CFQ, and I hard coded the CFQ functions called by blkio. This created issues when CFQ is compiled as module. o This patch implements a basic dynamic io controlling policy registration functionality in blkio. This is similar to elevator functionality where ioschedulers register the functions dynamically. o Now in future, when more IO controlling policies are implemented, these can dynakically register with block IO controller. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
\| * \|	blkio: Export some symbols from blkio as its user CFQ can be a module	Vivek Goyal	2009-12-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	o blkio controller is inside the kernel and cfq makes use of interfaces exported by blkio. CFQ can be a module too, hence export symbols used by CFQ. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
\| * \|	block: Fix io_context leak after failure of clone with CLONE_IO	Louis Rilling	2009-12-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	With CLONE_IO, parent's io_context->nr_tasks is incremented, but never decremented whenever copy_process() fails afterwards, which prevents exit_io_context() from calling IO schedulers exit functions. Give a task_struct to exit_io_context(), and call exit_io_context() instead of put_io_context() in copy_process() cleanup path. Signed-off-by: Louis Rilling <louis.rilling@kerlabs.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
\| * \|	block: Fix io_context leak after clone with CLONE_IO	Louis Rilling	2009-12-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	With CLONE_IO, copy_io() increments both ioc->refcount and ioc->nr_tasks. However exit_io_context() only decrements ioc->refcount if ioc->nr_tasks reaches 0. Always call put_io_context() in exit_io_context(). Signed-off-by: Louis Rilling <louis.rilling@kerlabs.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
\| * \|	cfq-iosched: make nonrot check logic consistent	Shaohua Li	2009-12-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	cfq_arm_slice_timer() has logic to disable idle window for SSD device. The same thing should be done at cfq_select_queue() too, otherwise we will still see idle window. This makes the nonrot check logic consistent in cfq. Tests in a intel SSD with low_latency knob close, below patch can triple disk thoughput for muti-thread sequential read. Signed-off-by: Shaohua Li <shaohua.li@intel.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
\| * \|	io controller: quick fix for blk-cgroup and modular CFQ	Jens Axboe	2009-12-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It's currently not an allowed configuration, so express that in Kconfig. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
\| * \|	cfq-iosched: move IO controller declerations to a header file	Jens Axboe	2009-12-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	They should not be declared inside some other file that's not related to CFQ. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
\| * \|	cfq-iosched: fix compile problem with !CONFIG_CGROUP	Jens Axboe	2009-12-03
\| \| \| \| \| \| \| \| \| \| \| \|	Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
\| * \|	blkio: Documentation	Vivek Goyal	2009-12-03
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
\| * \|	blkio: Wait on sync-noidle queue even if rq_noidle = 1	Vivek Goyal	2009-12-03
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	o rq_noidle() is supposed to tell cfq that do not expect a request after this one, hence don't idle. But this does not seem to work very well. For example for direct random readers, rq_noidle = 1 but there is next request coming after this. Not idling, leads to a group not getting its share even if group_isolation=1. o The right solution for this issue is to scan the higher layers and set right flag (WRITE_SYNC or WRITE_ODIRECT). For the time being, this single line fix helps. This should not have any significant impact when we are not using cgroups. I will later figure out IO paths in higher layer and fix it. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
\| * \|	blkio: Implement group_isolation tunable	Vivek Goyal	2009-12-03
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	o If a group is running only a random reader, then it will not have enough traffic to keep disk busy and we will reduce overall throughput. This should result in better latencies for random reader though. If we don't idle on random reader service tree, then this random reader will experience large latencies if there are other groups present in system with sequential readers running in these. o One solution suggested by corrado is that by default keep the random readers or sync-noidle workload in root group so that during one dispatch round we idle only once on sync-noidle tree. This means that all the sync-idle workload queues will be in their respective group and we will see service differentiation in those but not on sync-noidle workload. o Provide a tunable group_isolation. If set, this will make sure that even sync-noidle queues go in their respective group and we wait on these. This provides stronger isolation between groups but at the expense of throughput if group does not have enough traffic to keep the disk busy. o By default group_isolation = 0 Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
\| * \|	blkio: Determine async workload length based on total number of queues	Vivek Goyal	2009-12-03
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	o Async queues are not per group. Instead these are system wide and maintained in root group. Hence their workload slice length should be calculated based on total number of queues in the system and not just queues in the root group. o As root group's default weight is 1000, make sure to charge async queue more in terms of vtime so that it does not get more time on disk because root group has higher weight. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
\| * \|	blkio: Wait for cfq queue to get backlogged if group is empty	Vivek Goyal	2009-12-03
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	o If a queue consumes its slice and then gets deleted from service tree, its associated group will also get deleted from service tree if this was the only queue in the group. That will make group loose its share. o For the queues on which we have idling on and if these have used their slice, wait a bit for these queues to get backlogged again and then expire these queues so that group does not loose its share. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
\| * \|	blkio: Propagate cgroup weight updation to cfq groups	Vivek Goyal	2009-12-03
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	o Propagate blkio cgroup weight updation to associated cfq groups. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
\| * \|	blkio: Drop the reference to queue once the task changes cgroup	Vivek Goyal	2009-12-03
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	o If a task changes cgroup, drop reference to the cfqq associated with io context and set cfqq pointer stored in ioc to NULL so that upon next request arrival we will allocate a new queue in new group. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
\| * \|	blkio: Provide some isolation between groups	Vivek Goyal	2009-12-03
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	o Do not allow following three operations across groups for isolation. - selection of co-operating queues - preemtpions across groups - request merging across groups. o Async queues are currently global and not per group. Allow preemption of an async queue if a sync queue in other group gets backlogged. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
\| * \|	blkio: Export disk time and sectors used by a group to user space	Vivek Goyal	2009-12-03
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	o Export disk time and sector used by a group to user space through cgroup interface. o Also export a "dequeue" interface to cgroup which keeps track of how many a times a group was deleted from service tree. Helps in debugging. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
\| * \|	blkio: Some debugging aids for CFQ	Vivek Goyal	2009-12-03
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	o Some debugging aids for CFQ. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
\| * \|	blkio: Take care of cgroup deletion and cfq group reference counting	Vivek Goyal	2009-12-03
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	o One can choose to change elevator or delete a cgroup. Implement group reference counting so that both elevator exit and cgroup deletion can take place gracefully. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Nauman Rafique <nauman@google.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
\| * \|	blkio: Dynamic cfq group creation based on cgroup tasks belongs to	Vivek Goyal	2009-12-03
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	o Determine the cgroup IO submitting task belongs to and create the cfq group if it does not exist already. o Also link cfqq and associated cfq group. o Currently all async IO is mapped to root group. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
\| * \|	blkio: Group time used accounting and workload context save restore	Vivek Goyal	2009-12-03
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	o This patch introduces the functionality to do the accounting of group time when a queue expires. This time used decides which is the group to go next. o Also introduce the functionlity to save and restore the workload type context with-in group. It might happen that once we expire the cfq queue and group, a different group will schedule in and we will lose the context of the workload type. Hence save and restore it upon queue expiry. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
\| * \|	blkio: Implement per cfq group latency target and busy queue avg	Vivek Goyal	2009-12-03
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	o So far we had 300ms soft target latency system wide. Now with the introduction of cfq groups, divide that latency by number of groups so that one can come up with group target latency which will be helpful in determining the workload slice with-in group and also the dynamic slice length of the cfq queue. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>