| Commit message (Collapse) | Author | Age |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Define a new api that could be used for doing fancy data transfers
like interleaved to contiguous copy and vice-versa.
Traditional SG_list based transfers tend to be very inefficient in
such cases as where the interleave and chunk are only a few bytes,
which call for a very condensed api to convey pattern of the transfer.
This api supports all 4 variants of scatter-gather and contiguous transfer.
Of course, neither can this api help transfers that don't lend to DMA by
nature, i.e, scattered tiny read/writes with no periodic pattern.
Also since now we support SLAVE channels that might not provide
device_prep_slave_sg callback but device_prep_interleaved_dma,
remove the BUG_ON check.
Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
Acked-by: Barry Song <Baohua.Song@csr.com>
[renamed dmaxfer_template to dma_interleaved_template
did fixup after the enum dma_transfer_merge]
Signed-off-by: Vinod Koul <vinod.koul@linux.intel.com>
|
|\
| |
| |
| |
| |
| |
| |
| |
| |
| | |
git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx:
dmaengine: use DEFINE_IDR for static initialization
ioat: fix xor_idx_to_desc
Avoid section type conflict in dma/ioat/dma_v3.c
ioat: Adding PCI IDs for IOAT devices on SandyBridge platforms
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
We could use DEFINE_IDR for statically allocated idr
that allow us to save a few lines of code.
And also remove unneeded mutex_init() for dma_list_mutex, as
dma_list_mutex is initialized automatically by DEFINE_MUTEX().
Signed-off-by: Axel Lin <axel.lin@gmail.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|\ \
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
* 'for-linus' of git://git.infradead.org/users/vkoul/slave-dma: (37 commits)
Improve slave/cyclic DMA engine documentation
dmaengine: pl08x: handle the rest of enums in pl08x_width
DMA: PL08x: cleanup selection of burst size
DMA: PL08x: avoid recalculating cctl at each prepare
DMA: PL08x: cleanup selection of buswidth
DMA: PL08x: constify plchan->cd and plat->slave_channels
DMA: PL08x: separately store source/destination cctl
DMA: PL08x: separately store source/destination slave address
DMA: PL08x: clean up LLI debugging
DMA: PL08x: select LLI bus only once per LLI setup
DMA: PL08x: remove unused constants
ARM: mxs-dma: reset after disable channel
dma: intel_mid_dma: remove redundant pci_set_drvdata calls
dma: mxs-dma: fix unterminated platform_device_id table
dmaengine: pl330: make platform data optional
dmaengine: imx-sdma: return proper error if kzalloc fails
pch_dma: Fix CTL register access issue
dmaengine: mxs-dma: skip request_irq for NO_IRQ
dmaengine/coh901318: fix slave submission semantics
dmaengine/ste_dma40: allow memory buswidth/burst to be configured
...
Fix trivial whitespace conflict in drivers/dma/mv_xor.c
|
| |/
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
There exist systems with multiple DMA controllers with different
capabilities. For example, on some sh-mobile / rmobile systems there are
DMA controllers, whose channels can be configured to be used with
SD- and MMC-host controllers, serial ports etc. Besides there are also
DMA controllers, that can only be used for one special function, e.g.,
for USB. In such cases the DMA client filter function can just choose
to specify to the DMA driver, which channel it needs. Then the
.device_alloc_chan_resources() method of the DMA driver will check,
whether it can provide that dunction. If not, it will fail and the loop
in __dma_request_channel() will continue to the next DMA device, until
it finds a suitable one. This works fine with just one minor glitch:
the kernel logs error messages like
dmaengine: failed to get <channel name>: (-<error code>)
after each such non-critical failure. This patch lowers priority of
this message to the debug level.
Reported-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Tested-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Tested-by: Magnus Damm <damm@opensource.se>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
|
|/
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Remove linux/mm.h inclusion from netdevice.h -- it's unused (I've checked manually).
To prevent mm.h inclusion via other channels also extract "enum dma_data_direction"
definition into separate header. This tiny piece is what gluing netdevice.h with mm.h
via "netdevice.h => dmaengine.h => dma-mapping.h => scatterlist.h => mm.h".
Removal of mm.h from scatterlist.h was tried and was found not feasible
on most archs, so the link was cutoff earlier.
Hope people are OK with tiny include file.
Note, that mm_types.h is still dragged in, but it is a separate story.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
|
|
|
|
|
|
|
| |
The majority of drivers in drivers/dma/ will never establish cross
channel operation chains and do not need the extra overhead in struct
dma_async_tx_descriptor. Make channel switching opt-in by default.
Cc: Anatolij Gustschin <agust@denx.de>
Cc: Ira Snyder <iws@ovro.caltech.edu>
Cc: Linus Walleij <linus.walleij@stericsson.com>
Cc: Saeed Bishara <saeed@marvell.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|\ \ |
|
| |/
|/|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Cyclic transfers are useful for audio where a single buffer divided
in periods has to be transfered endlessly until stopped. After being
prepared the transfer is started using the dma_async_descriptor->tx_submit
function. dma_async_descriptor->callback is called after each period.
The transfer is stopped using the DMA_TERMINATE_ALL callback.
While being used for cyclic transfers the channel cannot be used
for other transfer types.
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Cc: Haavard Skinnemoen <haavard.skinnemoen@atmel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|/
|
|
|
|
|
|
|
|
|
|
|
| |
This adds support for scatterlist to scatterlist DMA transfers. A
similar interface is exposed by the fsldma driver (through the DMA_SLAVE
API) and by the ste_dma40 driver (through an exported function).
This patch paves the way for making this type of copy operation a part
of the generic DMAEngine API. Futher patches will add support in
individual drivers.
Signed-off-by: Ira W. Snyder <iws@ovro.caltech.edu>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|\ |
|
| |
| |
| |
| |
| |
| |
| | |
Saves 24 bytes per descriptor (64-bit) when the channel-switching
capabilities of async_tx are not required.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
implicit slab.h inclusion from percpu.h
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.
percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.
http://userweb.kernel.org/~tj/misc/slabh-sweep.py
The script does the followings.
* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.
* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.
* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.
The conversion was done in the following steps.
1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.
2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.
3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.
4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.
5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.
6. percpu.h was updated not to include slab.h.
7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).
* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig
8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.
Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.
Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The member 'private' of 'struct dma_chan' is meant for passing
data between client and the controller driver.
The DMA client driver may point it to platform specific stuff after
acquiring the channel. So, it is the responsiblity of the same code
to reset it, if it must.
The DMA engine doesn't set it and hence, shouldn't reset it either.
This reseting of private by DMA Engine comes in the way of implementing
default channel settings during DMAC probe. That capability is useful
for not having the clients to always provide platform specific data,
like Rx/Tx FIFO addresses, which usually doesn't change across channel
requests.
Signed-off-by: Jassi Brar <jassi.brar@samsung.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Convert the device_is_tx_complete() operation on the
DMA engine to a generic device_tx_status()operation which
can return three states, DMA_TX_RUNNING, DMA_TX_COMPLETE,
DMA_TX_PAUSED.
[dan.j.williams@intel.com: update for timberdale]
Signed-off-by: Linus Walleij <linus.walleij@stericsson.com>
Acked-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Cc: Maciej Sosnowski <maciej.sosnowski@intel.com>
Cc: Nicolas Ferre <nicolas.ferre@atmel.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Li Yang <leoli@freescale.com>
Cc: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Haavard Skinnemoen <haavard.skinnemoen@atmel.com>
Cc: Magnus Damm <damm@opensource.se>
Cc: Liam Girdwood <lrg@slimlogic.co.uk>
Cc: Joe Perches <joe@perches.com>
Cc: Roland Dreier <rdreier@cisco.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|/
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Convert the device_terminate_all() operation on the
DMA engine to a generic device_control() operation
which can now optionally support also pausing and
resuming DMA on a certain channel. Implemented for the
COH 901 318 DMAC as an example.
[dan.j.williams@intel.com: update for timberdale]
Signed-off-by: Linus Walleij <linus.walleij@stericsson.com>
Acked-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Cc: Maciej Sosnowski <maciej.sosnowski@intel.com>
Cc: Nicolas Ferre <nicolas.ferre@atmel.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Li Yang <leoli@freescale.com>
Cc: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Haavard Skinnemoen <haavard.skinnemoen@atmel.com>
Cc: Magnus Damm <damm@opensource.se>
Cc: Liam Girdwood <lrg@slimlogic.co.uk>
Cc: Joe Perches <joe@perches.com>
Cc: Roland Dreier <rdreier@cisco.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|\
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu:
percpu: add __percpu sparse annotations to what's left
percpu: add __percpu sparse annotations to fs
percpu: add __percpu sparse annotations to core kernel subsystems
local_t: Remove leftover local.h
this_cpu: Remove pageset_notifier
this_cpu: Page allocator conversion
percpu, x86: Generic inc / dec percpu instructions
local_t: Move local.h include to ringbuffer.c and ring_buffer_benchmark.c
module: Use this_cpu_xx to dynamically allocate counters
local_t: Remove cpu_local_xx macros
percpu: refactor the code in pcpu_[de]populate_chunk()
percpu: remove compile warnings caused by __verify_pcpu_ptr()
percpu: make accessors check for percpu pointer in sparse
percpu: add __percpu for sparse.
percpu: make access macros universal
percpu: remove per_cpu__ prefix.
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Add __percpu sparse annotations to places which didn't make it in one
of the previous patches. All converions are trivial.
These annotations are to make sparse consider percpu variables to be
in a different address space and warn if accessed without going
through percpu accessors. This patch doesn't affect normal builds.
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Borislav Petkov <borislav.petkov@amd.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Len Brown <lenb@kernel.org>
Cc: Neil Brown <neilb@suse.de>
|
|/
|
|
|
|
|
|
|
|
|
| |
While debugging a dma driver I noticed a memleak after
unloading the driver module.
Caught by kmemleak.
Signed-off-by: Anatolij Gustschin <agust@denx.de>
Cc: Maciej Sosnowski <maciej.sosnowski@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|\
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: (34 commits)
m68k: rename global variable vmalloc_end to m68k_vmalloc_end
percpu: add missing per_cpu_ptr_to_phys() definition for UP
percpu: Fix kdump failure if booted with percpu_alloc=page
percpu: make misc percpu symbols unique
percpu: make percpu symbols in ia64 unique
percpu: make percpu symbols in powerpc unique
percpu: make percpu symbols in x86 unique
percpu: make percpu symbols in xen unique
percpu: make percpu symbols in cpufreq unique
percpu: make percpu symbols in oprofile unique
percpu: make percpu symbols in tracer unique
percpu: make percpu symbols under kernel/ and mm/ unique
percpu: remove some sparse warnings
percpu: make alloc_percpu() handle array types
vmalloc: fix use of non-existent percpu variable in put_cpu_var()
this_cpu: Use this_cpu_xx in trace_functions_graph.c
this_cpu: Use this_cpu_xx for ftrace
this_cpu: Use this_cpu_xx in nmi handling
this_cpu: Use this_cpu operations in RCU
this_cpu: Use this_cpu ops for VM statistics
...
Fix up trivial (famous last words) global per-cpu naming conflicts in
arch/x86/kvm/svm.c
mm/slab.c
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
There are cases where we can use this_cpu_ptr and as the result
of using this_cpu_ptr() we no longer need to determine the
currently executing cpu.
In those places no get/put_cpu combination is needed anymore.
The local cpu variable can be eliminated.
Preemption still needs to be disabled and enabled since the
modifications of the per cpu variables is not atomic. There may
be multiple per cpu variables modified and those must all
be from the same processor.
Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: Tejun Heo <tj@kernel.org>
cc: Eric Biederman <ebiederm@aristanetworks.com>
cc: Stephen Hemminger <shemminger@vyatta.com>
cc: David L Stevens <dlstevens@us.ibm.com>
Signed-off-by: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
ioat3.2 does not support asynchronous error notifications which makes
the driver experience latencies when non-zero pq validate results are
expected. Provide a mechanism for turning off async_xor_val and
async_syndrome_val via Kconfig. This approach is generally useful for
any driver that specifies ASYNC_TX_DISABLE_CHANNEL_SWITCH and would like
to force the async_tx api to fall back to the synchronous path for
certain operations.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|/
|
|
|
|
|
| |
A channel must include these capabilities to satisfy
ASYNC_TX_DISABLE_CHANNEL_SWITCH.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|\
| |
| |
| |
| |
| |
| |
| | |
Conflicts:
crypto/async_tx/async_xor.c
drivers/dma/ioat/dma_v2.h
drivers/dma/ioat/pci.c
drivers/md/raid5.c
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The tx_list attribute of struct dma_async_tx_descriptor is common to
most, but not all dma driver implementations. None of the upper level
code (dmaengine/async_tx) uses it, so allow drivers to implement it
locally if they need it. This saves sizeof(struct list_head) bytes for
drivers that do not manage descriptors with a linked list (e.g.: ioatdma
v2,3).
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Channel switching is problematic for some dmaengine drivers as the
architecture precludes separating the ->prep from ->submit. In these
cases the driver can select ASYNC_TX_DISABLE_CHANNEL_SWITCH to modify
the async_tx allocator to only return channels that support all of the
required asynchronous operations.
For example MD_RAID456=y selects support for asynchronous xor, xor
validate, pq, pq validate, and memcpy. When
ASYNC_TX_DISABLE_CHANNEL_SWITCH=y any channel with all these
capabilities is marked DMA_ASYNC_TX allowing async_tx_find_channel() to
quickly locate compatible channels with the guarantee that dependency
chains will remain on one channel. When
ASYNC_TX_DISABLE_CHANNEL_SWITCH=n async_tx_find_channel() may select
channels that lead to operation chains that need to cross channel
boundaries using the async_tx channel switch capability.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|\ \
| |/
|/|
| |
| | |
Conflicts:
include/linux/dmaengine.h
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
[ Based on an original patch by Yuri Tikhonov ]
This adds support for doing asynchronous GF multiplication by adding
two additional functions to the async_tx API:
async_gen_syndrome() does simultaneous XOR and Galois field
multiplication of sources.
async_syndrome_val() validates the given source buffers against known P
and Q values.
When a request is made to run async_pq against more than the hardware
maximum number of supported sources we need to reuse the previous
generated P and Q values as sources into the next operation. Care must
be taken to remove Q from P' and P from Q'. For example to perform a 5
source pq op with hardware that only supports 4 sources at a time the
following approach is taken:
p, q = PQ(src0, src1, src2, src3, COEF({01}, {02}, {04}, {08}))
p', q' = PQ(p, q, q, src4, COEF({00}, {01}, {00}, {10}))
p' = p + q + q + src4 = p + src4
q' = {00}*p + {01}*q + {00}*q + {10}*src4 = q + {10}*src4
Note: 4 is the minimum acceptable maxpq otherwise we punt to
synchronous-software path.
The DMA_PREP_CONTINUE flag indicates to the driver to reuse p and q as
sources (in the above manner) and fill the remaining slots up to maxpq
with the new sources/coefficients.
Note1: Some devices have native support for P+Q continuation and can skip
this extra work. Devices with this capability can advertise it with
dma_set_maxpq. It is up to each driver how to handle the
DMA_PREP_CONTINUE flag.
Note2: The api supports disabling the generation of P when generating Q,
this is ignored by the synchronous path but is implemented by some dma
devices to save unnecessary writes. In this case the continuation
algorithm is simplified to only reuse Q as a source.
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Yuri Tikhonov <yur@emcraft.com>
Signed-off-by: Ilya Yanok <yanok@emcraft.com>
Reviewed-by: Andre Noll <maan@systemlinux.org>
Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
We currently walk the parent chain when waiting for a given tx to
complete however this walk may race with the driver cleanup routine.
The routines in async_raid6_recov.c may fall back to the synchronous
path at any point so we need to be prepared to call async_tx_quiesce()
(which calls dma_wait_for_async_tx). To remove the ->parent walk we
guarantee that every time a dependency is attached ->issue_pending() is
invoked, then we can simply poll the initial descriptor until
completion.
This also allows for a lighter weight 'issue pending' implementation as
there is no longer a requirement to iterate through all the channels'
->issue_pending() routines as long as operations have been submitted in
an ordered chain. async_tx_issue_pending() is added for this case.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
'zero_sum' does not properly describe the operation of generating parity
and checking that it validates against an existing buffer. Change the
name of the operation to 'val' (for 'validate'). This is in
anticipation of the p+q case where it is a requirement to identify the
target parity buffers separately from the source buffers, because the
target parity buffers will not have corresponding pq coefficients.
Reviewed-by: Andre Noll <maan@systemlinux.org>
Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|/
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
as reported by Alexander Beregalov <a.beregalov@gmail.com>
ioatdma 0000:00:08.0: DMA-API: device driver frees DMA memory with
wrong function [device address=0x000000007f76f800] [size=2000 bytes]
[map
ped as single] [unmapped as page]
The ioatdma driver was unmapping all regions
(either allocated as page or single) using unmap_page.
This patch lets dma driver recognize if unmap_single or unmap_page should be used.
It introduces two new dma control flags:
DMA_COMPL_SRC_UNMAP_SINGLE and DMA_COMPL_DEST_UNMAP_SINGLE.
They should be set to indicate dma driver to do dma-unmapping as single
(first one for the source, tha latter for the destination).
If respective flag is not set, the driver assumes dma-unmapping as page.
Signed-off-by: Maciej Sosnowski <maciej.sosnowski@intel.com>
Reported-by: Alexander Beregalov <a.beregalov@gmail.com>
Tested-by: Alexander Beregalov <a.beregalov@gmail.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently dma_request_channel() set DMA_PRIVATE capability but never
clear it. So if a public channel was once grabbed by
dma_request_channel(), the device stay PRIVATE forever. Add
privatecnt member to dma_device to correctly revert it.
[lg@denx.de: fix bad usage of 'chan' in dma_async_device_register]
Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
|
|
|
|
|
|
| |
Centralize this common initialization (and one case where ipu_idmac is
duplicating ->chan initialization).
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Atsushi points out:
"If alloc_percpu or kzalloc failed, chan_id does not match with its
position in device->channels list.
And above "continue" looks buggy anyway. Keeping incomplete channels
in device->channels list looks very dangerous..."
Also, fix up leakage of idr_ref in the idr_pre_get() and channel init
fail cases.
Reported-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The conversion of atmel-mci to dma_request_channel missed the
initialization of the channel dma_slave information. The filter_fn passed
to dma_request_channel is responsible for initializing the channel's
private data. This implementation has the additional benefit of enabling
a generic client-channel data passing mechanism.
Reviewed-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: Haavard Skinnemoen <hskinnemoen@atmel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
|
|
|
|
|
|
|
| |
dma_find_channel and dma_issue_pending_all are good places to warn about
improper api usage. However, warning correctly means synchronizing with
dma_list_mutex, i.e. too much overhead for these fast-path calls.
Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In dmaengine we track the dependencies between the descriptors
using the 'next' pointers of the structure. These pointers are
set to NULL as soon as the corresponding descriptor has been
submitted to the channel (in dma_run_dependencies()).
But, the first 'next' in chain is still remaining set, regardless
the fact, that tx->next has been already submitted. This may lead to
multiple submissions of the same descriptor. This patch fixes this.
Actually, some previous implementation of the xxx_run_dependencies()
function already had this fix in place. The fdb..0eaf3 commit, beside the
correct things, broke this.
Cc: <stable@kernel.org>
Signed-off-by: Yuri Tikhonov <yur@emcraft.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There are dmaengine users that would like to register dma devices at
subsys_initcall time to ensure channels are available by device_initcall
time.
Cc: Maciej Sosnowski <maciej.sosnowski@intel.com>
Cc: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Cc: Nicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Allow dma_filter_fn routines to disambiguate multiple channels on a device
rather than assuming that all channels on a device are equal.
Cc: Maciej Sosnowski <maciej.sosnowski@intel.com>
Reported-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This brings some predictability to dma device numbers, i.e. an rmmod/insmod
cycle may now result in /sys/class/dma/dma0chan0 being restored rather than
/sys/class/dma/dma1chan0 appearing.
Cc: Maciej Sosnowski <maciej.sosnowski@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Resolves:
WARNING: at drivers/base/core.c:122 device_release+0x4d/0x52()
Device 'dma0chan0' does not have a release() function, it is broken and must be fixed.
The dma_chan_dev object is introduced to gear-match sysfs kobject and
dmaengine channel lifetimes. When a channel is removed access to the
sysfs entries return -ENODEV until the kobject can be released.
The bulk of the change is updates to existing code to handle the extra
layer of indirection between a dma_chan and its struct device.
Reported-by: Alexander Beregalov <a.beregalov@gmail.com>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Cc: Haavard Skinnemoen <haavard.skinnemoen@atmel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
|
|
|
|
|
|
|
| |
DMA_NAK is now useless. We can just use a bool instead.
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
|
|
|
|
|
|
|
|
| |
Reference counting is done at the module level so clients need not worry
that a channel will leave while they are actively using dmaengine.
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
All users have been converted to either the general-purpose allocator,
dma_find_channel, or dma_request_channel.
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
|
|
|
|
|
|
|
|
| |
Now that clients no longer need to be notified of channel arrival
dma_async_client_register can simply increment the dmaengine_ref_count.
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
|
|
|
|
|
|
|
|
|
| |
dma_request_channel provides an exclusive channel, so we no longer need to
pass slave data through dmaengine.
Cc: Haavard Skinnemoen <haavard.skinnemoen@atmel.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This interface is primarily for device-to-memory clients which need to
search for dma channels with platform-specific characteristics. The
prototype is:
struct dma_chan *dma_request_channel(dma_cap_mask_t mask,
dma_filter_fn filter_fn,
void *filter_param);
When the optional 'filter_fn' parameter is set to NULL
dma_request_channel simply returns the first channel that satisfies the
capability mask. Otherwise, when the mask parameter is insufficient for
specifying the necessary channel, the filter_fn routine can be used to
disposition the available channels in the system. The filter_fn routine
is called once for each free channel in the system. Upon seeing a
suitable channel filter_fn returns DMA_ACK which flags that channel to
be the return value from dma_request_channel. A channel allocated via
this interface is exclusive to the caller, until dma_release_channel()
is called.
To ensure that all channels are not consumed by the general-purpose
allocator the DMA_PRIVATE capability is provided to exclude a dma_device
from general-purpose (memory-to-memory) consideration.
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
async_tx and net_dma each have open-coded versions of issue_pending_all,
so provide a common routine in dmaengine.
The implementation needs to walk the global device list, so implement
rcu to allow dma_issue_pending_all to run lockless. Clients protect
themselves from channel removal events by holding a dmaengine reference.
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Allowing multiple clients to each define their own channel allocation
scheme quickly leads to a pathological situation. For memory-to-memory
offload all clients can share a central allocator.
This simply moves the existing async_tx allocator to dmaengine with
minimal fixups:
* async_tx.c:get_chan_ref_by_cap --> dmaengine.c:nth_chan
* async_tx.c:async_tx_rebalance --> dmaengine.c:dma_channel_rebalance
* split out common code from async_tx.c:__async_tx_find_channel -->
dma_find_channel
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Simply, if a client wants any dmaengine channel then prevent all dmaengine
modules from being removed. Once the clients are done re-enable module
removal.
Why?, beyond reducing complication:
1/ Tracking reference counts per-transaction in an efficient manner, as
is currently done, requires a complicated scheme to avoid cache-line
bouncing effects.
2/ Per-transaction ref-counting gives the false impression that a
dma-driver can be gracefully removed ahead of its user (net, md, or
dma-slave)
3/ None of the in-tree dma-drivers talk to hot pluggable hardware, but
if such an engine were built one day we still would not need to notify
clients of remove events. The driver can simply return NULL to a
->prep() request, something that is much easier for a client to handle.
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|