1 files changed, 190 insertions, 0 deletions
diff --git a/Documentation/driver-api/dmaengine/pxa_dma.rst b/Documentation/driver-api/dmaengine/pxa_dma.rst
new file mode 100644
index 000000000000..442ee691a190
--- /dev/null
+++ b/Documentation/driver-api/dmaengine/pxa_dma.rst
@@ -0,0 +1,190 @@
+==============================
+PXA/MMP - DMA Slave controller
+==============================
+Constraints
+===========
+a) Transfers hot queuing
+A driver submitting a transfer and issuing it should be granted the transfer
+is queued even on a running DMA channel.
+This implies that the queuing doesn't wait for the previous transfer end,
+and that the descriptor chaining is not only done in the irq/tasklet code
+triggered by the end of the transfer.
+A transfer which is submitted and issued on a phy doesn't wait for a phy to
+stop and restart, but is submitted on a "running channel". The other
+drivers, especially mmp_pdma waited for the phy to stop before relaunching
+a new transfer.
+b) All transfers having asked for confirmation should be signaled
+Any issued transfer with DMA_PREP_INTERRUPT should trigger a callback call.
+This implies that even if an irq/tasklet is triggered by end of tx1, but
+at the time of irq/dma tx2 is already finished, tx1->complete() and
+tx2->complete() should be called.
+c) Channel running state
+A driver should be able to query if a channel is running or not. For the
+multimedia case, such as video capture, if a transfer is submitted and then
+a check of the DMA channel reports a "stopped channel", the transfer should
+not be issued until the next "start of frame interrupt", hence the need to
+know if a channel is in running or stopped state.
+d) Bandwidth guarantee
+The PXA architecture has 4 levels of DMAs priorities : high, normal, low.
+The high priorities get twice as much bandwidth as the normal, which get twice
+as much as the low priorities.
+A driver should be able to request a priority, especially the real-time
+ones such as pxa_camera with (big) throughputs.
+Design
+======
+a) Virtual channels
+Same concept as in sa11x0 driver, ie. a driver was assigned a "virtual
+channel" linked to the requestor line, and the physical DMA channel is
+assigned on the fly when the transfer is issued.
+b) Transfer anatomy for a scatter-gather transfer
+::
+   +------------+-----+---------------+----------------+-----------------+
+   | desc-sg[0] | ... | desc-sg[last] | status updater | finisher/linker |
+   +------------+-----+---------------+----------------+-----------------+
+This structure is pointed by dma->sg_cpu.
+The descriptors are used as follows :
+    - desc-sg[i]: i-th descriptor, transferring the i-th sg
+      element to the video buffer scatter gather
+    - status updater
+      Transfers a single u32 to a well known dma coherent memory to leave
+      a trace that this transfer is done. The "well known" is unique per
+      physical channel, meaning that a read of this value will tell which
+      is the last finished transfer at that point in time.
+    - finisher: has ddadr=DADDR_STOP, dcmd=ENDIRQEN
+    - linker: has ddadr= desc-sg[0] of next transfer, dcmd=0
+c) Transfers hot-chaining
+Suppose the running chain is:
+::
+   Buffer 1              Buffer 2
+   +---------+----+---+  +----+----+----+---+
+   | d0 | .. | dN | l |  | d0 | .. | dN | f |
+   +---------+----+-|-+  ^----+----+----+---+
+                    |    |
+                    +----+
+After a call to dmaengine_submit(b3), the chain will look like:
+::
+   Buffer 1              Buffer 2              Buffer 3
+   +---------+----+---+  +----+----+----+---+  +----+----+----+---+
+   | d0 | .. | dN | l |  | d0 | .. | dN | l |  | d0 | .. | dN | f |
+   +---------+----+-|-+  ^----+----+----+-|-+  ^----+----+----+---+
+                    |    |                |    |
+                    +----+                +----+
+                                         new_link
+If while new_link was created the DMA channel stopped, it is _not_
+restarted. Hot-chaining doesn't break the assumption that
+dma_async_issue_pending() is to be used to ensure the transfer is actually started.
+One exception to this rule :
+- if Buffer1 and Buffer2 had all their addresses 8 bytes aligned
+- and if Buffer3 has at least one address not 4 bytes aligned
+- then hot-chaining cannot happen, as the channel must be stopped, the
+  "align bit" must be set, and the channel restarted As a consequence,
+  such a transfer tx_submit() will be queued on the submitted queue, and
+  this specific case if the DMA is already running in aligned mode.
+d) Transfers completion updater
+Each time a transfer is completed on a channel, an interrupt might be
+generated or not, up to the client's request. But in each case, the last
+descriptor of a transfer, the "status updater", will write the latest
+transfer being completed into the physical channel's completion mark.
+This will speed up residue calculation, for large transfers such as video
+buffers which hold around 6k descriptors or more. This also allows without
+any lock to find out what is the latest completed transfer in a running
+DMA chain.
+e) Transfers completion, irq and tasklet
+When a transfer flagged as "DMA_PREP_INTERRUPT" is finished, the dma irq
+is raised. Upon this interrupt, a tasklet is scheduled for the physical
+channel.
+The tasklet is responsible for :
+- reading the physical channel last updater mark
+- calling all the transfer callbacks of finished transfers, based on
+  that mark, and each transfer flags.
+If a transfer is completed while this handling is done, a dma irq will
+be raised, and the tasklet will be scheduled once again, having a new
+updater mark.
+f) Residue
+Residue granularity will be descriptor based. The issued but not completed
+transfers will be scanned for all of their descriptors against the
+currently running descriptor.
+g) Most complicated case of driver's tx queues
+The most tricky situation is when :
+ - there are not "acked" transfers (tx0)
+ - a driver submitted an aligned tx1, not chained
+ - a driver submitted an aligned tx2 => tx2 is cold chained to tx1
+ - a driver issued tx1+tx2 => channel is running in aligned mode
+ - a driver submitted an aligned tx3 => tx3 is hot-chained
+ - a driver submitted an unaligned tx4 => tx4 is put in submitted queue,
+   not chained
+ - a driver issued tx4 => tx4 is put in issued queue, not chained
+ - a driver submitted an aligned tx5 => tx5 is put in submitted queue, not
+   chained
+ - a driver submitted an aligned tx6 => tx6 is put in submitted queue,
+   cold chained to tx5
+ This translates into (after tx4 is issued) :
+ - issued queue
+ ::
+      +-----+ +-----+ +-----+ +-----+
+      | tx1 | | tx2 | | tx3 | | tx4 |
+      +---|-+ ^---|-+ ^-----+ +-----+
+          |   |   |   |
+          +---+   +---+
+        - submitted queue
+      +-----+ +-----+
+      | tx5 | | tx6 |
+      +---|-+ ^-----+
+          |   |
+          +---+
+- completed queue : empty
+- allocated queue : tx0
+It should be noted that after tx3 is completed, the channel is stopped, and
+restarted in "unaligned mode" to handle tx4.
+Author: Robert Jarzmik <robert.jarzmik@free.fr>

diff --git a/Documentation/driver-api/dmaengine/pxa_dma.rst b/Documentation/driver-api/dmaengine/pxa_dma.rst new file mode 100644 index 000000000000..442ee691a190 --- /dev/null +++ b/Documentation/driver-api/dmaengine/pxa_dma.rst
@@ -0,0 +1,190 @@
	1	==============================
	2	PXA/MMP - DMA Slave controller
	3	==============================
	4
	5	Constraints
	6	===========
	7
	8	a) Transfers hot queuing
	9	A driver submitting a transfer and issuing it should be granted the transfer
	10	is queued even on a running DMA channel.
	11	This implies that the queuing doesn't wait for the previous transfer end,
	12	and that the descriptor chaining is not only done in the irq/tasklet code
	13	triggered by the end of the transfer.
	14	A transfer which is submitted and issued on a phy doesn't wait for a phy to
	15	stop and restart, but is submitted on a "running channel". The other
	16	drivers, especially mmp_pdma waited for the phy to stop before relaunching
	17	a new transfer.
	18
	19	b) All transfers having asked for confirmation should be signaled
	20	Any issued transfer with DMA_PREP_INTERRUPT should trigger a callback call.
	21	This implies that even if an irq/tasklet is triggered by end of tx1, but
	22	at the time of irq/dma tx2 is already finished, tx1->complete() and
	23	tx2->complete() should be called.
	24
	25	c) Channel running state
	26	A driver should be able to query if a channel is running or not. For the
	27	multimedia case, such as video capture, if a transfer is submitted and then
	28	a check of the DMA channel reports a "stopped channel", the transfer should
	29	not be issued until the next "start of frame interrupt", hence the need to
	30	know if a channel is in running or stopped state.
	31
	32	d) Bandwidth guarantee
	33	The PXA architecture has 4 levels of DMAs priorities : high, normal, low.
	34	The high priorities get twice as much bandwidth as the normal, which get twice
	35	as much as the low priorities.
	36	A driver should be able to request a priority, especially the real-time
	37	ones such as pxa_camera with (big) throughputs.
	38
	39	Design
	40	======
	41	a) Virtual channels
	42	Same concept as in sa11x0 driver, ie. a driver was assigned a "virtual
	43	channel" linked to the requestor line, and the physical DMA channel is
	44	assigned on the fly when the transfer is issued.
	45
	46	b) Transfer anatomy for a scatter-gather transfer
	47
	48	::
	49
	50	+------------+-----+---------------+----------------+-----------------+
	51	\| desc-sg[0] \| ... \| desc-sg[last] \| status updater \| finisher/linker \|
	52	+------------+-----+---------------+----------------+-----------------+
	53
	54	This structure is pointed by dma->sg_cpu.
	55	The descriptors are used as follows :
	56
	57	- desc-sg[i]: i-th descriptor, transferring the i-th sg
	58	element to the video buffer scatter gather
	59
	60	- status updater
	61	Transfers a single u32 to a well known dma coherent memory to leave
	62	a trace that this transfer is done. The "well known" is unique per
	63	physical channel, meaning that a read of this value will tell which
	64	is the last finished transfer at that point in time.
	65
	66	- finisher: has ddadr=DADDR_STOP, dcmd=ENDIRQEN
	67
	68	- linker: has ddadr= desc-sg[0] of next transfer, dcmd=0
	69
	70	c) Transfers hot-chaining
	71	Suppose the running chain is:
	72
	73	::
	74
	75	Buffer 1 Buffer 2
	76	+---------+----+---+ +----+----+----+---+
	77	\| d0 \| .. \| dN \| l \| \| d0 \| .. \| dN \| f \|
	78	+---------+----+-\|-+ ^----+----+----+---+
	79	\| \|
	80	+----+
	81
	82	After a call to dmaengine_submit(b3), the chain will look like:
	83
	84	::
	85
	86	Buffer 1 Buffer 2 Buffer 3
	87	+---------+----+---+ +----+----+----+---+ +----+----+----+---+
	88	\| d0 \| .. \| dN \| l \| \| d0 \| .. \| dN \| l \| \| d0 \| .. \| dN \| f \|
	89	+---------+----+-\|-+ ^----+----+----+-\|-+ ^----+----+----+---+
	90	\| \| \| \|
	91	+----+ +----+
	92	new_link
	93
	94	If while new_link was created the DMA channel stopped, it is _not_
	95	restarted. Hot-chaining doesn't break the assumption that
	96	dma_async_issue_pending() is to be used to ensure the transfer is actually started.
	97
	98	One exception to this rule :
	99
	100	- if Buffer1 and Buffer2 had all their addresses 8 bytes aligned
	101
	102	- and if Buffer3 has at least one address not 4 bytes aligned
	103
	104	- then hot-chaining cannot happen, as the channel must be stopped, the
	105	"align bit" must be set, and the channel restarted As a consequence,
	106	such a transfer tx_submit() will be queued on the submitted queue, and
	107	this specific case if the DMA is already running in aligned mode.
	108
	109	d) Transfers completion updater
	110	Each time a transfer is completed on a channel, an interrupt might be
	111	generated or not, up to the client's request. But in each case, the last
	112	descriptor of a transfer, the "status updater", will write the latest
	113	transfer being completed into the physical channel's completion mark.
	114
	115	This will speed up residue calculation, for large transfers such as video
	116	buffers which hold around 6k descriptors or more. This also allows without
	117	any lock to find out what is the latest completed transfer in a running
	118	DMA chain.
	119
	120	e) Transfers completion, irq and tasklet
	121	When a transfer flagged as "DMA_PREP_INTERRUPT" is finished, the dma irq
	122	is raised. Upon this interrupt, a tasklet is scheduled for the physical
	123	channel.
	124
	125	The tasklet is responsible for :
	126
	127	- reading the physical channel last updater mark
	128
	129	- calling all the transfer callbacks of finished transfers, based on
	130	that mark, and each transfer flags.
	131
	132	If a transfer is completed while this handling is done, a dma irq will
	133	be raised, and the tasklet will be scheduled once again, having a new
	134	updater mark.
	135
	136	f) Residue
	137	Residue granularity will be descriptor based. The issued but not completed
	138	transfers will be scanned for all of their descriptors against the
	139	currently running descriptor.
	140
	141	g) Most complicated case of driver's tx queues
	142	The most tricky situation is when :
	143
	144	- there are not "acked" transfers (tx0)
	145
	146	- a driver submitted an aligned tx1, not chained
	147
	148	- a driver submitted an aligned tx2 => tx2 is cold chained to tx1
	149
	150	- a driver issued tx1+tx2 => channel is running in aligned mode
	151
	152	- a driver submitted an aligned tx3 => tx3 is hot-chained
	153
	154	- a driver submitted an unaligned tx4 => tx4 is put in submitted queue,
	155	not chained
	156
	157	- a driver issued tx4 => tx4 is put in issued queue, not chained
	158
	159	- a driver submitted an aligned tx5 => tx5 is put in submitted queue, not
	160	chained
	161
	162	- a driver submitted an aligned tx6 => tx6 is put in submitted queue,
	163	cold chained to tx5
	164
	165	This translates into (after tx4 is issued) :
	166
	167	- issued queue
	168
	169	::
	170
	171	+-----+ +-----+ +-----+ +-----+
	172	\| tx1 \| \| tx2 \| \| tx3 \| \| tx4 \|
	173	+---\|-+ ^---\|-+ ^-----+ +-----+
	174	\| \| \| \|
	175	+---+ +---+
	176	- submitted queue
	177	+-----+ +-----+
	178	\| tx5 \| \| tx6 \|
	179	+---\|-+ ^-----+
	180	\| \|
	181	+---+
	182
	183	- completed queue : empty
	184
	185	- allocated queue : tx0
	186
	187	It should be noted that after tx3 is completed, the channel is stopped, and
	188	restarted in "unaligned mode" to handle tx4.
	189
	190	Author: Robert Jarzmik <robert.jarzmik@free.fr>