diff options
| author | David Brownell <david-b@pacbell.net> | 2006-04-01 13:21:52 -0500 |
|---|---|---|
| committer | Greg Kroah-Hartman <gregkh@suse.de> | 2006-04-14 15:25:26 -0400 |
| commit | 21440d313358043b0ce5e43b00ff3c9b35a8616c (patch) | |
| tree | 32f3ed659a76ad6e4a6061b57346178cf3fa6256 | |
| parent | 2d1e1c754d641bb8a32f0ce909dcff32906830ef (diff) | |
[PATCH] dma doc updates
This updates the DMA API documentation to address a few issues:
- The dma_map_sg() call results are used like pci_map_sg() results:
using sg_dma_address() and sg_dma_len(). That's not wholly obvious
to folk reading _only_ the "new" DMA-API.txt writeup.
- Buffers allocated by dma_alloc_coherent() may not be completely
free of coherency concerns ... some CPUs also have write buffers
that may need to be flushed.
- Cacheline coherence issues are now mentioned as being among issues
which affect dma buffers, and complicate/prevent using of static and
(especially) stack based buffers with the DMA calls.
I don't think many drivers currently need to worry about flushing write
buffers, but I did hit it with one SOC using external SDRAM for DMA
descriptors: without explicit writebuffer flushing, the on-chip DMA
controller accessed descriptors before the CPU completed the writes.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
| -rw-r--r-- | Documentation/DMA-API.txt | 49 | ||||
| -rw-r--r-- | Documentation/DMA-mapping.txt | 22 |
2 files changed, 53 insertions, 18 deletions
diff --git a/Documentation/DMA-API.txt b/Documentation/DMA-API.txt index 1af0f2d50220..2ffb0d62f0fe 100644 --- a/Documentation/DMA-API.txt +++ b/Documentation/DMA-API.txt | |||
| @@ -33,7 +33,9 @@ pci_alloc_consistent(struct pci_dev *dev, size_t size, | |||
| 33 | 33 | ||
| 34 | Consistent memory is memory for which a write by either the device or | 34 | Consistent memory is memory for which a write by either the device or |
| 35 | the processor can immediately be read by the processor or device | 35 | the processor can immediately be read by the processor or device |
| 36 | without having to worry about caching effects. | 36 | without having to worry about caching effects. (You may however need |
| 37 | to make sure to flush the processor's write buffers before telling | ||
| 38 | devices to read that memory.) | ||
| 37 | 39 | ||
| 38 | This routine allocates a region of <size> bytes of consistent memory. | 40 | This routine allocates a region of <size> bytes of consistent memory. |
| 39 | it also returns a <dma_handle> which may be cast to an unsigned | 41 | it also returns a <dma_handle> which may be cast to an unsigned |
| @@ -304,12 +306,12 @@ dma address with dma_mapping_error(). A non zero return value means the mapping | |||
| 304 | could not be created and the driver should take appropriate action (eg | 306 | could not be created and the driver should take appropriate action (eg |
| 305 | reduce current DMA mapping usage or delay and try again later). | 307 | reduce current DMA mapping usage or delay and try again later). |
| 306 | 308 | ||
| 307 | int | 309 | int |
| 308 | dma_map_sg(struct device *dev, struct scatterlist *sg, int nents, | 310 | dma_map_sg(struct device *dev, struct scatterlist *sg, |
| 309 | enum dma_data_direction direction) | 311 | int nents, enum dma_data_direction direction) |
| 310 | int | 312 | int |
| 311 | pci_map_sg(struct pci_dev *hwdev, struct scatterlist *sg, | 313 | pci_map_sg(struct pci_dev *hwdev, struct scatterlist *sg, |
| 312 | int nents, int direction) | 314 | int nents, int direction) |
| 313 | 315 | ||
| 314 | Maps a scatter gather list from the block layer. | 316 | Maps a scatter gather list from the block layer. |
| 315 | 317 | ||
| @@ -327,12 +329,33 @@ critical that the driver do something, in the case of a block driver | |||
| 327 | aborting the request or even oopsing is better than doing nothing and | 329 | aborting the request or even oopsing is better than doing nothing and |
| 328 | corrupting the filesystem. | 330 | corrupting the filesystem. |
| 329 | 331 | ||
| 330 | void | 332 | With scatterlists, you use the resulting mapping like this: |
| 331 | dma_unmap_sg(struct device *dev, struct scatterlist *sg, int nhwentries, | 333 | |
| 332 | enum dma_data_direction direction) | 334 | int i, count = dma_map_sg(dev, sglist, nents, direction); |
| 333 | void | 335 | struct scatterlist *sg; |
| 334 | pci_unmap_sg(struct pci_dev *hwdev, struct scatterlist *sg, | 336 | |
| 335 | int nents, int direction) | 337 | for (i = 0, sg = sglist; i < count; i++, sg++) { |
| 338 | hw_address[i] = sg_dma_address(sg); | ||
| 339 | hw_len[i] = sg_dma_len(sg); | ||
| 340 | } | ||
| 341 | |||
| 342 | where nents is the number of entries in the sglist. | ||
| 343 | |||
| 344 | The implementation is free to merge several consecutive sglist entries | ||
| 345 | into one (e.g. with an IOMMU, or if several pages just happen to be | ||
| 346 | physically contiguous) and returns the actual number of sg entries it | ||
| 347 | mapped them to. On failure 0, is returned. | ||
| 348 | |||
| 349 | Then you should loop count times (note: this can be less than nents times) | ||
| 350 | and use sg_dma_address() and sg_dma_len() macros where you previously | ||
| 351 | accessed sg->address and sg->length as shown above. | ||
| 352 | |||
| 353 | void | ||
| 354 | dma_unmap_sg(struct device *dev, struct scatterlist *sg, | ||
| 355 | int nhwentries, enum dma_data_direction direction) | ||
| 356 | void | ||
| 357 | pci_unmap_sg(struct pci_dev *hwdev, struct scatterlist *sg, | ||
| 358 | int nents, int direction) | ||
| 336 | 359 | ||
| 337 | unmap the previously mapped scatter/gather list. All the parameters | 360 | unmap the previously mapped scatter/gather list. All the parameters |
| 338 | must be the same as those and passed in to the scatter/gather mapping | 361 | must be the same as those and passed in to the scatter/gather mapping |
diff --git a/Documentation/DMA-mapping.txt b/Documentation/DMA-mapping.txt index 10bf4deb96aa..7c717699032c 100644 --- a/Documentation/DMA-mapping.txt +++ b/Documentation/DMA-mapping.txt | |||
| @@ -58,11 +58,15 @@ translating each of those pages back to a kernel address using | |||
| 58 | something like __va(). [ EDIT: Update this when we integrate | 58 | something like __va(). [ EDIT: Update this when we integrate |
| 59 | Gerd Knorr's generic code which does this. ] | 59 | Gerd Knorr's generic code which does this. ] |
| 60 | 60 | ||
| 61 | This rule also means that you may not use kernel image addresses | 61 | This rule also means that you may use neither kernel image addresses |
| 62 | (ie. items in the kernel's data/text/bss segment, or your driver's) | 62 | (items in data/text/bss segments), nor module image addresses, nor |
| 63 | nor may you use kernel stack addresses for DMA. Both of these items | 63 | stack addresses for DMA. These could all be mapped somewhere entirely |
| 64 | might be mapped somewhere entirely different than the rest of physical | 64 | different than the rest of physical memory. Even if those classes of |
| 65 | memory. | 65 | memory could physically work with DMA, you'd need to ensure the I/O |
| 66 | buffers were cacheline-aligned. Without that, you'd see cacheline | ||
| 67 | sharing problems (data corruption) on CPUs with DMA-incoherent caches. | ||
| 68 | (The CPU could write to one word, DMA would write to a different one | ||
| 69 | in the same cache line, and one of them could be overwritten.) | ||
| 66 | 70 | ||
| 67 | Also, this means that you cannot take the return of a kmap() | 71 | Also, this means that you cannot take the return of a kmap() |
| 68 | call and DMA to/from that. This is similar to vmalloc(). | 72 | call and DMA to/from that. This is similar to vmalloc(). |
| @@ -284,6 +288,11 @@ There are two types of DMA mappings: | |||
| 284 | 288 | ||
| 285 | in order to get correct behavior on all platforms. | 289 | in order to get correct behavior on all platforms. |
| 286 | 290 | ||
| 291 | Also, on some platforms your driver may need to flush CPU write | ||
| 292 | buffers in much the same way as it needs to flush write buffers | ||
| 293 | found in PCI bridges (such as by reading a register's value | ||
| 294 | after writing it). | ||
| 295 | |||
| 287 | - Streaming DMA mappings which are usually mapped for one DMA transfer, | 296 | - Streaming DMA mappings which are usually mapped for one DMA transfer, |
| 288 | unmapped right after it (unless you use pci_dma_sync_* below) and for which | 297 | unmapped right after it (unless you use pci_dma_sync_* below) and for which |
| 289 | hardware can optimize for sequential accesses. | 298 | hardware can optimize for sequential accesses. |
| @@ -303,6 +312,9 @@ There are two types of DMA mappings: | |||
| 303 | 312 | ||
| 304 | Neither type of DMA mapping has alignment restrictions that come | 313 | Neither type of DMA mapping has alignment restrictions that come |
| 305 | from PCI, although some devices may have such restrictions. | 314 | from PCI, although some devices may have such restrictions. |
| 315 | Also, systems with caches that aren't DMA-coherent will work better | ||
| 316 | when the underlying buffers don't share cache lines with other data. | ||
| 317 | |||
| 306 | 318 | ||
| 307 | Using Consistent DMA mappings. | 319 | Using Consistent DMA mappings. |
| 308 | 320 | ||
