diff options
author | Yinghai Lu <yinghai@kernel.org> | 2015-05-27 20:23:51 -0400 |
---|---|---|
committer | Greg Kroah-Hartman <gregkh@linuxfoundation.org> | 2015-07-21 13:10:05 -0400 |
commit | 30e8a1821385dbd85830b407103f28d989764911 (patch) | |
tree | e7cb290a5616c2571d58410f1f37a305346dd835 /Documentation | |
parent | 7044198591216ee701f98eeefecabc9d0ad6c2ef (diff) |
PCI: Add pci_bus_addr_t
commit 3a9ad0b4fdcd57f775d3615004c8c64c021a9e7d upstream.
David Ahern reported that d63e2e1f3df9 ("sparc/PCI: Clip bridge windows
to fit in upstream windows") fails to boot on sparc/T5-8:
pci 0000:06:00.0: reg 0x184: can't handle BAR above 4GB (bus address 0x110204000)
The problem is that sparc64 assumed that dma_addr_t only needed to hold DMA
addresses, i.e., bus addresses returned via the DMA API (dma_map_single(),
etc.), while the PCI core assumed dma_addr_t could hold *any* bus address,
including raw BAR values. On sparc64, all DMA addresses fit in 32 bits, so
dma_addr_t is a 32-bit type. However, BAR values can be 64 bits wide, so
they don't fit in a dma_addr_t. d63e2e1f3df9 added new checking that
tripped over this mismatch.
Add pci_bus_addr_t, which is wide enough to hold any PCI bus address,
including both raw BAR values and DMA addresses. This will be 64 bits
on 64-bit platforms and on platforms with a 64-bit dma_addr_t. Then
dma_addr_t only needs to be wide enough to hold addresses from the DMA API.
[bhelgaas: changelog, bugzilla, Kconfig to ensure pci_bus_addr_t is at
least as wide as dma_addr_t, documentation]
Fixes: d63e2e1f3df9 ("sparc/PCI: Clip bridge windows to fit in upstream windows")
Fixes: 23b13bc76f35 ("PCI: Fail safely if we can't handle BARs larger than 4GB")
Link: http://lkml.kernel.org/r/CAE9FiQU1gJY1LYrxs+ma5LCTEEe4xmtjRG0aXJ9K_Tsu+m9Wuw@mail.gmail.com
Link: http://lkml.kernel.org/r/1427857069-6789-1-git-send-email-yinghai@kernel.org
Link: https://bugzilla.kernel.org/show_bug.cgi?id=96231
Reported-by: David Ahern <david.ahern@oracle.com>
Tested-by: David Ahern <david.ahern@oracle.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Diffstat (limited to 'Documentation')
-rw-r--r-- | Documentation/DMA-API-HOWTO.txt | 29 | ||||
-rw-r--r-- | Documentation/DMA-API.txt | 30 |
2 files changed, 32 insertions, 27 deletions
diff --git a/Documentation/DMA-API-HOWTO.txt b/Documentation/DMA-API-HOWTO.txt index 0f7afb2bb442..aef8cc5a677b 100644 --- a/Documentation/DMA-API-HOWTO.txt +++ b/Documentation/DMA-API-HOWTO.txt | |||
@@ -25,13 +25,18 @@ physical addresses. These are the addresses in /proc/iomem. The physical | |||
25 | address is not directly useful to a driver; it must use ioremap() to map | 25 | address is not directly useful to a driver; it must use ioremap() to map |
26 | the space and produce a virtual address. | 26 | the space and produce a virtual address. |
27 | 27 | ||
28 | I/O devices use a third kind of address: a "bus address" or "DMA address". | 28 | I/O devices use a third kind of address: a "bus address". If a device has |
29 | If a device has registers at an MMIO address, or if it performs DMA to read | 29 | registers at an MMIO address, or if it performs DMA to read or write system |
30 | or write system memory, the addresses used by the device are bus addresses. | 30 | memory, the addresses used by the device are bus addresses. In some |
31 | In some systems, bus addresses are identical to CPU physical addresses, but | 31 | systems, bus addresses are identical to CPU physical addresses, but in |
32 | in general they are not. IOMMUs and host bridges can produce arbitrary | 32 | general they are not. IOMMUs and host bridges can produce arbitrary |
33 | mappings between physical and bus addresses. | 33 | mappings between physical and bus addresses. |
34 | 34 | ||
35 | From a device's point of view, DMA uses the bus address space, but it may | ||
36 | be restricted to a subset of that space. For example, even if a system | ||
37 | supports 64-bit addresses for main memory and PCI BARs, it may use an IOMMU | ||
38 | so devices only need to use 32-bit DMA addresses. | ||
39 | |||
35 | Here's a picture and some examples: | 40 | Here's a picture and some examples: |
36 | 41 | ||
37 | CPU CPU Bus | 42 | CPU CPU Bus |
@@ -72,11 +77,11 @@ can use virtual address X to access the buffer, but the device itself | |||
72 | cannot because DMA doesn't go through the CPU virtual memory system. | 77 | cannot because DMA doesn't go through the CPU virtual memory system. |
73 | 78 | ||
74 | In some simple systems, the device can do DMA directly to physical address | 79 | In some simple systems, the device can do DMA directly to physical address |
75 | Y. But in many others, there is IOMMU hardware that translates bus | 80 | Y. But in many others, there is IOMMU hardware that translates DMA |
76 | addresses to physical addresses, e.g., it translates Z to Y. This is part | 81 | addresses to physical addresses, e.g., it translates Z to Y. This is part |
77 | of the reason for the DMA API: the driver can give a virtual address X to | 82 | of the reason for the DMA API: the driver can give a virtual address X to |
78 | an interface like dma_map_single(), which sets up any required IOMMU | 83 | an interface like dma_map_single(), which sets up any required IOMMU |
79 | mapping and returns the bus address Z. The driver then tells the device to | 84 | mapping and returns the DMA address Z. The driver then tells the device to |
80 | do DMA to Z, and the IOMMU maps it to the buffer at address Y in system | 85 | do DMA to Z, and the IOMMU maps it to the buffer at address Y in system |
81 | RAM. | 86 | RAM. |
82 | 87 | ||
@@ -98,7 +103,7 @@ First of all, you should make sure | |||
98 | #include <linux/dma-mapping.h> | 103 | #include <linux/dma-mapping.h> |
99 | 104 | ||
100 | is in your driver, which provides the definition of dma_addr_t. This type | 105 | is in your driver, which provides the definition of dma_addr_t. This type |
101 | can hold any valid DMA or bus address for the platform and should be used | 106 | can hold any valid DMA address for the platform and should be used |
102 | everywhere you hold a DMA address returned from the DMA mapping functions. | 107 | everywhere you hold a DMA address returned from the DMA mapping functions. |
103 | 108 | ||
104 | What memory is DMA'able? | 109 | What memory is DMA'able? |
@@ -316,7 +321,7 @@ There are two types of DMA mappings: | |||
316 | Think of "consistent" as "synchronous" or "coherent". | 321 | Think of "consistent" as "synchronous" or "coherent". |
317 | 322 | ||
318 | The current default is to return consistent memory in the low 32 | 323 | The current default is to return consistent memory in the low 32 |
319 | bits of the bus space. However, for future compatibility you should | 324 | bits of the DMA space. However, for future compatibility you should |
320 | set the consistent mask even if this default is fine for your | 325 | set the consistent mask even if this default is fine for your |
321 | driver. | 326 | driver. |
322 | 327 | ||
@@ -403,7 +408,7 @@ dma_alloc_coherent() returns two values: the virtual address which you | |||
403 | can use to access it from the CPU and dma_handle which you pass to the | 408 | can use to access it from the CPU and dma_handle which you pass to the |
404 | card. | 409 | card. |
405 | 410 | ||
406 | The CPU virtual address and the DMA bus address are both | 411 | The CPU virtual address and the DMA address are both |
407 | guaranteed to be aligned to the smallest PAGE_SIZE order which | 412 | guaranteed to be aligned to the smallest PAGE_SIZE order which |
408 | is greater than or equal to the requested size. This invariant | 413 | is greater than or equal to the requested size. This invariant |
409 | exists (for example) to guarantee that if you allocate a chunk | 414 | exists (for example) to guarantee that if you allocate a chunk |
@@ -645,8 +650,8 @@ PLEASE NOTE: The 'nents' argument to the dma_unmap_sg call must be | |||
645 | dma_map_sg call. | 650 | dma_map_sg call. |
646 | 651 | ||
647 | Every dma_map_{single,sg}() call should have its dma_unmap_{single,sg}() | 652 | Every dma_map_{single,sg}() call should have its dma_unmap_{single,sg}() |
648 | counterpart, because the bus address space is a shared resource and | 653 | counterpart, because the DMA address space is a shared resource and |
649 | you could render the machine unusable by consuming all bus addresses. | 654 | you could render the machine unusable by consuming all DMA addresses. |
650 | 655 | ||
651 | If you need to use the same streaming DMA region multiple times and touch | 656 | If you need to use the same streaming DMA region multiple times and touch |
652 | the data in between the DMA transfers, the buffer needs to be synced | 657 | the data in between the DMA transfers, the buffer needs to be synced |
diff --git a/Documentation/DMA-API.txt b/Documentation/DMA-API.txt index 52088408668a..7eba542eff7c 100644 --- a/Documentation/DMA-API.txt +++ b/Documentation/DMA-API.txt | |||
@@ -18,10 +18,10 @@ Part I - dma_ API | |||
18 | To get the dma_ API, you must #include <linux/dma-mapping.h>. This | 18 | To get the dma_ API, you must #include <linux/dma-mapping.h>. This |
19 | provides dma_addr_t and the interfaces described below. | 19 | provides dma_addr_t and the interfaces described below. |
20 | 20 | ||
21 | A dma_addr_t can hold any valid DMA or bus address for the platform. It | 21 | A dma_addr_t can hold any valid DMA address for the platform. It can be |
22 | can be given to a device to use as a DMA source or target. A CPU cannot | 22 | given to a device to use as a DMA source or target. A CPU cannot reference |
23 | reference a dma_addr_t directly because there may be translation between | 23 | a dma_addr_t directly because there may be translation between its physical |
24 | its physical address space and the bus address space. | 24 | address space and the DMA address space. |
25 | 25 | ||
26 | Part Ia - Using large DMA-coherent buffers | 26 | Part Ia - Using large DMA-coherent buffers |
27 | ------------------------------------------ | 27 | ------------------------------------------ |
@@ -42,7 +42,7 @@ It returns a pointer to the allocated region (in the processor's virtual | |||
42 | address space) or NULL if the allocation failed. | 42 | address space) or NULL if the allocation failed. |
43 | 43 | ||
44 | It also returns a <dma_handle> which may be cast to an unsigned integer the | 44 | It also returns a <dma_handle> which may be cast to an unsigned integer the |
45 | same width as the bus and given to the device as the bus address base of | 45 | same width as the bus and given to the device as the DMA address base of |
46 | the region. | 46 | the region. |
47 | 47 | ||
48 | Note: consistent memory can be expensive on some platforms, and the | 48 | Note: consistent memory can be expensive on some platforms, and the |
@@ -193,7 +193,7 @@ dma_map_single(struct device *dev, void *cpu_addr, size_t size, | |||
193 | enum dma_data_direction direction) | 193 | enum dma_data_direction direction) |
194 | 194 | ||
195 | Maps a piece of processor virtual memory so it can be accessed by the | 195 | Maps a piece of processor virtual memory so it can be accessed by the |
196 | device and returns the bus address of the memory. | 196 | device and returns the DMA address of the memory. |
197 | 197 | ||
198 | The direction for both APIs may be converted freely by casting. | 198 | The direction for both APIs may be converted freely by casting. |
199 | However the dma_ API uses a strongly typed enumerator for its | 199 | However the dma_ API uses a strongly typed enumerator for its |
@@ -212,20 +212,20 @@ contiguous piece of memory. For this reason, memory to be mapped by | |||
212 | this API should be obtained from sources which guarantee it to be | 212 | this API should be obtained from sources which guarantee it to be |
213 | physically contiguous (like kmalloc). | 213 | physically contiguous (like kmalloc). |
214 | 214 | ||
215 | Further, the bus address of the memory must be within the | 215 | Further, the DMA address of the memory must be within the |
216 | dma_mask of the device (the dma_mask is a bit mask of the | 216 | dma_mask of the device (the dma_mask is a bit mask of the |
217 | addressable region for the device, i.e., if the bus address of | 217 | addressable region for the device, i.e., if the DMA address of |
218 | the memory ANDed with the dma_mask is still equal to the bus | 218 | the memory ANDed with the dma_mask is still equal to the DMA |
219 | address, then the device can perform DMA to the memory). To | 219 | address, then the device can perform DMA to the memory). To |
220 | ensure that the memory allocated by kmalloc is within the dma_mask, | 220 | ensure that the memory allocated by kmalloc is within the dma_mask, |
221 | the driver may specify various platform-dependent flags to restrict | 221 | the driver may specify various platform-dependent flags to restrict |
222 | the bus address range of the allocation (e.g., on x86, GFP_DMA | 222 | the DMA address range of the allocation (e.g., on x86, GFP_DMA |
223 | guarantees to be within the first 16MB of available bus addresses, | 223 | guarantees to be within the first 16MB of available DMA addresses, |
224 | as required by ISA devices). | 224 | as required by ISA devices). |
225 | 225 | ||
226 | Note also that the above constraints on physical contiguity and | 226 | Note also that the above constraints on physical contiguity and |
227 | dma_mask may not apply if the platform has an IOMMU (a device which | 227 | dma_mask may not apply if the platform has an IOMMU (a device which |
228 | maps an I/O bus address to a physical memory address). However, to be | 228 | maps an I/O DMA address to a physical memory address). However, to be |
229 | portable, device driver writers may *not* assume that such an IOMMU | 229 | portable, device driver writers may *not* assume that such an IOMMU |
230 | exists. | 230 | exists. |
231 | 231 | ||
@@ -296,7 +296,7 @@ reduce current DMA mapping usage or delay and try again later). | |||
296 | dma_map_sg(struct device *dev, struct scatterlist *sg, | 296 | dma_map_sg(struct device *dev, struct scatterlist *sg, |
297 | int nents, enum dma_data_direction direction) | 297 | int nents, enum dma_data_direction direction) |
298 | 298 | ||
299 | Returns: the number of bus address segments mapped (this may be shorter | 299 | Returns: the number of DMA address segments mapped (this may be shorter |
300 | than <nents> passed in if some elements of the scatter/gather list are | 300 | than <nents> passed in if some elements of the scatter/gather list are |
301 | physically or virtually adjacent and an IOMMU maps them with a single | 301 | physically or virtually adjacent and an IOMMU maps them with a single |
302 | entry). | 302 | entry). |
@@ -340,7 +340,7 @@ must be the same as those and passed in to the scatter/gather mapping | |||
340 | API. | 340 | API. |
341 | 341 | ||
342 | Note: <nents> must be the number you passed in, *not* the number of | 342 | Note: <nents> must be the number you passed in, *not* the number of |
343 | bus address entries returned. | 343 | DMA address entries returned. |
344 | 344 | ||
345 | void | 345 | void |
346 | dma_sync_single_for_cpu(struct device *dev, dma_addr_t dma_handle, size_t size, | 346 | dma_sync_single_for_cpu(struct device *dev, dma_addr_t dma_handle, size_t size, |
@@ -507,7 +507,7 @@ it's asked for coherent memory for this device. | |||
507 | phys_addr is the CPU physical address to which the memory is currently | 507 | phys_addr is the CPU physical address to which the memory is currently |
508 | assigned (this will be ioremapped so the CPU can access the region). | 508 | assigned (this will be ioremapped so the CPU can access the region). |
509 | 509 | ||
510 | device_addr is the bus address the device needs to be programmed | 510 | device_addr is the DMA address the device needs to be programmed |
511 | with to actually address this memory (this will be handed out as the | 511 | with to actually address this memory (this will be handed out as the |
512 | dma_addr_t in dma_alloc_coherent()). | 512 | dma_addr_t in dma_alloc_coherent()). |
513 | 513 | ||