Oza Pawandeep
2017-05-22 16:39:39 UTC
iproc based PCI RC and Stingray SOC has limitaiton of addressing only 512GB
memory at once.
IOVA allocation honors device's coherent_dma_mask/dma_mask.
In PCI case, current code honors DMA mask set by EP, there is no
concept of PCI host bridge dma-mask, should be there and hence
could truly reflect the limitation of PCI host bridge.
However assuming Linux takes care of largest possible dma_mask, still the
limitation could exist, because of the way memory banks are implemented.
for e.g. memory banks:
<0x00000000 0x80000000 0x0 0x80000000>, /* 2G @ 2G */
<0x00000008 0x80000000 0x3 0x80000000>, /* 14G @ 34G */
<0x00000090 0x00000000 0x4 0x00000000>, /* 16G @ 576G */
<0x000000a0 0x00000000 0x4 0x00000000>; /* 16G @ 640G */
When run User space (SPDK) which internally uses vfio in order to access
PCI EndPoint directly.
Vfio uses huge-pages which could come from 640G/0x000000a0.
And the way vfio maps the hugepage is to have phys addr as iova,
and ends up calling VFIO_IOMMU_MAP_DMA ends up calling iommu_map,
inturn arm_lpae_map mapping iovas out of range.
So the way kernel allocates IOVA (where it honours device dma_mask) and
the way userspace gets IOVA is different.
dma-ranges = <0x43000000 0x00 0x00 0x00 0x00 0x80 0x00>; will not work.
Instead we have to go for scattered dma-ranges leaving holes.
Hence, we have to reserve IOVA allocations for inbound memory.
The patch-set caters to only addressing IOVA allocation problem.
Changes since v7:
- Robin's comment addressed
where he wanted to remove depedency between IOMMU and OF layer.
- Bjorn Helgaas's comments addressed.
Changes since v6:
- Robin's comments addressed.
Changes since v5:
Changes since v4:
Changes since v3:
Changes since v2:
- minor changes, redudant checkes removed
- removed internal review
Changes since v1:
- address Rob's comments.
- Add a get_dma_ranges() function to of_bus struct..
- Convert existing contents of of_dma_get_range function to
of_bus_default_dma_get_ranges and adding that to the
default of_bus struct.
- Make of_dma_get_range call of_bus_match() and then bus->get_dma_ranges.
Oza Pawandeep (3):
OF/PCI: expose inbound memory interface to PCI RC drivers.
IOMMU/PCI: reserve IOVA for inbound memory for PCI masters
PCI: add support for inbound windows resources
drivers/iommu/dma-iommu.c | 44 ++++++++++++++++++++--
drivers/of/of_pci.c | 96 +++++++++++++++++++++++++++++++++++++++++++++++
drivers/pci/probe.c | 30 +++++++++++++--
include/linux/of_pci.h | 7 ++++
include/linux/pci.h | 1 +
5 files changed, 170 insertions(+), 8 deletions(-)
memory at once.
IOVA allocation honors device's coherent_dma_mask/dma_mask.
In PCI case, current code honors DMA mask set by EP, there is no
concept of PCI host bridge dma-mask, should be there and hence
could truly reflect the limitation of PCI host bridge.
However assuming Linux takes care of largest possible dma_mask, still the
limitation could exist, because of the way memory banks are implemented.
for e.g. memory banks:
<0x00000000 0x80000000 0x0 0x80000000>, /* 2G @ 2G */
<0x00000008 0x80000000 0x3 0x80000000>, /* 14G @ 34G */
<0x00000090 0x00000000 0x4 0x00000000>, /* 16G @ 576G */
<0x000000a0 0x00000000 0x4 0x00000000>; /* 16G @ 640G */
When run User space (SPDK) which internally uses vfio in order to access
PCI EndPoint directly.
Vfio uses huge-pages which could come from 640G/0x000000a0.
And the way vfio maps the hugepage is to have phys addr as iova,
and ends up calling VFIO_IOMMU_MAP_DMA ends up calling iommu_map,
inturn arm_lpae_map mapping iovas out of range.
So the way kernel allocates IOVA (where it honours device dma_mask) and
the way userspace gets IOVA is different.
dma-ranges = <0x43000000 0x00 0x00 0x00 0x00 0x80 0x00>; will not work.
Instead we have to go for scattered dma-ranges leaving holes.
Hence, we have to reserve IOVA allocations for inbound memory.
The patch-set caters to only addressing IOVA allocation problem.
Changes since v7:
- Robin's comment addressed
where he wanted to remove depedency between IOMMU and OF layer.
- Bjorn Helgaas's comments addressed.
Changes since v6:
- Robin's comments addressed.
Changes since v5:
Changes since v4:
Changes since v3:
Changes since v2:
- minor changes, redudant checkes removed
- removed internal review
Changes since v1:
- address Rob's comments.
- Add a get_dma_ranges() function to of_bus struct..
- Convert existing contents of of_dma_get_range function to
of_bus_default_dma_get_ranges and adding that to the
default of_bus struct.
- Make of_dma_get_range call of_bus_match() and then bus->get_dma_ranges.
Oza Pawandeep (3):
OF/PCI: expose inbound memory interface to PCI RC drivers.
IOMMU/PCI: reserve IOVA for inbound memory for PCI masters
PCI: add support for inbound windows resources
drivers/iommu/dma-iommu.c | 44 ++++++++++++++++++++--
drivers/of/of_pci.c | 96 +++++++++++++++++++++++++++++++++++++++++++++++
drivers/pci/probe.c | 30 +++++++++++++--
include/linux/of_pci.h | 7 ++++
include/linux/pci.h | 1 +
5 files changed, 170 insertions(+), 8 deletions(-)
--
1.9.1
1.9.1