Discussion:
[PATCH 1/2] [ARM] pxa/sgh_i900: Basic support for the Samsung SGH-i900 (Omnia) phone.
(too old to reply)
Jean-Francois Moine
2014-01-30 18:00:11 UTC
Permalink
On Thu, 30 Jan 2014 14:20:56 +0200
I am having trouble getting the tda998x-codec working on BBB. The=20
problem is I do not have a DT node for the tda998x driver. Instead I=20
have tilcdc-slave node that provides pdata for the tda-driver.
=20
I am thinking of solving the problem by adding a reference to the=20
i2c-adapter hosting tda998x as an optional DT property to the codec=20
node. I could then dig the driver instance from the i2c adapter's=20
children. Any better ideas?
I better think about a 'normal' DT definition:

- in the DT, define the tda998x in a i2c subnode:

&i2c0 {
tda998x: hdmi-encoder {
compatible =3D "nxp,tda998x";
reg =3D <0x70>;
/* the video ports are OK by default */
/* define the interrupt if you want to use it */
};
};

- in tilcdc_slave.c, in the function slave_encoder_create(), instead of
using drm_i2c_encoder_init(), do quite the same, but without calling
request_module() nor i2c_new_device().

This can be done as follows (the code is not compiled):

--------------------8<---------------------
static struct drm_encoder *slave_encoder_create(struct drm_device *dev,
struct slave_module *mod)
{
struct slave_encoder *slave_encoder;
struct drm_encoder *encoder;
int ret;
/* ------ added ------ */
struct device_node *np;
static const struct of_device_id tda_dt[] =3D {
{ .compatible =3D "nxp,tda998x" },
{ },
};
/* ------ end added ------ */

... no change ...

drm_encoder_helper_add(encoder, &slave_encoder_helper_funcs);

/* ------ added ------ */

/* search the tda998x device */
np =3D of_find_matching_node_and_match(NULL, tda_dt, NULL);
if (np && of_device_is_available(np)) {
struct i2c_client *i2c_client;
struct drm_i2c_encoder_driver *encoder_drv;
struct module *module;

/* the tda998x is in the DT */

i2c_client =3D of_find_i2c_device_by_node(np);
of_node_put(np);
if (!i2c_client) {
dev_err(dev->dev, "no tda998x i2c client\n");
goto fail;
}

to_encoder_slave(encoder)->bus_priv =3D i2c_client;

encoder_drv =3D to_drm_i2c_encoder_driver(
to_i2c_driver(i2c_client->dev.driver));

/* lock the tda998x module in memory */
module =3D to_i2c_driver(i2c_client->dev.driver)->driver.owner;
if (!module || !try_module_get(module)) {
dev_err(dev->dev, "cannot get module %s\n", module->name);
goto fail;
}

ret =3D encoder_drv->encoder_init(i2c_client, dev, encoder_slave);
if (ret < 0) {
dev_err(dev->dev, "slave encoder init failed\n");
module_put(module);
goto fail;
}
/* set_config is useless */
return encoder;
}

/* ------ end added ------ */

ret =3D drm_i2c_encoder_init(dev, to_encoder_slave(encoder), mod->i2c,=
&info);
if (ret)
goto fail;

return encoder;

fail:
slave_encoder_destroy(encoder);
return NULL;
}
--------------------8<---------------------

When the tda998x is in the DT, the i2c_client is already created.
It must not be freed, and so, the function drm_i2c_encoder_destroy()
must not be called. But, the module must be explicitly unlocked in
slave_encoder_destroy(), and then, there must be some flag in the
structures for this job to be done...

--=20
Ken ar c'henta=C3=B1 | ** Breizh ha Linux atav! **
Jef | http://moinejf.free.fr/
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" i=
n
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Catalin Marinas
2014-02-17 12:29:48 UTC
Permalink
The dma_mask for a device structure is a pointer. This pointer
needs to be set up before the dma mask can actually be set. Most
frameworks in the kernel take care of setting this up properly but
platform devices that don't follow a regular bus structure may not
ever have this set. As a result, checks such as dma_capable will
always return false on a raw platform device and dma_set_mask will
always return -EIO. Fix this by adding a dma_mask in the
platform_device archdata and setting it to be the dma_mask. Devices
used in other frameworks can change this as needed.
You shouldn't need to do this. I went through a lot of the drivers we
currently have, fixing them up in various manners. The basic rules
- It is the responsibility of the code creating the device to set a
reasonable default for the dma mask according to the bus and whether
DMA is supportable.
- It is the responsibility of the driver _always_ to make a call to
dma_set_mask() and/or dma_set_coherent_mask() according to the
driver's needs if the driver is going to be using DMA.
As a work-around for the buggy situation we have in the kernel with DT,
various buggy workarounds have been incorporated into drivers which
involve writing directly to the DMA masks, and other such games. None
of that is necessary when the dma_coerce_*() functions are used - but
these are a stop-gap until the DT code gets fixed.
The real answer here is to make DT conform to the first point above
and not add yet another different hack to the kernel.
powerpc ran into this exact problem before and fixed it using this
method (a77ce8167cc1d0370fcb1d79b367d62e050cb2b0
"driver core: Add ability for arch code to setup pdev_archdata" and
314b02f503c2c219fde0fcf6f086fda415f8a847 "powerpc: implement
arch_setup_pdev_archdata") so there is at least some precedent for this
method.
Are there patches/discussion somewhere else on what a proper solution
would be in the DT?
I guess this comes under the system topology discussion we briefly
touched at the kernel summit. Dave Martin (cc'ed) is looking into this
now, so at some point we'll have some longer discussions on the list.
--
Catalin
Russell King - ARM Linux
2014-02-17 12:52:50 UTC
Permalink
The dma_mask for a device structure is a pointer. This pointer
needs to be set up before the dma mask can actually be set. Most
frameworks in the kernel take care of setting this up properly but
platform devices that don't follow a regular bus structure may not
ever have this set. As a result, checks such as dma_capable will
always return false on a raw platform device and dma_set_mask will
always return -EIO. Fix this by adding a dma_mask in the
platform_device archdata and setting it to be the dma_mask. Devices
used in other frameworks can change this as needed.
You shouldn't need to do this. I went through a lot of the drivers we
currently have, fixing them up in various manners. The basic rules
- It is the responsibility of the code creating the device to set a
reasonable default for the dma mask according to the bus and whether
DMA is supportable.
- It is the responsibility of the driver _always_ to make a call to
dma_set_mask() and/or dma_set_coherent_mask() according to the
driver's needs if the driver is going to be using DMA.
As a work-around for the buggy situation we have in the kernel with DT,
various buggy workarounds have been incorporated into drivers which
involve writing directly to the DMA masks, and other such games. None
of that is necessary when the dma_coerce_*() functions are used - but
these are a stop-gap until the DT code gets fixed.
The real answer here is to make DT conform to the first point above
and not add yet another different hack to the kernel.
powerpc ran into this exact problem before and fixed it using this
method (a77ce8167cc1d0370fcb1d79b367d62e050cb2b0
"driver core: Add ability for arch code to setup pdev_archdata" and
314b02f503c2c219fde0fcf6f086fda415f8a847 "powerpc: implement
arch_setup_pdev_archdata") so there is at least some precedent for this
method.
The result being that all platform devices get their DMA masks setup
whether they like it or not - what they did was nothing more than a
work-around the problem.

That doesn't get away from what I stated above, which is what's expected
of drivers, and by drivers.

Drivers which perform DMA _must_ without fail call the appropriate DMA
mask setting functions.

Code declaring devices must set the DMA masks to a reasonable default
if the driver is to perform DMA with that device.

As the ARM architecture moves forward, and we start seeing 64-bit DMA
controllers, this becomes extremely important, because the first part
of that is part of the negotiation whether memory outside of the 4GB
DMA range can be passed to the DMA engine. We're also going to hit
problems where people have been lazy, and don't map/unmap/allocate DMA
memory against the DMA engine device, but against the DMA client device.
ASoC is going to be one such area of pain because the DMA engine backend
that was merged totally disregarded this point which was present in my
backend.
--
FTTC broadband for 0.8mile line: 5.8Mbps down 500kbps up. Estimation
in database were 13.1 to 19Mbit for a good line, about 7.5+ for a bad.
Estimate before purchase was "up to 13.2Mbit".
Marcin Wojtas
2015-02-25 21:57:58 UTC
Permalink
Hi,

I send a series of patches adding I2S - S/PDIF support for Marvell Armada 38x
SoC. It is same controller as in Armada 370, however two quirks had to be taken
into consideration:

* new PLL frequency setting

* exceptive pinout of I2S and S/PDIF formats

Kirkwood-i2s driver was enhanced with obtaining mem resources by name. Sound
interface was enabled for Armada 385 DB board, using simple-card DT binding for
both I2S and S/PDIF. In addition to that minor fix was applied for Armada 370
audio DT description (stable-CC'ed).

Any comments and remarks are welcome.

Best regards,
Marcin

Marcin Wojtas (4):
ASoC: kirkwood: enable Kirkwood driver for Armada 38x platforms
ARM: mvebu: add audio I2S controller to Armada 38x Device Tree
ARM: mvebu: add audio support to Armada 385 DB
ARM: mvebu: fix routing for analog audio input of Armada 370 DB
platform

.../devicetree/bindings/sound/mvebu-audio.txt | 14 ++-
arch/arm/boot/dts/armada-370-db.dts | 2 +-
arch/arm/boot/dts/armada-388-db.dts | 69 +++++++++++
arch/arm/boot/dts/armada-38x.dtsi | 17 +++
sound/soc/kirkwood/kirkwood-i2s.c | 132 ++++++++++++++++++++-
sound/soc/kirkwood/kirkwood.h | 2 +
6 files changed, 232 insertions(+), 4 deletions(-)
--
1.8.3.1
Marcin Wojtas
2015-02-25 21:57:59 UTC
Permalink
The audio unit of Marvell Armada38x SoC is similar to the ones comprised by
other Marvell SoCs (Kirkwood, Dove and Armada 370). Therefore Kirkwood audio
driver can be used to support it and this commit adds new compatible string
to identify Armada 38x variant.

Two new memory regions are added: first one for PLL configuration and
the second one for choosing one of audio I/O modes (I2S or S/PDIF).
For the latter purpose a new optional DT property is added ('spdif-mode').

kirkwood-i2s driver is extended by adding a new init function for Armada
38x flavor and also a routine that enables PLL output (i.e. MCLK)
configuration.

Signed-off-by: Marcin Wojtas <***@semihalf.com>
---
.../devicetree/bindings/sound/mvebu-audio.txt | 14 ++-
sound/soc/kirkwood/kirkwood-i2s.c | 132 ++++++++++++++++++++-
sound/soc/kirkwood/kirkwood.h | 2 +
3 files changed, 145 insertions(+), 3 deletions(-)

diff --git a/Documentation/devicetree/bindings/sound/mvebu-audio.txt b/Documentation/devicetree/bindings/sound/mvebu-audio.txt
index cb8c07c..4f5dec5 100644
--- a/Documentation/devicetree/bindings/sound/mvebu-audio.txt
+++ b/Documentation/devicetree/bindings/sound/mvebu-audio.txt
@@ -6,9 +6,14 @@ Required properties:
"marvell,kirkwood-audio" for Kirkwood platforms
"marvell,dove-audio" for Dove platforms
"marvell,armada370-audio" for Armada 370 platforms
+ "marvell,armada-380-audio" for Armada 38x platforms

- reg: physical base address of the controller and length of memory mapped
- region.
+ region (named "i2s_regs").
+ With "marvell,armada-380-audio" two other regions are required:
+ first of those is dedicated for Audio PLL Configuration registers
+ (named "pll_regs") and the second one ("soc_ctrl") - for register
+ where one of exceptive I/O types (I2S or S/PDIF) is set.

- interrupts:
with "marvell,kirkwood-audio", the audio interrupt
@@ -23,6 +28,13 @@ Required properties:
"internal" for the internal clock
"extclk" for the external clock

+Optional properties:
+
+- spdif-mode:
+ Enable S/PDIF mode on Armada 38x SoC. Using this property
+ disables standard I2S I/O. Valid only with "marvell,armada-380-audio"
+ compatible string.
+
Example:

i2s1: audio-***@b4000 {
diff --git a/sound/soc/kirkwood/kirkwood-i2s.c b/sound/soc/kirkwood/kirkwood-i2s.c
index def7d82..8818861 100644
--- a/sound/soc/kirkwood/kirkwood-i2s.c
+++ b/sound/soc/kirkwood/kirkwood-i2s.c
@@ -37,6 +37,124 @@
(SNDRV_PCM_FMTBIT_S16_LE | \
SNDRV_PCM_FMTBIT_S24_LE)

+/* These registers are relative to the second register region -
+ * audio pll configuration.
+ */
+#define A38X_PLL_CONF_REG0 0x0
+#define A38X_PLL_FB_CLK_DIV_OFFSET 10
+#define A38X_PLL_FB_CLK_DIV_MASK 0x7fc00
+#define A38X_PLL_CONF_REG1 0x4
+#define A38X_PLL_FREQ_OFFSET_MASK 0xffff
+#define A38X_PLL_FREQ_OFFSET_VALID BIT(16)
+#define A38X_PLL_SW_RESET BIT(31)
+#define A38X_PLL_CONF_REG2 0x8
+#define A38X_PLL_AUDIO_POSTDIV_MASK 0x7f
+
+/* Bit below belongs to SoC control register corresponding to the third
+ * register region.
+ */
+#define A38X_SPDIF_MODE_ENABLE BIT(27)
+
+static int armada_38x_i2s_init_quirk(struct platform_device *pdev,
+ struct kirkwood_dma_data *priv,
+ struct snd_soc_dai_driver *dai_drv)
+{
+ struct resource *mem;
+ u32 reg_val;
+ int i;
+
+ mem = platform_get_resource_byname(pdev, IORESOURCE_MEM, "pll_regs");
+ priv->pll_config = devm_ioremap_resource(&pdev->dev, mem);
+ if (IS_ERR(priv->pll_config))
+ return -ENOMEM;
+
+ mem = platform_get_resource_byname(pdev, IORESOURCE_MEM, "soc_ctrl");
+ priv->soc_control = devm_ioremap_resource(&pdev->dev, mem);
+ if (IS_ERR(priv->soc_control))
+ return -ENOMEM;
+
+ /* Select one of exceptive modes: I2S or S/PDIF */
+ reg_val = readl(priv->soc_control);
+ if (of_property_read_bool(np, "spdif-mode")) {
+ reg_val |= A38X_SPDIF_MODE_ENABLE;
+ dev_info(&pdev->dev, "using S/PDIF mode\n");
+ } else {
+ reg_val &= ~A38X_SPDIF_MODE_ENABLE;
+ dev_info(&pdev->dev, "using I2S mode\n");
+ }
+ writel(reg_val, priv->soc_control);
+
+ /* Update available rates of mclk's fs */
+ for (i = 0; i < 2; i++) {
+ dai_drv[i].playback.rates |= SNDRV_PCM_RATE_192000;
+ dai_drv[i].capture.rates |= SNDRV_PCM_RATE_192000;
+ }
+
+ return 0;
+}
+
+static inline void armada_38x_set_pll(void __iomem *base, unsigned long rate)
+{
+ u32 reg_val;
+ u16 freq_offset = 0x22b0;
+ u8 audio_postdiv, fb_clk_div = 0x1d;
+
+ /* Set frequency offset value to not valid and enable PLL reset */
+ reg_val = readl(base + A38X_PLL_CONF_REG1);
+ reg_val &= ~A38X_PLL_FREQ_OFFSET_VALID;
+ reg_val &= ~A38X_PLL_SW_RESET;
+ writel(reg_val, base + A38X_PLL_CONF_REG1);
+
+ udelay(1);
+
+ /* Update PLL parameters */
+ switch (rate) {
+ default:
+ case 44100:
+ freq_offset = 0x735;
+ fb_clk_div = 0x1b;
+ audio_postdiv = 0xc;
+ break;
+ case 48000:
+ audio_postdiv = 0xc;
+ break;
+ case 96000:
+ audio_postdiv = 0x6;
+ break;
+ case 192000:
+ audio_postdiv = 0x3;
+ break;
+ }
+
+ reg_val = readl(base + A38X_PLL_CONF_REG0);
+ reg_val &= ~A38X_PLL_FB_CLK_DIV_MASK;
+ reg_val |= (fb_clk_div << A38X_PLL_FB_CLK_DIV_OFFSET);
+ writel(reg_val, base + A38X_PLL_CONF_REG0);
+
+ reg_val = readl(base + A38X_PLL_CONF_REG2);
+ reg_val &= ~A38X_PLL_AUDIO_POSTDIV_MASK;
+ reg_val |= audio_postdiv;
+ writel(reg_val, base + A38X_PLL_CONF_REG2);
+
+ reg_val = readl(base + A38X_PLL_CONF_REG1);
+ reg_val &= ~A38X_PLL_FREQ_OFFSET_MASK;
+ reg_val |= freq_offset;
+ writel(reg_val, base + A38X_PLL_CONF_REG1);
+
+ udelay(1);
+
+ /* Disable reset */
+ reg_val |= A38X_PLL_SW_RESET;
+ writel(reg_val, base + A38X_PLL_CONF_REG1);
+
+ /* Wait 50us for PLL to lock */
+ udelay(50);
+
+ /* Restore frequency offset value validity */
+ reg_val |= A38X_PLL_FREQ_OFFSET_VALID;
+ writel(reg_val, base + A38X_PLL_CONF_REG1);
+}
+
static int kirkwood_i2s_set_fmt(struct snd_soc_dai *cpu_dai,
unsigned int fmt)
{
@@ -112,7 +230,10 @@ static void kirkwood_set_rate(struct snd_soc_dai *dai,
* defined in kirkwood_i2s_dai */
dev_dbg(dai->dev, "%s: dco set rate = %lu\n",
__func__, rate);
- kirkwood_set_dco(priv->io, rate);
+ if (priv->pll_config)
+ armada_38x_set_pll(priv->pll_config, rate);
+ else
+ kirkwood_set_dco(priv->io, rate);

clks_ctrl = KIRKWOOD_MCLK_SOURCE_DCO;
} else {
@@ -544,7 +665,7 @@ static int kirkwood_i2s_dev_probe(struct platform_device *pdev)
}
dev_set_drvdata(&pdev->dev, priv);

- mem = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+ mem = platform_get_resource_byname(pdev, IORESOURCE_MEM, "i2s_regs");
priv->io = devm_ioremap_resource(&pdev->dev, mem);
if (IS_ERR(priv->io))
return PTR_ERR(priv->io);
@@ -555,6 +676,12 @@ static int kirkwood_i2s_dev_probe(struct platform_device *pdev)
return -ENXIO;
}

+ if (of_device_is_compatible(np, "marvell,armada-380-audio")) {
+ err = armada_38x_i2s_init_quirk(pdev, priv, soc_dai);
+ if (err < 0)
+ return err;
+ }
+
if (np) {
priv->burst = 128; /* might be 32 or 128 */
} else if (data) {
@@ -647,6 +774,7 @@ static struct of_device_id mvebu_audio_of_match[] = {
{ .compatible = "marvell,kirkwood-audio" },
{ .compatible = "marvell,dove-audio" },
{ .compatible = "marvell,armada370-audio" },
+ { .compatible = "marvell,armada-380-audio" },
{ }
};
MODULE_DEVICE_TABLE(of, mvebu_audio_of_match);
diff --git a/sound/soc/kirkwood/kirkwood.h b/sound/soc/kirkwood/kirkwood.h
index 90e32a7..0bb6dc6 100644
--- a/sound/soc/kirkwood/kirkwood.h
+++ b/sound/soc/kirkwood/kirkwood.h
@@ -133,6 +133,8 @@

struct kirkwood_dma_data {
void __iomem *io;
+ void __iomem *pll_config;
+ void __iomem *soc_control;
struct clk *clk;
struct clk *extclk;
uint32_t ctl_play;
--
1.8.3.1
Marcin Wojtas
2015-02-25 21:58:00 UTC
Permalink
This commit adds the description of the I2S controller to the Marvell
Armada 38x SoC's Device Tree, as well as its pin configuration.

Signed-off-by: Marcin Wojtas <***@semihalf.com>
---
arch/arm/boot/dts/armada-38x.dtsi | 17 +++++++++++++++++
1 file changed, 17 insertions(+)

diff --git a/arch/arm/boot/dts/armada-38x.dtsi b/arch/arm/boot/dts/armada-38x.dtsi
index 1dff30a..fa21cb5 100644
--- a/arch/arm/boot/dts/armada-38x.dtsi
+++ b/arch/arm/boot/dts/armada-38x.dtsi
@@ -314,6 +314,12 @@
marvell,pins = "mpp44";
marvell,function = "sata3";
};
+
+ i2s_pins: i2s_pins {
+ marvell,pins = "mpp48", "mpp49", "mpp50",
+ "mpp51", "mpp52", "mpp53";
+ marvell,function = "audio";
+ };
};

gpio0: ***@18100 {
@@ -555,6 +561,17 @@
status = "disabled";
};

+ audio_controller: audio-***@e8000 {
+ #sound-dai-cells = <1>;
+ compatible = "marvell,armada-380-audio";
+ reg = <0xe8000 0x4000>, <0x18410 0xc>, <0x18204 0x4>;
+ reg-names = "i2s_regs", "pll_regs", "soc_ctrl";
+ interrupts = <GIC_SPI 75 IRQ_TYPE_LEVEL_HIGH>;
+ clocks = <&gateclk 0>;
+ clock-names = "internal";
+ status = "disabled";
+ };
+
***@f0000 {
compatible = "marvell,armada-380-xhci";
reg = <0xf0000 0x4000>,<0xf4000 0x4000>;
--
1.8.3.1
Sebastian Hesselbarth
2015-02-25 22:21:10 UTC
Permalink
Post by Marcin Wojtas
This commit adds the description of the I2S controller to the Marvell
Armada 38x SoC's Device Tree, as well as its pin configuration.
---
arch/arm/boot/dts/armada-38x.dtsi | 17 +++++++++++++++++
1 file changed, 17 insertions(+)
diff --git a/arch/arm/boot/dts/armada-38x.dtsi b/arch/arm/boot/dts/armada-38x.dtsi
index 1dff30a..fa21cb5 100644
--- a/arch/arm/boot/dts/armada-38x.dtsi
+++ b/arch/arm/boot/dts/armada-38x.dtsi
[...]
Post by Marcin Wojtas
@@ -555,6 +561,17 @@
status = "disabled";
};
+ #sound-dai-cells = <1>;
+ compatible = "marvell,armada-380-audio";
+ reg = <0xe8000 0x4000>, <0x18410 0xc>, <0x18204 0x4>;
+ reg-names = "i2s_regs", "pll_regs", "soc_ctrl";
Marcin,

sorry but NACK. The PLL and SoC ctrl are not even close to the audio
controller registers.
Post by Marcin Wojtas
+ interrupts = <GIC_SPI 75 IRQ_TYPE_LEVEL_HIGH>;
+ clocks = <&gateclk 0>;
+ clock-names = "internal";
How about providing access to audio PLL by a clock driver and amend the
binding to allow for a second more precise PLL clock, e.g.

clocks = <&gateclk CLK_AUDIO>, <&pll PLL_AUDIO>;
clock-names = "internal", "pll";

we already check for an "extclk" on Dove for the same reason but the
name might be misleading here.

Also, i2c/spdif muxing option could be handled by 38x's pinctrl driver,
we have the same for Dove's internal i2c mux.

If you want to use i2s you just add the option to the default pinctrl
hog:

pinctrl-0 = <&i2s_pins &audio_mux_i2s>;
pinctrl-names = "default";

Sebastian
Post by Marcin Wojtas
+ status = "disabled";
+ };
+
compatible = "marvell,armada-380-xhci";
reg = <0xf0000 0x4000>,<0xf4000 0x4000>;
Marcin Wojtas
2015-02-26 00:11:44 UTC
Permalink
Hi Sebastian,

Thanks for your input.

2015-02-25 23:21 GMT+01:00 Sebastian Hesselbarth
Post by Sebastian Hesselbarth
Post by Marcin Wojtas
This commit adds the description of the I2S controller to the Marvell
Armada 38x SoC's Device Tree, as well as its pin configuration.
---
arch/arm/boot/dts/armada-38x.dtsi | 17 +++++++++++++++++
1 file changed, 17 insertions(+)
diff --git a/arch/arm/boot/dts/armada-38x.dtsi
b/arch/arm/boot/dts/armada-38x.dtsi
index 1dff30a..fa21cb5 100644
--- a/arch/arm/boot/dts/armada-38x.dtsi
+++ b/arch/arm/boot/dts/armada-38x.dtsi
[...]
Post by Marcin Wojtas
@@ -555,6 +561,17 @@
status = "disabled";
};
+ #sound-dai-cells = <1>;
+ compatible = "marvell,armada-380-audio";
+ reg = <0xe8000 0x4000>, <0x18410 0xc>,
<0x18204 0x4>;
+ reg-names = "i2s_regs", "pll_regs",
"soc_ctrl";
Marcin,
sorry but NACK. The PLL and SoC ctrl are not even close to the audio
controller registers.
Post by Marcin Wojtas
+ interrupts = <GIC_SPI 75
IRQ_TYPE_LEVEL_HIGH>;
+ clocks = <&gateclk 0>;
+ clock-names = "internal";
How about providing access to audio PLL by a clock driver and amend the
binding to allow for a second more precise PLL clock, e.g.
clocks = <&gateclk CLK_AUDIO>, <&pll PLL_AUDIO>;
clock-names = "internal", "pll";
Good idea. Would you suggest some name for the new driver? How about
clk-audio.c under drivers/clk/mvebu?
Post by Sebastian Hesselbarth
we already check for an "extclk" on Dove for the same reason but the
name might be misleading here.
Also, i2c/spdif muxing option could be handled by 38x's pinctrl driver,
we have the same for Dove's internal i2c mux.
If you want to use i2s you just add the option to the default pinctrl
pinctrl-0 = <&i2s_pins &audio_mux_i2s>;
pinctrl-names = "default";
Ok, I'll try to extend pinctrl-armada-38x then.

Marcin
Sebastian Hesselbarth
2015-02-26 00:30:58 UTC
Permalink
Post by Marcin Wojtas
2015-02-25 23:21 GMT+01:00 Sebastian Hesselbarth
[...]
Post by Marcin Wojtas
Post by Sebastian Hesselbarth
sorry but NACK. The PLL and SoC ctrl are not even close to the audio
controller registers.
Post by Marcin Wojtas
+ interrupts = <GIC_SPI 75
IRQ_TYPE_LEVEL_HIGH>;
+ clocks = <&gateclk 0>;
+ clock-names = "internal";
How about providing access to audio PLL by a clock driver and amend the
binding to allow for a second more precise PLL clock, e.g.
clocks = <&gateclk CLK_AUDIO>, <&pll PLL_AUDIO>;
clock-names = "internal", "pll";
Good idea. Would you suggest some name for the new driver? How about
clk-audio.c under drivers/clk/mvebu?
Depends on how much you know about the surrounding registers.

If there is more clock related stuff than the audio pll, I'd
suggest to have a single driver for everything that isn't core clk
or clk gates which are already dealt with in the existing driver.

Given that this PLL layout is most likely 38x specific, I'd just put it
into drivers/clk/mvebu/armada-38x.c.
Post by Marcin Wojtas
Post by Sebastian Hesselbarth
we already check for an "extclk" on Dove for the same reason but the
name might be misleading here.
Also, i2c/spdif muxing option could be handled by 38x's pinctrl driver,
we have the same for Dove's internal i2c mux.
If you want to use i2s you just add the option to the default pinctrl
pinctrl-0 = <&i2s_pins &audio_mux_i2s>;
pinctrl-names = "default";
Ok, I'll try to extend pinctrl-armada-38x then.
IMHO, adding a "syscon" compatible to the system-controller node should
do the trick. You can reference the regmap from 38x's pinctrl driver and
have a specific callback for the i2s/spdif muxing. The reg property
range of the system-controller should be extended but that definitely
depends on what the you or the Free Electron guys know about the
register set.

Sebastian
Marcin Wojtas
2015-02-26 09:05:08 UTC
Permalink
2015-02-26 1:30 GMT+01:00 Sebastian Hesselbarth
Post by Sebastian Hesselbarth
Post by Marcin Wojtas
2015-02-25 23:21 GMT+01:00 Sebastian Hesselbarth
[...]
Post by Marcin Wojtas
Post by Sebastian Hesselbarth
sorry but NACK. The PLL and SoC ctrl are not even close to the audio
controller registers.
Post by Marcin Wojtas
+ interrupts = <GIC_SPI 75
IRQ_TYPE_LEVEL_HIGH>;
+ clocks = <&gateclk 0>;
+ clock-names = "internal";
How about providing access to audio PLL by a clock driver and amend the
binding to allow for a second more precise PLL clock, e.g.
clocks = <&gateclk CLK_AUDIO>, <&pll PLL_AUDIO>;
clock-names = "internal", "pll";
Good idea. Would you suggest some name for the new driver? How about
clk-audio.c under drivers/clk/mvebu?
Depends on how much you know about the surrounding registers.
If there is more clock related stuff than the audio pll, I'd
suggest to have a single driver for everything that isn't core clk
or clk gates which are already dealt with in the existing driver.
Given that this PLL layout is most likely 38x specific, I'd just put it
into drivers/clk/mvebu/armada-38x.c.
Audio PLL setting consist of 3 independent registers. Adjacent to them
there are same regs for setting TDM PLL, however I doubt telephony is
going to be supported in foreseeable future (if ever). I also have
strong suspicion that it looks the same in a39x - Thomas, can you take
a look into spec and confirm?
Post by Sebastian Hesselbarth
Post by Marcin Wojtas
Post by Sebastian Hesselbarth
we already check for an "extclk" on Dove for the same reason but the
name might be misleading here.
Also, i2c/spdif muxing option could be handled by 38x's pinctrl driver,
we have the same for Dove's internal i2c mux.
If you want to use i2s you just add the option to the default pinctrl
pinctrl-0 = <&i2s_pins &audio_mux_i2s>;
pinctrl-names = "default";
Ok, I'll try to extend pinctrl-armada-38x then.
IMHO, adding a "syscon" compatible to the system-controller node should
do the trick. You can reference the regmap from 38x's pinctrl driver and
have a specific callback for the i2s/spdif muxing. The reg property
range of the system-controller should be extended but that definitely
depends on what the you or the Free Electron guys know about the
register set.
For muxing i2s/spdif (exactly the same pins, they have to be either
way set to "audio" by pinctrl driver) changing single bit in "system
control 1" register is needed. However currently unused, it is already
a part of system-***@18200 regmap. Anyway the will of using
either i2s or spdif has to be explicitly represented in DT.

Marcin
Thomas Petazzoni
2015-02-27 14:03:03 UTC
Permalink
Dear Marcin Wojtas,
Post by Marcin Wojtas
Post by Sebastian Hesselbarth
Given that this PLL layout is most likely 38x specific, I'd just put it
into drivers/clk/mvebu/armada-38x.c.
Audio PLL setting consist of 3 independent registers. Adjacent to them
there are same regs for setting TDM PLL, however I doubt telephony is
going to be supported in foreseeable future (if ever).
We might be supporting TDM in the future. However, I see that the TDM
PLL registers are nicely grouped together, and so are the audio PLL
registers. So maybe we can simply have a driver to handle the audio PLL
registers for now, and if needed in the future, add another driver for
the TDM PLL.
Post by Marcin Wojtas
I also have strong suspicion that it looks the same in a39x - Thomas,
can you take a look into spec and confirm?
As far as I can see there is no audio support and no TDM support in
Armada 39x, and there those TDM PLL and audio PLL registers do not
exist.

Thomas
--
Thomas Petazzoni, CTO, Free Electrons
Embedded Linux, Kernel and Android engineering
http://free-electrons.com
Marcin Wojtas
2015-02-27 20:22:30 UTC
Permalink
Dear Thomas,

2015-02-27 15:03 GMT+01:00 Thomas Petazzoni
Post by Thomas Petazzoni
Dear Marcin Wojtas,
Post by Marcin Wojtas
Post by Sebastian Hesselbarth
Given that this PLL layout is most likely 38x specific, I'd just put it
into drivers/clk/mvebu/armada-38x.c.
Audio PLL setting consist of 3 independent registers. Adjacent to them
there are same regs for setting TDM PLL, however I doubt telephony is
going to be supported in foreseeable future (if ever).
We might be supporting TDM in the future. However, I see that the TDM
PLL registers are nicely grouped together, and so are the audio PLL
registers. So maybe we can simply have a driver to handle the audio PLL
registers for now, and if needed in the future, add another driver for
the TDM PLL.
I've got very fresh experience of adding TDM support for A38x, it
definitely would require longer discussion.
Post by Thomas Petazzoni
Post by Marcin Wojtas
I also have strong suspicion that it looks the same in a39x - Thomas,
can you take a look into spec and confirm?
As far as I can see there is no audio support and no TDM support in
Armada 39x, and there those TDM PLL and audio PLL registers do not
exist.
Given all the facts, do you think that drivers/clk/mvebu/armada-38x.c
is a right place for adding this PLL setting support?

Best regards,
Marcin
Thomas Petazzoni
2015-02-28 09:58:44 UTC
Permalink
Dear Marcin Wojtas,
Post by Marcin Wojtas
Given all the facts, do you think that drivers/clk/mvebu/armada-38x.c
is a right place for adding this PLL setting support?
Yes, sounds right to me. I was thinking that maybe it should be in a
separate file armada-38x-pll.c, but that's probably a bit too much for
something that is causing to be a simple driver.

Have you sorted out how to handle the pin-muxing part of the problem?

Best regards,

Thomas
--
Thomas Petazzoni, CTO, Free Electrons
Embedded Linux, Kernel and Android engineering
http://free-electrons.com
Marcin Wojtas
2015-03-04 12:12:10 UTC
Permalink
Hello Thomas,

2015-02-28 10:58 GMT+01:00 Thomas Petazzoni
Post by Thomas Petazzoni
Dear Marcin Wojtas,
Post by Marcin Wojtas
Given all the facts, do you think that drivers/clk/mvebu/armada-38x.c
is a right place for adding this PLL setting support?
Yes, sounds right to me. I was thinking that maybe it should be in a
separate file armada-38x-pll.c, but that's probably a bit too much for
something that is causing to be a simple driver.
I've got an update from Marvell - TDM and Audio are available in A39x
and it has the same set of PLL registers like A38x. Shouldn't we place
PLL configuration code in a separate file like clk-pll.c, or whatever
name you suggest?
Post by Thomas Petazzoni
Have you sorted out how to handle the pin-muxing part of the problem?
I'll come back with this issue after implementing poc.

Best regards,
Marcin
Thomas Petazzoni
2015-03-04 12:29:50 UTC
Permalink
Dear Marcin Wojtas,
Post by Marcin Wojtas
I've got an update from Marvell - TDM and Audio are available in A39x
and it has the same set of PLL registers like A38x. Shouldn't we place
PLL configuration code in a separate file like clk-pll.c, or whatever
name you suggest?
Yes, seems OK to me.
Post by Marcin Wojtas
Post by Thomas Petazzoni
Have you sorted out how to handle the pin-muxing part of the problem?
I'll come back with this issue after implementing poc.
Ok, thanks!

Thomas
--
Thomas Petazzoni, CTO, Free Electrons
Embedded Linux, Kernel and Android engineering
http://free-electrons.com
Marcin Wojtas
2015-02-25 21:58:01 UTC
Permalink
This commit adds the necessary Device Tree information to enable
audio support on the Armada 385 DB platform. In details it:

* Instantiates the CS42L51 audio codec on the I2C0 bus

* Adds simple-card DT binding for audio on Armada 385 DB

* Adds description for both analog I2S and S/PDIF I/O

Signed-off-by: Marcin Wojtas <***@semihalf.com>
---
arch/arm/boot/dts/armada-388-db.dts | 69 +++++++++++++++++++++++++++++++++++++
1 file changed, 69 insertions(+)

diff --git a/arch/arm/boot/dts/armada-388-db.dts b/arch/arm/boot/dts/armada-388-db.dts
index 16512ef..bf5d9a7 100644
--- a/arch/arm/boot/dts/armada-388-db.dts
+++ b/arch/arm/boot/dts/armada-388-db.dts
@@ -82,6 +82,11 @@
***@11000 {
status = "okay";
clock-frequency = <100000>;
+ audio_codec: audio-***@4a {
+ #sound-dai-cells = <0>;
+ compatible = "cirrus,cs42l51";
+ reg = <0x4a>;
+ };
};

***@11100 {
@@ -158,6 +163,12 @@
no-1-8-v;
};

+ audio-***@e8000 {
+ pinctrl-0 = <&i2s_pins>;
+ pinctrl-names = "default";
+ status = "okay";
+ };
+
***@f0000 {
status = "okay";
};
@@ -183,4 +194,62 @@
};
};
};
+
+ sound {
+ compatible = "simple-audio-card";
+ simple-audio-card,name = "Armada 385 DB Audio";
+ simple-audio-card,mclk-fs = <256>;
+ simple-audio-card,widgets =
+ "Headphone", "Out Jack",
+ "Line", "In Jack";
+ simple-audio-card,routing =
+ "Out Jack", "HPL",
+ "Out Jack", "HPR",
+ "AIN1L", "In Jack",
+ "AIN1R", "In Jack";
+ status = "okay";
+
+ simple-audio-card,dai-***@0 {
+ format = "i2s";
+ cpu {
+ sound-dai = <&audio_controller 0>;
+ };
+
+ codec {
+ sound-dai = <&audio_codec>;
+ };
+ };
+
+ simple-audio-card,dai-***@1 {
+ format = "i2s";
+ cpu {
+ sound-dai = <&audio_controller 1>;
+ };
+
+ codec {
+ sound-dai = <&spdif_out>;
+ };
+ };
+
+ simple-audio-card,dai-***@2 {
+ format = "i2s";
+ cpu {
+ sound-dai = <&audio_controller 1>;
+ };
+
+ codec {
+ sound-dai = <&spdif_in>;
+ };
+ };
+ };
+
+ spdif_out: spdif-out {
+ #sound-dai-cells = <0>;
+ compatible = "linux,spdif-dit";
+ };
+
+ spdif_in: spdif-in {
+ #sound-dai-cells = <0>;
+ compatible = "linux,spdif-dir";
+ };
};
--
1.8.3.1
Marcin Wojtas
2015-02-25 21:58:02 UTC
Permalink
Armada 370 DB sound complex comprise routing for audio card with a
description for analog input line with two left channels instead of a pair
left-right. This commit introduces routing for the right channel.

Fixes: a6b334514be2 ("use simple-card DT binding for audio on Armada 370 DB")

Signed-off-by: Marcin Wojtas <***@semihalf.com>
Cc: <***@vger.kernel.org> # v3.19+
Reviewed-by: Thomas Petazzoni <***@free-electrons.com>
---
arch/arm/boot/dts/armada-370-db.dts | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm/boot/dts/armada-370-db.dts b/arch/arm/boot/dts/armada-370-db.dts
index e993c46..d6e72a4 100644
--- a/arch/arm/boot/dts/armada-370-db.dts
+++ b/arch/arm/boot/dts/armada-370-db.dts
@@ -190,7 +190,7 @@
"Out Jack", "HPL",
"Out Jack", "HPR",
"AIN1L", "In Jack",
- "AIN1L", "In Jack";
+ "AIN1R", "In Jack";
status = "okay";

simple-audio-card,dai-***@0 {
--
1.8.3.1
Marcin Wojtas
2015-08-06 17:00:27 UTC
Permalink
Hello,

This is a set of three patches that fix long-lasting problems implemented in
the initial support for the Armada 375 network controller.

Due to an inappropriate concept of handling the per-CPU sent packets'
processing on TX path the driver numerous problems occured, such as RCU
stalls. Those have been fixed, of which details you can find in the commit
logs. The patches were intensively tested on top of v4.2-rc5.

I'm looking forward to any comments or remarks.

Thanks,
Marcin

Marcin Wojtas (3):
net: mvpp2: remove excessive spinlocks from driver initialization
net: mvpp2: enable proper per-CPU TX buffers unmapping
net: mvpp2: replace TX coalescing interrupts with hrtimer

drivers/net/ethernet/marvell/mvpp2.c | 244 ++++++++++++++++++++++++-----------
1 file changed, 167 insertions(+), 77 deletions(-)
--
1.8.3.1
Marcin Wojtas
2015-08-06 17:00:28 UTC
Permalink
Using spinlocks protection during one-time driver initialization is not
necessary. Moreover it resulted in invalid GFP_KERNEL allocation under the lock.

This commit removes redundant spinlocks from buffer manager part of mvpp2
initialization.

Signed-off-by: Marcin Wojtas <***@semihalf.com>
Reported-by: Alexandre Fournier <***@wisp-e.com>
---
drivers/net/ethernet/marvell/mvpp2.c | 15 ---------------
1 file changed, 15 deletions(-)

diff --git a/drivers/net/ethernet/marvell/mvpp2.c b/drivers/net/ethernet/marvell/mvpp2.c
index 3e8b1bf..f94bd12 100644
--- a/drivers/net/ethernet/marvell/mvpp2.c
+++ b/drivers/net/ethernet/marvell/mvpp2.c
@@ -913,8 +913,6 @@ struct mvpp2_bm_pool {
/* Occupied buffers indicator */
atomic_t in_use;
int in_use_thresh;
-
- spinlock_t lock;
};

struct mvpp2_buff_hdr {
@@ -3376,7 +3374,6 @@ static int mvpp2_bm_pool_create(struct platform_device *pdev,
bm_pool->pkt_size = 0;
bm_pool->buf_num = 0;
atomic_set(&bm_pool->in_use, 0);
- spin_lock_init(&bm_pool->lock);

return 0;
}
@@ -3647,7 +3644,6 @@ static struct mvpp2_bm_pool *
mvpp2_bm_pool_use(struct mvpp2_port *port, int pool, enum mvpp2_bm_type type,
int pkt_size)
{
- unsigned long flags = 0;
struct mvpp2_bm_pool *new_pool = &port->priv->bm_pools[pool];
int num;

@@ -3656,8 +3652,6 @@ mvpp2_bm_pool_use(struct mvpp2_port *port, int pool, enum mvpp2_bm_type type,
return NULL;
}

- spin_lock_irqsave(&new_pool->lock, flags);
-
if (new_pool->type == MVPP2_BM_FREE)
new_pool->type = type;

@@ -3686,8 +3680,6 @@ mvpp2_bm_pool_use(struct mvpp2_port *port, int pool, enum mvpp2_bm_type type,
if (num != pkts_num) {
WARN(1, "pool %d: %d of %d allocated\n",
new_pool->id, num, pkts_num);
- /* We need to undo the bufs_add() allocations */
- spin_unlock_irqrestore(&new_pool->lock, flags);
return NULL;
}
}
@@ -3695,15 +3687,12 @@ mvpp2_bm_pool_use(struct mvpp2_port *port, int pool, enum mvpp2_bm_type type,
mvpp2_bm_pool_bufsize_set(port->priv, new_pool,
MVPP2_RX_BUF_SIZE(new_pool->pkt_size));

- spin_unlock_irqrestore(&new_pool->lock, flags);
-
return new_pool;
}

/* Initialize pools for swf */
static int mvpp2_swf_bm_pool_init(struct mvpp2_port *port)
{
- unsigned long flags = 0;
int rxq;

if (!port->pool_long) {
@@ -3714,9 +3703,7 @@ static int mvpp2_swf_bm_pool_init(struct mvpp2_port *port)
if (!port->pool_long)
return -ENOMEM;

- spin_lock_irqsave(&port->pool_long->lock, flags);
port->pool_long->port_map |= (1 << port->id);
- spin_unlock_irqrestore(&port->pool_long->lock, flags);

for (rxq = 0; rxq < rxq_number; rxq++)
mvpp2_rxq_long_pool_set(port, rxq, port->pool_long->id);
@@ -3730,9 +3717,7 @@ static int mvpp2_swf_bm_pool_init(struct mvpp2_port *port)
if (!port->pool_short)
return -ENOMEM;

- spin_lock_irqsave(&port->pool_short->lock, flags);
port->pool_short->port_map |= (1 << port->id);
- spin_unlock_irqrestore(&port->pool_short->lock, flags);

for (rxq = 0; rxq < rxq_number; rxq++)
mvpp2_rxq_short_pool_set(port, rxq,
--
1.8.3.1
Marcin Wojtas
2015-08-06 17:00:29 UTC
Permalink
mvpp2 driver allows usage of per-CPU TX processing. Once the packets are
prepared independetly on each CPU, the hardware enqueues the descriptors in
common TX queue. After they are sent, the buffers and associated sk_buffs
should be released on the corresponding CPU.

This is why a special index is maintained in order to point to the right data to
be released after transmission takes place. Each per-CPU TX queue comprise an
array of sent sk_buffs, freed in mvpp2_txq_bufs_free function. However, the
index was used there also for obtaining a descriptor (and therefore a buffer to
be DMA-unmapped) from common TX queue, which was wrong, because it was not
referring to the current CPU.

This commit enables proper unmapping of sent data buffers by indexing them in
per-CPU queues using a dedicated array for keeping their physical addresses.

Signed-off-by: Marcin Wojtas <***@semihalf.com>
---
drivers/net/ethernet/marvell/mvpp2.c | 52 +++++++++++++++++++++++++-----------
1 file changed, 37 insertions(+), 15 deletions(-)

diff --git a/drivers/net/ethernet/marvell/mvpp2.c b/drivers/net/ethernet/marvell/mvpp2.c
index f94bd12..3e25d314 100644
--- a/drivers/net/ethernet/marvell/mvpp2.c
+++ b/drivers/net/ethernet/marvell/mvpp2.c
@@ -776,6 +776,9 @@ struct mvpp2_txq_pcpu {
/* Array of transmitted skb */
struct sk_buff **tx_skb;

+ /* Array of transmitted buffers' physical addresses */
+ dma_addr_t *tx_buffs;
+
/* Index of last TX DMA descriptor that was inserted */
int txq_put_index;

@@ -961,9 +964,13 @@ static void mvpp2_txq_inc_get(struct mvpp2_txq_pcpu *txq_pcpu)
}

static void mvpp2_txq_inc_put(struct mvpp2_txq_pcpu *txq_pcpu,
- struct sk_buff *skb)
+ struct sk_buff *skb,
+ struct mvpp2_tx_desc *tx_desc)
{
txq_pcpu->tx_skb[txq_pcpu->txq_put_index] = skb;
+ if (skb)
+ txq_pcpu->tx_buffs[txq_pcpu->txq_put_index] =
+ tx_desc->buf_phys_addr;
txq_pcpu->txq_put_index++;
if (txq_pcpu->txq_put_index == txq_pcpu->size)
txq_pcpu->txq_put_index = 0;
@@ -4392,8 +4399,8 @@ static void mvpp2_txq_bufs_free(struct mvpp2_port *port,
int i;

for (i = 0; i < num; i++) {
- struct mvpp2_tx_desc *tx_desc = txq->descs +
- txq_pcpu->txq_get_index;
+ dma_addr_t buf_phys_addr =
+ txq_pcpu->tx_buffs[txq_pcpu->txq_get_index];
struct sk_buff *skb = txq_pcpu->tx_skb[txq_pcpu->txq_get_index];

mvpp2_txq_inc_get(txq_pcpu);
@@ -4401,8 +4408,8 @@ static void mvpp2_txq_bufs_free(struct mvpp2_port *port,
if (!skb)
continue;

- dma_unmap_single(port->dev->dev.parent, tx_desc->buf_phys_addr,
- tx_desc->data_size, DMA_TO_DEVICE);
+ dma_unmap_single(port->dev->dev.parent, buf_phys_addr,
+ skb_headlen(skb), DMA_TO_DEVICE);
dev_kfree_skb_any(skb);
}
}
@@ -4634,12 +4641,13 @@ static int mvpp2_txq_init(struct mvpp2_port *port,
txq_pcpu->tx_skb = kmalloc(txq_pcpu->size *
sizeof(*txq_pcpu->tx_skb),
GFP_KERNEL);
- if (!txq_pcpu->tx_skb) {
- dma_free_coherent(port->dev->dev.parent,
- txq->size * MVPP2_DESC_ALIGNED_SIZE,
- txq->descs, txq->descs_phys);
- return -ENOMEM;
- }
+ if (!txq_pcpu->tx_skb)
+ goto error;
+
+ txq_pcpu->tx_buffs = kmalloc(txq_pcpu->size *
+ sizeof(dma_addr_t), GFP_KERNEL);
+ if (!txq_pcpu->tx_buffs)
+ goto error;

txq_pcpu->count = 0;
txq_pcpu->reserved_num = 0;
@@ -4648,6 +4656,19 @@ static int mvpp2_txq_init(struct mvpp2_port *port,
}

return 0;
+
+error:
+ for_each_present_cpu(cpu) {
+ txq_pcpu = per_cpu_ptr(txq->pcpu, cpu);
+ kfree(txq_pcpu->tx_skb);
+ kfree(txq_pcpu->tx_buffs);
+ }
+
+ dma_free_coherent(port->dev->dev.parent,
+ txq->size * MVPP2_DESC_ALIGNED_SIZE,
+ txq->descs, txq->descs_phys);
+
+ return -ENOMEM;
}

/* Free allocated TXQ resources */
@@ -4660,6 +4681,7 @@ static void mvpp2_txq_deinit(struct mvpp2_port *port,
for_each_present_cpu(cpu) {
txq_pcpu = per_cpu_ptr(txq->pcpu, cpu);
kfree(txq_pcpu->tx_skb);
+ kfree(txq_pcpu->tx_buffs);
}

if (txq->descs)
@@ -5129,11 +5151,11 @@ static int mvpp2_tx_frag_process(struct mvpp2_port *port, struct sk_buff *skb,
if (i == (skb_shinfo(skb)->nr_frags - 1)) {
/* Last descriptor */
tx_desc->command = MVPP2_TXD_L_DESC;
- mvpp2_txq_inc_put(txq_pcpu, skb);
+ mvpp2_txq_inc_put(txq_pcpu, skb, tx_desc);
} else {
/* Descriptor in the middle: Not First, Not Last */
tx_desc->command = 0;
- mvpp2_txq_inc_put(txq_pcpu, NULL);
+ mvpp2_txq_inc_put(txq_pcpu, NULL, tx_desc);
}
}

@@ -5199,12 +5221,12 @@ static int mvpp2_tx(struct sk_buff *skb, struct net_device *dev)
/* First and Last descriptor */
tx_cmd |= MVPP2_TXD_F_DESC | MVPP2_TXD_L_DESC;
tx_desc->command = tx_cmd;
- mvpp2_txq_inc_put(txq_pcpu, skb);
+ mvpp2_txq_inc_put(txq_pcpu, skb, tx_desc);
} else {
/* First but not Last */
tx_cmd |= MVPP2_TXD_F_DESC | MVPP2_TXD_PADDING_DISABLE;
tx_desc->command = tx_cmd;
- mvpp2_txq_inc_put(txq_pcpu, NULL);
+ mvpp2_txq_inc_put(txq_pcpu, NULL, tx_desc);

/* Continue with other skb fragments */
if (mvpp2_tx_frag_process(port, skb, aggr_txq, txq)) {
--
1.8.3.1
Marcin Wojtas
2015-08-06 17:00:30 UTC
Permalink
The PP2 controller is capable of per-CPU TX processing, which means there are
per-CPU banked register sets and queues. Current version of the driver supports
TX packet coalescing - once on given CPU sent packets amount reaches a threshold
value, an IRQ occurs. However, there is a single interrupt line responsible for
CPU0/1 TX and RX events (the latter is not per-CPU, the hardware does not
support RSS).

When the top-half executes the interrupt cause is not known. This is why in
NAPI poll function, along with RX processing, IRQ cause register on both
CPU's is accessed in order to determine on which of them the TX coalescing
threshold might have been reached. Thus the egress processing and releasing the
buffers is able to take place on the corresponding CPU. Hitherto approach lead
to an illegal usage of on_each_cpu function in softirq context.

The problem is solved by resigning from TX coalescing interrupts and separating
egress finalization from NAPI processing. For that purpose a method of using
hrtimer is introduced. In main transmit function (mvpp2_tx) buffers are released
once a software coalescing threshold is reached. In case not all the data is
processed a timer is set on this CPU - in its interrupt context a tasklet is
scheduled in which all queues are processed. At once only one timer per-CPU can
be running, which is controlled by a dedicated flag.

This commit removes TX processing from NAPI polling function, disables hardware
coalescing and enables hrtimer with tasklet, using new per-CPU port structure
(mvpp2_port_pcpu).

Signed-off-by: Marcin Wojtas <***@semihalf.com>
---
drivers/net/ethernet/marvell/mvpp2.c | 177 +++++++++++++++++++++++++----------
1 file changed, 130 insertions(+), 47 deletions(-)

diff --git a/drivers/net/ethernet/marvell/mvpp2.c b/drivers/net/ethernet/marvell/mvpp2.c
index 3e25d314..d9884fd 100644
--- a/drivers/net/ethernet/marvell/mvpp2.c
+++ b/drivers/net/ethernet/marvell/mvpp2.c
@@ -27,6 +27,8 @@
#include <linux/of_address.h>
#include <linux/phy.h>
#include <linux/clk.h>
+#include <linux/hrtimer.h>
+#include <linux/ktime.h>
#include <uapi/linux/ppp_defs.h>
#include <net/ip.h>
#include <net/ipv6.h>
@@ -299,6 +301,7 @@

/* Coalescing */
#define MVPP2_TXDONE_COAL_PKTS_THRESH 15
+#define MVPP2_TXDONE_HRTIMER_PERIOD_NS 1000000UL
#define MVPP2_RX_COAL_PKTS 32
#define MVPP2_RX_COAL_USEC 100

@@ -660,6 +663,14 @@ struct mvpp2_pcpu_stats {
u64 tx_bytes;
};

+/* Per-CPU port control */
+struct mvpp2_port_pcpu {
+ struct hrtimer tx_done_timer;
+ bool timer_scheduled;
+ /* Tasklet for egress finalization */
+ struct tasklet_struct tx_done_tasklet;
+};
+
struct mvpp2_port {
u8 id;

@@ -679,6 +690,9 @@ struct mvpp2_port {
u32 pending_cause_rx;
struct napi_struct napi;

+ /* Per-CPU port control */
+ struct mvpp2_port_pcpu __percpu *pcpu;
+
/* Flags */
unsigned long flags;

@@ -3798,7 +3812,6 @@ static void mvpp2_interrupts_unmask(void *arg)

mvpp2_write(port->priv, MVPP2_ISR_RX_TX_MASK_REG(port->id),
(MVPP2_CAUSE_MISC_SUM_MASK |
- MVPP2_CAUSE_TXQ_OCCUP_DESC_ALL_MASK |
MVPP2_CAUSE_RXQ_OCCUP_DESC_ALL_MASK));
}

@@ -4374,23 +4387,6 @@ static void mvpp2_rx_time_coal_set(struct mvpp2_port *port,
rxq->time_coal = usec;
}

-/* Set threshold for TX_DONE pkts coalescing */
-static void mvpp2_tx_done_pkts_coal_set(void *arg)
-{
- struct mvpp2_port *port = arg;
- int queue;
- u32 val;
-
- for (queue = 0; queue < txq_number; queue++) {
- struct mvpp2_tx_queue *txq = port->txqs[queue];
-
- val = (txq->done_pkts_coal << MVPP2_TRANSMITTED_THRESH_OFFSET) &
- MVPP2_TRANSMITTED_THRESH_MASK;
- mvpp2_write(port->priv, MVPP2_TXQ_NUM_REG, txq->id);
- mvpp2_write(port->priv, MVPP2_TXQ_THRESH_REG, val);
- }
-}
-
/* Free Tx queue skbuffs */
static void mvpp2_txq_bufs_free(struct mvpp2_port *port,
struct mvpp2_tx_queue *txq,
@@ -4425,7 +4421,7 @@ static inline struct mvpp2_rx_queue *mvpp2_get_rx_queue(struct mvpp2_port *port,
static inline struct mvpp2_tx_queue *mvpp2_get_tx_queue(struct mvpp2_port *port,
u32 cause)
{
- int queue = fls(cause >> 16) - 1;
+ int queue = fls(cause) - 1;

return port->txqs[queue];
}
@@ -4452,6 +4448,29 @@ static void mvpp2_txq_done(struct mvpp2_port *port, struct mvpp2_tx_queue *txq,
netif_tx_wake_queue(nq);
}

+static unsigned int mvpp2_tx_done(struct mvpp2_port *port, u32 cause)
+{
+ struct mvpp2_tx_queue *txq;
+ struct mvpp2_txq_pcpu *txq_pcpu;
+ unsigned int tx_todo = 0;
+
+ while (cause) {
+ txq = mvpp2_get_tx_queue(port, cause);
+ if (!txq)
+ break;
+
+ txq_pcpu = this_cpu_ptr(txq->pcpu);
+
+ if (txq_pcpu->count) {
+ mvpp2_txq_done(port, txq, txq_pcpu);
+ tx_todo += txq_pcpu->count;
+ }
+
+ cause &= ~(1 << txq->log_id);
+ }
+ return tx_todo;
+}
+
/* Rx/Tx queue initialization/cleanup methods */

/* Allocate and initialize descriptors for aggr TXQ */
@@ -4812,7 +4831,6 @@ static int mvpp2_setup_txqs(struct mvpp2_port *port)
goto err_cleanup;
}

- on_each_cpu(mvpp2_tx_done_pkts_coal_set, port, 1);
on_each_cpu(mvpp2_txq_sent_counter_clear, port, 1);
return 0;

@@ -4894,6 +4912,49 @@ static void mvpp2_link_event(struct net_device *dev)
}
}

+static void mvpp2_timer_set(struct mvpp2_port_pcpu *port_pcpu)
+{
+ ktime_t interval;
+
+ if (!port_pcpu->timer_scheduled) {
+ port_pcpu->timer_scheduled = true;
+ interval = ktime_set(0, MVPP2_TXDONE_HRTIMER_PERIOD_NS);
+ hrtimer_start(&port_pcpu->tx_done_timer, interval,
+ HRTIMER_MODE_REL_PINNED);
+ }
+}
+
+static void mvpp2_tx_proc_cb(unsigned long data)
+{
+ struct net_device *dev = (struct net_device *)data;
+ struct mvpp2_port *port = netdev_priv(dev);
+ struct mvpp2_port_pcpu *port_pcpu = this_cpu_ptr(port->pcpu);
+ unsigned int tx_todo, cause;
+
+ if (!netif_running(dev))
+ return;
+ port_pcpu->timer_scheduled = false;
+
+ /* Process all the Tx queues */
+ cause = (1 << txq_number) - 1;
+ tx_todo = mvpp2_tx_done(port, cause);
+
+ /* Set the timer in case not all the packets were processed */
+ if (tx_todo)
+ mvpp2_timer_set(port_pcpu);
+}
+
+static enum hrtimer_restart mvpp2_hr_timer_cb(struct hrtimer *timer)
+{
+ struct mvpp2_port_pcpu *port_pcpu = container_of(timer,
+ struct mvpp2_port_pcpu,
+ tx_done_timer);
+
+ tasklet_schedule(&port_pcpu->tx_done_tasklet);
+
+ return HRTIMER_NORESTART;
+}
+
/* Main RX/TX processing routines */

/* Display more error info */
@@ -5262,6 +5323,17 @@ out:
dev_kfree_skb_any(skb);
}

+ /* Finalize TX processing */
+ if (txq_pcpu->count >= txq->done_pkts_coal)
+ mvpp2_txq_done(port, txq, txq_pcpu);
+
+ /* Set the timer in case not all frags were processed */
+ if (txq_pcpu->count <= frags && txq_pcpu->count > 0) {
+ struct mvpp2_port_pcpu *port_pcpu = this_cpu_ptr(port->pcpu);
+
+ mvpp2_timer_set(port_pcpu);
+ }
+
return NETDEV_TX_OK;
}

@@ -5275,10 +5347,11 @@ static inline void mvpp2_cause_error(struct net_device *dev, int cause)
netdev_err(dev, "tx fifo underrun error\n");
}

-static void mvpp2_txq_done_percpu(void *arg)
+static int mvpp2_poll(struct napi_struct *napi, int budget)
{
- struct mvpp2_port *port = arg;
- u32 cause_rx_tx, cause_tx, cause_misc;
+ u32 cause_rx_tx, cause_rx, cause_misc;
+ int rx_done = 0;
+ struct mvpp2_port *port = netdev_priv(napi->dev);

/* Rx/Tx cause register
*
@@ -5292,7 +5365,7 @@ static void mvpp2_txq_done_percpu(void *arg)
*/
cause_rx_tx = mvpp2_read(port->priv,
MVPP2_ISR_RX_TX_CAUSE_REG(port->id));
- cause_tx = cause_rx_tx & MVPP2_CAUSE_TXQ_OCCUP_DESC_ALL_MASK;
+ cause_rx_tx &= ~MVPP2_CAUSE_TXQ_OCCUP_DESC_ALL_MASK;
cause_misc = cause_rx_tx & MVPP2_CAUSE_MISC_SUM_MASK;

if (cause_misc) {
@@ -5304,26 +5377,6 @@ static void mvpp2_txq_done_percpu(void *arg)
cause_rx_tx & ~MVPP2_CAUSE_MISC_SUM_MASK);
}

- /* Release TX descriptors */
- if (cause_tx) {
- struct mvpp2_tx_queue *txq = mvpp2_get_tx_queue(port, cause_tx);
- struct mvpp2_txq_pcpu *txq_pcpu = this_cpu_ptr(txq->pcpu);
-
- if (txq_pcpu->count)
- mvpp2_txq_done(port, txq, txq_pcpu);
- }
-}
-
-static int mvpp2_poll(struct napi_struct *napi, int budget)
-{
- u32 cause_rx_tx, cause_rx;
- int rx_done = 0;
- struct mvpp2_port *port = netdev_priv(napi->dev);
-
- on_each_cpu(mvpp2_txq_done_percpu, port, 1);
-
- cause_rx_tx = mvpp2_read(port->priv,
- MVPP2_ISR_RX_TX_CAUSE_REG(port->id));
cause_rx = cause_rx_tx & MVPP2_CAUSE_RXQ_OCCUP_DESC_ALL_MASK;

/* Process RX packets */
@@ -5568,6 +5621,8 @@ err_cleanup_rxqs:
static int mvpp2_stop(struct net_device *dev)
{
struct mvpp2_port *port = netdev_priv(dev);
+ struct mvpp2_port_pcpu *port_pcpu;
+ int cpu;

mvpp2_stop_dev(port);
mvpp2_phy_disconnect(port);
@@ -5576,6 +5631,13 @@ static int mvpp2_stop(struct net_device *dev)
on_each_cpu(mvpp2_interrupts_mask, port, 1);

free_irq(port->irq, port);
+ for_each_present_cpu(cpu) {
+ port_pcpu = per_cpu_ptr(port->pcpu, cpu);
+
+ hrtimer_cancel(&port_pcpu->tx_done_timer);
+ port_pcpu->timer_scheduled = false;
+ tasklet_kill(&port_pcpu->tx_done_tasklet);
+ }
mvpp2_cleanup_rxqs(port);
mvpp2_cleanup_txqs(port);

@@ -5791,7 +5853,6 @@ static int mvpp2_ethtool_set_coalesce(struct net_device *dev,
txq->done_pkts_coal = c->tx_max_coalesced_frames;
}

- on_each_cpu(mvpp2_tx_done_pkts_coal_set, port, 1);
return 0;
}

@@ -6042,6 +6103,7 @@ static int mvpp2_port_probe(struct platform_device *pdev,
{
struct device_node *phy_node;
struct mvpp2_port *port;
+ struct mvpp2_port_pcpu *port_pcpu;
struct net_device *dev;
struct resource *res;
const char *dt_mac_addr;
@@ -6051,7 +6113,7 @@ static int mvpp2_port_probe(struct platform_device *pdev,
int features;
int phy_mode;
int priv_common_regs_num = 2;
- int err, i;
+ int err, i, cpu;

dev = alloc_etherdev_mqs(sizeof(struct mvpp2_port), txq_number,
rxq_number);
@@ -6142,6 +6204,24 @@ static int mvpp2_port_probe(struct platform_device *pdev,
}
mvpp2_port_power_up(port);

+ port->pcpu = alloc_percpu(struct mvpp2_port_pcpu);
+ if (!port->pcpu) {
+ err = -ENOMEM;
+ goto err_free_txq_pcpu;
+ }
+
+ for_each_present_cpu(cpu) {
+ port_pcpu = per_cpu_ptr(port->pcpu, cpu);
+
+ hrtimer_init(&port_pcpu->tx_done_timer, CLOCK_MONOTONIC,
+ HRTIMER_MODE_REL_PINNED);
+ port_pcpu->tx_done_timer.function = mvpp2_hr_timer_cb;
+ port_pcpu->timer_scheduled = false;
+
+ tasklet_init(&port_pcpu->tx_done_tasklet, mvpp2_tx_proc_cb,
+ (unsigned long)dev);
+ }
+
netif_napi_add(dev, &port->napi, mvpp2_poll, NAPI_POLL_WEIGHT);
features = NETIF_F_SG | NETIF_F_IP_CSUM;
dev->features = features | NETIF_F_RXCSUM;
@@ -6151,7 +6231,7 @@ static int mvpp2_port_probe(struct platform_device *pdev,
err = register_netdev(dev);
if (err < 0) {
dev_err(&pdev->dev, "failed to register netdev\n");
- goto err_free_txq_pcpu;
+ goto err_free_port_pcpu;
}
netdev_info(dev, "Using %s mac address %pM\n", mac_from, dev->dev_addr);

@@ -6160,6 +6240,8 @@ static int mvpp2_port_probe(struct platform_device *pdev,
priv->port_list[id] = port;
return 0;

+err_free_port_pcpu:
+ free_percpu(port->pcpu);
err_free_txq_pcpu:
for (i = 0; i < txq_number; i++)
free_percpu(port->txqs[i]->pcpu);
@@ -6178,6 +6260,7 @@ static void mvpp2_port_remove(struct mvpp2_port *port)
int i;

unregister_netdev(port->dev);
+ free_percpu(port->pcpu);
free_percpu(port->stats);
for (i = 0; i < txq_number; i++)
free_percpu(port->txqs[i]->pcpu);
--
1.8.3.1
David Miller
2015-08-10 17:57:36 UTC
Permalink
From: Marcin Wojtas <***@semihalf.com>
Date: Thu, 6 Aug 2015 19:00:27 +0200
Post by Marcin Wojtas
This is a set of three patches that fix long-lasting problems implemented in
the initial support for the Armada 375 network controller.
Due to an inappropriate concept of handling the per-CPU sent packets'
processing on TX path the driver numerous problems occured, such as RCU
stalls. Those have been fixed, of which details you can find in the commit
logs. The patches were intensively tested on top of v4.2-rc5.
I'm looking forward to any comments or remarks.
Series applied, thanks.
Sunil Goutham
2015-11-30 18:18:56 UTC
Permalink
From: Sunil Goutham <***@cavium.com>

This patch series contains fixes for various issues and support for
HW TSO which is added in pass2 silicon.

Sunil Goutham (9):
net: thunderx: Increase transmit queue length
net: thunderx: Set CQ timer threshold properly
net: thunderx: Switchon carrier only upon interface link up
net: thunderx: Remove unnecessary rcv buffer start address management
net: thunderx: Enable BGX LMAC's RX/TX only after VF is up
net: thunderx: HW TSO support for pass2 chips
net: thunderx: Enable CQE count threshold interrupt for pass2 chip
net: thunderx: HW errata workaround for non-tunneled TSO pkts
net: thunderx: Fix for duplicate CQEs by HW for TSO packets

Thanneeru Srinivasulu (3):
net: thunderx: Force to load octeon-mdio before bgx driver.
net: thunderx: Wait for delayed work to finish before destroying it
net: thunderx: Add TX timeout, RX buff allocation fail to driver
stats.

Yury Norov (1):
net: thunderx: nicvf_queues: nivc_*_intr: remove duplication

drivers/net/ethernet/cavium/thunder/nic.h | 22 ++-
drivers/net/ethernet/cavium/thunder/nic_main.c | 34 ++-
drivers/net/ethernet/cavium/thunder/nicvf_ethtool.c | 18 ++-
drivers/net/ethernet/cavium/thunder/nicvf_main.c | 20 ++-
drivers/net/ethernet/cavium/thunder/nicvf_queues.c | 219 +++++++-------------
drivers/net/ethernet/cavium/thunder/nicvf_queues.h | 19 +-
drivers/net/ethernet/cavium/thunder/q_struct.h | 30 ++--
drivers/net/ethernet/cavium/thunder/thunder_bgx.c | 28 +++-
drivers/net/ethernet/cavium/thunder/thunder_bgx.h | 2 +
9 files changed, 189 insertions(+), 203 deletions(-)
David Miller
2015-11-30 19:32:26 UTC
Permalink
From: Sunil Goutham <***@gmail.com>
Date: Mon, 30 Nov 2015 23:48:56 +0530
Post by Sunil Goutham
This patch series contains fixes for various issues and support for
HW TSO which is added in pass2 silicon.
Please do not mix bug fixes and support for new features and cleanups.

First, post a separate series of just the bug fixes, targetting the
'net' tree.

Then once that propagates into 'net-next', you can submit the feature
additions and cleanups relative to that.

Thank you.
Sunil Goutham
2015-11-30 18:18:57 UTC
Permalink
From: Thanneeru Srinivasulu <***@caviumnetworks.com>

Signed-off-by: Thanneeru Srinivasulu <***@caviumnetworks.com>
Signed-off-by: Sunil Goutham <***@cavium.com>
---
drivers/net/ethernet/cavium/thunder/thunder_bgx.c | 3 +++
drivers/net/ethernet/cavium/thunder/thunder_bgx.h | 1 +
2 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/drivers/net/ethernet/cavium/thunder/thunder_bgx.c b/drivers/net/ethernet/cavium/thunder/thunder_bgx.c
index 180aa9f..1fb72dd 100644
--- a/drivers/net/ethernet/cavium/thunder/thunder_bgx.c
+++ b/drivers/net/ethernet/cavium/thunder/thunder_bgx.c
@@ -1009,6 +1009,9 @@ static int bgx_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
struct bgx *bgx = NULL;
u8 lmac;

+ /*Load octeon mdio driver*/
+ octeon_mdiobus_force_mod_depencency();
+
bgx = devm_kzalloc(dev, sizeof(*bgx), GFP_KERNEL);
if (!bgx)
return -ENOMEM;
diff --git a/drivers/net/ethernet/cavium/thunder/thunder_bgx.h b/drivers/net/ethernet/cavium/thunder/thunder_bgx.h
index 07b7ec6..89a02fa 100644
--- a/drivers/net/ethernet/cavium/thunder/thunder_bgx.h
+++ b/drivers/net/ethernet/cavium/thunder/thunder_bgx.h
@@ -182,6 +182,7 @@ enum MCAST_MODE {
#define BCAST_ACCEPT 1
#define CAM_ACCEPT 1

+void octeon_mdiobus_force_mod_depencency(void);
void bgx_add_dmac_addr(u64 dmac, int node, int bgx_idx, int lmac);
unsigned bgx_get_map(int node);
int bgx_get_lmac_count(int node, int bgx);
--
1.7.1
Sunil Goutham
2015-11-30 18:18:58 UTC
Permalink
From: Thanneeru Srinivasulu <***@caviumnetworks.com>

While VNIC or BGX driver teardown, wait for already scheduled delayed work to
finish before destroying it.

Signed-off-by: Thanneeru Srinivasulu <***@caviumnetworks.com>
Signed-off-by: Sunil Goutham <***@cavium.com>
---
drivers/net/ethernet/cavium/thunder/nic_main.c | 3 +--
drivers/net/ethernet/cavium/thunder/thunder_bgx.c | 3 +--
2 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/cavium/thunder/nic_main.c b/drivers/net/ethernet/cavium/thunder/nic_main.c
index c561fdc..cfc24a1 100644
--- a/drivers/net/ethernet/cavium/thunder/nic_main.c
+++ b/drivers/net/ethernet/cavium/thunder/nic_main.c
@@ -1074,8 +1074,7 @@ static void nic_remove(struct pci_dev *pdev)

if (nic->check_link) {
/* Destroy work Queue */
- cancel_delayed_work(&nic->dwork);
- flush_workqueue(nic->check_link);
+ cancel_delayed_work_sync(&nic->dwork);
destroy_workqueue(nic->check_link);
}

diff --git a/drivers/net/ethernet/cavium/thunder/thunder_bgx.c b/drivers/net/ethernet/cavium/thunder/thunder_bgx.c
index 1fb72dd..2076ac3 100644
--- a/drivers/net/ethernet/cavium/thunder/thunder_bgx.c
+++ b/drivers/net/ethernet/cavium/thunder/thunder_bgx.c
@@ -695,8 +695,7 @@ static void bgx_lmac_disable(struct bgx *bgx, u8 lmacid)
lmac = &bgx->lmac[lmacid];
if (lmac->check_link) {
/* Destroy work queue */
- cancel_delayed_work(&lmac->dwork);
- flush_workqueue(lmac->check_link);
+ cancel_delayed_work_sync(&lmac->dwork);
destroy_workqueue(lmac->check_link);
}
--
1.7.1
Sunil Goutham
2015-11-30 18:19:00 UTC
Permalink
From: Sunil Goutham <***@cavium.com>

Properly set CQ timer threshold and also set it to 2us.
With previous incorrect settings it was set to 0.5us which is too less.

Signed-off-by: Sunil Goutham <***@cavium.com>
---
drivers/net/ethernet/cavium/thunder/nic.h | 5 ++---
drivers/net/ethernet/cavium/thunder/nicvf_queues.c | 2 +-
drivers/net/ethernet/cavium/thunder/nicvf_queues.h | 2 +-
3 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/cavium/thunder/nic.h b/drivers/net/ethernet/cavium/thunder/nic.h
index d3950b2..39ca674 100644
--- a/drivers/net/ethernet/cavium/thunder/nic.h
+++ b/drivers/net/ethernet/cavium/thunder/nic.h
@@ -120,10 +120,9 @@
* Calculated for SCLK of 700Mhz
* value written should be a 1/16th of what is expected
*
- * 1 tick per 0.05usec = value of 2.2
- * This 10% would be covered in CQ timer thresh value
+ * 1 tick per 0.025usec
*/
-#define NICPF_CLK_PER_INT_TICK 2
+#define NICPF_CLK_PER_INT_TICK 1

/* Time to wait before we decide that a SQ is stuck.
*
diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_queues.c b/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
index e404ea8..206b6a7 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
@@ -592,7 +592,7 @@ void nicvf_cmp_queue_config(struct nicvf *nic, struct queue_set *qs,
/* Set threshold value for interrupt generation */
nicvf_queue_reg_write(nic, NIC_QSET_CQ_0_7_THRESH, qidx, cq->thresh);
nicvf_queue_reg_write(nic, NIC_QSET_CQ_0_7_CFG2,
- qidx, nic->cq_coalesce_usecs);
+ qidx, CMP_QUEUE_TIMER_THRESH);
}

/* Configures transmit queue */
diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_queues.h b/drivers/net/ethernet/cavium/thunder/nicvf_queues.h
index b1e93a9..457eaee 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_queues.h
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_queues.h
@@ -76,7 +76,7 @@
#define CMP_QSIZE CMP_QUEUE_SIZE3
#define CMP_QUEUE_LEN (1ULL << (CMP_QSIZE + 10))
#define CMP_QUEUE_CQE_THRESH 0
-#define CMP_QUEUE_TIMER_THRESH 220 /* 10usec */
+#define CMP_QUEUE_TIMER_THRESH 80 /* ~2usec */

#define RBDR_SIZE RBDR_SIZE0
#define RCV_BUF_COUNT (1ULL << (RBDR_SIZE + 13))
--
1.7.1
Sunil Goutham
2015-11-30 18:18:59 UTC
Permalink
From: Sunil Goutham <***@cavium.com>

Under high transmit rates and with TSO enabled observing fluctuations
in TX performance. Seen especially with iperf3 application.

Since TSO is taken care at driver level, with 64KB of TSO packets
and when window size is also high the rate at which CPU fills in
transmit descriptors is much higher than what HW is able to process.
Each 64KB TSO packet occupies gets segmented to ~43 1500 byte MTU packets
and occupies ~130 descriptors. Hence increasing transmit queue size.

Signed-off-by: Sunil Goutham <***@cavium.com>
---
drivers/net/ethernet/cavium/thunder/nicvf_queues.h | 4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_queues.h b/drivers/net/ethernet/cavium/thunder/nicvf_queues.h
index fb4957d..b1e93a9 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_queues.h
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_queues.h
@@ -62,7 +62,7 @@
#define SND_QUEUE_CNT 8
#define CMP_QUEUE_CNT 8 /* Max of RCV and SND qcount */

-#define SND_QSIZE SND_QUEUE_SIZE2
+#define SND_QSIZE SND_QUEUE_SIZE3
#define SND_QUEUE_LEN (1ULL << (SND_QSIZE + 10))
#define MAX_SND_QUEUE_LEN (1ULL << (SND_QUEUE_SIZE6 + 10))
#define SND_QUEUE_THRESH 2ULL
@@ -73,7 +73,7 @@
/* Keep CQ and SQ sizes same, if timestamping
* is enabled this equation will change.
*/
-#define CMP_QSIZE CMP_QUEUE_SIZE2
+#define CMP_QSIZE CMP_QUEUE_SIZE3
#define CMP_QUEUE_LEN (1ULL << (CMP_QSIZE + 10))
#define CMP_QUEUE_CQE_THRESH 0
#define CMP_QUEUE_TIMER_THRESH 220 /* 10usec */
--
1.7.1
Sunil Goutham
2015-11-30 18:19:01 UTC
Permalink
From: Yury Norov <***@auriga.com>

The same switch-case repeates for nivc_*_intr functions.
In this patch it is moved to a helper nicvf_int_type_to_mask().

By the way:
- Unneeded write to NICVF register dropped if int_type is unknown.
- netdev_dbg() is used instead of netdev_err().

Signed-off-by: Yury Norov <***@auriga.com>
Signed-off-by: Aleksey Makarov <***@caviumnetworks.com>
Acked-by: Vadim Lomovtsev <***@caiumnetworks.com>
Signed-off-by: Sunil Goutham <***@cavium.com>
---
drivers/net/ethernet/cavium/thunder/nicvf_queues.c | 140 ++++++--------------
1 files changed, 40 insertions(+), 100 deletions(-)

diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_queues.c b/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
index 206b6a7..99a29d0 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
@@ -1234,153 +1234,93 @@ struct sk_buff *nicvf_get_rcv_skb(struct nicvf *nic, struct cqe_rx_t *cqe_rx)
return skb;
}

-/* Enable interrupt */
-void nicvf_enable_intr(struct nicvf *nic, int int_type, int q_idx)
+static u64 nicvf_int_type_to_mask(int int_type, int q_idx)
{
u64 reg_val;

- reg_val = nicvf_reg_read(nic, NIC_VF_ENA_W1S);
-
switch (int_type) {
case NICVF_INTR_CQ:
- reg_val |= ((1ULL << q_idx) << NICVF_INTR_CQ_SHIFT);
+ reg_val = ((1ULL << q_idx) << NICVF_INTR_CQ_SHIFT);
break;
case NICVF_INTR_SQ:
- reg_val |= ((1ULL << q_idx) << NICVF_INTR_SQ_SHIFT);
+ reg_val = ((1ULL << q_idx) << NICVF_INTR_SQ_SHIFT);
break;
case NICVF_INTR_RBDR:
- reg_val |= ((1ULL << q_idx) << NICVF_INTR_RBDR_SHIFT);
+ reg_val = ((1ULL << q_idx) << NICVF_INTR_RBDR_SHIFT);
break;
case NICVF_INTR_PKT_DROP:
- reg_val |= (1ULL << NICVF_INTR_PKT_DROP_SHIFT);
+ reg_val = (1ULL << NICVF_INTR_PKT_DROP_SHIFT);
break;
case NICVF_INTR_TCP_TIMER:
- reg_val |= (1ULL << NICVF_INTR_TCP_TIMER_SHIFT);
+ reg_val = (1ULL << NICVF_INTR_TCP_TIMER_SHIFT);
break;
case NICVF_INTR_MBOX:
- reg_val |= (1ULL << NICVF_INTR_MBOX_SHIFT);
+ reg_val = (1ULL << NICVF_INTR_MBOX_SHIFT);
break;
case NICVF_INTR_QS_ERR:
- reg_val |= (1ULL << NICVF_INTR_QS_ERR_SHIFT);
+ reg_val = (1ULL << NICVF_INTR_QS_ERR_SHIFT);
break;
default:
- netdev_err(nic->netdev,
- "Failed to enable interrupt: unknown type\n");
- break;
+ reg_val = 0;
}

- nicvf_reg_write(nic, NIC_VF_ENA_W1S, reg_val);
+ return reg_val;
+}
+
+/* Enable interrupt */
+void nicvf_enable_intr(struct nicvf *nic, int int_type, int q_idx)
+{
+ u64 mask = nicvf_int_type_to_mask(int_type, q_idx);
+
+ if (!mask) {
+ netdev_dbg(nic->netdev,
+ "Failed to enable interrupt: unknown type\n");
+ return;
+ }
+ nicvf_reg_write(nic, NIC_VF_ENA_W1S,
+ nicvf_reg_read(nic, NIC_VF_ENA_W1S) | mask);
}

/* Disable interrupt */
void nicvf_disable_intr(struct nicvf *nic, int int_type, int q_idx)
{
- u64 reg_val = 0;
+ u64 mask = nicvf_int_type_to_mask(int_type, q_idx);

- switch (int_type) {
- case NICVF_INTR_CQ:
- reg_val |= ((1ULL << q_idx) << NICVF_INTR_CQ_SHIFT);
- break;
- case NICVF_INTR_SQ:
- reg_val |= ((1ULL << q_idx) << NICVF_INTR_SQ_SHIFT);
- break;
- case NICVF_INTR_RBDR:
- reg_val |= ((1ULL << q_idx) << NICVF_INTR_RBDR_SHIFT);
- break;
- case NICVF_INTR_PKT_DROP:
- reg_val |= (1ULL << NICVF_INTR_PKT_DROP_SHIFT);
- break;
- case NICVF_INTR_TCP_TIMER:
- reg_val |= (1ULL << NICVF_INTR_TCP_TIMER_SHIFT);
- break;
- case NICVF_INTR_MBOX:
- reg_val |= (1ULL << NICVF_INTR_MBOX_SHIFT);
- break;
- case NICVF_INTR_QS_ERR:
- reg_val |= (1ULL << NICVF_INTR_QS_ERR_SHIFT);
- break;
- default:
- netdev_err(nic->netdev,
+ if (!mask) {
+ netdev_dbg(nic->netdev,
"Failed to disable interrupt: unknown type\n");
- break;
+ return;
}

- nicvf_reg_write(nic, NIC_VF_ENA_W1C, reg_val);
+ nicvf_reg_write(nic, NIC_VF_ENA_W1C, mask);
}

/* Clear interrupt */
void nicvf_clear_intr(struct nicvf *nic, int int_type, int q_idx)
{
- u64 reg_val = 0;
+ u64 mask = nicvf_int_type_to_mask(int_type, q_idx);

- switch (int_type) {
- case NICVF_INTR_CQ:
- reg_val = ((1ULL << q_idx) << NICVF_INTR_CQ_SHIFT);
- break;
- case NICVF_INTR_SQ:
- reg_val = ((1ULL << q_idx) << NICVF_INTR_SQ_SHIFT);
- break;
- case NICVF_INTR_RBDR:
- reg_val = ((1ULL << q_idx) << NICVF_INTR_RBDR_SHIFT);
- break;
- case NICVF_INTR_PKT_DROP:
- reg_val = (1ULL << NICVF_INTR_PKT_DROP_SHIFT);
- break;
- case NICVF_INTR_TCP_TIMER:
- reg_val = (1ULL << NICVF_INTR_TCP_TIMER_SHIFT);
- break;
- case NICVF_INTR_MBOX:
- reg_val = (1ULL << NICVF_INTR_MBOX_SHIFT);
- break;
- case NICVF_INTR_QS_ERR:
- reg_val |= (1ULL << NICVF_INTR_QS_ERR_SHIFT);
- break;
- default:
- netdev_err(nic->netdev,
+ if (!mask) {
+ netdev_dbg(nic->netdev,
"Failed to clear interrupt: unknown type\n");
- break;
+ return;
}

- nicvf_reg_write(nic, NIC_VF_INT, reg_val);
+ nicvf_reg_write(nic, NIC_VF_INT, mask);
}

/* Check if interrupt is enabled */
int nicvf_is_intr_enabled(struct nicvf *nic, int int_type, int q_idx)
{
- u64 reg_val;
- u64 mask = 0xff;
-
- reg_val = nicvf_reg_read(nic, NIC_VF_ENA_W1S);
-
- switch (int_type) {
- case NICVF_INTR_CQ:
- mask = ((1ULL << q_idx) << NICVF_INTR_CQ_SHIFT);
- break;
- case NICVF_INTR_SQ:
- mask = ((1ULL << q_idx) << NICVF_INTR_SQ_SHIFT);
- break;
- case NICVF_INTR_RBDR:
- mask = ((1ULL << q_idx) << NICVF_INTR_RBDR_SHIFT);
- break;
- case NICVF_INTR_PKT_DROP:
- mask = NICVF_INTR_PKT_DROP_MASK;
- break;
- case NICVF_INTR_TCP_TIMER:
- mask = NICVF_INTR_TCP_TIMER_MASK;
- break;
- case NICVF_INTR_MBOX:
- mask = NICVF_INTR_MBOX_MASK;
- break;
- case NICVF_INTR_QS_ERR:
- mask = NICVF_INTR_QS_ERR_MASK;
- break;
- default:
- netdev_err(nic->netdev,
+ u64 mask = nicvf_int_type_to_mask(int_type, q_idx);
+ /* If interrupt type is unknown, we treat it disabled. */
+ if (!mask) {
+ netdev_dbg(nic->netdev,
"Failed to check interrupt enable: unknown type\n");
- break;
+ return 0;
}

- return (reg_val & mask);
+ return mask & nicvf_reg_read(nic, NIC_VF_ENA_W1S);
}

void nicvf_update_rq_stats(struct nicvf *nic, int rq_idx)
--
1.7.1
Sunil Goutham
2015-11-30 18:19:02 UTC
Permalink
From: Sunil Goutham <***@cavium.com>

Call netif_carrier_on() only if interface's link is up. Switching this on
upon IFF_UP by default, is causing issues with ethernet channel bonding
in LACP mode. Initial NETDEV_CHANGE notification was being skipped.

Also fixed some issues with link/speed/duplex reporting via ethtool.

Signed-off-by: Sunil Goutham <***@cavium.com>
---
.../net/ethernet/cavium/thunder/nicvf_ethtool.c | 16 +++++++++++++++-
drivers/net/ethernet/cavium/thunder/nicvf_main.c | 4 +---
drivers/net/ethernet/cavium/thunder/thunder_bgx.c | 2 ++
3 files changed, 18 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_ethtool.c b/drivers/net/ethernet/cavium/thunder/nicvf_ethtool.c
index af54c10..a12b2e3 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_ethtool.c
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_ethtool.c
@@ -112,6 +112,13 @@ static int nicvf_get_settings(struct net_device *netdev,

cmd->supported = 0;
cmd->transceiver = XCVR_EXTERNAL;
+
+ if (!nic->link_up) {
+ cmd->duplex = DUPLEX_UNKNOWN;
+ ethtool_cmd_speed_set(cmd, SPEED_UNKNOWN);
+ return 0;
+ }
+
if (nic->speed <= 1000) {
cmd->port = PORT_MII;
cmd->autoneg = AUTONEG_ENABLE;
@@ -125,6 +132,13 @@ static int nicvf_get_settings(struct net_device *netdev,
return 0;
}

+static u32 nicvf_get_link(struct net_device *netdev)
+{
+ struct nicvf *nic = netdev_priv(netdev);
+
+ return nic->link_up;
+}
+
static void nicvf_get_drvinfo(struct net_device *netdev,
struct ethtool_drvinfo *info)
{
@@ -660,7 +674,7 @@ static int nicvf_set_channels(struct net_device *dev,

static const struct ethtool_ops nicvf_ethtool_ops = {
.get_settings = nicvf_get_settings,
- .get_link = ethtool_op_get_link,
+ .get_link = nicvf_get_link,
.get_drvinfo = nicvf_get_drvinfo,
.get_msglevel = nicvf_get_msglevel,
.set_msglevel = nicvf_set_msglevel,
diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_main.c b/drivers/net/ethernet/cavium/thunder/nicvf_main.c
index 7f709cb..dde8dc7 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_main.c
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_main.c
@@ -1057,6 +1057,7 @@ int nicvf_stop(struct net_device *netdev)

netif_carrier_off(netdev);
netif_tx_stop_all_queues(nic->netdev);
+ nic->link_up = false;

/* Teardown secondary qsets first */
if (!nic->sqs_mode) {
@@ -1211,9 +1212,6 @@ int nicvf_open(struct net_device *netdev)
nic->drv_stats.txq_stop = 0;
nic->drv_stats.txq_wake = 0;

- netif_carrier_on(netdev);
- netif_tx_start_all_queues(netdev);
-
return 0;
cleanup:
nicvf_disable_intr(nic, NICVF_INTR_MBOX, 0);
diff --git a/drivers/net/ethernet/cavium/thunder/thunder_bgx.c b/drivers/net/ethernet/cavium/thunder/thunder_bgx.c
index 2076ac3..d9f27ad 100644
--- a/drivers/net/ethernet/cavium/thunder/thunder_bgx.c
+++ b/drivers/net/ethernet/cavium/thunder/thunder_bgx.c
@@ -612,6 +612,8 @@ static void bgx_poll_for_link(struct work_struct *work)
lmac->last_duplex = 1;
} else {
lmac->link_up = 0;
+ lmac->last_speed = SPEED_UNKNOWN;
+ lmac->last_duplex = DUPLEX_UNKNOWN;
}

if (lmac->last_link != lmac->link_up) {
--
1.7.1
Sunil Goutham
2015-11-30 18:19:04 UTC
Permalink
From: Sunil Goutham <***@cavium.com>

Enable or disable BGX LMAC's RX/TX based on corresponding VF's
status. If otherwise, when multiple LMAC's physical link is up
then packets from all LMAC's whose corresponding VF is not yet
initialized will get forwarded to VF0. This is due to VNIC's default
configuration where CPI, RSSI e.t.c point to VF0/QSET0/RQ0.

This patch will prevent multiple copies of packets on VF0.

Signed-off-by: Sunil Goutham <***@cavium.com>
---
drivers/net/ethernet/cavium/thunder/nic_main.c | 20 +++++++++++++++++++-
drivers/net/ethernet/cavium/thunder/thunder_bgx.c | 20 ++++++++++++++++++--
drivers/net/ethernet/cavium/thunder/thunder_bgx.h | 1 +
3 files changed, 38 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/cavium/thunder/nic_main.c b/drivers/net/ethernet/cavium/thunder/nic_main.c
index cfc24a1..4b7fd63 100644
--- a/drivers/net/ethernet/cavium/thunder/nic_main.c
+++ b/drivers/net/ethernet/cavium/thunder/nic_main.c
@@ -37,6 +37,7 @@ struct nicpf {
#define NIC_GET_BGX_FROM_VF_LMAC_MAP(map) ((map >> 4) & 0xF)
#define NIC_GET_LMAC_FROM_VF_LMAC_MAP(map) (map & 0xF)
u8 vf_lmac_map[MAX_LMAC];
+ u8 lmac_cnt;
struct delayed_work dwork;
struct workqueue_struct *check_link;
u8 link[MAX_LMAC];
@@ -279,6 +280,7 @@ static void nic_set_lmac_vf_mapping(struct nicpf *nic)
u64 lmac_credit;

nic->num_vf_en = 0;
+ nic->lmac_cnt = 0;

for (bgx = 0; bgx < NIC_MAX_BGX; bgx++) {
if (!(bgx_map & (1 << bgx)))
@@ -288,6 +290,7 @@ static void nic_set_lmac_vf_mapping(struct nicpf *nic)
nic->vf_lmac_map[next_bgx_lmac++] =
NIC_SET_VF_LMAC_MAP(bgx, lmac);
nic->num_vf_en += lmac_cnt;
+ nic->lmac_cnt += lmac_cnt;

/* Program LMAC credits */
lmac_credit = (1ull << 1); /* channel credit enable */
@@ -715,6 +718,13 @@ static void nic_handle_mbx_intr(struct nicpf *nic, int vf)
case NIC_MBOX_MSG_CFG_DONE:
/* Last message of VF config msg sequence */
nic->vf_enabled[vf] = true;
+ if (vf >= nic->lmac_cnt)
+ goto unlock;
+
+ bgx = NIC_GET_BGX_FROM_VF_LMAC_MAP(nic->vf_lmac_map[vf]);
+ lmac = NIC_GET_LMAC_FROM_VF_LMAC_MAP(nic->vf_lmac_map[vf]);
+
+ bgx_lmac_rx_tx_enable(nic->node, bgx, lmac, true);
goto unlock;
case NIC_MBOX_MSG_SHUTDOWN:
/* First msg in VF teardown sequence */
@@ -722,6 +732,14 @@ static void nic_handle_mbx_intr(struct nicpf *nic, int vf)
if (vf >= nic->num_vf_en)
nic->sqs_used[vf - nic->num_vf_en] = false;
nic->pqs_vf[vf] = 0;
+
+ if (vf >= nic->lmac_cnt)
+ break;
+
+ bgx = NIC_GET_BGX_FROM_VF_LMAC_MAP(nic->vf_lmac_map[vf]);
+ lmac = NIC_GET_LMAC_FROM_VF_LMAC_MAP(nic->vf_lmac_map[vf]);
+
+ bgx_lmac_rx_tx_enable(nic->node, bgx, lmac, false);
break;
case NIC_MBOX_MSG_ALLOC_SQS:
nic_alloc_sqs(nic, &mbx.sqs_alloc);
@@ -940,7 +958,7 @@ static void nic_poll_for_link(struct work_struct *work)

mbx.link_status.msg = NIC_MBOX_MSG_BGX_LINK_CHANGE;

- for (vf = 0; vf < nic->num_vf_en; vf++) {
+ for (vf = 0; vf < nic->lmac_cnt; vf++) {
/* Poll only if VF is UP */
if (!nic->vf_enabled[vf])
continue;
diff --git a/drivers/net/ethernet/cavium/thunder/thunder_bgx.c b/drivers/net/ethernet/cavium/thunder/thunder_bgx.c
index d9f27ad..c51d5c9 100644
--- a/drivers/net/ethernet/cavium/thunder/thunder_bgx.c
+++ b/drivers/net/ethernet/cavium/thunder/thunder_bgx.c
@@ -186,6 +186,23 @@ void bgx_set_lmac_mac(int node, int bgx_idx, int lmacid, const u8 *mac)
}
EXPORT_SYMBOL(bgx_set_lmac_mac);

+void bgx_lmac_rx_tx_enable(int node, int bgx_idx, int lmacid, bool enable)
+{
+ struct bgx *bgx = bgx_vnic[(node * MAX_BGX_PER_CN88XX) + bgx_idx];
+ u64 cfg;
+
+ if (!bgx)
+ return;
+
+ cfg = bgx_reg_read(bgx, lmacid, BGX_CMRX_CFG);
+ if (enable)
+ cfg |= CMR_PKT_RX_EN | CMR_PKT_TX_EN;
+ else
+ cfg &= ~(CMR_PKT_RX_EN | CMR_PKT_TX_EN);
+ bgx_reg_write(bgx, lmacid, BGX_CMRX_CFG, cfg);
+}
+EXPORT_SYMBOL(bgx_lmac_rx_tx_enable);
+
static void bgx_sgmii_change_link_state(struct lmac *lmac)
{
struct bgx *bgx = lmac->bgx;
@@ -656,8 +673,7 @@ static int bgx_lmac_enable(struct bgx *bgx, u8 lmacid)
}

/* Enable lmac */
- bgx_reg_modify(bgx, lmacid, BGX_CMRX_CFG,
- CMR_EN | CMR_PKT_RX_EN | CMR_PKT_TX_EN);
+ bgx_reg_modify(bgx, lmacid, BGX_CMRX_CFG, CMR_EN);

/* Restore default cfg, incase low level firmware changed it */
bgx_reg_write(bgx, lmacid, BGX_CMRX_RX_DMAC_CTL, 0x03);
diff --git a/drivers/net/ethernet/cavium/thunder/thunder_bgx.h b/drivers/net/ethernet/cavium/thunder/thunder_bgx.h
index 89a02fa..149e179 100644
--- a/drivers/net/ethernet/cavium/thunder/thunder_bgx.h
+++ b/drivers/net/ethernet/cavium/thunder/thunder_bgx.h
@@ -183,6 +183,7 @@ enum MCAST_MODE {
#define CAM_ACCEPT 1

void octeon_mdiobus_force_mod_depencency(void);
+void bgx_lmac_rx_tx_enable(int node, int bgx_idx, int lmacid, bool enable);
void bgx_add_dmac_addr(u64 dmac, int node, int bgx_idx, int lmac);
unsigned bgx_get_map(int node);
int bgx_get_lmac_count(int node, int bgx);
--
1.7.1
Sunil Goutham
2015-11-30 18:19:07 UTC
Permalink
From: Sunil Goutham <***@cavium.com>

A HW errata mandates inner layer3 offset field of send header for a non-tunneled
TSO packet to point to L2 ethertype in the payload. Also added counter for HW
TSO packets.

Signed-off-by: Sunil Goutham <***@cavium.com>
---
drivers/net/ethernet/cavium/thunder/nicvf_queues.c | 17 ++++++++++-------
1 files changed, 10 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_queues.c b/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
index 5855828..4e9709e 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
@@ -943,8 +943,8 @@ static int nicvf_sq_subdesc_required(struct nicvf *nic, struct sk_buff *skb)
* First subdescriptor for every send descriptor.
*/
static inline void
-nicvf_sq_add_hdr_subdesc(struct snd_queue *sq, int qentry, int subdesc_cnt,
- struct sk_buff *skb, int len, bool hw_tso)
+nicvf_sq_add_hdr_subdesc(struct nicvf *nic, struct snd_queue *sq, int qentry,
+ int subdesc_cnt, struct sk_buff *skb, int len)
{
int proto;
struct sq_hdr_subdesc *hdr;
@@ -980,10 +980,13 @@ nicvf_sq_add_hdr_subdesc(struct snd_queue *sq, int qentry, int subdesc_cnt,
}
}

- if (hw_tso && skb_shinfo(skb)->gso_size) {
+ if (nic->hw_tso && skb_shinfo(skb)->gso_size) {
hdr->tso = 1;
hdr->tso_start = skb_transport_offset(skb) + tcp_hdrlen(skb);
hdr->tso_max_paysize = skb_shinfo(skb)->gso_size;
+ /* For non-tunneled pkts, point this to L2 ethertype */
+ hdr->inner_l3_offset = skb_network_offset(skb) - 2;
+ nic->drv_stats.tx_tso++;
}
}

@@ -1054,8 +1057,8 @@ static int nicvf_sq_append_tso(struct nicvf *nic, struct snd_queue *sq,
data_left -= size;
tso_build_data(skb, &tso, size);
}
- nicvf_sq_add_hdr_subdesc(sq, hdr_qentry,
- seg_subdescs - 1, skb, seg_len, false);
+ nicvf_sq_add_hdr_subdesc(nic, sq, hdr_qentry,
+ seg_subdescs - 1, skb, seg_len);
sq->skbuff[hdr_qentry] = (u64)NULL;
qentry = nicvf_get_nxt_sqentry(sq, qentry);

@@ -1111,8 +1114,8 @@ int nicvf_sq_append_skb(struct nicvf *nic, struct sk_buff *skb)
return nicvf_sq_append_tso(nic, sq, sq_num, qentry, skb);

/* Add SQ header subdesc */
- nicvf_sq_add_hdr_subdesc(sq, qentry, subdesc_cnt - 1, skb,
- skb->len, nic->hw_tso);
+ nicvf_sq_add_hdr_subdesc(nic, sq, qentry, subdesc_cnt - 1,
+ skb, skb->len);

/* Add SQ gather subdescs */
qentry = nicvf_get_nxt_sqentry(sq, qentry);
--
1.7.1
Sunil Goutham
2015-11-30 18:19:08 UTC
Permalink
From: Sunil Goutham <***@cavium.com>

HW sends a CQE for each segment of a TSO packet instead of a single CQE
for whole TSO packet transmitted. Each of this CQE points to the same SQE.
SW should takecare of not freeing same SQE multiple times.

Signed-off-by: Sunil Goutham <***@cavium.com>
---
drivers/net/ethernet/cavium/thunder/nicvf_main.c | 10 ++++++++--
1 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_main.c b/drivers/net/ethernet/cavium/thunder/nicvf_main.c
index 51bbe67..a75f21f 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_main.c
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_main.c
@@ -525,14 +525,20 @@ static void nicvf_snd_pkt_handler(struct net_device *netdev,
__func__, cqe_tx->sq_qs, cqe_tx->sq_idx,
cqe_tx->sqe_ptr, hdr->subdesc_cnt);

- nicvf_put_sq_desc(sq, hdr->subdesc_cnt + 1);
nicvf_check_cqe_tx_errs(nic, cq, cqe_tx);
skb = (struct sk_buff *)sq->skbuff[cqe_tx->sqe_ptr];
- /* For TSO offloaded packets only one head SKB needs to be freed */
+ /* For SW TSO offloaded packets only one head SKB needs to be freed */
if (skb) {
+ nicvf_put_sq_desc(sq, hdr->subdesc_cnt + 1);
prefetch(skb);
dev_consume_skb_any(skb);
sq->skbuff[cqe_tx->sqe_ptr] = (u64)NULL;
+ } else {
+ /* In case of HW TSO, a CQE_TX is added by HW for each segment.
+ * Free SQEs only once.
+ */
+ if (!nic->hw_tso)
+ nicvf_put_sq_desc(sq, hdr->subdesc_cnt + 1);
}
}
--
1.7.1
Sunil Goutham
2015-11-30 18:19:03 UTC
Permalink
From: Sunil Goutham <***@cavium.com>

Since we have moved on to using allocated pages to carve receive
buffers instead of netdev_alloc_skb() there is no need to store
any pointers for later retrieval. Earlier we had to store
skb and skb->data pointers which later are used to handover
received packet to network stack.

This will avoid an unnecessary cache miss as well.

Signed-off-by: Sunil Goutham <***@cavium.com>
---
drivers/net/ethernet/cavium/thunder/nicvf_queues.c | 49 ++++----------------
drivers/net/ethernet/cavium/thunder/nicvf_queues.h | 10 +---
2 files changed, 11 insertions(+), 48 deletions(-)

diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_queues.c b/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
index 99a29d0..1fbd908 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
@@ -18,14 +18,6 @@
#include "q_struct.h"
#include "nicvf_queues.h"

-struct rbuf_info {
- struct page *page;
- void *data;
- u64 offset;
-};
-
-#define GET_RBUF_INFO(x) ((struct rbuf_info *)(x - NICVF_RCV_BUF_ALIGN_BYTES))
-
/* Poll a register for a specific value */
static int nicvf_poll_reg(struct nicvf *nic, int qidx,
u64 reg, int bit_pos, int bits, int val)
@@ -86,8 +78,6 @@ static void nicvf_free_q_desc_mem(struct nicvf *nic, struct q_desc_mem *dmem)
static inline int nicvf_alloc_rcv_buffer(struct nicvf *nic, gfp_t gfp,
u32 buf_len, u64 **rbuf)
{
- u64 data;
- struct rbuf_info *rinfo;
int order = get_order(buf_len);

/* Check if request can be accomodated in previous allocated page */
@@ -113,46 +103,28 @@ static inline int nicvf_alloc_rcv_buffer(struct nicvf *nic, gfp_t gfp,
nic->rb_page_offset = 0;
}

- data = (u64)page_address(nic->rb_page) + nic->rb_page_offset;
-
- /* Align buffer addr to cache line i.e 128 bytes */
- rinfo = (struct rbuf_info *)(data + NICVF_RCV_BUF_ALIGN_LEN(data));
- /* Save page address for reference updation */
- rinfo->page = nic->rb_page;
- /* Store start address for later retrieval */
- rinfo->data = (void *)data;
- /* Store alignment offset */
- rinfo->offset = NICVF_RCV_BUF_ALIGN_LEN(data);
+ *rbuf = (u64 *)((u64)page_address(nic->rb_page) + nic->rb_page_offset);

- data += rinfo->offset;
-
- /* Give next aligned address to hw for DMA */
- *rbuf = (u64 *)(data + NICVF_RCV_BUF_ALIGN_BYTES);
return 0;
}

-/* Retrieve actual buffer start address and build skb for received packet */
+/* Build skb around receive buffer */
static struct sk_buff *nicvf_rb_ptr_to_skb(struct nicvf *nic,
u64 rb_ptr, int len)
{
+ void *data;
struct sk_buff *skb;
- struct rbuf_info *rinfo;

- rb_ptr = (u64)phys_to_virt(rb_ptr);
- /* Get buffer start address and alignment offset */
- rinfo = GET_RBUF_INFO(rb_ptr);
+ data = phys_to_virt(rb_ptr);

/* Now build an skb to give to stack */
- skb = build_skb(rinfo->data, RCV_FRAG_LEN);
+ skb = build_skb(data, RCV_FRAG_LEN);
if (!skb) {
- put_page(rinfo->page);
+ put_page(virt_to_page(data));
return NULL;
}

- /* Set correct skb->data */
- skb_reserve(skb, rinfo->offset + NICVF_RCV_BUF_ALIGN_BYTES);
-
- prefetch((void *)rb_ptr);
+ prefetch(skb->data);
return skb;
}

@@ -196,7 +168,6 @@ static void nicvf_free_rbdr(struct nicvf *nic, struct rbdr *rbdr)
int head, tail;
u64 buf_addr;
struct rbdr_entry_t *desc;
- struct rbuf_info *rinfo;

if (!rbdr)
return;
@@ -212,16 +183,14 @@ static void nicvf_free_rbdr(struct nicvf *nic, struct rbdr *rbdr)
while (head != tail) {
desc = GET_RBDR_DESC(rbdr, head);
buf_addr = desc->buf_addr << NICVF_RCV_BUF_ALIGN;
- rinfo = GET_RBUF_INFO((u64)phys_to_virt(buf_addr));
- put_page(rinfo->page);
+ put_page(virt_to_page(phys_to_virt(buf_addr)));
head++;
head &= (rbdr->dmem.q_len - 1);
}
/* Free SKB of tail desc */
desc = GET_RBDR_DESC(rbdr, tail);
buf_addr = desc->buf_addr << NICVF_RCV_BUF_ALIGN;
- rinfo = GET_RBUF_INFO((u64)phys_to_virt(buf_addr));
- put_page(rinfo->page);
+ put_page(virt_to_page(phys_to_virt(buf_addr)));

/* Free RBDR ring */
nicvf_free_q_desc_mem(nic, &rbdr->dmem);
diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_queues.h b/drivers/net/ethernet/cavium/thunder/nicvf_queues.h
index 457eaee..bce8099 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_queues.h
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_queues.h
@@ -83,10 +83,8 @@
#define MAX_RCV_BUF_COUNT (1ULL << (RBDR_SIZE6 + 13))
#define RBDR_THRESH (RCV_BUF_COUNT / 2)
#define DMA_BUFFER_LEN 2048 /* In multiples of 128bytes */
-#define RCV_FRAG_LEN (SKB_DATA_ALIGN(DMA_BUFFER_LEN + NET_SKB_PAD) + \
- SKB_DATA_ALIGN(sizeof(struct skb_shared_info)) + \
- (NICVF_RCV_BUF_ALIGN_BYTES * 2))
-#define RCV_DATA_OFFSET NICVF_RCV_BUF_ALIGN_BYTES
+#define RCV_FRAG_LEN (SKB_DATA_ALIGN(DMA_BUFFER_LEN + NET_SKB_PAD) + \
+ SKB_DATA_ALIGN(sizeof(struct skb_shared_info)))

#define MAX_CQES_FOR_TX ((SND_QUEUE_LEN / MIN_SQ_DESC_PER_PKT_XMIT) * \
MAX_CQE_PER_PKT_XMIT)
@@ -108,10 +106,6 @@
#define NICVF_SQ_BASE_ALIGN_BYTES 128 /* 7 bits */

#define NICVF_ALIGNED_ADDR(ADDR, ALIGN_BYTES) ALIGN(ADDR, ALIGN_BYTES)
-#define NICVF_ADDR_ALIGN_LEN(ADDR, BYTES)\
- (NICVF_ALIGNED_ADDR(ADDR, BYTES) - BYTES)
-#define NICVF_RCV_BUF_ALIGN_LEN(X)\
- (NICVF_ALIGNED_ADDR(X, NICVF_RCV_BUF_ALIGN_BYTES) - X)

/* Queue enable/disable */
#define NICVF_SQ_EN BIT_ULL(19)
--
1.7.1
Sunil Goutham
2015-11-30 18:19:09 UTC
Permalink
From: Thanneeru Srinivasulu <***@caviumnetworks.com>

Instead of error message added counter for receive buffer allocation failures.
Also added counter for transmit timeout event for debug purposes.

Signed-off-by: Thanneeru Srinivasulu <***@caviumnetworks.com>
Signed-off-by: Sunil Goutham <***@cavium.com>
---
drivers/net/ethernet/cavium/thunder/nic.h | 3 +++
.../net/ethernet/cavium/thunder/nicvf_ethtool.c | 2 ++
drivers/net/ethernet/cavium/thunder/nicvf_main.c | 1 +
drivers/net/ethernet/cavium/thunder/nicvf_queues.c | 3 +--
4 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/cavium/thunder/nic.h b/drivers/net/ethernet/cavium/thunder/nic.h
index e782856..defbf17 100644
--- a/drivers/net/ethernet/cavium/thunder/nic.h
+++ b/drivers/net/ethernet/cavium/thunder/nic.h
@@ -250,10 +250,13 @@ struct nicvf_drv_stats {
u64 rx_frames_jumbo;
u64 rx_drops;

+ u64 rcv_buffer_alloc_failures;
+
/* Tx */
u64 tx_frames_ok;
u64 tx_drops;
u64 tx_tso;
+ u64 tx_timeout;
u64 txq_stop;
u64 txq_wake;
};
diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_ethtool.c b/drivers/net/ethernet/cavium/thunder/nicvf_ethtool.c
index a12b2e3..d2d8ef2 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_ethtool.c
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_ethtool.c
@@ -89,9 +89,11 @@ static const struct nicvf_stat nicvf_drv_stats[] = {
NICVF_DRV_STAT(rx_frames_1518),
NICVF_DRV_STAT(rx_frames_jumbo),
NICVF_DRV_STAT(rx_drops),
+ NICVF_DRV_STAT(rcv_buffer_alloc_failures),
NICVF_DRV_STAT(tx_frames_ok),
NICVF_DRV_STAT(tx_tso),
NICVF_DRV_STAT(tx_drops),
+ NICVF_DRV_STAT(tx_timeout),
NICVF_DRV_STAT(txq_stop),
NICVF_DRV_STAT(txq_wake),
};
diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_main.c b/drivers/net/ethernet/cavium/thunder/nicvf_main.c
index a75f21f..a7e2951 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_main.c
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_main.c
@@ -1392,6 +1392,7 @@ static void nicvf_tx_timeout(struct net_device *dev)
netdev_warn(dev, "%s: Transmit timed out, resetting\n",
dev->name);

+ nic->drv_stats.tx_timeout++;
schedule_work(&nic->reset_task);
}

diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_queues.c b/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
index 4e9709e..28df11a 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
@@ -96,8 +96,7 @@ static inline int nicvf_alloc_rcv_buffer(struct nicvf *nic, gfp_t gfp,
nic->rb_page = alloc_pages(gfp | __GFP_COMP | __GFP_NOWARN,
order);
if (!nic->rb_page) {
- netdev_err(nic->netdev,
- "Failed to allocate new rcv buffer\n");
+ nic->drv_stats.rcv_buffer_alloc_failures++;
return -ENOMEM;
}
nic->rb_page_offset = 0;
--
1.7.1
Sunil Goutham
2015-11-30 18:19:05 UTC
Permalink
From: Sunil Goutham <***@cavium.com>

Added support to offload TSO to HW for ThunderX pass2 chips.

Signed-off-by: Thanneeru Srinivasulu <***@caviumnetworks.com>
Signed-off-by: Sunil Goutham <***@cavium.com>
---
drivers/net/ethernet/cavium/thunder/nic.h | 12 ++++++--
drivers/net/ethernet/cavium/thunder/nic_main.c | 11 ++-----
drivers/net/ethernet/cavium/thunder/nicvf_main.c | 3 ++
drivers/net/ethernet/cavium/thunder/nicvf_queues.c | 19 ++++++++----
drivers/net/ethernet/cavium/thunder/q_struct.h | 30 ++++++++++---------
5 files changed, 44 insertions(+), 31 deletions(-)

diff --git a/drivers/net/ethernet/cavium/thunder/nic.h b/drivers/net/ethernet/cavium/thunder/nic.h
index 39ca674..02571f4 100644
--- a/drivers/net/ethernet/cavium/thunder/nic.h
+++ b/drivers/net/ethernet/cavium/thunder/nic.h
@@ -262,9 +262,10 @@ struct nicvf {
struct pci_dev *pdev;
u8 vf_id;
u8 node;
- u8 tns_mode:1;
- u8 sqs_mode:1;
- u8 loopback_supported:1;
+ bool tns_mode:1;
+ bool sqs_mode:1;
+ bool loopback_supported:1;
+ bool hw_tso:1;
u16 mtu;
struct queue_set *qs;
#define MAX_SQS_PER_VF_SINGLE_NODE 5
@@ -489,6 +490,11 @@ static inline int nic_get_node_id(struct pci_dev *pdev)
return ((addr >> NIC_NODE_ID_SHIFT) & NIC_NODE_ID_MASK);
}

+static inline bool pass1_silicon(struct pci_dev *pdev)
+{
+ return pdev->revision < 8;
+}
+
int nicvf_set_real_num_queues(struct net_device *netdev,
int tx_queues, int rx_queues);
int nicvf_open(struct net_device *netdev);
diff --git a/drivers/net/ethernet/cavium/thunder/nic_main.c b/drivers/net/ethernet/cavium/thunder/nic_main.c
index 4b7fd63..9f80de4 100644
--- a/drivers/net/ethernet/cavium/thunder/nic_main.c
+++ b/drivers/net/ethernet/cavium/thunder/nic_main.c
@@ -55,11 +55,6 @@ struct nicpf {
bool irq_allocated[NIC_PF_MSIX_VECTORS];
};

-static inline bool pass1_silicon(struct nicpf *nic)
-{
- return nic->pdev->revision < 8;
-}
-
/* Supported devices */
static const struct pci_device_id nic_id_table[] = {
{ PCI_DEVICE(PCI_VENDOR_ID_CAVIUM, PCI_DEVICE_ID_THUNDER_NIC_PF) },
@@ -123,7 +118,7 @@ static void nic_send_msg_to_vf(struct nicpf *nic, int vf, union nic_mbx *mbx)
* when PF writes to MBOX(1), in next revisions when
* PF writes to MBOX(0)
*/
- if (pass1_silicon(nic)) {
+ if (pass1_silicon(nic->pdev)) {
/* see the comment for nic_reg_write()/nic_reg_read()
* functions above
*/
@@ -400,7 +395,7 @@ static void nic_config_cpi(struct nicpf *nic, struct cpi_cfg_msg *cfg)
padd = cpi % 8; /* 3 bits CS out of 6bits DSCP */

/* Leave RSS_SIZE as '0' to disable RSS */
- if (pass1_silicon(nic)) {
+ if (pass1_silicon(nic->pdev)) {
nic_reg_write(nic, NIC_PF_CPI_0_2047_CFG | (cpi << 3),
(vnic << 24) | (padd << 16) |
(rssi_base + rssi));
@@ -470,7 +465,7 @@ static void nic_config_rss(struct nicpf *nic, struct rss_cfg_msg *cfg)
}

cpi_base = nic->cpi_base[cfg->vf_id];
- if (pass1_silicon(nic))
+ if (pass1_silicon(nic->pdev))
idx_addr = NIC_PF_CPI_0_2047_CFG;
else
idx_addr = NIC_PF_MPI_0_2047_CFG;
diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_main.c b/drivers/net/ethernet/cavium/thunder/nicvf_main.c
index dde8dc7..df07bd7 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_main.c
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_main.c
@@ -1549,6 +1549,9 @@ static int nicvf_probe(struct pci_dev *pdev, const struct pci_device_id *ent)

netdev->vlan_features = NETIF_F_SG | NETIF_F_IP_CSUM | NETIF_F_TSO;

+ if (!pass1_silicon(nic->pdev))
+ nic->hw_tso = true;
+
netdev->netdev_ops = &nicvf_netdev_ops;
netdev->watchdog_timeo = NICVF_TX_TIMEOUT;

diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_queues.c b/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
index 1fbd908..209e495 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
@@ -925,7 +925,7 @@ static int nicvf_sq_subdesc_required(struct nicvf *nic, struct sk_buff *skb)
{
int subdesc_cnt = MIN_SQ_DESC_PER_PKT_XMIT;

- if (skb_shinfo(skb)->gso_size) {
+ if (skb_shinfo(skb)->gso_size && !nic->hw_tso) {
subdesc_cnt = nicvf_tso_count_subdescs(skb);
return subdesc_cnt;
}
@@ -940,8 +940,8 @@ static int nicvf_sq_subdesc_required(struct nicvf *nic, struct sk_buff *skb)
* First subdescriptor for every send descriptor.
*/
static inline void
-nicvf_sq_add_hdr_subdesc(struct snd_queue *sq, int qentry,
- int subdesc_cnt, struct sk_buff *skb, int len)
+nicvf_sq_add_hdr_subdesc(struct snd_queue *sq, int qentry, int subdesc_cnt,
+ struct sk_buff *skb, int len, bool hw_tso)
{
int proto;
struct sq_hdr_subdesc *hdr;
@@ -976,6 +976,12 @@ nicvf_sq_add_hdr_subdesc(struct snd_queue *sq, int qentry,
break;
}
}
+
+ if (hw_tso && skb_shinfo(skb)->gso_size) {
+ hdr->tso = 1;
+ hdr->tso_start = skb_transport_offset(skb) + tcp_hdrlen(skb);
+ hdr->tso_max_paysize = skb_shinfo(skb)->gso_size;
+ }
}

/* SQ GATHER subdescriptor
@@ -1046,7 +1052,7 @@ static int nicvf_sq_append_tso(struct nicvf *nic, struct snd_queue *sq,
tso_build_data(skb, &tso, size);
}
nicvf_sq_add_hdr_subdesc(sq, hdr_qentry,
- seg_subdescs - 1, skb, seg_len);
+ seg_subdescs - 1, skb, seg_len, false);
sq->skbuff[hdr_qentry] = (u64)NULL;
qentry = nicvf_get_nxt_sqentry(sq, qentry);

@@ -1098,11 +1104,12 @@ int nicvf_sq_append_skb(struct nicvf *nic, struct sk_buff *skb)
qentry = nicvf_get_sq_desc(sq, subdesc_cnt);

/* Check if its a TSO packet */
- if (skb_shinfo(skb)->gso_size)
+ if (skb_shinfo(skb)->gso_size && !nic->hw_tso)
return nicvf_sq_append_tso(nic, sq, sq_num, qentry, skb);

/* Add SQ header subdesc */
- nicvf_sq_add_hdr_subdesc(sq, qentry, subdesc_cnt - 1, skb, skb->len);
+ nicvf_sq_add_hdr_subdesc(sq, qentry, subdesc_cnt - 1, skb,
+ skb->len, nic->hw_tso);

/* Add SQ gather subdescs */
qentry = nicvf_get_nxt_sqentry(sq, qentry);
diff --git a/drivers/net/ethernet/cavium/thunder/q_struct.h b/drivers/net/ethernet/cavium/thunder/q_struct.h
index 3c1de97..9e6d987 100644
--- a/drivers/net/ethernet/cavium/thunder/q_struct.h
+++ b/drivers/net/ethernet/cavium/thunder/q_struct.h
@@ -545,25 +545,28 @@ struct sq_hdr_subdesc {
u64 subdesc_cnt:8;
u64 csum_l4:2;
u64 csum_l3:1;
- u64 rsvd0:5;
+ u64 csum_inner_l4:2;
+ u64 csum_inner_l3:1;
+ u64 rsvd0:2;
u64 l4_offset:8;
u64 l3_offset:8;
u64 rsvd1:4;
u64 tot_len:20; /* W0 */

- u64 tso_sdc_cont:8;
- u64 tso_sdc_first:8;
- u64 tso_l4_offset:8;
- u64 tso_flags_last:12;
- u64 tso_flags_first:12;
- u64 rsvd2:2;
+ u64 rsvd2:24;
+ u64 inner_l4_offset:8;
+ u64 inner_l3_offset:8;
+ u64 tso_start:8;
+ u64 rsvd3:2;
u64 tso_max_paysize:14; /* W1 */
#elif defined(__LITTLE_ENDIAN_BITFIELD)
u64 tot_len:20;
u64 rsvd1:4;
u64 l3_offset:8;
u64 l4_offset:8;
- u64 rsvd0:5;
+ u64 rsvd0:2;
+ u64 csum_inner_l3:1;
+ u64 csum_inner_l4:2;
u64 csum_l3:1;
u64 csum_l4:2;
u64 subdesc_cnt:8;
@@ -574,12 +577,11 @@ struct sq_hdr_subdesc {
u64 subdesc_type:4; /* W0 */

u64 tso_max_paysize:14;
- u64 rsvd2:2;
- u64 tso_flags_first:12;
- u64 tso_flags_last:12;
- u64 tso_l4_offset:8;
- u64 tso_sdc_first:8;
- u64 tso_sdc_cont:8; /* W1 */
+ u64 rsvd3:2;
+ u64 tso_start:8;
+ u64 inner_l3_offset:8;
+ u64 inner_l4_offset:8;
+ u64 rsvd2:24; /* W1 */
#endif
};
--
1.7.1
Sunil Goutham
2015-11-30 18:19:06 UTC
Permalink
From: Sunil Goutham <***@cavium.com>

Signed-off-by: Sunil Goutham <***@cavium.com>
---
drivers/net/ethernet/cavium/thunder/nic.h | 2 ++
drivers/net/ethernet/cavium/thunder/nicvf_main.c | 2 +-
drivers/net/ethernet/cavium/thunder/nicvf_queues.c | 5 ++++-
drivers/net/ethernet/cavium/thunder/nicvf_queues.h | 3 ++-
4 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/cavium/thunder/nic.h b/drivers/net/ethernet/cavium/thunder/nic.h
index 02571f4..e782856 100644
--- a/drivers/net/ethernet/cavium/thunder/nic.h
+++ b/drivers/net/ethernet/cavium/thunder/nic.h
@@ -34,6 +34,8 @@
/* NIC priv flags */
#define NIC_SRIOV_ENABLED BIT(0)

+#define VNIC_NAPI_WEIGHT NAPI_POLL_WEIGHT
+
/* Min/Max packet size */
#define NIC_HW_MIN_FRS 64
#define NIC_HW_MAX_FRS 9200 /* 9216 max packet including FCS */
diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_main.c b/drivers/net/ethernet/cavium/thunder/nicvf_main.c
index df07bd7..51bbe67 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_main.c
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_main.c
@@ -1147,7 +1147,7 @@ int nicvf_open(struct net_device *netdev)
cq_poll->cq_idx = qidx;
cq_poll->nicvf = nic;
netif_napi_add(netdev, &cq_poll->napi, nicvf_poll,
- NAPI_POLL_WEIGHT);
+ VNIC_NAPI_WEIGHT);
napi_enable(&cq_poll->napi);
nic->napi[qidx] = cq_poll;
}
diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_queues.c b/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
index 209e495..5855828 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
@@ -299,7 +299,10 @@ static int nicvf_init_cmp_queue(struct nicvf *nic,
return err;

cq->desc = cq->dmem.base;
- cq->thresh = CMP_QUEUE_CQE_THRESH;
+ if (!pass1_silicon(nic->pdev))
+ cq->thresh = CMP_QUEUE_CQE_THRESH;
+ else
+ cq->thresh = 0;
nic->cq_coalesce_usecs = (CMP_QUEUE_TIMER_THRESH * 0.05) - 1;

return 0;
diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_queues.h b/drivers/net/ethernet/cavium/thunder/nicvf_queues.h
index bce8099..2cfc85b 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_queues.h
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_queues.h
@@ -10,6 +10,7 @@
#define NICVF_QUEUES_H

#include <linux/netdevice.h>
+#include "nic.h"
#include "q_struct.h"

#define MAX_QUEUE_SET 128
@@ -75,7 +76,7 @@
*/
#define CMP_QSIZE CMP_QUEUE_SIZE3
#define CMP_QUEUE_LEN (1ULL << (CMP_QSIZE + 10))
-#define CMP_QUEUE_CQE_THRESH 0
+#define CMP_QUEUE_CQE_THRESH (VNIC_NAPI_WEIGHT / 2)
#define CMP_QUEUE_TIMER_THRESH 80 /* ~2usec */

#define RBDR_SIZE RBDR_SIZE0
--
1.7.1
Sunil Goutham
2015-12-01 09:13:37 UTC
Permalink
From: Sunil Goutham <***@cavium.com>

This patch series contains fixes for various issues observed
with BGX and NIC drivers.

Sunil Goutham (4):
net: thunderx: Increase transmit queue length
net: thunderx: Set CQ timer threshold properly
net: thunderx: Switchon carrier only upon interface link up
net: thunderx: Enable BGX LMAC's RX/TX only after VF is up

Thanneeru Srinivasulu (2):
net: thunderx: Force to load octeon-mdio before bgx driver.
net: thunderx: Wait for delayed work to finish before destroying it

drivers/net/ethernet/cavium/thunder/nic.h | 5 +--
drivers/net/ethernet/cavium/thunder/nic_main.c | 23 ++++++++++++++--
.../net/ethernet/cavium/thunder/nicvf_ethtool.c | 16 ++++++++++-
drivers/net/ethernet/cavium/thunder/nicvf_main.c | 4 +--
drivers/net/ethernet/cavium/thunder/nicvf_queues.c | 2 +-
drivers/net/ethernet/cavium/thunder/nicvf_queues.h | 6 ++--
drivers/net/ethernet/cavium/thunder/thunder_bgx.c | 28 +++++++++++++++++---
drivers/net/ethernet/cavium/thunder/thunder_bgx.h | 2 +
8 files changed, 68 insertions(+), 18 deletions(-)
Sunil Goutham
2015-12-01 09:13:38 UTC
Permalink
From: Thanneeru Srinivasulu <***@caviumnetworks.com>

Signed-off-by: Thanneeru Srinivasulu <***@caviumnetworks.com>
Signed-off-by: Sunil Goutham <***@cavium.com>
---
drivers/net/ethernet/cavium/thunder/thunder_bgx.c | 3 +++
drivers/net/ethernet/cavium/thunder/thunder_bgx.h | 1 +
2 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/drivers/net/ethernet/cavium/thunder/thunder_bgx.c b/drivers/net/ethernet/cavium/thunder/thunder_bgx.c
index 180aa9f..1fb72dd 100644
--- a/drivers/net/ethernet/cavium/thunder/thunder_bgx.c
+++ b/drivers/net/ethernet/cavium/thunder/thunder_bgx.c
@@ -1009,6 +1009,9 @@ static int bgx_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
struct bgx *bgx = NULL;
u8 lmac;

+ /*Load octeon mdio driver*/
+ octeon_mdiobus_force_mod_depencency();
+
bgx = devm_kzalloc(dev, sizeof(*bgx), GFP_KERNEL);
if (!bgx)
return -ENOMEM;
diff --git a/drivers/net/ethernet/cavium/thunder/thunder_bgx.h b/drivers/net/ethernet/cavium/thunder/thunder_bgx.h
index 07b7ec6..89a02fa 100644
--- a/drivers/net/ethernet/cavium/thunder/thunder_bgx.h
+++ b/drivers/net/ethernet/cavium/thunder/thunder_bgx.h
@@ -182,6 +182,7 @@ enum MCAST_MODE {
#define BCAST_ACCEPT 1
#define CAM_ACCEPT 1

+void octeon_mdiobus_force_mod_depencency(void);
void bgx_add_dmac_addr(u64 dmac, int node, int bgx_idx, int lmac);
unsigned bgx_get_map(int node);
int bgx_get_lmac_count(int node, int bgx);
--
1.7.1
Sergei Shtylyov
2015-12-01 13:17:56 UTC
Permalink
Hello.
Post by Sunil Goutham
---
drivers/net/ethernet/cavium/thunder/thunder_bgx.c | 3 +++
drivers/net/ethernet/cavium/thunder/thunder_bgx.h | 1 +
2 files changed, 4 insertions(+), 0 deletions(-)
diff --git a/drivers/net/ethernet/cavium/thunder/thunder_bgx.c b/drivers/net/ethernet/cavium/thunder/thunder_bgx.c
index 180aa9f..1fb72dd 100644
--- a/drivers/net/ethernet/cavium/thunder/thunder_bgx.c
+++ b/drivers/net/ethernet/cavium/thunder/thunder_bgx.c
@@ -1009,6 +1009,9 @@ static int bgx_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
struct bgx *bgx = NULL;
u8 lmac;
+ /*Load octeon mdio driver*/
Please insert spaces after /* and before */ (unless it's a common comment
style in this driver).

[...]

MBR, Sergei
Sunil Kovvuri
2015-12-01 16:24:57 UTC
Permalink
Thanks a lot, will fix and repost the patch.

On Tue, Dec 1, 2015 at 6:47 PM, Sergei Shtylyov
Post by Sergei Shtylyov
Hello.
Post by Sunil Goutham
---
drivers/net/ethernet/cavium/thunder/thunder_bgx.c | 3 +++
drivers/net/ethernet/cavium/thunder/thunder_bgx.h | 1 +
2 files changed, 4 insertions(+), 0 deletions(-)
diff --git a/drivers/net/ethernet/cavium/thunder/thunder_bgx.c
b/drivers/net/ethernet/cavium/thunder/thunder_bgx.c
index 180aa9f..1fb72dd 100644
--- a/drivers/net/ethernet/cavium/thunder/thunder_bgx.c
+++ b/drivers/net/ethernet/cavium/thunder/thunder_bgx.c
@@ -1009,6 +1009,9 @@ static int bgx_probe(struct pci_dev *pdev, const
struct pci_device_id *ent)
struct bgx *bgx = NULL;
u8 lmac;
+ /*Load octeon mdio driver*/
Please insert spaces after /* and before */ (unless it's a common comment
style in this driver).
[...]
MBR, Sergei
Sunil Goutham
2015-12-01 09:13:39 UTC
Permalink
From: Thanneeru Srinivasulu <***@caviumnetworks.com>

While VNIC or BGX driver teardown, wait for already scheduled delayed work to
finish before destroying it.

Signed-off-by: Thanneeru Srinivasulu <***@caviumnetworks.com>
Signed-off-by: Sunil Goutham <***@cavium.com>
---
drivers/net/ethernet/cavium/thunder/nic_main.c | 3 +--
drivers/net/ethernet/cavium/thunder/thunder_bgx.c | 3 +--
2 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/cavium/thunder/nic_main.c b/drivers/net/ethernet/cavium/thunder/nic_main.c
index c561fdc..cfc24a1 100644
--- a/drivers/net/ethernet/cavium/thunder/nic_main.c
+++ b/drivers/net/ethernet/cavium/thunder/nic_main.c
@@ -1074,8 +1074,7 @@ static void nic_remove(struct pci_dev *pdev)

if (nic->check_link) {
/* Destroy work Queue */
- cancel_delayed_work(&nic->dwork);
- flush_workqueue(nic->check_link);
+ cancel_delayed_work_sync(&nic->dwork);
destroy_workqueue(nic->check_link);
}

diff --git a/drivers/net/ethernet/cavium/thunder/thunder_bgx.c b/drivers/net/ethernet/cavium/thunder/thunder_bgx.c
index 1fb72dd..2076ac3 100644
--- a/drivers/net/ethernet/cavium/thunder/thunder_bgx.c
+++ b/drivers/net/ethernet/cavium/thunder/thunder_bgx.c
@@ -695,8 +695,7 @@ static void bgx_lmac_disable(struct bgx *bgx, u8 lmacid)
lmac = &bgx->lmac[lmacid];
if (lmac->check_link) {
/* Destroy work queue */
- cancel_delayed_work(&lmac->dwork);
- flush_workqueue(lmac->check_link);
+ cancel_delayed_work_sync(&lmac->dwork);
destroy_workqueue(lmac->check_link);
}
--
1.7.1
Sunil Goutham
2015-12-01 09:13:40 UTC
Permalink
From: Sunil Goutham <***@cavium.com>

Under high transmit rates and with TSO enabled observing fluctuations
in TX performance. Seen especially with iperf3 application.

Since TSO is taken care at driver level, with 64KB of TSO packets
and when window size is also high the rate at which CPU fills in
transmit descriptors is much higher than what HW is able to process.
Each 64KB TSO packet occupies gets segmented to ~43 1500 byte MTU packets
and occupies ~130 descriptors. Hence increasing transmit queue size.

Signed-off-by: Sunil Goutham <***@cavium.com>
---
drivers/net/ethernet/cavium/thunder/nicvf_queues.h | 4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_queues.h b/drivers/net/ethernet/cavium/thunder/nicvf_queues.h
index fb4957d..b1e93a9 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_queues.h
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_queues.h
@@ -62,7 +62,7 @@
#define SND_QUEUE_CNT 8
#define CMP_QUEUE_CNT 8 /* Max of RCV and SND qcount */

-#define SND_QSIZE SND_QUEUE_SIZE2
+#define SND_QSIZE SND_QUEUE_SIZE3
#define SND_QUEUE_LEN (1ULL << (SND_QSIZE + 10))
#define MAX_SND_QUEUE_LEN (1ULL << (SND_QUEUE_SIZE6 + 10))
#define SND_QUEUE_THRESH 2ULL
@@ -73,7 +73,7 @@
/* Keep CQ and SQ sizes same, if timestamping
* is enabled this equation will change.
*/
-#define CMP_QSIZE CMP_QUEUE_SIZE2
+#define CMP_QSIZE CMP_QUEUE_SIZE3
#define CMP_QUEUE_LEN (1ULL << (CMP_QSIZE + 10))
#define CMP_QUEUE_CQE_THRESH 0
#define CMP_QUEUE_TIMER_THRESH 220 /* 10usec */
--
1.7.1
Pavel Fedin
2015-12-01 14:40:09 UTC
Permalink
Hello!

This one breaks networking on my machine:
--- cut ---
swiotlb: coherent allocation failed for device 0002:01:08.4 size=4198400
CPU: 2 PID: 3655 Comm: NetworkManager Tainted: G W O 4.2.6+ #201
Hardware name: Cavium ThunderX CN88XX board (DT)
Call trace:
[<ffffffc00008a3d4>] dump_backtrace+0x0/0x124
[<ffffffc00008a50c>] show_stack+0x14/0x1c
[<ffffffc0006df258>] dump_stack+0x88/0xc8
[<ffffffc000416994>] swiotlb_alloc_coherent+0x150/0x168
[<ffffffc00009555c>] __dma_alloc+0x58/0x1e8
[<ffffffc000507cd8>] nicvf_alloc_q_desc_mem.isra.14+0xc0/0xdc
[<ffffffc000508c8c>] nicvf_config_data_transfer+0x448/0x728
[<ffffffc000506c64>] nicvf_open+0x184/0x994
[<ffffffc0005ea0b4>] __dev_open+0xb4/0x120
[<ffffffc0005ea388>] __dev_change_flags+0x8c/0x150
[<ffffffc0005ea46c>] dev_change_flags+0x20/0x5c
[<ffffffc0005fb210>] do_setlink+0x27c/0x8e0
[<ffffffc0005fbe28>] rtnl_newlink+0x4b4/0x6c4
[<ffffffc0005fa6f0>] rtnetlink_rcv_msg+0x90/0x22c
[<ffffffc000611b74>] netlink_rcv_skb+0xd8/0x100
[<ffffffc0005fa650>] rtnetlink_rcv+0x30/0x40
[<ffffffc000611498>] netlink_unicast+0x158/0x1f8
[<ffffffc000611910>] netlink_sendmsg+0x324/0x380
[<ffffffc0005c6690>] sock_sendmsg+0x18/0x58
[<ffffffc0005c6db8>] ___sys_sendmsg+0x254/0x264
[<ffffffc0005c7bb8>] __sys_sendmsg+0x40/0x84
[<ffffffc0005c7c0c>] SyS_sendmsg+0x10/0x20
thunder-nicvf 0002:01:08.4 enP2p1s8f4: Failed to alloc/config VF's QSet resources
--- cut ---

And none of interfaces work after this. Reverting this patch helps.

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia
-----Original Message-----
Goutham
Sent: Tuesday, December 01, 2015 12:14 PM
Subject: [PATCH 3/6] net: thunderx: Increase transmit queue length
Under high transmit rates and with TSO enabled observing fluctuations
in TX performance. Seen especially with iperf3 application.
Since TSO is taken care at driver level, with 64KB of TSO packets
and when window size is also high the rate at which CPU fills in
transmit descriptors is much higher than what HW is able to process.
Each 64KB TSO packet occupies gets segmented to ~43 1500 byte MTU packets
and occupies ~130 descriptors. Hence increasing transmit queue size.
---
drivers/net/ethernet/cavium/thunder/nicvf_queues.h | 4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_queues.h
b/drivers/net/ethernet/cavium/thunder/nicvf_queues.h
index fb4957d..b1e93a9 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_queues.h
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_queues.h
@@ -62,7 +62,7 @@
#define SND_QUEUE_CNT 8
#define CMP_QUEUE_CNT 8 /* Max of RCV and SND qcount */
-#define SND_QSIZE SND_QUEUE_SIZE2
+#define SND_QSIZE SND_QUEUE_SIZE3
#define SND_QUEUE_LEN (1ULL << (SND_QSIZE + 10))
#define MAX_SND_QUEUE_LEN (1ULL << (SND_QUEUE_SIZE6 + 10))
#define SND_QUEUE_THRESH 2ULL
@@ -73,7 +73,7 @@
/* Keep CQ and SQ sizes same, if timestamping
* is enabled this equation will change.
*/
-#define CMP_QSIZE CMP_QUEUE_SIZE2
+#define CMP_QSIZE CMP_QUEUE_SIZE3
#define CMP_QUEUE_LEN (1ULL << (CMP_QSIZE + 10))
#define CMP_QUEUE_CQE_THRESH 0
#define CMP_QUEUE_TIMER_THRESH 220 /* 10usec */
--
1.7.1
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
More majordomo info at http://vger.kernel.org/majordomo-info.html
Eric Dumazet
2015-12-01 15:33:05 UTC
Permalink
Post by Pavel Fedin
Hello!
--- cut ---
swiotlb: coherent allocation failed for device 0002:01:08.4 size=4198400
CPU: 2 PID: 3655 Comm: NetworkManager Tainted: G W O 4.2.6+ #201
Hardware name: Cavium ThunderX CN88XX
Are you sure 4.2.6 kernel is suitable for backporting this patch aimed
for linux-4.5 ?
Sunil Kovvuri
2015-12-01 16:30:49 UTC
Permalink
Hi Pavel Fedin,

Increasing descriptor ring size will lead to more memory allocation.
And what you are seeing is a memory alloc failure and doesn't seem to
be due to this driver change.
I mean it looks like the behavior will be same for other drivers as well.

Probably you might have to set "coherent_pool" size in bootargs to a
higher value.
Can you please check.

Thanks,
Sunil.
Post by Eric Dumazet
Post by Pavel Fedin
Hello!
--- cut ---
swiotlb: coherent allocation failed for device 0002:01:08.4 size=4198400
CPU: 2 PID: 3655 Comm: NetworkManager Tainted: G W O 4.2.6+ #201
Hardware name: Cavium ThunderX CN88XX
Are you sure 4.2.6 kernel is suitable for backporting this patch aimed
for linux-4.5 ?
David Miller
2015-12-01 19:30:47 UTC
Permalink
From: Sunil Kovvuri <***@gmail.com>
Date: Tue, 1 Dec 2015 22:00:49 +0530
Post by Sunil Kovvuri
Increasing descriptor ring size will lead to more memory allocation.
And what you are seeing is a memory alloc failure and doesn't seem
to be due to this driver change. I mean it looks like the behavior
will be same for other drivers as well.
The driver should successfully recover from out of memory situations
and not stop RX/TX completely.

There might be some other problem with your driver causing this.

Don't put this off as not "related" to your patch, it is as this
introduces the behavior for this user, and you should fix it before
expecting me to apply this patch series.
Sunil Kovvuri
2015-12-02 05:48:43 UTC
Permalink
Post by David Miller
The driver should successfully recover from out of memory situations
and not stop RX/TX completely.
This memory allocation is while interface bringup/initialization and not during
packet I/O.
Post by David Miller
Don't put this off as not "related" to your patch, it is as this
introduces the behavior for this user, and you should fix it before
expecting me to apply this patch series.
I would disagree on this, as this patch hasn't introduced any failure here,
if this user has connected any device which asks for a bit large amount
of coherent memory then i am sure he will see the same issue.
And above i have suggested what could be done to not see this issue.

But anyway for the time being I will resubmit the patch series without this
patch, hope that would be okay.
Post by David Miller
Date: Tue, 1 Dec 2015 22:00:49 +0530
Post by Sunil Kovvuri
Increasing descriptor ring size will lead to more memory allocation.
And what you are seeing is a memory alloc failure and doesn't seem
to be due to this driver change. I mean it looks like the behavior
will be same for other drivers as well.
The driver should successfully recover from out of memory situations
and not stop RX/TX completely.
There might be some other problem with your driver causing this.
Don't put this off as not "related" to your patch, it is as this
introduces the behavior for this user, and you should fix it before
expecting me to apply this patch series.
Eric Dumazet
2015-12-02 13:25:33 UTC
Permalink
Post by Sunil Kovvuri
Post by David Miller
The driver should successfully recover from out of memory situations
and not stop RX/TX completely.
This memory allocation is while interface bringup/initialization and not during
packet I/O.
Post by David Miller
Don't put this off as not "related" to your patch, it is as this
introduces the behavior for this user, and you should fix it before
expecting me to apply this patch series.
I would disagree on this, as this patch hasn't introduced any failure here,
if this user has connected any device which asks for a bit large amount
of coherent memory then i am sure he will see the same issue.
And above i have suggested what could be done to not see this issue.
This is unacceptable.

Maybe you did not complete tests. changelog has no 'Tested:' section.

You can not claim this patch was good, especially considering no precise
numbers were given.

If the performance increase is 4 %, then surely using twice more memory
is not worth it.

RX/TX ring buffer sizes should be :

- Default to reasonable sizes (ie not gigantic memory usage for typical
use). For multiqueue devices, one also has to take into account the
number of queues:

If a 10Gbit NIC has 128 queues, then probably having 8192 slots
per queue is too much, if this maps to 512 MB of memory !

- ethtool -G support to let the admin change the conservative settings
for the exceptional workloads.
Sunil Kovvuri
2015-12-02 16:50:46 UTC
Permalink
Post by Eric Dumazet
If the performance increase is 4 %, then surely using twice more memory
is not worth it.
I haven't mentioned anywhere that i am seeing 4% increase in performance.
Anyways as said already i will recheck and resubmit.
Eric Dumazet
2015-12-02 16:59:40 UTC
Permalink
Post by Sunil Kovvuri
Post by Eric Dumazet
If the performance increase is 4 %, then surely using twice more memory
is not worth it.
I haven't mentioned anywhere that i am seeing 4% increase in performance.
Yes, this is the complain I made :

No numbers, just a patch increasing ring size by a 100% factor !

I really hope you'll have very convincing numbers, especially if your
driver does not implement BQL.
David Miller
2015-12-02 17:31:03 UTC
Permalink
From: Sunil Kovvuri <***@gmail.com>
Date: Wed, 2 Dec 2015 11:18:43 +0530
Post by Sunil Kovvuri
Post by David Miller
The driver should successfully recover from out of memory situations
and not stop RX/TX completely.
This memory allocation is while interface bringup/initialization and not during
packet I/O.
Post by David Miller
Don't put this off as not "related" to your patch, it is as this
introduces the behavior for this user, and you should fix it before
expecting me to apply this patch series.
I would disagree on this, as this patch hasn't introduced any failure here,
if this user has connected any device which asks for a bit large amount
of coherent memory then i am sure he will see the same issue.
It's not the memory allocation that's the problem.

It's the the device completely dies and does not recover even when
memory does become available later.

That is a hard regression which this change introduces.
Pavel Fedin
2015-12-02 09:05:41 UTC
Permalink
Hello!
Post by Sunil Kovvuri
Probably you might have to set "coherent_pool" size in bootargs to a
higher value.
Can you please check.
I have tried to do this. I was able to enlarge the pool up to 4MB, and still got allocation failures. At 8MB pool preallocation stops working:
--- cut ---
Call trace:
[<ffffffc00012ddb8>] __alloc_pages_nodemask+0x4f4/0x7d4
[<ffffffc0007be370>] atomic_pool_init+0x60/0x1a4
[<ffffffc0007be4d4>] arm64_dma_init+0x20/0x28
[<ffffffc000082848>] do_one_initcall+0x8c/0x1a4
[<ffffffc0007baac0>] kernel_init_freeable+0x154/0x1f4
[<ffffffc0005c2b14>] kernel_init+0x10/0xd8
DMA: failed to allocate 8192 KiB pool for atomic coherent allocation
--- cut ---
and i get even worse faults in the driver.

I know that it is possible to allocate larger pools by setting CONFIG_FORCE_MAX_ZONEORDER, but:
a) This is done on per-platform basis. For ThunderX we used to have a patch (http://www.spinics.net/lists/arm-kernel/msg415457.html), which never made it upstream, because vGIC fixes stopped requiring it at some point. And also we may want to use the nicvf driver not only on actual hardware, but also inside virtual machine in KVM. So do we need to set CONFIG_FORCE_MAX_ZONEORDER for virt too? And what if at some point qemu emulates not only "virt", but some other machine (let's say AMD X-Gene), and we run it on ThunderX and want to use nicvf with this model?
b) IMHO it's not good to have a driver which simply does not work without some obscure option in boot arguments.

So, i see several possible ways to solve this:

1. Introduce some mechanism which would allow the driver to tell the kernel that it needs coherent pool of large size. Can be problematic because the driver can be a module, and pool allocation happens early.
2. Can we use some other method for allocating queues, which would not require such a huge coherent pool?
3. The driver could check value of atomic_pool_size and adjust own memory requirements accordingly. This indeed looks like a quick hack, but would at least make things running quickly.

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia
Pavel Fedin
2015-12-02 10:31:42 UTC
Permalink
Hello!
Post by Pavel Fedin
Post by Sunil Kovvuri
Probably you might have to set "coherent_pool" size in bootargs to a
higher value.
Can you please check.
I have tried to do this. I was able to enlarge the pool up to 4MB, and still got allocation
--- cut ---
[<ffffffc00012ddb8>] __alloc_pages_nodemask+0x4f4/0x7d4
[<ffffffc0007be370>] atomic_pool_init+0x60/0x1a4
[<ffffffc0007be4d4>] arm64_dma_init+0x20/0x28
[<ffffffc000082848>] do_one_initcall+0x8c/0x1a4
[<ffffffc0007baac0>] kernel_init_freeable+0x154/0x1f4
[<ffffffc0005c2b14>] kernel_init+0x10/0xd8
DMA: failed to allocate 8192 KiB pool for atomic coherent allocation
--- cut ---
and i get even worse faults in the driver.
I know that it is possible to allocate larger pools by setting CONFIG_FORCE_MAX_ZONEORDER,
a) This is done on per-platform basis. For ThunderX we used to have a patch
(http://www.spinics.net/lists/arm-kernel/msg415457.html), which never made it upstream,
because vGIC fixes stopped requiring it at some point. And also we may want to use the nicvf
driver not only on actual hardware, but also inside virtual machine in KVM. So do we need to
set CONFIG_FORCE_MAX_ZONEORDER for virt too? And what if at some point qemu emulates not only
"virt", but some other machine (let's say AMD X-Gene), and we run it on ThunderX and want to
use nicvf with this model?
b) IMHO it's not good to have a driver which simply does not work without some obscure option
in boot arguments.
1. Introduce some mechanism which would allow the driver to tell the kernel that it needs
coherent pool of large size. Can be problematic because the driver can be a module, and pool
allocation happens early.
2. Can we use some other method for allocating queues, which would not require such a huge
coherent pool?
3. The driver could check value of atomic_pool_size and adjust own memory requirements
accordingly. This indeed looks like a quick hack, but would at least make things running
quickly.
I have also noticed that CONFIG_DMA_CMA is turned off in my kernel. I guess it was a leftover from old defconfig, because i carry over my .config from version to version. I enabled it and rebuilt the kernel, but in order to get the driver working with this patch i had to also add cma=32M option to kernel arguments. With default of 16M the allocation still fails.
Should we add Kconfig dependencies?

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia
Pavel Fedin
2015-12-02 12:29:10 UTC
Permalink
Hello!
Post by Pavel Fedin
Post by Pavel Fedin
1. Introduce some mechanism which would allow the driver to tell the kernel that it needs
coherent pool of large size. Can be problematic because the driver can be a module, and pool
allocation happens early.
2. Can we use some other method for allocating queues, which would not require such a huge
coherent pool?
3. The driver could check value of atomic_pool_size and adjust own memory requirements
accordingly. This indeed looks like a quick hack, but would at least make things running
quickly.
I have also noticed that CONFIG_DMA_CMA is turned off in my kernel. I guess it was a leftover
from old defconfig, because i carry over my .config from version to version. I enabled it and
rebuilt the kernel, but in order to get the driver working with this patch i had to also add
cma=32M option to kernel arguments. With default of 16M the allocation still fails.
Should we add Kconfig dependencies?
After getting it working in guest i tried to apply it to host. With total of 128 virtual functions (= 128 interfaces) it does not work at all. Even after bumping cma region size to insane value of 2GB more than half of interfaces still failed to allocate queues. And after setting cma=3G i could not mount my rootfs.
So, absolute NAK, unfortunately.

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia
Sunil Kovvuri
2015-12-02 12:57:14 UTC
Permalink
Post by Pavel Fedin
After getting it working in guest i tried to apply it to host. With total of 128 virtual functions (= 128 interfaces) it does not work at all.
Even after bumping cma region size to insane value of 2GB more than half of interfaces still failed to allocate queues.
And after setting cma=3G i could not mount my rootfs.
Here what you are saying is half of the interfaces were initialized
succesfully and rest didn't.
So this issue is not something which is introduced by this patch.
Post by Pavel Fedin
Can we use some other method for allocating queues, which would not require such a huge coherent pool?
There are so many drivers which use dma_alloc_coherent() directly or
via pci_alloc_consistent() to
allocate memory. How many drivers should we modify.
Pavel Fedin
2015-12-02 13:22:13 UTC
Permalink
Hello!
Post by Pavel Fedin
Post by Pavel Fedin
After getting it working in guest i tried to apply it to host. With total of 128 virtual
functions (= 128 interfaces) it does not work at all.
Post by Pavel Fedin
Even after bumping cma region size to insane value of 2GB more than half of interfaces still
failed to allocate queues.
Post by Pavel Fedin
And after setting cma=3G i could not mount my rootfs.
Here what you are saying is half of the interfaces were initialized
succesfully and rest didn't.
After setting cma=2G. With default setting of 16M none of them initialized.
Post by Pavel Fedin
So this issue is not something which is introduced by this patch.
Before this patch all my interfaces were working.
I would say the problem with your patch is that it introduces memory requirements which cannot be satisfied by the platform. It's combination of several factors which stops the thing from working, not a single factor. Using dma_alloc_coherent() is not all wrong by itself, of course.
Perhaps you did some tricks with your configuration, which make it working. Then, i guess, you should have at least described them in commit message of your patch. Or describe all dependencies in KConfig of your driver, which is better. Or, if the platform needs some very special defconfig, add it to arch/arm64/configs (however, i guess, the goal of ARM64 Linux is to run on all possible hardware, so this would not be good from maintainers' POV).

Sorry, but this is all i can say. In previous messages i have already suggested several ways to solve the problem (too lazy to quite here, 4 IIRC), or you can suggest your own one and let us test it, or you can even stick to "It works for me, i am the only right guy in the world, and i don't care if it doesn't work for you" position and let David decide who of us is right (and he already did that once).
Basically, here is what i did: i took kernel 4.2, added ThunderX PCI drivers to it (they were posted but NAKed those days back, there's some lazy progress on them currently), added necessary errata patches (also posted on lists, all merged into 4.4), took defconfig, adjusted it according to my needs, and this is what i'm running on my board and this is what i'm using for development. If you point me at what i'm doing wrong way, i'll be glad to accept this.
I'm over.

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia
Pavel Fedin
2015-12-02 08:09:15 UTC
Permalink
Hello!
Post by Eric Dumazet
Post by Pavel Fedin
swiotlb: coherent allocation failed for device 0002:01:08.4 size=4198400
CPU: 2 PID: 3655 Comm: NetworkManager Tainted: G W O 4.2.6+ #201
Hardware name: Cavium ThunderX CN88XX
Are you sure 4.2.6 kernel is suitable for backporting this patch aimed
for linux-4.5 ?
Why not? It's just a networking driver. And, i also have the same problem on 4.3 running as KVM guest with VFIO.

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia
Sunil Goutham
2015-12-01 09:13:41 UTC
Permalink
From: Sunil Goutham <***@cavium.com>

Properly set CQ timer threshold and also set it to 2us.
With previous incorrect settings it was set to 0.5us which is too less.

Signed-off-by: Sunil Goutham <***@cavium.com>
---
drivers/net/ethernet/cavium/thunder/nic.h | 5 ++---
drivers/net/ethernet/cavium/thunder/nicvf_queues.c | 2 +-
drivers/net/ethernet/cavium/thunder/nicvf_queues.h | 2 +-
3 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/cavium/thunder/nic.h b/drivers/net/ethernet/cavium/thunder/nic.h
index d3950b2..39ca674 100644
--- a/drivers/net/ethernet/cavium/thunder/nic.h
+++ b/drivers/net/ethernet/cavium/thunder/nic.h
@@ -120,10 +120,9 @@
* Calculated for SCLK of 700Mhz
* value written should be a 1/16th of what is expected
*
- * 1 tick per 0.05usec = value of 2.2
- * This 10% would be covered in CQ timer thresh value
+ * 1 tick per 0.025usec
*/
-#define NICPF_CLK_PER_INT_TICK 2
+#define NICPF_CLK_PER_INT_TICK 1

/* Time to wait before we decide that a SQ is stuck.
*
diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_queues.c b/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
index e404ea8..206b6a7 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
@@ -592,7 +592,7 @@ void nicvf_cmp_queue_config(struct nicvf *nic, struct queue_set *qs,
/* Set threshold value for interrupt generation */
nicvf_queue_reg_write(nic, NIC_QSET_CQ_0_7_THRESH, qidx, cq->thresh);
nicvf_queue_reg_write(nic, NIC_QSET_CQ_0_7_CFG2,
- qidx, nic->cq_coalesce_usecs);
+ qidx, CMP_QUEUE_TIMER_THRESH);
}

/* Configures transmit queue */
diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_queues.h b/drivers/net/ethernet/cavium/thunder/nicvf_queues.h
index b1e93a9..457eaee 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_queues.h
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_queues.h
@@ -76,7 +76,7 @@
#define CMP_QSIZE CMP_QUEUE_SIZE3
#define CMP_QUEUE_LEN (1ULL << (CMP_QSIZE + 10))
#define CMP_QUEUE_CQE_THRESH 0
-#define CMP_QUEUE_TIMER_THRESH 220 /* 10usec */
+#define CMP_QUEUE_TIMER_THRESH 80 /* ~2usec */

#define RBDR_SIZE RBDR_SIZE0
#define RCV_BUF_COUNT (1ULL << (RBDR_SIZE + 13))
--
1.7.1
Sunil Goutham
2015-12-01 09:13:42 UTC
Permalink
From: Sunil Goutham <***@cavium.com>

Call netif_carrier_on() only if interface's link is up. Switching this on
upon IFF_UP by default, is causing issues with ethernet channel bonding
in LACP mode. Initial NETDEV_CHANGE notification was being skipped.

Also fixed some issues with link/speed/duplex reporting via ethtool.

Signed-off-by: Sunil Goutham <***@cavium.com>
---
.../net/ethernet/cavium/thunder/nicvf_ethtool.c | 16 +++++++++++++++-
drivers/net/ethernet/cavium/thunder/nicvf_main.c | 4 +---
drivers/net/ethernet/cavium/thunder/thunder_bgx.c | 2 ++
3 files changed, 18 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_ethtool.c b/drivers/net/ethernet/cavium/thunder/nicvf_ethtool.c
index af54c10..a12b2e3 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_ethtool.c
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_ethtool.c
@@ -112,6 +112,13 @@ static int nicvf_get_settings(struct net_device *netdev,

cmd->supported = 0;
cmd->transceiver = XCVR_EXTERNAL;
+
+ if (!nic->link_up) {
+ cmd->duplex = DUPLEX_UNKNOWN;
+ ethtool_cmd_speed_set(cmd, SPEED_UNKNOWN);
+ return 0;
+ }
+
if (nic->speed <= 1000) {
cmd->port = PORT_MII;
cmd->autoneg = AUTONEG_ENABLE;
@@ -125,6 +132,13 @@ static int nicvf_get_settings(struct net_device *netdev,
return 0;
}

+static u32 nicvf_get_link(struct net_device *netdev)
+{
+ struct nicvf *nic = netdev_priv(netdev);
+
+ return nic->link_up;
+}
+
static void nicvf_get_drvinfo(struct net_device *netdev,
struct ethtool_drvinfo *info)
{
@@ -660,7 +674,7 @@ static int nicvf_set_channels(struct net_device *dev,

static const struct ethtool_ops nicvf_ethtool_ops = {
.get_settings = nicvf_get_settings,
- .get_link = ethtool_op_get_link,
+ .get_link = nicvf_get_link,
.get_drvinfo = nicvf_get_drvinfo,
.get_msglevel = nicvf_get_msglevel,
.set_msglevel = nicvf_set_msglevel,
diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_main.c b/drivers/net/ethernet/cavium/thunder/nicvf_main.c
index 7f709cb..dde8dc7 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_main.c
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_main.c
@@ -1057,6 +1057,7 @@ int nicvf_stop(struct net_device *netdev)

netif_carrier_off(netdev);
netif_tx_stop_all_queues(nic->netdev);
+ nic->link_up = false;

/* Teardown secondary qsets first */
if (!nic->sqs_mode) {
@@ -1211,9 +1212,6 @@ int nicvf_open(struct net_device *netdev)
nic->drv_stats.txq_stop = 0;
nic->drv_stats.txq_wake = 0;

- netif_carrier_on(netdev);
- netif_tx_start_all_queues(netdev);
-
return 0;
cleanup:
nicvf_disable_intr(nic, NICVF_INTR_MBOX, 0);
diff --git a/drivers/net/ethernet/cavium/thunder/thunder_bgx.c b/drivers/net/ethernet/cavium/thunder/thunder_bgx.c
index 2076ac3..d9f27ad 100644
--- a/drivers/net/ethernet/cavium/thunder/thunder_bgx.c
+++ b/drivers/net/ethernet/cavium/thunder/thunder_bgx.c
@@ -612,6 +612,8 @@ static void bgx_poll_for_link(struct work_struct *work)
lmac->last_duplex = 1;
} else {
lmac->link_up = 0;
+ lmac->last_speed = SPEED_UNKNOWN;
+ lmac->last_duplex = DUPLEX_UNKNOWN;
}

if (lmac->last_link != lmac->link_up) {
--
1.7.1
Pavel Fedin
2015-12-01 15:32:36 UTC
Permalink
Hello!

This one causes the network to stop working on Fedora 21. Probably has to do with NetworkManager, which sees something unexpected.
IP address is never set up and connection is never activated, despite it has UP flag.

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia
-----Original Message-----
Goutham
Sent: Tuesday, December 01, 2015 12:14 PM
Subject: [PATCH 5/6] net: thunderx: Switchon carrier only upon interface link up
Call netif_carrier_on() only if interface's link is up. Switching this on
upon IFF_UP by default, is causing issues with ethernet channel bonding
in LACP mode. Initial NETDEV_CHANGE notification was being skipped.
Also fixed some issues with link/speed/duplex reporting via ethtool.
---
.../net/ethernet/cavium/thunder/nicvf_ethtool.c | 16 +++++++++++++++-
drivers/net/ethernet/cavium/thunder/nicvf_main.c | 4 +---
drivers/net/ethernet/cavium/thunder/thunder_bgx.c | 2 ++
3 files changed, 18 insertions(+), 4 deletions(-)
diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_ethtool.c
b/drivers/net/ethernet/cavium/thunder/nicvf_ethtool.c
index af54c10..a12b2e3 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_ethtool.c
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_ethtool.c
@@ -112,6 +112,13 @@ static int nicvf_get_settings(struct net_device *netdev,
cmd->supported = 0;
cmd->transceiver = XCVR_EXTERNAL;
+
+ if (!nic->link_up) {
+ cmd->duplex = DUPLEX_UNKNOWN;
+ ethtool_cmd_speed_set(cmd, SPEED_UNKNOWN);
+ return 0;
+ }
+
if (nic->speed <= 1000) {
cmd->port = PORT_MII;
cmd->autoneg = AUTONEG_ENABLE;
@@ -125,6 +132,13 @@ static int nicvf_get_settings(struct net_device *netdev,
return 0;
}
+static u32 nicvf_get_link(struct net_device *netdev)
+{
+ struct nicvf *nic = netdev_priv(netdev);
+
+ return nic->link_up;
+}
+
static void nicvf_get_drvinfo(struct net_device *netdev,
struct ethtool_drvinfo *info)
{
@@ -660,7 +674,7 @@ static int nicvf_set_channels(struct net_device *dev,
static const struct ethtool_ops nicvf_ethtool_ops = {
.get_settings = nicvf_get_settings,
- .get_link = ethtool_op_get_link,
+ .get_link = nicvf_get_link,
.get_drvinfo = nicvf_get_drvinfo,
.get_msglevel = nicvf_get_msglevel,
.set_msglevel = nicvf_set_msglevel,
diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_main.c
b/drivers/net/ethernet/cavium/thunder/nicvf_main.c
index 7f709cb..dde8dc7 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_main.c
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_main.c
@@ -1057,6 +1057,7 @@ int nicvf_stop(struct net_device *netdev)
netif_carrier_off(netdev);
netif_tx_stop_all_queues(nic->netdev);
+ nic->link_up = false;
/* Teardown secondary qsets first */
if (!nic->sqs_mode) {
@@ -1211,9 +1212,6 @@ int nicvf_open(struct net_device *netdev)
nic->drv_stats.txq_stop = 0;
nic->drv_stats.txq_wake = 0;
- netif_carrier_on(netdev);
- netif_tx_start_all_queues(netdev);
-
return 0;
nicvf_disable_intr(nic, NICVF_INTR_MBOX, 0);
diff --git a/drivers/net/ethernet/cavium/thunder/thunder_bgx.c
b/drivers/net/ethernet/cavium/thunder/thunder_bgx.c
index 2076ac3..d9f27ad 100644
--- a/drivers/net/ethernet/cavium/thunder/thunder_bgx.c
+++ b/drivers/net/ethernet/cavium/thunder/thunder_bgx.c
@@ -612,6 +612,8 @@ static void bgx_poll_for_link(struct work_struct *work)
lmac->last_duplex = 1;
} else {
lmac->link_up = 0;
+ lmac->last_speed = SPEED_UNKNOWN;
+ lmac->last_duplex = DUPLEX_UNKNOWN;
}
if (lmac->last_link != lmac->link_up) {
--
1.7.1
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
More majordomo info at http://vger.kernel.org/majordomo-info.html
Sunil Kovvuri
2015-12-01 16:39:44 UTC
Permalink
Hi Pavel Fedin,

Are running Fedora 21 on Cavium ThunderX ?
Do you see linkup notification (dmesg) ?

If you see the existing driver (pasted snippet below), it does call
netif_carrier_on() upon receiving l
ink up notification from BGX driver.

===
case NIC_MBOX_MSG_BGX_LINK_CHANGE:
.........
if (nic->link_up) {
netdev_info(nic->netdev, "%s: Link is Up %d Mbps %s\n",
nic->netdev->name, nic->speed,
nic->duplex == DUPLEX_FULL ?
"Full duplex" : "Half duplex");
netif_carrier_on(nic->netdev);
netif_tx_start_all_queues(nic->netdev);
========

This patch removes calling carrier on by default.


Thanks,
Sunil.
Post by Pavel Fedin
Hello!
This one causes the network to stop working on Fedora 21. Probably has to do with NetworkManager, which sees something unexpected.
IP address is never set up and connection is never activated, despite it has UP flag.
Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia
-----Original Message-----
Goutham
Sent: Tuesday, December 01, 2015 12:14 PM
Subject: [PATCH 5/6] net: thunderx: Switchon carrier only upon interface link up
Call netif_carrier_on() only if interface's link is up. Switching this on
upon IFF_UP by default, is causing issues with ethernet channel bonding
in LACP mode. Initial NETDEV_CHANGE notification was being skipped.
Also fixed some issues with link/speed/duplex reporting via ethtool.
---
.../net/ethernet/cavium/thunder/nicvf_ethtool.c | 16 +++++++++++++++-
drivers/net/ethernet/cavium/thunder/nicvf_main.c | 4 +---
drivers/net/ethernet/cavium/thunder/thunder_bgx.c | 2 ++
3 files changed, 18 insertions(+), 4 deletions(-)
diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_ethtool.c
b/drivers/net/ethernet/cavium/thunder/nicvf_ethtool.c
index af54c10..a12b2e3 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_ethtool.c
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_ethtool.c
@@ -112,6 +112,13 @@ static int nicvf_get_settings(struct net_device *netdev,
cmd->supported = 0;
cmd->transceiver = XCVR_EXTERNAL;
+
+ if (!nic->link_up) {
+ cmd->duplex = DUPLEX_UNKNOWN;
+ ethtool_cmd_speed_set(cmd, SPEED_UNKNOWN);
+ return 0;
+ }
+
if (nic->speed <= 1000) {
cmd->port = PORT_MII;
cmd->autoneg = AUTONEG_ENABLE;
@@ -125,6 +132,13 @@ static int nicvf_get_settings(struct net_device *netdev,
return 0;
}
+static u32 nicvf_get_link(struct net_device *netdev)
+{
+ struct nicvf *nic = netdev_priv(netdev);
+
+ return nic->link_up;
+}
+
static void nicvf_get_drvinfo(struct net_device *netdev,
struct ethtool_drvinfo *info)
{
@@ -660,7 +674,7 @@ static int nicvf_set_channels(struct net_device *dev,
static const struct ethtool_ops nicvf_ethtool_ops = {
.get_settings = nicvf_get_settings,
- .get_link = ethtool_op_get_link,
+ .get_link = nicvf_get_link,
.get_drvinfo = nicvf_get_drvinfo,
.get_msglevel = nicvf_get_msglevel,
.set_msglevel = nicvf_set_msglevel,
diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_main.c
b/drivers/net/ethernet/cavium/thunder/nicvf_main.c
index 7f709cb..dde8dc7 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_main.c
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_main.c
@@ -1057,6 +1057,7 @@ int nicvf_stop(struct net_device *netdev)
netif_carrier_off(netdev);
netif_tx_stop_all_queues(nic->netdev);
+ nic->link_up = false;
/* Teardown secondary qsets first */
if (!nic->sqs_mode) {
@@ -1211,9 +1212,6 @@ int nicvf_open(struct net_device *netdev)
nic->drv_stats.txq_stop = 0;
nic->drv_stats.txq_wake = 0;
- netif_carrier_on(netdev);
- netif_tx_start_all_queues(netdev);
-
return 0;
nicvf_disable_intr(nic, NICVF_INTR_MBOX, 0);
diff --git a/drivers/net/ethernet/cavium/thunder/thunder_bgx.c
b/drivers/net/ethernet/cavium/thunder/thunder_bgx.c
index 2076ac3..d9f27ad 100644
--- a/drivers/net/ethernet/cavium/thunder/thunder_bgx.c
+++ b/drivers/net/ethernet/cavium/thunder/thunder_bgx.c
@@ -612,6 +612,8 @@ static void bgx_poll_for_link(struct work_struct *work)
lmac->last_duplex = 1;
} else {
lmac->link_up = 0;
+ lmac->last_speed = SPEED_UNKNOWN;
+ lmac->last_duplex = DUPLEX_UNKNOWN;
}
if (lmac->last_link != lmac->link_up) {
--
1.7.1
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
More majordomo info at http://vger.kernel.org/majordomo-info.html
Sunil Goutham
2015-12-01 09:13:43 UTC
Permalink
From: Sunil Goutham <***@cavium.com>

Enable or disable BGX LMAC's RX/TX based on corresponding VF's
status. If otherwise, when multiple LMAC's physical link is up
then packets from all LMAC's whose corresponding VF is not yet
initialized will get forwarded to VF0. This is due to VNIC's default
configuration where CPI, RSSI e.t.c point to VF0/QSET0/RQ0.

This patch will prevent multiple copies of packets on VF0.

Signed-off-by: Sunil Goutham <***@cavium.com>
---
drivers/net/ethernet/cavium/thunder/nic_main.c | 20 +++++++++++++++++++-
drivers/net/ethernet/cavium/thunder/thunder_bgx.c | 20 ++++++++++++++++++--
drivers/net/ethernet/cavium/thunder/thunder_bgx.h | 1 +
3 files changed, 38 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/cavium/thunder/nic_main.c b/drivers/net/ethernet/cavium/thunder/nic_main.c
index cfc24a1..4b7fd63 100644
--- a/drivers/net/ethernet/cavium/thunder/nic_main.c
+++ b/drivers/net/ethernet/cavium/thunder/nic_main.c
@@ -37,6 +37,7 @@ struct nicpf {
#define NIC_GET_BGX_FROM_VF_LMAC_MAP(map) ((map >> 4) & 0xF)
#define NIC_GET_LMAC_FROM_VF_LMAC_MAP(map) (map & 0xF)
u8 vf_lmac_map[MAX_LMAC];
+ u8 lmac_cnt;
struct delayed_work dwork;
struct workqueue_struct *check_link;
u8 link[MAX_LMAC];
@@ -279,6 +280,7 @@ static void nic_set_lmac_vf_mapping(struct nicpf *nic)
u64 lmac_credit;

nic->num_vf_en = 0;
+ nic->lmac_cnt = 0;

for (bgx = 0; bgx < NIC_MAX_BGX; bgx++) {
if (!(bgx_map & (1 << bgx)))
@@ -288,6 +290,7 @@ static void nic_set_lmac_vf_mapping(struct nicpf *nic)
nic->vf_lmac_map[next_bgx_lmac++] =
NIC_SET_VF_LMAC_MAP(bgx, lmac);
nic->num_vf_en += lmac_cnt;
+ nic->lmac_cnt += lmac_cnt;

/* Program LMAC credits */
lmac_credit = (1ull << 1); /* channel credit enable */
@@ -715,6 +718,13 @@ static void nic_handle_mbx_intr(struct nicpf *nic, int vf)
case NIC_MBOX_MSG_CFG_DONE:
/* Last message of VF config msg sequence */
nic->vf_enabled[vf] = true;
+ if (vf >= nic->lmac_cnt)
+ goto unlock;
+
+ bgx = NIC_GET_BGX_FROM_VF_LMAC_MAP(nic->vf_lmac_map[vf]);
+ lmac = NIC_GET_LMAC_FROM_VF_LMAC_MAP(nic->vf_lmac_map[vf]);
+
+ bgx_lmac_rx_tx_enable(nic->node, bgx, lmac, true);
goto unlock;
case NIC_MBOX_MSG_SHUTDOWN:
/* First msg in VF teardown sequence */
@@ -722,6 +732,14 @@ static void nic_handle_mbx_intr(struct nicpf *nic, int vf)
if (vf >= nic->num_vf_en)
nic->sqs_used[vf - nic->num_vf_en] = false;
nic->pqs_vf[vf] = 0;
+
+ if (vf >= nic->lmac_cnt)
+ break;
+
+ bgx = NIC_GET_BGX_FROM_VF_LMAC_MAP(nic->vf_lmac_map[vf]);
+ lmac = NIC_GET_LMAC_FROM_VF_LMAC_MAP(nic->vf_lmac_map[vf]);
+
+ bgx_lmac_rx_tx_enable(nic->node, bgx, lmac, false);
break;
case NIC_MBOX_MSG_ALLOC_SQS:
nic_alloc_sqs(nic, &mbx.sqs_alloc);
@@ -940,7 +958,7 @@ static void nic_poll_for_link(struct work_struct *work)

mbx.link_status.msg = NIC_MBOX_MSG_BGX_LINK_CHANGE;

- for (vf = 0; vf < nic->num_vf_en; vf++) {
+ for (vf = 0; vf < nic->lmac_cnt; vf++) {
/* Poll only if VF is UP */
if (!nic->vf_enabled[vf])
continue;
diff --git a/drivers/net/ethernet/cavium/thunder/thunder_bgx.c b/drivers/net/ethernet/cavium/thunder/thunder_bgx.c
index d9f27ad..c51d5c9 100644
--- a/drivers/net/ethernet/cavium/thunder/thunder_bgx.c
+++ b/drivers/net/ethernet/cavium/thunder/thunder_bgx.c
@@ -186,6 +186,23 @@ void bgx_set_lmac_mac(int node, int bgx_idx, int lmacid, const u8 *mac)
}
EXPORT_SYMBOL(bgx_set_lmac_mac);

+void bgx_lmac_rx_tx_enable(int node, int bgx_idx, int lmacid, bool enable)
+{
+ struct bgx *bgx = bgx_vnic[(node * MAX_BGX_PER_CN88XX) + bgx_idx];
+ u64 cfg;
+
+ if (!bgx)
+ return;
+
+ cfg = bgx_reg_read(bgx, lmacid, BGX_CMRX_CFG);
+ if (enable)
+ cfg |= CMR_PKT_RX_EN | CMR_PKT_TX_EN;
+ else
+ cfg &= ~(CMR_PKT_RX_EN | CMR_PKT_TX_EN);
+ bgx_reg_write(bgx, lmacid, BGX_CMRX_CFG, cfg);
+}
+EXPORT_SYMBOL(bgx_lmac_rx_tx_enable);
+
static void bgx_sgmii_change_link_state(struct lmac *lmac)
{
struct bgx *bgx = lmac->bgx;
@@ -656,8 +673,7 @@ static int bgx_lmac_enable(struct bgx *bgx, u8 lmacid)
}

/* Enable lmac */
- bgx_reg_modify(bgx, lmacid, BGX_CMRX_CFG,
- CMR_EN | CMR_PKT_RX_EN | CMR_PKT_TX_EN);
+ bgx_reg_modify(bgx, lmacid, BGX_CMRX_CFG, CMR_EN);

/* Restore default cfg, incase low level firmware changed it */
bgx_reg_write(bgx, lmacid, BGX_CMRX_RX_DMAC_CTL, 0x03);
diff --git a/drivers/net/ethernet/cavium/thunder/thunder_bgx.h b/drivers/net/ethernet/cavium/thunder/thunder_bgx.h
index 89a02fa..149e179 100644
--- a/drivers/net/ethernet/cavium/thunder/thunder_bgx.h
+++ b/drivers/net/ethernet/cavium/thunder/thunder_bgx.h
@@ -183,6 +183,7 @@ enum MCAST_MODE {
#define CAM_ACCEPT 1

void octeon_mdiobus_force_mod_depencency(void);
+void bgx_lmac_rx_tx_enable(int node, int bgx_idx, int lmacid, bool enable);
void bgx_add_dmac_addr(u64 dmac, int node, int bgx_idx, int lmac);
unsigned bgx_get_map(int node);
int bgx_get_lmac_count(int node, int bgx);
--
1.7.1
Sunil Goutham
2015-12-07 05:00:31 UTC
Permalink
From: Sunil Goutham <***@cavium.com>

This patch series contains contains couple of cleanup patches.

Sunil Goutham (1):
net, thunderx: Remove unnecessary rcv buffer start address management

Yury Norov (1):
net: thunderx: nicvf_queues: nivc_*_intr: remove duplication

drivers/net/ethernet/cavium/thunder/nicvf_queues.c | 189 +++++---------------
drivers/net/ethernet/cavium/thunder/nicvf_queues.h | 10 +-
2 files changed, 51 insertions(+), 148 deletions(-)
Pavel Fedin
2015-12-07 10:33:55 UTC
Permalink
Tested-by: Pavel Fedin <***@samsung.com>

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia
-----Original Message-----
Goutham
Sent: Monday, December 07, 2015 8:01 AM
Subject: [PATCH 0/2] net: thunderx: Miscellaneous cleanups
This patch series contains contains couple of cleanup patches.
net, thunderx: Remove unnecessary rcv buffer start address management
net: thunderx: nicvf_queues: nivc_*_intr: remove duplication
drivers/net/ethernet/cavium/thunder/nicvf_queues.c | 189 +++++---------------
drivers/net/ethernet/cavium/thunder/nicvf_queues.h | 10 +-
2 files changed, 51 insertions(+), 148 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
More majordomo info at http://vger.kernel.org/majordomo-info.html
David Miller
2015-12-07 18:40:29 UTC
Permalink
From: Sunil Goutham <***@gmail.com>
Date: Mon, 7 Dec 2015 10:30:31 +0530
Post by Sunil Goutham
This patch series contains contains couple of cleanup patches.
Series applied to net-next, thanks.
Sunil Goutham
2015-12-07 05:00:32 UTC
Permalink
From: Yury Norov <***@auriga.com>

The same switch-case repeates for nivc_*_intr functions.
In this patch it is moved to a helper nicvf_int_type_to_mask().

By the way:
- Unneeded write to NICVF register dropped if int_type is unknown.
- netdev_dbg() is used instead of netdev_err().

Signed-off-by: Yury Norov <***@auriga.com>
Signed-off-by: Aleksey Makarov <***@caviumnetworks.com>
Acked-by: Vadim Lomovtsev <***@caiumnetworks.com>
Signed-off-by: Sunil Goutham <***@cavium.com>
---
drivers/net/ethernet/cavium/thunder/nicvf_queues.c | 140 ++++++--------------
1 files changed, 40 insertions(+), 100 deletions(-)

diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_queues.c b/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
index 206b6a7..99a29d0 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
@@ -1234,153 +1234,93 @@ struct sk_buff *nicvf_get_rcv_skb(struct nicvf *nic, struct cqe_rx_t *cqe_rx)
return skb;
}

-/* Enable interrupt */
-void nicvf_enable_intr(struct nicvf *nic, int int_type, int q_idx)
+static u64 nicvf_int_type_to_mask(int int_type, int q_idx)
{
u64 reg_val;

- reg_val = nicvf_reg_read(nic, NIC_VF_ENA_W1S);
-
switch (int_type) {
case NICVF_INTR_CQ:
- reg_val |= ((1ULL << q_idx) << NICVF_INTR_CQ_SHIFT);
+ reg_val = ((1ULL << q_idx) << NICVF_INTR_CQ_SHIFT);
break;
case NICVF_INTR_SQ:
- reg_val |= ((1ULL << q_idx) << NICVF_INTR_SQ_SHIFT);
+ reg_val = ((1ULL << q_idx) << NICVF_INTR_SQ_SHIFT);
break;
case NICVF_INTR_RBDR:
- reg_val |= ((1ULL << q_idx) << NICVF_INTR_RBDR_SHIFT);
+ reg_val = ((1ULL << q_idx) << NICVF_INTR_RBDR_SHIFT);
break;
case NICVF_INTR_PKT_DROP:
- reg_val |= (1ULL << NICVF_INTR_PKT_DROP_SHIFT);
+ reg_val = (1ULL << NICVF_INTR_PKT_DROP_SHIFT);
break;
case NICVF_INTR_TCP_TIMER:
- reg_val |= (1ULL << NICVF_INTR_TCP_TIMER_SHIFT);
+ reg_val = (1ULL << NICVF_INTR_TCP_TIMER_SHIFT);
break;
case NICVF_INTR_MBOX:
- reg_val |= (1ULL << NICVF_INTR_MBOX_SHIFT);
+ reg_val = (1ULL << NICVF_INTR_MBOX_SHIFT);
break;
case NICVF_INTR_QS_ERR:
- reg_val |= (1ULL << NICVF_INTR_QS_ERR_SHIFT);
+ reg_val = (1ULL << NICVF_INTR_QS_ERR_SHIFT);
break;
default:
- netdev_err(nic->netdev,
- "Failed to enable interrupt: unknown type\n");
- break;
+ reg_val = 0;
}

- nicvf_reg_write(nic, NIC_VF_ENA_W1S, reg_val);
+ return reg_val;
+}
+
+/* Enable interrupt */
+void nicvf_enable_intr(struct nicvf *nic, int int_type, int q_idx)
+{
+ u64 mask = nicvf_int_type_to_mask(int_type, q_idx);
+
+ if (!mask) {
+ netdev_dbg(nic->netdev,
+ "Failed to enable interrupt: unknown type\n");
+ return;
+ }
+ nicvf_reg_write(nic, NIC_VF_ENA_W1S,
+ nicvf_reg_read(nic, NIC_VF_ENA_W1S) | mask);
}

/* Disable interrupt */
void nicvf_disable_intr(struct nicvf *nic, int int_type, int q_idx)
{
- u64 reg_val = 0;
+ u64 mask = nicvf_int_type_to_mask(int_type, q_idx);

- switch (int_type) {
- case NICVF_INTR_CQ:
- reg_val |= ((1ULL << q_idx) << NICVF_INTR_CQ_SHIFT);
- break;
- case NICVF_INTR_SQ:
- reg_val |= ((1ULL << q_idx) << NICVF_INTR_SQ_SHIFT);
- break;
- case NICVF_INTR_RBDR:
- reg_val |= ((1ULL << q_idx) << NICVF_INTR_RBDR_SHIFT);
- break;
- case NICVF_INTR_PKT_DROP:
- reg_val |= (1ULL << NICVF_INTR_PKT_DROP_SHIFT);
- break;
- case NICVF_INTR_TCP_TIMER:
- reg_val |= (1ULL << NICVF_INTR_TCP_TIMER_SHIFT);
- break;
- case NICVF_INTR_MBOX:
- reg_val |= (1ULL << NICVF_INTR_MBOX_SHIFT);
- break;
- case NICVF_INTR_QS_ERR:
- reg_val |= (1ULL << NICVF_INTR_QS_ERR_SHIFT);
- break;
- default:
- netdev_err(nic->netdev,
+ if (!mask) {
+ netdev_dbg(nic->netdev,
"Failed to disable interrupt: unknown type\n");
- break;
+ return;
}

- nicvf_reg_write(nic, NIC_VF_ENA_W1C, reg_val);
+ nicvf_reg_write(nic, NIC_VF_ENA_W1C, mask);
}

/* Clear interrupt */
void nicvf_clear_intr(struct nicvf *nic, int int_type, int q_idx)
{
- u64 reg_val = 0;
+ u64 mask = nicvf_int_type_to_mask(int_type, q_idx);

- switch (int_type) {
- case NICVF_INTR_CQ:
- reg_val = ((1ULL << q_idx) << NICVF_INTR_CQ_SHIFT);
- break;
- case NICVF_INTR_SQ:
- reg_val = ((1ULL << q_idx) << NICVF_INTR_SQ_SHIFT);
- break;
- case NICVF_INTR_RBDR:
- reg_val = ((1ULL << q_idx) << NICVF_INTR_RBDR_SHIFT);
- break;
- case NICVF_INTR_PKT_DROP:
- reg_val = (1ULL << NICVF_INTR_PKT_DROP_SHIFT);
- break;
- case NICVF_INTR_TCP_TIMER:
- reg_val = (1ULL << NICVF_INTR_TCP_TIMER_SHIFT);
- break;
- case NICVF_INTR_MBOX:
- reg_val = (1ULL << NICVF_INTR_MBOX_SHIFT);
- break;
- case NICVF_INTR_QS_ERR:
- reg_val |= (1ULL << NICVF_INTR_QS_ERR_SHIFT);
- break;
- default:
- netdev_err(nic->netdev,
+ if (!mask) {
+ netdev_dbg(nic->netdev,
"Failed to clear interrupt: unknown type\n");
- break;
+ return;
}

- nicvf_reg_write(nic, NIC_VF_INT, reg_val);
+ nicvf_reg_write(nic, NIC_VF_INT, mask);
}

/* Check if interrupt is enabled */
int nicvf_is_intr_enabled(struct nicvf *nic, int int_type, int q_idx)
{
- u64 reg_val;
- u64 mask = 0xff;
-
- reg_val = nicvf_reg_read(nic, NIC_VF_ENA_W1S);
-
- switch (int_type) {
- case NICVF_INTR_CQ:
- mask = ((1ULL << q_idx) << NICVF_INTR_CQ_SHIFT);
- break;
- case NICVF_INTR_SQ:
- mask = ((1ULL << q_idx) << NICVF_INTR_SQ_SHIFT);
- break;
- case NICVF_INTR_RBDR:
- mask = ((1ULL << q_idx) << NICVF_INTR_RBDR_SHIFT);
- break;
- case NICVF_INTR_PKT_DROP:
- mask = NICVF_INTR_PKT_DROP_MASK;
- break;
- case NICVF_INTR_TCP_TIMER:
- mask = NICVF_INTR_TCP_TIMER_MASK;
- break;
- case NICVF_INTR_MBOX:
- mask = NICVF_INTR_MBOX_MASK;
- break;
- case NICVF_INTR_QS_ERR:
- mask = NICVF_INTR_QS_ERR_MASK;
- break;
- default:
- netdev_err(nic->netdev,
+ u64 mask = nicvf_int_type_to_mask(int_type, q_idx);
+ /* If interrupt type is unknown, we treat it disabled. */
+ if (!mask) {
+ netdev_dbg(nic->netdev,
"Failed to check interrupt enable: unknown type\n");
- break;
+ return 0;
}

- return (reg_val & mask);
+ return mask & nicvf_reg_read(nic, NIC_VF_ENA_W1S);
}

void nicvf_update_rq_stats(struct nicvf *nic, int rq_idx)
--
1.7.1
Sunil Goutham
2015-12-07 05:00:33 UTC
Permalink
From: Sunil Goutham <***@cavium.com>

Since we have moved on to using allocated pages to carve receive
buffers instead of netdev_alloc_skb() there is no need to store
any pointers for later retrieval. Earlier we had to store
skb and skb->data pointers which later are used to handover
received packet to network stack.

This will avoid an unnecessary cache miss as well.

Signed-off-by: Sunil Goutham <***@cavium.com>
---
drivers/net/ethernet/cavium/thunder/nicvf_queues.c | 49 ++++----------------
drivers/net/ethernet/cavium/thunder/nicvf_queues.h | 10 +---
2 files changed, 11 insertions(+), 48 deletions(-)

diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_queues.c b/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
index 99a29d0..1fbd908 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
@@ -18,14 +18,6 @@
#include "q_struct.h"
#include "nicvf_queues.h"

-struct rbuf_info {
- struct page *page;
- void *data;
- u64 offset;
-};
-
-#define GET_RBUF_INFO(x) ((struct rbuf_info *)(x - NICVF_RCV_BUF_ALIGN_BYTES))
-
/* Poll a register for a specific value */
static int nicvf_poll_reg(struct nicvf *nic, int qidx,
u64 reg, int bit_pos, int bits, int val)
@@ -86,8 +78,6 @@ static void nicvf_free_q_desc_mem(struct nicvf *nic, struct q_desc_mem *dmem)
static inline int nicvf_alloc_rcv_buffer(struct nicvf *nic, gfp_t gfp,
u32 buf_len, u64 **rbuf)
{
- u64 data;
- struct rbuf_info *rinfo;
int order = get_order(buf_len);

/* Check if request can be accomodated in previous allocated page */
@@ -113,46 +103,28 @@ static inline int nicvf_alloc_rcv_buffer(struct nicvf *nic, gfp_t gfp,
nic->rb_page_offset = 0;
}

- data = (u64)page_address(nic->rb_page) + nic->rb_page_offset;
-
- /* Align buffer addr to cache line i.e 128 bytes */
- rinfo = (struct rbuf_info *)(data + NICVF_RCV_BUF_ALIGN_LEN(data));
- /* Save page address for reference updation */
- rinfo->page = nic->rb_page;
- /* Store start address for later retrieval */
- rinfo->data = (void *)data;
- /* Store alignment offset */
- rinfo->offset = NICVF_RCV_BUF_ALIGN_LEN(data);
+ *rbuf = (u64 *)((u64)page_address(nic->rb_page) + nic->rb_page_offset);

- data += rinfo->offset;
-
- /* Give next aligned address to hw for DMA */
- *rbuf = (u64 *)(data + NICVF_RCV_BUF_ALIGN_BYTES);
return 0;
}

-/* Retrieve actual buffer start address and build skb for received packet */
+/* Build skb around receive buffer */
static struct sk_buff *nicvf_rb_ptr_to_skb(struct nicvf *nic,
u64 rb_ptr, int len)
{
+ void *data;
struct sk_buff *skb;
- struct rbuf_info *rinfo;

- rb_ptr = (u64)phys_to_virt(rb_ptr);
- /* Get buffer start address and alignment offset */
- rinfo = GET_RBUF_INFO(rb_ptr);
+ data = phys_to_virt(rb_ptr);

/* Now build an skb to give to stack */
- skb = build_skb(rinfo->data, RCV_FRAG_LEN);
+ skb = build_skb(data, RCV_FRAG_LEN);
if (!skb) {
- put_page(rinfo->page);
+ put_page(virt_to_page(data));
return NULL;
}

- /* Set correct skb->data */
- skb_reserve(skb, rinfo->offset + NICVF_RCV_BUF_ALIGN_BYTES);
-
- prefetch((void *)rb_ptr);
+ prefetch(skb->data);
return skb;
}

@@ -196,7 +168,6 @@ static void nicvf_free_rbdr(struct nicvf *nic, struct rbdr *rbdr)
int head, tail;
u64 buf_addr;
struct rbdr_entry_t *desc;
- struct rbuf_info *rinfo;

if (!rbdr)
return;
@@ -212,16 +183,14 @@ static void nicvf_free_rbdr(struct nicvf *nic, struct rbdr *rbdr)
while (head != tail) {
desc = GET_RBDR_DESC(rbdr, head);
buf_addr = desc->buf_addr << NICVF_RCV_BUF_ALIGN;
- rinfo = GET_RBUF_INFO((u64)phys_to_virt(buf_addr));
- put_page(rinfo->page);
+ put_page(virt_to_page(phys_to_virt(buf_addr)));
head++;
head &= (rbdr->dmem.q_len - 1);
}
/* Free SKB of tail desc */
desc = GET_RBDR_DESC(rbdr, tail);
buf_addr = desc->buf_addr << NICVF_RCV_BUF_ALIGN;
- rinfo = GET_RBUF_INFO((u64)phys_to_virt(buf_addr));
- put_page(rinfo->page);
+ put_page(virt_to_page(phys_to_virt(buf_addr)));

/* Free RBDR ring */
nicvf_free_q_desc_mem(nic, &rbdr->dmem);
diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_queues.h b/drivers/net/ethernet/cavium/thunder/nicvf_queues.h
index 033e830..a4f6667 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_queues.h
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_queues.h
@@ -83,10 +83,8 @@
#define MAX_RCV_BUF_COUNT (1ULL << (RBDR_SIZE6 + 13))
#define RBDR_THRESH (RCV_BUF_COUNT / 2)
#define DMA_BUFFER_LEN 2048 /* In multiples of 128bytes */
-#define RCV_FRAG_LEN (SKB_DATA_ALIGN(DMA_BUFFER_LEN + NET_SKB_PAD) + \
- SKB_DATA_ALIGN(sizeof(struct skb_shared_info)) + \
- (NICVF_RCV_BUF_ALIGN_BYTES * 2))
-#define RCV_DATA_OFFSET NICVF_RCV_BUF_ALIGN_BYTES
+#define RCV_FRAG_LEN (SKB_DATA_ALIGN(DMA_BUFFER_LEN + NET_SKB_PAD) + \
+ SKB_DATA_ALIGN(sizeof(struct skb_shared_info)))

#define MAX_CQES_FOR_TX ((SND_QUEUE_LEN / MIN_SQ_DESC_PER_PKT_XMIT) * \
MAX_CQE_PER_PKT_XMIT)
@@ -108,10 +106,6 @@
#define NICVF_SQ_BASE_ALIGN_BYTES 128 /* 7 bits */

#define NICVF_ALIGNED_ADDR(ADDR, ALIGN_BYTES) ALIGN(ADDR, ALIGN_BYTES)
-#define NICVF_ADDR_ALIGN_LEN(ADDR, BYTES)\
- (NICVF_ALIGNED_ADDR(ADDR, BYTES) - BYTES)
-#define NICVF_RCV_BUF_ALIGN_LEN(X)\
- (NICVF_ALIGNED_ADDR(X, NICVF_RCV_BUF_ALIGN_BYTES) - X)

/* Queue enable/disable */
#define NICVF_SQ_EN BIT_ULL(19)
--
1.7.1
Sunil Goutham
2015-12-09 11:38:00 UTC
Permalink
From: Sunil Goutham <***@cavium.com>

This adds support for offloading TCP segmentation to HW in pass-2
revision of hardware. Both driver level SW TSO for pass1.x chips
and HW TSO for pass-2 chip will co-exist.

Signed-off-by: Sunil Goutham <***@cavium.com>
---
drivers/net/ethernet/cavium/thunder/nic.h | 12 ++++++--
drivers/net/ethernet/cavium/thunder/nic_main.c | 11 ++-----
drivers/net/ethernet/cavium/thunder/nicvf_main.c | 15 ++++++++-
drivers/net/ethernet/cavium/thunder/nicvf_queues.c | 20 ++++++++++---
drivers/net/ethernet/cavium/thunder/q_struct.h | 30 ++++++++++---------
5 files changed, 56 insertions(+), 32 deletions(-)

diff --git a/drivers/net/ethernet/cavium/thunder/nic.h b/drivers/net/ethernet/cavium/thunder/nic.h
index 39ca674..02571f4 100644
--- a/drivers/net/ethernet/cavium/thunder/nic.h
+++ b/drivers/net/ethernet/cavium/thunder/nic.h
@@ -262,9 +262,10 @@ struct nicvf {
struct pci_dev *pdev;
u8 vf_id;
u8 node;
- u8 tns_mode:1;
- u8 sqs_mode:1;
- u8 loopback_supported:1;
+ bool tns_mode:1;
+ bool sqs_mode:1;
+ bool loopback_supported:1;
+ bool hw_tso:1;
u16 mtu;
struct queue_set *qs;
#define MAX_SQS_PER_VF_SINGLE_NODE 5
@@ -489,6 +490,11 @@ static inline int nic_get_node_id(struct pci_dev *pdev)
return ((addr >> NIC_NODE_ID_SHIFT) & NIC_NODE_ID_MASK);
}

+static inline bool pass1_silicon(struct pci_dev *pdev)
+{
+ return pdev->revision < 8;
+}
+
int nicvf_set_real_num_queues(struct net_device *netdev,
int tx_queues, int rx_queues);
int nicvf_open(struct net_device *netdev);
diff --git a/drivers/net/ethernet/cavium/thunder/nic_main.c b/drivers/net/ethernet/cavium/thunder/nic_main.c
index 4b7fd63..9f80de4 100644
--- a/drivers/net/ethernet/cavium/thunder/nic_main.c
+++ b/drivers/net/ethernet/cavium/thunder/nic_main.c
@@ -55,11 +55,6 @@ struct nicpf {
bool irq_allocated[NIC_PF_MSIX_VECTORS];
};

-static inline bool pass1_silicon(struct nicpf *nic)
-{
- return nic->pdev->revision < 8;
-}
-
/* Supported devices */
static const struct pci_device_id nic_id_table[] = {
{ PCI_DEVICE(PCI_VENDOR_ID_CAVIUM, PCI_DEVICE_ID_THUNDER_NIC_PF) },
@@ -123,7 +118,7 @@ static void nic_send_msg_to_vf(struct nicpf *nic, int vf, union nic_mbx *mbx)
* when PF writes to MBOX(1), in next revisions when
* PF writes to MBOX(0)
*/
- if (pass1_silicon(nic)) {
+ if (pass1_silicon(nic->pdev)) {
/* see the comment for nic_reg_write()/nic_reg_read()
* functions above
*/
@@ -400,7 +395,7 @@ static void nic_config_cpi(struct nicpf *nic, struct cpi_cfg_msg *cfg)
padd = cpi % 8; /* 3 bits CS out of 6bits DSCP */

/* Leave RSS_SIZE as '0' to disable RSS */
- if (pass1_silicon(nic)) {
+ if (pass1_silicon(nic->pdev)) {
nic_reg_write(nic, NIC_PF_CPI_0_2047_CFG | (cpi << 3),
(vnic << 24) | (padd << 16) |
(rssi_base + rssi));
@@ -470,7 +465,7 @@ static void nic_config_rss(struct nicpf *nic, struct rss_cfg_msg *cfg)
}

cpi_base = nic->cpi_base[cfg->vf_id];
- if (pass1_silicon(nic))
+ if (pass1_silicon(nic->pdev))
idx_addr = NIC_PF_CPI_0_2047_CFG;
else
idx_addr = NIC_PF_MPI_0_2047_CFG;
diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_main.c b/drivers/net/ethernet/cavium/thunder/nicvf_main.c
index dde8dc7..c24cb2a 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_main.c
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_main.c
@@ -525,14 +525,22 @@ static void nicvf_snd_pkt_handler(struct net_device *netdev,
__func__, cqe_tx->sq_qs, cqe_tx->sq_idx,
cqe_tx->sqe_ptr, hdr->subdesc_cnt);

- nicvf_put_sq_desc(sq, hdr->subdesc_cnt + 1);
nicvf_check_cqe_tx_errs(nic, cq, cqe_tx);
skb = (struct sk_buff *)sq->skbuff[cqe_tx->sqe_ptr];
- /* For TSO offloaded packets only one head SKB needs to be freed */
+ /* For TSO offloaded packets only one SQE will have a valid SKB */
if (skb) {
+ nicvf_put_sq_desc(sq, hdr->subdesc_cnt + 1);
prefetch(skb);
dev_consume_skb_any(skb);
sq->skbuff[cqe_tx->sqe_ptr] = (u64)NULL;
+ } else {
+ /* In case of HW TSO, HW sends a CQE for each segment of a TSO
+ * packet instead of a single CQE for the whole TSO packet
+ * transmitted. Each of this CQE points to the same SQE, so
+ * avoid freeing same SQE multiple times.
+ */
+ if (!nic->hw_tso)
+ nicvf_put_sq_desc(sq, hdr->subdesc_cnt + 1);
}
}

@@ -1549,6 +1557,9 @@ static int nicvf_probe(struct pci_dev *pdev, const struct pci_device_id *ent)

netdev->vlan_features = NETIF_F_SG | NETIF_F_IP_CSUM | NETIF_F_TSO;

+ if (!pass1_silicon(nic->pdev))
+ nic->hw_tso = true;
+
netdev->netdev_ops = &nicvf_netdev_ops;
netdev->watchdog_timeo = NICVF_TX_TIMEOUT;

diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_queues.c b/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
index 1fbd908..b11fc09 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
@@ -925,7 +925,7 @@ static int nicvf_sq_subdesc_required(struct nicvf *nic, struct sk_buff *skb)
{
int subdesc_cnt = MIN_SQ_DESC_PER_PKT_XMIT;

- if (skb_shinfo(skb)->gso_size) {
+ if (skb_shinfo(skb)->gso_size && !nic->hw_tso) {
subdesc_cnt = nicvf_tso_count_subdescs(skb);
return subdesc_cnt;
}
@@ -940,7 +940,7 @@ static int nicvf_sq_subdesc_required(struct nicvf *nic, struct sk_buff *skb)
* First subdescriptor for every send descriptor.
*/
static inline void
-nicvf_sq_add_hdr_subdesc(struct snd_queue *sq, int qentry,
+nicvf_sq_add_hdr_subdesc(struct nicvf *nic, struct snd_queue *sq, int qentry,
int subdesc_cnt, struct sk_buff *skb, int len)
{
int proto;
@@ -976,6 +976,15 @@ nicvf_sq_add_hdr_subdesc(struct snd_queue *sq, int qentry,
break;
}
}
+
+ if (nic->hw_tso && skb_shinfo(skb)->gso_size) {
+ hdr->tso = 1;
+ hdr->tso_start = skb_transport_offset(skb) + tcp_hdrlen(skb);
+ hdr->tso_max_paysize = skb_shinfo(skb)->gso_size;
+ /* For non-tunneled pkts, point this to L2 ethertype */
+ hdr->inner_l3_offset = skb_network_offset(skb) - 2;
+ nic->drv_stats.tx_tso++;
+ }
}

/* SQ GATHER subdescriptor
@@ -1045,7 +1054,7 @@ static int nicvf_sq_append_tso(struct nicvf *nic, struct snd_queue *sq,
data_left -= size;
tso_build_data(skb, &tso, size);
}
- nicvf_sq_add_hdr_subdesc(sq, hdr_qentry,
+ nicvf_sq_add_hdr_subdesc(nic, sq, hdr_qentry,
seg_subdescs - 1, skb, seg_len);
sq->skbuff[hdr_qentry] = (u64)NULL;
qentry = nicvf_get_nxt_sqentry(sq, qentry);
@@ -1098,11 +1107,12 @@ int nicvf_sq_append_skb(struct nicvf *nic, struct sk_buff *skb)
qentry = nicvf_get_sq_desc(sq, subdesc_cnt);

/* Check if its a TSO packet */
- if (skb_shinfo(skb)->gso_size)
+ if (skb_shinfo(skb)->gso_size && !nic->hw_tso)
return nicvf_sq_append_tso(nic, sq, sq_num, qentry, skb);

/* Add SQ header subdesc */
- nicvf_sq_add_hdr_subdesc(sq, qentry, subdesc_cnt - 1, skb, skb->len);
+ nicvf_sq_add_hdr_subdesc(nic, sq, qentry, subdesc_cnt - 1,
+ skb, skb->len);

/* Add SQ gather subdescs */
qentry = nicvf_get_nxt_sqentry(sq, qentry);
diff --git a/drivers/net/ethernet/cavium/thunder/q_struct.h b/drivers/net/ethernet/cavium/thunder/q_struct.h
index 3c1de97..9e6d987 100644
--- a/drivers/net/ethernet/cavium/thunder/q_struct.h
+++ b/drivers/net/ethernet/cavium/thunder/q_struct.h
@@ -545,25 +545,28 @@ struct sq_hdr_subdesc {
u64 subdesc_cnt:8;
u64 csum_l4:2;
u64 csum_l3:1;
- u64 rsvd0:5;
+ u64 csum_inner_l4:2;
+ u64 csum_inner_l3:1;
+ u64 rsvd0:2;
u64 l4_offset:8;
u64 l3_offset:8;
u64 rsvd1:4;
u64 tot_len:20; /* W0 */

- u64 tso_sdc_cont:8;
- u64 tso_sdc_first:8;
- u64 tso_l4_offset:8;
- u64 tso_flags_last:12;
- u64 tso_flags_first:12;
- u64 rsvd2:2;
+ u64 rsvd2:24;
+ u64 inner_l4_offset:8;
+ u64 inner_l3_offset:8;
+ u64 tso_start:8;
+ u64 rsvd3:2;
u64 tso_max_paysize:14; /* W1 */
#elif defined(__LITTLE_ENDIAN_BITFIELD)
u64 tot_len:20;
u64 rsvd1:4;
u64 l3_offset:8;
u64 l4_offset:8;
- u64 rsvd0:5;
+ u64 rsvd0:2;
+ u64 csum_inner_l3:1;
+ u64 csum_inner_l4:2;
u64 csum_l3:1;
u64 csum_l4:2;
u64 subdesc_cnt:8;
@@ -574,12 +577,11 @@ struct sq_hdr_subdesc {
u64 subdesc_type:4; /* W0 */

u64 tso_max_paysize:14;
- u64 rsvd2:2;
- u64 tso_flags_first:12;
- u64 tso_flags_last:12;
- u64 tso_l4_offset:8;
- u64 tso_sdc_first:8;
- u64 tso_sdc_cont:8; /* W1 */
+ u64 rsvd3:2;
+ u64 tso_start:8;
+ u64 inner_l3_offset:8;
+ u64 inner_l4_offset:8;
+ u64 rsvd2:24; /* W1 */
#endif
};
--
1.7.1
Pavel Fedin
2015-12-09 12:05:01 UTC
Permalink
Hello!
-----Original Message-----
Goutham
Sent: Wednesday, December 09, 2015 2:38 PM
Subject: [PATCH 1/2] net: thunderx: HW TSO support for pass-2 hardware
This adds support for offloading TCP segmentation to HW in pass-2
revision of hardware. Both driver level SW TSO for pass1.x chips
and HW TSO for pass-2 chip will co-exist.
---
drivers/net/ethernet/cavium/thunder/nic.h | 12 ++++++--
drivers/net/ethernet/cavium/thunder/nic_main.c | 11 ++-----
drivers/net/ethernet/cavium/thunder/nicvf_main.c | 15 ++++++++-
drivers/net/ethernet/cavium/thunder/nicvf_queues.c | 20 ++++++++++---
drivers/net/ethernet/cavium/thunder/q_struct.h | 30 ++++++++++---------
5 files changed, 56 insertions(+), 32 deletions(-)
diff --git a/drivers/net/ethernet/cavium/thunder/nic.h
b/drivers/net/ethernet/cavium/thunder/nic.h
index 39ca674..02571f4 100644
--- a/drivers/net/ethernet/cavium/thunder/nic.h
+++ b/drivers/net/ethernet/cavium/thunder/nic.h
@@ -262,9 +262,10 @@ struct nicvf {
struct pci_dev *pdev;
u8 vf_id;
u8 node;
- u8 tns_mode:1;
- u8 sqs_mode:1;
- u8 loopback_supported:1;
+ bool tns_mode:1;
+ bool sqs_mode:1;
These little refactors are creeping in your code without even being mentioned in the commit message, this is not good practice
IMHO. Additionally, may be turn these two flags into something like:

enum nicvf_mode {
NICVF_BYPASS,
NICVF_TNS,
NICVF_SQS
};

? Anyway, these modes are mutually exclusive.
+ bool loopback_supported:1;
+ bool hw_tso:1;
u16 mtu;
struct queue_set *qs;
#define MAX_SQS_PER_VF_SINGLE_NODE 5
@@ -489,6 +490,11 @@ static inline int nic_get_node_id(struct pci_dev *pdev)
return ((addr >> NIC_NODE_ID_SHIFT) & NIC_NODE_ID_MASK);
}
+static inline bool pass1_silicon(struct pci_dev *pdev)
+{
+ return pdev->revision < 8;
+}
+
int nicvf_set_real_num_queues(struct net_device *netdev,
int tx_queues, int rx_queues);
int nicvf_open(struct net_device *netdev);
diff --git a/drivers/net/ethernet/cavium/thunder/nic_main.c
b/drivers/net/ethernet/cavium/thunder/nic_main.c
index 4b7fd63..9f80de4 100644
--- a/drivers/net/ethernet/cavium/thunder/nic_main.c
+++ b/drivers/net/ethernet/cavium/thunder/nic_main.c
@@ -55,11 +55,6 @@ struct nicpf {
bool irq_allocated[NIC_PF_MSIX_VECTORS];
};
-static inline bool pass1_silicon(struct nicpf *nic)
-{
- return nic->pdev->revision < 8;
-}
-
/* Supported devices */
static const struct pci_device_id nic_id_table[] = {
{ PCI_DEVICE(PCI_VENDOR_ID_CAVIUM, PCI_DEVICE_ID_THUNDER_NIC_PF) },
@@ -123,7 +118,7 @@ static void nic_send_msg_to_vf(struct nicpf *nic, int vf, union nic_mbx
*mbx)
* when PF writes to MBOX(1), in next revisions when
* PF writes to MBOX(0)
*/
- if (pass1_silicon(nic)) {
+ if (pass1_silicon(nic->pdev)) {
/* see the comment for nic_reg_write()/nic_reg_read()
* functions above
*/
@@ -400,7 +395,7 @@ static void nic_config_cpi(struct nicpf *nic, struct cpi_cfg_msg *cfg)
padd = cpi % 8; /* 3 bits CS out of 6bits DSCP */
/* Leave RSS_SIZE as '0' to disable RSS */
- if (pass1_silicon(nic)) {
+ if (pass1_silicon(nic->pdev)) {
nic_reg_write(nic, NIC_PF_CPI_0_2047_CFG | (cpi << 3),
(vnic << 24) | (padd << 16) |
(rssi_base + rssi));
@@ -470,7 +465,7 @@ static void nic_config_rss(struct nicpf *nic, struct rss_cfg_msg *cfg)
}
cpi_base = nic->cpi_base[cfg->vf_id];
- if (pass1_silicon(nic))
+ if (pass1_silicon(nic->pdev))
idx_addr = NIC_PF_CPI_0_2047_CFG;
else
idx_addr = NIC_PF_MPI_0_2047_CFG;
diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_main.c
b/drivers/net/ethernet/cavium/thunder/nicvf_main.c
index dde8dc7..c24cb2a 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_main.c
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_main.c
@@ -525,14 +525,22 @@ static void nicvf_snd_pkt_handler(struct net_device *netdev,
__func__, cqe_tx->sq_qs, cqe_tx->sq_idx,
cqe_tx->sqe_ptr, hdr->subdesc_cnt);
- nicvf_put_sq_desc(sq, hdr->subdesc_cnt + 1);
nicvf_check_cqe_tx_errs(nic, cq, cqe_tx);
skb = (struct sk_buff *)sq->skbuff[cqe_tx->sqe_ptr];
- /* For TSO offloaded packets only one head SKB needs to be freed */
+ /* For TSO offloaded packets only one SQE will have a valid SKB */
if (skb) {
+ nicvf_put_sq_desc(sq, hdr->subdesc_cnt + 1);
prefetch(skb);
dev_consume_skb_any(skb);
sq->skbuff[cqe_tx->sqe_ptr] = (u64)NULL;
+ } else {
+ /* In case of HW TSO, HW sends a CQE for each segment of a TSO
+ * packet instead of a single CQE for the whole TSO packet
+ * transmitted. Each of this CQE points to the same SQE, so
+ * avoid freeing same SQE multiple times.
+ */
+ if (!nic->hw_tso)
+ nicvf_put_sq_desc(sq, hdr->subdesc_cnt + 1);
}
}
@@ -1549,6 +1557,9 @@ static int nicvf_probe(struct pci_dev *pdev, const struct pci_device_id
*ent)
netdev->vlan_features = NETIF_F_SG | NETIF_F_IP_CSUM | NETIF_F_TSO;
+ if (!pass1_silicon(nic->pdev))
+ nic->hw_tso = true;
+
netdev->netdev_ops = &nicvf_netdev_ops;
netdev->watchdog_timeo = NICVF_TX_TIMEOUT;
diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
b/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
index 1fbd908..b11fc09 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
@@ -925,7 +925,7 @@ static int nicvf_sq_subdesc_required(struct nicvf *nic, struct sk_buff
*skb)
{
int subdesc_cnt = MIN_SQ_DESC_PER_PKT_XMIT;
- if (skb_shinfo(skb)->gso_size) {
+ if (skb_shinfo(skb)->gso_size && !nic->hw_tso) {
subdesc_cnt = nicvf_tso_count_subdescs(skb);
return subdesc_cnt;
}
@@ -940,7 +940,7 @@ static int nicvf_sq_subdesc_required(struct nicvf *nic, struct sk_buff
*skb)
* First subdescriptor for every send descriptor.
*/
static inline void
-nicvf_sq_add_hdr_subdesc(struct snd_queue *sq, int qentry,
+nicvf_sq_add_hdr_subdesc(struct nicvf *nic, struct snd_queue *sq, int qentry,
int subdesc_cnt, struct sk_buff *skb, int len)
{
int proto;
@@ -976,6 +976,15 @@ nicvf_sq_add_hdr_subdesc(struct snd_queue *sq, int qentry,
break;
}
}
+
+ if (nic->hw_tso && skb_shinfo(skb)->gso_size) {
+ hdr->tso = 1;
+ hdr->tso_start = skb_transport_offset(skb) + tcp_hdrlen(skb);
+ hdr->tso_max_paysize = skb_shinfo(skb)->gso_size;
+ /* For non-tunneled pkts, point this to L2 ethertype */
+ hdr->inner_l3_offset = skb_network_offset(skb) - 2;
+ nic->drv_stats.tx_tso++;
+ }
}
/* SQ GATHER subdescriptor
@@ -1045,7 +1054,7 @@ static int nicvf_sq_append_tso(struct nicvf *nic, struct snd_queue *sq,
data_left -= size;
tso_build_data(skb, &tso, size);
}
- nicvf_sq_add_hdr_subdesc(sq, hdr_qentry,
+ nicvf_sq_add_hdr_subdesc(nic, sq, hdr_qentry,
seg_subdescs - 1, skb, seg_len);
sq->skbuff[hdr_qentry] = (u64)NULL;
qentry = nicvf_get_nxt_sqentry(sq, qentry);
@@ -1098,11 +1107,12 @@ int nicvf_sq_append_skb(struct nicvf *nic, struct sk_buff *skb)
qentry = nicvf_get_sq_desc(sq, subdesc_cnt);
/* Check if its a TSO packet */
- if (skb_shinfo(skb)->gso_size)
+ if (skb_shinfo(skb)->gso_size && !nic->hw_tso)
return nicvf_sq_append_tso(nic, sq, sq_num, qentry, skb);
/* Add SQ header subdesc */
- nicvf_sq_add_hdr_subdesc(sq, qentry, subdesc_cnt - 1, skb, skb->len);
+ nicvf_sq_add_hdr_subdesc(nic, sq, qentry, subdesc_cnt - 1,
+ skb, skb->len);
/* Add SQ gather subdescs */
qentry = nicvf_get_nxt_sqentry(sq, qentry);
diff --git a/drivers/net/ethernet/cavium/thunder/q_struct.h
b/drivers/net/ethernet/cavium/thunder/q_struct.h
index 3c1de97..9e6d987 100644
--- a/drivers/net/ethernet/cavium/thunder/q_struct.h
+++ b/drivers/net/ethernet/cavium/thunder/q_struct.h
@@ -545,25 +545,28 @@ struct sq_hdr_subdesc {
u64 subdesc_cnt:8;
u64 csum_l4:2;
u64 csum_l3:1;
- u64 rsvd0:5;
+ u64 csum_inner_l4:2;
+ u64 csum_inner_l3:1;
+ u64 rsvd0:2;
u64 l4_offset:8;
u64 l3_offset:8;
u64 rsvd1:4;
u64 tot_len:20; /* W0 */
- u64 tso_sdc_cont:8;
- u64 tso_sdc_first:8;
- u64 tso_l4_offset:8;
- u64 tso_flags_last:12;
- u64 tso_flags_first:12;
- u64 rsvd2:2;
+ u64 rsvd2:24;
+ u64 inner_l4_offset:8;
+ u64 inner_l3_offset:8;
+ u64 tso_start:8;
+ u64 rsvd3:2;
u64 tso_max_paysize:14; /* W1 */
#elif defined(__LITTLE_ENDIAN_BITFIELD)
u64 tot_len:20;
u64 rsvd1:4;
u64 l3_offset:8;
u64 l4_offset:8;
- u64 rsvd0:5;
+ u64 rsvd0:2;
+ u64 csum_inner_l3:1;
+ u64 csum_inner_l4:2;
u64 csum_l3:1;
u64 csum_l4:2;
u64 subdesc_cnt:8;
@@ -574,12 +577,11 @@ struct sq_hdr_subdesc {
u64 subdesc_type:4; /* W0 */
u64 tso_max_paysize:14;
- u64 rsvd2:2;
- u64 tso_flags_first:12;
- u64 tso_flags_last:12;
- u64 tso_l4_offset:8;
- u64 tso_sdc_first:8;
- u64 tso_sdc_cont:8; /* W1 */
+ u64 rsvd3:2;
+ u64 tso_start:8;
+ u64 inner_l3_offset:8;
+ u64 inner_l4_offset:8;
+ u64 rsvd2:24; /* W1 */
#endif
};
--
1.7.1
Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia
Sunil Kovvuri
2015-12-09 12:24:16 UTC
Permalink
Post by Pavel Fedin
+ bool tns_mode:1;
+ bool sqs_mode:1;
These little refactors are creeping in your code without even being
mentioned in the commit message, this is not good practice.
Okay, will include these in the commit message.
Post by Pavel Fedin
enum nicvf_mode {
NICVF_BYPASS,
NICVF_TNS,
NICVF_SQS
};
? Anyway, these modes are mutually exclusive.
VF driver always assumed to be in BYPASS mode.
Will submit a seperate cleanup patch to get rid of 'tns_mode' boolean.

Thanks,
Sunil.
David Miller
2015-12-09 20:26:22 UTC
Permalink
From: Pavel Fedin <***@samsung.com>
Date: Wed, 09 Dec 2015 15:05:01 +0300
Post by Pavel Fedin
Hello!
-----Original Message-----
Goutham
Sent: Wednesday, December 09, 2015 2:38 PM
Subject: [PATCH 1/2] net: thunderx: HW TSO support for pass-2 hardware
This adds support for offloading TCP segmentation to HW in pass-2
revision of hardware. Both driver level SW TSO for pass1.x chips
and HW TSO for pass-2 chip will co-exist.
---
drivers/net/ethernet/cavium/thunder/nic.h | 12 ++++++--
drivers/net/ethernet/cavium/thunder/nic_main.c | 11 ++-----
drivers/net/ethernet/cavium/thunder/nicvf_main.c | 15 ++++++++-
drivers/net/ethernet/cavium/thunder/nicvf_queues.c | 20 ++++++++++---
drivers/net/ethernet/cavium/thunder/q_struct.h | 30 ++++++++++---------
5 files changed, 56 insertions(+), 32 deletions(-)
diff --git a/drivers/net/ethernet/cavium/thunder/nic.h
b/drivers/net/ethernet/cavium/thunder/nic.h
index 39ca674..02571f4 100644
--- a/drivers/net/ethernet/cavium/thunder/nic.h
+++ b/drivers/net/ethernet/cavium/thunder/nic.h
@@ -262,9 +262,10 @@ struct nicvf {
struct pci_dev *pdev;
u8 vf_id;
u8 node;
- u8 tns_mode:1;
- u8 sqs_mode:1;
- u8 loopback_supported:1;
+ bool tns_mode:1;
+ bool sqs_mode:1;
These little refactors are creeping in your code without even being mentioned in the commit message, this is not good practice
Also I disagree completely with boolean bitfields. Just use plain 'bool' and let
the compiler decide how to lay it out.
Sunil Goutham
2015-12-09 11:37:59 UTC
Permalink
From: Sunil Goutham <***@cavium.com>

This patch set adds support for new features added in pass-2 revision
of hardware like TSO and count based interrupt coalescing.

Sunil Goutham (2):
net: thunderx: HW TSO support for pass-2 hardware
net: thunderx: Enable CQE count threshold interrupt

drivers/net/ethernet/cavium/thunder/nic.h | 14 +++++++--
drivers/net/ethernet/cavium/thunder/nic_main.c | 11 ++-----
drivers/net/ethernet/cavium/thunder/nicvf_main.c | 17 +++++++++--
drivers/net/ethernet/cavium/thunder/nicvf_queues.c | 25 ++++++++++++----
drivers/net/ethernet/cavium/thunder/nicvf_queues.h | 3 +-
drivers/net/ethernet/cavium/thunder/q_struct.h | 30 ++++++++++---------
6 files changed, 65 insertions(+), 35 deletions(-)
Sunil Goutham
2015-12-09 11:38:01 UTC
Permalink
From: Sunil Goutham <***@cavium.com>

This feature is introduced in pass-2 chip and with this CQ interrupt
coalescing will work based on both timer and count.

Signed-off-by: Sunil Goutham <***@cavium.com>
---
drivers/net/ethernet/cavium/thunder/nic.h | 2 ++
drivers/net/ethernet/cavium/thunder/nicvf_main.c | 2 +-
drivers/net/ethernet/cavium/thunder/nicvf_queues.c | 5 ++++-
drivers/net/ethernet/cavium/thunder/nicvf_queues.h | 3 ++-
4 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/cavium/thunder/nic.h b/drivers/net/ethernet/cavium/thunder/nic.h
index 02571f4..e782856 100644
--- a/drivers/net/ethernet/cavium/thunder/nic.h
+++ b/drivers/net/ethernet/cavium/thunder/nic.h
@@ -34,6 +34,8 @@
/* NIC priv flags */
#define NIC_SRIOV_ENABLED BIT(0)

+#define VNIC_NAPI_WEIGHT NAPI_POLL_WEIGHT
+
/* Min/Max packet size */
#define NIC_HW_MIN_FRS 64
#define NIC_HW_MAX_FRS 9200 /* 9216 max packet including FCS */
diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_main.c b/drivers/net/ethernet/cavium/thunder/nicvf_main.c
index c24cb2a..e06a7f8 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_main.c
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_main.c
@@ -1155,7 +1155,7 @@ int nicvf_open(struct net_device *netdev)
cq_poll->cq_idx = qidx;
cq_poll->nicvf = nic;
netif_napi_add(netdev, &cq_poll->napi, nicvf_poll,
- NAPI_POLL_WEIGHT);
+ VNIC_NAPI_WEIGHT);
napi_enable(&cq_poll->napi);
nic->napi[qidx] = cq_poll;
}
diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_queues.c b/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
index b11fc09..4e9709e 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
@@ -299,7 +299,10 @@ static int nicvf_init_cmp_queue(struct nicvf *nic,
return err;

cq->desc = cq->dmem.base;
- cq->thresh = CMP_QUEUE_CQE_THRESH;
+ if (!pass1_silicon(nic->pdev))
+ cq->thresh = CMP_QUEUE_CQE_THRESH;
+ else
+ cq->thresh = 0;
nic->cq_coalesce_usecs = (CMP_QUEUE_TIMER_THRESH * 0.05) - 1;

return 0;
diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_queues.h b/drivers/net/ethernet/cavium/thunder/nicvf_queues.h
index a4f6667..0fae6ad 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_queues.h
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_queues.h
@@ -10,6 +10,7 @@
#define NICVF_QUEUES_H

#include <linux/netdevice.h>
+#include "nic.h"
#include "q_struct.h"

#define MAX_QUEUE_SET 128
@@ -75,7 +76,7 @@
*/
#define CMP_QSIZE CMP_QUEUE_SIZE2
#define CMP_QUEUE_LEN (1ULL << (CMP_QSIZE + 10))
-#define CMP_QUEUE_CQE_THRESH 0
+#define CMP_QUEUE_CQE_THRESH (VNIC_NAPI_WEIGHT / 2)
#define CMP_QUEUE_TIMER_THRESH 80 /* ~2usec */

#define RBDR_SIZE RBDR_SIZE0
--
1.7.1
Pavel Fedin
2015-12-09 12:07:55 UTC
Permalink
Hello!
-----Original Message-----
Goutham
Sent: Wednesday, December 09, 2015 2:38 PM
Subject: [PATCH 2/2] net: thunderx: Enable CQE count threshold interrupt
This feature is introduced in pass-2 chip and with this CQ interrupt
coalescing will work based on both timer and count.
---
drivers/net/ethernet/cavium/thunder/nic.h | 2 ++
drivers/net/ethernet/cavium/thunder/nicvf_main.c | 2 +-
drivers/net/ethernet/cavium/thunder/nicvf_queues.c | 5 ++++-
drivers/net/ethernet/cavium/thunder/nicvf_queues.h | 3 ++-
4 files changed, 9 insertions(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/cavium/thunder/nic.h
b/drivers/net/ethernet/cavium/thunder/nic.h
index 02571f4..e782856 100644
--- a/drivers/net/ethernet/cavium/thunder/nic.h
+++ b/drivers/net/ethernet/cavium/thunder/nic.h
@@ -34,6 +34,8 @@
/* NIC priv flags */
#define NIC_SRIOV_ENABLED BIT(0)
+#define VNIC_NAPI_WEIGHT NAPI_POLL_WEIGHT
+
/* Min/Max packet size */
#define NIC_HW_MIN_FRS 64
#define NIC_HW_MAX_FRS 9200 /* 9216 max packet including FCS */
diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_main.c
b/drivers/net/ethernet/cavium/thunder/nicvf_main.c
index c24cb2a..e06a7f8 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_main.c
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_main.c
@@ -1155,7 +1155,7 @@ int nicvf_open(struct net_device *netdev)
cq_poll->cq_idx = qidx;
cq_poll->nicvf = nic;
netif_napi_add(netdev, &cq_poll->napi, nicvf_poll,
- NAPI_POLL_WEIGHT);
+ VNIC_NAPI_WEIGHT);
What's the sense in introducing another constant which is aliased to the previous one? Making LOC bigger?
napi_enable(&cq_poll->napi);
nic->napi[qidx] = cq_poll;
}
diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
b/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
index b11fc09..4e9709e 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
@@ -299,7 +299,10 @@ static int nicvf_init_cmp_queue(struct nicvf *nic,
return err;
cq->desc = cq->dmem.base;
- cq->thresh = CMP_QUEUE_CQE_THRESH;
+ if (!pass1_silicon(nic->pdev))
+ cq->thresh = CMP_QUEUE_CQE_THRESH;
+ else
+ cq->thresh = 0;
IMHO "cq->thresh = pass1_silicon(nic->pdev) ? CMP_QUEUE_CQE_THRESH : 0" looks less bulky.
nic->cq_coalesce_usecs = (CMP_QUEUE_TIMER_THRESH * 0.05) - 1;
return 0;
diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_queues.h
b/drivers/net/ethernet/cavium/thunder/nicvf_queues.h
index a4f6667..0fae6ad 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_queues.h
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_queues.h
@@ -10,6 +10,7 @@
#define NICVF_QUEUES_H
#include <linux/netdevice.h>
+#include "nic.h"
#include "q_struct.h"
#define MAX_QUEUE_SET 128
@@ -75,7 +76,7 @@
*/
#define CMP_QSIZE CMP_QUEUE_SIZE2
#define CMP_QUEUE_LEN (1ULL << (CMP_QSIZE + 10))
-#define CMP_QUEUE_CQE_THRESH 0
+#define CMP_QUEUE_CQE_THRESH (VNIC_NAPI_WEIGHT / 2)
#define CMP_QUEUE_TIMER_THRESH 80 /* ~2usec */
#define RBDR_SIZE RBDR_SIZE0
--
1.7.1
Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia
Sunil Kovvuri
2015-12-09 12:26:52 UTC
Permalink
Will take care of above suggestions and resubmit.

Thanks,
Sunil.
Sunil Goutham
2015-12-10 07:55:18 UTC
Permalink
From: Sunil Goutham <***@cavium.com>

This patch set adds support for new features added in pass-2 revision
of hardware like TSO and count based interrupt coalescing.

Changes from v1:
- Addressed comments received regarding boolean bit field changes
by excluding them from this patch. Will submit a seperate
patch along with cleanup of unsed field.
- Got rid of new macro 'VNIC_NAPI_WEIGHT' introduced in
count threshold interrupt patch.

Sunil Goutham (2):
net: thunderx: HW TSO support for pass-2 hardware
net: thunderx: Enable CQE count threshold interrupt

drivers/net/ethernet/cavium/thunder/nic.h | 6 ++++
drivers/net/ethernet/cavium/thunder/nic_main.c | 11 ++-----
drivers/net/ethernet/cavium/thunder/nicvf_main.c | 15 ++++++++-
drivers/net/ethernet/cavium/thunder/nicvf_queues.c | 22 ++++++++++----
drivers/net/ethernet/cavium/thunder/nicvf_queues.h | 2 +-
drivers/net/ethernet/cavium/thunder/q_struct.h | 30 ++++++++++---------
6 files changed, 55 insertions(+), 31 deletions(-)
Pavel Fedin
2015-12-10 08:52:15 UTC
Permalink
All series:

Reviewed-by: Pavel Fedin <***@samsung.com>

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia

-----Original Message-----
From: netdev-***@vger.kernel.org [mailto:netdev-***@vger.kernel.org] On Behalf Of Sunil Goutham
Sent: Thursday, December 10, 2015 10:55 AM
To: ***@vger.kernel.org
Cc: linux-***@vger.kernel.org; linux-arm-***@lists.infradead.org; ***@samsung.com; ***@caviumnetworks.com; Sunil
Goutham
Subject: [PATCH v2 0/2] net: thunderx: Support for pass-2 hw features

From: Sunil Goutham <***@cavium.com>

This patch set adds support for new features added in pass-2 revision of hardware like TSO and count based interrupt coalescing.

Changes from v1:
- Addressed comments received regarding boolean bit field changes
by excluding them from this patch. Will submit a seperate
patch along with cleanup of unsed field.
- Got rid of new macro 'VNIC_NAPI_WEIGHT' introduced in
count threshold interrupt patch.

Sunil Goutham (2):
net: thunderx: HW TSO support for pass-2 hardware
net: thunderx: Enable CQE count threshold interrupt

drivers/net/ethernet/cavium/thunder/nic.h | 6 ++++
drivers/net/ethernet/cavium/thunder/nic_main.c | 11 ++-----
drivers/net/ethernet/cavium/thunder/nicvf_main.c | 15 ++++++++-
drivers/net/ethernet/cavium/thunder/nicvf_queues.c | 22 ++++++++++----
drivers/net/ethernet/cavium/thunder/nicvf_queues.h | 2 +-
drivers/net/ethernet/cavium/thunder/q_struct.h | 30 ++++++++++---------
6 files changed, 55 insertions(+), 31 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to ***@vger.kernel.org More
majordomo info at http://vger.kernel.org/majordomo-info.html
David Miller
2015-12-12 04:38:32 UTC
Permalink
From: Sunil Goutham <***@gmail.com>
Date: Thu, 10 Dec 2015 13:25:18 +0530
Post by Sunil Goutham
This patch set adds support for new features added in pass-2 revision
of hardware like TSO and count based interrupt coalescing.
- Addressed comments received regarding boolean bit field changes
by excluding them from this patch. Will submit a seperate
patch along with cleanup of unsed field.
- Got rid of new macro 'VNIC_NAPI_WEIGHT' introduced in
count threshold interrupt patch.
Series applied to net-next, thanks.
Sunil Goutham
2015-12-10 07:55:20 UTC
Permalink
From: Sunil Goutham <***@cavium.com>

This feature is introduced in pass-2 chip and with this CQ interrupt
coalescing will work based on both timer and count.

Signed-off-by: Sunil Goutham <***@cavium.com>
---
drivers/net/ethernet/cavium/thunder/nicvf_queues.c | 2 +-
drivers/net/ethernet/cavium/thunder/nicvf_queues.h | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_queues.c b/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
index b11fc09..d0d1b54 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
@@ -299,7 +299,7 @@ static int nicvf_init_cmp_queue(struct nicvf *nic,
return err;

cq->desc = cq->dmem.base;
- cq->thresh = CMP_QUEUE_CQE_THRESH;
+ cq->thresh = pass1_silicon(nic->pdev) ? 0 : CMP_QUEUE_CQE_THRESH;
nic->cq_coalesce_usecs = (CMP_QUEUE_TIMER_THRESH * 0.05) - 1;

return 0;
diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_queues.h b/drivers/net/ethernet/cavium/thunder/nicvf_queues.h
index a4f6667..c5030a7 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_queues.h
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_queues.h
@@ -75,7 +75,7 @@
*/
#define CMP_QSIZE CMP_QUEUE_SIZE2
#define CMP_QUEUE_LEN (1ULL << (CMP_QSIZE + 10))
-#define CMP_QUEUE_CQE_THRESH 0
+#define CMP_QUEUE_CQE_THRESH (NAPI_POLL_WEIGHT / 2)
#define CMP_QUEUE_TIMER_THRESH 80 /* ~2usec */

#define RBDR_SIZE RBDR_SIZE0
--
1.7.1
Sunil Goutham
2015-12-10 07:55:19 UTC
Permalink
From: Sunil Goutham <***@cavium.com>

This adds support for offloading TCP segmentation to HW in pass-2
revision of hardware. Both driver level SW TSO for pass1.x chips
and HW TSO for pass-2 chip will co-exist. Modified SQ descriptor
structures to reflect pass-2 hw implementation.

Signed-off-by: Sunil Goutham <***@cavium.com>
---
drivers/net/ethernet/cavium/thunder/nic.h | 6 ++++
drivers/net/ethernet/cavium/thunder/nic_main.c | 11 ++-----
drivers/net/ethernet/cavium/thunder/nicvf_main.c | 15 ++++++++-
drivers/net/ethernet/cavium/thunder/nicvf_queues.c | 20 ++++++++++---
drivers/net/ethernet/cavium/thunder/q_struct.h | 30 ++++++++++---------
5 files changed, 53 insertions(+), 29 deletions(-)

diff --git a/drivers/net/ethernet/cavium/thunder/nic.h b/drivers/net/ethernet/cavium/thunder/nic.h
index 39ca674..6888288 100644
--- a/drivers/net/ethernet/cavium/thunder/nic.h
+++ b/drivers/net/ethernet/cavium/thunder/nic.h
@@ -265,6 +265,7 @@ struct nicvf {
u8 tns_mode:1;
u8 sqs_mode:1;
u8 loopback_supported:1;
+ bool hw_tso;
u16 mtu;
struct queue_set *qs;
#define MAX_SQS_PER_VF_SINGLE_NODE 5
@@ -489,6 +490,11 @@ static inline int nic_get_node_id(struct pci_dev *pdev)
return ((addr >> NIC_NODE_ID_SHIFT) & NIC_NODE_ID_MASK);
}

+static inline bool pass1_silicon(struct pci_dev *pdev)
+{
+ return pdev->revision < 8;
+}
+
int nicvf_set_real_num_queues(struct net_device *netdev,
int tx_queues, int rx_queues);
int nicvf_open(struct net_device *netdev);
diff --git a/drivers/net/ethernet/cavium/thunder/nic_main.c b/drivers/net/ethernet/cavium/thunder/nic_main.c
index 4b7fd63..9f80de4 100644
--- a/drivers/net/ethernet/cavium/thunder/nic_main.c
+++ b/drivers/net/ethernet/cavium/thunder/nic_main.c
@@ -55,11 +55,6 @@ struct nicpf {
bool irq_allocated[NIC_PF_MSIX_VECTORS];
};

-static inline bool pass1_silicon(struct nicpf *nic)
-{
- return nic->pdev->revision < 8;
-}
-
/* Supported devices */
static const struct pci_device_id nic_id_table[] = {
{ PCI_DEVICE(PCI_VENDOR_ID_CAVIUM, PCI_DEVICE_ID_THUNDER_NIC_PF) },
@@ -123,7 +118,7 @@ static void nic_send_msg_to_vf(struct nicpf *nic, int vf, union nic_mbx *mbx)
* when PF writes to MBOX(1), in next revisions when
* PF writes to MBOX(0)
*/
- if (pass1_silicon(nic)) {
+ if (pass1_silicon(nic->pdev)) {
/* see the comment for nic_reg_write()/nic_reg_read()
* functions above
*/
@@ -400,7 +395,7 @@ static void nic_config_cpi(struct nicpf *nic, struct cpi_cfg_msg *cfg)
padd = cpi % 8; /* 3 bits CS out of 6bits DSCP */

/* Leave RSS_SIZE as '0' to disable RSS */
- if (pass1_silicon(nic)) {
+ if (pass1_silicon(nic->pdev)) {
nic_reg_write(nic, NIC_PF_CPI_0_2047_CFG | (cpi << 3),
(vnic << 24) | (padd << 16) |
(rssi_base + rssi));
@@ -470,7 +465,7 @@ static void nic_config_rss(struct nicpf *nic, struct rss_cfg_msg *cfg)
}

cpi_base = nic->cpi_base[cfg->vf_id];
- if (pass1_silicon(nic))
+ if (pass1_silicon(nic->pdev))
idx_addr = NIC_PF_CPI_0_2047_CFG;
else
idx_addr = NIC_PF_MPI_0_2047_CFG;
diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_main.c b/drivers/net/ethernet/cavium/thunder/nicvf_main.c
index dde8dc7..c24cb2a 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_main.c
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_main.c
@@ -525,14 +525,22 @@ static void nicvf_snd_pkt_handler(struct net_device *netdev,
__func__, cqe_tx->sq_qs, cqe_tx->sq_idx,
cqe_tx->sqe_ptr, hdr->subdesc_cnt);

- nicvf_put_sq_desc(sq, hdr->subdesc_cnt + 1);
nicvf_check_cqe_tx_errs(nic, cq, cqe_tx);
skb = (struct sk_buff *)sq->skbuff[cqe_tx->sqe_ptr];
- /* For TSO offloaded packets only one head SKB needs to be freed */
+ /* For TSO offloaded packets only one SQE will have a valid SKB */
if (skb) {
+ nicvf_put_sq_desc(sq, hdr->subdesc_cnt + 1);
prefetch(skb);
dev_consume_skb_any(skb);
sq->skbuff[cqe_tx->sqe_ptr] = (u64)NULL;
+ } else {
+ /* In case of HW TSO, HW sends a CQE for each segment of a TSO
+ * packet instead of a single CQE for the whole TSO packet
+ * transmitted. Each of this CQE points to the same SQE, so
+ * avoid freeing same SQE multiple times.
+ */
+ if (!nic->hw_tso)
+ nicvf_put_sq_desc(sq, hdr->subdesc_cnt + 1);
}
}

@@ -1549,6 +1557,9 @@ static int nicvf_probe(struct pci_dev *pdev, const struct pci_device_id *ent)

netdev->vlan_features = NETIF_F_SG | NETIF_F_IP_CSUM | NETIF_F_TSO;

+ if (!pass1_silicon(nic->pdev))
+ nic->hw_tso = true;
+
netdev->netdev_ops = &nicvf_netdev_ops;
netdev->watchdog_timeo = NICVF_TX_TIMEOUT;

diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_queues.c b/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
index 1fbd908..b11fc09 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
@@ -925,7 +925,7 @@ static int nicvf_sq_subdesc_required(struct nicvf *nic, struct sk_buff *skb)
{
int subdesc_cnt = MIN_SQ_DESC_PER_PKT_XMIT;

- if (skb_shinfo(skb)->gso_size) {
+ if (skb_shinfo(skb)->gso_size && !nic->hw_tso) {
subdesc_cnt = nicvf_tso_count_subdescs(skb);
return subdesc_cnt;
}
@@ -940,7 +940,7 @@ static int nicvf_sq_subdesc_required(struct nicvf *nic, struct sk_buff *skb)
* First subdescriptor for every send descriptor.
*/
static inline void
-nicvf_sq_add_hdr_subdesc(struct snd_queue *sq, int qentry,
+nicvf_sq_add_hdr_subdesc(struct nicvf *nic, struct snd_queue *sq, int qentry,
int subdesc_cnt, struct sk_buff *skb, int len)
{
int proto;
@@ -976,6 +976,15 @@ nicvf_sq_add_hdr_subdesc(struct snd_queue *sq, int qentry,
break;
}
}
+
+ if (nic->hw_tso && skb_shinfo(skb)->gso_size) {
+ hdr->tso = 1;
+ hdr->tso_start = skb_transport_offset(skb) + tcp_hdrlen(skb);
+ hdr->tso_max_paysize = skb_shinfo(skb)->gso_size;
+ /* For non-tunneled pkts, point this to L2 ethertype */
+ hdr->inner_l3_offset = skb_network_offset(skb) - 2;
+ nic->drv_stats.tx_tso++;
+ }
}

/* SQ GATHER subdescriptor
@@ -1045,7 +1054,7 @@ static int nicvf_sq_append_tso(struct nicvf *nic, struct snd_queue *sq,
data_left -= size;
tso_build_data(skb, &tso, size);
}
- nicvf_sq_add_hdr_subdesc(sq, hdr_qentry,
+ nicvf_sq_add_hdr_subdesc(nic, sq, hdr_qentry,
seg_subdescs - 1, skb, seg_len);
sq->skbuff[hdr_qentry] = (u64)NULL;
qentry = nicvf_get_nxt_sqentry(sq, qentry);
@@ -1098,11 +1107,12 @@ int nicvf_sq_append_skb(struct nicvf *nic, struct sk_buff *skb)
qentry = nicvf_get_sq_desc(sq, subdesc_cnt);

/* Check if its a TSO packet */
- if (skb_shinfo(skb)->gso_size)
+ if (skb_shinfo(skb)->gso_size && !nic->hw_tso)
return nicvf_sq_append_tso(nic, sq, sq_num, qentry, skb);

/* Add SQ header subdesc */
- nicvf_sq_add_hdr_subdesc(sq, qentry, subdesc_cnt - 1, skb, skb->len);
+ nicvf_sq_add_hdr_subdesc(nic, sq, qentry, subdesc_cnt - 1,
+ skb, skb->len);

/* Add SQ gather subdescs */
qentry = nicvf_get_nxt_sqentry(sq, qentry);
diff --git a/drivers/net/ethernet/cavium/thunder/q_struct.h b/drivers/net/ethernet/cavium/thunder/q_struct.h
index 3c1de97..9e6d987 100644
--- a/drivers/net/ethernet/cavium/thunder/q_struct.h
+++ b/drivers/net/ethernet/cavium/thunder/q_struct.h
@@ -545,25 +545,28 @@ struct sq_hdr_subdesc {
u64 subdesc_cnt:8;
u64 csum_l4:2;
u64 csum_l3:1;
- u64 rsvd0:5;
+ u64 csum_inner_l4:2;
+ u64 csum_inner_l3:1;
+ u64 rsvd0:2;
u64 l4_offset:8;
u64 l3_offset:8;
u64 rsvd1:4;
u64 tot_len:20; /* W0 */

- u64 tso_sdc_cont:8;
- u64 tso_sdc_first:8;
- u64 tso_l4_offset:8;
- u64 tso_flags_last:12;
- u64 tso_flags_first:12;
- u64 rsvd2:2;
+ u64 rsvd2:24;
+ u64 inner_l4_offset:8;
+ u64 inner_l3_offset:8;
+ u64 tso_start:8;
+ u64 rsvd3:2;
u64 tso_max_paysize:14; /* W1 */
#elif defined(__LITTLE_ENDIAN_BITFIELD)
u64 tot_len:20;
u64 rsvd1:4;
u64 l3_offset:8;
u64 l4_offset:8;
- u64 rsvd0:5;
+ u64 rsvd0:2;
+ u64 csum_inner_l3:1;
+ u64 csum_inner_l4:2;
u64 csum_l3:1;
u64 csum_l4:2;
u64 subdesc_cnt:8;
@@ -574,12 +577,11 @@ struct sq_hdr_subdesc {
u64 subdesc_type:4; /* W0 */

u64 tso_max_paysize:14;
- u64 rsvd2:2;
- u64 tso_flags_first:12;
- u64 tso_flags_last:12;
- u64 tso_l4_offset:8;
- u64 tso_sdc_first:8;
- u64 tso_sdc_cont:8; /* W1 */
+ u64 rsvd3:2;
+ u64 tso_start:8;
+ u64 inner_l3_offset:8;
+ u64 inner_l4_offset:8;
+ u64 rsvd2:24; /* W1 */
#endif
};
--
1.7.1
Arvind Yadav
2016-12-06 07:42:51 UTC
Permalink
Free memory mapping, if fimc_is_probe is not successful.

Signed-off-by: Arvind Yadav <***@gmail.com>
---
drivers/media/platform/exynos4-is/fimc-is.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/media/platform/exynos4-is/fimc-is.c b/drivers/media/platform/exynos4-is/fimc-is.c
index 32ca55f..10d98a5 100644
--- a/drivers/media/platform/exynos4-is/fimc-is.c
+++ b/drivers/media/platform/exynos4-is/fimc-is.c
@@ -818,12 +818,13 @@ static int fimc_is_probe(struct platform_device *pdev)
is->irq = irq_of_parse_and_map(dev->of_node, 0);
if (!is->irq) {
dev_err(dev, "no irq found\n");
- return -EINVAL;
+ ret = -EINVAL;
+ goto err_iounmap;
}

ret = fimc_is_get_clocks(is);
if (ret < 0)
- return ret;
+ goto err_iounmap;

platform_set_drvdata(pdev, is);

@@ -877,6 +878,8 @@ err_irq:
free_irq(is->irq, is);
err_clk:
fimc_is_put_clocks(is);
+err_iounmap:
+ iounmap(is->pmu_regs);
return ret;
}

@@ -932,6 +935,7 @@ static int fimc_is_remove(struct platform_device *pdev)
fimc_is_unregister_subdevs(is);
vb2_dma_contig_clear_max_seg_size(dev);
fimc_is_put_clocks(is);
+ iounmap(is->pmu_regs);
fimc_is_debugfs_remove(is);
release_firmware(is->fw.f_w);
fimc_is_free_cpu_memory(is);
--
1.7.9.5
Zhangfei Gao
2016-12-27 14:22:40 UTC
Permalink
Some platforms like hi3660 need do reset first to allow accessing registers

Signed-off-by: Zhangfei Gao <***@linaro.org>
Reviewed-by: Andy Shevchenko <***@linux.intel.com>
Tested-by: Ramiro Oliveira <***@synopsys.com>
---
rebase to 4.10-rc1

drivers/i2c/busses/i2c-designware-core.h | 1 +
drivers/i2c/busses/i2c-designware-platdrv.c | 28 ++++++++++++++++++++++++----
2 files changed, 25 insertions(+), 4 deletions(-)

diff --git a/drivers/i2c/busses/i2c-designware-core.h b/drivers/i2c/busses/i2c-designware-core.h
index 26250b4..302807c 100644
--- a/drivers/i2c/busses/i2c-designware-core.h
+++ b/drivers/i2c/busses/i2c-designware-core.h
@@ -88,6 +88,7 @@ struct dw_i2c_dev {
void __iomem *base;
struct completion cmd_complete;
struct clk *clk;
+ struct reset_control *rst;
u32 (*get_clk_rate_khz) (struct dw_i2c_dev *dev);
struct dw_pci_controller *controller;
int cmd_err;
diff --git a/drivers/i2c/busses/i2c-designware-platdrv.c b/drivers/i2c/busses/i2c-designware-platdrv.c
index 6ce4313..79c4b4e 100644
--- a/drivers/i2c/busses/i2c-designware-platdrv.c
+++ b/drivers/i2c/busses/i2c-designware-platdrv.c
@@ -38,6 +38,7 @@
#include <linux/pm_runtime.h>
#include <linux/property.h>
#include <linux/io.h>
+#include <linux/reset.h>
#include <linux/slab.h>
#include <linux/acpi.h>
#include <linux/platform_data/i2c-designware.h>
@@ -199,6 +200,14 @@ static int dw_i2c_plat_probe(struct platform_device *pdev)
dev->irq = irq;
platform_set_drvdata(pdev, dev);

+ dev->rst = devm_reset_control_get_optional_exclusive(&pdev->dev, NULL);
+ if (IS_ERR(dev->rst)) {
+ if (PTR_ERR(dev->rst) == -EPROBE_DEFER)
+ return -EPROBE_DEFER;
+ } else {
+ reset_control_deassert(dev->rst);
+ }
+
if (pdata) {
dev->clk_freq = pdata->i2c_scl_freq;
} else {
@@ -235,12 +244,13 @@ static int dw_i2c_plat_probe(struct platform_device *pdev)
&& dev->clk_freq != 1000000 && dev->clk_freq != 3400000) {
dev_err(&pdev->dev,
"Only 100kHz, 400kHz, 1MHz and 3.4MHz supported");
- return -EINVAL;
+ r = -EINVAL;
+ goto exit_reset;
}

r = i2c_dw_eval_lock_support(dev);
if (r)
- return r;
+ goto exit_reset;

dev->functionality = I2C_FUNC_10BIT_ADDR | DW_IC_DEFAULT_FUNCTIONALITY;

@@ -286,10 +296,18 @@ static int dw_i2c_plat_probe(struct platform_device *pdev)
}

r = i2c_dw_probe(dev);
- if (r && !dev->pm_runtime_disabled)
- pm_runtime_disable(&pdev->dev);
+ if (r)
+ goto exit_probe;

return r;
+
+exit_probe:
+ if (!dev->pm_runtime_disabled)
+ pm_runtime_disable(&pdev->dev);
+exit_reset:
+ if (!IS_ERR_OR_NULL(dev->rst))
+ reset_control_assert(dev->rst);
+ return r;
}

static int dw_i2c_plat_remove(struct platform_device *pdev)
@@ -306,6 +324,8 @@ static int dw_i2c_plat_remove(struct platform_device *pdev)
pm_runtime_put_sync(&pdev->dev);
if (!dev->pm_runtime_disabled)
pm_runtime_disable(&pdev->dev);
+ if (!IS_ERR_OR_NULL(dev->rst))
+ reset_control_assert(dev->rst);

return 0;
}
--
2.7.4
Jarkko Nikula
2016-12-27 14:41:15 UTC
Permalink
Post by Zhangfei Gao
Some platforms like hi3660 need do reset first to allow accessing registers
---
rebase to 4.10-rc1
drivers/i2c/busses/i2c-designware-core.h | 1 +
drivers/i2c/busses/i2c-designware-platdrv.c | 28 ++++++++++++++++++++++++----
2 files changed, 25 insertions(+), 4 deletions(-)
Acked-by: Jarkko Nikula <***@linux.intel.com>
Wolfram Sang
2017-01-12 18:54:45 UTC
Permalink
Post by Zhangfei Gao
Some platforms like hi3660 need do reset first to allow accessing registers
Applied to for-next, thanks!
Ramiro Oliveira
2017-03-06 15:11:48 UTC
Permalink
Hi Wolfram
Post by Wolfram Sang
Post by Zhangfei Gao
Some platforms like hi3660 need do reset first to allow accessing registers
Applied to for-next, thanks!
I haven't been able to find this patch anywhere. Do you still have plans to
apply it to for-next?
--
Best Regards

Ramiro Oliveira
***@synopsys.com
Wolfram Sang
2017-03-09 14:43:40 UTC
Permalink
Post by Ramiro Oliveira
Hi Wolfram
Post by Wolfram Sang
Post by Zhangfei Gao
Some platforms like hi3660 need do reset first to allow accessing registers
Applied to for-next, thanks!
I haven't been able to find this patch anywhere. Do you still have plans to
apply it to for-next?
Uh, thanks for the heads up. Can't really tell why it fell through the
cracks :( It probably got lost by an improper rebase by me? Sorry about
that.

Applied to for-current, thanks again!

Loading...