The Mellanox OCP NIC shows uneven data transfer speeds.
The Mellanox OCP NIC shows uneven data transfer speeds.
I made my final attempt at getting help before returning this server. I bought a Tyan 1U model equipped with a Mellanox 25Gb/s NIC (OCP card). The part number is MCX4411A-ACUN. We don’t currently have 25Gb/s clients; ours are limited to 2.5Gb/s, while our switches each have two 10Gb/s SFP+ ports—normally used only for connecting switches, but here one port links to the Mellanox NIC’s SFP+ slot. The specs say it can handle 25Gb/s, 10Gb/s, and 1Gb/s. The switch and Windows report successful negotiation at 10Gb/s. I installed the latest drivers from Nvidia and ran tests. Files sent to the server moved at normal rates (about 285MB/s), but outgoing transfers were slower, averaging around 85MB/s. After confirming storage speeds were fine, I used iperf3 to check performance. The results showed similar speed differences between sending and receiving. I shared two test runs: one with the server as the target and one as a client. I also experimented with swapping cables, switches, and even got a new Mellanox card, but nothing changed. Any advice would be appreciated.
The client requirements are detailed. NIC specifications are included. Interesting note: using -R flips the test, avoiding changes on the server side. This problem persists in newer Windows Server releases. I've faced network speed problems with Win10 that Win11 resolves, likely due to driver conflicts in Linux 6.x kernels—though that’s a long story.
No unmanaged switches detected. I've tested every flow control setting between the server and client on the devices. It seems this issue likely lies with the NIC or its driver, since a separate NIC in the PCIe slot works without problems.
I don’t have any additional links at 10Gb/s. Buying a crossover cable and relocating the PCIe NIC to another machine might help, but I’m not sure what results would be. The 10Gb port on the switch is functioning properly with the other 10Gb NIC.
Crossovers are no longer required (since the era of 10/100). The switch as a bottleneck has been removed and it's now part of the software stack. I think the problem might lie in flow control shifting from 25G to 10G or with a Mellanox driver issue—both are common. Personally, I've had to replace three out of four Mellanox NICs I owned... I don't really like them. On the other hand, the ixgbe driver problems with Intel 10G NICs weren't great, but at least they worked in general.
Absolutely, upgrading my Linux server to 10Gbit required turning on Flow Control so it could connect smoothly with slower Windows clients. Previously, I was limited to 300Mbit to Gigabit clients. Even now, I still face problems where the Windows transmission speed lags behind its receiving speed. It seems the current Windows network stack behaves unpredictably—sometimes changing between updates or even across different machines sharing the same NIC.
I hoped for a better outcome. Flow Control was configured as Rx and Tx by default in the driver. Turning each off separately or both off didn’t change the speeds from the server. The only success came from using many parallel streams—something unnecessary with the other 10G NIC.