You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm running the latest version of the UPF on a VM with Mellanox VFs attached.
To be able to use this i patched the Dockerfile of the UPF to build DPDK with support for Mellanox:
diff --git a/Dockerfile b/Dockerfile
index 6913931..90839f0 100644
--- a/Dockerfile+++ b/Dockerfile@@ -14,6 +14,7 @@ RUN apt-get update && apt-get install -y \
--no-install-recommends \
git \
ca-certificates \
+ wget \
libbpf0 \
libelf-dev && \
apt-get clean && \
@@ -21,10 +22,24 @@ RUN apt-get update && apt-get install -y \
# BESS pre-reqs
WORKDIR /bess
+COPY mellanox_fix.patch /tmp/mellanox_fix.patch
RUN git clone https://github.com/omec-project/bess.git . && \
git checkout ${BESS_COMMIT} && \
+ mv /tmp/mellanox_fix.patch . && git apply mellanox_fix.patch && \
cp -a protobuf /protobuf
+# Install Mellanox user-space libraries for support in DPDK+ARG MLNX_EN_VERSION=23.10-2.1.3.1+RUN \+ cd /tmp && \+ wget "https://www.mellanox.com/downloads/ofed/MLNX_EN-${MLNX_EN_VERSION}/mlnx-en-${MLNX_EN_VERSION}-ubuntu20.04-x86_64.tgz" && \+ tar zxf mlnx-en-${MLNX_EN_VERSION}-ubuntu20.04-x86_64.tgz && \+ rm mlnx-en-${MLNX_EN_VERSION}-ubuntu20.04-x86_64.tgz && \+ cd mlnx-en-${MLNX_EN_VERSION}-ubuntu20.04-x86_64 && \+ echo y | ./install --dpdk --user-space-only --without-fw-update && \+ cd .. && \+ rm -rf mlnx-en-${MLNX_EN_VERSION}-ubuntu20.04-x86_64+
# Build DPDK
RUN ./build.py dpdk
Since DPDK with Mellanox doesn't require to unbind the card from the normal driver i moved the network interfaces to the container like this:
sudo ip link set enp6s16np0 netns pause
sudo ip link set enp6s17np0 netns pause
Using the 59c79ca commit of PR #39 everything works but with commit af34576 it stopped, the problem is this check.
I'm not sure why but since that rte_eth_dev_socket_id check the NUMA zone of the interface i think it may be related to the fact that's running in a Docker container and bess report -1 as NUMA zone:
EAL: Detected CPU lcores: 16
EAL: Detected NUMA nodes: 1
EAL: Detected static linkage of DPDK
EAL: Selected IOVA mode 'PA'
EAL: VFIO support initialized
EAL: Probe PCI driver: mlx5_pci (15b3:101e) device: 0000:06:10.0 (socket -1)
EAL: Probe PCI driver: mlx5_pci (15b3:101e) device: 0000:06:11.0 (socket -1)
EAL: Probe PCI driver: net_virtio (1af4:1000) device: 0000:06:12.0 (socket -1)
eth_virtio_pci_init(): Failed to init PCI device
EAL: Requested device 0000:06:12.0 cannot be used
TELEMETRY: No legacy callbacks, legacy socket not created
Segment 0-0: IOVA:0x380000000, len:1073741824, virt:0x140000000, socket_id:0, hugepage_sz:1073741824, nchannel:0, nrank:0 fd:10
Segment 0-2: IOVA:0x400000000, len:1073741824, virt:0x1c0000000, socket_id:0, hugepage_sz:1073741824, nchannel:0, nrank:0 fd:11
I0311 11:30:57.800395 1 packet_pool.cc:53] Creating DpdkPacketPool for 262144 packets on node 0
I0311 11:30:57.800410 1 packet_pool.cc:74] PacketPool0 requests for 262144 packets
I0311 11:30:57.812450 1 packet_pool.cc:161] PacketPool0 has been created with 262144 packets
I0311 11:30:57.812634 1 pmd.cc:74] 2 DPDK PMD ports have been recognized:
I0311 11:30:57.812654 1 pmd.cc:98] DPDK port_id 0 (mlx5_pci) RXQ 1024 TXQ 1024 94:6d:ae:00:00:00 00000000:06:10.00 15b3:101e numa_node -1
I0311 11:30:57.812659 1 pmd.cc:98] DPDK port_id 1 (mlx5_pci) RXQ 1024 TXQ 1024 94:6d:ae:00:00:01 00000000:06:11.00 15b3:101e numa_node -1
The crash happens when the UPF start script run the command docker exec bess ./bessctl run up4 and the output is the following (I edited the python script to also print the exception string):
*** Error: Unhandled exception in the configuration script (most recent call last)
File "/opt/bess/bessctl/conf/up4.bess", line 141, in <module>
p.init_port(idx, parser.mode)
File "/opt/bess/bessctl/conf/ports.py", line 244, in init_port
sys.exit()
SystemExit
Command failed: run up4
max_ip_defrag_flows value not set. Not installing IP4Defrag module.
ip_frag_with_eth_mtu value not set. Not installing IP4Frag module.
Can't parse unix socket paths for notify! Setting it to default values (/tmp/notifycp)
Can't parse unix socket paths for end marker! Setting it to default values (/tmp/pfcpport)
Setting up port access on worker ids [0, 1]
#####################
errno=19 (ENODEV: No such device), Port id 0 is not attached
#####################
Registered dpdk ports do not exist.
Am I doing something wrong or is this a bug with Mellanox NICs ?
The text was updated successfully, but these errors were encountered:
I'm running the latest version of the UPF on a VM with Mellanox VFs attached.
To be able to use this i patched the
Dockerfile
of the UPF to build DPDK with support for Mellanox:Since DPDK with Mellanox doesn't require to unbind the card from the normal driver i moved the network interfaces to the container like this:
Using the 59c79ca commit of PR #39 everything works but with commit af34576 it stopped, the problem is this check.
I'm not sure why but since that
rte_eth_dev_socket_id
check the NUMA zone of the interface i think it may be related to the fact that's running in a Docker container and bess report -1 as NUMA zone:The crash happens when the UPF start script run the command
docker exec bess ./bessctl run up4
and the output is the following (I edited the python script to also print the exception string):Am I doing something wrong or is this a bug with Mellanox NICs ?
The text was updated successfully, but these errors were encountered: