Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Port id is not attached error with Mellanox NICs #53

Closed
AldericoGallo opened this issue Mar 11, 2024 · 2 comments · Fixed by #57
Closed

Port id is not attached error with Mellanox NICs #53

AldericoGallo opened this issue Mar 11, 2024 · 2 comments · Fixed by #57

Comments

@AldericoGallo
Copy link

I'm running the latest version of the UPF on a VM with Mellanox VFs attached.

To be able to use this i patched the Dockerfile of the UPF to build DPDK with support for Mellanox:

diff --git a/Dockerfile b/Dockerfile
index 6913931..90839f0 100644
--- a/Dockerfile
+++ b/Dockerfile
@@ -14,6 +14,7 @@ RUN apt-get update && apt-get install -y \
     --no-install-recommends \
     git \
     ca-certificates \
+    wget \
     libbpf0 \
     libelf-dev && \
     apt-get clean && \
@@ -21,10 +22,24 @@ RUN apt-get update && apt-get install -y \

 # BESS pre-reqs
 WORKDIR /bess
+COPY mellanox_fix.patch /tmp/mellanox_fix.patch
 RUN git clone https://github.com/omec-project/bess.git . && \
     git checkout ${BESS_COMMIT} && \
+    mv /tmp/mellanox_fix.patch . && git apply mellanox_fix.patch && \
     cp -a protobuf /protobuf

+# Install Mellanox user-space libraries for support in DPDK
+ARG MLNX_EN_VERSION=23.10-2.1.3.1
+RUN \
+        cd /tmp && \
+        wget "https://www.mellanox.com/downloads/ofed/MLNX_EN-${MLNX_EN_VERSION}/mlnx-en-${MLNX_EN_VERSION}-ubuntu20.04-x86_64.tgz" && \
+        tar zxf mlnx-en-${MLNX_EN_VERSION}-ubuntu20.04-x86_64.tgz && \
+        rm mlnx-en-${MLNX_EN_VERSION}-ubuntu20.04-x86_64.tgz && \
+        cd mlnx-en-${MLNX_EN_VERSION}-ubuntu20.04-x86_64 && \
+        echo y | ./install --dpdk --user-space-only --without-fw-update && \
+        cd .. && \
+        rm -rf mlnx-en-${MLNX_EN_VERSION}-ubuntu20.04-x86_64
+
 # Build DPDK
 RUN ./build.py dpdk

Since DPDK with Mellanox doesn't require to unbind the card from the normal driver i moved the network interfaces to the container like this:

sudo ip link set enp6s16np0 netns pause
sudo ip link set enp6s17np0 netns pause

Using the 59c79ca commit of PR #39 everything works but with commit af34576 it stopped, the problem is this check.

I'm not sure why but since that rte_eth_dev_socket_id check the NUMA zone of the interface i think it may be related to the fact that's running in a Docker container and bess report -1 as NUMA zone:

EAL: Detected CPU lcores: 16
EAL: Detected NUMA nodes: 1
EAL: Detected static linkage of DPDK
EAL: Selected IOVA mode 'PA'
EAL: VFIO support initialized
EAL: Probe PCI driver: mlx5_pci (15b3:101e) device: 0000:06:10.0 (socket -1)
EAL: Probe PCI driver: mlx5_pci (15b3:101e) device: 0000:06:11.0 (socket -1)
EAL: Probe PCI driver: net_virtio (1af4:1000) device: 0000:06:12.0 (socket -1)
eth_virtio_pci_init(): Failed to init PCI device
EAL: Requested device 0000:06:12.0 cannot be used
TELEMETRY: No legacy callbacks, legacy socket not created
Segment 0-0: IOVA:0x380000000, len:1073741824, virt:0x140000000, socket_id:0, hugepage_sz:1073741824, nchannel:0, nrank:0 fd:10
Segment 0-2: IOVA:0x400000000, len:1073741824, virt:0x1c0000000, socket_id:0, hugepage_sz:1073741824, nchannel:0, nrank:0 fd:11
I0311 11:30:57.800395     1 packet_pool.cc:53] Creating DpdkPacketPool for 262144 packets on node 0
I0311 11:30:57.800410     1 packet_pool.cc:74] PacketPool0 requests for 262144 packets
I0311 11:30:57.812450     1 packet_pool.cc:161] PacketPool0 has been created with 262144 packets
I0311 11:30:57.812634     1 pmd.cc:74] 2 DPDK PMD ports have been recognized:
I0311 11:30:57.812654     1 pmd.cc:98] DPDK port_id 0 (mlx5_pci)   RXQ 1024 TXQ 1024  94:6d:ae:00:00:00  00000000:06:10.00 15b3:101e   numa_node -1
I0311 11:30:57.812659     1 pmd.cc:98] DPDK port_id 1 (mlx5_pci)   RXQ 1024 TXQ 1024  94:6d:ae:00:00:01  00000000:06:11.00 15b3:101e   numa_node -1

The crash happens when the UPF start script run the command docker exec bess ./bessctl run up4 and the output is the following (I edited the python script to also print the exception string):

*** Error: Unhandled exception in the configuration script (most recent call last)
  File "/opt/bess/bessctl/conf/up4.bess", line 141, in <module>
    p.init_port(idx, parser.mode)
  File "/opt/bess/bessctl/conf/ports.py", line 244, in init_port
    sys.exit()
SystemExit
  Command failed: run up4
max_ip_defrag_flows value not set. Not installing IP4Defrag module.
ip_frag_with_eth_mtu value not set. Not installing IP4Frag module.
Can't parse unix socket paths for notify! Setting it to default values (/tmp/notifycp)
Can't parse unix socket paths for end marker! Setting it to default values (/tmp/pfcpport)
Setting up port access on worker ids [0, 1]
#####################
errno=19 (ENODEV: No such device), Port id 0 is not attached
#####################
Registered dpdk ports do not exist.

Am I doing something wrong or is this a bug with Mellanox NICs ?

@gab-arrobo
Copy link
Contributor

@AldericoGallo, PR #57 should resolve the issue, Please confirm. Thanks!

@AldericoGallo
Copy link
Author

@gab-arrobo, I confirm that the issue is resolved, Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants