GithubHelp home page GithubHelp logo

xilinx / open-nic-driver Goto Github PK

View Code? Open in Web Editor NEW
47.0 9.0 39.0 180 KB

AMD OpenNIC driver includes the Linux kernel driver

License: GNU General Public License v2.0

Makefile 0.18% C 88.34% C++ 11.48%
driver linux-kernel smartnic datacenter

open-nic-driver's Introduction

AMD OpenNIC Driver

This is one of the three components of the OpenNIC project. The other components are:

OpenNIC driver implements a Linux kernel driver for OpenNIC shell. It supports multiple PCI-e PFs with multiple TX/RX queues in each PF, and up to two 100Gbps ports on the same card. As of version 1.0, the driver has not implemented the ethtool routines to change the hash key and the indirection table.

The driver has been tested on under Ubuntu 18.04, 20.04, and 22.04 with multiple versions of the Linux kernel.

Building the Driver

Follow the steps below to build the driver.

  1. Run make to compile the loadable kernel module onic.ko.
  2. Connect 100Gbps cables/loopback adapters to enabled ports before insert the kernel module. Currently, the driver does not detect link status change. Thus links should be ready before loading the driver.
  3. Run sudo insmod onic.ko to insert the kernel module. (There is an optional parameter RS_FEC_ENABLED, which can be set to either zero or one.)
  4. Verify that no error message is printed through dmesg, and new devices show up in ifconfig output.

The driver registers a net device for each PF it probed. Net devices are registered with multiple queues. The number of queues depends on the number of MSI-X vectors available through the associated PF. In particular, for PF0 which acts as the master PF, the number of queues equals to the number of MSI-X vectors minus 2, one for card-level error interrupt and one for function-level user interrupt; for other PFs, it equals to the number of MSI-X vectors minus 1. Each net device has the same number of TX and RX queues.

For each FPGA card loaded with the OpenNIC shell bitstream, the driver detects the number of CMAC instances and manages the links accordingly. Only PF0 can enable/disable the links. The default bitstream, when configured with 2 PFs and 2 CMAC instances, maps PF0 to port0 and PF1 to port1.

Testing the Driver

Loopback Test

A loopback register is accessible from BAR2 at the offset 0x8090 and 0xC090, for port0 and port1 respectively. One could use pcimem to read/write PCI device registers.

To enable loopback, write 0x1 to the loopback register. For instance, to enable loopback on port0, issue the following command.

sudo ./pcimem /sys/devices/pci0000:d7/0000:d7:00.0/0000:d8:00.0/resource2 0x8090 w 0x1

After that, packets received by CMAC0 will be looped back to the host. Write 0x0 to disable loopback.

Here is a simple scenario to test the loopback mode. Assume the interface name is enp216s0f0 and the IP address is 192.168.1.10.

  1. Run tcpdump to capture packets on the interface.

     sudo tcpdump -i enp216s0f0 -xx
    
  2. Run ping 192.168.1.10.

  3. Observe that packets captured by tcpdump are always duplicated.

ETHTOOL Test

ethtool is a Linux utility and used to control and read status of various MAC parameters. Also, this tool is used to obtain various counter registers (such as total good packets etc.) from the MAC.

If the tool is not found, install it from distro.

$ sudo apt install ethtool

List all options that the tool can support

$ ethtool -h

List active interfaces, activate required interface Note: assume interface name is xyz01, IP address is 192.168.1.1

$ ifconfig -a

$ ifconfig xyz01 192.168.1.1 up

Use ethtool interface to see the status Note: assume interface name is xyz01

$ ethtool xyz01

Show driver information Note: assume interface name is xyz01

$ ethtool -i xyz01

Show adapter statistics Note: assume interface name is xyz01

$ ethtool -S xyz01

LM-SENSORS Test

To install lm-sensors framework:

 $ sudo apt install lm-sensors

 This installs 'sensors' application here: /usr/bin/sensors

To enable lm-sensors framework support in open-nic

 a. In the file "onic_main.c", enable macro "CMS_SUPPORT"

 b. build the open-nic driver as explained above

To test LM-SENSORS support in the open-nic

 Note: CMS IP in the design need to be added
 
 a, Load the kernel driver as explained above

 b. Run the sensors application, to see the data
    
    The output looks as below:

    $ sensors

     sn1000-onic-isa-0000

     Adapter: ISA adapter
  
     12V PEX:         +12.22 V  (max = +12.22 V, avg = +12.21 V)
  
     12V AUX:         +12.26 V  (max = +12.26 V, avg = +12.25 V)
  
     3V3 PEX:          +3.26 V  (max =  +3.26 V, avg =  +3.26 V)
  
     1V8 TOP:          +1.80 V  (max =  +1.80 V, avg =  +1.80 V)
  
     VCC INT:          +0.85 V  (max =  +0.85 V, avg =  +0.85 V)
  
     VCC 3V3:          +3.27 V  (max =  +3.27 V, avg =  +3.27 V)
  
     PCB TOP FRONT:    +44.0°C  (highest = +45.0°C)
  
     PCB TOP REAR:     +47.0°C  (highest = +48.0°C)
  
     FPGA TEMP:        +58.0°C  (highest = +59.0°C)
  
     QSPF 0:            +0.0°C  (highest =  +0.0°C)
  
     QSPF 1:            +0.0°C  (highest =  +0.0°C)
  
     POWER:            38.66 W  (avg =  40.42 W)
  
     12V PEX Current:  +2.07 A  (max =  +2.15 A, avg =  +2.08 A)
  
     12V AUX Current:  +0.76 A  (max =  +0.81 A, avg =  +0.75 A)
  
     VCC INT Current: +10.00 A  (max = +10.60 A, avg = +10.00 A)
  
     3V3 PEX Current:  +1.24 A  (max =  +1.29 A, avg =  +1.24 A)

     ... ... ...
     sensor output for other devices
     ... ... ...
     ... ... ...

Known Issues

Static IP Address

It has been found that in some cases, DHCP clients may cause kernel panic after inserting the kernel module. A message similar as below show up in dmesg.

[  224.835445] BUG: unable to handle kernel paging request at ffff9d7f45effa1f

Assigning a static network address seems to solve the issue in most cases. Add the following lines into /etc/network/interfaces with the correct interface name, IP address.

```
auto IF_NAME
iface IF_NAME inet static
      address IP_ADDRESS
```

An alternative is to uninstall DHCP. This can be done by killing any running processes using DHCP with ps -eF | grep dhclient, and then to disable DHCP.

Machine locks up when installing kernel module

This seems to be related to the DHCP issue mentioned above in "Static IP Address". The recommendation is to disable DHCP.


Copyright Notice and Disclaimer

© Copyright 2020 – 2021 Xilinx, Inc. All rights reserved.

This file contains confidential and proprietary information of Xilinx, Inc. and is protected under U.S. and international copyright and other intellectual property laws.

DISCLAIMER

This disclaimer is not a license and does not grant any rights to the materials distributed herewith. Except as otherwise provided in a valid license issued to you by Xilinx, and to the maximum extent permitted by applicable law: (1) THESE MATERIALS ARE MADE AVAILABLE "AS IS" AND WITH ALL FAULTS, AND XILINX HEREBY DISCLAIMS ALL WARRANTIES AND CONDITIONS, EXPRESS, IMPLIED, OR STATUTORY, INCLUDING BUT NOT LIMITED TO WARRANTIES OF MERCHANTABILITY, NONINFRINGEMENT, OR FITNESS FOR ANY PARTICULAR PURPOSE; and (2) Xilinx shall not be liable (whether in contract or tort, including negligence, or under any other theory of liability) for any loss or damage of any kind or nature related to, arising under or in connection with these materials, including for any direct, or any indirect, special, incidental, or consequential loss or damage (including loss of data, profits, goodwill, or any type of loss or damage suffered as a result of any action brought by a third party) even if such damage or loss was reasonably foreseeable or Xilinx had been advised of the possibility of the same.

CRITICAL APPLICATIONS

Xilinx products are not designed or intended to be fail-safe, or for use in any application requiring failsafe performance, such as life-support or safety devices or systems, Class III medical devices, nuclear facilities, applications related to the deployment of airbags, or any other applications that could lead to death, personal injury, or severe property or environmental damage (individually and collectively, "Critical Applications"). Customer assumes the sole risk and liability of any use of Xilinx products in Critical Applications, subject only to applicable laws and regulations governing limitations on product liability.

THIS COPYRIGHT NOTICE AND DISCLAIMER MUST BE RETAINED AS PART OF THIS FILE AT ALL TIMES.

open-nic-driver's People

Contributors

alexbradd avatar aniltirli avatar changsu-kimm avatar cneely-amd avatar cyberang3l avatar dsorber avatar gbrebner avatar hyunok-kim avatar kenter avatar lmunch avatar marcinwoj avatar sattili-xlnx avatar yangji-xlnx avatar yanz-xlnx avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

open-nic-driver's Issues

Error running make

I get the following error when I try to run make.

make -C /lib/modules/4.15.0-166-generic/build M=/home/suranga/open-nic-driver modules
make[1]: Entering directory '/usr/src/linux-headers-4.15.0-166-generic'
CC [M] /home/suranga/open-nic-driver/onic_main.o
CC [M] /home/suranga/open-nic-driver/onic_sysfs.o
CC [M] /home/suranga/open-nic-driver/onic_ethtool.o
CC [M] /home/suranga/open-nic-driver/onic_hardware.o
CC [M] /home/suranga/open-nic-driver/onic_lib.o
CC [M] /home/suranga/open-nic-driver/onic_netdev.o
/home/suranga/open-nic-driver/onic_netdev.c: In function ‘onic_xmit_frame’:
/home/suranga/open-nic-driver/onic_netdev.c:731:79: error: missing binary operator before token "("
#elif defined(RHEL_RELEASE_CODE) && (RHEL_RELEASE_CODE >= RHEL_RELEASE_VERSION(8, 1))
^
scripts/Makefile.build:333: recipe for target '/home/suranga/open-nic-driver/onic_netdev.o' failed
make[2]: *** [/home/suranga/open-nic-driver/onic_netdev.o] Error 1
Makefile:1590: recipe for target 'module/home/suranga/open-nic-driver' failed
make[1]: *** [module/home/suranga/open-nic-driver] Error 2
make[1]: Leaving directory '/usr/src/linux-headers-4.15.0-166-generic'
Makefile:24: recipe for target 'all' failed
make: *** [all] Error 2

I am using Ubuntu 18.04. The kernel version is: 4.15.0-166.

How to fix this? Any help would be greatly appreciated.

-Suranga

open nic driver on Alveo U280, rwaxi ioctl: Operation not supported error

Hi all,
I installed the reference_nic_au280.bit on the Alveo U280 board.
The PC is Ubuntu 22.04, vivado version 2020
The driver is loaded, and two interfaces (onic4s0f0, onic4s0f1) showed up.

When I tried ./rdaxi -a 0x4 (read input arbiter ID), it returned rwaxi ioctl: Operation not supported error.
Some parts of the code:
fd = socket(AF_INET6, SOCK_DGRAM, 0);
....
sifr.addr = addr;
if ((flags & HAVE_VALUE) != 0) {
sifr.val = value;
req = NFDP_IOCTL_CMD_WRITE_REG;
}

memset(&ifr, 0, sizeof(ifr));

memcpy(ifr.ifr_name, ifnam, ifnamlen);
ifr.ifr_name[ifnamlen] = '\0';
ifr.ifr_data = (char *)&sifr;

rc = ioctl(fd, req, &ifr);

if (rc == -1)
{
	err(1, "ioctl-");
}

Anyone know about this issue? Thanks.

open-nic-driver on Centos?

Hi,

I am a newbie for open-nic, apologise if this question has been asked/answered repetitively. I tried to build open-nic-driver on a couple of servers with Centos, all failed (please see the attached below), but fine with the one with Ubuntu. I wonder if this driver works for a Centos server?

Thanks very much for your help!

make
make -C /lib/modules/3.10.0-1160.42.2.el7.x86_64/build M=/home/neil/U55nPlatform/open-nic-driver modules
make[1]: Entering directory /usr/src/kernels/3.10.0-1160.42.2.el7.x86_64' CC [M] /home/neil/U55nPlatform/open-nic-driver/onic_main.o /home/neil/U55nPlatform/open-nic-driver/onic_main.c:132:2: error: unknown field ândo_change_mtuâ specified in initializer .ndo_change_mtu = onic_change_mtu, ^ /home/neil/U55nPlatform/open-nic-driver/onic_main.c:132:2: error: initialization from incompatible pointer type [-Werror] /home/neil/U55nPlatform/open-nic-driver/onic_main.c:132:2: error: (near initialization for âonic_netdev_ops.ndo_set_configâ cc1: all warnings being treated as errors make[2]: *** [/home/neil/U55nPlatform/open-nic-driver/onic_main.o] Error 1 make[1]: *** [_module_/home/neil/U55nPlatform/open-nic-driver] Error 2 make[1]: Leaving directory /usr/src/kernels/3.10.0-1160.42.2.el7.x86_64'
make: *** [all] Error 2

Support for aarch64

Hi. Unlike other boards supported by OpenNIC, SN1000 and U45N have built-in ARM core. Can the OpenNIC driver be compiled for aarch64 and be used in the built-in ARM core on SN1000 and U45N?

Installing driver on Ubuntu 20.04

Although it was said that to install the driver on Ubuntu 18.04 with the Linux kernel version of 4.15.0, I want to install it on Ubuntu 20.04. (I don't want to install another Linux on my machine)
Unfortunately, I cannot ping another machine.
I have programmed my u280 with open-nic-shell design and this is the output of dmesg after I load onic.

[  487.260359] OpenNIC Linux Kernel Driver 0.21
[  487.260531] onic 0000:17:00.0: enabling device (0000 -> 0002)
[  487.260778] onic 0000:17:00.0 onic23s0f0 (uninitialized): Set MAC address to 00:0a:35:d2:35:5c
[  487.260780] onic 0000:17:00.0: device is a master PF
[  487.261176] onic 0000:17:00.0: Allocated 8 queue vectors
[  487.360488] onic 0000:17:00.0: Number of CMAC instances = 1
[  487.360530] onic 0000:17:00.0: Setup IRQ vector 609 with name onic23s0f0-0
[  487.360553] onic 0000:17:00.0: Setup IRQ vector 610 with name onic23s0f0-1
[  487.360574] onic 0000:17:00.0: Setup IRQ vector 611 with name onic23s0f0-2
[  487.360594] onic 0000:17:00.0: Setup IRQ vector 612 with name onic23s0f0-3
[  487.360615] onic 0000:17:00.0: Setup IRQ vector 613 with name onic23s0f0-4
[  487.360639] onic 0000:17:00.0: Setup IRQ vector 614 with name onic23s0f0-5
[  487.360658] onic 0000:17:00.0: Setup IRQ vector 615 with name onic23s0f0-6
[  487.360691] onic 0000:17:00.0: Setup IRQ vector 616 with name onic23s0f0-7
[  487.368783] onic 0000:17:00.0 ens81: renamed from onic23s0f0

Could you give me some hints on how to debug it?

URGENT: Existing support for linuxptp on open-nic driver

In our project, we already have synchronized the system clock and the shared and SmartNIC clocks in VMs using kvm_ptp on a Virtual Machine as shown below. It utilizes the ptp hardware support on the NICs, both in the actual hardware and the VMs to synchronize the system and the NIC clocks with ptp. It does that through ptp_daemon and the pcie buses (/dev/ptpX, dev/ptpY, etc). It is possible to do so with hardware devices which already have hardware support for ptp synchronization.
timing

Now, we are using Alveo U280 accelerator cards for running P4 program to control the data plane. In that, we are trying to get hardware timestamps from the accelerator cards which would be ptp synchronized as mentioned above. But the issue is, U280 doesn’t have ARM core as the MPSoC so that the software driver used to control ptp (https://github.com/Xilinx/linux-xlnx/blob/master/drivers/ptp/Kconfig) cannot be directly used. We don’t know how to control the ptp from the OS. In MPSoC, Xilinx developed a specific linux-ptp driver that can be deployed on the MPSoC’s Arm core to control the ptp service on the hardware block (CMAC). I don’t think that we can directly use this specific driver to control the ptp on the CMAC.

So, do you have an existing support for linuxptp on open-nic driver? If not, is it possible to do that with the open-nicshell or cmac driver? Do we have any support for that on the driver? Can we control it from a Xilinx ptp library, or a memory file or pcie bus? How can we control the ptp services on the CMAC hardware IP from our x86 based linux host through the PCIE bus?

Any sort of help or pointers would be really helpful. Thanks in advance.

insmod onic.ko hanging

insmod onic.ko is hanging. I see the following in dmesg log.
It used to work fine earlier. Could it be issue due to interaction with an updated linux kernel?

This is using vanilla open nic shell bitstream on a U280 FPGA.

dmesg log:

[  427.355719] OpenNIC Linux Kernel Driver 0.21
[  427.356254] onic 0000:86:00.0 onic134s0f0 (uninitialized): Set MAC address to 00:0a:35:11:d0:b0
[  427.356257] onic 0000:86:00.0: device is a master PF
[  427.356538] onic 0000:86:00.0: Allocated 8 queue vectors
[  427.854578] BUG: unable to handle kernel NULL pointer dereference at 000000000000000a
[  427.857391] IP: qdma_invalidate_fmap_ctxt+0x11/0x60 [onic]
[  427.860124] PGD 0 P4D 0
[  427.862786] Oops: 0002 [#1] SMP NOPTI
[  427.865416] Modules linked in: onic(OE+) nfsv3 nfs_acl rpcsec_gss_krb5 auth_rpcgss nfsv4 nfs lockd grace fscache ipmi_ssif intel_rapl skx_edac joydev x86_pkg_temp_thermal intel_powerclamp input_leds coretemp ftdi_sio kvm_intel usbserial kvm irqbypass xclmgmt(OE) intel_cstate xocl(OE) intel_rapl_perf fpga_mgr mei_me mei ioatdma lpc_ich shpchp acpi_power_meter ipmi_si ipmi_devintf ipmi_msghandler mac_hid acpi_pad sch_fq_codel ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi parport_pc ppdev lp parport sunrpc ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear hid_generic crct10dif_pclmul crc32_pclmul ghash_clmulni_intel usbhid hid pcbc aesni_intel
[  427.884024]  aes_x86_64 crypto_simd glue_helper cryptd ast i2c_algo_bit ttm drm_kms_helper ixgbe syscopyarea sysfillrect sysimgblt fb_sys_fops dca ptp drm pps_core mdio ahci libahci wmi [last unloaded: onic]
[  427.889334] CPU: 16 PID: 108 Comm: kworker/16:0 Tainted: G           OE    4.15.0-177-generic #186-Ubuntu
[  427.891975] Hardware name: Supermicro SYS-2029GP-TR/X11DPG-SN, BIOS 3.4 12/18/2020
[  427.894599] Workqueue: events work_for_cpu_fn
[  427.897189] RIP: 0010:qdma_invalidate_fmap_ctxt+0x11/0x60 [onic]
[  427.899749] RSP: 0018:ffffaedb0cc27d58 EFLAGS: 00010282
[  427.902266] RAX: 00000000fffffff0 RBX: 0000000000000000 RCX: 0000000000000000
[  427.904767] RDX: 0000000000000000 RSI: ffffaedb0f311000 RDI: 0000000000000000
[  427.907223] RBP: ffffaedb0cc27d68 R08: ffff9ef7e0718480 R09: ffffaedb0cc27be0
[  427.909656] R10: fffffffffffc0000 R11: ffffcedaffffffff R12: ffff9ef7dd81a8c0
[  427.912049] R13: ffff9ef7f0640000 R14: 0000000000000000 R15: 0000000000000000
[  427.914412] FS:  0000000000000000(0000) GS:ffff9ef7fee00000(0000) knlGS:0000000000000000
[  427.916754] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  427.919057] CR2: 000000000000000a CR3: 000000154d00a006 CR4: 00000000007606e0
[  427.921340] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  427.923578] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  427.925770] PKRU: 55555554
[  427.927913] Call Trace:
[  427.930022]  onic_init_hardware+0x116/0x900 [onic]
[  427.932103]  onic_probe+0x2b8/0x4f0 [onic]
[  427.934142]  local_pci_probe+0x47/0xa0
[  427.936135]  work_for_cpu_fn+0x1a/0x30
[  427.938083]  process_one_work+0x1de/0x420
[  427.939991]  worker_thread+0x228/0x410
[  427.941854]  kthread+0x121/0x140
[  427.943668]  ? process_one_work+0x420/0x420
[  427.945450]  ? kthread_create_worker_on_cpu+0x70/0x70
[  427.947201]  ret_from_fork+0x1f/0x40
[  427.948904] Code: 65 48 33 14 25 28 00 00 00 75 02 c9 c3 e8 e8 57 15 c4 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 31 c9 31 d2 48 89 e5 48 83 ec 10 <c7> 47 0a 00 00 00 00 65 48 8b 04 25 28 00 00 00 48 89 45 f8 31
[  427.952366] RIP: qdma_invalidate_fmap_ctxt+0x11/0x60 [onic] RSP: ffffaedb0cc27d58
[  427.954050] CR2: 000000000000000a
[  427.955688] ---[ end trace 8a5a28677ac3f2ee ]---

Loopback test tcpdump empty

Hi,

I am currently trying to get the loopback test working, however tcpdump on the FPGA NIC iface is not outputting any packets. Here are the steps I followed:

  • Make the kernel driver
  • Connect port 0 on the NIC to a separate machine
  • Insert the kernel driver: insmod onic.ko
  • Verify interfaces appear: ifconfig -a (ens1f0 and ens1f1)
  • Reveal Xilinx devices: sudo lspci -PPd 10ee: (3a:00.0/3b:00.0 and 3a:00.0/3b:00.1)
  • Enable loopback on port 0: sudo ./pcimem /sys/devices/pci0000:3a/0000:3a:00.0/0000:3b:00.0/resource2 0x8090 w 0x1
  • Assign IP to NIC port 0: ifconfig ens1f0 192.168.1.10 up
  • Run tcpdump to capture packets on the interface: tcpdump -i ens1f0 -xx
  • (from other machine) ping 192.168.1.10

Issue: tcpdump on the FPGA displays no traffic

However, tcpdump on a separate NIC attached to the same server as the FPGA does display traffic between the FPGA (192.168.1.10) and the separate machine (192.168.10.15). tcpdump of the interface on the separate machine displays the same.

Essentially, everything appears to be working, except tcpdump on the FPGA NIC interface.

Xilinx Alveo U280
OpenNIC shell w/ 2 CMAC ports
Ubuntu 20.04

Thanks!

Does the NIC in U250 can directly connected to other NIC from other host, rather through another FPGA's NIC

Hi, we are testing the throughput of the NIC in U250. First, we connected two QSFP ports: one from the U250 and the other from a conventional NIC (40GB/s). Then, we used port 1 on the conventional NIC to send packets, while port 2 on the U250's NIC was used to test the throughput. However, we were unable to establish a successful connection between these two ports.

Considering this, is it possible to directly connect the NIC in the U250 to another NIC on a different host, instead of going through another FPGA's NIC?

If you have any comments, please feel free to discuss them with us. Thank you so much in advance.

Best regards~

[Bug report] OpenNIC not receiving in the first time insmod

Hi,

I interconnected U250's two 100G ports and tried to test them by ping from one to another. However, the receiver cannot detect the sender's ARP request packet after the driver is installed "the first time". The packet can be seen in tcpdump -i enp94s0f0 but not tcpdump -i enp94s0f1.

This is what I did

$ sudo insmod onic.ko
$ ping 1.1.1.1 -I enp94s0f0

My Solution

When I remove the module and install it again, that works. (tcpdump can capture the packet on both sides)

$ sudo rmmod onic.ko
$ sudo insmod onic.ko

My two machines with two different U250s encounter the same situation. Although it works after the second time insmod, I still want to report it and see if this is a bug or not.

Thanks,
ChonLam

Driver getting stuck on load against u50

Hi,
I am trying to get the driver up to do a loopback test on the U50. I followed the steps and with the firmware loaded tried to run insmod onic.ko
The command does not return and I see the following in dmesg:

[ 1450.565564] onic 0000:82:00.0: device is a master PF
[ 1450.565839] onic 0000:82:00.0: Allocated 8 queue vectors
[ 1451.063047] BUG: kernel NULL pointer dereference, address: 0000000000000008
[ 1451.063050] #PF: supervisor read access in kernel mode
[ 1451.063051] #PF: error_code(0x0000) - not-present page
[ 1451.063051] PGD 0 P4D 0
[ 1451.063053] Oops: 0000 [#1] SMP PTI
[ 1451.063056] CPU: 14 PID: 261 Comm: kworker/14:1 Tainted: G          OE    5.4.0-109-generic #123-Ubuntu
[ 1451.063057] Hardware name: Supermicro X10DRH/X10DRH-iT, BIOS 2.0 12/17/2015
[ 1451.063061] Workqueue: events work_for_cpu_fn
[ 1451.063067] RIP: 0010:qdma_invalidate_fmap_ctxt+0x20/0x60 [onic]
[ 1451.063069] Code: 66 d0 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 31 c9 31 d2 48 89 e5 48 83 ec 10 65 48 8b 04 25 28 00 00 00 48 89 45 f8 31 c0 <0f> b7 47 08 c7 47 0a 00 00 00 00 48 8d 75 f4 25 ff 07 00 00 c1 e0
[ 1451.063070] RSP: 0018:ffffa43e8e2dfd38 EFLAGS: 00010246
[ 1451.063071] RAX: 0000000000000000 RBX: ffff8e1e244fc8c0 RCX: 0000000000000000
[ 1451.063072] RDX: 0000000000000000 RSI: ffffa43e8e901000 RDI: 0000000000000000
[ 1451.063072] RBP: ffffa43e8e2dfd48 R08: 0000034bebdb23f0 R09: 000000000000000e
[ 1451.063073] R10: ffffa43ea0c40000 R11: ffff8e1e245f68c0 R12: ffff8e0db3fcb000
[ 1451.063074] R13: 0000000000000000 R14: 00000000fffffff0 R15: ffff8e0db3fcb000
[ 1451.063075] FS: 0000000000000000(0000) GS:ffff8e1e3f800000(0000) knlGS:0000000000000000
[ 1451.063076] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1451.063077] CR2: 0000000000000008 CR3: 0000002ffc60a001 CR4: 00000000003606e0
[ 1451.063078] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1451.063078] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 1451.063079] Call Trace:
[ 1451.063083] onic_init_hardware+0x10d/0x820 [onic]
[ 1451.063086] onic_probe+0x22b/0x2c0 [onic]
[ 1451.063089] local_pci_probe+0x48/0x80
[ 1451.063093] ? __schedule+0x2eb/0x740
[ 1451.063095] work_for_cpu_fn+0x1a/0x30
[ 1451.063097] process_one_work+0x1eb/0x3b0
[ 1451.063099] worker_thread+0x21e/0x400
[ 1451.063100] kthread+0x104/0x140
[ 1451.063102] ? process_one_work+0x3b0/0x3b0
[ 1451.063103] ? kthread_park+0x90/0x90
[ 1451.063105] `ret_from_fork+0x35/0x40

One of MAC is not enabled in Dual MAC case

if (!onic_rx_lane_aligned(hw, cmac_id)) {

If OpenNIC configures two ports, use a cable to connect both ports together as a loopback test case. MAC0 won't be enabled. The reason is when MAC0 is enabling, RX lane won't be aligned. The reason is MAC1 is not enabled yet.

Or if OpenNIC connects to another NIC, whether OpenNIC MAC enables or not, it will depend on the remote NIC. That will be a logical deadlock.

I would suggest to remove this rx lane alignment check here.

Please refer qep driver.

https://github.com/Xilinx/qep-drivers/blob/eba9eb7a6880a20e797303aa747f191170e98209/linux-kernel/driver/cmac/xcmac.c#L263

Ethtool showing link detected even with no cable connected

Hi,

I'm trying this on a u250, which has 2 QSFP+ ports. But when I connected only one of the ports, both interface shows link detected

$ ethtool enp24s0f0
Settings for enp24s0f0:
Cannot get wake-on-lan settings: Operation not permitted
        Link detected: yes
$ ethtool enp24s0f1
Settings for enp24s0f1:
Cannot get wake-on-lan settings: Operation not permitted
        Link detected: yes

Is this expected or is there anything I did wrong?

Driver works alongside vfio

I was wondering if it is possible in opennic to use one of u280 network ports to use nic-driver and use DPDK through vfio for the other one.
I played a little bit and limited the driver to only use the first one but vfio couldn't find the second one.
Is there a workaround to use both at the same time?

eth_hw_addr_set

/data/open-nic/dpdk/open-nic-driver/onic_netdev.c: In function ‘onic_set_mac_address’:
/data/open-nic/dpdk/open-nic-driver/onic_netdev.c:755:2: error: implicit declaration of function ‘eth_hw_addr_set’; did you mean ‘eth_addr_dec’? [-Werror=implicit-function-declaration]
eth_hw_addr_set(dev, dev_addr);
^~~~~~~~~~~~~~~
eth_addr_dec

user-space MMAP not working on Debian 10

Hi!

I know that Debian is not supported yet...
Problem:
MMAP access to opennic via sysfs is not working on Debian, when opennic driver is loaded.
For example: pcimem (https://github.com/billfarrow/pcimem) mmap fails.

Cause:
Debian kernels ship with CONFIG_IO_STRICT_DEVMEM parameter set. This kernel setting prevents a user-space application to access device memory when a driver is loaded, and is using it.

I have modified open-nic-driver by adding a char-device, which allows to access BAR-2 register space from user applications.
Note, that this is a fast-and-dirty modification because we needed it, but you could officially implement it, if others are interested too.

There is my fork:
https://github.com/alacamester/open-nic-driver

L.

Low throughput when use this driver

I tried to use a 100Gbps QSFP28 DAC cable to connect two U50s and ran an iperf speed test, but the result was only about 26Gbps.
Is this expected? What is the maximum speed that the driver can achieve?

Get access to TUSER signals of the user logic box

I am working on instantiating a VitisNetP4 IP into user logic box. The IP will spit out some custom user metadata with the packets. I have noticed that I could connect them with the TUSER interfaces according to this.

How do I use handle the TUSER signals of user logic boxes when using VitisNetP4 (formerly known as SDNet)?
Use an SDNet Tuple to propagate the TUSER signals across different engines.

My question is how I am able to access them through the open-nic-driver?

Or should I connect the user metadata through a different interface?

Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.