V2EX = way to explore
V2EX 是一个关于分享和探索的地方
现在注册
已注册用户请  登录
V2EX 提问指南
CRUD
V2EX  ›  问与答

pve 网卡直通问题求解

  •  
  •   CRUD · 2022-12-22 15:39:30 +08:00 · 3208 次点击
    这是一个创建于 708 天前的主题,其中的信息可能已经有所发展或是发生改变。

    主板是精粤 b660i ,一个 8125 2.5G 网口和一个 8111 千兆网口。配置完直通后通过 web 添加 pci 将 8111 直通给 openwrt ,然后 openwrt 虚拟机启动时报错:

    kvm: ../hw/pci/pci.c:1562: pci_irq_handler: Assertion `0 <= irq_num && irq_num < PCI_NUM_PINS' failed.
    TASK ERROR: start failed: QEMU exited with code 1
    

    pve syslog:

    Dec 22 15:15:13 aio pvedaemon[1399]: <root@pam> starting task UPID:aio:00006CA9:0005885C:63A40401:qmstart:100:root@pam:
    Dec 22 15:15:13 aio pvedaemon[27817]: start VM 100: UPID:aio:00006CA9:0005885C:63A40401:qmstart:100:root@pam:
    Dec 22 15:15:13 aio kernel: vfio-pci 0000:04:00.0: Unable to change power state from D3cold to D0, device inaccessible
    Dec 22 15:15:13 aio kernel: vfio-pci 0000:04:00.0: Unable to change power state from D3cold to D0, device inaccessible
    Dec 22 15:15:14 aio systemd[1]: Started 100.scope.
    Dec 22 15:15:14 aio systemd-udevd[27834]: Using default interface naming scheme 'v247'.
    Dec 22 15:15:14 aio systemd-udevd[27834]: ethtool: autonegotiation is unset or enabled, the speed and duplex are not writable.
    Dec 22 15:15:15 aio kernel: device tap100i0 entered promiscuous mode
    Dec 22 15:15:15 aio kernel: fwbr100i0: port 1(fwln100i0) entered disabled state
    Dec 22 15:15:15 aio kernel: vmbr0: port 2(fwpr100p0) entered disabled state
    Dec 22 15:15:15 aio kernel: device fwln100i0 left promiscuous mode
    Dec 22 15:15:15 aio kernel: fwbr100i0: port 1(fwln100i0) entered disabled state
    Dec 22 15:15:15 aio kernel: device fwpr100p0 left promiscuous mode
    Dec 22 15:15:15 aio kernel: vmbr0: port 2(fwpr100p0) entered disabled state
    Dec 22 15:15:15 aio systemd-udevd[27834]: ethtool: autonegotiation is unset or enabled, the speed and duplex are not writable.
    Dec 22 15:15:15 aio systemd-udevd[27834]: ethtool: autonegotiation is unset or enabled, the speed and duplex are not writable.
    Dec 22 15:15:15 aio systemd-udevd[27837]: ethtool: autonegotiation is unset or enabled, the speed and duplex are not writable.
    Dec 22 15:15:15 aio systemd-udevd[27837]: Using default interface naming scheme 'v247'.
    Dec 22 15:15:15 aio kernel: vmbr0: port 2(fwpr100p0) entered blocking state
    Dec 22 15:15:15 aio kernel: vmbr0: port 2(fwpr100p0) entered disabled state
    Dec 22 15:15:15 aio kernel: device fwpr100p0 entered promiscuous mode
    Dec 22 15:15:15 aio kernel: vmbr0: port 2(fwpr100p0) entered blocking state
    Dec 22 15:15:15 aio kernel: vmbr0: port 2(fwpr100p0) entered forwarding state
    Dec 22 15:15:15 aio kernel: fwbr100i0: port 1(fwln100i0) entered blocking state
    Dec 22 15:15:15 aio kernel: fwbr100i0: port 1(fwln100i0) entered disabled state
    Dec 22 15:15:15 aio kernel: device fwln100i0 entered promiscuous mode
    Dec 22 15:15:15 aio kernel: fwbr100i0: port 1(fwln100i0) entered blocking state
    Dec 22 15:15:15 aio kernel: fwbr100i0: port 1(fwln100i0) entered forwarding state
    Dec 22 15:15:15 aio kernel: fwbr100i0: port 2(tap100i0) entered blocking state
    Dec 22 15:15:15 aio kernel: fwbr100i0: port 2(tap100i0) entered disabled state
    Dec 22 15:15:15 aio kernel: fwbr100i0: port 2(tap100i0) entered blocking state
    Dec 22 15:15:15 aio kernel: fwbr100i0: port 2(tap100i0) entered forwarding state
    Dec 22 15:15:15 aio kernel: vfio-pci 0000:04:00.0: Unable to change power state from D3cold to D0, device inaccessible
    Dec 22 15:15:15 aio kernel: vfio-pci 0000:04:00.0: Unable to change power state from D3cold to D0, device inaccessible
    Dec 22 15:15:15 aio kernel: vfio-pci 0000:04:00.0: Unable to change power state from D3cold to D0, device inaccessible
    Dec 22 15:15:16 aio kernel: vfio-pci 0000:04:00.0: vfio_cap_init: hiding cap 0xff@0xff
    Dec 22 15:15:16 aio kernel: vfio-pci 0000:04:00.0: vfio_cap_init: hiding cap 0xff@0xff
    Dec 22 15:15:16 aio kernel: vfio-pci 0000:04:00.0: vfio_cap_init: hiding cap 0xff@0xff
    Dec 22 15:15:16 aio kernel: vfio-pci 0000:04:00.0: vfio_cap_init: hiding cap 0xff@0xff
    Dec 22 15:15:16 aio kernel: vfio-pci 0000:04:00.0: vfio_cap_init: hiding cap 0xff@0xff
    Dec 22 15:15:16 aio kernel: vfio-pci 0000:04:00.0: vfio_cap_init: hiding cap 0xff@0xff
    Dec 22 15:15:16 aio kernel: vfio-pci 0000:04:00.0: vfio_cap_init: hiding cap 0xff@0xff
    Dec 22 15:15:16 aio kernel: vfio-pci 0000:04:00.0: vfio_cap_init: hiding cap 0xff@0xff
    Dec 22 15:15:16 aio kernel: vfio-pci 0000:04:00.0: vfio_cap_init: hiding cap 0xff@0xff
    Dec 22 15:15:16 aio kernel: vfio-pci 0000:04:00.0: vfio_cap_init: hiding cap 0xff@0xff
    Dec 22 15:15:16 aio kernel: vfio-pci 0000:04:00.0: vfio_cap_init: hiding cap 0xff@0xff
    Dec 22 15:15:16 aio kernel: vfio-pci 0000:04:00.0: vfio_cap_init: hiding cap 0xff@0xff
    Dec 22 15:15:16 aio kernel: vfio-pci 0000:04:00.0: vfio_cap_init: hiding cap 0xff@0xff
    Dec 22 15:15:16 aio kernel: vfio-pci 0000:04:00.0: vfio_cap_init: hiding cap 0xff@0xff
    Dec 22 15:15:16 aio kernel: vfio-pci 0000:04:00.0: vfio_cap_init: hiding cap 0xff@0xff
    Dec 22 15:15:16 aio kernel: vfio-pci 0000:04:00.0: vfio_cap_init: hiding cap 0xff@0xff
    Dec 22 15:15:16 aio kernel: vfio-pci 0000:04:00.0: vfio_cap_init: hiding cap 0xff@0xff
    Dec 22 15:15:16 aio kernel: vfio-pci 0000:04:00.0: vfio_cap_init: hiding cap 0xff@0xff
    Dec 22 15:15:16 aio kernel: vfio-pci 0000:04:00.0: vfio_cap_init: hiding cap 0xff@0xff
    Dec 22 15:15:16 aio kernel: vfio-pci 0000:04:00.0: vfio_cap_init: hiding cap 0xff@0xff
    Dec 22 15:15:16 aio kernel: vfio-pci 0000:04:00.0: vfio_cap_init: hiding cap 0xff@0xff
    Dec 22 15:15:16 aio kernel: vfio-pci 0000:04:00.0: vfio_cap_init: hiding cap 0xff@0xff
    Dec 22 15:15:16 aio kernel: vfio-pci 0000:04:00.0: vfio_cap_init: hiding cap 0xff@0xff
    Dec 22 15:15:16 aio kernel: vfio-pci 0000:04:00.0: vfio_cap_init: hiding cap 0xff@0xff
    Dec 22 15:15:16 aio kernel: vfio-pci 0000:04:00.0: vfio_cap_init: hiding cap 0xff@0xff
    Dec 22 15:15:16 aio kernel: vfio-pci 0000:04:00.0: vfio_cap_init: hiding cap 0xff@0xff
    Dec 22 15:15:16 aio kernel: vfio-pci 0000:04:00.0: vfio_cap_init: hiding cap 0xff@0xff
    Dec 22 15:15:16 aio kernel: vfio-pci 0000:04:00.0: vfio_cap_init: hiding cap 0xff@0xff
    Dec 22 15:15:16 aio kernel: vfio-pci 0000:04:00.0: vfio_cap_init: hiding cap 0xff@0xff
    Dec 22 15:15:16 aio kernel: vfio-pci 0000:04:00.0: vfio_cap_init: hiding cap 0xff@0xff
    Dec 22 15:15:16 aio kernel: vfio-pci 0000:04:00.0: vfio_cap_init: hiding cap 0xff@0xff
    Dec 22 15:15:16 aio kernel: vfio-pci 0000:04:00.0: vfio_cap_init: hiding cap 0xff@0xff
    Dec 22 15:15:16 aio kernel: vfio-pci 0000:04:00.0: vfio_cap_init: hiding cap 0xff@0xff
    Dec 22 15:15:16 aio kernel: vfio-pci 0000:04:00.0: vfio_cap_init: hiding cap 0xff@0xff
    Dec 22 15:15:16 aio kernel: vfio-pci 0000:04:00.0: vfio_cap_init: hiding cap 0xff@0xff
    Dec 22 15:15:16 aio kernel: vfio-pci 0000:04:00.0: vfio_cap_init: hiding cap 0xff@0xff
    Dec 22 15:15:16 aio kernel: vfio-pci 0000:04:00.0: vfio_cap_init: hiding cap 0xff@0xff
    Dec 22 15:15:16 aio kernel: vfio-pci 0000:04:00.0: vfio_cap_init: hiding cap 0xff@0xff
    Dec 22 15:15:16 aio kernel: vfio-pci 0000:04:00.0: vfio_cap_init: hiding cap 0xff@0xff
    Dec 22 15:15:16 aio kernel: vfio-pci 0000:04:00.0: vfio_cap_init: hiding cap 0xff@0xff
    Dec 22 15:15:16 aio kernel: vfio-pci 0000:04:00.0: vfio_cap_init: hiding cap 0xff@0xff
    Dec 22 15:15:16 aio kernel: vfio-pci 0000:04:00.0: vfio_cap_init: hiding cap 0xff@0xff
    Dec 22 15:15:16 aio kernel: vfio-pci 0000:04:00.0: vfio_cap_init: hiding cap 0xff@0xff
    Dec 22 15:15:16 aio kernel: vfio-pci 0000:04:00.0: vfio_cap_init: hiding cap 0xff@0xff
    Dec 22 15:15:16 aio kernel: vfio-pci 0000:04:00.0: vfio_cap_init: hiding cap 0xff@0xff
    Dec 22 15:15:16 aio kernel: vfio-pci 0000:04:00.0: vfio_cap_init: hiding cap 0xff@0xff
    Dec 22 15:15:16 aio kernel: vfio-pci 0000:04:00.0: vfio_cap_init: hiding cap 0xff@0xff
    Dec 22 15:15:16 aio kernel: vfio-pci 0000:04:00.0: vfio_cap_init: hiding cap 0xff@0xff
    Dec 22 15:15:16 aio kernel: fwbr100i0: port 2(tap100i0) entered disabled state
    Dec 22 15:15:16 aio kernel: fwbr100i0: port 2(tap100i0) entered disabled state
    Dec 22 15:15:16 aio pvedaemon[1400]: VM 100 qmp command failed - VM 100 not running
    Dec 22 15:15:16 aio pvestatd[1370]: VM 100 qmp command failed - VM 100 not running
    Dec 22 15:15:16 aio kernel: vfio-pci 0000:04:00.0: Unable to change power state from D3cold to D0, device inaccessible
    Dec 22 15:15:16 aio pvedaemon[27817]: start failed: QEMU exited with code 1
    Dec 22 15:15:16 aio pvedaemon[1399]: <root@pam> end task UPID:aio:00006CA9:0005885C:63A40401:qmstart:100:root@pam: start failed: QEMU exited with code 1
    

    Imgur

    0000:04:00是千兆网口,0000:05:00是无线网卡,删除掉千兆网口的 PCI 只保留无线网卡是可以正常启动的。

    grub:

    GRUB_DEFAULT=0
    GRUB_TIMEOUT=5
    GRUB_DISTRIBUTOR=`lsb_release -i -s 2> /dev/null || echo Debian`
    GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on pcie_acs_override=downstream,multifunction"
    GRUB_CMDLINE_LINUX=""
    

    vm config:

    balloon: 512
    boot: order=scsi0
    cores: 4
    hostpci0: 0000:04:00
    hostpci1: 0000:05:00
    memory: 2048
    meta: creation-qemu=7.1.0,ctime=1671628207
    name: openwrt
    net0: virtio=6E:35:49:91:0D:46,bridge=vmbr0,firewall=1
    numa: 0
    onboot: 1
    ostype: l26
    scsi0: local-lvm:vm-100-disk-0,iothread=1,size=420M
    scsihw: virtio-scsi-single
    smbios1: uuid=a3c5ba9b-a5f7-4f32-b9e3-056f0901cc01
    sockets: 1
    vmgenid: 1aa8b1ee-37e5-4617-9f93-fd2c8fbf98ab
    
    16 条回复    2024-06-04 18:07:53 +08:00
    WinkeyLin
        1
    WinkeyLin  
       2022-12-22 15:47:19 +08:00 via Android
    这个网口是不是被 pve 占用做管理网口了
    CRUD
        2
    CRUD  
    OP
       2022-12-22 15:48:18 +08:00
    @WinkeyLin 没有,pve 的网口在 2.5G 口上, 想要直通的是千兆口
    onetown
        3
    onetown  
       2022-12-22 16:04:47 +08:00
    试试 启动参数里加上 intel_iommu=on iommu=pt
    onetown
        4
    onetown  
       2022-12-22 16:06:35 +08:00
    另外, 如果不是板载的网卡的话, 可以试试换个 pci 插槽, 不过你这个估计是板载的。

    这个报错其实是 kvm 预防你的 pci 设备直接访问主机内存, 主要是不在一个 iommo group 里的话就会 crash.
    CRUD
        5
    CRUD  
    OP
       2022-12-22 17:20:30 +08:00
    @onetown 也是不行的老哥,一样的错误信息,网卡是板载的。看 syslog 里面相对比较明确的提示是`vfio-pci 0000:04:00.0: Unable to change power state from D3cold to D0, device inaccessible`,但是按这个提示没找到什么有用的线索。
    onetown
        6
    onetown  
       2022-12-22 22:03:15 +08:00
    @CRUD 嗯, 看日志是 vfio 驱动的问题, 如果你不是一定要上 dpdk 的话, 可以考虑把 vfio 屏蔽掉试试

    内核在加载驱动的时候直接使用了 vfio 驱动, 你的网卡应该不是 100% vfio 兼容的.

    在 /etc/modprobe.d/blacklist.conf 里追加一行
    blacklist vfio-pci
    waltcow
        7
    waltcow  
       2022-12-31 21:07:25 +08:00
    @CRUD 老哥解决了没有,同一张主板,pve vm 搞 openwrt 直通 PCI 网卡遇到一样的问题。
    CRUD
        8
    CRUD  
    OP
       2023-01-03 09:11:59 +08:00
    @waltcow 没有,我现在没走直通了,虚拟化方案先用着,搞不太定。
    waltcow
        9
    waltcow  
       2023-01-03 09:34:06 +08:00
    @CRUD 折腾了下用 LXC 容器跑 Openwrt, 不搞 PCI 直通貌似也可以
    CRUD
        10
    CRUD  
    OP
       2023-01-03 09:38:54 +08:00
    @waltcow 对,不搞直通也能用,所以就先用着了,虽然还是想能直通好一点,但是有点难搞。
    moli777
        11
    moli777  
       2023-03-18 18:19:38 +08:00
    @CRUD 买了个 760i 一样的问题,楼主后续有再尝试吗
    CRUD
        12
    CRUD  
    OP
       2023-03-20 09:05:44 +08:00
    @moli777 没有了,目前没走直通。
    moli777
        13
    moli777  
       2023-03-26 15:22:57 +08:00
    @CRUD 嗯,我也放弃了,直接虚拟网桥还省事😂
    CRUD
        14
    CRUD  
    OP
       2023-03-27 11:10:11 +08:00
    @moli777 是的,触碰不到带宽瓶颈,省事就好
    heider
        15
    heider  
       2023-11-23 10:51:32 +08:00
    我是 770i 遇到了同样的问题

    看文章
    https://docs.opennebula.io/6.4/open_cluster_deployment/kvm_node/pci_passthrough.html

    VFIO 设备绑定章节好像可以解决直通问题,抽空试一下
    s1e42NxZVE484pwH
        16
    s1e42NxZVE484pwH  
       178 天前
    @heider #15 成功了吗?
    关于   ·   帮助文档   ·   博客   ·   API   ·   FAQ   ·   实用小工具   ·   2667 人在线   最高记录 6679   ·     Select Language
    创意工作者们的社区
    World is powered by solitude
    VERSION: 3.9.8.5 · 27ms · UTC 05:13 · PVG 13:13 · LAX 21:13 · JFK 00:13
    Developed with CodeLauncher
    ♥ Do have faith in what you're doing.