Jump to content


Photo

XenServer 7.0 FCoE on HP ProLiant Gen8

Started by Andrea Martra , 01 June 2016 - 09:58 AM
6 replies to this topic

Andrea Martra Members

Andrea Martra
  • 1 posts

Posted 01 June 2016 - 09:58 AM

Hello,

I installed xenserver 7.0 on a HP ProLiant BL660c Gen8 blade which has 6 network interfaces BCM57810 NetXtreme II 10 Gigabit Ethernet.
The FCoE support is active but does not properly negotiate the connection with the Brocade switch to which two of the interfaces are connected.
The state of xenserver interfaces is this:

[root@xenblade-04 fcoe]# fcoeadm -i
Description: NetXtreme II BCM57810 10 Gigabit Ethernet
Revision: 10
Manufacturer: Broadcom Corporation
Serial Number: 9CB6548B0050
Driver: bnx2x 1.713.04
Number of Ports: 1

Symbolic Name: bnx2fc (QLogic BCM57810) v2.10.2 over eth2
OS Device Name: host2
Node Name: 0x50060B0000C266D1
Port Name: 0x50060B0000C266D0
FabricName: 0x100000053325F803
Speed: Unknown
Supported Speed: Unknown
MaxFrameSize: 2048
FC-ID (Port ID): 0x01F0DD
State: Offline

Symbolic Name: bnx2fc (QLogic BCM57810) v2.10.2 over eth3
OS Device Name: host3
Node Name: 0x50060B0000C266D3
Port Name: 0x50060B0000C266D2
FabricName: 0x100000053325D201
Speed: Unknown
Supported Speed: Unknown
MaxFrameSize: 2048
FC-ID (Port ID): 0x02F0DB
State: Offline

The switch reports this error: (image attacched).

What can I do to fix it?

Thanks

 



Javier Ayllon Members

Javier Ayllon
  • 3 posts

Posted 08 September 2016 - 11:23 AM

Hi Andrea, I'm facing the same issue, I'm connecting a Flexfabric 630FLB Adapter and the process doesn't complete.

 

This is my output of fcoeadm -i

    Description:      BCM57840 NetXtreme II 10/20-Gigabit Ethernet
    Revision:         11
    Manufacturer:     Broadcom Corporation
    Serial Number:    5820B1E86400
    Driver:           bnx2x 1.713.04
    Number of Ports:  1
 
        Symbolic Name:     bnx2fc (QLogic BCM57840) v2.10.2 over eth2
        OS Device Name:    host1
        Node Name:         0x50060B0000C26285
        Port Name:         0x50060B0000C26284
        FabricName:        0x100000053375A10E
        Speed:             Unknown
        Supported Speed:   Unknown
        MaxFrameSize:      2048
        FC-ID (Port ID):   0x1009E0
        State:             Offline
 

 

Did you solve the problem?



ykim337 Members

Yongjae Kim
  • 6 posts

Posted 05 January 2017 - 10:18 AM

Dear Andrea Martra &  Javier Ayllon !

 

I also same problem. Did you solved your problem?

 

 

here is my server's fcoeadm result. 

 
  Description:      BCM57840 NetXtreme II 10/20-Gigabit Ethernet
    Revision:         11
    Manufacturer:     Broadcom Corporation
    Serial Number:    9CDC7165CF90
    Driver:           bnx2x 1.713.04
    Number of Ports:  1
 
        Symbolic Name:     bnx2fc (QLogic BCM57840) v2.10.2 over eth2
        OS Device Name:    host1
        Node Name:         0x50060B0000C29E3D
        Port Name:         0x50060B0000C29E3C
        FabricName:        0x10000027F8C722CA
        Speed:             Unknown
        Supported Speed:   Unknown
        MaxFrameSize:      2048
        FC-ID (Port ID):   0x011844
        State:             Offline
 
        Symbolic Name:     bnx2fc (QLogic BCM57840) v2.10.2 over eth3
        OS Device Name:    host2
        Node Name:         0x50060B0000C29E3F
        Port Name:         0x50060B0000C29E3E
        FabricName:        0x10000027F8C45DB7
        Speed:             Unknown
        Supported Speed:   Unknown
        MaxFrameSize:      2048
        FC-ID (Port ID):   0x021845
        State:             Offline


Eric Hosmer Members

Eric Hosmer
  • 140 posts

Posted 06 January 2017 - 09:21 PM

Hey I have been fighting something similar with Intel X540 on a 10Gbe network port. (Had a support ticket in for almost 5 months).   I have no issue when I plug into a 1Gbe port

 

Possible fix if we share the same root cause, see below?)  I'm seeing something that may be related to root issue with the x540's and its only when there connected to a 10Gbe port. Does not matter if non-bonded or Bonded(tested with Active -Active only)
 
Here is what I have figured out: non-Bonded or Bonded(tested with Active -Active only)
 
It only happens when the x540 is plugged in to a 10Gbe port. Use a 1Gbe port and no issues
XS 7.0 un-patch there are no issues.
 
XS 7.0 after XS70E004, it takes about 40 to 90 seconds for my storage NIC to be up in XS 7.0
 
Looking a the kern logs, that x540 is nothing but "link is up" "link is down" back to back. Takes a long time to stabilize.
Tried updating the x540 firmware to but no luck. 
Note: The current XS 7 download at Citrix has E004 built in.
 
After 6 months Citrix Support gave me something to try that may have worked.     After running all 3 Disable commands, the NFS SR is no longer has a big Red X on reboot.
 
 
When I disable, fcoe, xs-fcoe and lldpad my networking problem goes away.    
 
From what i have leaned in:  There is a driver bug with my intel x540's
 
 
--------
From Citrix Support 
 
We are still evaluating the data we received from you but we would like you to try something.  We think there may be a problem with FCoE causing the problem.  We would like you to try and disable FCoE and see if the problem still persists.  The commands to run on your XenServer host console are:
 
# systemctl disable fcoe
# systemctl disable xs-fcoe
# systemctl disable lldpad
 
Please run the disable commands and reboot to see if that makes a diference in your NFS issue. 
 
# systemctl enable fcoe
# systemctl enable xs-fcoe
# systemctl enable lldpad


Karl Heller Members

Karl Heller
  • 9 posts

Posted 20 April 2017 - 08:41 PM

Just some information that I troubleshooted today that may give some insight... or be completely unrelated.

 

I've installed XS7.1 on similar HP gear and have run into a slew of problems with networking issues in bonding active/passive mode.  As you may know, when creating a bond, the bond uses the MAC off the first NIC's HWADDR.  So in my case, eth0 MAC A and eth1 MAC B so bond has a MAC of A.  eth0 goes to switch 1 and eth1 goes to switch 2 both switches are connected.

When failing over to eth1 for testing purposes everything on that same IP space plugged into the same switch that eth0 was plugged into can't talk to each other.

 

I discovered that the passive interface was still sending out traffic, which makes no sense.  Turns out the FCOE components are sending out FCoE discovery packets using the HW mac addresses on EACH connected NIC.

 

So FCoE is sending packets on eth0 with its MAC and is sending packets on eth1 with its MAC... but eth0's mac is in use on BOND0 which is currently on eth1. The switches get confused seeing the same MAC address on two different ports (in my Active/passive configuration).

 

A test I will run is if I can create a fake MAC for the BOND when creating the BOND to avoid this out-of-the-box problem.

Or just disabling all FCoE components... since its probing all interfaces every second.



Eric Hosmer Members

Eric Hosmer
  • 140 posts

Posted 27 April 2017 - 08:29 PM

Hi Karl.

 

Wow this really identify the issue.    Thank you.    



Eric Hosmer Members

Eric Hosmer
  • 140 posts

Posted 01 June 2017 - 05:38 PM

I found one way to fix my issue.    I replaced the x540-at2 NIC's with x520-DA2 Nic's!    When I used the x520's for the storage network everything was fine.