Jump to content


Photo

After securing HA Netscalers, should LB on secondary be DOWN ?

Started by Jean-Marc Paulin , 01 March 2017 - 12:47 PM
14 replies to this topic

Jean-Marc Paulin Members

Jean-Marc Paulin
  • 5 posts

Posted 01 March 2017 - 12:47 PM

Hi,

 

We have a pair of netScaler VPX 11.1 (NetScaler NS11.1: Build 50.10.nc) in HA mode with load balancing. We followed some of the steps here (https://support.citrix.com/article/CTX114087) to encrypt the RPC communication.

 

What we noticed is that if we enable rpc secure, the secondary shows the load balanced virtual servers as DOWN, if we revert to unsecure, the secondary shows load balanced virtual servers as UP.

 

That does not appear to be the expectation. This thread http://discussions.citrix.com/topic/356551-change-password-on-rpcnode/ seems to confirm this is wrong as well.

 

However, both NetScalers seems to be happy enough and see each others, force ha sync does not return any errors.

 

Do I have a problem and how do I fix it?

 

Thanks

 

JM

 

 

 

 

 

 



Jens Ostkamp Members

Jens Ostkamp
  • 34 posts

Posted 01 March 2017 - 03:21 PM

This should be expected behaviour and your secondary NetScaler will get his services (and so the LBVSRVs) up, as soon as he needs to take over primary role



Jean-Marc Paulin Members

Jean-Marc Paulin
  • 5 posts

Posted 01 March 2017 - 03:26 PM

That looks odd. We have a different status on the Secondary depending on how rpc secure is set? Is there another way to confirm rpc is working as it should?

 

JM



Paul Blitz Members

Paul Blitz
  • 3,767 posts

Posted 02 March 2017 - 10:20 AM

Are you 100% happy that the HA connection betwen the 2 netscalers is working? (make a change, see if it propagates)

 

Secondary can't send monitors, so it pulls the state information from the primary, which is why I'm questioning the state of the HA



Jean-Marc Paulin Members

Jean-Marc Paulin
  • 5 posts

Posted 02 March 2017 - 11:18 AM

This is a fair question. I forced failovers and that worked as expected.

 

Also I just added a dummy syslog action on the primary, and I could see it on the secondary.

I then removed it from the primary, and it got removed from the secondary.

 

On that basis I am assuming HA communication works. I am just surprised as why the state if the vserver is not replicated.on the secondary.

 

On the Primary:

> show lb vserver test-lb
        test-lb (aaa.bbb.ccc.ddd:443) - TCP       Type: ADDRESS
        State: UP
        Last state change was at Wed Feb 22 18:18:21 2017
        Time since last state change: 7 days, 16:52:51.500
        Effective State: UP
        Client Idle Timeout: 9000 sec
        Down state flush: ENABLED
        Disable Primary Vserver On Down : DISABLED
        Appflow logging: ENABLED
        No. of Bound Services :  3 (Total)       3 (Active)
        Configured Method: LEASTCONNECTION      BackupMethod: ROUNDROBIN
        Mode: IP
        Persistence: NONE
        Connection Failover: STATEFUL
        L2Conn: OFF
        Skip Persistency: None
        Listen Policy: NONE
        IcmpResponse: PASSIVE
        RHIstate: PASSIVE
        New Service Startup Request Rate: 0 PER_SECOND, Increment Interval: 0
        Mac mode Retain Vlan: DISABLED
        DBS_LB: DISABLED
        Process Local: DISABLED
        Traffic Domain: 0

1) test-lb-backends0 (aaa.bbb.ccc.ddd: 10011) - TCP State: UP        Weight: 10
2) test-lb-backends1 (aaa.bbb.ccc.ddd: 10011) - TCP State: UP        Weight: 10
3) test-lb-backends2 (aaa.bbb.ccc.ddd: 10011) - TCP State: UP        Weight: 10
 

 

On the Secondary:

> show lb vserver test-lb
        test-lb (aaa.bbb.ccc.ddd:443) - TCP       Type: ADDRESS
        State: DOWN
        Last state change was at Wed Mar  1 16:06:53 2017
        Time since last state change: 0 days, 19:04:00.30
        Effective State: DOWN
        Client Idle Timeout: 9000 sec
        Down state flush: ENABLED
        Disable Primary Vserver On Down : DISABLED
        Appflow logging: ENABLED
        No. of Bound Services :  3 (Total)       0 (Active)
        Configured Method: LEASTCONNECTION      BackupMethod: ROUNDROBIN
        Mode: IP
        Persistence: NONE
        Connection Failover: STATEFUL
        L2Conn: OFF
        Skip Persistency: None
        Listen Policy: NONE
        IcmpResponse: PASSIVE
        RHIstate: PASSIVE
        New Service Startup Request Rate: 0 PER_SECOND, Increment Interval: 0
        Mac mode Retain Vlan: DISABLED
        DBS_LB: DISABLED
        Process Local: DISABLED
        Traffic Domain: 0

1) test-lb-backends0 (aaa.bbb.ccc.ddd: 10011) - TCP State: DOWN      Weight: 10
2) test-lb-backends1 (aaa.bbb.ccc.ddd: 10011) - TCP State: DOWN      Weight: 10
3) test-lb-backends2 (aaa.bbb.ccc.ddd: 10011) - TCP State: DOWN      Weight: 10
 

 

Any other ideas ?



Paul Blitz Members

Paul Blitz
  • 3,767 posts

Posted 02 March 2017 - 05:15 PM

I wonder if this is one of those "well, it depends on the version" things.... I'm sure through history I've seen both an UP status on secondary nodes (ie copied state from primary), and a DOWN.

 

Just looking at the GUI on our 2 netscalers (11.0.65.31), and all the services on the secondary are showing as UP



Jean-Marc Paulin Members

Jean-Marc Paulin
  • 5 posts

Posted 03 March 2017 - 08:53 AM

@Paul, do you have RPC configured with Secure=YES ? Because when I have RPC configured with Secure=NO, I see the status as UP on the Secondary. it is only when I enable RPM Secure=YES that the status goes DOWN .



Avinash Piare Members

Avinash Piare
  • 4 posts

Posted 13 April 2017 - 05:50 PM

This should be expected behaviour and your secondary NetScaler will get his services (and so the LBVSRVs) up, as soon as he needs to take over primary role

 

I just upgraded two SDX 8015 appliances using the single-bundle upgrade, and after that two H/A VPX pairs (running on the SDX appliances). 

 

After upgrading the Secondary of the first H/A pair I ran into the same issue. Virtual servers were down on the Secondary, because the monitors were down. 

 

I forced a failover and voila all services and virtual servers were UP again. 

 

 

After I upgraded the former-Primary (the Secondary at that moment) all services and virtual servers were UP whilst in Secondary status.

Probably has to do with the difference in version between two NSsen in a H/A pair.

 

(can't seem to remove the quote frame below) 

 


Jens Ostkamp Members

Jens Ostkamp
  • 34 posts

Posted 18 April 2017 - 12:33 PM

I'm sorry regarding my first post, i didnt really think about different versions as the last two-three HA configurations i made had this behaviour and as all the failover tests were going as expected (secondary shows UP when it takes over, shows DOWN when secondary) i didnt really think about other possibilties regarding different firmware version.

 

 

I just upgraded two SDX 8015 appliances using the single-bundle upgrade, and after that two H/A VPX pairs (running on the SDX appliances). 

 

After upgrading the Secondary of the first H/A pair I ran into the same issue. Virtual servers were down on the Secondary, because the monitors were down. 

 

I forced a failover and voila all services and virtual servers were UP again. 

 

 

After I upgraded the former-Primary (the Secondary at that moment) all services and virtual servers were UP whilst in Secondary status.

Probably has to do with the difference in version between two NSsen in a H/A pair.

 

(can't seem to remove the quote frame below) 

 

this sounds strange, i can't really confirm this behaviour. all my implementations and configurations featured the exact firmware on both appliances, yet had these up/down difference in primary and secondary appliance. 

 

I wonder if this is one of those "well, it depends on the version" things.... I'm sure through history I've seen both an UP status on secondary nodes (ie copied state from primary), and a DOWN.

 

Just looking at the GUI on our 2 netscalers (11.0.65.31), and all the services on the secondary are showing as UP

 

are u sure secondary probes just pull the monitor information of the primary? i thought so too but it looks like citrix changed this with some version?



Jens Ostkamp Members
  • #10

Jens Ostkamp
  • 34 posts

Posted 18 April 2017 - 12:38 PM

e: sorry for doubleposting, please delete 

 



Paul Blitz Members
  • #11

Paul Blitz
  • 3,767 posts

Posted 18 April 2017 - 04:00 PM

The only IP available to the secondary netscaler is its NSIP, and (apart from perl monitors) monitors are sent from a SNIP, so secondary has no way to send monitors. Thus the only way to allow the secondary to get LB status is from the primary.

 

(Of course, when you have two different versions, there is no synchronization or propagation between the HA Pair, so I could understand if it doesn't also send status info.)



Jean-Marc Paulin Members
  • #12

Jean-Marc Paulin
  • 5 posts

Posted 18 April 2017 - 06:16 PM

I am with you on that one. the secondary cannot monitor anything. However, when secure=false the monitor status is replicated from the primary to the secondary. and when secure=true the monitor status is not replicated. This is the odd behavior.

 

However, If I make a configuration change to the primary, it is replicated to the secondary, so I presume (may be wrongly) that replication between primary and secondary works.

 

Primary and Secondary both have the same version.

 

@Jens, have you configured RPC communication between primary and secondary to be secure?

 

Thanks

 

JM



Jens Ostkamp Members
  • #13

Jens Ostkamp
  • 34 posts

Posted 19 April 2017 - 09:50 AM

You are right, RPC Secure Checkbox is NOT ticked in one HA setup i just checked and the secondary shows its services/vservers as UP. I will backcheck later with secure RPC if services are going down. 

 

@ Paul: Yeah, that makes sense. Still I can't really figure the logic why the secondary should show its services as down when it pulls the monitor information off the primary where they are shown as UP. Even when the HA itself seems to work this way it somehow sounds odd. It obviously has to do something with the secure rpc traffic, can anyone put some detailed insight behind that?



Paul Blitz Members
  • #14

Paul Blitz
  • 3,767 posts

Posted 19 April 2017 - 11:12 AM

ok, I just created a secondary netscaler in my lab, and have had a play with rpcnode. I'm running 11.1-51.26

 

Things I learned:

 

- If I run "set rpcnode <pmy ip> - secure yes" on primary, then the command propagated fine to the secondary (used the "show rpcnode" command), forced sync worked ok, and vservers on secondary showed as up

 

- If I run "set rpcnode <secy ip> - secure yes" on primary, then the command again propagated fine to the secondary, forced sync worked ok, and vservers on secondary showed as up

 

- "show rpcnode" matched what was set.



Jens Ostkamp Members
  • #15

Jens Ostkamp
  • 34 posts

Posted 19 April 2017 - 02:43 PM

I just did the same in my lab setting up a secondary netscaler and played around with HA and RPC. I couldnt reproduce the state, that my secondary shows its services as DOWN as long as it stays secondary. Checking/unchecking Secure RPC didnt change a thing. Doing some Failovers neither. I tested with the latest 11.1 Build. I dont expect a different behaviour of the 50.10 build but i will check this to recreate the OPs configuration firmware-wise

E: not able to reproduce described issue with 50.10 - only difference i can think of now is license (i tested with partner license which equals platinum). i definitely have seen the issue before in my configurations (i even thought that it would be normal behaviour until i did these lab tests today) so im not really sure what the cause might be as version in my lab matches OPs version.