Jump to content
Welcome to our new Citrix community!
  • 0

HighAvailabilityService.exe extreme memory usage


Dennis Parker

Question

I have a ticket open with Citrix on this, but they say the have no other reports of it happening, so looking for confirmation or is it just my environment.

 

Delivery Controllers running 2012R2 with 20GB RAM assigned.

Hyper-V

Citrix 7.12

LHC enabled

~1000 active brokered connections

 

Restarting the High Availability Service, it starts out using the documented amount of memory at ~1.2 GB. Over the next few days it climbs up, currently using ~8GB on each of Delivery Controllers. Monitoring to watch memory usage continue to grow, but based on prior experience, it will continue using more memory until all memory is used on the server. 

Does it continue to grow until all available memory is used then stop just maintain? 

 

Again, just looking for evidence in other environments, because what I see is that it grows to fill available memory, but Citrix docs say it should only use ~1.2GB of memory. 

 

https://docs.citrix.com/en-us/xenapp-and-xendesktop/7-12/manage-deployment/local-host-cache.html

RAM size

The LocalDB service can use approximately 1.2 GB of RAM (up to 1 GB for the database cache, plus 200 MB for running SQL Server Express LocalDB). The High Availability Service can use up to 1 GB of RAM if an outage lasts for an extended interval with many logons occurring (for example, 12 hours with 10K users). These memory requirements are in addition to the normal RAM requirements for the Controller, so you might need to increase the total amount of RAM capacity.

 

post-12637205-0-98685300-1485207077_thumb.png

Link to comment

19 answers to this question

Recommended Posts

  • 0

I am fairly certain I have the issue resolved. 

 

I disabled and re-enabled the LHC:

Set-BrokerSite -LocalHostCacheEnabled $false -ConnectionLeasingEnabled $false
 
Set-BrokerSite -LocalHostCacheEnabled $true -ConnectionLeasingEnabled $false
 
Which fixed the datastore has not been configured message in the CitrixBrokerConfigSyncReport.html report, which is of course enabled by the command: 
New-ItemProperty -Path HKLM:\SOFTWARE\Citrix\DesktopServer\LHC -Name EnableCssTraceMode -PropertyType DWORD -Value 1
 
OR if the value already exists:
Set-ItemProperty -Path HKLM:\SOFTWARE\Citrix\DesktopServer\LHC -Name EnableCssTraceMode -Value 1
 

 

 

(and then disabled when things are working of course: Set-ItemProperty -Path HKLM:\SOFTWARE\Citrix\DesktopServer\LHC -Name EnableCssTraceMode -Value 0)

 

I then started getting this error:

Exception: Citrix.Broker.Admin.SDK.SdkOperationException: Security identifier does not represent a Windows account ---> System.Invali↵
                dOperationException: Security identifier does not represent a Windows account --- End of inner exception stack tr↵
                ace --- at System.Management.Automation.MshCommandRuntime.ThrowTerminatingError(ErrorRecord errorRecord)
 

 

Looking up a few lines I see that it was working on importing application information into the LHC database and trying to import a security identifier (user or group) associated with the application that no longer exists in our AD environment. I had to go through the process several times and clean up each application one at a time (I also spot checked a number of applications and did some clean-up to make the process go much faster) as they failed. 

 

Once all of the invalid security tokens were cleared up and the LHC database was created properly, and restarting the High Availability Service. The memory usage dropped significantly, down to about 400MB and remained there on all delivery controllers over night. 

 

The moral of the story: Check your event logs for error 505 and follow the troubleshooting steps. 

 

One final comment: It seems that Citrix could maybe do a little better job of handling and reporting errors and rather than crashing in the middle of the process, maybe gather the errors and report on them. Admins like me, of course, need to remember to also do our job and properly check the error logs, but if I had, then this thread most likely never would have been generated and I hope that someone will find a use for it in the future. 

 

Thanks again for all of your assistance Martin! I'm sorry that I won't be able to provide meaningful data to Citrix as I already resolved my issue. Perhaps Citrix can attempt to recreate in a lab. 

  • Like 1
Link to comment
  • 0

Whilst I think continuing via the support process is the right approach, as I expect the next step would be to get a memory dump of the process whilst isn't using a lot of memory so that the dump can be analysed to determine what objects are using memory.

 

That said, some extra information about your environment would be useful:

a) How many VDAs do you have?

b) Is this memory usage when in outage mode or is your database up and XD reports connected whilst the memory usage grows.

 

As noted from the docs LHC uses Microsoft SQL Express LocalDB, and is configured such that the maximum memory for database should be limited to 1GB. This limit can be changed by editing the configuration file. For example to reduce it to 768MB edit:

C:\Program Files\Citrix\Broker\Service\HighAvailabilityService.exe.config

Look for the <appSettings> section and add an entry: <add key="MaxServerMemoryInMB" value="768" />

 

I believe you'll need to restart the High Availability Service to pick up the change.

Thanks Martin

Link to comment
  • 0

Thanks for the reply Martin.

 

I agree working with support will continue. I am monitoring for a couple days before we do the memory dump as I want to see if it continues to grow as expected. I restarted the service on Friday and it drops but seem to grow at a consistent rate. As I said, I was just wondering if any other environments are seeing the same thing, or if it's just mine. Not really expecting the solution to come from the discussion board. 

 

Answers to questions:

A 124 VDAs active and registered. These are all 2008R2 or 2012R2 server OS. Could be either Hyper-V or XenServer for the underlying HyperVisor for these VDAs. 

B. Memory usage is when DB is up and connected. 

More information:

Reading some of the troubleshooting documentation, I found an error in the event log that I didn't see before for some reason.

 

505 - The Citrix Config Sync Service failed an import.

 

This seems important....

 

So, attempting to troubleshoot, I discovered this error after creating the CSS trace report:

Exception: Citrix.Broker.Admin.SDK.SdkOperationException: Data store has not been configured ---> System.InvalidOperationException: D
                ata store has not been configured --- End of inner exception stack trace --- at System.Management.Automation.
                MshCommandRuntime.ThrowTerminatingError(ErrorRecord errorRecord)

 

 

This *really* seems important. 

Note: this site went through upgrades from 7.6 to 7.9 to 7.11 to 7.12. Not sure if it is relevant, but reading the LHC documentation it saysthe localDB is installed automatically when install a controller or upgrade from a version earlier than 7.9. It does not mention an upgrade from 7.11, but I see the SQL server instance, just not certain the database is properly configured at this point. 

 

My next step will be to go back to Citrix support and figure out how to configure the data store. I will update this discussion thread when I learn more.

 

Thanks again for the response and oddly directing me towards the error which will likely fix my issues.

Link to comment
  • 0

Dennis, glad you have things working. Shame we couldn't get a dump. We've seen a few occasions during internal testing when memory usage grew like you saw, we believed we had fixed all the ones we were able to reproduce, but clearly there maybe something else lurking. A dump would have helped pin point it, but oh well...

 

As for the problems of import failing due to failing to lookup the AD account. We believe we've already fixed this and should be resolved in the next release. 

 

If you are unfortunate to hit the high memory usage issue again. Please get a process dump. Thanks Martin

Link to comment
  • 0

 

 

I then started getting this error:

Exception: Citrix.Broker.Admin.SDK.SdkOperationException: Security identifier does not represent a Windows account ---> System.Invali↵
                dOperationException: Security identifier does not represent a Windows account --- End of inner exception stack tr↵
                ace --- at System.Management.Automation.MshCommandRuntime.ThrowTerminatingError(ErrorRecord errorRecord)
 

 

Looking up a few lines I see that it was working on importing application information into the LHC database and trying to import a security identifier (user or group) associated with the application that no longer exists in our AD environment. I had to go through the process several times and clean up each application one at a time (I also spot checked a number of applications and did some clean-up to make the process go much faster) as they failed. 

 

 

 

 

I'm seeing this in my environment.  HighAvailabilityService.exe was consuming 3.5GB on one DDC and 2.5GB on another.  I disabled LHC and then re-enabled it, we'll see how that goes.

 

I'm still getting 505 errors though.  Any pointers on how you tracked down the app?  I'm seeing some chatter before the 505, but nothing looks like an error.  I don't even see any application names to help point me in the right direction of the one causing the problem!

Link to comment
  • 0

Hi Joe, 

 

Sorry, my team made some adjustments to spam filtering so I didn't get notification of the reply to this topic until this morning.

 

I still have a slower memory leak after fixing all of the 505 error messages. I got a dump to support and they confirmed there is a bug in the HA service around policies. It will be patched in 7.13. Thanks for Martin Rowan for pushing the process through quickly.

 

As for cleaning up the 505 messages, in the troubleshooting section of this link is some guidance: https://docs.citrix.com/en-us/xenapp-and-xendesktop/7-12/manage-deployment/local-host-cache.html

 

To expand on that document a little bit, this is the main section that I used to troubleshoot my environment:

---

Report: You can generate and provide a report that details the failure point. This report feature affects synchronization speed, so Citrix recommends disabling it when not in use.

To enable and produce a CSS trace report, enter:

New-ItemProperty -Path HKLM:\SOFTWARE\Citrix\DesktopServer\LHC -Name EnableCssTraceMode -PropertyType DWORD -Value 1

The HTML report is posted at C:\Windows\NetworkService\LocalService\AppData\Local\Temp\CitrixBrokerConfigSyncReport.html 

After the report is generated, disable the reporting feature:

Set-ItemProperty -Path HKLM:\SOFTWARE\Citrix\DesktopServer\LHC -Name EnableCssTraceMode -Value 0

---

You enable the trace with the PowerShell command. Wait for the sync to happen. Browse the directory which in my environment anyway, was different from the document above: C:\Windows\ServiceProfiles\NetworkService\AppData\Local\Temp

 

There will be two files created, if there is data from a previous run it will be backed up and older data removed. 

  • CitrixBrokerConfigSyncReport.html
  • CitrixConfigConfigSyncReport.html

For my issue, I was interested in the first one (Broker). I just sat and watched it grow until I got the 505 error, then opened it in a browser and scrolled to the bottom to see the error in that run.

 

For example: 

2017-01-23T16:42:33 =========================== 2017-01-23T16:42:33 Calling Add-BrokerUser: 2017-01-23T16:42:33     Application: Type="Citrix.Broker.Admin.SDK.Application", Value="Citrix.Broker.Admin.SDK.Application" 2017-01-23T16:42:33     Name: Type="System.String", Value="S-1-5-1x-3767472556-3828333994-2192153425-83959" 2017-01-23T16:42:33 Exception: Citrix.Broker.Admin.SDK.SdkOperationException: Security identifier does not represent a Windows account ---> System.Invali
                dOperationException: Security identifier does not represent a Windows account --- End of inner exception stack tr
                ace --- at System.Management.Automation.MshCommandRuntime.ThrowTerminatingError(ErrorRecord errorRecord)

 

I would have to scroll up in the file to see what application it was working on to clean up the accounts. All of the errors in my environment were surrounding deleted AD users or groups.

 

It was a painful process to go through and again, talking with Martin, he says Citrix is aware of the problem and it will be better in 7.13, but I don't have any details on what will be different. 

 

Hope this helps.

Dennis

Link to comment
  • 0

I am using a 2016 O/S with Xenapp 7.13. I have forced use of the LHC by using the OutageForcedMode registry key set to 1.

 

I get the 505 error "The Citrix Config Sync Service failed an import".

I have tried to enable the html report using the following in powershell to troubleshoot the issue:

 

New-ItemProperty -Path HKLM:\SOFTWARE\Citrix\DesktopServer\LHC -Name EnableCssTraceMode -PropertyType DWORD -Value 1
 
Set-ItemProperty -Path HKLM:\SOFTWARE\Citrix\DesktopServer\LHC -Name EnableCssTraceMode -Value 1
 

I see no html report in the path provided above:

C:\Windows\NetworkService\LocalService\AppData\Local\Temp\

 

I have waited for the alert to generate after enabling the tracemode for CSS service but still no html report.

Am I missing something?

Link to comment
  • 0

Dennis, glad you have things working. Shame we couldn't get a dump. We've seen a few occasions during internal testing when memory usage grew like you saw, we believed we had fixed all the ones we were able to reproduce, but clearly there maybe something else lurking. A dump would have helped pin point it, but oh well...

 

As for the problems of import failing due to failing to lookup the AD account. We believe we've already fixed this and should be resolved in the next release. 

 

If you are unfortunate to hit the high memory usage issue again. Please get a process dump. Thanks Martin

 

​This invalid AD account look up problem is still not fixed in 7.13. I still have it even though the error message is a bit different from 7.12. Exception: Citrix.Broker.Admin.SDK.SdkOperationException: Security identifier is invalid ---> System.InvalidOperationException: Secur↵

                ity identifier is invalid --- End of inner exception stack trace --- at System.Management.Automation.MshComma↵

                ndRuntime.ThrowTerminatingError(ErrorRecord errorRecord)

 

What I found the associated VDI or application was to look up the SID in Studio.

Link to comment
  • 0

​This invalid AD account look up problem is still not fixed in 7.13. I still have it even though the error message is a bit different from 7.12. Exception: Citrix.Broker.Admin.SDK.SdkOperationException: Security identifier is invalid ---> System.InvalidOperationException: Secur↵

                ity identifier is invalid --- End of inner exception stack trace --- at System.Management.Automation.MshComma↵

                ndRuntime.ThrowTerminatingError(ErrorRecord errorRecord)

 

What I found the associated VDI or application was to look up the SID in Studio.

Actually I found I couldn't search SID for applications. I needed to find the application name from the trace log file. I also found another bug for LHC. When Citrix Remote PowerShell SDK is installed, the CSS service fails. You don't see errors from event log but no LHC DB is created. The ps script in CSS folder looks for SDKProxy reg and if it finds it, the script would throw an exception.

Link to comment
  • 0

... I also found another bug for LHC. When Citrix Remote PowerShell SDK is installed, the CSS service fails. You don't see errors from event log but no LHC DB is created. The ps script in CSS folder looks for SDKProxy reg and if it finds it, the script would throw an exception.

 

David, The "Remote PowerShell SDK" shouldn't be installed on an on-prem DDC, if that's what you've done. The remote SDK is only intended to be used to for Cloud based deployments. 

In this thread Citrix Studio was broken by installation of the Remote SDK: https://discussions.citrix.com/topic/383418-xd-76-starting-studio-throws-out-citrix-cloud-popup-and-hangs-on-expand/?do=findComment&comment=1952475

 

I suspect the reason for incompatibility is because the two different SDK share the some of the same commands but may be different versions or have different parameters.

Link to comment
  • 0
On 1/24/2017 at 6:43 PM, Dennis Parker said:

I am fairly certain I have the issue resolved. 

 

I disabled and re-enabled the LHC:

Set-BrokerSite -LocalHostCacheEnabled $false -ConnectionLeasingEnabled $false
 
Set-BrokerSite -LocalHostCacheEnabled $true -ConnectionLeasingEnabled $false
 
Which fixed the datastore has not been configured message in the CitrixBrokerConfigSyncReport.html report, which is of course enabled by the command: 
New-ItemProperty -Path HKLM:\SOFTWARE\Citrix\DesktopServer\LHC -Name EnableCssTraceMode -PropertyType DWORD -Value 1
 
OR if the value already exists:
Set-ItemProperty -Path HKLM:\SOFTWARE\Citrix\DesktopServer\LHC -Name EnableCssTraceMode -Value 1
 

 

 

(and then disabled when things are working of course: Set-ItemProperty -Path HKLM:\SOFTWARE\Citrix\DesktopServer\LHC -Name EnableCssTraceMode -Value 0)

 

I then started getting this error:

Exception: Citrix.Broker.Admin.SDK.SdkOperationException: Security identifier does not represent a Windows account ---> System.Invali↵
                dOperationException: Security identifier does not represent a Windows account --- End of inner exception stack tr↵
                ace --- at System.Management.Automation.MshCommandRuntime.ThrowTerminatingError(ErrorRecord errorRecord)
 

 

Looking up a few lines I see that it was working on importing application information into the LHC database and trying to import a security identifier (user or group) associated with the application that no longer exists in our AD environment. I had to go through the process several times and clean up each application one at a time (I also spot checked a number of applications and did some clean-up to make the process go much faster) as they failed. 

 

Once all of the invalid security tokens were cleared up and the LHC database was created properly, and restarting the High Availability Service. The memory usage dropped significantly, down to about 400MB and remained there on all delivery controllers over night. 

 

The moral of the story: Check your event logs for error 505 and follow the troubleshooting steps. 

 

One final comment: It seems that Citrix could maybe do a little better job of handling and reporting errors and rather than crashing in the middle of the process, maybe gather the errors and report on them. Admins like me, of course, need to remember to also do our job and properly check the error logs, but if I had, then this thread most likely never would have been generated and I hope that someone will find a use for it in the future. 

 

Thanks again for all of your assistance Martin! I'm sorry that I won't be able to provide meaningful data to Citrix as I already resolved my issue. Perhaps Citrix can attempt to recreate in a lab. 

 

Hello Dennis,

 

Did you ever receive a conclusive response from Citrix Support on this? I am having the exact same issue and I need to find out if this is a confirmed bug from the Citrix side. I am running Delivery Controllers LTSR 1912 on Windows Server 2019 with latest Windows updates.

 

Thanks and looking forward to your feedback.

Link to comment
  • 0

There was a memory issue with LHC that was resolved. I don't have exact version numbers, but I know I didn't have the issue with 7.15 LTSR and do not have the memory issue with 1912 LTSR on 2012R2. 

There are still occasional issues with 505 errors that need to be worked through, but internal processes have been cleaned up some and Citrix has done work on the process, which combined have reduced the number of 505 errors significantly. I probably have one about every 2 months now. 

Link to comment

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...