Jump to content
Updated Privacy Statement
  • 3

Missing Registration Events in studio console_With Fault State : "Failed to start" Power state OFF


Balaji Muthukrishnan

Question

we have a XD 7.15 LTSR CU1 environment with VDA 7.17 installed.

 

This environment was upgraded from XD 7.8 to  7.15 LTSR CU1. Since the upgrade we are observing events in the Registration missing section of the studio console.

 

Out of that we were able to understand the cause for "Stuck on boot" Fault state events,

 

but there are repeated VDAs with Fault state "Failed to start", this includes non persistent and persistent VMs.

 we are using XenServer 7.0 as hypervisor.

 

did any one face similar issue or any known issues that you encountered ?  pl suggest workaroud / solution

Link to comment

12 answers to this question

Recommended Posts

  • 3

Fault State: "Failed to Start" is due to 

 

issue with XenDesktop, PVS and XenServer 7.0 combination alone.

 

Fault State: "Unregistered" is due to various reasons...

 

- VM in unresponsive state due to process hung

- sometimes an application can cause a VM to go to unresponsive state and cause the VM to go into unregistered state (In my case Jabber application and copying bulk files to C:\ drive causing this issue)

- Controller unable to reach/contact VDA

- sometimes PVS streamed Targets unable to boot due to (stuck in boot diagnostic screen) target device entry not found in database

  • Like 3
Link to comment
  • 2

--Still observing the issue mentioned above, but based on the outcome of the support case was to update the Nutanix AHV MCS Plugin For Citrix XenDesktop to latest version (the issue and symptom mentioned in this post (above post July 14) is only applicable for workloads hosted on Nutanix ==>AHV/ESXi)

 

In relation to this issue, I have had similar symptoms of FailedToStart but of a different issue.

 

-Issue is we had an image updated and Rolled out with OS updates. post that users had an issue in connecting to their VDAs and performance issues as well.

-Finally we had to Roll-back the update in MCS Machine catalog

-For some reason Roll-back didnt went well. Had error "This Machine Catalog has Following Errors: A Provisioning Operation failed (The windows event log may contain Further details. See the Action tab for further details)"

-Because of this Machines that are restarted or powered on (Manual or Broker Action) from studio are getting FailedToStart issue (RegistrationMissing). The only option is to Manually power on from Hyp console.

-When we took cdf during Power action it showed that Broker passes a Null value of Machineid or MasterImageid to Nutanix AHV MCS Plugin. 

-This gave us a clue and got a correlation to the above error message showed on the Machine Catalog ("This Machine Catalog has Following Errors: A Provisioning Operation failed (The windows event log may contain Further details. See the Action tab for further details)")

 

--Doubt was an issue with Roll back, so decided to do an update of the Machine catalog again pointing to the same snapshot when it got rolled back (no changes as such, Just pointed to the same snapshot)

--After the update, there was no error message in the Machine catalog.

--As expected, we are able to restart and Power on from Studio, Director console (Including the broker power actions).

  • Like 2
Link to comment
  • 2

I have upgraded Nutanix AHV MCS Plugin For Citrix XenDesktop 7.9 or Later 2.3.0.0 (from Nutanix AHV MCS Plugin For Citrix XenDesktop 7.9 or Later 1.1.2.0 ) and from then on, I dont see any issues with VDAs FailedToStart (Hypervisor Reported Failure - Jul 14 Post see above) issue reported in Director dashboard under Failed Desktop Os Machines :)

 

MCS Plugin upgrade in delivery controller fixed the issue :) :) :) 

  • Like 2
Link to comment
  • 1

https://support.citrix.com/article/CTX225835 (Recommended Hotfixes for XenServer 7.x)

https://support.citrix.com/article/CTX235403 (Updates to Management Agent - For XenServer 7.0 and later) these are the drivers which has to be updated...

 

Have to download "Windows Server 2016 and Later Servicing Drivers" for windows 1709 X64 bit OS for 64 bit patch. I have downloaded the cab files and installed in win 10 1709 x64 OS, drivers got updated to latest version as mentioned in the CTX 235403

 

But with the PVS streamed Vdisks am not able to update xenbus drivers update, getting BSOD (cvhpmp.sys) issue on reboot. Rest of the drivers dont have any issue.

 

  • Like 1
Link to comment
  • 1

In the mean time i took a vdisk which is having this issue (Fault State as "Failed to Start and Unregistered") and used the direct vhd boot method to upgrade win 10 1709 along with the Xen Drivers update as mentioned in the CTX 235403

 

Then exported the VHD back to PVS datastore, imported the vDIsk back in PVS console.

 

Since then i am observing the environment of the similar issue. Its been more than a week now, i dont see any such issues in that particular delivery group.

  • Like 1
Link to comment
  • 0

Updates on the above issue...

 

we were able to identify the events which is registered in DC for the affected VMs...

 

Event ID 3012 and 3016

 

Log Name:      Application
Source:        Citrix Broker Service
Date:          5/30/2018 1:36:55 AM
Event ID:      3012
Task Category: None
Level:         Warning
Keywords:      
User:          NETWORK SERVICE

Description:
The Citrix Broker Service detected that power action 'TurnOn' (origin: IdlePool) on virtual machine 'CSB' failed. 
 
This problem is most likely due to a host issue. Check that the configuration of the virtual machine on the host does not prohibit the requested power action. Verify that there are no storage problems on the host. Check that the host connection information is correct. 
 
Error details: 
Exception 'Failure in PowerOn, SR_BACKEND_FAILURE_453, , tapdisk experienced an error [opterr=Operation not permitted], ' of type 'PluginUtilities.Exceptions.ManagedMachineGeneralException'.
Event Xml:
<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
  <System>
    <Provider Name="Citrix Broker Service" Guid="{1EC1549E-1762-49AB-B7A8-0DE5CBACA3FB}" />
    <EventID>3012</EventID>
    <Version>0</Version>
    <Level>3</Level>
    <Task>0</Task>
    <Opcode>0</Opcode>
    <Keywords>0x8000000000000000</Keywords>
    <TimeCreated SystemTime="2018-05-30T05:36:55.592182400Z" />
    <EventRecordID>6337671</EventRecordID>
    <Correlation />
    <Execution ProcessID="2092" ThreadID="28748" />
    <Channel>Application</Channel>
    <Computer>FA09.v.com</Computer>
    <Security UserID="S-1-5-20" />
  </System>
  <EventData>
    <Data Name="action">TurnOn</Data>
    <Data Name="origin">IdlePool</Data>
    <Data Name="machineName">CSB</Data>
    <Data Name="message">Failure in PowerOn, SR_BACKEND_FAILURE_453, , tapdisk experienced an error [opterr=Operation not permitted], </Data>
    <Data Name="type">PluginUtilities.Exceptions.ManagedMachineGeneralException</Data>
  </EventData>
</Event>

 

 

Log Name:      Application
Source:        Citrix Broker Service
Date:          5/29/2018 11:49:01 PM
Event ID:      3016
Task Category: None
Level:         Warning
Keywords:      
User:          NETWORK SERVICE

Description:
The Citrix Broker Service abandoned active power action 'Shutdown' (origin: 'Policy') on virtual machine 'ICI' because no notification of either success or failure was received within the configured timeout period.

Event Xml:
<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
  <System>
    <Provider Name="Citrix Broker Service" Guid="{1EC1549E-1762-49AB-B7A8-0DE5CBACA3FB}" />
    <EventID>3016</EventID>
    <Version>0</Version>
    <Level>3</Level>
    <Task>0</Task>
    <Opcode>0</Opcode>
    <Keywords>0x8000000000000000</Keywords>
    <TimeCreated SystemTime="2018-05-30T03:49:01.740729700Z" />
    <EventRecordID>141941</EventRecordID>
    <Correlation />
    <Execution ProcessID="1556" ThreadID="7308" />
    <Channel>Application</Channel>
    <Computer>A004.v.com</Computer>
    <Security UserID="S-1-5-20" />
  </System>
  <EventData>
    <Data Name="action">Shutdown</Data>
    <Data Name="origin">Policy</Data>
    <Data Name="machineName">ICI</Data>
  </EventData>
</Event>

 

and on the Hypervisor side (Citrix XenServer Alone not in VMWare ESXi)

 

Error: Starting VM

Tapdisk experienced an error

 

 

Link to comment
  • 0

Worked with CItrix Support and they informed that the problem is with XenDesktop and Xenserver combination alone.

 

Workaround is to install/update Citrix XenServer Windows Management Agent in the VDAs....available @ https://www.catalog.update.microsoft.com/Search.aspx?q=citrix

 

path for the driver files C:\Windows\system32\drivers

xenvbd 8.2.1.203

xennet 8.2.1.102

xenvif 8.2.1.155

xenbus 8.2.1.117

 

In the VDA in problematic VMs, Windows XS management agent version is @ 7.1.825 and the driver files are 8.1.xxx

 

its expected to be updated to 8.2.xxx version after updating the cab files / MS patch.

 

One other problem that we experienced is the update provided by MS is for Windows 10 1703 and later. But my VMs are in windows 10 1607 build. so rite now in the process of updating the build to 1703/1709 and need to test the workaround suggested.

 

will post the updates later...

Link to comment
  • 0

Update -  I have upgraded a persistent VDA to 1709 version.

 

Applicable patches for 1709 are 

 

Citrix-Net-7-31-201712-00-00AM-8.2.1.102.cab

Citrix-System-1-25-201812-00-00AM-8.2.1.155.cab

Citrix-System-4-5-201812-00-00AM-8.2.1.124.cab

Citrix-System-11-28-201712-00-00AM-8.2.1.117.cab

 

But failing to install the updates using dism

 

dism.exe /online /add-package /packagepath:c:\tools

 

Error :

 

Dism.exe /online /add-package /packagepath:c:\tools

Deployment Image Servicing and Management tool
Version: 10.0.16299.15

Image Version: 10.0.16299.492

An error occurred trying to open - Citrix-Net-7-31-201712-00-00AM-8.2.1.102_1709.cab Error: 0x80070002
An error occurred trying to open - Citrix-System-1-25-201812-00-00AM-8.2.1.155_1709.cab Error: 0x80070002
An error occurred trying to open - Citrix-System-11-28-201712-00-00AM-8.2.1.117_1709.cab Error: 0x80070002
An error occurred trying to open - Citrix-System-4-5-201812-00-00AM-8.2.1.124_1709.cab Error: 0x80070002

Error: 2

The system cannot find the file specified.

The DISM log file can be found at C:\WINDOWS\Logs\DISM\dism.log

Link to comment
  • 0

I have faced the same issue but in a different environments with a different hypervisors.

 

Fault state: FailedToStart

Failure reason: Hypervisor reported failure

Studio: Registration missing events

 

Issue:     Server Os VDAs failed to PowerON after Scheduled Restart resulting in FailedToStart and Desktop OS VDAs failed to power ON based on controller power manager policy resulting in Failed Desktop (will feature in FailedToStart under failed desktop or failed server OS in Director / Registration Missing Event in studio console. (Actual state of the VDA is powered Off in Hyp, but in director it looks like it's showing as Turning On)

 

But this issue symptoms is no way related to the one discussed above in this post.

In this new environment workloads are hosted on Nutanix AHV and ESXi 6.5 Hyp. When a VDA restarts due to controller power action or due to scheduled restart power ON action is failed.

 

Xendesktop -7.17 (OS : Windows Server 2012 R2) 
VDA 7.17 (OS : Windows Server 2012 R2/ Win 10 1709/1607) 
Nutanix AHV MCS Plugin For Citrix XenDesktop 7.9 or Later 1.1.2.0 
VMware ESXi, 6.5.0, 13004031 (ServerOS) on Nutanix box
NX-T00-4NL3-G5 (Nutanix) (Desktop OS/ AHV 5.10.4)

 

If I check the reason for failure for these power actions, they are registered as...(FailureReason : Failure in ResetVM, GeneralFailure)

 

From the Nutanix Tasks's and events i see the power actions are timing out.

 

--Seems to be a new issue and occurring on daily basis.

 

--For now as a workaround, written a powershell script to take power action if there is an Fault State as FailedToStart VDA entry in studio...

 

Working with Nutanix and Citrix support, will post the updates once we get some clue or a possible fix...

Link to comment
  • 0

One other possible reason for FailedToStart issue were VDAs failing to Start from its Power Schedule(Only specific to VMWare Hyp Hosting connection)

 

Its worth checking the Certificate thumbprint of the Certificate imported in Delivery controller with the one wriiten to Database (Site Database, Table name : HypervisorConnectionSSLThumbprint). In my case it was wierd that out of 4 VMWare Hyp Hosting connection i was able to see only 2 entries in the Tables for the Hyp Cpnnection (Esxi 5.5) but missing other 2 entries for 2 other Hyp Hosting connection (VMware Esxi 6.0/6.5)

 

This was identified when we did a Test connection against Hosting Connection in Studio console. test came out successfull, but DC was not able to connect to VC due to cert error

~~~~

Cannot connect to the vCenter server due to a certificate error. Make sure that the appropriate certificates are installed on the VCenter server, and install appropriate certificates on every controller in the site

~~~~

 

But i have the right certificates in all the DCs and in the right stores and they are trusted. Got this info from Carl's blog post and checked the (https://www.carlstalhood.com/delivery-controller-cr-and-licensing/#vcenter  - Hosting Resources Section, Point no 9) Cert thumprint in DB, surprisingly i was missing the hosting connection entry, If i create a new host connection, it is written back to the database tables (Test Host connection also didnt through back any cert errors), but for the existing host connection entry was missing.

 

This was causing VDAs Failure to Power On from its daily reboot schedules resulting in FailedToStart errors. we were not able to power on from Studio/Director or through powershell cmd. The only way is to login to vSphere web console and power on. If we reboot the Delivery controller issue will be resolved temporarily but after a week or so issue is coming back

 

I guess the only option is to create a new Hosting connection for the missing Hyp entries and move the VDAs to the newly created Hosting connection

 

Link to comment

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...