Jump to content


Photo

Kernel panic on patching to 6.2 SP1

Started by Jamie Knight , 09 June 2014 - 10:58 AM
17 replies to this topic

Best Answer Jamie Knight , 10 June 2014 - 02:45 PM

Finally found a fix!

 

Under the BIOS > Advanced > PCI Configuration, disabling "Memory Mapped I/O above 4GB" appears to fix the issue. I suspect it might be related to what Malcolm mentioned?

 

Thanks Kon, James and Malcolm! :)

Jamie Knight Members

Jamie Knight
  • 12 posts

Posted 09 June 2014 - 10:58 AM

Hi,

 

We have a server here with a brand new install of XenServer 6.2. All installed okay and booted into dom0 without any issues.

First thing we did was use xe patch-upload and xe pool-patch-apply to install XS62ESP1, and appears to complete successfully. However on reboot, the system kernel panics - with it not being able to find /dev/root. (See attached screenshot.)

 

If I flip back to using the fallback boot option, it boots up correctly. The safe option fails with the same kernel panic message.

 

I had a look around on here for people with similar issues - sadly nothing suggested in those threads appear to have worked for us. Does anyone have any suggestions as to why this might be, or steps to try and diagnose it further?

 

Best regards,

Jamie



Jamie Knight Members

Jamie Knight
  • 12 posts

Posted 09 June 2014 - 11:17 AM

Hi,

 

I should probably add - the server in question is a white-box Intel one. The board is an Intel S2600GL with two Xeon E5-2630 processors; firmware on the board has been updated to latest available from Intel.

 

Best regards,

Jamie



konvikkt1 Members

kon vikkt
  • 782 posts

Posted 09 June 2014 - 11:19 AM

what is the kernel panic trace ?



Malcolm Crossley Citrix Employees

Malcolm Crossley
  • 34 posts

Posted 09 June 2014 - 11:44 AM

Can you try disabling PCI 64bit BAR support in the BIOS?



Jamie Knight Members

Jamie Knight
  • 12 posts

Posted 09 June 2014 - 11:44 AM

Hi Kon,

 

The only trace I have the call trace - as it's panicked I can't find a crash dump anywhere? There should have been a screenshot of it on the first post! (Hopefully attached here.)

 

Thanks,

Jamie

Attached Thumbnails

  • Screenshot from 2014-06-09 09:12:46.png


Jamie Knight Members

Jamie Knight
  • 12 posts

Posted 09 June 2014 - 11:53 AM

Hi Malcolm,

 

There doesn't appear to be an option to turn it on or off on this board.

 

"Maximize Memory below 4GB" is currently disabled as per previous threads. Not sure if that's related?

 

Thanks,

Jamie



Jamie Knight Members

Jamie Knight
  • 12 posts

Posted 09 June 2014 - 12:30 PM

Please find attached the full console output from the crash - hope it's of use.

 

 

Attached Files



konvikkt1 Members

kon vikkt
  • 782 posts

Posted 09 June 2014 - 01:08 PM

have a look at this, might help http://discussions.citrix.com/topic/292864-xenserver-continues-to-reboot-with-kernel-panic-could-not-find-filesystem-devroot/



Jamie Knight Members

Jamie Knight
  • 12 posts

Posted 09 June 2014 - 01:45 PM

Hi Kon,

 

Just tried it - sadly still not able to find the filesystem.

 

Jamie



James Cannon Citrix Employees
  • #10

James Cannon
  • 4,402 posts

Posted 09 June 2014 - 02:15 PM

When you applied SP1 (and before rebooting it) you did apply driver updates? Is firmware up to date?



konvikkt1 Members
  • #11

kon vikkt
  • 782 posts

Posted 09 June 2014 - 03:10 PM

i believe its booting from local disk and not a "boot from SAN'.

alright, not sure if feasible  but can you try disabling multipath and then try installing the hotfix  on the server ? may be a test server ?



James Cannon Citrix Employees
  • #12

James Cannon
  • 4,402 posts

Posted 09 June 2014 - 09:35 PM

It could be that HPSA driver (if this is HP) needs to be updated. 



Jamie Knight Members
  • #13

Jamie Knight
  • 12 posts

Posted 10 June 2014 - 08:06 AM

There were no additional drivers required - at least for the initial install. Can't seem to find any XenServer-specific drivers on Intel's site, plus the HCL suggests support for it has been built in since 6.0.

 

@James it's an Intel board, so hopefully not! :)

 

I had updated the various firmwares to the latest ones provided by Intel before starting the install.

 

Yes it is indeed booting from local disk. (Connected via AHCI - no RAID or anything special.) The storage controller option in the BIOS is set to Intel RSTe - which is the default. Multipathing hasn't been enabled. (As confirmed by the output the multipath command: "DM multipath kernel driver not loaded".



konvikkt1 Members
  • #14

kon vikkt
  • 782 posts

Posted 10 June 2014 - 12:34 PM

i am not really sure, but might be you need to review the init  from the initrd image file loading scripts as what device it is mounting.



Jamie Knight Members
  • #15

Jamie Knight
  • 12 posts

Posted 10 June 2014 - 02:16 PM

Had a quick look at the init file for both initrds - the following is the only difference between them:

 

> echo "Loading dm-mirror.ko module"

> insmod /lib/dm-mirror.ko
> echo "Loading dm-zero.ko module"
> insmod /lib/dm-zero.ko
> echo "Loading dm-snapshot.ko module"
> insmod /lib/dm-snapshot.ko 

 

The only structural difference I can see between the two initrd files are the addition of the modules mentioned above.

 

Looking closer at the boot log, I've noticed the following difference between the failed and working boots.

 

< [    7.488468] isci: Intel® C600 SAS Controller Driver - version 1.1.0
< [    7.488644] isci 0000:07:00.0: driver configured for rev: 6 silicon
< [    7.488784] isci 0000:07:00.0: OEM parameter table found in OROM
< [    7.488923] isci 0000:07:00.0: OEM SAS parameters (version: 1.1) loaded (platform)
< [    7.489108] isci 0000:07:00.0: can't find IRQ for PCI INT A; probably buggy MP table
< [    7.717950] usb 1-1: new high speed USB device using ehci_hcd and address 2
< [   12.467158] isci: probe of 0000:07:00.0 failed with error -12
---
> [    7.514857] isci: Intel® C600 SAS Controller Driver - version 1.1.0
> [    7.515016] isci 0000:07:00.0: driver configured for rev: 6 silicon
> [    7.515157] isci 0000:07:00.0: OEM parameter table found in OROM
> [    7.515295] isci 0000:07:00.0: OEM SAS parameters (version: 1.1) loaded (platform)
> [    7.515481] isci 0000:07:00.0: can't find IRQ for PCI INT A; probably buggy MP table
> [    7.515933] isci 0000:07:00.0: SCU controller 0: phy 3-0 cables: {short, short, short, short}
> [    7.519813] scsi0 : isci
> [    7.520372] isci 0000:07:00.0: get owner: 7ff0
> [    7.520583] isci 0000:07:00.0: get owner: 7ff0
> [    7.731361] usb 1-1: new high speed USB device using ehci_hcd and address 2
> [    7.820677] ata1.00: ATA-8: WDC WD2503ABYZ-011FA0, 01.01S03, max UDMA/133
> [    7.820847] ata1.00: 490350672 sectors, multi 0: LBA48 NCQ (depth 31/32)
> [    7.821704] ata1.00: configured for UDMA/133
> [    7.821889] scsi 0:0:0:0: Direct-Access     ATA      WDC WD2503ABYZ-0 01.0 PQ: 0 ANSI: 5
> [    7.822224] sd 0:0:0:0: [sda] 490350672 512-byte logical blocks: (251 GB/233 GiB)
> [    7.822426] sd 0:0:0:0: [sda] Write Protect is off
> [    7.822588] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
> [    7.822846] sda: detected capacity change from 0 to 251059544064
> [    7.822986]  sda: sda1 sda2
> [    7.962082] sd 0:0:0:0: [sda] Attached SCSI disk

 

So it appears to be failing to detect the disks. Any ideas as to why this might be?



Jamie Knight Members
  • #16

Jamie Knight
  • 12 posts

Posted 10 June 2014 - 02:45 PM

Finally found a fix!

 

Under the BIOS > Advanced > PCI Configuration, disabling "Memory Mapped I/O above 4GB" appears to fix the issue. I suspect it might be related to what Malcolm mentioned?

 

Thanks Kon, James and Malcolm! :)


Best Answer

James Cannon Citrix Employees
  • #17

James Cannon
  • 4,402 posts

Posted 10 June 2014 - 04:44 PM

Ok, that may be IOMMU perhaps (not familiar with your hardware). Sometimes adding no iommu in the /boot/extlinux.conf resolves this. Best bet is the BIOS and I am glad you have issue resolved.



Jamie Knight Members
  • #18

Jamie Knight
  • 12 posts

Posted 11 June 2014 - 09:13 AM

Thanks for the tip - I'll bear that in mind for the future.