Fedora Linux Support Community & Resources Center
  #1  
Old 16th January 2009, 07:00 AM
veritgo Offline
Registered User
 
Join Date: Feb 2007
Posts: 8
f10 x86_64 xen VM guests fail to boot on f8 host

I have two machines running fresh installs of f8 with the xen. Kernel and all software versions are the same on both.
Specifically:

Code:
[root@machineA boot]# uname -a
Linux machineA 2.6.21.7-5.fc8xen #1 SMP Thu Aug 7 12:44:22 EDT 2008 x86_64 x86_64 x86_64 GNU/Linux
[root@machineA boot]# virsh version
Compiled against library: libvir 0.4.4
Using library: libvir 0.4.4
Using API: Xen 3.0.1
Running hypervisor: Xen 3.1.0
And
Code:
[root@machineB ~]# uname -a
Linux machineB 2.6.21.7-5.fc8xen #1 SMP Thu Aug 7 12:44:22 EDT 2008 x86_64 x86_64 x86_64 GNU/Linux
[root@machineB ~]# virsh version
Compiled against library: libvir 0.4.4
Using library: libvir 0.4.4
Using API: Xen 3.0.1
Running hypervisor: Xen 3.1.0
MachineA has two AMD Opteron 275s. MachineB has four Intel(R) Xeon(TM) CPU 2.80GHz processors.

Both machines are as up to date as possible.

I can boot or create x86_64 f10 guests on MachineA with no trouble whatsoever.

MachineB will not boot/create x86_64 f10 guests.

The configuration files are created in the same manner, but as soon as Xen tries to unpause the newly created domain, it crashes pretty much instantly.



/var/log/xen/xend.log relevant output
Code:
[2009-01-16 14:45:32 4120] DEBUG (DevController:150) Waiting for devices vtpm.
[2009-01-16 14:45:32 4120] INFO (XendDomain:1130) Domain f10testB (21) unpaused.
[2009-01-16 14:45:32 4120] WARNING (XendDomainInfo:1203) Domain has crashed: name=f10testB id=21.
[2009-01-16 14:45:32 4120] DEBUG (XendDomainInfo:1802) XendDomainInfo.destroy: domid=21
[2009-01-16 14:45:32 4120] DEBUG (XendDomainInfo:1821) XendDomainInfo.destroyDomain(21)
I've also tried moving a functional guest from MachineA to MachineB to boot it there, with the same results. Guest will not boot on MachineB.

f8 64bit guests will boot on MachineB with no problems.
f10 32bit guests will boot on MachineB with no problems.

Only 64bit machines seem to be borked.

Any information / help / insight as to why this is happening would be very much appreciated. The machines are pretty similar, and since the guests are paravirtualized it does not really make sense for the processors to be the cause of the problem.

Thanks,
jon
Reply With Quote
  #2  
Old 16th January 2009, 07:00 AM
veritgo Offline
Registered User
 
Join Date: Feb 2007
Posts: 8
Full corresponding dump of xend.log:

Code:
[2009-01-16 14:45:29 4120] DEBUG (XendDomainInfo:84) XendDomainInfo.create(['vm', ['name', 'f10testB'], ['memory', '512'], ['maxmem', '512'], ['vcpus', '1'], ['uuid', '9e415006-b708-6918-2125-be1459a0c376'], ['on_poweroff', 'destroy'], ['on_reboot', 'destroy'], ['on_crash', 'destroy'], ['image', ['linux', ['kernel', '/var/lib/xen/virtinst-vmlinuz.8-3i5u'], ['ramdisk', '/var/lib/xen/virtinst-initrd.img.tHZaoA'], ['args', 'ks=http://10.0.13.215/xenf10ks.cfg method=http://kickstart.server/fedora/os/10/x86_64/']]], ['device', ['vbd', ['dev', 'xvda'], ['uname', 'file:/opt/vm/f10onf8/f10testB.img'], ['mode', 'w']]], ['device', ['vif', ['mac', '00:16:3e:52:31:eb'], ['bridge', 'eth0']]]])
[2009-01-16 14:45:29 4120] DEBUG (XendDomainInfo:1555) XendDomainInfo.constructDomain
[2009-01-16 14:45:29 4120] DEBUG (balloon:116) Balloon: 528844 KiB free; need 2048; done.
[2009-01-16 14:45:29 4120] DEBUG (XendDomain:443) Adding Domain: 21
[2009-01-16 14:45:29 4120] DEBUG (XendDomainInfo:1609) XendDomainInfo.initDomain: 21 256
[2009-01-16 14:45:29 4120] DEBUG (XendDomainInfo:1640) _initDomain:shadow_memory=0x0, memory_static_max=0x20000000, memory_static_min=0x0.
[2009-01-16 14:45:29 4120] DEBUG (balloon:116) Balloon: 528836 KiB free; need 524288; done.
[2009-01-16 14:45:29 4120] INFO (image:129) buildDomain os=linux dom=21 vcpus=1
[2009-01-16 14:45:29 4120] DEBUG (image:198) domid          = 21
[2009-01-16 14:45:29 4120] DEBUG (image:199) memsize        = 512
[2009-01-16 14:45:29 4120] DEBUG (image:200) image          = /var/lib/xen/virtinst-vmlinuz.8-3i5u
[2009-01-16 14:45:29 4120] DEBUG (image:201) store_evtchn   = 1
[2009-01-16 14:45:29 4120] DEBUG (image:202) console_evtchn = 2
[2009-01-16 14:45:29 4120] DEBUG (image:203) cmdline        = ks=http://10.0.13.215/xenf10ks.cfg method=http://kickstart.sys.intra/fedora/os/10/x86_64/
[2009-01-16 14:45:30 4120] DEBUG (image:204) ramdisk        = /var/lib/xen/virtinst-initrd.img.tHZaoA
[2009-01-16 14:45:30 4120] DEBUG (image:205) vcpus          = 1
[2009-01-16 14:45:30 4120] DEBUG (image:206) features       =
[2009-01-16 14:45:30 4120] INFO (XendDomainInfo:1458) createDevice: vbd : {'uuid': '8e5c6770-489b-97e0-8331-5429648a9d7c', 'bootable': 1, 'driver': 'paravirtualised', 'dev': 'xvda', 'uname': 'file:/opt/vm/f10onf8/f10testB.img', 'mode': 'w'}
[2009-01-16 14:45:30 4120] DEBUG (DevController:117) DevController: writing {'virtual-device': '51712', 'device-type': 'disk', 'protocol': 'x86_64-abi', 'backend-id': '0', 'state': '1', 'backend': '/local/domain/0/backend/vbd/21/51712'} to /local/domain/21/device/vbd/51712.
[2009-01-16 14:45:30 4120] DEBUG (DevController:119) DevController: writing {'domain': 'f10testB', 'frontend': '/local/domain/21/device/vbd/51712', 'uuid': '8e5c6770-489b-97e0-8331-5429648a9d7c', 'format': 'raw', 'dev': 'xvda', 'state': '1', 'params': '/opt/vm/f10onf8/f10testB.img', 'mode': 'w', 'online': '1', 'frontend-id': '21', 'type': 'file'} to /local/domain/0/backend/vbd/21/51712.
[2009-01-16 14:45:31 4120] INFO (XendDomainInfo:1458) createDevice: vif : {'bridge': 'eth0', 'mac': '00:16:3e:52:31:eb', 'uuid': 'a8c5be37-9b04-e638-7541-4c333edd43a2'}
[2009-01-16 14:45:31 4120] DEBUG (DevController:117) DevController: writing {'mac': '00:16:3e:52:31:eb', 'handle': '0', 'protocol': 'x86_64-abi', 'backend-id': '0', 'state': '1', 'backend': '/local/domain/0/backend/vif/21/0'} to /local/domain/21/device/vif/0.
[2009-01-16 14:45:31 4120] DEBUG (DevController:119) DevController: writing {'bridge': 'eth0', 'domain': 'f10testB', 'handle': '0', 'uuid': 'a8c5be37-9b04-e638-7541-4c333edd43a2', 'script': '/etc/xen/scripts/vif-bridge', 'state': '1', 'frontend': '/local/domain/21/device/vif/0', 'mac': '00:16:3e:52:31:eb', 'online': '1', 'frontend-id': '21'} to /local/domain/0/backend/vif/21/0.
[2009-01-16 14:45:31 4120] DEBUG (XendDomainInfo:2116) Storing VM details: {'on_xend_stop': 'ignore', 'shadow_memory': '0', 'uuid': '9e415006-b708-6918-2125-be1459a0c376', 'on_reboot': 'destroy', 'start_time': '1232084731.11', 'on_poweroff': 'destroy', 'on_xend_start': 'ignore', 'on_crash': 'destroy', 'xend/restart_count': '0', 'vcpus': '1', 'vcpu_avail': '1', 'image': "(linux (kernel /var/lib/xen/virtinst-vmlinuz.8-3i5u) (ramdisk /var/lib/xen/virtinst-initrd.img.tHZaoA) (args 'ks=http://10.0.13.215/xenf10ks.cfg method=http://kickstart.sys.intra/fedora/os/10/x86_64/') (notes (HV_START_LOW 18446603336221196288) (FEATURES '!writable_page_tables|pae_pgdir_above_4gb') (VIRT_BASE 18446744071562067968) (GUEST_VERSION 2.6) (PADDR_OFFSET 0) (GUEST_OS linux) (HYPERCALL_PAGE 18446744071578882048) (LOADER generic) (SUSPEND_CANCEL 1) (PAE_MODE yes) (ENTRY 18446744071584731648) (XEN_VERSION xen-3.0)))", 'name': 'f10testB'}
[2009-01-16 14:45:31 4120] DEBUG (XendDomainInfo:956) Storing domain details: {'console/ring-ref': '1150214', 'image/entry': '18446744071584731648', 'console/port': '2', 'store/ring-ref': '1150215', 'image/loader': 'generic', 'vm': '/vm/9e415006-b708-6918-2125-be1459a0c376', 'control/platform-feature-multiprocessor-suspend': '1', 'image/hv-start-low': '18446603336221196288', 'image/guest-os': 'linux', 'image/virt-base': '18446744071562067968', 'memory/target': '524288', 'image/guest-version': '2.6', 'image/pae-mode': 'yes', 'console/limit': '1048576', 'image/paddr-offset': '0', 'image/hypercall-page': '18446744071578882048', 'image/suspend-cancel': '1', 'cpu/0/availability': 'online', 'image/features/pae-pgdir-above-4gb': '1', 'image/features/writable-page-tables': '0', 'name': 'f10testB', 'domid': '21', 'image/xen-version': 'xen-3.0', 'store/port': '1'}
[2009-01-16 14:45:31 4120] DEBUG (DevController:117) DevController: writing {'protocol': 'x86_64-abi', 'state': '1', 'backend-id': '0', 'backend': '/local/domain/0/backend/console/21/0'} to /local/domain/21/device/console/0.
[2009-01-16 14:45:31 4120] DEBUG (DevController:119) DevController: writing {'domain': 'f10testB', 'protocol': 'vt100', 'uuid': 'edeeb8b3-fd69-386e-22de-f95302561c05', 'frontend': '/local/domain/21/device/console/0', 'state': '1', 'location': '2', 'online': '1', 'frontend-id': '21'} to /local/domain/0/backend/console/21/0.
[2009-01-16 14:45:31 4120] DEBUG (XendDomainInfo:1040) XendDomainInfo.handleShutdownWatch
[2009-01-16 14:45:31 4120] DEBUG (DevController:150) Waiting for devices vif.
[2009-01-16 14:45:31 4120] DEBUG (DevController:155) Waiting for 0.
[2009-01-16 14:45:31 4120] DEBUG (DevController:594) hotplugStatusCallback /local/domain/0/backend/vif/21/0/hotplug-status.
[2009-01-16 14:45:31 4120] DEBUG (DevController:608) hotplugStatusCallback 1.
[2009-01-16 14:45:31 4120] DEBUG (DevController:150) Waiting for devices usb.
[2009-01-16 14:45:31 4120] DEBUG (DevController:150) Waiting for devices vbd.
[2009-01-16 14:45:31 4120] DEBUG (DevController:155) Waiting for 51712.
[2009-01-16 14:45:31 4120] DEBUG (DevController:594) hotplugStatusCallback /local/domain/0/backend/vbd/21/51712/hotplug-status.
[2009-01-16 14:45:32 4120] DEBUG (DevController:594) hotplugStatusCallback /local/domain/0/backend/vbd/21/51712/hotplug-status.
[2009-01-16 14:45:32 4120] DEBUG (DevController:608) hotplugStatusCallback 1.
[2009-01-16 14:45:32 4120] DEBUG (DevController:150) Waiting for devices irq.
[2009-01-16 14:45:32 4120] DEBUG (DevController:150) Waiting for devices vkbd.
[2009-01-16 14:45:32 4120] DEBUG (DevController:150) Waiting for devices vfb.
[2009-01-16 14:45:32 4120] DEBUG (DevController:150) Waiting for devices console.
[2009-01-16 14:45:32 4120] DEBUG (DevController:155) Waiting for 0.
[2009-01-16 14:45:32 4120] DEBUG (DevController:150) Waiting for devices pci.
[2009-01-16 14:45:32 4120] DEBUG (DevController:150) Waiting for devices ioports.
[2009-01-16 14:45:32 4120] DEBUG (DevController:150) Waiting for devices tap.
[2009-01-16 14:45:32 4120] DEBUG (DevController:150) Waiting for devices vtpm.
[2009-01-16 14:45:32 4120] INFO (XendDomain:1130) Domain f10testB (21) unpaused.
[2009-01-16 14:45:32 4120] WARNING (XendDomainInfo:1203) Domain has crashed: name=f10testB id=21.
[2009-01-16 14:45:32 4120] DEBUG (XendDomainInfo:1802) XendDomainInfo.destroy: domid=21
[2009-01-16 14:45:32 4120] DEBUG (XendDomainInfo:1821) XendDomainInfo.destroyDomain(21)
[2009-01-16 14:45:32 4120] DEBUG (XendDomainInfo:1479) Removing vif/0
[2009-01-16 14:45:32 4120] DEBUG (XendDomainInfo:569) XendDomainInfo.destroyDevice: deviceClass = vif, device = vif/0
[2009-01-16 14:45:32 4120] DEBUG (XendDomainInfo:1479) Removing vbd/51712
[2009-01-16 14:45:32 4120] DEBUG (XendDomainInfo:569) XendDomainInfo.destroyDevice: deviceClass = vbd, device = vbd/51712
[2009-01-16 14:45:32 4120] DEBUG (XendDomainInfo:1479) Removing console/0
[2009-01-16 14:45:32 4120] DEBUG (XendDomainInfo:569) XendDomainInfo.destroyDevice: deviceClass = console, device = console/0
Reply With Quote
  #3  
Old 16th January 2009, 07:01 AM
veritgo Offline
Registered User
 
Join Date: Feb 2007
Posts: 8
Creation script with debug:

Code:
[root@MachineB]# virt-install --name=f10testB -r 512 -f /opt/vm/f10onf8/f10testB.img -s 4 --nographics --os-type=linux --os-variant=fedora10 -l  http://kickstart.server/fedora/os/10/x86_64/ -x  ks=http://<ipaddress>/xenf10ks.cfg -d
Fri, 16 Jan 2009 14:45:25 DEBUG    Disk path not found: Assuming file disk type.
libvir: Xen Daemon error : GET operation failed: xend_get: error from xen daemon:
libvir: Xen Daemon error : failed Xen syscall xenDaemonDomainDumpXMLByID failed to find this domain 1361587548
libvir: Xen Daemon error : GET operation failed: xend_get: error from xen daemon:
libvir: Xen Daemon error : failed Xen syscall xenDaemonDomainDumpXMLByID failed to find this domain 12221216


Starting install...
libvir: Xen Daemon error : GET operation failed: xend_get: error from xen daemon:
Fri, 16 Jan 2009 14:45:26 DEBUG    Fetching URI http://kickstart.server/fedora/os/10/x86_64//.treeinfo
Fri, 16 Jan 2009 14:45:26 DEBUG    Saved file to /var/lib/xen/virtinst-.treeinfo.wsrJek
Retrieving file .treeinfo...                                                                                         | 1.0 kB     00:00
Fri, 16 Jan 2009 14:45:26 DEBUG    Fetching URI http://kickstart.server/fedora/os/10/x86_64//.treeinfo
Fri, 16 Jan 2009 14:45:26 DEBUG    Saved file to /var/lib/xen/virtinst-.treeinfo.GTqc_W
Retrieving file .treeinfo...                                                                                         | 1.0 kB     00:00
Fri, 16 Jan 2009 14:45:26 DEBUG    Detected a valid .treeinfo file
Fri, 16 Jan 2009 14:45:26 DEBUG    Fetching URI http://kickstart.server/fedora/os/10/x86_64//images/pxeboot/vmlinuz
Retrieving file vmlinuz...                              66% [==============================               ]  0.0 B/s | 1.7 MB     --:-- ETA Fri, 16 Jan 2009 14:45:26 DEBUG    Saved file to /var/lib/xen/virtinst-vmlinuz.8-3i5u
Retrieving file vmlinuz...                                                                                           | 2.5 MB     00:00
Fri, 16 Jan 2009 14:45:26 DEBUG    Fetching URI http://kickstart.server/fedora/os/10/x86_64//images/pxeboot/initrd.img
Retrieving file initrd.img...                           91% [=========================================    ] 2.8 MB/s |  16 MB     00:00 ETA Fri, 16 Jan 2009 14:45:29 DEBUG    Saved file to /var/lib/xen/virtinst-initrd.img.tHZaoA
Retrieving file initrd.img...                                                                                        |  17 MB     00:03
libvir: Xen Daemon error : GET operation failed: xend_get: error from xen daemon:
Creating storage file...                                                                                             | 4.0 GB     00:00
Fri, 16 Jan 2009 14:45:29 DEBUG    Creating guest from '<domain type='xen'>
  <name>f10testB</name>
  <currentMemory>524288</currentMemory>
  <memory>524288</memory>
  <uuid>9e415006-b708-6918-2125-be1459a0c376</uuid>
  <os>
    <type>linux</type>
    <kernel>/var/lib/xen/virtinst-vmlinuz.8-3i5u</kernel>
    <initrd>/var/lib/xen/virtinst-initrd.img.tHZaoA</initrd>
    <cmdline>ks=http://10.0.13.215/xenf10ks.cfg method=http://kickstart.server/fedora/os/10/x86_64/</cmdline>
  </os>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>destroy</on_reboot>
  <on_crash>destroy</on_crash>
  <vcpu>1</vcpu>
  <devices>
    <disk type='file' device='disk'>
      <source file='/opt/vm/f10onf8/f10testB.img'/>
      <target dev='xvda'/>
    </disk>

    <interface type='bridge'>
      <source bridge='eth0'/>
      <mac address='00:16:3e:52:31:eb'/>
    </interface>

    <input type='mouse' bus='xen'/>

  </devices>
</domain>
'
Creating domain...                                                                                                   |    0 B     00:02
Fri, 16 Jan 2009 14:45:32 DEBUG    Created guest, looking to see if it is running
libvir: Xen Daemon error : GET operation failed: xend_get: error from xen daemon:
Fri, 16 Jan 2009 14:45:32 DEBUG    No guest running yet virDomainLookupByName() failed GET operation failed: xend_get: error from xen daemon:
libvir: Xen Daemon error : GET operation failed: xend_get: error from xen daemon:
Fri, 16 Jan 2009 14:45:33 DEBUG    No guest running yet virDomainLookupByName() failed GET operation failed: xend_get: error from xen daemon:
libvir: Xen Daemon error : GET operation failed: xend_get: error from xen daemon:
Fri, 16 Jan 2009 14:45:33 DEBUG    No guest running yet virDomainLookupByName() failed GET operation failed: xend_get: error from xen daemon:
libvir: Xen Daemon error : GET operation failed: xend_get: error from xen daemon:
Fri, 16 Jan 2009 14:45:33 DEBUG    No guest running yet virDomainLookupByName() failed GET operation failed: xend_get: error from xen daemon:
libvir: Xen Daemon error : GET operation failed: xend_get: error from xen daemon:
Fri, 16 Jan 2009 14:45:34 DEBUG    No guest running yet virDomainLookupByName() failed GET operation failed: xend_get: error from xen daemon:
libvir: Xen Daemon error : GET operation failed: xend_get: error from xen daemon:
Fri, 16 Jan 2009 14:45:34 DEBUG    No guest running yet virDomainLookupByName() failed GET operation failed: xend_get: error from xen daemon:
libvir: Xen Daemon error : GET operation failed: xend_get: error from xen daemon:
Fri, 16 Jan 2009 14:45:34 DEBUG    No guest running yet virDomainLookupByName() failed GET operation failed: xend_get: error from xen daemon:
libvir: Xen Daemon error : GET operation failed: xend_get: error from xen daemon:
Fri, 16 Jan 2009 14:45:34 DEBUG    No guest running yet virDomainLookupByName() failed GET operation failed: xend_get: error from xen daemon:
libvir: Xen Daemon error : GET operation failed: xend_get: error from xen daemon:
Fri, 16 Jan 2009 14:45:35 DEBUG    No guest running yet virDomainLookupByName() failed GET operation failed: xend_get: error from xen daemon:
libvir: Xen Daemon error : GET operation failed: xend_get: error from xen daemon:
Fri, 16 Jan 2009 14:45:35 DEBUG    No guest running yet virDomainLookupByName() failed GET operation failed: xend_get: error from xen daemon:
libvir: Xen Daemon error : GET operation failed: xend_get: error from xen daemon:
Fri, 16 Jan 2009 14:45:35 DEBUG    No guest running yet virDomainLookupByName() failed GET operation failed: xend_get: error from xen daemon:
libvir: Xen Daemon error : GET operation failed: xend_get: error from xen daemon:
Fri, 16 Jan 2009 14:45:35 DEBUG    No guest running yet virDomainLookupByName() failed GET operation failed: xend_get: error from xen daemon:
libvir: Xen Daemon error : GET operation failed: xend_get: error from xen daemon:
Fri, 16 Jan 2009 14:45:36 DEBUG    No guest running yet virDomainLookupByName() failed GET operation failed: xend_get: error from xen daemon:
libvir: Xen Daemon error : GET operation failed: xend_get: error from xen daemon:
Fri, 16 Jan 2009 14:45:36 DEBUG    No guest running yet virDomainLookupByName() failed GET operation failed: xend_get: error from xen daemon:
libvir: Xen Daemon error : GET operation failed: xend_get: error from xen daemon:
Fri, 16 Jan 2009 14:45:36 DEBUG    No guest running yet virDomainLookupByName() failed GET operation failed: xend_get: error from xen daemon:
libvir: Xen Daemon error : GET operation failed: xend_get: error from xen daemon:
Fri, 16 Jan 2009 14:45:36 DEBUG    No guest running yet virDomainLookupByName() failed GET operation failed: xend_get: error from xen daemon:
libvir: Xen Daemon error : GET operation failed: xend_get: error from xen daemon:
Fri, 16 Jan 2009 14:45:37 DEBUG    No guest running yet virDomainLookupByName() failed GET operation failed: xend_get: error from xen daemon:
libvir: Xen Daemon error : GET operation failed: xend_get: error from xen daemon:
Fri, 16 Jan 2009 14:45:37 DEBUG    No guest running yet virDomainLookupByName() failed GET operation failed: xend_get: error from xen daemon:
libvir: Xen Daemon error : GET operation failed: xend_get: error from xen daemon:
Fri, 16 Jan 2009 14:45:37 DEBUG    No guest running yet virDomainLookupByName() failed GET operation failed: xend_get: error from xen daemon:
libvir: Xen Daemon error : GET operation failed: xend_get: error from xen daemon:
Fri, 16 Jan 2009 14:45:37 DEBUG    No guest running yet virDomainLookupByName() failed GET operation failed: xend_get: error from xen daemon:
Fri, 16 Jan 2009 14:45:38 DEBUG    Removing /var/lib/xen/virtinst-vmlinuz.8-3i5u
Fri, 16 Jan 2009 14:45:38 DEBUG    Removing /var/lib/xen/virtinst-initrd.img.tHZaoA
ERROR:  It appears that your installation has crashed.  You should be able to find more information in the logs
Reply With Quote
  #4  
Old 16th January 2009, 08:37 AM
SlowJet Offline
Registered User
 
Join Date: Jan 2005
Posts: 5,048
See the news forum - Jan 11th news issue 158 Virtualization
http://fedoraproject.org/wiki/FWN/Is...Virtualization
New methods and lists for F10, F10+

SJ
__________________
Do the Math
Reply With Quote
  #5  
Old 16th January 2009, 08:56 AM
veritgo Offline
Registered User
 
Join Date: Feb 2007
Posts: 8
Thank you for the link SlowJet.

I checked out the virtualization section of the newsletter and have scanned december/january threads for both the fedora-virt and fedora-xen lists.

Have not gained any fresh insight yet, but i'll try posting the question to fedora-xen.

Thanks again.
Reply With Quote
  #6  
Old 20th January 2009, 05:06 AM
veritgo Offline
Registered User
 
Join Date: Feb 2007
Posts: 8
Feedback from fedora-xen

Mark at redhat gave me a few things to try:

Quote:
Okay, sounds like it might "just" be a F10 kernel bug.

Try doing this to get a stack trace:

1) Set "on_crash=preserve" in your domain config

2) Copy the guest kernel's System.map to the host

2) Once the guest has crashed, run:

/usr/lib/xen/bin/xenctx -s System.map <domid>

Cheers,
Mark.
I copied a functioning guest from MachineA to MachineB, and did the above procedures.

The output is as follows:
Code:
/usr/lib64/xen/bin/xenctx -s System.map-2.6.27.5-117.fc10.x86_64 46
rip: ffffffff8100b8a2 set_page_prot+0x6d
rsp: ffffffff81573f08
rax: ffffffea   rbx: 000016e1   rcx: 00000055   rdx: 00000000
rsi: 800000014ffc6061   rdi: ffffffff816e1000   rbp: ffffffff81573f68
 r8: 0000000f    r9: ffffffff817eb450   r10: ffffffff817eb650   r11: 00000010
r12: ffffffff816e1000   r13: 800000014ffc6061   r14: 8000000000000161   r15: 00000016
 cs: 0000e033    ds: 00000000    fs: 00000000    gs: 00000000

Stack:
 0000000000000055 0000000000000010 ffffffff8100b8a2 000000010000e030
 0000000000010082 ffffffff81573f48 000000000000e02b ffffffff8100b89e
 0000000000000200 ffffffff816e4000 0000000000000800 0000000000002c00
 ffffffff81573ff8 ffffffff815a3c60 0000000000002c00 0000000000000000

Code:
7b 4a 1d 00 4c 89 e7 4c 89 ee 31 d2 e8 22 d9 ff ff 85 c0 74 04 <0f> 0b eb fe 5b 41 5c 41 5d 41 5e

Call Trace:
  [<ffffffff8100b8a2>] set_page_prot+0x6d <--
  [<ffffffff8100b8a2>] set_page_prot+0x6d
  [<ffffffff8100b89e>] set_page_prot+0x69
  [<ffffffff815a3c60>] xen_start_kernel+0x5dd
Reply With Quote
  #7  
Old 20th January 2009, 05:08 AM
veritgo Offline
Registered User
 
Join Date: Feb 2007
Posts: 8
xm dmesg

In addition to the logs, there seems to be some relevant data in Xen's dmesg, accessed with the command

Code:
xm dmesg
Specifically, this line:
(XEN) traps.c:405:d44 Unhandled invalid opcode fault/trap [#6] in domain 46 on VCPU 0 [ec=0000]

Context:
Code:
(XEN) mm.c:1362:d46 Bad L1 flags 800000
(XEN) traps.c:405:d46 Unhandled invalid opcode fault/trap [#6] in domain 46 on VCPU 0 [ec=0000]
(XEN) domain_crash_sync called from entry.S
(XEN) Domain 46 (vcpu#0) crashed on cpu#2:
(XEN) ----[ Xen-3.1.4  x86_64  debug=n  Not tainted ]----
(XEN) CPU:    2
(XEN) RIP:    e033:[<ffffffff8100b8a2>]
(XEN) RFLAGS: 0000000000000282   CONTEXT: guest
(XEN) rax: 00000000ffffffea   rbx: 00000000000016e1   rcx: 0000000000000055
(XEN) rdx: 0000000000000000   rsi: 800000014ffc6061   rdi: ffffffff816e1000
(XEN) rbp: ffffffff81573f68   rsp: ffffffff81573f08   r8:  000000000000000f
(XEN) r9:  ffffffff817eb450   r10: ffffffff817eb650   r11: 0000000000000010
(XEN) r12: ffffffff816e1000   r13: 800000014ffc6061   r14: 8000000000000161
(XEN) r15: 0000000000000016   cr0: 000000008005003b   cr4: 00000000000006f0
(XEN) cr3: 0000000144f18000   cr2: 0000000000000000
(XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: e02b   cs: e033
(XEN) Guest stack trace from rsp=ffffffff81573f08:
(XEN)    0000000000000055 0000000000000010 ffffffff8100b8a2 000000010000e030
(XEN)    0000000000010082 ffffffff81573f48 000000000000e02b ffffffff8100b89e
(XEN)    0000000000000200 ffffffff816e4000 0000000000000800 0000000000002c00
(XEN)    ffffffff81573ff8 ffffffff815a3c60 0000000000002c00 0000000000000000
(XEN)    0000000000000000 0000000000000000 0000000000000000 ffffffff8208b000
(XEN)    0000000000010000 0000000000000000 0000000000000000 0000000000000000
(XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN)    0000000000000000 0000000000000000 0000000000000000 ffffffff82008000
(XEN)    ffffffff82009000 ffffffff8200a000 ffffffff8200b000 ffffffff8200c000
(XEN)    ffffffff8200d000 ffffffff8200e000 ffffffff8200f000 ffffffff82010000
(XEN)    ffffffff82011000 ffffffff82012000 ffffffff82013000 ffffffff82014000
(XEN)    ffffffff82015000 ffffffff82016000 ffffffff82017000 ffffffff82018000
(XEN)    ffffffff82019000 ffffffff8201a000 ffffffff8201b000 ffffffff8201c000
(XEN)    ffffffff8201d000 ffffffff8201e000 ffffffff8201f000 ffffffff82020000
(XEN)    ffffffff82021000 ffffffff82022000 ffffffff82023000 ffffffff82024000
(XEN)    ffffffff82025000 ffffffff82026000 ffffffff82027000 ffffffff82028000
(XEN)    ffffffff82029000 ffffffff8202a000 ffffffff8202b000 ffffffff8202c000
(XEN)    ffffffff8202d000 ffffffff8202e000 ffffffff8202f000 ffffffff82030000
(XEN)    ffffffff82031000 ffffffff82032000 ffffffff82033000 ffffffff82034000
(XEN)    ffffffff82035000 ffffffff82036000 ffffffff82037000 ffffffff82038000
Reply With Quote
  #8  
Old 21st January 2009, 04:48 AM
veritgo Offline
Registered User
 
Join Date: Feb 2007
Posts: 8
Bugzilla Logged: 480880

https://bugzilla.redhat.com/show_bug.cgi?id=480880

Phill on the dev list had the following comments:
Quote:
Hi list,

>From the Intel® Virtualization Technology Specification
for the IA-32 Intel® Architecture (2005):

"2.9.2 Information for VM Exits Due to Vectored Events
Event-specific information is provided for VM exits due to the following vectored events:
exceptions (including those generated by the instructions INT3, INTO, BOUND, and UD2); external interrupts that occur while the “acknowledge interrupt on exit” VM-exit control is 1; and non-maskable interrupts (NMIs). This information is provided in the following fields:" ....

The <0f> 0b in the "Code:" section are the UD2 instruction.

Checking through the OpCode map for the Xeon processor, this is an invalid op code. In VT processors the software guide indicates that a program can communicate various events and state information to the underlying virtualization supervisor by executing a UD2 (and some others ops like it).

I think that in a non-VT cpu it's actually a "real" invalid op code. The stuff (hardware) which flips over to the supervisor with all the needed info from the virtual machine isn't there.

KVM uses this, from the patches I've seen Googling around for UD2 (if I understand correctly).

So why a UD2 in the code? It's highly unlikely that it's just some random bytes that happen to be a UD2. Possibly the kernel thinks it's in fully virt mode at some point? The image notes do seem to indicate this.

Cheers
Phill.
Mark on the dev list figured out the following:
Quote:
Here's the important bits:

1) Host kernel is 2.6.21.7-5.fc8xen, that means the hypervisor is
xen-3.1.4

2) The guest kernel is 2.6.27.5-117.fc10.x86_64

3) Phill points out the faulting instruction is UD2. That just means
the guest kernel is hitting a BUG() assertion. See /asm-x86/bug.h:

#define BUG() \
do { \
asm volatile("ud2"); \
for (; ; \
} while (0)

4) The backtrace shows the fault happens in set_page_prot()

5) Jon's dmesg contains:

(XEN) mm.c:1362:d46 Bad L1 flags 800000

That means the guest is faulting here:

static void set_page_prot(void *addr, pgprot_t prot) { ....
if (HYPERVISOR_update_va_mapping((unsigned long)addr, pte, 0))
BUG();
}

because the PTE update is failing in the HV here:

static int mod_l1_entry(l1_pgentry_t *pl1e, l1_pgentry_t nl1e,
unsigned long gl1mfn) { ...
if ( unlikely(l1e_get_flags(nl1e) & L1_DISALLOW_MASK) )
{
MEM_LOG("Bad L1 flags %x",
l1e_get_flags(nl1e) & L1_DISALLOW_MASK);
return 0;
}
...
}

the PTE flags are 800000 which corresponds to:

#define _PAGE_NX_BIT (1U<<23)

Jon/Phill - can one of you two file a bug (bugzilla.redhat.com) with all this info?

Thanks,
Mark.
Ian came back saying that the issue is likely due to MachineB boing incapable of NX . MachineA is capable of NX, which fits.
Quote:
At least in xen-unstable (and I think for much longer) L1_DISALLOW_MASK contains _PAGE_NX_BIT dynamically depending on the processor capabilities.

#define _PAGE_NX (cpu_has_nx ? _PAGE_NX_BIT : 0)
...
/*
* Disallow unused flag bits plus PAT/PSE, PCD, PWT and GLOBAL.
* Permit the NX bit if the hardware supports it.
*/
#define BASE_DISALLOW_MASK (0xFFFFF198U & ~_PAGE_NX)

#define L1_DISALLOW_MASK (BASE_DISALLOW_MASK | _PAGE_GNTTAB)

Does the hardware support NX? What does /proc/cpuinfo in dom0 think?

The guest kernel should be setting up __supported_pte_mask appropriately to match the hardware and hence shouldn't be using NX if it isn't available. There's a command line option to force NX, can you try noexec=off on the guest command line.

My guess would be that the guest is getting a wrong EFER from somewhere...

Ian.
Unfortunately noexec had no effect.
Reply With Quote
Reply

Tags
boot, f10, fail, guests, host, x8664, xen

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
unable to boot xen guests after crash th350urc3 Servers & Networking 0 18th July 2008 05:08 PM
convert VMWare server WinXP guests to KVM guests? mayostard Using Fedora 2 30th May 2008 02:10 AM


Current GMT-time: 17:05 (Thursday, 24-07-2014)

TopSubscribe to XML RSS for all Threads in all ForumsFedoraForumDotOrg Archive
logo

All trademarks, and forum posts in this site are property of their respective owner(s).
FedoraForum.org is privately owned and is not directly sponsored by the Fedora Project or Red Hat, Inc.

Privacy Policy | Term of Use | Posting Guidelines | Archive | Contact Us | Founding Members

Powered by vBulletin® Copyright ©2000 - 2012, vBulletin Solutions, Inc.

FedoraForum is Powered by RedHat