Thursday, 25 August 2016

This is a simple guide to help me next time I need to start VM’s on an ODA.

This post started life as a description of oakcli commands and where to run them. Pretty simple… It ended up with me really trying to understand the virtualised ODA architecture and where it might be letting me down.

All my research came down to a couple of salient facts:

· Start and stop VM’s from oakcli on ODA_BASE

· Start and stop ODA_BASE from Dom0

First, you should do with through EM12C if you can, but I had a pile of problems with that too.

So, you need to use oakcli – nice…  go old school on it

oakcli is a command that seems to do anything for your VM’s on your ODA.

A good reference is here:  https://docs.oracle.com/cd/E22693_01/doc.12/e55580/oakcli.htm

Before you start hacking away, it’s nice to understand what you are dealing with when you have a virtualized ODA.

What are all these machines?

clip_image002

image

Each “Dom X” is a VM, where Dom0 is the management domain and Dom 1 is the ODA_BASE domain (a Dom U with special privs). All of these Dom’s sit above the hypervisor.

clip_image009

Each node is a physical machine

So, you’ve probably called your host – ronin… After the singer!

Dom0 gives you control over the hypervisor (ronin0_net1)

Dom1 is ODA_BASE (which is a glorified Dom U), it has direct access to the storage shelf.

Dom U is user domains, for each VM essentially

clip_image011

Then you have ronin0 and ronin1 which are hostnames for each of the ODA_BASE machines on your physical “ronin” ODA. You then have hypervisors (essentially another VM) known as ronin0_net1 and ronin1_net1.

Oakcli from Dom0

So you need to login as root to ronin0_net1 to use the oakcli command interface:

which oakcli

/opt/oracle/oak/bin/oakcli

if you are in Dom0, you have a limited number of objects for oakcli, but you still have the same command:

[root@ronin1-net1 ~]# oakcli show oda-base Usage: oakcli [] : [show | configure | deploy | start | stop | restart | create | delete]: [firstnet | additionalnet | ib2fiber | oda_base | env_hw | vlan | network] []: [force] []: [-changeNetCard | -publicNet] available with configure network

[root@buttons-net1 ~]# oakcli show oda_base
ODA base domain
ODA base CPU cores      :8
ODA base domain memory  :384
ODA base template       :/OVS/oda_base_12.1.2.5.tar.gz
ODA base vlans          :['net1', 'net2']
ODA base current status :Running

Oakcli from ODA_BASE

You get a few more options from ODA_BASE, you can actually control your VMs. Now, this is strange, as the Dom0 machine actually has them running within it… Very strange to me.

So, when you are on ODA_BASE, you can run vm commands:

[root@ronin0 tmp]# oakcli show vm

         NAME                                  NODENUM         MEMORY          VCPU            STATE           REPOSITORY

        bear                                    1               8192M              2            ONLINE          jderepotst
        chinapeak                               0               8192M              2            ONLINE          jderepotst
        georgetown                              0               8192M              2            ONLINE          jderepotst
        homewood                                0               8192M              2            ONLINE          jderepotst
        jdessosvr                               0               4096M              2            ONLINE          jderepotst
        mountshasta                             1               8192M              2            ONLINE          jderepotst

Note also that the database and grid run in ODA_BASE too:

oracle   57905     1  0 Apr15 ?        00:55:14 ora_ckpt_JDETEST1
oracle   57909     1  0 Apr15 ?        00:18:57 ora_smon_JDETEST1
oracle   57913     1  0 Apr15 ?        00:01:06 ora_reco_JDETEST1
oracle   57917     1  0 Apr15 ?        00:35:03 ora_mmon_JDETEST1
oracle   57921     1  0 Apr15 ?        06:05:21 ora_mmnl_JDETEST1
oracle   57925     1  0 Apr15 ?        00:01:33 ora_d000_JDETEST1
oracle   57929     1  0 Apr15 ?        00:01:27 ora_s000_JDETEST1
oracle   58081     1  0 Apr15 ?        06:44:49 ora_lck0_JDETEST1
oracle   58089     1  0 Apr15 ?        00:06:48 ora_rsmn_JDETEST1
oracle   58287     1  0 Apr15 ?        00:02:24 ora_arc0_JDETEST1
oracle   58291     1  0 Apr15 ?        00:03:50 ora_arc1_JDETEST1

See above there are a heap of processes for the JDETEST1 RAC instance of the database

When logged into Dom0, you actually see the ODA_BASE VM and the other VM’s if you PS for a command.

Note also that your VM’s don’t run in ODA_BASE, they run in Node 0

[root@ronin1-net1 ~]# ps -ef | grep domain
root      6590 10362  0 21:38 pts/2    00:00:00 grep domain
root     15970 15027  0 Mar04 ?        04:58:17 /usr/lib64/xen/bin/qemu-dm -d 1 -domain-name oakDom1 -videoram 4 -vnc 0.0.0.0:0 -vncunused -vcpus 16 -vcpu_avail 0xffff -boot c -serial pty -acpi -net none -M xenfv
root     31107 15027  2 16:26 ?        00:08:16 /usr/lib/xen/bin/qemu-dm -d 7 -domain-name bear -videoram 4 -vnc 0.0.0.0:10 -vncunused -vcpus 2 -vcpu_avail 0x3L -boot c -acpi -usbdevice tablet -net none -M xenfv
root     49452 15027  2 Apr07 ?        3-17:47:50 /usr/lib/xen/bin/qemu-dm -d 5 -domain-name mountshasta -videoram 4 -vnc 0.0.0.0:10 -vncunused -vcpus 2 -vcpu_avail 0x3L -boot c -acpi -usbdevice tablet -net none -M xenfv

Note that an interesting point is that the first vm above is actually ODA_BASE or oakDom1.  So the _net1 is the bare metal host.

If you run this command from the dom0 on the other node,

[root@ronin0-net1 ~]# ps aux |grep domain

root 5674 2.5 0.0 64744 2944 ? SLl Aug01 876:22 /usr/lib/xen/bin/qemu-dm -d 16508 -domain-name chinapeak -videoram 4 -vnc 0.0.0.0:10 -vncunused -vcpus 2 -vcpu_avail 0x3L -boot c -acpi -usbdevice tablet -net none -M xenfv

root 15964 0.2 0.0 56552 996 ? SLl Mar04 573:03 /usr/lib64/xen/bin/qemu-dm -d 1 -domain-name oakDom1 -videoram 4 -vnc 0.0.0.0:0 -vncunused -vcpus 16 -vcpu_avail 0xffff -boot c -serial pty -acpi -net none -M xenfv

root 39236 0.0 0.0 61208 736 pts/5 S+ 22:38 0:00 grep domain

root 60399 4.0 0.1 64744 5100 ? SLl Apr07 8210:40 /usr/lib/xen/bin/qemu-dm -d 12 -domain-name georgetown -videoram 4 -vnc 0.0.0.0:10 -vncunused -vcpus 2 -vcpu_avail 0x3L -boot dc -acpi -usbdevice tablet -net none -M xenfv

root 61443 2.6 0.0 64744 956 ? SLl Apr07 5231:00 /usr/lib/xen/bin/qemu-dm -d 13 -domain-name homewood -videoram 4 -vnc 0.0.0.0:10 -vncunused -vcpus 2 -vcpu_avail 0x3L -boot c -acpi -usbdevice tablet -net none -M xenfv

You see another ODA_BASE (of course, because we are running RAC), but you see a different list of Dom0’s

This is a great diagram of how disk works for OVM guest OS’s on the ODA:

clip_image013

What a mess! The hypervisor exposes disk to ODA_BASE a little more natively, but VM disks are exposed as a NFS share to an ACFS volume. The above is stolen from http://blog.dbi-services.com/oda-vms-possibilities-a-performances/ . So you have VMs running on the hypervisor which are stored on an NFS share itself exported by a VM……not the most efficient architecture.

1 comment:

Anonymous said...

Check out the "driverdomain" feature recently added, it helps to avoid the NFS layer for your domU VMs.

Extending JDE to generative AI