Professional Documents
Culture Documents
Nova HA
Nova HA
Nova HA
1
Masakari
Mali Asemani
2022 Nov 23
Abramad Corporate
2 Masakari Module
Provides VMs High Availability(VMHA),
and rescues KVM-based VM from a failure.
It covers:
nova-compute host failure
evacuate all the VMs from failure host according to th
VM process down
restart vm (use nova stop API, and nova start API).
Libvirt events will be also emitted by other failures.
3 Requirements
Communications services
Masakari-api :
REST API
process REST requests
The user-facing interface Send them to the masakari-engin over RPC
Typically involve
RPC
database reads/writes,
Internal communications sending RPC messages to Masakari engine,
generating responses to the REST calls.
done via the oslo.messaging library
an abstraction on top of message Masakari-engine
queues. will run on the same host where the Masakari
api is running,
listen for RPC messages
Processes the notifications received
from masakari-api
execute the recovery workflow in async way.
7 Masakari Architecture
When you evacuate the instance, Compute detects whether shared storage is
available on the target host.
In Masakari the compute nodes are grouped into Openstack zoned boundaries :
failover segments.
System can be zoned from top to
down levels, into
Segment concept enables you to
Regions,
define several segments of hypervisor hosts
based on the shared storage boundaries Availability Zones
or any other limitations may critical for selection of and Host Aggregates (or Cells).
the failover host.
and assign the failover host/hosts for each of them. Zones can be managed by the nova
scheduler.
In the event of a failure Operator may want to use other types
guests are moved onto other nodes within the same of boundaries such as rack layout and
segment. powering.
recovery method of a segment determines the
compute node to house the evacuated guests
10 Recovery Methods
(using Nova Evacuate API)
Auto auto_priority
the guests are relocated to any of the available chain the previous methods together.
nodes in the same segment
evacuate all the VMs with no destination node for
At First: attempts to move the guest using
nova scheduler the auto method
Dis-Adv: if that fails: tries the reserved_host method
no guarantee for resources availability
rh_priority
reserved_host
compute hosts are allocated as reserved chain the previous methods together.
which allows an operator to guarantee there is At First: attempts to move the guest using
sufficient capacity available for any guests in need the reserved_host method
of evacuation
if that fails it tries the auto method.
evacuate all the VMs with reserved hosts as the
destination nodes for nova scheduler
11
How Does it Work?
12 How does it work as a whole?
13 How does it work as a whole?
Recovery Request
Segments API
Host API
Notifications API
b le
isa
No
ic eD Ms
v
v V
Ser
aE
te Start
va
m pu an d
cu
Co Sto
p
ate
o v a
N
Process failure
Notification
Instance failure
Notification
Host failure Notification Pacemaker-remote Pacemaker-remote
14 Host Monitor
Monitor Driver:
Consul
15 Monitor Driver: Pacemaker
libvirt and QEMU Guest Agent are used as the underlying protocol for messaging to and from VM.
The host-side qemu-agent sockets are used to determine whether VMs are configured with QEMU Guest
Agent.
qemu-guest-ping is used as the monitoring heartbeat.
For the future release, we can pass through arbitrary guest agent commands to check the health of the
applications inside a VM.
21 Process Monitor
Note:
If OpenStack service runs in container(pod),
processmonitor will not work as expected.
Any Question?
Thanks All :)
22