Hosting companies differ by offers, size, etc, but also by choices of infrastructure. But the latter, which are a major vector of the efficiency of a hosting offer, are seldom well known. Thus, NBS System launches a series of articles introducing its infrastructures and tools, for more transparency and to make this hidden part of the IT hosting sector known. After our article about reverse proxies, followed by the one on firewalls and load balancers, we focus today on hypervisors.
Before all, we must broaden the subject in order to define the words and themes that will be used in the article. That is why we will start by exposing the concept of the Cloud, then explain virtualization, and finally end up on hypervisors and their characteristics.
The Cloud and its concept
Lately, there has been one word on everyone’s mouth: the Cloud. The concept of the Cloud is as follows: having access to a potentially worldwide service (data storage, access to new computing resources (CPU), web or mail services…), without caring about the physical position of one’s data at a precise moment. To keep it simple, it is about creating new information systems without taking into account the underlying operational and technical constraints (datacenters, logistics, human teams…). Hence the name “Cloud”: in the collective imaginary, the data is in a “cloud”, it is not attached to any physical place.
Through this characteristic, the Cloud has three major dimensions: flexibility, agility and scalability. Indeed, if the limitations are suppressed, everything is possible… Teams can now focus rather on the evolution and improvement of the services than on the maintenance of the underlying infrastructure.
To do that, the Cloud relies on several virtualization techniques, which provide it with the agility and scalability level that has made it so successful.
What is virtualization ?
Virtualization consists, in its general definition, in having one or several systems work on one or several physical machines, rather than to limit oneself to one system per machine. One can thus have, for instance, three servers on one physical machine. Let us pause here for a second in order to define the vocabulary that will be used in the article from now on:
- Host: a host is a physical system. It is the “original” system of a machine.
- Guest: a guest is a virtual system. It is a system that was virtualized on a host. It can be a virtual machine (VM) such as a server, but also other types of systems (router, switch…)
In order to better understand virtualization, let us take a peek at its history.
The first method ever allowing to simulate several systems on one physical machine is called chroot (pronounce C-H-root, for CHange ROOT). Here, we can not yet use the term “guest”: indeed, this method consists in deploying one or several system processes in an isolated sub-tree of the host system (imagine a “son” operating system in a file of the “father” system), with the following characteristics:
- The host and the chroot system processes use the same kernel
- The host can take action on the chroot processes, but not the other way around
Chroot is thus not, technically speaking, virtualization. Its goal is to have work, within a single machine, several processes that should not cohabit. It is possible thanks to chroot, which isolates these processes from one another.
Duplication of network stacks
Another method based on the principle of virtualization is the duplication of network stacks. It consists in creating, directly within the kernel, several networks. Indeed, technically, the cloning of a network stack provides domain names, which enables the creation of two distinct networks. These network stacks are defined and configured by the userland (what is not the kernel in the machine): it is thus possible, using Chroot in parallel for instance, to create two stacks, one per userland (host / process). It brings even more flexibility, notably at the routing level.
Chroot and the duplication of network stacks are, however, only the first stammerings of what will result in virtualization as we know it. Modern virtualisation begins with the containers method, inaugurated with Linux Vserver then with OpenVZ on Solaris. Today, Docker is one of the most renowned container systems. This method, as its name suggests, is based on containers. They contain the guest systems; they are turn-key systems, which automatically deal with several technical aspects. The isolement between the systems is here very advanced, and the administrator can control each system’s resource consumption, which was not possible with the chroot method. These are not, however, the sole differences between these two methods:
- The host and the guests cannot interact: the containers bring a real isolement between the different systems. This characteristic is what marks the beginning of the era of modern virtualization.
- Containers are easily deployable on the host: thus, the container method makes virtualization and machines more accessible, notably for people who do not have any “system” technical skills.
- This also implies that a container can be instantly deployed or destructed, depending on the needs. They thus bring an incomparable flexibility of maintenance and manipulation.
The only characteristic that does not change is the common use of the kernel; it is thus still not possible to get two different operating systems within a single physical machine.
The method using hypervisors is still a little different, since the host and the guests do not share their kernel. Each system, whether physical of virtual, has its own kernel. The isolement between the systems is thus, here, complete. However, there is a problem: as mentionned on the schema, a guest’s kernel is not directly connected to the underlying physical materiel; it can thus not interact with the machine (CPU, network card…), as it should.
The host’s kernel, however, has this ability. A communication interface must thus be created between the host’s kernel and the guest’s kernel, an intermediate that can send the guest’s kernel’s requests to the host kernel: it is the hypervisor.
There are two types of virtualization: para-virtualization and HVM (Hardware Virtual Machine).
With the use of the HVM method, the guest’s kernel has to go through the hypervisor in order to access the physical machine. The virtual machine is completely isolated, it takes the hypervisor as a physical machine: the hypervisor virtualizes the resources that are needed. The virtual machine talks to it as if it was directly talking with the hardware. The benefit of this method is that one can use a Linux host, and deploy a Windows guest, since the guest’s and host’s kernels have no interaction: they can thus come from different operating systems. The drawback is that the use of an intermediary extends the action treating time, and this diminishes the performance of the machines. That is what was represented in the schema above, labeled “hypervisor”.
With para-virtualization, some requests go through the hypervisor, except the ones going to specific components such as the network card, the CPUs or the disks (when the machine has some). Indeed, these requests are sent directly to the hardware, through the host’s kernel, without using the hypervisor. The benefit and the drawback of this method are exactly the opposite of the HVM method’s ones: it provides a good performance by supressing the intermediary, but does not allow the use of different operating systems on the guest and the host, since both kernels interact. We represented this method on the schema to the right, simplified with a single guest, and by only indicating the interactions of the kernels and the hypervisor.
What is a hypervisor?
As mentionned above, a hypervisor is part of the kernel of a host, receiving the requests of the guests’ kernels, and transmitting it to the machine. Its performance thus greatly relies on the ones of the guests.
The hypervisor is not, as opposed to reverse provies, firewalls and load balancers, a whole machine. It is part of the host: it is on the latter that optimizations are made in order to get the best results possible. We will call these machines hypervisor hosts.
NBS System’s choices
We also use para-virtualization on 90% of our park, which works under Linux. The 10% left use HVM and allow us to have some Windows and Unix machines, for specific internal uses.
NB: within the last versions of Xen, a para-virtualization / HVM hybrid option is available (PVHVM). It offers all the benefits of the two methods, without the drawbacks. NBS System considers chosing this option in the future.
Characteristics of the hypervisor host
Our hosts are machines with 40 CPUs and 192 Go of RAM, and have no disk. This lack of physical storage comes from our will to provide our hosts with the best possible resiliency. Indeed, to work the host should have access to RAM, CPUs and network cards: the rest being optional, we removed it.
Our most faithful readers will recognize here the choice we made for our reverse proxies!
The creation of a hypervisor host
Before we continue, if you do not know how a machine can work without a disk, we advise you to read the “how our RPs work with no disk” chapter of our article about reverse proxies (10 lines). We notably use, in the article, the terms PXE and InitRD, which are defined in this chapter.
When a machine starts, the PXE receives a signal from the network card telling it where to find the InitRD used for the discovery of new material, and the kernel it uses for its start. In parallel, it collects the information on the machine that is contained in the signal: MAC address, series number, hardware type… It allows the PXE to understand that the machine is physical, which means, in our company, that it can only be a hypervisor host. The PXE records this new host in the inventories, and has the machine start again.
Thus, when it starts again, it sends another signal, still received by the PXE. The latter recognizes the MAC address of the machine, which it just recorded: it knows it deals with a hypervisor host. It then sends the machine the information it needs to retreive the following elements:
- Kernel and InitRD (these are the same, no matter the machine)
- Specific command line, unique per machine, containing the name of the host, its dedicated IP addresses, and information about its functions as a hypervisor host. It is created when the host is registered.
- Xen, the hypervisor itself, to control the virtualization functions on the machine.
We obtain a host that is automatically ready, which prevents a manual configuration of each new machine that needs to be set up! In parrallel, the host is registered as “active” through internal mechanisms, and is then marked as available for production on our park management tools.
Once the host is ready, in order to create a virtual machine, the hypervisor must have access to a configuration file. The administrator can go through the network and write the file directly on the machine, by hand, but it creates some risks of human error (forgetting of a file, double deployment…).
On our infrastructure, we use a system of NFS (Network File System): it is a file storing system on our Netapps that can be shared between several machines. It is only one of the numerous available options!
Once connected to the NFS (the links are made when the operating system of the machine starts), the hypervisor host has access to the files as if it were on its local memory. Logs, virtual machines configuration files and different internal tools are stored on the NFS.
Creation of a virtual machine
So how does one deploy a virtual machine?
We use deployment tools, such as Salt. Salt is a configuration management and remote execution tool, it performs most of the actions, as the schema below shows.
- The administrator registers the virtual machine in our internal database (name, resources needed, datacenter…)
- The administrator provides Salt with the command line asking to deploy the virtual machine
- Salt gathers, from the database, all the information written by the administrator, and generates:
- a Xen configuration file that contains information about the resources dedicated to the machine, as well as the location of a kernel, an InitRD, and the dedicated command line for the new machine
- a File System for the virtual machine, already containing useful files
- Salt sends it to the NFS, which stores the files
- Last step, Salt sends, to the chosen host, the command to start a new virtual machine
Once that is done, the hypervisor will start a guest, and send it the kernel, the InitRD and the command line it received in its configuration file. Once it works, the virtual machine will go get its File System directly on the NFS. We will not go further, since this article’s topic does not include the functionning of a virtual machine.
At the beginning of virtualization, the loss of performance compared to the use of a physical machine was big enough to matter. Today, this loss is very small compared to the numerous benefis of virtualization: there is thus no reason not to use this method! The only obstacle here are the resources (having an infrastructure for virtualization requires tooling and knowledge), or the use of proprietary products that do not support this method (for instance, Oracle refused for a while the use of its products on anything else than a physical machine, since it did not do well with virtualization).
Source: Denis Pompilio