
The basics of containerization – how containers work in Linux

This article was created to familiarize readers with the issue of containerization. Apart from discussing the most important concepts, I will also present a historical outline. This will allow you to understand that containers, contrary to appearances, are not a new technology. Their present shape is a derivative of an idea that has literally evolved through generations of enthusiasts and creators of operating systems.
This article was created to familiarize readers with the issue of containerization. Apart from discussing the most important concepts, I will also present a historical outline. This will allow you to understand that containers, contrary to appearances, are not a new technology. Their present shape is a derivative of an idea that has literally evolved through generations of enthusiasts and creators of operating systems.
I know from experience that it happens that people who talk with enthusiasm about containers do not fully know what it means from a technical point of view. Therefore, let me informally define the term containerization. At the same time, I would like to point out that this definition is based on experience and is intended to help you understand the idea, not to discuss a specific technology. So – containerization is the process of running processes in an environment isolated from the rest of the system. The process running in the container “doesn’t know” that other, both non-containerized and containerized processes are running alongside it. Containerization is also associated with the reduction of system resources (CPU, memory, disk) for a given container environment. An important feature of containerization that distinguishes it from full virtualization is that containers share the system and its resources, and also virtualize only selected elements.
Historical outline of containerization
Before the times of the popularity of Docker, systemd-nspawn or Podman, as well as frameworks or orchestrators, there were many tools and mechanisms that can be without doubt considered a kind of containerization. Let me mention the three most important ones.
Chroot – 1979
Chroot, also often called “change root”, is already an “ancient” mechanism derived from Unix 7, also known as V7. This mechanism still exists today. It allows you to execute commands in the new root directory. Children of the process launched in chroot also see their “world” from the new directory. We are dealing here with the separation of file access for each process. This mechanism was added by Billy Joy (creator of vi, csh, NFS and co-founder of Sun Microsystems) to the BSD project in 1981. The current containers are the ideological heir of chroot. As a curiosity, I would like to add that when you build (especially from scratch) a “modern” container for Docker or Podman, one of the main tools is chroot.
FreeBSD jails – (FreeBSD 4.0) 2000
The jails mechanism implemented in FreeBSD allows you to divide the system into many smaller ones, essentially sharing one system. From the user’s point of view, each microsystem is a full-fledged system. The jails concept is a natural evolution of chroot and extends it to include virtualization:
- access to the file system (chroot allows direct access to files)
- user access (the root user in jail does not have access to the main system)
- network (in chroot, the system uses the same network resources).
I would also like to highlight the following characteristics that define a jail mechanism.
- Entry point (directory tree)
- Hostname
- IP address
- Startup command.
The defining features of jail in BSD are also found in Linux containerization. Obviously, the entry point is an image, not a path in the file system. Nevertheless, it is literally a copy.
Solaris Containers – 2004
Described as “chrooting on steroids”. In fact, the first fully commercial containers in the form we know them today. They combined both separation at the level of files (like chroot, jail), users (like FreeBSD jail), networks (like FreeBSD jail), and the ability to manage resources like CPU and maximum memory. Thanks to this solution, we prevent one or several heavily loaded containers from saturating the server’s resources. The name of the project itself is interchangeable with “Solaris Zones”, which are zones that are collections of containers.
Other projects worth mentioning
Other projects that have contributed to the development of containers, especially under Linux, include:
- Linux Vserver – it offered resource partitioning. However, it never gained much popularity and required manual kernel patching
- Process Containers – created by Google in 2006, it was used to limit the use of system resources such as CPU, network, disk resources (including the number of I / O operations) or memory. The project changed its name to cgroups, or control groups and is still the basis for effective containerization. It comes as a standard with the Linux kernel
- LXC (LinuX Containers) – this is the first complete implementation of containers under Linux – works to this day. It uses the well-known cgroups and namespaces mechanisms, which will be discussed later in the article.
¡Docker!
Created in 2013, Docker is responsible for the so-called boom, i.e. the explosion of the container market. It is the first project to gain such a wide group of supporters. From a technical point of view, it is a system daemon together with the client, which helps to manage the containers. In other words, it is a service operating in the client-server model, responsible for the creation, management and destruction of containers. Interestingly, in the first period of its existence, Docker used LXC. Then it used the new libcontainer library to finally join the OCI (Open Container Initiative) and use the most standardized runc library.
Docker uses two features / functionalities from the Linux kernel to create its containers. The first is cgroups, the aforementioned mechanism responsible primarily for limiting resources such as memory, CPU usage, the number of disk operations or the maximum number of processes in a given group. Moreover, this mechanism allows for counting all types of operations.
The other mechanism is namespaces. In short, namespaces separate, i.e. isolate, processes between different namespaces. The processes in the given namespace “appear” to be the only processes on the system. Among other things, namespaces have their own virtual networks, which is essential in the case of containers.
You will learn more about both mechanisms by reviewing the relevant manuals man 7 namespaces
and man 7 cgroups
.
From a developer’s point of view, Docker is responsible for the complete life cycle of a container. It also allows you to manage it in a fairly simple way. It is this simplicity (Dockerfile, for example, includes 18 commands altogether, of which the developer needs to know only about 6) that allowed Docker to dominate the market.
Container orchestration
Finally, I would like to present a few problems related to Docker and containers in general:
- by their very nature, containers are ephemeral (short-lived). Whereas, for example, databases such as EuroDB (PostgreSQL) placed in a container require permanent storage and data carriers
- containers receive “random” IP addresses (in theory it can be controlled, but this solution complicates the environment even more), so combining them, especially when there is a need to scale only selected parts, can be troublesome
- to achieve HA, it is necessary to place the application or service on N hosts (N> 1)
- as the environment grows, it becomes more and more difficult to manage containers.
Container orchestrators were developed to solve these and many other problems. The most important include: Kubernetes (very actively developed), Docker Swarm (contrary to common opinion, Docker does not kill this project, as more than 700 Docker clients use it) or finally Apache Mesos, which can orchestrate both containerized and non-containerized resources. Orchestrators create a layer of abstraction above the containers. This allows, for example, to manage the autoscaling of containers, propagation of changes in the image to all nodes or creating the idea of a website that can be automatically load-balanced. The general idea of a container orchestrator can be summarized as a multi-container environment management tool in scale.
More information about this topic will be available soon on our blog.
Summary
The future of Docker itself is uncertain. What is visible here is the difficulty in finding its own identity and the way how to effectively cash in on the world’s most popular container building platform. Putting aside broadly understood business issues, however, one should appreciate the impact that Docker has had and still has on the wide adaptation of containerization technology, which today is more of a standard than a curiosity available to the chosen few.
I would like to thank you very much for the time you spent reading this article. Although it is not technical at first glance, it does allow you to get some refinement when discussing containers. It also helps to understand the context of containerization and the fact that, contrary to appearances, it is not a new idea at all, and that it should be viewed as a natural evolution.