
VDO – optimizing disk space in Enterprise Linux 8

Virtual Data Optimizer (VDO) includes everything you need to create a transparent layer for data compression and deduplication. It reduces disk space usage on block devices, minimises replication, and increases data throughput.
Virtual Data Optimizer (VDO) includes everything you need to create a transparent layer for data compression and deduplication. It reduces disk space usage on block devices, minimises replication, and increases data throughput.
VDO uses three main techniques:
- zero-blocks elimination – filters out data blocks that contain only zeros and stores information about these blocks in metadata only. The non-zero data blocks are then passed to the next processing phase. This phase allows the use of the Thin-Provisioning function in VDO devices
- deduplication – eliminates redundant data blocks. When creating multiple copies of the same data, VDO detects duplicate blocks and updates the metadata to use these duplicates as references to the original blocks without creating redundant blocks
- compression – the kvdo kernel module compresses data blocks using LZ4 algorithm.
With these techniques, VDO can significantly increase both the efficiency of storage and the usage of the network bandwidth. The VDO layer is placed on an existing block device, such as a RAID device or local disk, and block devices can also be encrypted.
Logical devices created with VDO are named VDO volumes. They are similar to disk partitions – they can be formatted with the desired file system and mounted just like a regular file system. You can also use a VDO volume as an LVM physical volume.
Since the VDO volume is Thinly-Provisioned, the file system and applications only see logical space in use and are not aware of the actual physical space available.
When hosting virtual machines or containers, it is recommended to provide storage at a 10:1 logical to physical ratio (for example, using 1 TB of physical storage, we present it as 10 TB of logical storage). For object-based storage platforms such as Ceph, a logical to physical ratio of 3:1 is recommended (meaning 1TB of physical storage will be represented as 3TB of logical storage).
VDO Installation
Installing VDO on Enterprise Linux (CentOS, Red Hat® Enterprise Linux®, Oracle® Linux, Rocky Linux, AlmaLinux) involves running the following command:
[eurolinux@el84 ~]$ sudo dnf install vdo kmod-kvdo
Creating a VDO volume
To create a new VDO volume, prepare the following information:
- the name of the underlying block device
- the name of the optimised block device that will be presented by VDO
- the logical size to be presented to storage layers above the VDO.
Without the latter parameter, VDO will create a volume that provides a 1:1 mapping between the physical and logical blocks. You can later increase the physical and logical size of the volume using vdo growPhysical
and vdo growLogical
commands.
As a simple example, we will create a VDO volume on the device /dev /vdb with the name vdo1 and the logical size of 50GB by running the vdo create
command:
[eurolinux@el84 ~]$ sudo vdo create --name=vdo1 \
--device=/dev/vdb \
--vdoLogicalSize=50G
Creating VDO vdo1
The VDO volume can address 2 GB in 1 data slab.
It can grow to address at most 16 TB of physical storage in 8192 slabs.
If a larger maximum size might be needed, use bigger slabs.
Starting VDO vdo1
Starting compression on VDO vdo1
VDO instance 0 volume is ready at /dev/mapper/vdo1
VDO Information and Statistics
To analyse a VDO volume, run the vdo status
command. It displays a report on the VDO system and the status of the volume in YAML format. We can limit the display of information to a specific volume by using the --name=
option – of course, in case of just one volume, using this option will not be necessary.
[eurolinux@el84 ~]$ sudo vdo status
VDO status:
Date: '2021-09-10 13:57:42-04:00'
Node: el84
Kernel module:
Loaded: true
Name: kvdo
Version information:
kvdo version: 6.2.4.26
Configuration:
File: /etc/vdoconf.yml
Last modified: '2021-09-10 12:58:27'
VDOs:
vdo1:
Acknowledgement threads: 1
Activate: enabled
Bio rotation interval: 64
Bio submission threads: 4
Block map cache size: 128M
Block map period: 16380
(...)
To check the volume, we can use the vdostats
command. Since VDO provides Thin-Provisioning, this tool should also be used to determine how much free physical space is left on the underlying storage device:
[eurolinux@el84 ~]$ sudo vdostats --human-readable
Device Size Used Available Use% Space saving%
/dev/mapper/vdo1 5.0G 3.0G 2.0G 60% N/A
The output of the vdostats
command displays the VDO volume device name (Device) along with statistics that indicate the total number of blocks (1K-blocks), the number of blocks in use (Used), the number of remaining blocks (Available), the percentage of total blocks in use (Use%), and the percentage of space saved (Space saving%).
File System
Next, we can format the VDO volume with the XFS file system:
[eurolinux@el84 ~]$ sudo mkfs.xfs -K /dev/mapper/vdo1
meta-data=/dev/mapper/vdo1 isize=512 agcount=4, agsize=3276800 blks
= sectsz=4096 attr=2, projid32bit=1
= crc=1 finobt=1, sparse=1, rmapbt=0
= reflink=1
data = bsize=4096 blocks=13107200, imaxpct=25
= sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0, ftype=1
log =internal log bsize=4096 blocks=6400, version=2
= sectsz=4096 sunit=1 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
and mount the resource:
[eurolinux@el84 ~]$ sudo mount /dev/mapper/vdo1 /mnt && df -h
Filesystem Size Used Avail Use% Mounted on
devtmpfs 627M 0 627M 0% /dev
tmpfs 657M 0 657M 0% /dev/shm
tmpfs 657M 9.3M 648M 2% /run
tmpfs 657M 0 657M 0% /sys/fs/cgroup
/dev/mapper/eurolinux-root 21G 4.7G 16G 23% /
/dev/vda1 1014M 246M 769M 25% /boot
tmpfs 132M 4.0K 132M 1% /run/user/1000
/dev/mapper/vdo1 50G 390M 50G 1% /mnt
At system startup, systemd vdo unit automatically starts all VDO devices that are configured as active. The vdo unit is installed and enabled by default when the VDO package is installed. In the event of a system restart after an unclean shutdown, VDO performs a metadata rebuild to check its consistency and repairs it as needed.
Summary
VDO is designed to save disk space and reduce costs. Savings can be seen in both traditional data centres and cloud-based deployments. Depending on your needs, this can translate into lower costs per compute instance, lower costs of external block storage, and lower costs of long-term snapshot storage.
The degree of data reduction that can be observed using VDO will vary depending on the type of data stored. Compressed video or audio files will not take full advantage of this technology, but backups, virtual machines and container deployments will provide very tangible savings.