The software development manager of Western Digital has proposed a new Zonefs file system en developer mailing list linux kernel, with the goal of simplifying low-level work with storage devices by zones. Zonefs associates each zone on the drive with a separate file that can be used to store data in raw mode without manipulation at the sector and block level.
Zonefs is not a POSIX compliant FS and is limited for a rather limited scope that allows applications to use the file API instead of directly accessing a block device using ioctl. Files associated with zones require sequential write operations starting from the end of the file (plugin mode writing).
Files provided in Zonefs can be used to place databases on top of zoned units using Structured Record Merge (LSM) record structures, starting by the concept of an archive: a storage area.
For example, similar structures are used in the RocksDB and LevelDB databases. The proposed approach makes it possible to reduce the cost of porting code that was originally designed to manipulate files instead of locking devices, as well as to organize low-level work with zoned units from applications in programming languages other than C.
Under Zoned Units intended for HDD or NVMe SSD devices, storage space that is divided into zones, sectors or blocks that constitute the group in which only sequential data update is allowed also throughout the entire group of blocks.
Eg recording zoning is used on devices with Shingled Magnetic Recording (SMR), in which the width of the track is less than the width of the magnetic head, and the recording is performed with a partial overlap of the neighboring track, that is, cuAny overwrite makes it necessary to overwrite the entire group of tracks.
Western Digital's Damien Le Moal describes Zonefs as
Zonefs is not a POSIX-compliant file system. Its goal is to simplify the implementation of zoned block device support in applications by replacing raw block device file accesses with a richer file-based API, avoiding relying on direct block device file ioctls which can be more dark for developers.
An example of this approach is the implementation of LSM tree structures in zoned block devices allowing SSTables to be stored in a zone file similar to a normal file system rather than a range of sectors of a zoned device.
As for SSD drives, initially thave a link to sequential write operations with preliminary data cleaning, but these operations are hidden at the controller level and the FTL layer (Flash translation layer). To increase efficiency under certain types of load, NVMe has standardized the ZNS (Zoned Namespaces) interface, which allows direct access to zones without going through the FTL layer.
Linux for zoned hard drives starting with kernel 4.10 offers ZBC (SCSI) and ZAC (ATA) block devices and from version 4.13, dm-zoned module has been added, which represents the zoned disk as a normal block device, hiding the write restrictions used during the job.
At the file system level, zoning support has already been integrated into the F2FS file system and a set of patches is being developed for the Btrfs file system, which is simplified for adaptation to zoned drives by working in CoW copy-on-write mode. Ext4 and XFS running on zoned drives can be organized using dm-zoned.
To simplify the translation of file systems, the ZBD interface is proposed, which translates random write operations in files into sequences of sequential write operations.