Exploring Flat Binary Formats: A Journey into COM Support for Linux

Have you ever contemplated the idea of crafting your own private binary format specifically designed for Linux? For many developers, the desire to create the most efficient, minimal binary during project compilation often highlights the cumbersome nature of the Executable and Linkable Format (ELF). This frustration particularly resonates with Brian Raiter, who has embarked on a fascinating journey through the intricacies of binary formats, ultimately concluding that flat binary formats are an optimal solution for achieving sleek and streamlined binaries.
Flat binary formats, such as the COM format, are reminiscent of the early computing era, being well-known from the MS-DOS days. In fact, the concept of flat binaries dates back to even earlier, originating in the CP/M days. The term 'flat' signifies that the entire binary file is loaded directly into RAM without unnecessary complications or preambles, making it an appealing choice for developers seeking efficiency.
Despite the fact that Linux does not currently support this particular binary format, the encouraging news is that developers can gain valuable insights by learning how to write kernel modules focused on implementing COM support within the Linux kernel. In a recent article, Brian guides readers through this intriguing COM exploration, leading them into the realms of setting up a kernel module development environment and delving into the implementation of a custom binary file format. This journey brings us through familiar territories, especially for those who have previously examined how the Linux kernel manages shebangs (#!) and various misc formats.
On Windows systems, the kernel recognizes the COM file format by its extension, subsequently allocating it 640 KB of memory alongside an interrupt table for its operations. The kernel module's approach mirrors this process, although it requires a significant amount of coding to manage these functionalities.
The exploration does not stop there; the COM format has also been creatively extended into a new format represented by the symbol (Unicode U+265A). In our contemporary landscape, this extension embraces the necessity of utilizing an expansive array of Unicode glyphs to enhance our programming endeavors. This innovative format allows for intriguing features, such as the capability for automatic termination upon executing a command, resembling a crash-like behavior.
Through the culmination of these efforts, developers not only gain the ability to write kernel modules and introduce novel binary file formats to Linux, but they also learn to appreciate the vast potential offered by the Unicode glyph space, liberating themselves from the constraints of traditional ASCII. This shift toward embracing the richness of Unicode is a meaningful evolution in the landscape of software development.
In summary, the journey into flat binary formats and COM support illustrates a fascinating intersection of efficiency and creativity in programming, paving the way for new possibilities in the Linux operating environment.
Top image: Illustration of Brian Raiter surveying the fruits of his labor by Bomberanian.