Comparing the OpenVMS ODS and the Linux ods5 file systems

ODS is the On Disk Structure and names a file system which was first used in RSX on PDPs. That was level 1. ODS-2 designates level 2, which is used in VMS, nowadays named OpenVMS, on all its platforms: VAX, Alpha and I64 (1). It is the standard file system for VMS.

ODS-5 (on disk structure level 5) is the follow up version of the ODS-2. It was introduced with the NT-Affinity program, to store Windows-NT file names in a VMS file system. While ODS-2 is restricted in naming a file, length of file name, uppercase characters, etc. ODS-5 removes or lifts these restrictions. It supports more valid characters in file names and presents itself as case preserving but case blind. Later, ODS-5 was enhanced for the VMS COE version to meet Posix requirements: again, more valid filename characters, now case sensitive, with support for Posix hard-, softlinks and timestamps. Internally ODS-5 is a "superset" of ODS-2.

The ods5 file system for Linux supports level 2 and all level 5 variants.

The ODS level 2 and 5 file system is not so much different from other hierarchical file systems. The only big difference is how users look at it. VMS users usually look through the Record Management Services (RMS). Unix users usually don't know what records are. If a unix user asks a VMS guy how the file system works, he is confronted with two different worlds: RMS and ODS.

As on other file systems, ODS files are organized in directories. On the ODS disk there is a root directory. It is called the master file directory (MFD). It's not that the master knows all, just an entry into the directory tree listing all the files and directories at its level.

Compared with a Unix file system, the filename space and on ODS-2 the time stamps and on the lack of symbolic links are the main differences.

Additionally, ODS directories maintain file versions. If you list a directory with the VMS DIRECTORY command you will see the version as part of the file name, a number at the end of the file name, after a ';'. VMS users usually don't have to specify the version, usually the last version of a file, the file with the highest version number, is automatically assumed. Usually creating a file is creating a new version of a file. The version number is incremented by one.

The Linux ods5 file system does not assume the latest version of a file, it sees the version as a part of the file name. Actually, the ODS file system also stores the version number as part of the file name string in most of its data structures. It is only the directory, which can have a single file name string plus multiple version numbers.

The directory structure on ODS is similar to Unix file systems. However, naming a file is different. An absolute path to a file (2) consists of a directory path, a file name, a file type and a file version number: "[DIRECTORY.SUBDIRECTORY]FILE.TYPE;1". A directory is a file with file type DIR and version number 1.

The current Linux ods5 file system maps such a path to a Unix path with all the full intermediate names: "DIRECTORY.DIR;1/SUBDIRECTORY.DIR;1/FILE.TYPE;1"

VMS uses an additional software layer between user programs and the file system: RMS - Record Management Services. RMS understands a great variety of file structures, that is record formats, including ISAM, which usually aren't available on standard Unix systems. The most often used record format is the variable length record, where each record is prefixed by two bytes specifying the length of the data within the record. For example the VMS editor creates such formatted files. RMS also knows a stream LF record format, which exactly matches Unix text files as created by Unix editors.

Some programmers don't see RMS at all, because they use the run time environment of the programming language and the language constructs to manipulate files. For C they usually use fopen or open which call SYS$OPEN (3), the RMS service to open a file. Although C doesn't know about records, the run time environment helps to read and write text files which were created with the VMS editor.

Programmers can call RMS services directly: SYS$OPEN, SYS$GET, and SYS$CLOSE. Again, this is not the file system. RMS calls SYS$QIO with function codes to access, read and deaccess the file. These QIO (Queued I/O) services are the interface to the file system. At that level, similar to Unix systems, there is no record structure, only a disk block. However, the file system provides some data cells for RMS to save the file internal structure on the disk: the file attributes. Although this looks like mixing data for the file system and RMS it isn't a bad thing. It has to be seen as additional information RMS or the application can retrieve about a file without opening the file itself and reading any of its data.

The Linux ods5 file system supplies an ioctl function to retrieve RMS attributes.

Looking at C functions like open it seems that it can be mapped to an SYS$QIO to access the file. This would work in a C or Unix type only environment within VMS, but it would fail to interact with standard VMS tools or applications.

A Linux file system for ODS can only implement the Unix open etc. the equivalent of the QIO interface. It can not integrate the RMS features. A cp from or to an ods5 file system will only copy the file data no matter what the RMS structure was or should be. As with other Unix files, it is expected that the application knows the internal (record) structure. Any file on an ods5 file system which has the Linux compatible RMS stream LF record format can be immediately interpreted as if it were a Linux file. Such a text file appears as a text file created by a Unix editor. All other files need to be understood by an application.

Notes:

1) VAX is the processor architecture and the VAX systems have different implementations of that architecture. Alpha started as EVAX and was AXP in between. Alpha is the processor architecture and there were chips implementing it, starting with EV4 or 21064. I64 systems are Integrity Servers based on Intel's 64 bit processors, the Itanium ones. Itanium is the processor architecture, which is sometimes called IA64 or IPF.

2) VMS has a multiple root file system. The roots are the disk devices. They are part of a fully specified filename and are a prefix to the absolute path to a file as explained above: "DEVICE:[DIRECTORY.SUBDIRECTORY]FILE.TYPE;1". With DECnet and several nodes in a network, the roots are the nodes. They are part of the fully specified filename and are a prefix to the device: "NODE::DEVICE:[DIRECTORY.SUBDIRECTORY]FILE.TYPE;1".

3) The RMS functions have uppercase names, however in VMS almost all compilers by default uppercase external names. So programmers became used to write lowercase names, even in languages which are case sensitive. So you may see the RMS open function be referenced as "sys$open".