CMS (Code Management System) is a non-distributed Version Control System
(VCS) developed and maintained at DEC/Compaq/HP/VSI as part of the DECset
collection of tools. It runs on OpenVMS VAX, Alpha, I64 and x86. A VCS is
also known as revision control or source code management system. To name a
few of the currently popular ones: CVS (concurrent versions system), SVN
("Subversion"), Mercurial and Git.
The base objects in a CMS library are elements. An element consists of all
versions of a (source) file. As there are file versions on OpenVMS, here the
element versions are named generations. A generation reflects a status of
development in the (source) file. For the main line of development the
generations are numbers, starting with 1 and increasing. For a side line the
generation is called a variant. A variant's generation is specified by the
main line number, from which it is derived, and a single letter, specified
at reservation time, plus a number, again starting with 1, automatically
assigned at checkin/replacement time. For example, an element FOO.C in a
library may have generations 1, 2 and 3. When generation 3 exists and
generation 2 needs a change, one creates a variant of generation 2, that
opens a side line. For example it may be resered as variant with the letter T.
When the modified file is put back into CMS with the replace command, CMS
creates a variant of FOO.C as generation 2T1.
With variant names only ranging from 'A' to 'Z' it is very likely that in
big projects the names are reused. That is, a variant A for one element may
not have any relationship to variant A of another element. Also,
from looking at CMS libraries, it seems that in some projects there
were no guide lines, how to use variant letters: the same letter may be used for
different purposes.
The other important object in a CMS library is a class. It describes a
development or project status of the software. A class is defined as a set
of particular generations of elements. Obviously not all elements of the
library need to be in a class and for an element any but only one generation including a
variant can be contained in a class.
To compare with Git, the base objects are files as well. However the other
important "object" is the current collection of the files know to Git, which
describes a development or project status of the software. This "object"
essentially is a snapshot of all files known to Git. Files do not have a
version or generation identifier attached. A version of a file is defined by
the snapshot to which it belongs. So the main development line is a sequence
of snapshots. A side line is a branch, which can be created from any
snapshot and can have a user defined, descriptive name. A branch again is a
sequence of snapshots. In CMS terms, it very likely contains at least one
variant. Git snapshots can be tagged, simply said they can have names. All
tags are - more or less - a subset of all snapshots. A tag, that is a named
snapshot, can be compared to a CMS class. The big difference here is, that a
CMS class can be defined at any time, independent of replacing CMS objects,
which can be compared to commiting files in Git, which creates a snapshot.
But as defining the members of a
class is not a single CMS command, and as the generation of an element being
a member of a class can be changed any time as well, mapping a CMS class to
a Git tag/snapshot is not straight forward.
A possible scheme how a CMS library is imported, by example. The CMS library
view:
BAR.C(1) FOO.C(1) MAIN.C(1) | | | -&------------&------------&--------- Class V1.0 |\ | | BAR.C(2) | MAIN.C(2) | | | | BAR.C(3) | | | | | | -&------------&------------&--------- Class V2.0 | | | | | BAR.C(1A1) | | | | | | ----&---------&------------&--------- Class ECO 1.1 | | | | | / | | BAR.C(4) | | | | | -&------------&------------&--------- Class V2.1 | | | BAR.C(5) FOO.C(2) | | | |
Obviously, the BAR.C(1A1) was created after BAR.C(2), with the merge being
done into BAR.C(4). The diagram indicates that variant 1A1 was created after
Class V2.0 was created and its members were defined.
Ideally, classes should be converted to tags, variants should be in branches
and classes with variants should be tagged in branches as well.
That is, the result of an ideal import should look like
master | BAR.C(1) FOO.C(1) MAIN.C(1) | + <--- tag V1.0 |`---------------------------+ <--- branch ECO 1.1 | | BAR.C(2) MAIN.C(2) FOO.C(1) | | BAR.C(1A1) FOO.C(1) MAIN.C(1) BAR.C(3) MAIN.C(2) FOO.C(1) | | + <--- tag ECO 1.1 + <--- tag V2.0 | BAR.C(4) MAIN.C(2) FOO.C(1) | + <--- tag V2.1 | BAR.C(5) MAIN.C(2) FOO.C(2) |
However, that's impossible due to the design of CMS, because
an INSERT GENERATION <element> <class>
can insert any version of an element into a class.
There is no guarantee that all the previous generations of the elements
that make up a class were in one Git snapshot. Such a snapshot has to
be the base for the new branch of the CMS class.
In this small, artificial example, such a snapshot exists, but it can
and probably is different in the general case.
Also, in Git to find and use a specific version of an element to "add it to a class" it is necessary to tag each version; that is each element version has to go into its own commit.
The current approach is to ignore variants when importing the main line and to import a CMS class into a branch that is based on the root of the repository. This reflects the common practice that variants are inserted only in classes. This enables future changes and so Git commits in existing classes that is Git branches. This may ignore older variant generations of an element, if more than one of them was inserted into an existing class. Consider a variant for implementing a fix. It is inserted into a class. If the fix doesn't work, another fix with a new variant generation is inserted in the same class. The only way to ensure both variants are in the Git branch is to commit each element after it is retrieved.
So at the moment the above CMS library visualized in the above diagram
is imported into Git as (again, showing the CMS generations in
parentheses so that the CMS elements can be identified in the
diagram):
master |`-------------------------------------------------- | ` V1.0 ` V2.0 ` ECO 1.1 ` V2.1 | | | | | BAR.C(1) BAR.C(1) BAR.C(3) BAR.C(1A1) BAR.C(4) FOO.C(1) FOO.C(1) FOO.C(1) FOO.C(1) FOO.C(1) MAIN.C(1) MAIN.C(1) MAIN.C(2) MAIN.C(1) MAIN.C(2) BAR.C(2) MAIN.C(2) BAR.C(3) BAR.C(4) BAR.C(5) FOO.C(2)
There is a client, that runs on the system with Git and there is a server, that runs on the system with CMS. The client is a perl script using a Git perl module. The server is an HTTP server that understands CMS commands and simple VMS file commands. The perl script sends CMS commands to the HTTP server. The executes the CMS commands in its environment that creates VMS files. The client fetches the files and requests deletion of it. The client then adds the fetched file to the Git repository.
This enables importing of
The third import option can only be combined with one of other options.
Option two and three will not work for all CMS libraries, as CMS accepts commands which will rename (or even delete) the element, that is the file(s) on the disk and then it is impossible to find the needed elements.
USAGE: ./git-cmsimport.pl [OPTION]... LIBRARY URL Retrieve the latest generation of all elements from the CMS LIBRARY (in VMS syntax) located by the URL. Create a Git repository and add the retrieved CMS elements to the Git master. IMPORT MODE OPTIONS -C CLASS Import all the elements of the CMS class CLASS, which are added to a Git branch CLASS starting at the root of the Git repository. If a Git repository for this CMS library already exists and the CLASS is not already in the repository, it is added. -h Import the full main line of the CMS LIBRARY. Creates a Git repository with the elements in the master. OPTIONS -f FILE Use the retrieval information in FILE to retrieve the CMS elements; without an IMPORT MODE OPTION or with -c CLASS a list of elements is expected, with the IMPORT MODE OPTION -h a history is expected; it's an error if the expected header is not contained in the FILE. -F Do not import, only save the retrieval information into a file; without an IMPORT MODE OPTION or with -c CLASS a list of elements is saved into ./cms-elements.txt, with the IMPORT MODE OPTION -h the history is saved into ./cms-history.txt. The history is filtered for CREATE, COPY and REPLACE commands. -k Keep the files on the server side: the client does not send a delete request and the server does not delete any fetched file. This can speed up importing the files into Git but requires manual cleanup at the server side. -l Locally lowercase all VMS names: library, user and elements. -r REPOSITORY Name of the to be created repository; default is the last subdirectory in the specified LIBRARY argument. -T Commit every element importe and tag each with the CMS file and generation. This can be useful, to map CMS objects to Git objects and vice versa. In case of importing a CLASS, its name is prepended to the tag. -t OFFSET 4 digit time zone offset from UTC (rfc2822) -v LEVEL Verbose, log original CMS commands, ... Examples: ./git-cmsimport.pl -F -h [.cmsdemo] http://eisner.encompasserve.org:8080 ./git-cmsimport.pl -f cms-history.txt -h -l -t -0600 [.cmsdemo] http://eisner.encompasserve.org:8080
The http server should be started from an empty directory. To fetch CMS
elements from the CMS library, the server creates files in this
(default) directory. After transmission of the content to the perl script,
the file is deleted. Starting from an empty directory makes it easier to
clean up in case of errors.
CMS remarks will be used as Git commit messages. The remark
is used as it was formatted by CMS. That is, it may consist of several lines.
As already indicated, the CMS commands (visible in a CMS SHOW HISTORY)
DELETE ELEMENT name MODIFY ELEMENT oldname newname
create some problems.
DELETE deletes the whole element from the CMS library. Which means there was
a "CREATE ELEMENT name" in the history, maybe followed by some "REPLACE
name" entries, and finally this "DELETE ELEMENT name". With the element gone
and all traces removed the element can't be found and can't be imported at
the time the perl script sees and processed the "CREATE ELEMENT name" entry
(or any of the subsequent REPLACE entries) in the CMS history. The perl
script doesn't look ahead to know about the deletion. So the CMS command is
processed but the script will very likely abort with an error message.
However, it may even do something, if the deleted element was re-created later
with the very same name. This is bad, very bad!
The same is true for a "MODIFY ELEMENT oldname newname", which essentially
renames an element. Again, anything in the history referencing "oldname" can
not be imported by this script. And again, if "oldname" is reused, the
trouble doubles or even more.
What can be done here, is to save the history with -F, check the history
entries for such commands and do a preprocessing. That is remove all entries
of a deleted element, but only prior to the actual deletion. And, rename all
entries of a renamed element, again only prior to the actual rename. Then an
import with -f should work.
By the way, deleting a class is OK, and it is "imported". In Git, the
associated branch is gone, but with empty commits, there are traces of
the class.