SRFI 176


Version flag


Lassi Kortela


This SRFI is currently in draft status. Here is an explanation of each status that a SRFI can hold. To provide input on this SRFI, please send email to To subscribe to the list, follow these instructions. You can access previous messages via the mailing list archive.


This SRFI defines a standard command-line flag to get version information from a Scheme implementation. The output is Line-oriented S-expressions which are easy to parse from Scheme, C, and shell scripts and can co-exist with non-S-expression output. A standard vocabulary is defined; extensions are easy to make.

Table of contents


The implementation of this SRFI boils down to a few write calls. But careful planning has gone into the details of those calls.


There is a long tradition of complex command line programs having a version flag. This flag skips the normal operation of the program; instead, it writes version information to standard output and immediately exits. The flag is useful for:

Part 1: Which flag

Survey of existing version flags

Existing Scheme implementations with an upper-case -V version flag:

Existing Scheme implementations with a conflicting use for upper-case -V:

Existing Scheme implementations with a lower-case -v version flag:

Existing Scheme implementations with a conflicting use for lower-case -v:

Other version flags in existing Scheme implementations:

Additionally, a few Schemes have flags to change the version by loading another version of the same Scheme. Changing the version is not addressed by this SRFI.

Which flag to choose

From the above survey we note that the most popular flags are lower-case -v, as well as --version with two dashes. Upper-case -V and -version with one dash are also each supported by several implementations.

The first problem is that long options (-version and --version) do not have a universally agreed-upon syntax. Perhaps most programs now adhere to the GNU-style two-dash syntax. However, many prominent ones such as C compilers and fundamental X Window System utilities continue to use the one-dash syntax. There are many Schemes supporting only one of those variants, and many that do not have long options at all.

One-letter options are much more standardized; so much in fact, that almost all Unix programs have supported them ever since Unix was first published. While programs disagree on whether to interpret a group of letters after one dash as a single one-word option or as multiple one-letter options, there is no ambiguity when only one letter follows the dash. The convention is especially strong when the flag is the only argument after the program name.

So we are left with upper-case -V and lower-case -v as candidates. Lower-case is more popular among Schemes, but upper-case is a stronger standard among Unix programs in general. Upper-case is also much less ambiguous. Lower-case -v is very often used as the "verbose" flag to make a program’s output more detailed. Some Scheme implementations also adopt that usage. By contrast, upper-case -V does not have any other standard meaning besides version information, and has no known conflicting uses among Scheme implementations.

Finally, The Art of Unix Programming, Chapter 10 has this to say about the upper-case -V flag:

-V: Version (without argument). Display program’s version on standard output and exit (often also prints compiled-in configuration details as well). Examples: gcc(1), flex(1), hostname(1), many others. It would be quite surprising for this switch to be used in any other way.

Hence this SRFI uses the upper-case -V flag.

Other flags (e.g. lower-case -v, the word -version with one dash, and/or the word --version with two dashes) may be implemented as well, but are not required.

Parser-friendly output format

Many Scheme/Lisp implementations and other Unix tools output version information in a format that is quite stable. The idea is that the information can be parsed by other programs and scripts. Often the output format is almost regular but not quite. Some of the more complex formats, while stable, are not self-consistent since they evolved over time from an ad hoc syntax; outside of Scheme, clisp --version and gcc -v are good examples.

This SRFI mandates a very simple S-expression syntax that is a subset of Scheme’s standard syntax. Implementations can easily write out the information using the standard write procedure as long as the expressions given to write are suitably constrained. Version output is naturally represented as an association list of properties and their values. Each association shall be written as a separate S-expression; the full list is implicit. The precise output format is slightly unconventional and is thoroughly presented in the next part.

Part 2: Line-oriented S-expressions (LOSE)


The Lisp tradition is gifted with the easy handling of nested data. Most languages are otherwise.

In particular, Unix shell scripts generally parse their input using the traditional tools awk, grep, and sed that are based around regular expressions. Regexps are notoriously unable to handle nested structure. Perhaps for that reason, Unix shells also make it difficult to store nested data in variables. Even string list handling is clumsy and error-prone.

The other classic Unix programming environment, the C language, makes it easy to represent nested data. But handling it is difficult due to the lack of standard tools. Almost all of the data processing functions in the standard library are string functions. Thus we come back to the same situation as with shell scripts: only strings are easy to process from standard C.


The following S-expression:

(version "1.2.3")

Can be easily parsed even with ancient versions of grep and sed:

grep '^(version ".*".*)$'      \
    | sed -e 's/^(version "//' \
          -e 's/".*//'

The sed command s/whatever// replaces whatever with the empty string. whatever is a regular expression.

An equivalent parser can be written in awk. It’s a bit more verbose but avoids using a shell pipeline:

awk -F '[()]' \
    '/^\(version .*\)$/ {
        sub(/^[a-z-]+ /, "", $2);
        gsub(/"/, "", $2);
        print $2

Again, the Awk command sub(/whatever/, "", $variable) replaces the regular expression whatever with the empty string. gsub replaces all occurrences on the line whereas sub replaces only the first. Awk splits each line into numbered input fields at the separator [()].

These commands should work with the POSIX standard versions of these tools. They have been tested with the bare-bones versions in BusyBox, which gives confidence that they are quite conservative.

Parsing from C is surprisingly easy as well:

#include <stdio.h>
#include <string.h>

static char output[1024];

static const char *parse_version(void)
    const char prefix[] = "\n(version \"";
    char *start;
    char *limit;

    if ((start = strstr(output, prefix)) == NULL) {
        return "";
    start += strlen(prefix);
    for (limit = start; *limit != '"'; limit++) {
        if ((*limit == '\0') || (*limit == '\n') || (*limit == '\\')) {
            return "";
    *limit = '\0';
    return start;

int main(void)
    output[0] = '\n';
    fread(&output[1], 1, sizeof(output) - 2, stdin);
    printf("%s\n", parse_version());
    return 0;

When reading this code, note that C implicitly initializes the output buffer to all all zero bytes. In order to find (version "1.2.3") when it is the first line of output, a newline character \n is artificially prepended to the real output. When reading we additionally leave one zero byte at the end of the output buffer to ensure the buffer is null-terminated. The parse_version function mutates the output buffer by replacing the closing double-quote in (version "1.2.3") with a null byte. Then we can easily handle the substring 1.2.3 as a null-terminated string.

Assumptions made

In order for the above S-expression parsing to be robust, it relies on some concessions from the program writing the output:

At first glance, this looks like a big list of restrictions. But in practice, they are not hard to conform to for simple data.

Hacks for lists

We noted above that writing nested data to be read by these tools is pretty much out of the question. C can represent a true recursive-descent S-expression parser, but writing such elaborate code in a low-level language is an ordeal and does not help the shell scripts that remain stuck with their regexp-based parsing aids.

The classic Unix tools awk, grep and sed are not only regexp-based but line-oriented. If something is easy to do to one line, it is just as easy to do to many lines. On the contrary, limiting an action to only one line is difficult.

So the natural way to represent lists to these tools is to have one list entry per line. S-expressions naturally represent a list between one pair of parentheses. We could write the list

(cities london milan paris tokyo)

split to lines like this:


But the Unix tools prefer to treat each line identically, storing different types of data in whole different files instead. Since we have to parse output from one stream and would prefer to avoid resorting to temporary files, we have to think of something else.

Luckily, grep and awk can easily filter lines to keep only the interesting ones. So if we break from the usual Lisp customs and write one list as multiple S-expressions, we can arrive at a format that is still S-expression-based, but where it is also easy for line-oriented tools to parse heterogeneous data from one stream. We essentially have a traditional Lisp association list, but with duplicate properties. And their values have to be merged to arrive at the complete values. Thankfully a merging S-expression reader is easy to express in Scheme:

(define (read-merged-alist port)
  (let loop ((alist '()))
    (let ((new (read port)))
      (cond ((eof-object? new) (reverse alist))
            ((not (pair? new)) (loop alist))
            (else (let ((old (assoc (car new) alist)))
                    (cond (old  (set-cdr! old (append (cdr old) (cdr new)))
                                (loop alist))
                          (else (loop (cons new alist))))))))))

Wrapping up

So we have arrived at an association list of property-value pairs:

(sweets chocolate)
(sweets candy)
(sweets cake)

(covers fabric)
(covers cellophane)
(covers tinfoil)
(covers paper)

Within each property, we can group the values to arbitrary lists:

(sweets chocolate)
(covers fabric)
(covers cellophane tinfoil paper)
(sweets candy cake)

And still end up with the same merged association list:

((sweets chocolate candy cake)
 (covers fabric cellophane tinfoil paper))

Part 3: Backward compatibility

To let implementations keep their existing version output for backward-compatibility, LOSE parsing starts at the first line with a left parenthesis ( at the first column. This means that any amount of other text can come before the S-expression part. Some Unix programs write a multi-paragraph copyright, warranty and version message; all of that can be preserved if desired.

This simple arrangement makes all of the following work naturally:


The best argument for using S-expressions generally is that people keep re-inventing them in less consistent and flexible formats without an objective reason. We will save time and effort by using time-tested syntax from the beginning. Easy interoperability with Scheme/Lisp is an obvious plus.

The main arguments against S-expressions is that they look foreign to non-Lisp programmers and require too many parentheses. The nesting implied by the parentheses makes them a poor fit for line-oriented tools. All the classic Unix text processing utilities are line-oriented. To interoperate with these tools we need we need a compromise. The easiest compromise is to write each association list entry on its own line, which leaves most lines with only one pair parentheses and no nesting. Nested lists in the output will be rare. To allow multi-value properties, and long lines to be broken into multiple lines, the easiest thing is to merge duplicate properties.

We could take this even further by using implied parentheses around each line of text, so that no parentheses are needed for most output lines. This should make the output completely un-scary even for Unix programmers who know nothing about Lisp. Unfortunately this syntax would make it hard to provide backward-compatibility with the many existing output formats for version info. Having a left parenthesis in the first column is a very simple and unambiguous rule. If there are no such syntactic markers, parsing will be a lot harder. Most natural candidates for a syntactic marker are also more ambiguous than a left parenthesis. For example, property: value pairs are harder to detect and easier to confuse with other things.


Character set and encoding

The version output should be in an ASCII superset character encoding, so that bytes #x01..#x7e correspond to those ASCII codepoints. The encoding of bytes outside this range is unspecified; UTF-8 is recommended where possible.

ASCII space (#x20), carriage return (#x0d) and line feed (#x0a) characters are recognized as whitespace. Only line feed is recognized as a newline character. But since carriage return is whitespace, CR/LF newlines work as well as LF newlines.

Line-oriented S-expressions (LOSE)

The version output shall conform to the following subset of Scheme syntax:

There shall be no whitespace between parentheses and the things they wrap on the same line. There shall be exactly one space between list elements on the same line.

Top-level S-expressions must all be lists, and the opening parenthesis must fall on the first column of a line. Multi-line S-expressions should generally be avoided; if required, then continuation lines of nested expressions must start with one space.

Missing features:

It is recommended to work around the lack of booleans by using enumerations or sets. For example, instead of (linux? #t) we recommend (platform-os linux) or (features ...​ linux ...​).

Version flag

For a Scheme invoked as fantastic-scheme, the command line fantastic-scheme -V (i.e. upper-case V preceded by one dash) must conform to the version output format in this SRFI. Specifically, LOSE parsing starts at the first output line that has a left parenthesis ( in the first column with no preceding whitespace characters. Parsing continues from that line until the end of the output. The parser merges every top-level S-expression that represents a (proper) list into one big association list, using the car of each list as the key and appending the cdrs in the order they appear. If there is no line starting with a left parenthesis, an empty association list is returned.

This SRFI guarantees only that the above simple command invocation, with -V as the first command-line argument and no other arguments, has the intended effect. The implementation should also support the -V flag in other argument positions if it makes sense, and it should have the same output format as when it is the only command line argument, but neither of those is required.

If the implementation has separate interpreter and compiler commands, both of those must support the -V flag as the only argument. Any other commands supplied by the implementation are also encouraged to support the flag, but are not required to. The version output is allowed to differ between all these commands.

When the -V flag is used as above, the command shall exit with a success exit code if it:

Otherwise it shall exit with a failure exit code. On Unix and Windows, exit code 0 means success and codes 1..100 are safe to use for indicating failure.

Effect of other flags on version output

The -V output may change if other flags are also given on the command line. For example, fantastic-scheme -V -r r6rs and fantastic-scheme -V -r r7rs could give different output describing the R6RS and R7RS modes of Fantastic Scheme, and fantastic-scheme -V could give yet different output describing both of them or have less information.

Use of color and other display attributes

ANSI, HTML or other in-band color and text attribute markup shall not be used in the S-expression part, since it will confuse parsers.

Out-of-band markup (e.g. Windows console character attributes) may be used.

Standard properties

Below is a large set of proposed standard properties. This set was designed based on actual information currently reported by various Scheme implementations in their version output.

All properties are optional. This implies that any Scheme implementation with a -V version flag writing only to standard output and exiting with code zero, no output lines starting with (, conforms to this SRFI.

Identification properties

(command string…​)

The command names for some Schemes differ on different operating systems and installations. Implementors typically desire a canonical command name for each command shipping with their implementation, but compromises sometimes need to be made due to name conflicts or multiple versions of the same command that need to be able to coexist. This property gives the canonical name suggested by the implementor without any optional version number.

If the executable being invoked is a multi-call binary (i.e. it can behave like more than one program depending on which argv[0] is given) or otherwise is known by more than one canonical name, then more than one string may be given.


(command "csi")
(command "gsc")
(command "isc")
(command "scheme" "mit-scheme")

(scheme-id symbol)

A symbol identifying which Scheme implementation provides this executable. Together with command this can be used to figure out which command of which implementation was invoked, even in cases where two implementations use the same command name.

At the time of writing, there is no central registry for scheme-id’s.


(scheme-id fantastic-scheme)

(language symbol…​)

The set of programming languages supported by the implementation. Symbols denote set membership.

The distinctions between language, language standard, language family and dialect are muddy. For the purposes of this property, they are all equivalent, and any of them may be represented by a symbol in this property. For example, r7rs is a member of scheme and both should be given. Non-Scheme languages could also be listed. If a unified Scheme and Common Lisp implementation is ever made, it would list both languages.

This property means that the implementation aspires to conform to these languages to a useful degree, and if it does not, you can open issues in the issue tracker to discuss it. Guarantees about conformance and pedantry about language definitions are not the point. In particular, any executable usefully characterized as a Scheme implementation should list scheme even if it does not fully conform to any RnRS report.

Standard symbols include r3rs, r4rs, r5rs, r6rs, r7rs, r7rs-large, scheme. Please coordinate with other implementors about coining symbols for other Scheme derivatives and non-Scheme languages.


(language scheme r6rs r7rs)

(website string)

A URL that starts with http:// or https://.

Version properties

(version string)

A free-form version string in the native format preferred by the implementor. No portable information can be reliably parsed from the string, but version strings should be sortable in order from oldest to newest using typical "version sort" algorithms.

In practice, most Scheme implementations use release version numbers in major.minor.patch format. Other information such as distributor patchlevel or version control commit may be appended.


(version "1.2.3")
(version "1.11.6")
(version "0.9.9_pre1")
(version "1.0.188-a4a79d5")
(version "4.3f")

(release string)

The most recent released version of the implementation based on which this build was made. If this is that release version, then version is identical to this. If this has patches on top, then the two versions are different.

(release-date iso-date-string)

The YYYY-MM-DD date on which the release was made.


(release-date "2019-08-06")

(release-name string)

A codename for the release.


(release-name "Grain Alcohol and Rainwater")
(release-name "oxygen")

(revision string)

A free-form revision string that can be used to check out the version control commit on which this build is based. For Git, we recommend the output of git describe --dirty --always which includes the latest tag, the commit hash in case commits have been made on top of that tag, and a mark if uncommitted changes have been made on top of the mentioned commit.


(revision "8e62f718")
(revision "3.0-0-g39797ea94")
(revision "v0.16-DEV-166-g6574540c")

Runtime properties

(features symbol…​)

The symbols should correspond to the feature list for cond-expand.


(features 64bit dload ptables)
(features utf-8 pthreads linux r7rs)

(features chibi r7rs ratios complex uvector threads full-unicode)
(features modules dynamic-loading darwin bsd macosx little-endian)

Build properties

(configure string…​)

Command line arguments given to the configure script before building this Scheme implementation. A configure script is a very common means for build-time configuration of programs on Unix-like operating systems. It is useful to save the options given to that script for run time; it helps in replicating builds and debugging problems with the implementation if the options are known.

Each command line argument is given as one string. S-expression string escapes are used; since the double-quoted string syntax used with S-expressions is largely compatible with Unix shells, the resulting syntax can generally be pasted to a shell with no changes.

The name of the configure script is not given. It is almost always configure, though that is not required.


(configure "--enable-single-host")
(configure "--prefix=/home/wiley/.local" "CC=gcc-9")

(platform string)

A free-form string identifying the computer architecture, operating system, and/or other aspects of the computing platform for which the executable was built. This is the platform string in the implementation’s native format; there is no portable information that can be reliably parsed. Often this is a GNU-style computer-kernel-userland triple; just as often it is not.


(platform "DarwinX8664")
(platform "x86_64-apple-darwin18.7.0")
(platform "macosx-unix-clang-x86-64")

(build-date iso-date-string [iso-time-string])

The date, with optional time-of-day, when this executable was built.

iso-date-string is always a string in ISO YYYY-MM-DD format.

iso-time-string (when given) is always a string that starts with HH:MM. It may contain more stuff for extra precision, according to the ISO 8601 time format, but it’s questionable whether a precision exceeding one minute is useful.

These should be UTC, not local time. It is implementation-dependent whether the timestamp is nearer to the start or end of the build.

Unix commands to generate this property:

date -u '+(build-date "%Y-%m-%d")'
date -u '+(build-date "%Y-%m-%d" "%H:%M")'


(build-date "2018-09-30")
(build-date "2018-09-30" "02:07")
(build-date "2018-09-30" "02:07:07.1234")

Boot image properties

(image-date iso-date-string [iso-time-string])

If this is an image-based Scheme system, the date and time when the active boot image was saved. Details as for build-date.

This may vary by command line options and environment variables if those can be used to select a different boot image.

(image-file filename)

If images can be loaded by filename, this gives the filename that is used to load the active boot image.

Pathname properties

(install-dir string)

Root directory of the Scheme installation, if it has one. Typically, this is the directory that has bin and lib subdirectories, but the meaning is implementation-dependent.

(library-path string…​)

List of directories to search for imported libraries.

Platform properties

(platform-os symbol…​)

The operating system(s) for which the executable was built. Symbols denote set membership.


(platform-os aros amigaos)
(platform-os dragonflybsd bsd unix)
(platform-os freebsd bsd unix)
(platform-os haiku beos)
(platform-os linux unix)
(platform-os netbsd bsd unix)
(platform-os openbsd bsd unix)
(platform-os solaris unix)
(platform-os windows)

(platform-computer symbol…​)

The computer architecture(s) and CPUs for which the executable was built. Symbols denote set membership.


(platform-computer x86 x86-64 intel)
(platform-computer x86 x86-64 amd)
(platform-computer arm32)
(platform-computer arm64)
(platform-computer ppc)
(platform-computer mips)
(platform-computer sparc)

(platform-bits integer…​)

integer is a positive exact integer giving the address width of the host computer in bits; almost always 32 or 64.

Implementation-defined properties

The names of implementation-defined properties should start with the implementation’s scheme-id and a dash. For example, if Fantastic Scheme builds varied by the phase of the moon, it could have:

  (fantastic-scheme-phase-of-the-moon waxing-crescent)

Complete example

If fantastic-scheme-2.95 -V gives the following output:

Fantastic Scheme version 2.95
Copyright (C) 2003 Pyrrhic Ventures
This is free software; always read the label. There is NO warranty;
not even for buoyancy or fitness for high-velocity landings at sea.

"Shoot for the moon. Even if you miss, you'll crash on impact."

(command "fantastic-scheme")
(scheme-id fantastic-scheme)
(language scheme r6rs r7rs)
(website "")
(version "2.96_pre1")
(release "2.95")
(release-date "2003-06-24")
(release-name "Sheer lunacy")
(revision "2.95-23-gc0f6340c")
(features r7rs ratios exact-complex full-unicode gnu-linux little-endian)
(features fantastic-scheme fantastic-scheme-1.0 space-ship-control-system)
(configure "--prefix=/home/wiley/.local")
(configure "--with-space-ship-control-system")
(platform "aarch64_be-linux-gnu_ilp32")
(platform-os ubuntu linux unix)
(platform-computer arm arm64)
(platform-bits 64)
(build-date "2019-10-05" "13:52:01")
(image-date "2019-10-05" "17:10:00")
(image-file "/home/wiley/fantastic.image")
(install-dir "/home/wiley/.local")
(library-path "/home/wiley/.local/share/fantastic")
(library-path "/home/wiley/fantastic")
(fantastic-scheme-phase-of-the-moon waxing-crescent)

It is parsed into the following merged association list:

((command "fantastic-scheme")
 (scheme-id fantastic-scheme)
 (language scheme r6rs r7rs)
 (website "")
 (version "2.96_pre1")
 (release "2.95")
 (release-date "2003-06-24")
 (release-name "Sheer lunacy")
 (revision "2.95-23-gc0f6340c")
 (configure "--prefix=/home/wiley/.local" "--with-space-ship-control-system")
 (platform "aarch64_be-linux-gnu_ilp32")
 (platform-os ubuntu linux unix)
 (platform-computer arm arm64)
 (platform-bits 64)
 (build-date "2019-10-05" "13:52:01")
 (image-date "2019-10-05" "17:10:00")
 (image-file "/home/wiley/fantastic.image")
 (install-dir "/home/wiley/.local")
 (library-path "/home/wiley/.local/share/fantastic" "/home/wiley/fantastic")
 (fantastic-scheme-phase-of-the-moon waxing-crescent))


Writing the version information is as simple as calling write or equivalent with suitable input.

The following is a fully functional parser. read-version-alist-from-command is specific to Gambit and needs to be rewritten for other Schemes.

(define (read-merged-alist in)
  (let loop ((alist '()))
    (let ((new (read in)))
      (cond ((eof-object? new) (reverse alist))
            ((not (pair? new)) (loop alist))
            (else (let ((old (assoc (car new) alist)))
                    (cond (old  (set-cdr! old (append (cdr old) (cdr new)))
                                (loop alist))
                          (else (loop (cons new alist))))))))))

(define (skip-to-line-starting char in)
  (let loop ((prev #\newline))
    (let ((c (peek-char in)))
      (unless (or (eof-object? c)
                  (and (char=? #\newline prev) (char=? char c)))
        (loop (read-char in))))))

(define (read-version-alist in)
  (skip-to-line-starting #\( in)
  (read-merged-alist in))

(define (read-version-alist-from-string string)
  (call-with-port (open-input-string string) read-version-alist))

(define (read-version-alist-from-command command)
  (let* ((result (shell-command (string-append command " -V") #t))
         (status (car result))
         (output (cdr result)))
      (read-version-alist-from-string (if (= status 0) output ""))))


Thanks to Marc Feeley for discussing S-expressions that can be parsed from portable shell scripts. Thanks to Arthur Gleckler for questioning why the output of existing version flags is extended instead of making a new flag exclusively for machine-parseable output.

This SRFI started off as one of those "what if we made this simple tweak" hunches. It has now reached a ludicrous length considering the triviality of the topic. I am grateful to anyone who may want to use LOSE for another application. It will do at least a little to justify the effort spent.


Copyright © Lassi Kortela (2019)

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice (including the next paragraph) shall be included in all copies or substantial portions of the Software.


Editor: Arthur A. Gleckler