12 April 2006

Generator syntax for state machines

I have a real-world interface that looks like this:

interface ITableScanner {
    void open(Path filename);
    void onRecord(TableRecord record);
    void onEndOfFile(int status, Exception error);
    void close();
}

It's actually a state machine. The methods are only to be called in a very specific order: open(), then zero or more calls to onRecord(), then onEndOfFile(), then close().

Here's the simplest nontrivial implementation of this interface I can come up with:

class DebugTableScanner implements ITableScanner {
    public DebugTableScanner(Writer out) {
        m_out = out;
    }

    public void open(Path filename) {
        m_out.writeln("contents of file " + filename);
    }

    public void onRecord(TableRecord record) {
        if (!isNoise(record))
            m_out.writeln("  - " + record);
    }

    public void onEndOfFile(int status, Exception error) {
        if (status != OK)
            m_out.writeln("  * End of file status: " + status);
        if (error != null)
            error.printStackTrace(m_out);
    }

    public void close() {
    }

    private Writer m_out;
}

I'm going to propose a different syntax for writing that class. Just as you can use generator syntax instead of explicitly writing an Iterator class, you could use this special syntax instead of explicitly implementing any state machine interface.

The special syntax is receive methodName(parameters). It means roughly “yield, then resume here when the appropriate method call is received”. But if this syntax occurs in an if or while test-expression, it also means “skip to the next resume statement if a different method is called instead”. Informal, I know, but I think formal semantics would be easy to hash out.

The above class, implemented using this syntax, looks like this:

ITableScanner debugTableScanner(Writer out) {
    receive open(Path filename) {
        out.writeln("contents of file " + filename);
    }

    while (receive onRecord(TableRecord record)) {
        if (!isNoise(record))
            out.writeln("  - " + record);
    }

    receive onEndOfFile(int status, Exception error) {
        if (status != OK)
            out.writeln("  * End of file status: " + status);
        if (error != null)
            error.printStackTrace(out);
    }

    receive close();
}

You would call debugTableScanner(myOut) instead of new DebugTableScanner(myOut) to create an instance.

Nice things about this:

  • The code appears in the same order that it'll execute at run time.
  • The code and its caller look the same. You could put the code side by side and see the method calls and receive statements line up.
  • No need to explicitly check the object state and throw IllegalStateException (or whatever) if stuff hasn't been called in the right order; the compiler can generate the checks.
  • You could create variables anywhere in the body, and they'll be accessible only to methods called later.
  • Similarly, there's no need to stick the Writer into a member variable. It's lexically scoped. (Yeah, lexical scoping is frequently the highlight of my day...)

Here's a funny real-world example of why my proposed syntax is nice. I had to modify the real-world version of this example to cope with errors. Here's what I ended up with:

class DebugTableScanner implements ITableScanner {
    public DebugTableScanner(Writer out) {
        m_out = out;
    }

    public void open(Path filename) {
        m_out.writeln("contents of file " + filename);
    }

    public void onRecord(TableRecord record) {
        try {
            if (!isNoise(record))
                m_out.writeln("  - " + record);
        } catch (Exception exc) {
            onError(exc);
            throw;
        }
    }

    public void onEndOfFile(int status, Exception error) {
        try {
            if (status != OK)
                m_out.writeln("  * End of file status: " + status);
            if (error != null)
                error.printStackTrace(m_out);
        } catch (Exception exc) {
            onError(exc);
            throw;
        }
    }

    public void close() {
    }

    private void onError(Exception exc) {
        reportError(
            new Exception(INTERNAL_ERROR_IN_SCANNER, exc));
    }

    private Writer m_out;
}

...and actually considerably worse, since I had code in close() which also had to be wrapped. In my proposed syntax, it would look like this, instead:

ITableScanner debugTableScanner(Writer out) {
    receive open(Path filename) {
        out.writeln("contents of file " + filename);
    }

    try {
        while (receive onRecord(TableRecord record)) {
            if (!isNoise(record))
                out.writeln("  - " + record);
        }

        receive onEndOfFile(int status, Exception error) {
            if (status != OK)
                out.writeln("  * End of file status: " + status);
            if (error != null)
                error.printStackTrace(out);
        }

        receive close();
    } catch (Exception exc) {
        reportError(
            new Exception(INTERNAL_ERROR_IN_SCANNER, exc));
        throw;
    }
}

2 comments:

Mattie said...

Off-topic request, but this post made me think of it: I'd like a Tumbolia review/consideration of http://www.codeworker.org/ if you deem it worth your time.

jto said...

It looks interesting. Synchronicity... I was just thinking about source-transforming JavaScript; I bet this could do it.