Writing a JavaScript parser using Jison

Earlier this year I went to JSConf 2014, where Neil Green gave a talk on "Writing custom DSLs". I recommend watching his talk before reading this post. The parser we'll build is based on the code example at the end of his talk.

tl;dr take a look at the code and demo.

Ever wonder how Google parses search queries like this:
telnet AND ssh ext:log intext:password
They certainly don't rely on regular expressions alone. They most likely have a parser that turns queries into an Abstract Syntax Tree, like this:

{
    'type': 'AND',
    'values': ['telnet', 'ssh']
},
{
    'type': 'MATCH',
    'values': ['ext', 'log']
},
{
    'type': 'MATCH',
    'values': ['intext', 'password']
}

This can more easily be digested by a computer.

Jison generates bottom-up parsers in JavaScript. Its API is similar to Bison’s, hence the name. If you are new to parser generators such as Bison, and Context-free Grammars in general, a good introduction is found in the Bison manual.

Jison let's us conveniently declare our lexical and language grammar all in one file. If you haven't taken a look at the grammar I wrote, now would be a good time.

Defining our lexical grammar is simple.
For example:

domestically|internationally    return 'LOCATION'

When the lexer finds the string domestically or internationally then tokenize it as LOCATION.

The language grammar can be more complex.
Take this rule for example:

conditions : conditions condition
        {$$ = merge($1, $2);}
    | condition
    ;

Where conditions is the nonterminal symbol that this rule describes.
Everything after the colon can be considered a component.
conditions condition is saying that this rule can have multiple conditions, this is a recursive component.
This component also has an action {$$ = merge($1, $2);} that merges two conditions into one.
The second component of this rule | condition, indicates that this rule can have one condition if it so pleases.

Easy right? ʘ‿ʘ

Jison can be easily installed using npm.

$ npm install -g jison

Compiling a Jison grammar file is just as simple:

$ jison grammar.jison -o ./parser.js

Now you too can write your own parser!
To learn more about Jison, check out the docs.

Happy parsing!