Hello.  It gives me great pleasure to announce my COBOL parser.  This is a fixed format COBOL parser.  I expect that it could be expanded to work with free format COBOL, but I have not need for that use.

Code is located at:
    http://www.smalltalkhub.com/#!/~cbc/PetitCobol

To invoke the parser, evalutate:
    CobolProg parseCobolCodingForm: <fileName>

This 'parser' contains 4 parsers plus a fair amount of additional logic to prep the files for for the prarsers (and output from previous parsers for later parsers).  The rough outline of what happens:

1) File is read line by line.  Each line is parsed as a formatted card.
2) Take these cards, and format them into sentences.
3) Parse the coding structure.  (Parse it out into the various divisions, and parse out the level 01 data).
4) Aggregate the structure into a segments.
5) Finally, parse the actual code, division by division.

The parser includes a full AST representation, along with a visitor to subclass to help handling the resulting AST.

The parser is not complete - it should parse any fixed format COBOL program file, but not all commands are implemented.  I have implemented a way to iteratively develop the parser.  It will continue to  parse each sentence up to a point where it cannot continue - at that point, it will parse into a CDJunk (for data division unknowns) and CobolStatement (for program division unknowns).  This later will point out any missing commands (which exist), or possibly incomplete commands (which may exist); a simple visitor over the AST trapping for those nodes should find them.

The result of the parse will leave you with a CobolProg containing the final parsed AST in the variable formattedStructure.   Comments in the code will be in the variable comments (along with the line number that they originated from).  In addition, most of the interim steps will also be present in the CobolProg instance, should you be interested in them. If not, you can send #cleanup to the instance to get rid of all but the final AST nodes.

Thanks,
cbc