For my parsing project, I decided to write an LL (top down left recursive) parser for a simplified version of SQL. Apparently, the specification for SQL is not freely available, but a professor at another university uses a simplified example of the grammar to demonstrate parsing, so I have used that example, which can be found here.
I'll copy the grammar here, too:
<Query> ::= SELECT <SelList>
FROM <FromList>
WHERE <Condition>
<SelList> ::= <Attribute> |
<Attribute> , <SelList>
<FromList> ::= <Relation> |
<Relation> , <FromList>
<Condition> ::= <Condition> AND <Condition> |
<Attribute> IN ( <Query> ) |
<Attribute> = <Attribute> |
<Attribute> LIKE <Pattern>
While not explicitly stated in the grammar, I inferred that tokens like
Because I have worked with SLY previously, I decided to use it for this project as well. SLY has a builtin Lexer and Parser class, but I decided only to use the builtin Lexer class, and to write the parser by hand. We have to eat our vegetables sometimes...
I'm a little stuck in the implementation of the subroutine that is responsible for parsing conditions: I probably need to rewrite the grammar so that the AND case is handled without causing an infinite loop. I think I can do so if I use lookahead to determine which of the terminal cases applies without too much trouble...I'm going to keep hacking at it but upload what I have so far just in case. (I need to update the readme to indicate that this program is both a lexer AND a parser, too...)
last modified | size | ||
create.sql | Sun Jan 05 2025 08:20 am | 397B | |
nick.md | Sun Jan 05 2025 08:20 am | 1.2K | |
populate.sql | Sun Jan 05 2025 08:20 am | 1.4K | |
sqllexer.py | Sun Jan 05 2025 08:20 am | 5.4K |