Saturday, July 19, 2014

Golang parser generator [ebnf,yacc,lex]

example project https://github.com/noypi/schedparser,

1) create your ebnf file, example:

example:

 decimal_digit     = .  
 ordinal = .  
 day  = .  
 month     = .  
 every = .  
 hrs = .  
 mins = .  
 secs = .  
 time = .  
 comma = .  
 newline = .  
 Days = day { comma day } .  
 Months = month { comma month }.  
 RecurByTime = every decimal_digit (hrs|mins|secs) ["from" time "to" time] .  
 RecurByDay = (every|ordinal) Days ["of" (Months)] (time) .  
 SchedLine = newline | RecurByTime | RecurByDay .  
 Sched = SchedLine { SchedLine } .  

2) Convert EBNF to yacc,

download and build "https://github.com/cznic/ebnf2y"

 ebnf2y -pkg gen -start Sched sched.ebnf > sched.y  

3) modify sched.y

3.1) remove the demo section
3.2) update data structures, see example from link below;
https://github.com/noypi/schedparser/blob/36e3882a18bc2d09bf77ca19eded978e321c7af0/gen/sched.y#L219-L250

3.3) update the TODOs, see example from link below;
https://github.com/noypi/schedparser/blob/36e3882a18bc2d09bf77ca19eded978e321c7af0/gen/sched.y#L187-L216

4) convert sched.y to go source


 go tool yacc -p Sched -o sched.y.go sched.y  

5) create your lex

from sched.ebnf, you can see that there are empty "= .",  see example from link below;
https://github.com/noypi/schedparser/blob/36e3882a18bc2d09bf77ca19eded978e321c7af0/gen/sched.ebnf#L1-L11

Lex will tell yacc that the next batch of characters is a "decimal_digit" or an "ordinal" or "day", or "month"...etc. Example;
https://github.com/noypi/schedparser/blob/36e3882a18bc2d09bf77ca19eded978e321c7af0/gen/sched.l#L42-L67

Our lex inserts a few code inside Lex(), note that our SchedLex follows yacc's interface requirements, which is:

 type SchedLexer interface {  
      Lex(lval *SchedSymType) int  
      Error(s string)  
 }  

see sample code where it started:
https://github.com/noypi/schedparser/blob/36e3882a18bc2d09bf77ca19eded978e321c7af0/gen/sched.l#L7-L23
 

6) Convert lex to go source

6.1) download and build "https://github.com/cznic/golex"
6.2) Execute the command

 golex -t -o sched.yy.go sched.l | gofmt > sched.yy.go   


6.3) Parsing starts by invoking the method SchedParse()

7) done