Back in 2011, Rob Pike gave a talk on the topic of Lexical Scanning in Golang, where he presented an interesting approach to lexical analysis with coroutines (implemented via goroutines communicating over a channel). Since the author has been pondering the connection between coroutines and state machines in the past, Rob’s talk inspired the author to try approximating this approach in Python using sub-generators.
Since 2011, the author has seen this talk and its technique mentioned many times, both in Golang forums and general programming communities. There’s something in this approach that feels elegant – it’s a problem very well suited for coroutines. However, there was always a small nagging voice in the back of my head doubting the efficiency of the approach.
Since the Author has seen recently found myself playing with lexers again, this seemed like a good opportunity to revisit Rob Pike’s lexer and compare its performance to other approaches, using the same problem and benchmark for fairness.