I’m playing with Leetcode 65 - Valid Number. I prefer some solution that is more generic. Hence, I made a parser with Python. It’s based on the idea of monadic parsing, common in functional languages. The source code can be found in this Gist.
Roughly speaking, a parser is a stream processor that consumes a stream, and emits the remainder of the stream and a parsing result. It can also return None if the parsing fails. We define a Parser class. In its abstract form, it has a function called parse:
Then we have a Many combinator, which greedily consumes the stream with a specific parser for 0 or more times, until it can no longer consume the stream. This parser will never fail:
The Many1 combinator is very similar to Many, except that it requires the parser to succe ed at least once.
Then we can have a Char parser. It requires a specific character to appear in the front of the stream. Otherwise, it fails:
We have a Satisfy parser combinator, which generates a parser that succeeds if the front of the stream satisfies the condition supplied as the lambda function.
We also have a Return parser that “lifts” a normal value into the parser context (exactly the return operator defined for Monad).
A Bind combinator will sends the result of first parser to the function, which processes the result and generates a new parser.
The Sequential parser combinator takes in 2 parsers. It parses the stream with the first parser. If it succeeds, it ignores the result and proceeds with the second parser. If either of the parsers fails, it fails. The Sequential parser combinator can be defined in terms of Bind combinator:
The Try parser tries a parser without consuming the stream. If it suceeds, it returns the result. Otherwise, it returns the original stream without signaling failure.
The Alternative parser combinator represents the selection between 2 parsers. If 1 fails, it tries the second.
Now let’s see how a correct number should be defined. A parsable number consists of zero or more spaces, followed by an optional sign , a numeric part, an optional exponential part and many spaces. The numeric part can be one of: pure integers (-1, 3, +2), a float starting with a point and no fraction part (1., -100.), a float with only fraction part (.3) and a float with both part (1.5). The exponential part should start with a letter “e” or “E” followed by an optional sign and an integer:
number ::= space* (+|-)? numeric exponential? space* numeric ::= digit+.?digit* | .digit+ exponential ::= E|e (+|-)? digit+
The above rule can be expressed clearly with parser combinators:
And the validation function just tries to parse the number and tests if the whole stream is consumed (required by the question):
Since the grammar is unambigourous, and every starting symbol is different in the rule, it only takes linear time to validate the number.
Currently the Python parser does not handle parsing results. But with “Bind”, it’s possible to do so. The parser can be extended to fully functional recursive descendant parser easily.