Commit Graph

8 Commits

Author SHA1 Message Date
Damien George 617c7dba3b py/lexer: Use null char as lexer EOF sentinel.
The null byte cannot exist in source code (per CPython), so use it to
indicate the end of the input stream (instead of `(mp_uint_t)-1`).  This
allows the cache chars (chr0/1/2 and their saved versions) to be 8-bit
bytes, making it clear that they are not `unichar` values.  It also saves a
bit of memory in the `mp_lexer_t` data structure.  (And in a future commit
allows the saved cache chars to be eliminated entirely by storing them in
a vstr instead.)

In order to keep code size down, the frequently used `chr0` is still of
type `uint32_t`.  Having it 32-bit means that machine instructions to load
it are smaller (it adds about +80 bytes to Thumb code if `chr0` is changed
to `uint8_t`).

Also add tests for invalid bytes in the input stream to make sure there are
no regressions in this regard.

Signed-off-by: Damien George <damien@micropython.org>
2026-02-04 23:19:09 +11:00
Jeff Epler 13b13d1fdd py/parsenum: Throw an exception for invalid int literals like "01".
This includes making int("01") parse in base 10 like standard Python.
When a base of 0 is specified it means auto-detect based on the prefix, and
literals begining with 0 (except when the literal is all 0's) like "01" are
then invalid and now throw an exception.

The new error message is different from CPython. It says e.g.,
`SyntaxError: invalid syntax for integer with base 0: '09'`

Additional test cases were added to cover the changed & added code.

Co-authored-by: Damien George <damien@micropython.org>
Signed-off-by: Jeff Epler <jepler@gmail.com>
2025-01-26 22:54:58 +11:00
Damien George 6031957473 tests: Automatically skip tests that require eval, exec or frozenset. 2018-02-14 16:46:44 +11:00
Tom Collins 760aa0996f tests/basics/lexer: Add line continuation tests for lexer.
Tests for an issue with line continuation failing in paste mode due to the
lexer only checking for \n in the "following" character position, before
next_char() has had a chance to convert \r and \r\n to \n.
2017-05-12 15:14:25 +10:00
Tom Collins d00d062af2 tests/basics/lexer: Add lexer tests for input starting with newlines. 2017-05-09 14:48:00 +10:00
Damien George adccafb42a tests/basics/lexer: Add a test for newline-escaping within a string. 2016-12-22 10:32:06 +11:00
Damien George d241c2a592 py/lexer: Raise SyntaxError when str hex escape sequence is malformed.
Addresses issue #1390.
2015-07-23 23:20:37 +01:00
Damien George 97abe22963 tests: Add tests to exercise lexer; and some more complex number tests. 2015-04-04 23:16:22 +01:00