If a non-string buffer was passed to execfile, then it would be passed
as a non-null-terminated char* to mp_lexer_new_from_file.
This changes mp_lexer_new_from_file to take a qstr instead (as in almost
all cases a qstr will be created from this input anyway to set the
`__file__` attribute on the module).
This now makes execfile require a string (not generic buffer) argument,
which is probably a good fix to make anyway.
Fixes issue #12522.
This work was funded through GitHub Sponsors.
Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
mp_reader_new_file() is used to read in files for importing, either .py or
.mpy files, for the lexer and persistent code loader respectively. In both
cases the file should be opened in raw bytes mode: the lexer handles
unicode characters itself, and .mpy files contain 8-bit bytes by nature.
Before this commit importing was working correctly because, although the
file was opened in text mode, all native filesystem implementations (POSIX,
FAT, LFS) would access the file in raw bytes mode via mp_stream_rw()
calling mp_stream_p_t.read(). So it was only an issue for non-native
filesystems, such as those implemented in Python. For Python-based
filesystem implementations, a call to mp_stream_rw() would go via IOBase
and then to readinto() at the Python level, and readinto() is only defined
on files opened in raw bytes mode.
Signed-off-by: Damien George <damien@micropython.org>
This patch simplifies the str creation API to favour the common case of
creating a str object that is not forced to be interned. To force
interning of a new str the new mp_obj_new_str_via_qstr function is added,
and should only be used if warranted.
Apart from simplifying the mp_obj_new_str function (and making it have the
same signature as mp_obj_new_bytes), this patch also reduces code size by a
bit (-16 bytes for bare-arm and roughly -40 bytes on the bare-metal archs).
This patch refactors the error handling in the lexer, to simplify it (ie
reduce code size).
A long time ago, when the lexer/parser/compiler were first written, the
lexer and parser were designed so they didn't use exceptions (ie nlr) to
report errors but rather returned an error code. Over time that has
gradually changed, the parser in particular has more and more ways of
raising exceptions. Also, the lexer never really handled all errors without
raising, eg there were some memory errors which could raise an exception
(and in these rare cases one would get a fatal nlr-not-handled fault).
This patch accepts the fact that the lexer can raise exceptions in some
cases and allows it to raise exceptions to handle all its errors, which are
for the most part just out-of-memory errors during construction of the
lexer. This makes the lexer a bit simpler, and also the persistent code
stuff is simplified.
What this means for users of the lexer is that calls to it must be wrapped
in a nlr handler. But all uses of the lexer already have such an nlr
handler for the parser (and compiler) so that doesn't put any extra burden
on the callers.
This provides mp_vfs_XXX functions (eg mount, open, listdir) which are
agnostic to the underlying filesystem type, and just require an object with
the relevant filesystem-like methods (eg .mount, .open, .listidr) which can
then be mounted.
These mp_vfs_XXX functions would typically be used by a port to implement
the "uos" module, and mp_vfs_open would be the builtin open function.
This feature is controlled by MICROPY_VFS, disabled by default.