With this patch parse nodes are allocated sequentially in chunks. This
reduces fragmentation of the heap and prevents waste at the end of
individually allocated parse nodes.
Saves roughly 20% of RAM during parse stage.
This patch adds more fine grained error message control for errors when
parsing integers (now has terse, normal and detailed). When detailed is
enabled, the error now escapes bytes when printing them so they can be
more easily seen.
When creating constant mpz's, the length of the mpz must be exactly how
many digits are used (not allocated) otherwise these numbers are not
compatible with dynamically allocated numbers.
Addresses issue #1448.
4 spaces are added at start of line to match previous indent, and if
previous line ended in colon.
Backspace deletes 4 space if only spaces begin a line.
Configurable via MICROPY_REPL_AUTO_INDENT. Disabled by default.
This optimises (in speed and code size) for the common case where the
binary op for the bool object is supported. Unsupported binary ops
still behave the same.
Function annotations are only needed when the native emitter is enabled
and when the current scope is emitted in viper mode. All other times
the annotations can be skipped completely.
Fetch the current usb mode and return a string representation when
pyb.usb_mode() is called with no args. The possible string values are interned
as qstr's. None will be returned if an incorrect mode is set.
Indeed, this flag efectively selects architecture target, and must
consistently apply to all compiles and links, including 3rd-party
libraries, unlike CFLAGS, which have MicroPython-specific setting.
unix-cpy was originally written to get semantic equivalent with CPython
without writing functional tests. When writing the initial
implementation of uPy it was a long way between lexer and functional
tests, so the half-way test was to make sure that the bytecode was
correct. The idea was that if the uPy bytecode matched CPython 1-1 then
uPy would be proper Python if the bytecodes acted correctly. And having
matching bytecode meant that it was less likely to miss some deep
subtlety in the Python semantics that would require an architectural
change later on.
But that is all history and it no longer makes sense to retain the
ability to output CPython bytecode, because:
1. It outputs CPython 3.3 compatible bytecode. CPython's bytecode
changes from version to version, and seems to have changed quite a bit
in 3.5. There's no point in changing the bytecode output to match
CPython anymore.
2. uPy and CPy do different optimisations to the bytecode which makes it
harder to match.
3. The bytecode tests are not run. They were never part of Travis and
are not run locally anymore.
4. The EMIT_CPYTHON option needs a lot of extra source code which adds
heaps of noise, especially in compile.c.
5. Now that there is an extensive test suite (which tests functionality)
there is no need to match the bytecode. Some very subtle behaviour is
tested with the test suite and passing these tests is a much better
way to stay Python-language compliant, rather than trying to match
CPy bytecode.
Previous to this patch there were some cases where line numbers for
errors were 0 (unknown). Now the compiler attempts to give a better
line number where possible, in some cases giving the line number of the
closest statement, and other cases the line number of the inner-most
scope of the error (eg the line number of the start of the function).
This helps to give good (and sometimes exact) line numbers for
ViperTypeError exceptions.
This patch also makes sure that the first compile error (eg SyntaxError)
that is encountered is reported (previously it was the last one that was
reported).
When looking to see if the REPL input needs to be continued on the next
line, don't look inside strings for unmatched ()[]{} ''' or """.
Addresses issue #1387.
ViperTypeError now includes filename and function name where the error
occurred. The line number is the line number of the start of the
function definition, which is the best that can be done without a lot
more work.
Partially addresses issue #1381.
This patch makes configurable, via MICROPY_QSTR_BYTES_IN_HASH, the
number of bytes used for a qstr hash. It was originally fixed at 2
bytes, and now defaults to 2 bytes. Setting it to 1 byte will save
ROM and RAM at a small expense of hash collisions.
Previous to this patch all interned strings lived in their own malloc'd
chunk. On average this wastes N/2 bytes per interned string, where N is
the number-of-bytes for a quanta of the memory allocator (16 bytes on 32
bit archs).
With this patch interned strings are concatenated into the same malloc'd
chunk when possible. Such chunks are enlarged inplace when possible,
and shrunk to fit when a new chunk is needed.
RAM savings with this patch are highly varied, but should always show an
improvement (unless only 3 or 4 strings are interned). New version
typically uses about 70% of previous memory for the qstr data, and can
lead to savings of around 10% of total memory footprint of a running
script.
Costs about 120 bytes code size on Thumb2 archs (depends on how many
calls to gc_realloc are made).
I checked the entire codebase, and every place that vstr_init_len
was called, there was a call to mp_obj_new_str_from_vstr after it.
mp_obj_new_str_from_vstr always tries to reallocate a new buffer
1 byte larger than the original to store the terminating null
character.
In many cases, if we allocated the initial buffer to be 1 byte
longer, we can prevent this extra allocation, and just reuse
the originally allocated buffer.
Asking to read 256 bytes and only getting 100 will still cause
the extra allocation, but if you ask to read 256 and get 256
then the extra allocation will be optimized away.
Yes - the reallocation is optimized in the heap to try and reuse
the buffer if it can, but it takes quite a few cycles to figure
this out.
Note by Damien: vstr_init_len should now be considered as a
string-init convenience function and used only when creating
null-terminated objects.
Previous to this patch, if "abcd" and "ab" were possible completions
to tab-completing "a", then tab would expand to "abcd" straight away
if this identifier appeared first in the dict.
The TimeoutError is useful for some modules, specially the the
socket module. TimeoutError can then be alised to socket.timeout
and then Python code can differentiate between socket.error and
socket.timeout.
When "micropython -m pkg.mod" command was used, relative imports in pkg.mod
didn't work, because pkg.mod.__name__ was set to __main__, and the fact that
it's a package submodule was missed. This is an original workaround to this
issue. TODO: investigate and compare how CPython deals with this issue.
Previous to this patch each time a bytes object was referenced a new
instance (with the same data) was created. With this patch a single
bytes object is created in the compiler and is loaded directly at execute
time as a true constant (similar to loading bignum and float objects).
This saves on allocating RAM and means that bytes objects can now be
used when the memory manager is locked (eg in interrupts).
The MP_BC_LOAD_CONST_BYTES bytecode was removed as part of this.
Generated bytecode is slightly larger due to storing a pointer to the
bytes object instead of the qstr identifier.
Code size is reduced by about 60 bytes on Thumb2 architectures.
Previous to this patch a call such as list.append(1, 2) would lead to a
seg fault. This is because list.append is a builtin method and the first
argument to such methods is always assumed to have the correct type.
Now, when a builtin method is extracted like this it is wrapped in a
checker object which checks the the type of the first argument before
calling the builtin function.
This feature is contrelled by MICROPY_BUILTIN_METHOD_CHECK_SELF_ARG and
is enabled by default.
See issue #1216.
mpconfigport.mk contains configuration options which affect the way
MicroPython is linked. In this regard, it's "stronger" configuration
dependency than even mpconfigport.h, so if we rebuild everything on
mpconfigport.h change, we certianly should of that on mpconfigport.mk
change too.
If heap allocation for the Python-stack of a function fails then we may
as well allocate the Python-stack on the C stack. This will allow to
run more code without using the heap.
This allows to do "ar[i]" and "ar[i] = val" in viper when ar is a Python
object and i and/or val are native viper types (eg ints).
Patch also includes tests for this feature.
This patch converts Q(abc) to "Q(abc)" to protect the abc from the
C preprocessor, then converts back after the preprocessor is finished.
So now we can safely put includes in mpconfig(port).h, and also
preprocess qstrdefsport.h (latter is now done also in this patch).
Addresses issue #1252.
C's printf will pad nan/inf differently to CPython. Our implementation
originally conformed to C, now it conforms to CPython's way.
Tests for this are also added in this patch.
This drops the size of unicode_isxdigit from 0x1e + 0x02 filler to
0x14 bytes (so net code reduction of 12 bytes) and will make
unicode_is_xdigit perform slightly faster.
This allows using (almost) the same code for printing floats everywhere,
removes the dependency on sprintf and uses just snprintf and
applies an msvc-specific fix for snprintf in a single place so
nan/inf are now printed correctly.
mp_obj_get_int_truncated will raise a TypeError if the argument is not
an integral type. Use mp_obj_int_get_truncated only when you know the
argument is a small or big int.
Hashing is now done using mp_unary_op function with MP_UNARY_OP_HASH as
the operator argument. Hashing for int, str and bytes still go via
fast-path in mp_unary_op since they are the most common objects which
need to be hashed.
This lead to quite a bit of code cleanup, and should be more efficient
if anything. It saves 176 bytes code space on Thumb2, and 360 bytes on
x86.
The only loss is that the error message "unhashable type" is now the
more generic "unsupported type for __hash__".
Unfortunately, MP_OBJ_STOP_ITERATION doesn't have means to pass an associated
value, so we can't optimize StopIteration exception with (non-None) argument
to MP_OBJ_STOP_ITERATION.
When generator raises exception, it is automatically terminated (by setting
its code_state.ip to 0), which interferes with this check.
Triggered in particular by CPython's test_pep380.py.
Exceptions in .close() should be ignored (dumped to sys.stderr, not
propagated), but in uPy, they are propagated. Fix would require
nlr-wrapping .close() call, which is expensive. Bu on the other hand,
.close() is not called often, so maybe that's not too bad (depends,
if it's finally called and that causes stack overflow, there's nothing
good in that). And yet on another hand, .close() can be implemented to
catch exceptions on its side, and that should be the right choice.
The code was apparently broken after 9988618e0e
"py: Implement full func arg passing for native emitter.". This attempts to
propagate those changes to ARM emitter.
User instances are hashable by default (using __hash__ inherited from
"object"). But if __eq__ is defined and __hash__ not defined in particular
class, instance is not hashable.
Having NotImplemented as MP_OBJ_SENTINEL turned out to be problematic
(it needs to be checked for in a lot of places, otherwise it'll crash
as would pass MP_OBJ_IS_OBJ()), so made a proper singleton value like
Ellipsis, both of them sharing the same type.
From https://docs.python.org/3/library/constants.html#NotImplemented :
"Special value which should be returned by the binary special methods
(e.g. __eq__(), __lt__(), __add__(), __rsub__(), etc.) to indicate
that the operation is not implemented with respect to the other type;
may be returned by the in-place binary special methods (e.g. __imul__(),
__iand__(), etc.) for the same purpose. Its truth value is true."
Some people however appear to abuse it to mean "no value" when None is
a legitimate value (don't do that).
Can complete names in the global namespace, as well as a chain of
attributes, eg pyb.Pin.board.<tab> will give a list of all board pins.
Costs 700 bytes ROM on Thumb2 arch, but greatly increases usability of
REPL prompt.
This doesn't handle case fo enclosed except blocks, but once again,
sys.exc_info() support is a workaround for software which uses it
instead of properly catching exceptions via variable in except clause.