The goal here is to remove a slot (making way to turn make_new into a slot)
as well as reduce code size by the ~40 references to mp_identity_getiter
and mp_stream_unbuffered_iter.
This introduces two new type flags:
- MP_TYPE_FLAG_ITER_IS_ITERNEXT: This means that the "iter" slot in the
type is "iternext", and should use the identity getiter.
- MP_TYPE_FLAG_ITER_IS_CUSTOM: This means that the "iter" slot is a pointer
to a mp_getiter_iternext_custom_t instance, which then defines both
getiter and iternext.
And a third flag that is the OR of both, MP_TYPE_FLAG_ITER_IS_STREAM: This
means that the type should use the identity getiter, and
mp_stream_unbuffered_iter as iternext.
Finally, MP_TYPE_FLAG_ITER_IS_GETITER is defined as a no-op flag to give
the default case where "iter" is "getiter".
Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
The existings mp_obj_type_t uses a sparse representation for slots for the
capability methods of the type (eg print, make_new). This commit adds a
compact slot-index representation. The basic idea is that where the
mp_obj_type_t struct used to have 12 pointer fields, it now has 12 uint8_t
indices, and a variable-length array of pointers. So in the best case (no
fields used) it saves 12x4-12=36 bytes (on a 32-bit machine) and in the
common case (three fields used) it saves 9x4-12=24 bytes.
Overall with all associated changes, this slot-index representation reduces
code size by 1000 to 3000 bytes on bare-metal ports. Performance is
marginally better on a few tests (eg about 1% better on misc_pystone.py and
misc_raytrace.py on PYBv1.1), but overall marginally worse by a percent or
so.
See issue #7542 for further analysis and discussion.
Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
This will always have the maximum/minimum size of a mp_obj_type_t
representation and can be used as a member in other structs.
Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
This will allow the structure of mp_obj_type_t to change while keeping the
definition code the same.
Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
The buffer protocol type only has a single member, and this existing layout
creates problems for the upcoming split/slot-index mp_obj_type_t layout
optimisations.
If we need to make the buffer protocol more sophisticated in the future
either we can rely on the mp_obj_type_t optimisations to just add
additional slots to mp_obj_type_t or re-visit the buffer protocol then.
This change is a no-op in terms of generated code.
Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
All uses of this are either tiny strings or not-known-to-be-safe.
Update comments for mp_obj_new_str_copy and mp_obj_new_str_of_type.
Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
The new `mp_obj_new_str_from_utf8_vstr` can be used when you know you
already have a unicode-safe string.
Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
Previously the desired output type was specified. Now make the type part
of the function name. Because this function is used in a few places this
saves code size due to smaller call-site.
This makes `mp_obj_new_str_type_from_vstr` a private function of objstr.c
(which is almost the only place where the output type isn't a compile-time
constant).
This saves ~140 bytes on PYBV11.
Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
This commit adds the bytes methods to bytearray, matching CPython. The
existing implementations of these methods for str/bytes are reused for
bytearray with minor updates to match CPython return types.
For details on the CPython behaviour see
https://docs.python.org/3/library/stdtypes.html#bytes-and-bytearray-operations
The work to merge locals tables for str/bytes/bytearray/array was done by
@jimmo. Because of this merging of locals the change in code size for this
commit is mostly negative:
bare-arm: +0 +0.000%
minimal x86: +29 +0.018%
unix x64: -792 -0.128% standard[incl -448(data)]
unix nanbox: -436 -0.078% nanbox[incl -448(data)]
stm32: -40 -0.010% PYBV10
cc3200: -32 -0.017%
esp8266: -28 -0.004% GENERIC
esp32: -72 -0.005% GENERIC[incl -200(data)]
mimxrt: -40 -0.011% TEENSY40
renesas-ra: -40 -0.006% RA6M2_EK
nrf: -16 -0.009% pca10040
rp2: -64 -0.013% PICO
samd: +148 +0.105% ADAFRUIT_ITSYBITSY_M4_EXPRESS
This adds new compile-time infrastructure to parse source code files for
`MP_REGISTER_ROOT_POINTER()` and generates a new `root_pointers.h` header
file containing the collected declarations. This works the same as the
existing `MP_REGISTER_MODULE()` feature.
Signed-off-by: David Lechner <david@pybricks.com>
Zero effect on non debug builds, and also usually optimized out even in
debug builds as mp_obj_is_type() is called with a compile-time known type.
I'm not sure we even have dynamic uses of mp_obj_is_type() at the moment,
but if we ever will they will be protected from now on.
Signed-off-by: Yonatan Goldschmidt <yon.goldschmidt@gmail.com>
Commit d96cfd13e3 introduced a regression by breaking existing
users of mp_obj_is_type(.., &mp_obj_bool). This function (and associated
helpers like mp_obj_is_int()) have some specific nuances, and mistakes like
this one can happen again.
This commit adds mp_obj_is_exact_type() which behaves like the the old
mp_obj_is_type(). The new mp_obj_is_type() has the same prototype but it
attempts to statically assert that it's not called with types which should
be checked using mp_obj_is_type(). If called with any of these types: int,
str, bool, NoneType - it will cause a compilation error. Additional
checked types (e.g function types) can be added in the future.
Existing users of mp_obj_is_type() with the now "invalid" types, were
translated to use mp_obj_is_exact_type().
The use of MP_STATIC_ASSERT() is not bulletproof - usually GCC (and other
compilers) can't statically check conditions that are only known during
link-time (like variables' addresses comparison). However, in this case,
GCC is able to statically detect these conditions, probably because it's
the exact same object - `&mp_type_int == &mp_type_int` is detected.
Misuses of this function with runtime-chosen types (e.g:
`mp_obj_type_t *x = ...; mp_obj_is_type(..., x);` won't be detected. MSC
is unable to detect this, so we use MP_STATIC_ASSERT_NOT_MSC().
Compiling with this commit and without the fix for d96cfd13e3 shows
that it detects the problem.
Signed-off-by: Yonatan Goldschmidt <yon.goldschmidt@gmail.com>
It's no longer needed because this macro is now processed after
preprocessing the source code via cpp (in the qstr extraction stage), which
means unused MP_REGISTER_MODULE's are filtered out by the preprocessor.
Signed-off-by: Damien George <damien@micropython.org>
This cleans up the parsing of MP_REGISTER_MODULE() and generation of
genhdr/moduledefs.h so that it uses the same process as compressed error
string messages, using the output of qstr extraction.
This makes sure all MP_REGISTER_MODULE()'s that are part of the build are
correctly picked up. Previously the extraction would miss some (eg if you
had a mod.c file in the board directory for an stm32 board).
Build speed is more or less unchanged.
Thanks to @stinos for the ports/windows/msvc/genhdr.targets changes.
Signed-off-by: Damien George <damien@micropython.org>
This is to replace the following:
mp_foo_obj_t *self = m_new_obj(mp_foo_obj_t);
self->base.type = &mp_type_foo;
with:
mp_foo_obj_t *self = mp_obj_malloc(mp_foo_obj_t, &mp_type_foo);
Calling the function is less code than inlining setting the type
everywhere, adds up to ~100 bytes on PYBV11.
It also helps to avoid an easy mistake of forgetting to set the type.
Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
Background: .mpy files are precompiled .py files, built using mpy-cross,
that contain compiled bytecode functions (and can also contain machine
code). The benefit of using an .mpy file over a .py file is that they are
faster to import and take less memory when importing. They are also
smaller on disk.
But the real benefit of .mpy files comes when they are frozen into the
firmware. This is done by loading the .mpy file during compilation of the
firmware and turning it into a set of big C data structures (the job of
mpy-tool.py), which are then compiled and downloaded into the ROM of a
device. These C data structures can be executed in-place, ie directly from
ROM. This makes importing even faster because there is very little to do,
and also means such frozen modules take up much less RAM (because their
bytecode stays in ROM).
The downside of frozen code is that it requires recompiling and reflashing
the entire firmware. This can be a big barrier to entry, slows down
development time, and makes it harder to do OTA updates of frozen code
(because the whole firmware must be updated).
This commit attempts to solve this problem by providing a solution that
sits between loading .mpy files into RAM and freezing them into the
firmware. The .mpy file format has been reworked so that it consists of
data and bytecode which is mostly static and ready to run in-place. If
these new .mpy files are located in flash/ROM which is memory addressable,
the .mpy file can be executed (mostly) in-place.
With this approach there is still a small amount of unpacking and linking
of the .mpy file that needs to be done when it's imported, but it's still
much better than loading an .mpy from disk into RAM (although not as good
as freezing .mpy files into the firmware).
The main trick to make static .mpy files is to adjust the bytecode so any
qstrs that it references now go through a lookup table to convert from
local qstr number in the module to global qstr number in the firmware.
That means the bytecode does not need linking/rewriting of qstrs when it's
loaded. Instead only a small qstr table needs to be built (and put in RAM)
at import time. This means the bytecode itself is static/constant and can
be used directly if it's in addressable memory. Also the qstr string data
in the .mpy file, and some constant object data, can be used directly.
Note that the qstr table is global to the module (ie not per function).
In more detail, in the VM what used to be (schematically):
qst = DECODE_QSTR_VALUE;
is now (schematically):
idx = DECODE_QSTR_INDEX;
qst = qstr_table[idx];
That allows the bytecode to be fixed at compile time and not need
relinking/rewriting of the qstr values. Only qstr_table needs to be linked
when the .mpy is loaded.
Incidentally, this helps to reduce the size of bytecode because what used
to be 2-byte qstr values in the bytecode are now (mostly) 1-byte indices.
If the module uses the same qstr more than two times then the bytecode is
smaller than before.
The following changes are measured for this commit compared to the
previous (the baseline):
- average 7%-9% reduction in size of .mpy files
- frozen code size is reduced by about 5%-7%
- importing .py files uses about 5% less RAM in total
- importing .mpy files uses about 4% less RAM in total
- importing .py and .mpy files takes about the same time as before
The qstr indirection in the bytecode has only a small impact on VM
performance. For stm32 on PYBv1.0 the performance change of this commit
is:
diff of scores (higher is better)
N=100 M=100 baseline -> this-commit diff diff% (error%)
bm_chaos.py 371.07 -> 357.39 : -13.68 = -3.687% (+/-0.02%)
bm_fannkuch.py 78.72 -> 77.49 : -1.23 = -1.563% (+/-0.01%)
bm_fft.py 2591.73 -> 2539.28 : -52.45 = -2.024% (+/-0.00%)
bm_float.py 6034.93 -> 5908.30 : -126.63 = -2.098% (+/-0.01%)
bm_hexiom.py 48.96 -> 47.93 : -1.03 = -2.104% (+/-0.00%)
bm_nqueens.py 4510.63 -> 4459.94 : -50.69 = -1.124% (+/-0.00%)
bm_pidigits.py 650.28 -> 644.96 : -5.32 = -0.818% (+/-0.23%)
core_import_mpy_multi.py 564.77 -> 581.49 : +16.72 = +2.960% (+/-0.01%)
core_import_mpy_single.py 68.67 -> 67.16 : -1.51 = -2.199% (+/-0.01%)
core_qstr.py 64.16 -> 64.12 : -0.04 = -0.062% (+/-0.00%)
core_yield_from.py 362.58 -> 354.50 : -8.08 = -2.228% (+/-0.00%)
misc_aes.py 429.69 -> 405.59 : -24.10 = -5.609% (+/-0.01%)
misc_mandel.py 3485.13 -> 3416.51 : -68.62 = -1.969% (+/-0.00%)
misc_pystone.py 2496.53 -> 2405.56 : -90.97 = -3.644% (+/-0.01%)
misc_raytrace.py 381.47 -> 374.01 : -7.46 = -1.956% (+/-0.01%)
viper_call0.py 576.73 -> 572.49 : -4.24 = -0.735% (+/-0.04%)
viper_call1a.py 550.37 -> 546.21 : -4.16 = -0.756% (+/-0.09%)
viper_call1b.py 438.23 -> 435.68 : -2.55 = -0.582% (+/-0.06%)
viper_call1c.py 442.84 -> 440.04 : -2.80 = -0.632% (+/-0.08%)
viper_call2a.py 536.31 -> 532.35 : -3.96 = -0.738% (+/-0.06%)
viper_call2b.py 382.34 -> 377.07 : -5.27 = -1.378% (+/-0.03%)
And for unix on x64:
diff of scores (higher is better)
N=2000 M=2000 baseline -> this-commit diff diff% (error%)
bm_chaos.py 13594.20 -> 13073.84 : -520.36 = -3.828% (+/-5.44%)
bm_fannkuch.py 60.63 -> 59.58 : -1.05 = -1.732% (+/-3.01%)
bm_fft.py 112009.15 -> 111603.32 : -405.83 = -0.362% (+/-4.03%)
bm_float.py 246202.55 -> 247923.81 : +1721.26 = +0.699% (+/-2.79%)
bm_hexiom.py 615.65 -> 617.21 : +1.56 = +0.253% (+/-1.64%)
bm_nqueens.py 215807.95 -> 215600.96 : -206.99 = -0.096% (+/-3.52%)
bm_pidigits.py 8246.74 -> 8422.82 : +176.08 = +2.135% (+/-3.64%)
misc_aes.py 16133.00 -> 16452.74 : +319.74 = +1.982% (+/-1.50%)
misc_mandel.py 128146.69 -> 130796.43 : +2649.74 = +2.068% (+/-3.18%)
misc_pystone.py 83811.49 -> 83124.85 : -686.64 = -0.819% (+/-1.03%)
misc_raytrace.py 21688.02 -> 21385.10 : -302.92 = -1.397% (+/-3.20%)
The code size change is (firmware with a lot of frozen code benefits the
most):
bare-arm: +396 +0.697%
minimal x86: +1595 +0.979% [incl +32(data)]
unix x64: +2408 +0.470% [incl +800(data)]
unix nanbox: +1396 +0.309% [incl -96(data)]
stm32: -1256 -0.318% PYBV10
cc3200: +288 +0.157%
esp8266: -260 -0.037% GENERIC
esp32: -216 -0.014% GENERIC[incl -1072(data)]
nrf: +116 +0.067% pca10040
rp2: -664 -0.135% PICO
samd: +844 +0.607% ADAFRUIT_ITSYBITSY_M4_EXPRESS
As part of this change the .mpy file format version is bumped to version 6.
And mpy-tool.py has been improved to provide a good visualisation of the
contents of .mpy files.
In summary: this commit changes the bytecode to use qstr indirection, and
reworks the .mpy file format to be simpler and allow .mpy files to be
executed in-place. Performance is not impacted too much. Eventually it
will be possible to store such .mpy files in a linear, read-only, memory-
mappable filesystem so they can be executed from flash/ROM. This will
essentially be able to replace frozen code for most applications.
Signed-off-by: Damien George <damien@micropython.org>
Make it possible to proceed to a regular lookup in locals_dict if the
custom type->attr fails. This allows type->attr to extend rather than
completely replace the lookup in locals_dict.
This is useful for custom builtin classes that have mostly regular methods
but just a few special attributes/properties. This way, type->attr needs
to deal with the special cases only and the default lookup will be used for
generic methods.
Signed-off-by: Laurens Valk <laurens@pybricks.com>
This introduces a new option, MICROPY_ERROR_REPORTING_NONE, which
completely disables all error messages. To be used in cases where
MicroPython needs to fit in very limited systems.
Signed-off-by: Damien George <damien@micropython.org>
Because the argument arrays may overlap, as show by the new tests in this
commit.
Also remove the debugging comments for these macros, add a new comment
about overlapping regions, and separate the macros by blank lines to make
them easier to read.
Fixes issue #6244.
Signed-off-by: Damien George <damien@micropython.org>
This commit fixes lookups of class members to make it so that built-in
functions that are used as methods/functions of a class work correctly.
The mp_convert_member_lookup() function is pretty much completely changed
by this commit, but for the most part it's just reorganised and the
indenting changed. The functional changes are:
- staticmethod and classmethod checks moved to later in the if-logic,
because they are less common and so should be checked after the more
common cases.
- The explicit mp_obj_is_type(member, &mp_type_type) check is removed
because it's now subsumed by other, more general tests in this function.
- MP_TYPE_FLAG_BINDS_SELF and MP_TYPE_FLAG_BUILTIN_FUN type flags added to
make the checks in this function much simpler (now they just test this
bit in type->flags).
- An extra check is made for mp_obj_is_instance_type(type) to fix lookup of
built-in functions.
Fixes#1326 and #6198.
Signed-off-by: Damien George <damien@micropython.org>
This allows complex binary operations to fail gracefully with unsupported
operation rather than raising an exception, so that special methods work
correctly.
Signed-off-by: Damien George <damien@micropython.org>
Long ago, prior to 0ef01d0a75, fixed and
ordered maps were the same setting with the "table_is_fixed_array" member
of mp_map_t. But these settings are actually independent, and it is
possible to have is_fixed=1, is_ordered=0 (although this can currently
only be done by tools/cc1). So update the comments to reflect this.
The behavior mirrors the instance object dict attribute where a copy of the
local attributes are provided (unless the dict is read-only, then that dict
itself is returned, as an optimisation). MicroPython does not support
modifying this dict because the changes will not be reflected in the class.
The feature is only enabled if MICROPY_CPYTHON_COMPAT is set, the same as
the instance version.
Initially some of these were found building the unix coverage variant on
MacOS because that build uses clang and has -Wdouble-promotion enabled, and
clang performs more vigorous promotion checks than gcc. Additionally the
codebase has been compiled with clang and msvc (the latter with warning
level 3), and with MICROPY_FLOAT_IMPL_FLOAT to find the rest of the
conversions.
Fixes are implemented either as explicit casts, or by using the correct
type, or by using one of the utility functions to handle floating point
casting; these have been moved from nativeglue.c to the public API.
TimeoutError was added back in 077812b2ab for
the cc3200 port. In f522849a4d the cc3200
port enabled use of it in the socket module aliased to socket.timeout. So
it was never added to the builtins. Then it was replaced by
OSError(ETIMEDOUT) in 047af9b10b.
The esp32 port enables this exception, since the very beginning of that
port, but it could never be accessed because it's not in builtins.
It's being removed: 1) to not encourage its use; 2) because there are a lot
of other OSError subclasses which are not defined at all, and having
TimeoutError is a bit inconsistent.
Note that ports can add anything to the builtins via MICROPY_PORT_BUILTINS.
And they can also define their own exceptions using the
MP_DEFINE_EXCEPTION() macro.
The decompression of error-strings is only done if the string is accessed
via printing or via er.args. Tests are added for this feature to ensure
the decompression works.
And rename it to mp_obj_cast_to_native_base() to indicate this. This
allows users of this function to easily support native and native-subclass
objects in the same way (by just passing the object through this function).
Follow up to recent commit ad7213d3c3, the
name "varg2" is misleading, vlist describes better that the argument is a
va_list. This name also matches CircuitPython, which already has such
helper functions.
Both bool and namedtuple will check against other types for equality; int,
float and complex for bool, and tuple for namedtuple. So to make them work
after the recent commit 3aab54bf43 they would
need MP_TYPE_FLAG_NEEDS_FULL_EQ_TEST set. But that makes all bool and
namedtuple equality checks less efficient because mp_obj_equal_not_equal()
could no longer short-cut x==x, and would need to try __ne__. To improve
this, this commit splits the MP_TYPE_FLAG_NEEDS_FULL_EQ_TEST flags into 3
separate flags to give types more fine-grained control over how their
equality behaves. These new flags are then used to fix bool and namedtuple
equality.
Fixes issue #5615 and #5620.
This commit implements a more complete replication of CPython's behaviour
for equality and inequality testing of objects. This addresses the issues
discussed in #5382 and a few other inconsistencies. Improvements over the
old code include:
- Support for returning non-boolean results from comparisons (as used by
numpy and others).
- Support for non-reflexive equality tests.
- Preferential use of __ne__ methods and MP_BINARY_OP_NOT_EQUAL binary
operators for inequality tests, when available.
- Fallback to op2 == op1 or op2 != op1 when op1 does not implement the
(in)equality operators.
The scheme here makes use of a new flag, MP_TYPE_FLAG_NEEDS_FULL_EQ_TEST,
in the flags word of mp_obj_type_t to indicate if various shortcuts can or
cannot be used when performing equality and inequality tests. Currently
four built-in classes have the flag set: float and complex are
non-reflexive (since nan != nan) while bytearray and frozenszet instances
can equal other builtin class instances (bytes and set respectively). The
flag is also set for any new class defined by the user.
This commit also includes a more comprehensive set of tests for the
behaviour of (in)equality operators implemented in special methods.
Commit d96cfd13e3 introduced a regression in
testing for bool objects, that such objects were in some cases no longer
recognised and bools, eg when using mp_obj_is_type(o, &mp_type_bool), or
mp_obj_is_integer(o).
This commit fixes that problem by adding mp_obj_is_bool(o). Builds with
MICROPY_OBJ_IMMEDIATE_OBJS enabled check if the object is any of the const
True or False objects. Builds without it use the old method of ->type
checking, which compiles to smaller code (compared with the former
mentioned method).
Fixes#5538.
Can be used where mp_obj_int_get_checked() will overflow due to the
sign-bit solely. This returns an mp_uint_t, so it also verifies the given
integer is not negative.
Currently implemented only for mpz configurations.
This option (enabled by default for object representation A, B, C) makes
None/False/True objects immediate objects, ie they are no longer a concrete
object in ROM but are rather just values, eg None=0x6 for representation A.
Doing this saves a considerable amount of code size, due to these objects
being widely used:
bare-arm: -392 -0.591%
minimal x86: -252 -0.170% [incl +52(data)]
unix x64: -624 -0.125% [incl -128(data)]
unix nanbox: +0 +0.000%
stm32: -1940 -0.510% PYBV10
cc3200: -1216 -0.659%
esp8266: -404 -0.062% GENERIC
esp32: -732 -0.064% GENERIC[incl +48(data)]
nrf: -988 -0.675% pca10040
samd: -564 -0.556% ADAFRUIT_ITSYBITSY_M4_EXPRESS
Thanks go to @Jongy aka Yonatan Goldschmidt for the idea.
This commit adjusts the definition of qstr encoding in all object
representations by taking a single bit from the qstr space and using it to
distinguish between qstrs and a new kind of literal object: immediate
objects. In other words, the qstr space is divided in two pieces, one half
for qstrs and the other half for immediate objects.
There is still enough room for qstr values (29 bits in representation A on
a 32-bit architecture, and 19 bits in representation C) and the new
immediate objects can be used for things like None, False and True.