2014-05-03 23:27:38 +01:00
|
|
|
/*
|
2017-06-30 08:22:17 +01:00
|
|
|
* This file is part of the MicroPython project, http://micropython.org/
|
2014-05-03 23:27:38 +01:00
|
|
|
*
|
|
|
|
* The MIT License (MIT)
|
|
|
|
*
|
2020-06-16 12:42:44 +01:00
|
|
|
* Copyright (c) 2013-2020 Damien P. George
|
2014-05-03 23:27:38 +01:00
|
|
|
*
|
|
|
|
* Permission is hereby granted, free of charge, to any person obtaining a copy
|
|
|
|
* of this software and associated documentation files (the "Software"), to deal
|
|
|
|
* in the Software without restriction, including without limitation the rights
|
|
|
|
* to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
|
|
|
* copies of the Software, and to permit persons to whom the Software is
|
|
|
|
* furnished to do so, subject to the following conditions:
|
|
|
|
*
|
|
|
|
* The above copyright notice and this permission notice shall be included in
|
|
|
|
* all copies or substantial portions of the Software.
|
|
|
|
*
|
|
|
|
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
|
|
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
|
|
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
|
|
|
* AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
|
|
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
|
|
|
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
|
|
|
|
* THE SOFTWARE.
|
|
|
|
*/
|
|
|
|
|
2014-03-16 07:14:26 +00:00
|
|
|
#include <stdbool.h>
|
2013-10-04 19:53:11 +01:00
|
|
|
#include <stdint.h>
|
|
|
|
#include <stdio.h>
|
|
|
|
#include <string.h>
|
|
|
|
#include <assert.h>
|
|
|
|
|
2015-01-01 20:27:54 +00:00
|
|
|
#include "py/scope.h"
|
|
|
|
#include "py/emit.h"
|
|
|
|
#include "py/compile.h"
|
|
|
|
#include "py/runtime.h"
|
2016-12-09 09:54:54 +00:00
|
|
|
#include "py/asmbase.h"
|
py: Rework bytecode and .mpy file format to be mostly static data.
Background: .mpy files are precompiled .py files, built using mpy-cross,
that contain compiled bytecode functions (and can also contain machine
code). The benefit of using an .mpy file over a .py file is that they are
faster to import and take less memory when importing. They are also
smaller on disk.
But the real benefit of .mpy files comes when they are frozen into the
firmware. This is done by loading the .mpy file during compilation of the
firmware and turning it into a set of big C data structures (the job of
mpy-tool.py), which are then compiled and downloaded into the ROM of a
device. These C data structures can be executed in-place, ie directly from
ROM. This makes importing even faster because there is very little to do,
and also means such frozen modules take up much less RAM (because their
bytecode stays in ROM).
The downside of frozen code is that it requires recompiling and reflashing
the entire firmware. This can be a big barrier to entry, slows down
development time, and makes it harder to do OTA updates of frozen code
(because the whole firmware must be updated).
This commit attempts to solve this problem by providing a solution that
sits between loading .mpy files into RAM and freezing them into the
firmware. The .mpy file format has been reworked so that it consists of
data and bytecode which is mostly static and ready to run in-place. If
these new .mpy files are located in flash/ROM which is memory addressable,
the .mpy file can be executed (mostly) in-place.
With this approach there is still a small amount of unpacking and linking
of the .mpy file that needs to be done when it's imported, but it's still
much better than loading an .mpy from disk into RAM (although not as good
as freezing .mpy files into the firmware).
The main trick to make static .mpy files is to adjust the bytecode so any
qstrs that it references now go through a lookup table to convert from
local qstr number in the module to global qstr number in the firmware.
That means the bytecode does not need linking/rewriting of qstrs when it's
loaded. Instead only a small qstr table needs to be built (and put in RAM)
at import time. This means the bytecode itself is static/constant and can
be used directly if it's in addressable memory. Also the qstr string data
in the .mpy file, and some constant object data, can be used directly.
Note that the qstr table is global to the module (ie not per function).
In more detail, in the VM what used to be (schematically):
qst = DECODE_QSTR_VALUE;
is now (schematically):
idx = DECODE_QSTR_INDEX;
qst = qstr_table[idx];
That allows the bytecode to be fixed at compile time and not need
relinking/rewriting of the qstr values. Only qstr_table needs to be linked
when the .mpy is loaded.
Incidentally, this helps to reduce the size of bytecode because what used
to be 2-byte qstr values in the bytecode are now (mostly) 1-byte indices.
If the module uses the same qstr more than two times then the bytecode is
smaller than before.
The following changes are measured for this commit compared to the
previous (the baseline):
- average 7%-9% reduction in size of .mpy files
- frozen code size is reduced by about 5%-7%
- importing .py files uses about 5% less RAM in total
- importing .mpy files uses about 4% less RAM in total
- importing .py and .mpy files takes about the same time as before
The qstr indirection in the bytecode has only a small impact on VM
performance. For stm32 on PYBv1.0 the performance change of this commit
is:
diff of scores (higher is better)
N=100 M=100 baseline -> this-commit diff diff% (error%)
bm_chaos.py 371.07 -> 357.39 : -13.68 = -3.687% (+/-0.02%)
bm_fannkuch.py 78.72 -> 77.49 : -1.23 = -1.563% (+/-0.01%)
bm_fft.py 2591.73 -> 2539.28 : -52.45 = -2.024% (+/-0.00%)
bm_float.py 6034.93 -> 5908.30 : -126.63 = -2.098% (+/-0.01%)
bm_hexiom.py 48.96 -> 47.93 : -1.03 = -2.104% (+/-0.00%)
bm_nqueens.py 4510.63 -> 4459.94 : -50.69 = -1.124% (+/-0.00%)
bm_pidigits.py 650.28 -> 644.96 : -5.32 = -0.818% (+/-0.23%)
core_import_mpy_multi.py 564.77 -> 581.49 : +16.72 = +2.960% (+/-0.01%)
core_import_mpy_single.py 68.67 -> 67.16 : -1.51 = -2.199% (+/-0.01%)
core_qstr.py 64.16 -> 64.12 : -0.04 = -0.062% (+/-0.00%)
core_yield_from.py 362.58 -> 354.50 : -8.08 = -2.228% (+/-0.00%)
misc_aes.py 429.69 -> 405.59 : -24.10 = -5.609% (+/-0.01%)
misc_mandel.py 3485.13 -> 3416.51 : -68.62 = -1.969% (+/-0.00%)
misc_pystone.py 2496.53 -> 2405.56 : -90.97 = -3.644% (+/-0.01%)
misc_raytrace.py 381.47 -> 374.01 : -7.46 = -1.956% (+/-0.01%)
viper_call0.py 576.73 -> 572.49 : -4.24 = -0.735% (+/-0.04%)
viper_call1a.py 550.37 -> 546.21 : -4.16 = -0.756% (+/-0.09%)
viper_call1b.py 438.23 -> 435.68 : -2.55 = -0.582% (+/-0.06%)
viper_call1c.py 442.84 -> 440.04 : -2.80 = -0.632% (+/-0.08%)
viper_call2a.py 536.31 -> 532.35 : -3.96 = -0.738% (+/-0.06%)
viper_call2b.py 382.34 -> 377.07 : -5.27 = -1.378% (+/-0.03%)
And for unix on x64:
diff of scores (higher is better)
N=2000 M=2000 baseline -> this-commit diff diff% (error%)
bm_chaos.py 13594.20 -> 13073.84 : -520.36 = -3.828% (+/-5.44%)
bm_fannkuch.py 60.63 -> 59.58 : -1.05 = -1.732% (+/-3.01%)
bm_fft.py 112009.15 -> 111603.32 : -405.83 = -0.362% (+/-4.03%)
bm_float.py 246202.55 -> 247923.81 : +1721.26 = +0.699% (+/-2.79%)
bm_hexiom.py 615.65 -> 617.21 : +1.56 = +0.253% (+/-1.64%)
bm_nqueens.py 215807.95 -> 215600.96 : -206.99 = -0.096% (+/-3.52%)
bm_pidigits.py 8246.74 -> 8422.82 : +176.08 = +2.135% (+/-3.64%)
misc_aes.py 16133.00 -> 16452.74 : +319.74 = +1.982% (+/-1.50%)
misc_mandel.py 128146.69 -> 130796.43 : +2649.74 = +2.068% (+/-3.18%)
misc_pystone.py 83811.49 -> 83124.85 : -686.64 = -0.819% (+/-1.03%)
misc_raytrace.py 21688.02 -> 21385.10 : -302.92 = -1.397% (+/-3.20%)
The code size change is (firmware with a lot of frozen code benefits the
most):
bare-arm: +396 +0.697%
minimal x86: +1595 +0.979% [incl +32(data)]
unix x64: +2408 +0.470% [incl +800(data)]
unix nanbox: +1396 +0.309% [incl -96(data)]
stm32: -1256 -0.318% PYBV10
cc3200: +288 +0.157%
esp8266: -260 -0.037% GENERIC
esp32: -216 -0.014% GENERIC[incl -1072(data)]
nrf: +116 +0.067% pca10040
rp2: -664 -0.135% PICO
samd: +844 +0.607% ADAFRUIT_ITSYBITSY_M4_EXPRESS
As part of this change the .mpy file format version is bumped to version 6.
And mpy-tool.py has been improved to provide a good visualisation of the
contents of .mpy files.
In summary: this commit changes the bytecode to use qstr indirection, and
reworks the .mpy file format to be simpler and allow .mpy files to be
executed in-place. Performance is not impacted too much. Eventually it
will be possible to store such .mpy files in a linear, read-only, memory-
mappable filesystem so they can be executed from flash/ROM. This will
essentially be able to replace frozen code for most applications.
Signed-off-by: Damien George <damien@micropython.org>
2021-10-22 12:22:47 +01:00
|
|
|
#include "py/nativeglue.h"
|
2019-03-08 23:59:25 +00:00
|
|
|
#include "py/persistentcode.h"
|
2013-10-04 19:53:11 +01:00
|
|
|
|
2015-12-18 12:35:44 +00:00
|
|
|
#if MICROPY_ENABLE_COMPILER
|
|
|
|
|
2013-10-04 19:53:11 +01:00
|
|
|
// TODO need to mangle __attr names
|
|
|
|
|
2017-06-22 06:05:58 +01:00
|
|
|
#define INVALID_LABEL (0xffff)
|
|
|
|
|
2013-10-04 19:53:11 +01:00
|
|
|
typedef enum {
|
2017-02-14 23:58:05 +00:00
|
|
|
// define rules with a compile function
|
2014-01-23 00:00:53 +00:00
|
|
|
#define DEF_RULE(rule, comp, kind, ...) PN_##rule,
|
2017-02-14 23:58:05 +00:00
|
|
|
#define DEF_RULE_NC(rule, kind, ...)
|
2015-01-01 20:27:54 +00:00
|
|
|
#include "py/grammar.h"
|
2013-10-04 19:53:11 +01:00
|
|
|
#undef DEF_RULE
|
2017-02-14 23:58:05 +00:00
|
|
|
#undef DEF_RULE_NC
|
2015-02-08 01:57:40 +00:00
|
|
|
PN_const_object, // special node for a constant, generic Python object
|
2017-02-14 23:58:05 +00:00
|
|
|
// define rules without a compile function
|
|
|
|
#define DEF_RULE(rule, comp, kind, ...)
|
|
|
|
#define DEF_RULE_NC(rule, kind, ...) PN_##rule,
|
|
|
|
#include "py/grammar.h"
|
|
|
|
#undef DEF_RULE
|
|
|
|
#undef DEF_RULE_NC
|
2013-10-04 19:53:11 +01:00
|
|
|
} pn_kind_t;
|
|
|
|
|
2021-03-23 01:48:35 +00:00
|
|
|
// Whether a mp_parse_node_struct_t that has pns->kind == PN_testlist_comp
|
|
|
|
// corresponds to a list comprehension or generator.
|
|
|
|
#define MP_PARSE_NODE_TESTLIST_COMP_HAS_COMP_FOR(pns) \
|
|
|
|
(MP_PARSE_NODE_STRUCT_NUM_NODES(pns) == 2 && \
|
|
|
|
MP_PARSE_NODE_IS_STRUCT_KIND(pns->nodes[1], PN_comp_for))
|
|
|
|
|
2015-08-14 12:24:11 +01:00
|
|
|
#define NEED_METHOD_TABLE MICROPY_EMIT_NATIVE
|
2015-03-26 16:44:14 +00:00
|
|
|
|
|
|
|
#if NEED_METHOD_TABLE
|
|
|
|
|
|
|
|
// we need a method table to do the lookup for the emitter functions
|
2014-01-23 00:34:21 +00:00
|
|
|
#define EMIT(fun) (comp->emit_method_table->fun(comp->emit))
|
|
|
|
#define EMIT_ARG(fun, ...) (comp->emit_method_table->fun(comp->emit, __VA_ARGS__))
|
2018-05-18 15:11:04 +01:00
|
|
|
#define EMIT_LOAD_FAST(qst, local_num) (comp->emit_method_table->load_id.local(comp->emit, qst, local_num, MP_EMIT_IDOP_LOCAL_FAST))
|
2018-05-22 12:16:30 +01:00
|
|
|
#define EMIT_LOAD_GLOBAL(qst) (comp->emit_method_table->load_id.global(comp->emit, qst, MP_EMIT_IDOP_GLOBAL_GLOBAL))
|
2013-10-04 19:53:11 +01:00
|
|
|
|
2015-03-26 16:44:14 +00:00
|
|
|
#else
|
|
|
|
|
|
|
|
// if we only have the bytecode emitter enabled then we can do a direct call to the functions
|
|
|
|
#define EMIT(fun) (mp_emit_bc_##fun(comp->emit))
|
|
|
|
#define EMIT_ARG(fun, ...) (mp_emit_bc_##fun(comp->emit, __VA_ARGS__))
|
2018-05-18 15:11:04 +01:00
|
|
|
#define EMIT_LOAD_FAST(qst, local_num) (mp_emit_bc_load_local(comp->emit, qst, local_num, MP_EMIT_IDOP_LOCAL_FAST))
|
2018-05-22 12:16:30 +01:00
|
|
|
#define EMIT_LOAD_GLOBAL(qst) (mp_emit_bc_load_global(comp->emit, qst, MP_EMIT_IDOP_GLOBAL_GLOBAL))
|
2015-03-26 16:44:14 +00:00
|
|
|
|
|
|
|
#endif
|
|
|
|
|
2019-03-08 23:59:25 +00:00
|
|
|
#if MICROPY_EMIT_NATIVE && MICROPY_DYNAMIC_COMPILER
|
|
|
|
|
|
|
|
#define NATIVE_EMITTER(f) emit_native_table[mp_dynamic_compiler.native_arch]->emit_##f
|
py: Rework bytecode and .mpy file format to be mostly static data.
Background: .mpy files are precompiled .py files, built using mpy-cross,
that contain compiled bytecode functions (and can also contain machine
code). The benefit of using an .mpy file over a .py file is that they are
faster to import and take less memory when importing. They are also
smaller on disk.
But the real benefit of .mpy files comes when they are frozen into the
firmware. This is done by loading the .mpy file during compilation of the
firmware and turning it into a set of big C data structures (the job of
mpy-tool.py), which are then compiled and downloaded into the ROM of a
device. These C data structures can be executed in-place, ie directly from
ROM. This makes importing even faster because there is very little to do,
and also means such frozen modules take up much less RAM (because their
bytecode stays in ROM).
The downside of frozen code is that it requires recompiling and reflashing
the entire firmware. This can be a big barrier to entry, slows down
development time, and makes it harder to do OTA updates of frozen code
(because the whole firmware must be updated).
This commit attempts to solve this problem by providing a solution that
sits between loading .mpy files into RAM and freezing them into the
firmware. The .mpy file format has been reworked so that it consists of
data and bytecode which is mostly static and ready to run in-place. If
these new .mpy files are located in flash/ROM which is memory addressable,
the .mpy file can be executed (mostly) in-place.
With this approach there is still a small amount of unpacking and linking
of the .mpy file that needs to be done when it's imported, but it's still
much better than loading an .mpy from disk into RAM (although not as good
as freezing .mpy files into the firmware).
The main trick to make static .mpy files is to adjust the bytecode so any
qstrs that it references now go through a lookup table to convert from
local qstr number in the module to global qstr number in the firmware.
That means the bytecode does not need linking/rewriting of qstrs when it's
loaded. Instead only a small qstr table needs to be built (and put in RAM)
at import time. This means the bytecode itself is static/constant and can
be used directly if it's in addressable memory. Also the qstr string data
in the .mpy file, and some constant object data, can be used directly.
Note that the qstr table is global to the module (ie not per function).
In more detail, in the VM what used to be (schematically):
qst = DECODE_QSTR_VALUE;
is now (schematically):
idx = DECODE_QSTR_INDEX;
qst = qstr_table[idx];
That allows the bytecode to be fixed at compile time and not need
relinking/rewriting of the qstr values. Only qstr_table needs to be linked
when the .mpy is loaded.
Incidentally, this helps to reduce the size of bytecode because what used
to be 2-byte qstr values in the bytecode are now (mostly) 1-byte indices.
If the module uses the same qstr more than two times then the bytecode is
smaller than before.
The following changes are measured for this commit compared to the
previous (the baseline):
- average 7%-9% reduction in size of .mpy files
- frozen code size is reduced by about 5%-7%
- importing .py files uses about 5% less RAM in total
- importing .mpy files uses about 4% less RAM in total
- importing .py and .mpy files takes about the same time as before
The qstr indirection in the bytecode has only a small impact on VM
performance. For stm32 on PYBv1.0 the performance change of this commit
is:
diff of scores (higher is better)
N=100 M=100 baseline -> this-commit diff diff% (error%)
bm_chaos.py 371.07 -> 357.39 : -13.68 = -3.687% (+/-0.02%)
bm_fannkuch.py 78.72 -> 77.49 : -1.23 = -1.563% (+/-0.01%)
bm_fft.py 2591.73 -> 2539.28 : -52.45 = -2.024% (+/-0.00%)
bm_float.py 6034.93 -> 5908.30 : -126.63 = -2.098% (+/-0.01%)
bm_hexiom.py 48.96 -> 47.93 : -1.03 = -2.104% (+/-0.00%)
bm_nqueens.py 4510.63 -> 4459.94 : -50.69 = -1.124% (+/-0.00%)
bm_pidigits.py 650.28 -> 644.96 : -5.32 = -0.818% (+/-0.23%)
core_import_mpy_multi.py 564.77 -> 581.49 : +16.72 = +2.960% (+/-0.01%)
core_import_mpy_single.py 68.67 -> 67.16 : -1.51 = -2.199% (+/-0.01%)
core_qstr.py 64.16 -> 64.12 : -0.04 = -0.062% (+/-0.00%)
core_yield_from.py 362.58 -> 354.50 : -8.08 = -2.228% (+/-0.00%)
misc_aes.py 429.69 -> 405.59 : -24.10 = -5.609% (+/-0.01%)
misc_mandel.py 3485.13 -> 3416.51 : -68.62 = -1.969% (+/-0.00%)
misc_pystone.py 2496.53 -> 2405.56 : -90.97 = -3.644% (+/-0.01%)
misc_raytrace.py 381.47 -> 374.01 : -7.46 = -1.956% (+/-0.01%)
viper_call0.py 576.73 -> 572.49 : -4.24 = -0.735% (+/-0.04%)
viper_call1a.py 550.37 -> 546.21 : -4.16 = -0.756% (+/-0.09%)
viper_call1b.py 438.23 -> 435.68 : -2.55 = -0.582% (+/-0.06%)
viper_call1c.py 442.84 -> 440.04 : -2.80 = -0.632% (+/-0.08%)
viper_call2a.py 536.31 -> 532.35 : -3.96 = -0.738% (+/-0.06%)
viper_call2b.py 382.34 -> 377.07 : -5.27 = -1.378% (+/-0.03%)
And for unix on x64:
diff of scores (higher is better)
N=2000 M=2000 baseline -> this-commit diff diff% (error%)
bm_chaos.py 13594.20 -> 13073.84 : -520.36 = -3.828% (+/-5.44%)
bm_fannkuch.py 60.63 -> 59.58 : -1.05 = -1.732% (+/-3.01%)
bm_fft.py 112009.15 -> 111603.32 : -405.83 = -0.362% (+/-4.03%)
bm_float.py 246202.55 -> 247923.81 : +1721.26 = +0.699% (+/-2.79%)
bm_hexiom.py 615.65 -> 617.21 : +1.56 = +0.253% (+/-1.64%)
bm_nqueens.py 215807.95 -> 215600.96 : -206.99 = -0.096% (+/-3.52%)
bm_pidigits.py 8246.74 -> 8422.82 : +176.08 = +2.135% (+/-3.64%)
misc_aes.py 16133.00 -> 16452.74 : +319.74 = +1.982% (+/-1.50%)
misc_mandel.py 128146.69 -> 130796.43 : +2649.74 = +2.068% (+/-3.18%)
misc_pystone.py 83811.49 -> 83124.85 : -686.64 = -0.819% (+/-1.03%)
misc_raytrace.py 21688.02 -> 21385.10 : -302.92 = -1.397% (+/-3.20%)
The code size change is (firmware with a lot of frozen code benefits the
most):
bare-arm: +396 +0.697%
minimal x86: +1595 +0.979% [incl +32(data)]
unix x64: +2408 +0.470% [incl +800(data)]
unix nanbox: +1396 +0.309% [incl -96(data)]
stm32: -1256 -0.318% PYBV10
cc3200: +288 +0.157%
esp8266: -260 -0.037% GENERIC
esp32: -216 -0.014% GENERIC[incl -1072(data)]
nrf: +116 +0.067% pca10040
rp2: -664 -0.135% PICO
samd: +844 +0.607% ADAFRUIT_ITSYBITSY_M4_EXPRESS
As part of this change the .mpy file format version is bumped to version 6.
And mpy-tool.py has been improved to provide a good visualisation of the
contents of .mpy files.
In summary: this commit changes the bytecode to use qstr indirection, and
reworks the .mpy file format to be simpler and allow .mpy files to be
executed in-place. Performance is not impacted too much. Eventually it
will be possible to store such .mpy files in a linear, read-only, memory-
mappable filesystem so they can be executed from flash/ROM. This will
essentially be able to replace frozen code for most applications.
Signed-off-by: Damien George <damien@micropython.org>
2021-10-22 12:22:47 +01:00
|
|
|
#define NATIVE_EMITTER_TABLE (emit_native_table[mp_dynamic_compiler.native_arch])
|
2019-03-08 23:59:25 +00:00
|
|
|
|
|
|
|
STATIC const emit_method_table_t *emit_native_table[] = {
|
|
|
|
NULL,
|
|
|
|
&emit_native_x86_method_table,
|
|
|
|
&emit_native_x64_method_table,
|
|
|
|
&emit_native_arm_method_table,
|
|
|
|
&emit_native_thumb_method_table,
|
|
|
|
&emit_native_thumb_method_table,
|
|
|
|
&emit_native_thumb_method_table,
|
|
|
|
&emit_native_thumb_method_table,
|
|
|
|
&emit_native_thumb_method_table,
|
|
|
|
&emit_native_xtensa_method_table,
|
2019-09-13 04:15:12 +01:00
|
|
|
&emit_native_xtensawin_method_table,
|
2019-03-08 23:59:25 +00:00
|
|
|
};
|
|
|
|
|
|
|
|
#elif MICROPY_EMIT_NATIVE
|
2016-12-07 00:17:17 +00:00
|
|
|
// define a macro to access external native emitter
|
|
|
|
#if MICROPY_EMIT_X64
|
|
|
|
#define NATIVE_EMITTER(f) emit_native_x64_##f
|
|
|
|
#elif MICROPY_EMIT_X86
|
|
|
|
#define NATIVE_EMITTER(f) emit_native_x86_##f
|
|
|
|
#elif MICROPY_EMIT_THUMB
|
|
|
|
#define NATIVE_EMITTER(f) emit_native_thumb_##f
|
|
|
|
#elif MICROPY_EMIT_ARM
|
|
|
|
#define NATIVE_EMITTER(f) emit_native_arm_##f
|
2016-12-09 05:39:39 +00:00
|
|
|
#elif MICROPY_EMIT_XTENSA
|
|
|
|
#define NATIVE_EMITTER(f) emit_native_xtensa_##f
|
2019-09-13 04:15:12 +01:00
|
|
|
#elif MICROPY_EMIT_XTENSAWIN
|
|
|
|
#define NATIVE_EMITTER(f) emit_native_xtensawin_##f
|
2016-12-07 00:17:17 +00:00
|
|
|
#else
|
|
|
|
#error "unknown native emitter"
|
|
|
|
#endif
|
py: Rework bytecode and .mpy file format to be mostly static data.
Background: .mpy files are precompiled .py files, built using mpy-cross,
that contain compiled bytecode functions (and can also contain machine
code). The benefit of using an .mpy file over a .py file is that they are
faster to import and take less memory when importing. They are also
smaller on disk.
But the real benefit of .mpy files comes when they are frozen into the
firmware. This is done by loading the .mpy file during compilation of the
firmware and turning it into a set of big C data structures (the job of
mpy-tool.py), which are then compiled and downloaded into the ROM of a
device. These C data structures can be executed in-place, ie directly from
ROM. This makes importing even faster because there is very little to do,
and also means such frozen modules take up much less RAM (because their
bytecode stays in ROM).
The downside of frozen code is that it requires recompiling and reflashing
the entire firmware. This can be a big barrier to entry, slows down
development time, and makes it harder to do OTA updates of frozen code
(because the whole firmware must be updated).
This commit attempts to solve this problem by providing a solution that
sits between loading .mpy files into RAM and freezing them into the
firmware. The .mpy file format has been reworked so that it consists of
data and bytecode which is mostly static and ready to run in-place. If
these new .mpy files are located in flash/ROM which is memory addressable,
the .mpy file can be executed (mostly) in-place.
With this approach there is still a small amount of unpacking and linking
of the .mpy file that needs to be done when it's imported, but it's still
much better than loading an .mpy from disk into RAM (although not as good
as freezing .mpy files into the firmware).
The main trick to make static .mpy files is to adjust the bytecode so any
qstrs that it references now go through a lookup table to convert from
local qstr number in the module to global qstr number in the firmware.
That means the bytecode does not need linking/rewriting of qstrs when it's
loaded. Instead only a small qstr table needs to be built (and put in RAM)
at import time. This means the bytecode itself is static/constant and can
be used directly if it's in addressable memory. Also the qstr string data
in the .mpy file, and some constant object data, can be used directly.
Note that the qstr table is global to the module (ie not per function).
In more detail, in the VM what used to be (schematically):
qst = DECODE_QSTR_VALUE;
is now (schematically):
idx = DECODE_QSTR_INDEX;
qst = qstr_table[idx];
That allows the bytecode to be fixed at compile time and not need
relinking/rewriting of the qstr values. Only qstr_table needs to be linked
when the .mpy is loaded.
Incidentally, this helps to reduce the size of bytecode because what used
to be 2-byte qstr values in the bytecode are now (mostly) 1-byte indices.
If the module uses the same qstr more than two times then the bytecode is
smaller than before.
The following changes are measured for this commit compared to the
previous (the baseline):
- average 7%-9% reduction in size of .mpy files
- frozen code size is reduced by about 5%-7%
- importing .py files uses about 5% less RAM in total
- importing .mpy files uses about 4% less RAM in total
- importing .py and .mpy files takes about the same time as before
The qstr indirection in the bytecode has only a small impact on VM
performance. For stm32 on PYBv1.0 the performance change of this commit
is:
diff of scores (higher is better)
N=100 M=100 baseline -> this-commit diff diff% (error%)
bm_chaos.py 371.07 -> 357.39 : -13.68 = -3.687% (+/-0.02%)
bm_fannkuch.py 78.72 -> 77.49 : -1.23 = -1.563% (+/-0.01%)
bm_fft.py 2591.73 -> 2539.28 : -52.45 = -2.024% (+/-0.00%)
bm_float.py 6034.93 -> 5908.30 : -126.63 = -2.098% (+/-0.01%)
bm_hexiom.py 48.96 -> 47.93 : -1.03 = -2.104% (+/-0.00%)
bm_nqueens.py 4510.63 -> 4459.94 : -50.69 = -1.124% (+/-0.00%)
bm_pidigits.py 650.28 -> 644.96 : -5.32 = -0.818% (+/-0.23%)
core_import_mpy_multi.py 564.77 -> 581.49 : +16.72 = +2.960% (+/-0.01%)
core_import_mpy_single.py 68.67 -> 67.16 : -1.51 = -2.199% (+/-0.01%)
core_qstr.py 64.16 -> 64.12 : -0.04 = -0.062% (+/-0.00%)
core_yield_from.py 362.58 -> 354.50 : -8.08 = -2.228% (+/-0.00%)
misc_aes.py 429.69 -> 405.59 : -24.10 = -5.609% (+/-0.01%)
misc_mandel.py 3485.13 -> 3416.51 : -68.62 = -1.969% (+/-0.00%)
misc_pystone.py 2496.53 -> 2405.56 : -90.97 = -3.644% (+/-0.01%)
misc_raytrace.py 381.47 -> 374.01 : -7.46 = -1.956% (+/-0.01%)
viper_call0.py 576.73 -> 572.49 : -4.24 = -0.735% (+/-0.04%)
viper_call1a.py 550.37 -> 546.21 : -4.16 = -0.756% (+/-0.09%)
viper_call1b.py 438.23 -> 435.68 : -2.55 = -0.582% (+/-0.06%)
viper_call1c.py 442.84 -> 440.04 : -2.80 = -0.632% (+/-0.08%)
viper_call2a.py 536.31 -> 532.35 : -3.96 = -0.738% (+/-0.06%)
viper_call2b.py 382.34 -> 377.07 : -5.27 = -1.378% (+/-0.03%)
And for unix on x64:
diff of scores (higher is better)
N=2000 M=2000 baseline -> this-commit diff diff% (error%)
bm_chaos.py 13594.20 -> 13073.84 : -520.36 = -3.828% (+/-5.44%)
bm_fannkuch.py 60.63 -> 59.58 : -1.05 = -1.732% (+/-3.01%)
bm_fft.py 112009.15 -> 111603.32 : -405.83 = -0.362% (+/-4.03%)
bm_float.py 246202.55 -> 247923.81 : +1721.26 = +0.699% (+/-2.79%)
bm_hexiom.py 615.65 -> 617.21 : +1.56 = +0.253% (+/-1.64%)
bm_nqueens.py 215807.95 -> 215600.96 : -206.99 = -0.096% (+/-3.52%)
bm_pidigits.py 8246.74 -> 8422.82 : +176.08 = +2.135% (+/-3.64%)
misc_aes.py 16133.00 -> 16452.74 : +319.74 = +1.982% (+/-1.50%)
misc_mandel.py 128146.69 -> 130796.43 : +2649.74 = +2.068% (+/-3.18%)
misc_pystone.py 83811.49 -> 83124.85 : -686.64 = -0.819% (+/-1.03%)
misc_raytrace.py 21688.02 -> 21385.10 : -302.92 = -1.397% (+/-3.20%)
The code size change is (firmware with a lot of frozen code benefits the
most):
bare-arm: +396 +0.697%
minimal x86: +1595 +0.979% [incl +32(data)]
unix x64: +2408 +0.470% [incl +800(data)]
unix nanbox: +1396 +0.309% [incl -96(data)]
stm32: -1256 -0.318% PYBV10
cc3200: +288 +0.157%
esp8266: -260 -0.037% GENERIC
esp32: -216 -0.014% GENERIC[incl -1072(data)]
nrf: +116 +0.067% pca10040
rp2: -664 -0.135% PICO
samd: +844 +0.607% ADAFRUIT_ITSYBITSY_M4_EXPRESS
As part of this change the .mpy file format version is bumped to version 6.
And mpy-tool.py has been improved to provide a good visualisation of the
contents of .mpy files.
In summary: this commit changes the bytecode to use qstr indirection, and
reworks the .mpy file format to be simpler and allow .mpy files to be
executed in-place. Performance is not impacted too much. Eventually it
will be possible to store such .mpy files in a linear, read-only, memory-
mappable filesystem so they can be executed from flash/ROM. This will
essentially be able to replace frozen code for most applications.
Signed-off-by: Damien George <damien@micropython.org>
2021-10-22 12:22:47 +01:00
|
|
|
#define NATIVE_EMITTER_TABLE (&NATIVE_EMITTER(method_table))
|
2016-12-07 00:17:17 +00:00
|
|
|
#endif
|
|
|
|
|
2019-03-09 01:32:09 +00:00
|
|
|
#if MICROPY_EMIT_INLINE_ASM && MICROPY_DYNAMIC_COMPILER
|
|
|
|
|
|
|
|
#define ASM_EMITTER(f) emit_asm_table[mp_dynamic_compiler.native_arch]->asm_##f
|
|
|
|
#define ASM_EMITTER_TABLE emit_asm_table[mp_dynamic_compiler.native_arch]
|
|
|
|
|
|
|
|
STATIC const emit_inline_asm_method_table_t *emit_asm_table[] = {
|
|
|
|
NULL,
|
|
|
|
NULL,
|
|
|
|
NULL,
|
2019-05-01 06:24:21 +01:00
|
|
|
&emit_inline_thumb_method_table,
|
2019-03-09 01:32:09 +00:00
|
|
|
&emit_inline_thumb_method_table,
|
|
|
|
&emit_inline_thumb_method_table,
|
|
|
|
&emit_inline_thumb_method_table,
|
|
|
|
&emit_inline_thumb_method_table,
|
|
|
|
&emit_inline_thumb_method_table,
|
|
|
|
&emit_inline_xtensa_method_table,
|
2019-09-13 04:15:12 +01:00
|
|
|
NULL,
|
2019-03-09 01:32:09 +00:00
|
|
|
};
|
|
|
|
|
|
|
|
#elif MICROPY_EMIT_INLINE_ASM
|
2016-12-09 02:17:49 +00:00
|
|
|
// define macros for inline assembler
|
|
|
|
#if MICROPY_EMIT_INLINE_THUMB
|
|
|
|
#define ASM_DECORATOR_QSTR MP_QSTR_asm_thumb
|
|
|
|
#define ASM_EMITTER(f) emit_inline_thumb_##f
|
py: Add inline Xtensa assembler.
This patch adds the MICROPY_EMIT_INLINE_XTENSA option, which, when
enabled, allows the @micropython.asm_xtensa decorator to be used.
The following opcodes are currently supported (ax is a register, a0-a15):
ret_n()
callx0(ax)
j(label)
jx(ax)
beqz(ax, label)
bnez(ax, label)
mov(ax, ay)
movi(ax, imm) # imm can be full 32-bit, uses l32r if needed
and_(ax, ay, az)
or_(ax, ay, az)
xor(ax, ay, az)
add(ax, ay, az)
sub(ax, ay, az)
mull(ax, ay, az)
l8ui(ax, ay, imm)
l16ui(ax, ay, imm)
l32i(ax, ay, imm)
s8i(ax, ay, imm)
s16i(ax, ay, imm)
s32i(ax, ay, imm)
l16si(ax, ay, imm)
addi(ax, ay, imm)
ball(ax, ay, label)
bany(ax, ay, label)
bbc(ax, ay, label)
bbs(ax, ay, label)
beq(ax, ay, label)
bge(ax, ay, label)
bgeu(ax, ay, label)
blt(ax, ay, label)
bnall(ax, ay, label)
bne(ax, ay, label)
bnone(ax, ay, label)
Upon entry to the assembly function the registers a0, a12, a13, a14 are
pushed to the stack and the stack pointer (a1) decreased by 16. Upon
exit, these registers and the stack pointer are restored, and ret.n is
executed to return to the caller (caller address is in a0).
Note that the ABI for the Xtensa emitters is non-windowing.
2016-12-09 06:03:33 +00:00
|
|
|
#elif MICROPY_EMIT_INLINE_XTENSA
|
|
|
|
#define ASM_DECORATOR_QSTR MP_QSTR_asm_xtensa
|
|
|
|
#define ASM_EMITTER(f) emit_inline_xtensa_##f
|
2016-12-09 02:17:49 +00:00
|
|
|
#else
|
|
|
|
#error "unknown asm emitter"
|
|
|
|
#endif
|
2019-03-09 01:32:09 +00:00
|
|
|
#define ASM_EMITTER_TABLE &ASM_EMITTER(method_table)
|
2016-12-09 02:17:49 +00:00
|
|
|
#endif
|
|
|
|
|
2015-10-03 17:07:54 +01:00
|
|
|
#define EMIT_INLINE_ASM(fun) (comp->emit_inline_asm_method_table->fun(comp->emit_inline_asm))
|
|
|
|
#define EMIT_INLINE_ASM_ARG(fun, ...) (comp->emit_inline_asm_method_table->fun(comp->emit_inline_asm, __VA_ARGS__))
|
|
|
|
|
2014-10-05 19:01:34 +01:00
|
|
|
// elements in this struct are ordered to make it compact
|
2013-10-04 19:53:11 +01:00
|
|
|
typedef struct _compiler_t {
|
2014-04-09 12:27:39 +01:00
|
|
|
uint8_t is_repl;
|
|
|
|
uint8_t pass; // holds enum type pass_kind_t
|
2014-10-05 19:01:34 +01:00
|
|
|
uint8_t have_star;
|
|
|
|
|
|
|
|
// try to keep compiler clean from nlr
|
2015-07-29 23:16:01 +01:00
|
|
|
mp_obj_t compile_error; // set to an exception object if there's an error
|
2015-12-17 13:13:18 +00:00
|
|
|
size_t compile_error_line; // set to best guess of line of error
|
2013-10-04 19:53:11 +01:00
|
|
|
|
2014-04-10 14:11:31 +01:00
|
|
|
uint next_label;
|
2013-10-05 13:37:10 +01:00
|
|
|
|
2014-10-05 19:01:34 +01:00
|
|
|
uint16_t num_dict_params;
|
|
|
|
uint16_t num_default_params;
|
|
|
|
|
2014-05-30 15:20:41 +01:00
|
|
|
uint16_t break_label; // highest bit set indicates we are breaking out of a for loop
|
|
|
|
uint16_t continue_label;
|
2014-04-09 12:27:39 +01:00
|
|
|
uint16_t cur_except_level; // increased for SETUP_EXCEPT, SETUP_FINALLY; decreased for POP_BLOCK, POP_EXCEPT
|
2014-10-17 15:08:49 +01:00
|
|
|
uint16_t break_continue_except_level;
|
2013-10-04 19:53:11 +01:00
|
|
|
|
|
|
|
scope_t *scope_head;
|
|
|
|
scope_t *scope_cur;
|
|
|
|
|
py: Rework bytecode and .mpy file format to be mostly static data.
Background: .mpy files are precompiled .py files, built using mpy-cross,
that contain compiled bytecode functions (and can also contain machine
code). The benefit of using an .mpy file over a .py file is that they are
faster to import and take less memory when importing. They are also
smaller on disk.
But the real benefit of .mpy files comes when they are frozen into the
firmware. This is done by loading the .mpy file during compilation of the
firmware and turning it into a set of big C data structures (the job of
mpy-tool.py), which are then compiled and downloaded into the ROM of a
device. These C data structures can be executed in-place, ie directly from
ROM. This makes importing even faster because there is very little to do,
and also means such frozen modules take up much less RAM (because their
bytecode stays in ROM).
The downside of frozen code is that it requires recompiling and reflashing
the entire firmware. This can be a big barrier to entry, slows down
development time, and makes it harder to do OTA updates of frozen code
(because the whole firmware must be updated).
This commit attempts to solve this problem by providing a solution that
sits between loading .mpy files into RAM and freezing them into the
firmware. The .mpy file format has been reworked so that it consists of
data and bytecode which is mostly static and ready to run in-place. If
these new .mpy files are located in flash/ROM which is memory addressable,
the .mpy file can be executed (mostly) in-place.
With this approach there is still a small amount of unpacking and linking
of the .mpy file that needs to be done when it's imported, but it's still
much better than loading an .mpy from disk into RAM (although not as good
as freezing .mpy files into the firmware).
The main trick to make static .mpy files is to adjust the bytecode so any
qstrs that it references now go through a lookup table to convert from
local qstr number in the module to global qstr number in the firmware.
That means the bytecode does not need linking/rewriting of qstrs when it's
loaded. Instead only a small qstr table needs to be built (and put in RAM)
at import time. This means the bytecode itself is static/constant and can
be used directly if it's in addressable memory. Also the qstr string data
in the .mpy file, and some constant object data, can be used directly.
Note that the qstr table is global to the module (ie not per function).
In more detail, in the VM what used to be (schematically):
qst = DECODE_QSTR_VALUE;
is now (schematically):
idx = DECODE_QSTR_INDEX;
qst = qstr_table[idx];
That allows the bytecode to be fixed at compile time and not need
relinking/rewriting of the qstr values. Only qstr_table needs to be linked
when the .mpy is loaded.
Incidentally, this helps to reduce the size of bytecode because what used
to be 2-byte qstr values in the bytecode are now (mostly) 1-byte indices.
If the module uses the same qstr more than two times then the bytecode is
smaller than before.
The following changes are measured for this commit compared to the
previous (the baseline):
- average 7%-9% reduction in size of .mpy files
- frozen code size is reduced by about 5%-7%
- importing .py files uses about 5% less RAM in total
- importing .mpy files uses about 4% less RAM in total
- importing .py and .mpy files takes about the same time as before
The qstr indirection in the bytecode has only a small impact on VM
performance. For stm32 on PYBv1.0 the performance change of this commit
is:
diff of scores (higher is better)
N=100 M=100 baseline -> this-commit diff diff% (error%)
bm_chaos.py 371.07 -> 357.39 : -13.68 = -3.687% (+/-0.02%)
bm_fannkuch.py 78.72 -> 77.49 : -1.23 = -1.563% (+/-0.01%)
bm_fft.py 2591.73 -> 2539.28 : -52.45 = -2.024% (+/-0.00%)
bm_float.py 6034.93 -> 5908.30 : -126.63 = -2.098% (+/-0.01%)
bm_hexiom.py 48.96 -> 47.93 : -1.03 = -2.104% (+/-0.00%)
bm_nqueens.py 4510.63 -> 4459.94 : -50.69 = -1.124% (+/-0.00%)
bm_pidigits.py 650.28 -> 644.96 : -5.32 = -0.818% (+/-0.23%)
core_import_mpy_multi.py 564.77 -> 581.49 : +16.72 = +2.960% (+/-0.01%)
core_import_mpy_single.py 68.67 -> 67.16 : -1.51 = -2.199% (+/-0.01%)
core_qstr.py 64.16 -> 64.12 : -0.04 = -0.062% (+/-0.00%)
core_yield_from.py 362.58 -> 354.50 : -8.08 = -2.228% (+/-0.00%)
misc_aes.py 429.69 -> 405.59 : -24.10 = -5.609% (+/-0.01%)
misc_mandel.py 3485.13 -> 3416.51 : -68.62 = -1.969% (+/-0.00%)
misc_pystone.py 2496.53 -> 2405.56 : -90.97 = -3.644% (+/-0.01%)
misc_raytrace.py 381.47 -> 374.01 : -7.46 = -1.956% (+/-0.01%)
viper_call0.py 576.73 -> 572.49 : -4.24 = -0.735% (+/-0.04%)
viper_call1a.py 550.37 -> 546.21 : -4.16 = -0.756% (+/-0.09%)
viper_call1b.py 438.23 -> 435.68 : -2.55 = -0.582% (+/-0.06%)
viper_call1c.py 442.84 -> 440.04 : -2.80 = -0.632% (+/-0.08%)
viper_call2a.py 536.31 -> 532.35 : -3.96 = -0.738% (+/-0.06%)
viper_call2b.py 382.34 -> 377.07 : -5.27 = -1.378% (+/-0.03%)
And for unix on x64:
diff of scores (higher is better)
N=2000 M=2000 baseline -> this-commit diff diff% (error%)
bm_chaos.py 13594.20 -> 13073.84 : -520.36 = -3.828% (+/-5.44%)
bm_fannkuch.py 60.63 -> 59.58 : -1.05 = -1.732% (+/-3.01%)
bm_fft.py 112009.15 -> 111603.32 : -405.83 = -0.362% (+/-4.03%)
bm_float.py 246202.55 -> 247923.81 : +1721.26 = +0.699% (+/-2.79%)
bm_hexiom.py 615.65 -> 617.21 : +1.56 = +0.253% (+/-1.64%)
bm_nqueens.py 215807.95 -> 215600.96 : -206.99 = -0.096% (+/-3.52%)
bm_pidigits.py 8246.74 -> 8422.82 : +176.08 = +2.135% (+/-3.64%)
misc_aes.py 16133.00 -> 16452.74 : +319.74 = +1.982% (+/-1.50%)
misc_mandel.py 128146.69 -> 130796.43 : +2649.74 = +2.068% (+/-3.18%)
misc_pystone.py 83811.49 -> 83124.85 : -686.64 = -0.819% (+/-1.03%)
misc_raytrace.py 21688.02 -> 21385.10 : -302.92 = -1.397% (+/-3.20%)
The code size change is (firmware with a lot of frozen code benefits the
most):
bare-arm: +396 +0.697%
minimal x86: +1595 +0.979% [incl +32(data)]
unix x64: +2408 +0.470% [incl +800(data)]
unix nanbox: +1396 +0.309% [incl -96(data)]
stm32: -1256 -0.318% PYBV10
cc3200: +288 +0.157%
esp8266: -260 -0.037% GENERIC
esp32: -216 -0.014% GENERIC[incl -1072(data)]
nrf: +116 +0.067% pca10040
rp2: -664 -0.135% PICO
samd: +844 +0.607% ADAFRUIT_ITSYBITSY_M4_EXPRESS
As part of this change the .mpy file format version is bumped to version 6.
And mpy-tool.py has been improved to provide a good visualisation of the
contents of .mpy files.
In summary: this commit changes the bytecode to use qstr indirection, and
reworks the .mpy file format to be simpler and allow .mpy files to be
executed in-place. Performance is not impacted too much. Eventually it
will be possible to store such .mpy files in a linear, read-only, memory-
mappable filesystem so they can be executed from flash/ROM. This will
essentially be able to replace frozen code for most applications.
Signed-off-by: Damien George <damien@micropython.org>
2021-10-22 12:22:47 +01:00
|
|
|
mp_emit_common_t emit_common;
|
|
|
|
|
2013-10-05 18:08:26 +01:00
|
|
|
emit_t *emit; // current emitter
|
py: Rework bytecode and .mpy file format to be mostly static data.
Background: .mpy files are precompiled .py files, built using mpy-cross,
that contain compiled bytecode functions (and can also contain machine
code). The benefit of using an .mpy file over a .py file is that they are
faster to import and take less memory when importing. They are also
smaller on disk.
But the real benefit of .mpy files comes when they are frozen into the
firmware. This is done by loading the .mpy file during compilation of the
firmware and turning it into a set of big C data structures (the job of
mpy-tool.py), which are then compiled and downloaded into the ROM of a
device. These C data structures can be executed in-place, ie directly from
ROM. This makes importing even faster because there is very little to do,
and also means such frozen modules take up much less RAM (because their
bytecode stays in ROM).
The downside of frozen code is that it requires recompiling and reflashing
the entire firmware. This can be a big barrier to entry, slows down
development time, and makes it harder to do OTA updates of frozen code
(because the whole firmware must be updated).
This commit attempts to solve this problem by providing a solution that
sits between loading .mpy files into RAM and freezing them into the
firmware. The .mpy file format has been reworked so that it consists of
data and bytecode which is mostly static and ready to run in-place. If
these new .mpy files are located in flash/ROM which is memory addressable,
the .mpy file can be executed (mostly) in-place.
With this approach there is still a small amount of unpacking and linking
of the .mpy file that needs to be done when it's imported, but it's still
much better than loading an .mpy from disk into RAM (although not as good
as freezing .mpy files into the firmware).
The main trick to make static .mpy files is to adjust the bytecode so any
qstrs that it references now go through a lookup table to convert from
local qstr number in the module to global qstr number in the firmware.
That means the bytecode does not need linking/rewriting of qstrs when it's
loaded. Instead only a small qstr table needs to be built (and put in RAM)
at import time. This means the bytecode itself is static/constant and can
be used directly if it's in addressable memory. Also the qstr string data
in the .mpy file, and some constant object data, can be used directly.
Note that the qstr table is global to the module (ie not per function).
In more detail, in the VM what used to be (schematically):
qst = DECODE_QSTR_VALUE;
is now (schematically):
idx = DECODE_QSTR_INDEX;
qst = qstr_table[idx];
That allows the bytecode to be fixed at compile time and not need
relinking/rewriting of the qstr values. Only qstr_table needs to be linked
when the .mpy is loaded.
Incidentally, this helps to reduce the size of bytecode because what used
to be 2-byte qstr values in the bytecode are now (mostly) 1-byte indices.
If the module uses the same qstr more than two times then the bytecode is
smaller than before.
The following changes are measured for this commit compared to the
previous (the baseline):
- average 7%-9% reduction in size of .mpy files
- frozen code size is reduced by about 5%-7%
- importing .py files uses about 5% less RAM in total
- importing .mpy files uses about 4% less RAM in total
- importing .py and .mpy files takes about the same time as before
The qstr indirection in the bytecode has only a small impact on VM
performance. For stm32 on PYBv1.0 the performance change of this commit
is:
diff of scores (higher is better)
N=100 M=100 baseline -> this-commit diff diff% (error%)
bm_chaos.py 371.07 -> 357.39 : -13.68 = -3.687% (+/-0.02%)
bm_fannkuch.py 78.72 -> 77.49 : -1.23 = -1.563% (+/-0.01%)
bm_fft.py 2591.73 -> 2539.28 : -52.45 = -2.024% (+/-0.00%)
bm_float.py 6034.93 -> 5908.30 : -126.63 = -2.098% (+/-0.01%)
bm_hexiom.py 48.96 -> 47.93 : -1.03 = -2.104% (+/-0.00%)
bm_nqueens.py 4510.63 -> 4459.94 : -50.69 = -1.124% (+/-0.00%)
bm_pidigits.py 650.28 -> 644.96 : -5.32 = -0.818% (+/-0.23%)
core_import_mpy_multi.py 564.77 -> 581.49 : +16.72 = +2.960% (+/-0.01%)
core_import_mpy_single.py 68.67 -> 67.16 : -1.51 = -2.199% (+/-0.01%)
core_qstr.py 64.16 -> 64.12 : -0.04 = -0.062% (+/-0.00%)
core_yield_from.py 362.58 -> 354.50 : -8.08 = -2.228% (+/-0.00%)
misc_aes.py 429.69 -> 405.59 : -24.10 = -5.609% (+/-0.01%)
misc_mandel.py 3485.13 -> 3416.51 : -68.62 = -1.969% (+/-0.00%)
misc_pystone.py 2496.53 -> 2405.56 : -90.97 = -3.644% (+/-0.01%)
misc_raytrace.py 381.47 -> 374.01 : -7.46 = -1.956% (+/-0.01%)
viper_call0.py 576.73 -> 572.49 : -4.24 = -0.735% (+/-0.04%)
viper_call1a.py 550.37 -> 546.21 : -4.16 = -0.756% (+/-0.09%)
viper_call1b.py 438.23 -> 435.68 : -2.55 = -0.582% (+/-0.06%)
viper_call1c.py 442.84 -> 440.04 : -2.80 = -0.632% (+/-0.08%)
viper_call2a.py 536.31 -> 532.35 : -3.96 = -0.738% (+/-0.06%)
viper_call2b.py 382.34 -> 377.07 : -5.27 = -1.378% (+/-0.03%)
And for unix on x64:
diff of scores (higher is better)
N=2000 M=2000 baseline -> this-commit diff diff% (error%)
bm_chaos.py 13594.20 -> 13073.84 : -520.36 = -3.828% (+/-5.44%)
bm_fannkuch.py 60.63 -> 59.58 : -1.05 = -1.732% (+/-3.01%)
bm_fft.py 112009.15 -> 111603.32 : -405.83 = -0.362% (+/-4.03%)
bm_float.py 246202.55 -> 247923.81 : +1721.26 = +0.699% (+/-2.79%)
bm_hexiom.py 615.65 -> 617.21 : +1.56 = +0.253% (+/-1.64%)
bm_nqueens.py 215807.95 -> 215600.96 : -206.99 = -0.096% (+/-3.52%)
bm_pidigits.py 8246.74 -> 8422.82 : +176.08 = +2.135% (+/-3.64%)
misc_aes.py 16133.00 -> 16452.74 : +319.74 = +1.982% (+/-1.50%)
misc_mandel.py 128146.69 -> 130796.43 : +2649.74 = +2.068% (+/-3.18%)
misc_pystone.py 83811.49 -> 83124.85 : -686.64 = -0.819% (+/-1.03%)
misc_raytrace.py 21688.02 -> 21385.10 : -302.92 = -1.397% (+/-3.20%)
The code size change is (firmware with a lot of frozen code benefits the
most):
bare-arm: +396 +0.697%
minimal x86: +1595 +0.979% [incl +32(data)]
unix x64: +2408 +0.470% [incl +800(data)]
unix nanbox: +1396 +0.309% [incl -96(data)]
stm32: -1256 -0.318% PYBV10
cc3200: +288 +0.157%
esp8266: -260 -0.037% GENERIC
esp32: -216 -0.014% GENERIC[incl -1072(data)]
nrf: +116 +0.067% pca10040
rp2: -664 -0.135% PICO
samd: +844 +0.607% ADAFRUIT_ITSYBITSY_M4_EXPRESS
As part of this change the .mpy file format version is bumped to version 6.
And mpy-tool.py has been improved to provide a good visualisation of the
contents of .mpy files.
In summary: this commit changes the bytecode to use qstr indirection, and
reworks the .mpy file format to be simpler and allow .mpy files to be
executed in-place. Performance is not impacted too much. Eventually it
will be possible to store such .mpy files in a linear, read-only, memory-
mappable filesystem so they can be executed from flash/ROM. This will
essentially be able to replace frozen code for most applications.
Signed-off-by: Damien George <damien@micropython.org>
2021-10-22 12:22:47 +01:00
|
|
|
emit_t *emit_bc;
|
2015-03-26 16:44:14 +00:00
|
|
|
#if NEED_METHOD_TABLE
|
2013-10-05 18:08:26 +01:00
|
|
|
const emit_method_table_t *emit_method_table; // current emit method table
|
2015-03-26 16:44:14 +00:00
|
|
|
#endif
|
2013-10-05 23:17:28 +01:00
|
|
|
|
2016-12-09 02:17:49 +00:00
|
|
|
#if MICROPY_EMIT_INLINE_ASM
|
2013-10-05 23:17:28 +01:00
|
|
|
emit_inline_asm_t *emit_inline_asm; // current emitter for inline asm
|
|
|
|
const emit_inline_asm_method_table_t *emit_inline_asm_method_table; // current emit method table for inline asm
|
2015-03-14 12:59:31 +00:00
|
|
|
#endif
|
2013-10-04 19:53:11 +01:00
|
|
|
} compiler_t;
|
|
|
|
|
py: Rework bytecode and .mpy file format to be mostly static data.
Background: .mpy files are precompiled .py files, built using mpy-cross,
that contain compiled bytecode functions (and can also contain machine
code). The benefit of using an .mpy file over a .py file is that they are
faster to import and take less memory when importing. They are also
smaller on disk.
But the real benefit of .mpy files comes when they are frozen into the
firmware. This is done by loading the .mpy file during compilation of the
firmware and turning it into a set of big C data structures (the job of
mpy-tool.py), which are then compiled and downloaded into the ROM of a
device. These C data structures can be executed in-place, ie directly from
ROM. This makes importing even faster because there is very little to do,
and also means such frozen modules take up much less RAM (because their
bytecode stays in ROM).
The downside of frozen code is that it requires recompiling and reflashing
the entire firmware. This can be a big barrier to entry, slows down
development time, and makes it harder to do OTA updates of frozen code
(because the whole firmware must be updated).
This commit attempts to solve this problem by providing a solution that
sits between loading .mpy files into RAM and freezing them into the
firmware. The .mpy file format has been reworked so that it consists of
data and bytecode which is mostly static and ready to run in-place. If
these new .mpy files are located in flash/ROM which is memory addressable,
the .mpy file can be executed (mostly) in-place.
With this approach there is still a small amount of unpacking and linking
of the .mpy file that needs to be done when it's imported, but it's still
much better than loading an .mpy from disk into RAM (although not as good
as freezing .mpy files into the firmware).
The main trick to make static .mpy files is to adjust the bytecode so any
qstrs that it references now go through a lookup table to convert from
local qstr number in the module to global qstr number in the firmware.
That means the bytecode does not need linking/rewriting of qstrs when it's
loaded. Instead only a small qstr table needs to be built (and put in RAM)
at import time. This means the bytecode itself is static/constant and can
be used directly if it's in addressable memory. Also the qstr string data
in the .mpy file, and some constant object data, can be used directly.
Note that the qstr table is global to the module (ie not per function).
In more detail, in the VM what used to be (schematically):
qst = DECODE_QSTR_VALUE;
is now (schematically):
idx = DECODE_QSTR_INDEX;
qst = qstr_table[idx];
That allows the bytecode to be fixed at compile time and not need
relinking/rewriting of the qstr values. Only qstr_table needs to be linked
when the .mpy is loaded.
Incidentally, this helps to reduce the size of bytecode because what used
to be 2-byte qstr values in the bytecode are now (mostly) 1-byte indices.
If the module uses the same qstr more than two times then the bytecode is
smaller than before.
The following changes are measured for this commit compared to the
previous (the baseline):
- average 7%-9% reduction in size of .mpy files
- frozen code size is reduced by about 5%-7%
- importing .py files uses about 5% less RAM in total
- importing .mpy files uses about 4% less RAM in total
- importing .py and .mpy files takes about the same time as before
The qstr indirection in the bytecode has only a small impact on VM
performance. For stm32 on PYBv1.0 the performance change of this commit
is:
diff of scores (higher is better)
N=100 M=100 baseline -> this-commit diff diff% (error%)
bm_chaos.py 371.07 -> 357.39 : -13.68 = -3.687% (+/-0.02%)
bm_fannkuch.py 78.72 -> 77.49 : -1.23 = -1.563% (+/-0.01%)
bm_fft.py 2591.73 -> 2539.28 : -52.45 = -2.024% (+/-0.00%)
bm_float.py 6034.93 -> 5908.30 : -126.63 = -2.098% (+/-0.01%)
bm_hexiom.py 48.96 -> 47.93 : -1.03 = -2.104% (+/-0.00%)
bm_nqueens.py 4510.63 -> 4459.94 : -50.69 = -1.124% (+/-0.00%)
bm_pidigits.py 650.28 -> 644.96 : -5.32 = -0.818% (+/-0.23%)
core_import_mpy_multi.py 564.77 -> 581.49 : +16.72 = +2.960% (+/-0.01%)
core_import_mpy_single.py 68.67 -> 67.16 : -1.51 = -2.199% (+/-0.01%)
core_qstr.py 64.16 -> 64.12 : -0.04 = -0.062% (+/-0.00%)
core_yield_from.py 362.58 -> 354.50 : -8.08 = -2.228% (+/-0.00%)
misc_aes.py 429.69 -> 405.59 : -24.10 = -5.609% (+/-0.01%)
misc_mandel.py 3485.13 -> 3416.51 : -68.62 = -1.969% (+/-0.00%)
misc_pystone.py 2496.53 -> 2405.56 : -90.97 = -3.644% (+/-0.01%)
misc_raytrace.py 381.47 -> 374.01 : -7.46 = -1.956% (+/-0.01%)
viper_call0.py 576.73 -> 572.49 : -4.24 = -0.735% (+/-0.04%)
viper_call1a.py 550.37 -> 546.21 : -4.16 = -0.756% (+/-0.09%)
viper_call1b.py 438.23 -> 435.68 : -2.55 = -0.582% (+/-0.06%)
viper_call1c.py 442.84 -> 440.04 : -2.80 = -0.632% (+/-0.08%)
viper_call2a.py 536.31 -> 532.35 : -3.96 = -0.738% (+/-0.06%)
viper_call2b.py 382.34 -> 377.07 : -5.27 = -1.378% (+/-0.03%)
And for unix on x64:
diff of scores (higher is better)
N=2000 M=2000 baseline -> this-commit diff diff% (error%)
bm_chaos.py 13594.20 -> 13073.84 : -520.36 = -3.828% (+/-5.44%)
bm_fannkuch.py 60.63 -> 59.58 : -1.05 = -1.732% (+/-3.01%)
bm_fft.py 112009.15 -> 111603.32 : -405.83 = -0.362% (+/-4.03%)
bm_float.py 246202.55 -> 247923.81 : +1721.26 = +0.699% (+/-2.79%)
bm_hexiom.py 615.65 -> 617.21 : +1.56 = +0.253% (+/-1.64%)
bm_nqueens.py 215807.95 -> 215600.96 : -206.99 = -0.096% (+/-3.52%)
bm_pidigits.py 8246.74 -> 8422.82 : +176.08 = +2.135% (+/-3.64%)
misc_aes.py 16133.00 -> 16452.74 : +319.74 = +1.982% (+/-1.50%)
misc_mandel.py 128146.69 -> 130796.43 : +2649.74 = +2.068% (+/-3.18%)
misc_pystone.py 83811.49 -> 83124.85 : -686.64 = -0.819% (+/-1.03%)
misc_raytrace.py 21688.02 -> 21385.10 : -302.92 = -1.397% (+/-3.20%)
The code size change is (firmware with a lot of frozen code benefits the
most):
bare-arm: +396 +0.697%
minimal x86: +1595 +0.979% [incl +32(data)]
unix x64: +2408 +0.470% [incl +800(data)]
unix nanbox: +1396 +0.309% [incl -96(data)]
stm32: -1256 -0.318% PYBV10
cc3200: +288 +0.157%
esp8266: -260 -0.037% GENERIC
esp32: -216 -0.014% GENERIC[incl -1072(data)]
nrf: +116 +0.067% pca10040
rp2: -664 -0.135% PICO
samd: +844 +0.607% ADAFRUIT_ITSYBITSY_M4_EXPRESS
As part of this change the .mpy file format version is bumped to version 6.
And mpy-tool.py has been improved to provide a good visualisation of the
contents of .mpy files.
In summary: this commit changes the bytecode to use qstr indirection, and
reworks the .mpy file format to be simpler and allow .mpy files to be
executed in-place. Performance is not impacted too much. Eventually it
will be possible to store such .mpy files in a linear, read-only, memory-
mappable filesystem so they can be executed from flash/ROM. This will
essentially be able to replace frozen code for most applications.
Signed-off-by: Damien George <damien@micropython.org>
2021-10-22 12:22:47 +01:00
|
|
|
/******************************************************************************/
|
|
|
|
// mp_emit_common_t helper functions
|
|
|
|
// These are defined here so they can be inlined, to reduce code size.
|
|
|
|
|
|
|
|
STATIC void mp_emit_common_init(mp_emit_common_t *emit, qstr source_file) {
|
|
|
|
#if MICROPY_EMIT_BYTECODE_USES_QSTR_TABLE
|
|
|
|
mp_map_init(&emit->qstr_map, 1);
|
|
|
|
|
|
|
|
// add the source file as the first entry in the qstr table
|
|
|
|
mp_map_elem_t *elem = mp_map_lookup(&emit->qstr_map, MP_OBJ_NEW_QSTR(source_file), MP_MAP_LOOKUP_ADD_IF_NOT_FOUND);
|
|
|
|
elem->value = MP_OBJ_NEW_SMALL_INT(0);
|
|
|
|
#endif
|
|
|
|
}
|
|
|
|
|
|
|
|
STATIC void mp_emit_common_start_pass(mp_emit_common_t *emit, pass_kind_t pass) {
|
|
|
|
emit->pass = pass;
|
|
|
|
if (pass == MP_PASS_STACK_SIZE) {
|
|
|
|
emit->ct_cur_obj_base = emit->ct_cur_obj;
|
|
|
|
} else if (pass > MP_PASS_STACK_SIZE) {
|
|
|
|
emit->ct_cur_obj = emit->ct_cur_obj_base;
|
|
|
|
}
|
|
|
|
if (pass == MP_PASS_EMIT) {
|
|
|
|
if (emit->ct_cur_child == 0) {
|
|
|
|
emit->children = NULL;
|
|
|
|
} else {
|
|
|
|
emit->children = m_new0(mp_raw_code_t *, emit->ct_cur_child);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
emit->ct_cur_child = 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
STATIC void mp_emit_common_finalise(mp_emit_common_t *emit, bool has_native_code) {
|
|
|
|
emit->ct_cur_obj += has_native_code; // allocate an additional slot for &mp_fun_table
|
|
|
|
emit->const_table = m_new0(mp_uint_t, emit->ct_cur_obj);
|
|
|
|
emit->ct_cur_obj = has_native_code; // reserve slot 0 for &mp_fun_table
|
|
|
|
#if MICROPY_EMIT_NATIVE
|
|
|
|
if (has_native_code) {
|
|
|
|
// store mp_fun_table pointer at the start of the constant table
|
|
|
|
emit->const_table[0] = (mp_uint_t)(uintptr_t)&mp_fun_table;
|
|
|
|
}
|
|
|
|
#endif
|
|
|
|
}
|
|
|
|
|
|
|
|
STATIC void mp_emit_common_populate_module_context(mp_emit_common_t *emit, qstr source_file, mp_module_context_t *context) {
|
|
|
|
#if MICROPY_EMIT_BYTECODE_USES_QSTR_TABLE
|
|
|
|
size_t qstr_map_used = emit->qstr_map.used;
|
|
|
|
mp_module_context_alloc_tables(context, qstr_map_used, emit->ct_cur_obj);
|
|
|
|
for (size_t i = 0; i < emit->qstr_map.alloc; ++i) {
|
|
|
|
if (mp_map_slot_is_filled(&emit->qstr_map, i)) {
|
|
|
|
size_t idx = MP_OBJ_SMALL_INT_VALUE(emit->qstr_map.table[i].value);
|
|
|
|
qstr qst = MP_OBJ_QSTR_VALUE(emit->qstr_map.table[i].key);
|
|
|
|
context->constants.qstr_table[idx] = qst;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
#else
|
|
|
|
mp_module_context_alloc_tables(context, 0, emit->ct_cur_obj);
|
|
|
|
context->constants.source_file = source_file;
|
|
|
|
#endif
|
|
|
|
|
|
|
|
if (emit->ct_cur_obj > 0) {
|
|
|
|
memcpy(context->constants.obj_table, emit->const_table, emit->ct_cur_obj * sizeof(mp_uint_t));
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
/******************************************************************************/
|
|
|
|
|
2015-07-29 23:16:01 +01:00
|
|
|
STATIC void compile_error_set_line(compiler_t *comp, mp_parse_node_t pn) {
|
|
|
|
// if the line of the error is unknown then try to update it from the pn
|
|
|
|
if (comp->compile_error_line == 0 && MP_PARSE_NODE_IS_STRUCT(pn)) {
|
2015-12-17 13:13:18 +00:00
|
|
|
comp->compile_error_line = ((mp_parse_node_struct_t *)pn)->source_line;
|
2014-04-08 16:41:02 +01:00
|
|
|
}
|
2015-07-27 22:20:00 +01:00
|
|
|
}
|
|
|
|
|
2020-01-29 03:27:33 +00:00
|
|
|
STATIC void compile_syntax_error(compiler_t *comp, mp_parse_node_t pn, mp_rom_error_text_t msg) {
|
2015-07-29 23:16:01 +01:00
|
|
|
// only register the error if there has been no other error
|
|
|
|
if (comp->compile_error == MP_OBJ_NULL) {
|
|
|
|
comp->compile_error = mp_obj_new_exception_msg(&mp_type_SyntaxError, msg);
|
|
|
|
compile_error_set_line(comp, pn);
|
|
|
|
}
|
2014-03-03 23:19:11 +00:00
|
|
|
}
|
|
|
|
|
2014-02-12 16:31:30 +00:00
|
|
|
STATIC void compile_trailer_paren_helper(compiler_t *comp, mp_parse_node_t pn_arglist, bool is_method_call, int n_positional_extra);
|
2014-08-15 14:30:52 +01:00
|
|
|
STATIC void compile_comprehension(compiler_t *comp, mp_parse_node_struct_t *pns, scope_kind_t kind);
|
2019-02-26 13:10:04 +00:00
|
|
|
STATIC void compile_atom_brace_helper(compiler_t *comp, mp_parse_node_struct_t *pns, bool create_map);
|
2014-03-27 10:55:21 +00:00
|
|
|
STATIC void compile_node(compiler_t *comp, mp_parse_node_t pn);
|
2013-10-04 19:53:11 +01:00
|
|
|
|
2014-04-10 14:11:31 +01:00
|
|
|
STATIC uint comp_next_label(compiler_t *comp) {
|
2013-10-05 13:37:10 +01:00
|
|
|
return comp->next_label++;
|
|
|
|
}
|
|
|
|
|
py/emitnative: Optimise and improve exception handling in native code.
Prior to this patch, native code would use a full nlr_buf_t for each
exception handler (try-except, try-finally, with). For nested exception
handlers this would use a lot of C stack and be rather inefficient.
This patch changes how exceptions are handled in native code by setting up
only a single nlr_buf_t context for the entire function, and then manages a
state machine (using the PC) to work out which exception handler to run
when an exception is raised by an nlr_jump. This keeps the C stack usage
at a constant level regardless of the depth of Python exception blocks.
The patch also fixes an existing bug when local variables are written to
within an exception handler, then their value was incorrectly restored if
an exception was raised (since the nlr_jump would restore register values,
back to the point of the nlr_push).
And it also gets nested try-finally+with working with the viper emitter.
Broadly speaking, efficiency of executing native code that doesn't use
any exception blocks is unchanged, and emitted code size is only slightly
increased for such function. C stack usage of all native functions is
either equal or less than before. Emitted code size for native functions
that use exception blocks is increased by roughly 10% (due in part to
fixing of above-mentioned bugs).
But, most importantly, this patch allows to implement more Python features
in native code, like unwind jumps and yielding from within nested exception
blocks.
2018-08-16 04:56:36 +01:00
|
|
|
#if MICROPY_EMIT_NATIVE
|
|
|
|
STATIC void reserve_labels_for_native(compiler_t *comp, int n) {
|
|
|
|
if (comp->scope_cur->emit_options != MP_EMIT_OPT_BYTECODE) {
|
|
|
|
comp->next_label += n;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
#else
|
|
|
|
#define reserve_labels_for_native(comp, n)
|
|
|
|
#endif
|
|
|
|
|
2018-09-04 06:34:51 +01:00
|
|
|
STATIC void compile_increase_except_level(compiler_t *comp, uint label, int kind) {
|
|
|
|
EMIT_ARG(setup_block, label, kind);
|
2014-03-27 10:55:21 +00:00
|
|
|
comp->cur_except_level += 1;
|
|
|
|
if (comp->cur_except_level > comp->scope_cur->exc_stack_size) {
|
|
|
|
comp->scope_cur->exc_stack_size = comp->cur_except_level;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
STATIC void compile_decrease_except_level(compiler_t *comp) {
|
|
|
|
assert(comp->cur_except_level > 0);
|
|
|
|
comp->cur_except_level -= 1;
|
2018-09-04 06:34:51 +01:00
|
|
|
EMIT(end_finally);
|
|
|
|
reserve_labels_for_native(comp, 1);
|
2014-03-27 10:55:21 +00:00
|
|
|
}
|
|
|
|
|
2014-02-12 16:31:30 +00:00
|
|
|
STATIC scope_t *scope_new_and_link(compiler_t *comp, scope_kind_t kind, mp_parse_node_t pn, uint emit_options) {
|
py: Rework bytecode and .mpy file format to be mostly static data.
Background: .mpy files are precompiled .py files, built using mpy-cross,
that contain compiled bytecode functions (and can also contain machine
code). The benefit of using an .mpy file over a .py file is that they are
faster to import and take less memory when importing. They are also
smaller on disk.
But the real benefit of .mpy files comes when they are frozen into the
firmware. This is done by loading the .mpy file during compilation of the
firmware and turning it into a set of big C data structures (the job of
mpy-tool.py), which are then compiled and downloaded into the ROM of a
device. These C data structures can be executed in-place, ie directly from
ROM. This makes importing even faster because there is very little to do,
and also means such frozen modules take up much less RAM (because their
bytecode stays in ROM).
The downside of frozen code is that it requires recompiling and reflashing
the entire firmware. This can be a big barrier to entry, slows down
development time, and makes it harder to do OTA updates of frozen code
(because the whole firmware must be updated).
This commit attempts to solve this problem by providing a solution that
sits between loading .mpy files into RAM and freezing them into the
firmware. The .mpy file format has been reworked so that it consists of
data and bytecode which is mostly static and ready to run in-place. If
these new .mpy files are located in flash/ROM which is memory addressable,
the .mpy file can be executed (mostly) in-place.
With this approach there is still a small amount of unpacking and linking
of the .mpy file that needs to be done when it's imported, but it's still
much better than loading an .mpy from disk into RAM (although not as good
as freezing .mpy files into the firmware).
The main trick to make static .mpy files is to adjust the bytecode so any
qstrs that it references now go through a lookup table to convert from
local qstr number in the module to global qstr number in the firmware.
That means the bytecode does not need linking/rewriting of qstrs when it's
loaded. Instead only a small qstr table needs to be built (and put in RAM)
at import time. This means the bytecode itself is static/constant and can
be used directly if it's in addressable memory. Also the qstr string data
in the .mpy file, and some constant object data, can be used directly.
Note that the qstr table is global to the module (ie not per function).
In more detail, in the VM what used to be (schematically):
qst = DECODE_QSTR_VALUE;
is now (schematically):
idx = DECODE_QSTR_INDEX;
qst = qstr_table[idx];
That allows the bytecode to be fixed at compile time and not need
relinking/rewriting of the qstr values. Only qstr_table needs to be linked
when the .mpy is loaded.
Incidentally, this helps to reduce the size of bytecode because what used
to be 2-byte qstr values in the bytecode are now (mostly) 1-byte indices.
If the module uses the same qstr more than two times then the bytecode is
smaller than before.
The following changes are measured for this commit compared to the
previous (the baseline):
- average 7%-9% reduction in size of .mpy files
- frozen code size is reduced by about 5%-7%
- importing .py files uses about 5% less RAM in total
- importing .mpy files uses about 4% less RAM in total
- importing .py and .mpy files takes about the same time as before
The qstr indirection in the bytecode has only a small impact on VM
performance. For stm32 on PYBv1.0 the performance change of this commit
is:
diff of scores (higher is better)
N=100 M=100 baseline -> this-commit diff diff% (error%)
bm_chaos.py 371.07 -> 357.39 : -13.68 = -3.687% (+/-0.02%)
bm_fannkuch.py 78.72 -> 77.49 : -1.23 = -1.563% (+/-0.01%)
bm_fft.py 2591.73 -> 2539.28 : -52.45 = -2.024% (+/-0.00%)
bm_float.py 6034.93 -> 5908.30 : -126.63 = -2.098% (+/-0.01%)
bm_hexiom.py 48.96 -> 47.93 : -1.03 = -2.104% (+/-0.00%)
bm_nqueens.py 4510.63 -> 4459.94 : -50.69 = -1.124% (+/-0.00%)
bm_pidigits.py 650.28 -> 644.96 : -5.32 = -0.818% (+/-0.23%)
core_import_mpy_multi.py 564.77 -> 581.49 : +16.72 = +2.960% (+/-0.01%)
core_import_mpy_single.py 68.67 -> 67.16 : -1.51 = -2.199% (+/-0.01%)
core_qstr.py 64.16 -> 64.12 : -0.04 = -0.062% (+/-0.00%)
core_yield_from.py 362.58 -> 354.50 : -8.08 = -2.228% (+/-0.00%)
misc_aes.py 429.69 -> 405.59 : -24.10 = -5.609% (+/-0.01%)
misc_mandel.py 3485.13 -> 3416.51 : -68.62 = -1.969% (+/-0.00%)
misc_pystone.py 2496.53 -> 2405.56 : -90.97 = -3.644% (+/-0.01%)
misc_raytrace.py 381.47 -> 374.01 : -7.46 = -1.956% (+/-0.01%)
viper_call0.py 576.73 -> 572.49 : -4.24 = -0.735% (+/-0.04%)
viper_call1a.py 550.37 -> 546.21 : -4.16 = -0.756% (+/-0.09%)
viper_call1b.py 438.23 -> 435.68 : -2.55 = -0.582% (+/-0.06%)
viper_call1c.py 442.84 -> 440.04 : -2.80 = -0.632% (+/-0.08%)
viper_call2a.py 536.31 -> 532.35 : -3.96 = -0.738% (+/-0.06%)
viper_call2b.py 382.34 -> 377.07 : -5.27 = -1.378% (+/-0.03%)
And for unix on x64:
diff of scores (higher is better)
N=2000 M=2000 baseline -> this-commit diff diff% (error%)
bm_chaos.py 13594.20 -> 13073.84 : -520.36 = -3.828% (+/-5.44%)
bm_fannkuch.py 60.63 -> 59.58 : -1.05 = -1.732% (+/-3.01%)
bm_fft.py 112009.15 -> 111603.32 : -405.83 = -0.362% (+/-4.03%)
bm_float.py 246202.55 -> 247923.81 : +1721.26 = +0.699% (+/-2.79%)
bm_hexiom.py 615.65 -> 617.21 : +1.56 = +0.253% (+/-1.64%)
bm_nqueens.py 215807.95 -> 215600.96 : -206.99 = -0.096% (+/-3.52%)
bm_pidigits.py 8246.74 -> 8422.82 : +176.08 = +2.135% (+/-3.64%)
misc_aes.py 16133.00 -> 16452.74 : +319.74 = +1.982% (+/-1.50%)
misc_mandel.py 128146.69 -> 130796.43 : +2649.74 = +2.068% (+/-3.18%)
misc_pystone.py 83811.49 -> 83124.85 : -686.64 = -0.819% (+/-1.03%)
misc_raytrace.py 21688.02 -> 21385.10 : -302.92 = -1.397% (+/-3.20%)
The code size change is (firmware with a lot of frozen code benefits the
most):
bare-arm: +396 +0.697%
minimal x86: +1595 +0.979% [incl +32(data)]
unix x64: +2408 +0.470% [incl +800(data)]
unix nanbox: +1396 +0.309% [incl -96(data)]
stm32: -1256 -0.318% PYBV10
cc3200: +288 +0.157%
esp8266: -260 -0.037% GENERIC
esp32: -216 -0.014% GENERIC[incl -1072(data)]
nrf: +116 +0.067% pca10040
rp2: -664 -0.135% PICO
samd: +844 +0.607% ADAFRUIT_ITSYBITSY_M4_EXPRESS
As part of this change the .mpy file format version is bumped to version 6.
And mpy-tool.py has been improved to provide a good visualisation of the
contents of .mpy files.
In summary: this commit changes the bytecode to use qstr indirection, and
reworks the .mpy file format to be simpler and allow .mpy files to be
executed in-place. Performance is not impacted too much. Eventually it
will be possible to store such .mpy files in a linear, read-only, memory-
mappable filesystem so they can be executed from flash/ROM. This will
essentially be able to replace frozen code for most applications.
Signed-off-by: Damien George <damien@micropython.org>
2021-10-22 12:22:47 +01:00
|
|
|
scope_t *scope = scope_new(kind, pn, emit_options);
|
2013-10-04 19:53:11 +01:00
|
|
|
scope->parent = comp->scope_cur;
|
|
|
|
scope->next = NULL;
|
|
|
|
if (comp->scope_head == NULL) {
|
|
|
|
comp->scope_head = scope;
|
|
|
|
} else {
|
|
|
|
scope_t *s = comp->scope_head;
|
|
|
|
while (s->next != NULL) {
|
|
|
|
s = s->next;
|
|
|
|
}
|
|
|
|
s->next = scope;
|
|
|
|
}
|
|
|
|
return scope;
|
|
|
|
}
|
|
|
|
|
2015-04-09 16:31:53 +01:00
|
|
|
typedef void (*apply_list_fun_t)(compiler_t *comp, mp_parse_node_t pn);
|
|
|
|
|
|
|
|
STATIC void apply_to_single_or_list(compiler_t *comp, mp_parse_node_t pn, pn_kind_t pn_list_kind, apply_list_fun_t f) {
|
2015-01-16 17:47:07 +00:00
|
|
|
if (MP_PARSE_NODE_IS_STRUCT_KIND(pn, pn_list_kind)) {
|
2013-12-21 18:17:45 +00:00
|
|
|
mp_parse_node_struct_t *pns = (mp_parse_node_struct_t *)pn;
|
|
|
|
int num_nodes = MP_PARSE_NODE_STRUCT_NUM_NODES(pns);
|
2013-10-04 19:53:11 +01:00
|
|
|
for (int i = 0; i < num_nodes; i++) {
|
|
|
|
f(comp, pns->nodes[i]);
|
|
|
|
}
|
2013-12-21 18:17:45 +00:00
|
|
|
} else if (!MP_PARSE_NODE_IS_NULL(pn)) {
|
2013-10-04 19:53:11 +01:00
|
|
|
f(comp, pn);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2014-12-10 22:07:04 +00:00
|
|
|
STATIC void compile_generic_all_nodes(compiler_t *comp, mp_parse_node_struct_t *pns) {
|
2013-12-21 18:17:45 +00:00
|
|
|
int num_nodes = MP_PARSE_NODE_STRUCT_NUM_NODES(pns);
|
2013-10-04 19:53:11 +01:00
|
|
|
for (int i = 0; i < num_nodes; i++) {
|
|
|
|
compile_node(comp, pns->nodes[i]);
|
2015-07-29 23:16:01 +01:00
|
|
|
if (comp->compile_error != MP_OBJ_NULL) {
|
|
|
|
// add line info for the error in case it didn't have a line number
|
|
|
|
compile_error_set_line(comp, pns->nodes[i]);
|
|
|
|
return;
|
|
|
|
}
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2015-03-26 14:42:40 +00:00
|
|
|
STATIC void compile_load_id(compiler_t *comp, qstr qst) {
|
|
|
|
if (comp->pass == MP_PASS_SCOPE) {
|
|
|
|
mp_emit_common_get_id_for_load(comp->scope_cur, qst);
|
py: Rework bytecode and .mpy file format to be mostly static data.
Background: .mpy files are precompiled .py files, built using mpy-cross,
that contain compiled bytecode functions (and can also contain machine
code). The benefit of using an .mpy file over a .py file is that they are
faster to import and take less memory when importing. They are also
smaller on disk.
But the real benefit of .mpy files comes when they are frozen into the
firmware. This is done by loading the .mpy file during compilation of the
firmware and turning it into a set of big C data structures (the job of
mpy-tool.py), which are then compiled and downloaded into the ROM of a
device. These C data structures can be executed in-place, ie directly from
ROM. This makes importing even faster because there is very little to do,
and also means such frozen modules take up much less RAM (because their
bytecode stays in ROM).
The downside of frozen code is that it requires recompiling and reflashing
the entire firmware. This can be a big barrier to entry, slows down
development time, and makes it harder to do OTA updates of frozen code
(because the whole firmware must be updated).
This commit attempts to solve this problem by providing a solution that
sits between loading .mpy files into RAM and freezing them into the
firmware. The .mpy file format has been reworked so that it consists of
data and bytecode which is mostly static and ready to run in-place. If
these new .mpy files are located in flash/ROM which is memory addressable,
the .mpy file can be executed (mostly) in-place.
With this approach there is still a small amount of unpacking and linking
of the .mpy file that needs to be done when it's imported, but it's still
much better than loading an .mpy from disk into RAM (although not as good
as freezing .mpy files into the firmware).
The main trick to make static .mpy files is to adjust the bytecode so any
qstrs that it references now go through a lookup table to convert from
local qstr number in the module to global qstr number in the firmware.
That means the bytecode does not need linking/rewriting of qstrs when it's
loaded. Instead only a small qstr table needs to be built (and put in RAM)
at import time. This means the bytecode itself is static/constant and can
be used directly if it's in addressable memory. Also the qstr string data
in the .mpy file, and some constant object data, can be used directly.
Note that the qstr table is global to the module (ie not per function).
In more detail, in the VM what used to be (schematically):
qst = DECODE_QSTR_VALUE;
is now (schematically):
idx = DECODE_QSTR_INDEX;
qst = qstr_table[idx];
That allows the bytecode to be fixed at compile time and not need
relinking/rewriting of the qstr values. Only qstr_table needs to be linked
when the .mpy is loaded.
Incidentally, this helps to reduce the size of bytecode because what used
to be 2-byte qstr values in the bytecode are now (mostly) 1-byte indices.
If the module uses the same qstr more than two times then the bytecode is
smaller than before.
The following changes are measured for this commit compared to the
previous (the baseline):
- average 7%-9% reduction in size of .mpy files
- frozen code size is reduced by about 5%-7%
- importing .py files uses about 5% less RAM in total
- importing .mpy files uses about 4% less RAM in total
- importing .py and .mpy files takes about the same time as before
The qstr indirection in the bytecode has only a small impact on VM
performance. For stm32 on PYBv1.0 the performance change of this commit
is:
diff of scores (higher is better)
N=100 M=100 baseline -> this-commit diff diff% (error%)
bm_chaos.py 371.07 -> 357.39 : -13.68 = -3.687% (+/-0.02%)
bm_fannkuch.py 78.72 -> 77.49 : -1.23 = -1.563% (+/-0.01%)
bm_fft.py 2591.73 -> 2539.28 : -52.45 = -2.024% (+/-0.00%)
bm_float.py 6034.93 -> 5908.30 : -126.63 = -2.098% (+/-0.01%)
bm_hexiom.py 48.96 -> 47.93 : -1.03 = -2.104% (+/-0.00%)
bm_nqueens.py 4510.63 -> 4459.94 : -50.69 = -1.124% (+/-0.00%)
bm_pidigits.py 650.28 -> 644.96 : -5.32 = -0.818% (+/-0.23%)
core_import_mpy_multi.py 564.77 -> 581.49 : +16.72 = +2.960% (+/-0.01%)
core_import_mpy_single.py 68.67 -> 67.16 : -1.51 = -2.199% (+/-0.01%)
core_qstr.py 64.16 -> 64.12 : -0.04 = -0.062% (+/-0.00%)
core_yield_from.py 362.58 -> 354.50 : -8.08 = -2.228% (+/-0.00%)
misc_aes.py 429.69 -> 405.59 : -24.10 = -5.609% (+/-0.01%)
misc_mandel.py 3485.13 -> 3416.51 : -68.62 = -1.969% (+/-0.00%)
misc_pystone.py 2496.53 -> 2405.56 : -90.97 = -3.644% (+/-0.01%)
misc_raytrace.py 381.47 -> 374.01 : -7.46 = -1.956% (+/-0.01%)
viper_call0.py 576.73 -> 572.49 : -4.24 = -0.735% (+/-0.04%)
viper_call1a.py 550.37 -> 546.21 : -4.16 = -0.756% (+/-0.09%)
viper_call1b.py 438.23 -> 435.68 : -2.55 = -0.582% (+/-0.06%)
viper_call1c.py 442.84 -> 440.04 : -2.80 = -0.632% (+/-0.08%)
viper_call2a.py 536.31 -> 532.35 : -3.96 = -0.738% (+/-0.06%)
viper_call2b.py 382.34 -> 377.07 : -5.27 = -1.378% (+/-0.03%)
And for unix on x64:
diff of scores (higher is better)
N=2000 M=2000 baseline -> this-commit diff diff% (error%)
bm_chaos.py 13594.20 -> 13073.84 : -520.36 = -3.828% (+/-5.44%)
bm_fannkuch.py 60.63 -> 59.58 : -1.05 = -1.732% (+/-3.01%)
bm_fft.py 112009.15 -> 111603.32 : -405.83 = -0.362% (+/-4.03%)
bm_float.py 246202.55 -> 247923.81 : +1721.26 = +0.699% (+/-2.79%)
bm_hexiom.py 615.65 -> 617.21 : +1.56 = +0.253% (+/-1.64%)
bm_nqueens.py 215807.95 -> 215600.96 : -206.99 = -0.096% (+/-3.52%)
bm_pidigits.py 8246.74 -> 8422.82 : +176.08 = +2.135% (+/-3.64%)
misc_aes.py 16133.00 -> 16452.74 : +319.74 = +1.982% (+/-1.50%)
misc_mandel.py 128146.69 -> 130796.43 : +2649.74 = +2.068% (+/-3.18%)
misc_pystone.py 83811.49 -> 83124.85 : -686.64 = -0.819% (+/-1.03%)
misc_raytrace.py 21688.02 -> 21385.10 : -302.92 = -1.397% (+/-3.20%)
The code size change is (firmware with a lot of frozen code benefits the
most):
bare-arm: +396 +0.697%
minimal x86: +1595 +0.979% [incl +32(data)]
unix x64: +2408 +0.470% [incl +800(data)]
unix nanbox: +1396 +0.309% [incl -96(data)]
stm32: -1256 -0.318% PYBV10
cc3200: +288 +0.157%
esp8266: -260 -0.037% GENERIC
esp32: -216 -0.014% GENERIC[incl -1072(data)]
nrf: +116 +0.067% pca10040
rp2: -664 -0.135% PICO
samd: +844 +0.607% ADAFRUIT_ITSYBITSY_M4_EXPRESS
As part of this change the .mpy file format version is bumped to version 6.
And mpy-tool.py has been improved to provide a good visualisation of the
contents of .mpy files.
In summary: this commit changes the bytecode to use qstr indirection, and
reworks the .mpy file format to be simpler and allow .mpy files to be
executed in-place. Performance is not impacted too much. Eventually it
will be possible to store such .mpy files in a linear, read-only, memory-
mappable filesystem so they can be executed from flash/ROM. This will
essentially be able to replace frozen code for most applications.
Signed-off-by: Damien George <damien@micropython.org>
2021-10-22 12:22:47 +01:00
|
|
|
}
|
|
|
|
{
|
2015-03-26 16:44:14 +00:00
|
|
|
#if NEED_METHOD_TABLE
|
2015-03-26 14:42:40 +00:00
|
|
|
mp_emit_common_id_op(comp->emit, &comp->emit_method_table->load_id, comp->scope_cur, qst);
|
2015-03-26 16:44:14 +00:00
|
|
|
#else
|
|
|
|
mp_emit_common_id_op(comp->emit, &mp_emit_bc_method_table_load_id_ops, comp->scope_cur, qst);
|
|
|
|
#endif
|
2015-03-26 14:42:40 +00:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
STATIC void compile_store_id(compiler_t *comp, qstr qst) {
|
|
|
|
if (comp->pass == MP_PASS_SCOPE) {
|
|
|
|
mp_emit_common_get_id_for_modification(comp->scope_cur, qst);
|
py: Rework bytecode and .mpy file format to be mostly static data.
Background: .mpy files are precompiled .py files, built using mpy-cross,
that contain compiled bytecode functions (and can also contain machine
code). The benefit of using an .mpy file over a .py file is that they are
faster to import and take less memory when importing. They are also
smaller on disk.
But the real benefit of .mpy files comes when they are frozen into the
firmware. This is done by loading the .mpy file during compilation of the
firmware and turning it into a set of big C data structures (the job of
mpy-tool.py), which are then compiled and downloaded into the ROM of a
device. These C data structures can be executed in-place, ie directly from
ROM. This makes importing even faster because there is very little to do,
and also means such frozen modules take up much less RAM (because their
bytecode stays in ROM).
The downside of frozen code is that it requires recompiling and reflashing
the entire firmware. This can be a big barrier to entry, slows down
development time, and makes it harder to do OTA updates of frozen code
(because the whole firmware must be updated).
This commit attempts to solve this problem by providing a solution that
sits between loading .mpy files into RAM and freezing them into the
firmware. The .mpy file format has been reworked so that it consists of
data and bytecode which is mostly static and ready to run in-place. If
these new .mpy files are located in flash/ROM which is memory addressable,
the .mpy file can be executed (mostly) in-place.
With this approach there is still a small amount of unpacking and linking
of the .mpy file that needs to be done when it's imported, but it's still
much better than loading an .mpy from disk into RAM (although not as good
as freezing .mpy files into the firmware).
The main trick to make static .mpy files is to adjust the bytecode so any
qstrs that it references now go through a lookup table to convert from
local qstr number in the module to global qstr number in the firmware.
That means the bytecode does not need linking/rewriting of qstrs when it's
loaded. Instead only a small qstr table needs to be built (and put in RAM)
at import time. This means the bytecode itself is static/constant and can
be used directly if it's in addressable memory. Also the qstr string data
in the .mpy file, and some constant object data, can be used directly.
Note that the qstr table is global to the module (ie not per function).
In more detail, in the VM what used to be (schematically):
qst = DECODE_QSTR_VALUE;
is now (schematically):
idx = DECODE_QSTR_INDEX;
qst = qstr_table[idx];
That allows the bytecode to be fixed at compile time and not need
relinking/rewriting of the qstr values. Only qstr_table needs to be linked
when the .mpy is loaded.
Incidentally, this helps to reduce the size of bytecode because what used
to be 2-byte qstr values in the bytecode are now (mostly) 1-byte indices.
If the module uses the same qstr more than two times then the bytecode is
smaller than before.
The following changes are measured for this commit compared to the
previous (the baseline):
- average 7%-9% reduction in size of .mpy files
- frozen code size is reduced by about 5%-7%
- importing .py files uses about 5% less RAM in total
- importing .mpy files uses about 4% less RAM in total
- importing .py and .mpy files takes about the same time as before
The qstr indirection in the bytecode has only a small impact on VM
performance. For stm32 on PYBv1.0 the performance change of this commit
is:
diff of scores (higher is better)
N=100 M=100 baseline -> this-commit diff diff% (error%)
bm_chaos.py 371.07 -> 357.39 : -13.68 = -3.687% (+/-0.02%)
bm_fannkuch.py 78.72 -> 77.49 : -1.23 = -1.563% (+/-0.01%)
bm_fft.py 2591.73 -> 2539.28 : -52.45 = -2.024% (+/-0.00%)
bm_float.py 6034.93 -> 5908.30 : -126.63 = -2.098% (+/-0.01%)
bm_hexiom.py 48.96 -> 47.93 : -1.03 = -2.104% (+/-0.00%)
bm_nqueens.py 4510.63 -> 4459.94 : -50.69 = -1.124% (+/-0.00%)
bm_pidigits.py 650.28 -> 644.96 : -5.32 = -0.818% (+/-0.23%)
core_import_mpy_multi.py 564.77 -> 581.49 : +16.72 = +2.960% (+/-0.01%)
core_import_mpy_single.py 68.67 -> 67.16 : -1.51 = -2.199% (+/-0.01%)
core_qstr.py 64.16 -> 64.12 : -0.04 = -0.062% (+/-0.00%)
core_yield_from.py 362.58 -> 354.50 : -8.08 = -2.228% (+/-0.00%)
misc_aes.py 429.69 -> 405.59 : -24.10 = -5.609% (+/-0.01%)
misc_mandel.py 3485.13 -> 3416.51 : -68.62 = -1.969% (+/-0.00%)
misc_pystone.py 2496.53 -> 2405.56 : -90.97 = -3.644% (+/-0.01%)
misc_raytrace.py 381.47 -> 374.01 : -7.46 = -1.956% (+/-0.01%)
viper_call0.py 576.73 -> 572.49 : -4.24 = -0.735% (+/-0.04%)
viper_call1a.py 550.37 -> 546.21 : -4.16 = -0.756% (+/-0.09%)
viper_call1b.py 438.23 -> 435.68 : -2.55 = -0.582% (+/-0.06%)
viper_call1c.py 442.84 -> 440.04 : -2.80 = -0.632% (+/-0.08%)
viper_call2a.py 536.31 -> 532.35 : -3.96 = -0.738% (+/-0.06%)
viper_call2b.py 382.34 -> 377.07 : -5.27 = -1.378% (+/-0.03%)
And for unix on x64:
diff of scores (higher is better)
N=2000 M=2000 baseline -> this-commit diff diff% (error%)
bm_chaos.py 13594.20 -> 13073.84 : -520.36 = -3.828% (+/-5.44%)
bm_fannkuch.py 60.63 -> 59.58 : -1.05 = -1.732% (+/-3.01%)
bm_fft.py 112009.15 -> 111603.32 : -405.83 = -0.362% (+/-4.03%)
bm_float.py 246202.55 -> 247923.81 : +1721.26 = +0.699% (+/-2.79%)
bm_hexiom.py 615.65 -> 617.21 : +1.56 = +0.253% (+/-1.64%)
bm_nqueens.py 215807.95 -> 215600.96 : -206.99 = -0.096% (+/-3.52%)
bm_pidigits.py 8246.74 -> 8422.82 : +176.08 = +2.135% (+/-3.64%)
misc_aes.py 16133.00 -> 16452.74 : +319.74 = +1.982% (+/-1.50%)
misc_mandel.py 128146.69 -> 130796.43 : +2649.74 = +2.068% (+/-3.18%)
misc_pystone.py 83811.49 -> 83124.85 : -686.64 = -0.819% (+/-1.03%)
misc_raytrace.py 21688.02 -> 21385.10 : -302.92 = -1.397% (+/-3.20%)
The code size change is (firmware with a lot of frozen code benefits the
most):
bare-arm: +396 +0.697%
minimal x86: +1595 +0.979% [incl +32(data)]
unix x64: +2408 +0.470% [incl +800(data)]
unix nanbox: +1396 +0.309% [incl -96(data)]
stm32: -1256 -0.318% PYBV10
cc3200: +288 +0.157%
esp8266: -260 -0.037% GENERIC
esp32: -216 -0.014% GENERIC[incl -1072(data)]
nrf: +116 +0.067% pca10040
rp2: -664 -0.135% PICO
samd: +844 +0.607% ADAFRUIT_ITSYBITSY_M4_EXPRESS
As part of this change the .mpy file format version is bumped to version 6.
And mpy-tool.py has been improved to provide a good visualisation of the
contents of .mpy files.
In summary: this commit changes the bytecode to use qstr indirection, and
reworks the .mpy file format to be simpler and allow .mpy files to be
executed in-place. Performance is not impacted too much. Eventually it
will be possible to store such .mpy files in a linear, read-only, memory-
mappable filesystem so they can be executed from flash/ROM. This will
essentially be able to replace frozen code for most applications.
Signed-off-by: Damien George <damien@micropython.org>
2021-10-22 12:22:47 +01:00
|
|
|
}
|
|
|
|
{
|
2015-03-26 16:44:14 +00:00
|
|
|
#if NEED_METHOD_TABLE
|
2015-03-26 14:42:40 +00:00
|
|
|
mp_emit_common_id_op(comp->emit, &comp->emit_method_table->store_id, comp->scope_cur, qst);
|
2015-03-26 16:44:14 +00:00
|
|
|
#else
|
|
|
|
mp_emit_common_id_op(comp->emit, &mp_emit_bc_method_table_store_id_ops, comp->scope_cur, qst);
|
|
|
|
#endif
|
2015-03-26 14:42:40 +00:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
STATIC void compile_delete_id(compiler_t *comp, qstr qst) {
|
|
|
|
if (comp->pass == MP_PASS_SCOPE) {
|
|
|
|
mp_emit_common_get_id_for_modification(comp->scope_cur, qst);
|
py: Rework bytecode and .mpy file format to be mostly static data.
Background: .mpy files are precompiled .py files, built using mpy-cross,
that contain compiled bytecode functions (and can also contain machine
code). The benefit of using an .mpy file over a .py file is that they are
faster to import and take less memory when importing. They are also
smaller on disk.
But the real benefit of .mpy files comes when they are frozen into the
firmware. This is done by loading the .mpy file during compilation of the
firmware and turning it into a set of big C data structures (the job of
mpy-tool.py), which are then compiled and downloaded into the ROM of a
device. These C data structures can be executed in-place, ie directly from
ROM. This makes importing even faster because there is very little to do,
and also means such frozen modules take up much less RAM (because their
bytecode stays in ROM).
The downside of frozen code is that it requires recompiling and reflashing
the entire firmware. This can be a big barrier to entry, slows down
development time, and makes it harder to do OTA updates of frozen code
(because the whole firmware must be updated).
This commit attempts to solve this problem by providing a solution that
sits between loading .mpy files into RAM and freezing them into the
firmware. The .mpy file format has been reworked so that it consists of
data and bytecode which is mostly static and ready to run in-place. If
these new .mpy files are located in flash/ROM which is memory addressable,
the .mpy file can be executed (mostly) in-place.
With this approach there is still a small amount of unpacking and linking
of the .mpy file that needs to be done when it's imported, but it's still
much better than loading an .mpy from disk into RAM (although not as good
as freezing .mpy files into the firmware).
The main trick to make static .mpy files is to adjust the bytecode so any
qstrs that it references now go through a lookup table to convert from
local qstr number in the module to global qstr number in the firmware.
That means the bytecode does not need linking/rewriting of qstrs when it's
loaded. Instead only a small qstr table needs to be built (and put in RAM)
at import time. This means the bytecode itself is static/constant and can
be used directly if it's in addressable memory. Also the qstr string data
in the .mpy file, and some constant object data, can be used directly.
Note that the qstr table is global to the module (ie not per function).
In more detail, in the VM what used to be (schematically):
qst = DECODE_QSTR_VALUE;
is now (schematically):
idx = DECODE_QSTR_INDEX;
qst = qstr_table[idx];
That allows the bytecode to be fixed at compile time and not need
relinking/rewriting of the qstr values. Only qstr_table needs to be linked
when the .mpy is loaded.
Incidentally, this helps to reduce the size of bytecode because what used
to be 2-byte qstr values in the bytecode are now (mostly) 1-byte indices.
If the module uses the same qstr more than two times then the bytecode is
smaller than before.
The following changes are measured for this commit compared to the
previous (the baseline):
- average 7%-9% reduction in size of .mpy files
- frozen code size is reduced by about 5%-7%
- importing .py files uses about 5% less RAM in total
- importing .mpy files uses about 4% less RAM in total
- importing .py and .mpy files takes about the same time as before
The qstr indirection in the bytecode has only a small impact on VM
performance. For stm32 on PYBv1.0 the performance change of this commit
is:
diff of scores (higher is better)
N=100 M=100 baseline -> this-commit diff diff% (error%)
bm_chaos.py 371.07 -> 357.39 : -13.68 = -3.687% (+/-0.02%)
bm_fannkuch.py 78.72 -> 77.49 : -1.23 = -1.563% (+/-0.01%)
bm_fft.py 2591.73 -> 2539.28 : -52.45 = -2.024% (+/-0.00%)
bm_float.py 6034.93 -> 5908.30 : -126.63 = -2.098% (+/-0.01%)
bm_hexiom.py 48.96 -> 47.93 : -1.03 = -2.104% (+/-0.00%)
bm_nqueens.py 4510.63 -> 4459.94 : -50.69 = -1.124% (+/-0.00%)
bm_pidigits.py 650.28 -> 644.96 : -5.32 = -0.818% (+/-0.23%)
core_import_mpy_multi.py 564.77 -> 581.49 : +16.72 = +2.960% (+/-0.01%)
core_import_mpy_single.py 68.67 -> 67.16 : -1.51 = -2.199% (+/-0.01%)
core_qstr.py 64.16 -> 64.12 : -0.04 = -0.062% (+/-0.00%)
core_yield_from.py 362.58 -> 354.50 : -8.08 = -2.228% (+/-0.00%)
misc_aes.py 429.69 -> 405.59 : -24.10 = -5.609% (+/-0.01%)
misc_mandel.py 3485.13 -> 3416.51 : -68.62 = -1.969% (+/-0.00%)
misc_pystone.py 2496.53 -> 2405.56 : -90.97 = -3.644% (+/-0.01%)
misc_raytrace.py 381.47 -> 374.01 : -7.46 = -1.956% (+/-0.01%)
viper_call0.py 576.73 -> 572.49 : -4.24 = -0.735% (+/-0.04%)
viper_call1a.py 550.37 -> 546.21 : -4.16 = -0.756% (+/-0.09%)
viper_call1b.py 438.23 -> 435.68 : -2.55 = -0.582% (+/-0.06%)
viper_call1c.py 442.84 -> 440.04 : -2.80 = -0.632% (+/-0.08%)
viper_call2a.py 536.31 -> 532.35 : -3.96 = -0.738% (+/-0.06%)
viper_call2b.py 382.34 -> 377.07 : -5.27 = -1.378% (+/-0.03%)
And for unix on x64:
diff of scores (higher is better)
N=2000 M=2000 baseline -> this-commit diff diff% (error%)
bm_chaos.py 13594.20 -> 13073.84 : -520.36 = -3.828% (+/-5.44%)
bm_fannkuch.py 60.63 -> 59.58 : -1.05 = -1.732% (+/-3.01%)
bm_fft.py 112009.15 -> 111603.32 : -405.83 = -0.362% (+/-4.03%)
bm_float.py 246202.55 -> 247923.81 : +1721.26 = +0.699% (+/-2.79%)
bm_hexiom.py 615.65 -> 617.21 : +1.56 = +0.253% (+/-1.64%)
bm_nqueens.py 215807.95 -> 215600.96 : -206.99 = -0.096% (+/-3.52%)
bm_pidigits.py 8246.74 -> 8422.82 : +176.08 = +2.135% (+/-3.64%)
misc_aes.py 16133.00 -> 16452.74 : +319.74 = +1.982% (+/-1.50%)
misc_mandel.py 128146.69 -> 130796.43 : +2649.74 = +2.068% (+/-3.18%)
misc_pystone.py 83811.49 -> 83124.85 : -686.64 = -0.819% (+/-1.03%)
misc_raytrace.py 21688.02 -> 21385.10 : -302.92 = -1.397% (+/-3.20%)
The code size change is (firmware with a lot of frozen code benefits the
most):
bare-arm: +396 +0.697%
minimal x86: +1595 +0.979% [incl +32(data)]
unix x64: +2408 +0.470% [incl +800(data)]
unix nanbox: +1396 +0.309% [incl -96(data)]
stm32: -1256 -0.318% PYBV10
cc3200: +288 +0.157%
esp8266: -260 -0.037% GENERIC
esp32: -216 -0.014% GENERIC[incl -1072(data)]
nrf: +116 +0.067% pca10040
rp2: -664 -0.135% PICO
samd: +844 +0.607% ADAFRUIT_ITSYBITSY_M4_EXPRESS
As part of this change the .mpy file format version is bumped to version 6.
And mpy-tool.py has been improved to provide a good visualisation of the
contents of .mpy files.
In summary: this commit changes the bytecode to use qstr indirection, and
reworks the .mpy file format to be simpler and allow .mpy files to be
executed in-place. Performance is not impacted too much. Eventually it
will be possible to store such .mpy files in a linear, read-only, memory-
mappable filesystem so they can be executed from flash/ROM. This will
essentially be able to replace frozen code for most applications.
Signed-off-by: Damien George <damien@micropython.org>
2021-10-22 12:22:47 +01:00
|
|
|
}
|
|
|
|
{
|
2015-03-26 16:44:14 +00:00
|
|
|
#if NEED_METHOD_TABLE
|
2015-03-26 14:42:40 +00:00
|
|
|
mp_emit_common_id_op(comp->emit, &comp->emit_method_table->delete_id, comp->scope_cur, qst);
|
2015-03-26 16:44:14 +00:00
|
|
|
#else
|
|
|
|
mp_emit_common_id_op(comp->emit, &mp_emit_bc_method_table_delete_id_ops, comp->scope_cur, qst);
|
|
|
|
#endif
|
2015-03-26 14:42:40 +00:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2014-12-10 22:07:04 +00:00
|
|
|
STATIC void compile_generic_tuple(compiler_t *comp, mp_parse_node_struct_t *pns) {
|
2013-10-04 19:53:11 +01:00
|
|
|
// a simple tuple expression
|
2021-03-23 01:48:35 +00:00
|
|
|
size_t num_nodes = MP_PARSE_NODE_STRUCT_NUM_NODES(pns);
|
|
|
|
for (size_t i = 0; i < num_nodes; i++) {
|
|
|
|
compile_node(comp, pns->nodes[i]);
|
|
|
|
}
|
|
|
|
EMIT_ARG(build, num_nodes, MP_EMIT_BUILD_TUPLE);
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
|
|
|
|
2014-02-12 16:31:30 +00:00
|
|
|
STATIC void c_if_cond(compiler_t *comp, mp_parse_node_t pn, bool jump_if, int label) {
|
2016-11-13 04:32:05 +00:00
|
|
|
if (mp_parse_node_is_const_false(pn)) {
|
2013-10-12 15:01:56 +01:00
|
|
|
if (jump_if == false) {
|
2014-01-23 00:34:21 +00:00
|
|
|
EMIT_ARG(jump, label);
|
2013-10-12 15:01:56 +01:00
|
|
|
}
|
|
|
|
return;
|
2016-11-13 04:32:05 +00:00
|
|
|
} else if (mp_parse_node_is_const_true(pn)) {
|
2013-10-12 15:01:56 +01:00
|
|
|
if (jump_if == true) {
|
2014-01-23 00:34:21 +00:00
|
|
|
EMIT_ARG(jump, label);
|
2013-10-12 15:01:56 +01:00
|
|
|
}
|
|
|
|
return;
|
2013-12-21 18:17:45 +00:00
|
|
|
} else if (MP_PARSE_NODE_IS_STRUCT(pn)) {
|
|
|
|
mp_parse_node_struct_t *pns = (mp_parse_node_struct_t *)pn;
|
|
|
|
int n = MP_PARSE_NODE_STRUCT_NUM_NODES(pns);
|
|
|
|
if (MP_PARSE_NODE_STRUCT_KIND(pns) == PN_or_test) {
|
2013-10-12 15:01:56 +01:00
|
|
|
if (jump_if == false) {
|
2015-02-28 14:37:54 +00:00
|
|
|
and_or_logic1:;
|
2014-04-10 14:11:31 +01:00
|
|
|
uint label2 = comp_next_label(comp);
|
2013-10-12 15:01:56 +01:00
|
|
|
for (int i = 0; i < n - 1; i++) {
|
2015-02-28 14:37:54 +00:00
|
|
|
c_if_cond(comp, pns->nodes[i], !jump_if, label2);
|
2013-10-12 15:01:56 +01:00
|
|
|
}
|
2015-02-28 14:37:54 +00:00
|
|
|
c_if_cond(comp, pns->nodes[n - 1], jump_if, label);
|
2014-01-23 00:34:21 +00:00
|
|
|
EMIT_ARG(label_assign, label2);
|
2013-10-12 15:01:56 +01:00
|
|
|
} else {
|
2015-02-28 14:37:54 +00:00
|
|
|
and_or_logic2:
|
2013-10-12 15:01:56 +01:00
|
|
|
for (int i = 0; i < n; i++) {
|
2015-02-28 14:37:54 +00:00
|
|
|
c_if_cond(comp, pns->nodes[i], jump_if, label);
|
2013-10-12 15:01:56 +01:00
|
|
|
}
|
|
|
|
}
|
|
|
|
return;
|
2013-12-21 18:17:45 +00:00
|
|
|
} else if (MP_PARSE_NODE_STRUCT_KIND(pns) == PN_and_test) {
|
2013-10-12 15:01:56 +01:00
|
|
|
if (jump_if == false) {
|
2015-02-28 14:37:54 +00:00
|
|
|
goto and_or_logic2;
|
2013-10-12 15:01:56 +01:00
|
|
|
} else {
|
2015-02-28 14:37:54 +00:00
|
|
|
goto and_or_logic1;
|
2013-10-12 15:01:56 +01:00
|
|
|
}
|
2013-12-21 18:17:45 +00:00
|
|
|
} else if (MP_PARSE_NODE_STRUCT_KIND(pns) == PN_not_test_2) {
|
2013-10-12 15:01:56 +01:00
|
|
|
c_if_cond(comp, pns->nodes[0], !jump_if, label);
|
|
|
|
return;
|
2014-08-29 20:04:01 +01:00
|
|
|
} else if (MP_PARSE_NODE_STRUCT_KIND(pns) == PN_atom_paren) {
|
|
|
|
// cond is something in parenthesis
|
|
|
|
if (MP_PARSE_NODE_IS_NULL(pns->nodes[0])) {
|
|
|
|
// empty tuple, acts as false for the condition
|
|
|
|
if (jump_if == false) {
|
|
|
|
EMIT_ARG(jump, label);
|
|
|
|
}
|
2016-01-07 13:07:52 +00:00
|
|
|
} else {
|
|
|
|
assert(MP_PARSE_NODE_IS_STRUCT_KIND(pns->nodes[0], PN_testlist_comp));
|
2014-08-29 20:04:01 +01:00
|
|
|
// non-empty tuple, acts as true for the condition
|
|
|
|
if (jump_if == true) {
|
|
|
|
EMIT_ARG(jump, label);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
return;
|
2013-10-12 15:01:56 +01:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
// nothing special, fall back to default compiling for node and jump
|
|
|
|
compile_node(comp, pn);
|
2015-02-28 15:04:06 +00:00
|
|
|
EMIT_ARG(pop_jump_if, jump_if, label);
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
|
|
|
|
|
|
|
typedef enum { ASSIGN_STORE, ASSIGN_AUG_LOAD, ASSIGN_AUG_STORE } assign_kind_t;
|
2014-08-15 14:30:52 +01:00
|
|
|
STATIC void c_assign(compiler_t *comp, mp_parse_node_t pn, assign_kind_t kind);
|
2013-10-04 19:53:11 +01:00
|
|
|
|
2016-01-27 20:23:11 +00:00
|
|
|
STATIC void c_assign_atom_expr(compiler_t *comp, mp_parse_node_struct_t *pns, assign_kind_t assign_kind) {
|
2013-10-04 19:53:11 +01:00
|
|
|
if (assign_kind != ASSIGN_AUG_STORE) {
|
|
|
|
compile_node(comp, pns->nodes[0]);
|
|
|
|
}
|
|
|
|
|
2013-12-21 18:17:45 +00:00
|
|
|
if (MP_PARSE_NODE_IS_STRUCT(pns->nodes[1])) {
|
|
|
|
mp_parse_node_struct_t *pns1 = (mp_parse_node_struct_t *)pns->nodes[1];
|
2016-01-27 20:23:11 +00:00
|
|
|
if (MP_PARSE_NODE_STRUCT_KIND(pns1) == PN_atom_expr_trailers) {
|
2013-12-21 18:17:45 +00:00
|
|
|
int n = MP_PARSE_NODE_STRUCT_NUM_NODES(pns1);
|
2013-10-04 19:53:11 +01:00
|
|
|
if (assign_kind != ASSIGN_AUG_STORE) {
|
|
|
|
for (int i = 0; i < n - 1; i++) {
|
|
|
|
compile_node(comp, pns1->nodes[i]);
|
|
|
|
}
|
|
|
|
}
|
2013-12-21 18:17:45 +00:00
|
|
|
assert(MP_PARSE_NODE_IS_STRUCT(pns1->nodes[n - 1]));
|
|
|
|
pns1 = (mp_parse_node_struct_t *)pns1->nodes[n - 1];
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
2015-03-25 22:06:47 +00:00
|
|
|
if (MP_PARSE_NODE_STRUCT_KIND(pns1) == PN_trailer_bracket) {
|
2013-10-04 19:53:11 +01:00
|
|
|
if (assign_kind == ASSIGN_AUG_STORE) {
|
|
|
|
EMIT(rot_three);
|
2018-05-22 12:31:56 +01:00
|
|
|
EMIT_ARG(subscr, MP_EMIT_SUBSCR_STORE);
|
2013-10-04 19:53:11 +01:00
|
|
|
} else {
|
|
|
|
compile_node(comp, pns1->nodes[0]);
|
|
|
|
if (assign_kind == ASSIGN_AUG_LOAD) {
|
|
|
|
EMIT(dup_top_two);
|
2018-05-22 12:31:56 +01:00
|
|
|
EMIT_ARG(subscr, MP_EMIT_SUBSCR_LOAD);
|
2013-10-04 19:53:11 +01:00
|
|
|
} else {
|
2018-05-22 12:31:56 +01:00
|
|
|
EMIT_ARG(subscr, MP_EMIT_SUBSCR_STORE);
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
|
|
|
}
|
2018-02-24 12:03:17 +00:00
|
|
|
return;
|
2013-12-21 18:17:45 +00:00
|
|
|
} else if (MP_PARSE_NODE_STRUCT_KIND(pns1) == PN_trailer_period) {
|
|
|
|
assert(MP_PARSE_NODE_IS_ID(pns1->nodes[0]));
|
2013-10-04 19:53:11 +01:00
|
|
|
if (assign_kind == ASSIGN_AUG_LOAD) {
|
|
|
|
EMIT(dup_top);
|
2018-05-22 12:43:41 +01:00
|
|
|
EMIT_ARG(attr, MP_PARSE_NODE_LEAF_ARG(pns1->nodes[0]), MP_EMIT_ATTR_LOAD);
|
2013-10-04 19:53:11 +01:00
|
|
|
} else {
|
|
|
|
if (assign_kind == ASSIGN_AUG_STORE) {
|
|
|
|
EMIT(rot_two);
|
|
|
|
}
|
2018-05-22 12:43:41 +01:00
|
|
|
EMIT_ARG(attr, MP_PARSE_NODE_LEAF_ARG(pns1->nodes[0]), MP_EMIT_ATTR_STORE);
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
2018-02-24 12:03:17 +00:00
|
|
|
return;
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2020-03-02 11:35:22 +00:00
|
|
|
compile_syntax_error(comp, (mp_parse_node_t)pns, MP_ERROR_TEXT("can't assign to expression"));
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
|
|
|
|
2021-03-23 01:48:35 +00:00
|
|
|
STATIC void c_assign_tuple(compiler_t *comp, uint num_tail, mp_parse_node_t *nodes_tail) {
|
2014-04-11 12:53:00 +01:00
|
|
|
// look for star expression
|
2015-01-16 17:47:07 +00:00
|
|
|
uint have_star_index = -1;
|
|
|
|
for (uint i = 0; i < num_tail; i++) {
|
2014-04-11 12:53:00 +01:00
|
|
|
if (MP_PARSE_NODE_IS_STRUCT_KIND(nodes_tail[i], PN_star_expr)) {
|
2015-01-16 17:47:07 +00:00
|
|
|
if (have_star_index == (uint)-1) {
|
2021-03-23 01:48:35 +00:00
|
|
|
EMIT_ARG(unpack_ex, i, num_tail - i - 1);
|
|
|
|
have_star_index = i;
|
2013-10-04 19:53:11 +01:00
|
|
|
} else {
|
2020-03-02 11:35:22 +00:00
|
|
|
compile_syntax_error(comp, nodes_tail[i], MP_ERROR_TEXT("multiple *x in assignment"));
|
2013-10-04 19:53:11 +01:00
|
|
|
return;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
2015-01-16 17:47:07 +00:00
|
|
|
if (have_star_index == (uint)-1) {
|
2021-03-23 01:48:35 +00:00
|
|
|
EMIT_ARG(unpack_sequence, num_tail);
|
2014-04-11 12:53:00 +01:00
|
|
|
}
|
2015-01-16 17:47:07 +00:00
|
|
|
for (uint i = 0; i < num_tail; i++) {
|
2021-03-23 01:48:35 +00:00
|
|
|
if (i == have_star_index) {
|
2014-04-11 12:53:00 +01:00
|
|
|
c_assign(comp, ((mp_parse_node_struct_t *)nodes_tail[i])->nodes[0], ASSIGN_STORE);
|
|
|
|
} else {
|
|
|
|
c_assign(comp, nodes_tail[i], ASSIGN_STORE);
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
// assigns top of stack to pn
|
2014-08-15 14:30:52 +01:00
|
|
|
STATIC void c_assign(compiler_t *comp, mp_parse_node_t pn, assign_kind_t assign_kind) {
|
2015-03-25 22:06:47 +00:00
|
|
|
assert(!MP_PARSE_NODE_IS_NULL(pn));
|
|
|
|
if (MP_PARSE_NODE_IS_LEAF(pn)) {
|
2013-12-21 18:17:45 +00:00
|
|
|
if (MP_PARSE_NODE_IS_ID(pn)) {
|
2014-10-03 18:44:14 +01:00
|
|
|
qstr arg = MP_PARSE_NODE_LEAF_ARG(pn);
|
2013-10-04 19:53:11 +01:00
|
|
|
switch (assign_kind) {
|
|
|
|
case ASSIGN_STORE:
|
|
|
|
case ASSIGN_AUG_STORE:
|
2015-03-26 14:42:40 +00:00
|
|
|
compile_store_id(comp, arg);
|
2013-10-04 19:53:11 +01:00
|
|
|
break;
|
|
|
|
case ASSIGN_AUG_LOAD:
|
2015-01-14 21:32:42 +00:00
|
|
|
default:
|
2015-03-26 14:42:40 +00:00
|
|
|
compile_load_id(comp, arg);
|
2013-10-04 19:53:11 +01:00
|
|
|
break;
|
|
|
|
}
|
|
|
|
} else {
|
2017-03-29 02:28:33 +01:00
|
|
|
goto cannot_assign;
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
|
|
|
} else {
|
2015-03-25 22:06:47 +00:00
|
|
|
// pn must be a struct
|
2013-12-21 18:17:45 +00:00
|
|
|
mp_parse_node_struct_t *pns = (mp_parse_node_struct_t *)pn;
|
|
|
|
switch (MP_PARSE_NODE_STRUCT_KIND(pns)) {
|
2016-01-27 20:23:11 +00:00
|
|
|
case PN_atom_expr_normal:
|
2013-10-04 19:53:11 +01:00
|
|
|
// lhs is an index or attribute
|
2016-01-27 20:23:11 +00:00
|
|
|
c_assign_atom_expr(comp, pns, assign_kind);
|
2013-10-04 19:53:11 +01:00
|
|
|
break;
|
|
|
|
|
|
|
|
case PN_testlist_star_expr:
|
|
|
|
case PN_exprlist:
|
|
|
|
// lhs is a tuple
|
|
|
|
if (assign_kind != ASSIGN_STORE) {
|
2017-03-29 02:28:33 +01:00
|
|
|
goto cannot_assign;
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
2021-03-23 01:48:35 +00:00
|
|
|
c_assign_tuple(comp, MP_PARSE_NODE_STRUCT_NUM_NODES(pns), pns->nodes);
|
2013-10-04 19:53:11 +01:00
|
|
|
break;
|
|
|
|
|
|
|
|
case PN_atom_paren:
|
|
|
|
// lhs is something in parenthesis
|
2013-12-21 18:17:45 +00:00
|
|
|
if (MP_PARSE_NODE_IS_NULL(pns->nodes[0])) {
|
2013-10-04 19:53:11 +01:00
|
|
|
// empty tuple
|
2014-04-27 16:55:27 +01:00
|
|
|
goto cannot_assign;
|
2016-01-07 13:07:52 +00:00
|
|
|
} else {
|
|
|
|
assert(MP_PARSE_NODE_IS_STRUCT_KIND(pns->nodes[0], PN_testlist_comp));
|
2015-03-25 23:06:48 +00:00
|
|
|
if (assign_kind != ASSIGN_STORE) {
|
2017-03-29 02:28:33 +01:00
|
|
|
goto cannot_assign;
|
2015-03-25 23:06:48 +00:00
|
|
|
}
|
2013-12-21 18:17:45 +00:00
|
|
|
pns = (mp_parse_node_struct_t *)pns->nodes[0];
|
2013-10-04 19:53:11 +01:00
|
|
|
goto testlist_comp;
|
|
|
|
}
|
|
|
|
break;
|
|
|
|
|
|
|
|
case PN_atom_bracket:
|
|
|
|
// lhs is something in brackets
|
|
|
|
if (assign_kind != ASSIGN_STORE) {
|
2017-03-29 02:28:33 +01:00
|
|
|
goto cannot_assign;
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
2013-12-21 18:17:45 +00:00
|
|
|
if (MP_PARSE_NODE_IS_NULL(pns->nodes[0])) {
|
2013-10-04 19:53:11 +01:00
|
|
|
// empty list, assignment allowed
|
2021-03-23 01:48:35 +00:00
|
|
|
c_assign_tuple(comp, 0, NULL);
|
2013-12-21 18:17:45 +00:00
|
|
|
} else if (MP_PARSE_NODE_IS_STRUCT_KIND(pns->nodes[0], PN_testlist_comp)) {
|
|
|
|
pns = (mp_parse_node_struct_t *)pns->nodes[0];
|
2013-10-04 19:53:11 +01:00
|
|
|
goto testlist_comp;
|
|
|
|
} else {
|
|
|
|
// brackets around 1 item
|
2021-03-23 01:48:35 +00:00
|
|
|
c_assign_tuple(comp, 1, pns->nodes);
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
|
|
|
break;
|
|
|
|
|
|
|
|
default:
|
2014-04-27 16:55:27 +01:00
|
|
|
goto cannot_assign;
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
|
|
|
return;
|
|
|
|
|
|
|
|
testlist_comp:
|
|
|
|
// lhs is a sequence
|
2021-03-23 01:48:35 +00:00
|
|
|
if (MP_PARSE_NODE_TESTLIST_COMP_HAS_COMP_FOR(pns)) {
|
|
|
|
goto cannot_assign;
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
2021-03-23 01:48:35 +00:00
|
|
|
c_assign_tuple(comp, MP_PARSE_NODE_STRUCT_NUM_NODES(pns), pns->nodes);
|
2013-10-04 19:53:11 +01:00
|
|
|
return;
|
|
|
|
}
|
|
|
|
return;
|
|
|
|
|
2014-04-27 16:55:27 +01:00
|
|
|
cannot_assign:
|
2020-03-02 11:35:22 +00:00
|
|
|
compile_syntax_error(comp, pn, MP_ERROR_TEXT("can't assign to expression"));
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
|
|
|
|
2015-08-14 12:24:11 +01:00
|
|
|
// stuff for lambda and comprehensions and generators:
|
2014-03-31 15:18:37 +01:00
|
|
|
// if n_pos_defaults > 0 then there is a tuple on the stack with the positional defaults
|
|
|
|
// if n_kw_defaults > 0 then there is a dictionary on the stack with the keyword defaults
|
|
|
|
// if both exist, the tuple is above the dictionary (ie the first pop gets the tuple)
|
2014-08-15 14:30:52 +01:00
|
|
|
STATIC void close_over_variables_etc(compiler_t *comp, scope_t *this_scope, int n_pos_defaults, int n_kw_defaults) {
|
2014-03-31 11:30:17 +01:00
|
|
|
assert(n_pos_defaults >= 0);
|
|
|
|
assert(n_kw_defaults >= 0);
|
|
|
|
|
2015-10-22 23:45:37 +01:00
|
|
|
// set flags
|
|
|
|
if (n_kw_defaults > 0) {
|
|
|
|
this_scope->scope_flags |= MP_SCOPE_FLAG_DEFKWARGS;
|
|
|
|
}
|
|
|
|
this_scope->num_def_pos_args = n_pos_defaults;
|
|
|
|
|
py: Fix native functions so they run with their correct globals context.
Prior to this commit a function compiled with the native decorator
@micropython.native would not work correctly when accessing global
variables, because the globals dict was not being set upon function entry.
This commit fixes this problem by, upon function entry, setting as the
current globals dict the globals dict context the function was defined
within, as per normal Python semantics, and as bytecode does. Upon
function exit the original globals dict is restored.
In order to restore the globals dict when an exception is raised the native
function must guard its internals with an nlr_push/nlr_pop pair. Because
this push/pop is relatively expensive, in both C stack usage for the
nlr_buf_t and CPU execution time, the implementation here optimises things
as much as possible. First, the compiler keeps track of whether a function
even needs to access global variables. Using this information the native
emitter then generates three different kinds of code:
1. no globals used, no exception handlers: no nlr handling code and no
setting of the globals dict.
2. globals used, no exception handlers: an nlr_buf_t is allocated on the
C stack but it is not used if the globals dict is unchanged, saving
execution time because nlr_push/nlr_pop don't need to run.
3. function has exception handlers, may use globals: an nlr_buf_t is
allocated and nlr_push/nlr_pop are always called.
In the end, native functions that don't access globals and don't have
exception handlers will run more efficiently than those that do.
Fixes issue #1573.
2018-09-13 13:03:48 +01:00
|
|
|
#if MICROPY_EMIT_NATIVE
|
|
|
|
// When creating a function/closure it will take a reference to the current globals
|
2018-09-27 14:27:53 +01:00
|
|
|
comp->scope_cur->scope_flags |= MP_SCOPE_FLAG_REFGLOBALS | MP_SCOPE_FLAG_HASCONSTS;
|
py: Fix native functions so they run with their correct globals context.
Prior to this commit a function compiled with the native decorator
@micropython.native would not work correctly when accessing global
variables, because the globals dict was not being set upon function entry.
This commit fixes this problem by, upon function entry, setting as the
current globals dict the globals dict context the function was defined
within, as per normal Python semantics, and as bytecode does. Upon
function exit the original globals dict is restored.
In order to restore the globals dict when an exception is raised the native
function must guard its internals with an nlr_push/nlr_pop pair. Because
this push/pop is relatively expensive, in both C stack usage for the
nlr_buf_t and CPU execution time, the implementation here optimises things
as much as possible. First, the compiler keeps track of whether a function
even needs to access global variables. Using this information the native
emitter then generates three different kinds of code:
1. no globals used, no exception handlers: no nlr handling code and no
setting of the globals dict.
2. globals used, no exception handlers: an nlr_buf_t is allocated on the
C stack but it is not used if the globals dict is unchanged, saving
execution time because nlr_push/nlr_pop don't need to run.
3. function has exception handlers, may use globals: an nlr_buf_t is
allocated and nlr_push/nlr_pop are always called.
In the end, native functions that don't access globals and don't have
exception handlers will run more efficiently than those that do.
Fixes issue #1573.
2018-09-13 13:03:48 +01:00
|
|
|
#endif
|
|
|
|
|
2013-10-04 19:53:11 +01:00
|
|
|
// make closed over variables, if any
|
2013-12-10 18:28:17 +00:00
|
|
|
// ensure they are closed over in the order defined in the outer scope (mainly to agree with CPython)
|
2013-10-04 19:53:11 +01:00
|
|
|
int nfree = 0;
|
|
|
|
if (comp->scope_cur->kind != SCOPE_MODULE) {
|
2013-12-10 18:28:17 +00:00
|
|
|
for (int i = 0; i < comp->scope_cur->id_info_len; i++) {
|
|
|
|
id_info_t *id = &comp->scope_cur->id_info[i];
|
|
|
|
if (id->kind == ID_INFO_KIND_CELL || id->kind == ID_INFO_KIND_FREE) {
|
|
|
|
for (int j = 0; j < this_scope->id_info_len; j++) {
|
|
|
|
id_info_t *id2 = &this_scope->id_info[j];
|
2014-09-08 23:05:16 +01:00
|
|
|
if (id2->kind == ID_INFO_KIND_FREE && id->qst == id2->qst) {
|
2017-06-30 08:22:17 +01:00
|
|
|
// in MicroPython we load closures using LOAD_FAST
|
2015-03-26 14:42:40 +00:00
|
|
|
EMIT_LOAD_FAST(id->qst, id->local_num);
|
2013-12-10 18:28:17 +00:00
|
|
|
nfree += 1;
|
|
|
|
}
|
|
|
|
}
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
// make the function/closure
|
|
|
|
if (nfree == 0) {
|
2014-03-31 11:30:17 +01:00
|
|
|
EMIT_ARG(make_function, this_scope, n_pos_defaults, n_kw_defaults);
|
2013-10-04 19:53:11 +01:00
|
|
|
} else {
|
2014-04-20 17:50:40 +01:00
|
|
|
EMIT_ARG(make_closure, this_scope, nfree, n_pos_defaults, n_kw_defaults);
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2015-11-17 14:00:14 +00:00
|
|
|
STATIC void compile_funcdef_lambdef_param(compiler_t *comp, mp_parse_node_t pn) {
|
2017-04-22 05:23:47 +01:00
|
|
|
// For efficiency of the code below we extract the parse-node kind here
|
|
|
|
int pn_kind;
|
|
|
|
if (MP_PARSE_NODE_IS_ID(pn)) {
|
|
|
|
pn_kind = -1;
|
|
|
|
} else {
|
|
|
|
assert(MP_PARSE_NODE_IS_STRUCT(pn));
|
|
|
|
pn_kind = MP_PARSE_NODE_STRUCT_KIND((mp_parse_node_struct_t *)pn);
|
|
|
|
}
|
|
|
|
|
|
|
|
if (pn_kind == PN_typedargslist_star || pn_kind == PN_varargslist_star) {
|
2014-04-11 23:25:34 +01:00
|
|
|
comp->have_star = true;
|
|
|
|
/* don't need to distinguish bare from named star
|
2013-12-21 18:17:45 +00:00
|
|
|
mp_parse_node_struct_t *pns = (mp_parse_node_struct_t*)pn;
|
2014-03-03 23:19:11 +00:00
|
|
|
if (MP_PARSE_NODE_IS_NULL(pns->nodes[0])) {
|
|
|
|
// bare star
|
2014-04-11 23:25:34 +01:00
|
|
|
} else {
|
|
|
|
// named star
|
2014-03-03 23:19:11 +00:00
|
|
|
}
|
2014-04-11 23:25:34 +01:00
|
|
|
*/
|
2014-03-03 23:19:11 +00:00
|
|
|
|
2017-04-22 05:23:47 +01:00
|
|
|
} else if (pn_kind == PN_typedargslist_dbl_star || pn_kind == PN_varargslist_dbl_star) {
|
2014-04-11 23:25:34 +01:00
|
|
|
// named double star
|
2014-03-03 23:19:11 +00:00
|
|
|
// TODO do we need to do anything with this?
|
|
|
|
|
|
|
|
} else {
|
|
|
|
mp_parse_node_t pn_id;
|
|
|
|
mp_parse_node_t pn_equal;
|
2017-04-22 05:23:47 +01:00
|
|
|
if (pn_kind == -1) {
|
2014-03-03 23:19:11 +00:00
|
|
|
// this parameter is just an id
|
|
|
|
|
|
|
|
pn_id = pn;
|
|
|
|
pn_equal = MP_PARSE_NODE_NULL;
|
|
|
|
|
2017-04-22 05:23:47 +01:00
|
|
|
} else if (pn_kind == PN_typedargslist_name) {
|
2014-03-03 23:19:11 +00:00
|
|
|
// this parameter has a colon and/or equal specifier
|
|
|
|
|
|
|
|
mp_parse_node_struct_t *pns = (mp_parse_node_struct_t *)pn;
|
|
|
|
pn_id = pns->nodes[0];
|
2017-08-21 13:00:34 +01:00
|
|
|
// pn_colon = pns->nodes[1]; // unused
|
2014-03-03 23:19:11 +00:00
|
|
|
pn_equal = pns->nodes[2];
|
2015-11-17 14:00:14 +00:00
|
|
|
|
|
|
|
} else {
|
2017-04-22 05:23:47 +01:00
|
|
|
assert(pn_kind == PN_varargslist_name); // should be
|
2015-11-17 14:00:14 +00:00
|
|
|
// this parameter has an equal specifier
|
|
|
|
|
|
|
|
mp_parse_node_struct_t *pns = (mp_parse_node_struct_t *)pn;
|
|
|
|
pn_id = pns->nodes[0];
|
|
|
|
pn_equal = pns->nodes[1];
|
2014-03-03 23:19:11 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
if (MP_PARSE_NODE_IS_NULL(pn_equal)) {
|
|
|
|
// this parameter does not have a default value
|
|
|
|
|
|
|
|
// check for non-default parameters given after default parameters (allowed by parser, but not syntactically valid)
|
2014-04-11 23:25:34 +01:00
|
|
|
if (!comp->have_star && comp->num_default_params != 0) {
|
2020-03-02 11:35:22 +00:00
|
|
|
compile_syntax_error(comp, pn, MP_ERROR_TEXT("non-default argument follows default argument"));
|
2014-03-03 23:19:11 +00:00
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
|
|
|
} else {
|
2013-10-04 19:53:11 +01:00
|
|
|
// this parameter has a default value
|
|
|
|
// in CPython, None (and True, False?) as default parameters are loaded with LOAD_NAME; don't understandy why
|
2014-03-03 23:19:11 +00:00
|
|
|
|
2014-04-11 23:25:34 +01:00
|
|
|
if (comp->have_star) {
|
2014-04-11 14:38:30 +01:00
|
|
|
comp->num_dict_params += 1;
|
2017-06-30 08:22:17 +01:00
|
|
|
// in MicroPython we put the default dict parameters into a dictionary using the bytecode
|
2014-04-11 14:38:30 +01:00
|
|
|
if (comp->num_dict_params == 1) {
|
2017-06-30 08:22:17 +01:00
|
|
|
// in MicroPython we put the default positional parameters into a tuple using the bytecode
|
2014-04-12 00:05:49 +01:00
|
|
|
// we need to do this here before we start building the map for the default keywords
|
|
|
|
if (comp->num_default_params > 0) {
|
2018-05-18 15:41:40 +01:00
|
|
|
EMIT_ARG(build, comp->num_default_params, MP_EMIT_BUILD_TUPLE);
|
2014-04-20 17:50:40 +01:00
|
|
|
} else {
|
|
|
|
EMIT(load_null); // sentinel indicating empty default positional args
|
2014-04-12 00:05:49 +01:00
|
|
|
}
|
2014-04-11 14:38:30 +01:00
|
|
|
// first default dict param, so make the map
|
2018-05-18 15:41:40 +01:00
|
|
|
EMIT_ARG(build, 0, MP_EMIT_BUILD_MAP);
|
2014-04-11 14:38:30 +01:00
|
|
|
}
|
2014-06-07 22:01:00 +01:00
|
|
|
|
|
|
|
// compile value then key, then store it to the dict
|
2014-04-11 14:38:30 +01:00
|
|
|
compile_node(comp, pn_equal);
|
2015-06-25 15:42:13 +01:00
|
|
|
EMIT_ARG(load_const_str, MP_PARSE_NODE_LEAF_ARG(pn_id));
|
2014-04-11 14:38:30 +01:00
|
|
|
EMIT(store_map);
|
2013-10-04 19:53:11 +01:00
|
|
|
} else {
|
2014-04-11 14:38:30 +01:00
|
|
|
comp->num_default_params += 1;
|
|
|
|
compile_node(comp, pn_equal);
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2015-11-17 14:00:14 +00:00
|
|
|
STATIC void compile_funcdef_lambdef(compiler_t *comp, scope_t *scope, mp_parse_node_t pn_params, pn_kind_t pn_list_kind) {
|
2015-12-12 13:42:51 +00:00
|
|
|
// When we call compile_funcdef_lambdef_param below it can compile an arbitrary
|
|
|
|
// expression for default arguments, which may contain a lambda. The lambda will
|
|
|
|
// call here in a nested way, so we must save and restore the relevant state.
|
|
|
|
bool orig_have_star = comp->have_star;
|
|
|
|
uint16_t orig_num_dict_params = comp->num_dict_params;
|
|
|
|
uint16_t orig_num_default_params = comp->num_default_params;
|
|
|
|
|
2013-10-04 19:53:11 +01:00
|
|
|
// compile default parameters
|
2014-04-11 23:25:34 +01:00
|
|
|
comp->have_star = false;
|
2014-04-11 14:38:30 +01:00
|
|
|
comp->num_dict_params = 0;
|
|
|
|
comp->num_default_params = 0;
|
2015-11-17 14:00:14 +00:00
|
|
|
apply_to_single_or_list(comp, pn_params, pn_list_kind, compile_funcdef_lambdef_param);
|
2014-03-03 23:19:11 +00:00
|
|
|
|
2014-10-05 19:01:34 +01:00
|
|
|
if (comp->compile_error != MP_OBJ_NULL) {
|
2015-11-17 14:00:14 +00:00
|
|
|
return;
|
2014-03-03 23:19:11 +00:00
|
|
|
}
|
|
|
|
|
2017-06-30 08:22:17 +01:00
|
|
|
// in MicroPython we put the default positional parameters into a tuple using the bytecode
|
2014-04-12 00:05:49 +01:00
|
|
|
// the default keywords args may have already made the tuple; if not, do it now
|
|
|
|
if (comp->num_default_params > 0 && comp->num_dict_params == 0) {
|
2018-05-18 15:41:40 +01:00
|
|
|
EMIT_ARG(build, comp->num_default_params, MP_EMIT_BUILD_TUPLE);
|
2014-04-20 17:50:40 +01:00
|
|
|
EMIT(load_null); // sentinel indicating empty default keyword args
|
2014-03-31 15:18:37 +01:00
|
|
|
}
|
|
|
|
|
2015-11-17 14:00:14 +00:00
|
|
|
// make the function
|
|
|
|
close_over_variables_etc(comp, scope, comp->num_default_params, comp->num_dict_params);
|
2015-12-12 13:42:51 +00:00
|
|
|
|
|
|
|
// restore state
|
|
|
|
comp->have_star = orig_have_star;
|
|
|
|
comp->num_dict_params = orig_num_dict_params;
|
|
|
|
comp->num_default_params = orig_num_default_params;
|
2015-11-17 14:00:14 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
// leaves function object on stack
|
|
|
|
// returns function name
|
|
|
|
STATIC qstr compile_funcdef_helper(compiler_t *comp, mp_parse_node_struct_t *pns, uint emit_options) {
|
|
|
|
if (comp->pass == MP_PASS_SCOPE) {
|
|
|
|
// create a new scope for this function
|
|
|
|
scope_t *s = scope_new_and_link(comp, SCOPE_FUNCTION, (mp_parse_node_t)pns, emit_options);
|
|
|
|
// store the function scope so the compiling function can use it at each pass
|
|
|
|
pns->nodes[4] = (mp_parse_node_t)s;
|
|
|
|
}
|
|
|
|
|
2013-10-04 19:53:11 +01:00
|
|
|
// get the scope for this function
|
|
|
|
scope_t *fscope = (scope_t *)pns->nodes[4];
|
|
|
|
|
2015-11-17 14:00:14 +00:00
|
|
|
// compile the function definition
|
|
|
|
compile_funcdef_lambdef(comp, fscope, pns->nodes[1], PN_typedargslist);
|
2013-10-04 19:53:11 +01:00
|
|
|
|
|
|
|
// return its name (the 'f' in "def f(...):")
|
|
|
|
return fscope->simple_name;
|
|
|
|
}
|
|
|
|
|
|
|
|
// leaves class object on stack
|
|
|
|
// returns class name
|
2014-12-10 22:07:04 +00:00
|
|
|
STATIC qstr compile_classdef_helper(compiler_t *comp, mp_parse_node_struct_t *pns, uint emit_options) {
|
2014-05-07 17:24:22 +01:00
|
|
|
if (comp->pass == MP_PASS_SCOPE) {
|
2013-10-04 19:53:11 +01:00
|
|
|
// create a new scope for this class
|
2013-12-21 18:17:45 +00:00
|
|
|
scope_t *s = scope_new_and_link(comp, SCOPE_CLASS, (mp_parse_node_t)pns, emit_options);
|
2013-10-04 19:53:11 +01:00
|
|
|
// store the class scope so the compiling function can use it at each pass
|
2013-12-21 18:17:45 +00:00
|
|
|
pns->nodes[3] = (mp_parse_node_t)s;
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
|
|
|
|
|
|
|
EMIT(load_build_class);
|
|
|
|
|
|
|
|
// scope for this class
|
|
|
|
scope_t *cscope = (scope_t *)pns->nodes[3];
|
|
|
|
|
|
|
|
// compile the class
|
|
|
|
close_over_variables_etc(comp, cscope, 0, 0);
|
|
|
|
|
|
|
|
// get its name
|
2015-06-25 15:42:13 +01:00
|
|
|
EMIT_ARG(load_const_str, cscope->simple_name);
|
2013-10-04 19:53:11 +01:00
|
|
|
|
|
|
|
// nodes[1] has parent classes, if any
|
2014-03-30 23:06:37 +01:00
|
|
|
// empty parenthesis (eg class C():) gets here as an empty PN_classdef_2 and needs special handling
|
|
|
|
mp_parse_node_t parents = pns->nodes[1];
|
|
|
|
if (MP_PARSE_NODE_IS_STRUCT_KIND(parents, PN_classdef_2)) {
|
|
|
|
parents = MP_PARSE_NODE_NULL;
|
|
|
|
}
|
|
|
|
compile_trailer_paren_helper(comp, parents, false, 2);
|
2013-10-04 19:53:11 +01:00
|
|
|
|
|
|
|
// return its name (the 'C' in class C(...):")
|
|
|
|
return cscope->simple_name;
|
|
|
|
}
|
|
|
|
|
2013-10-05 18:08:26 +01:00
|
|
|
// returns true if it was a built-in decorator (even if the built-in had an error)
|
2020-05-04 13:11:44 +01:00
|
|
|
STATIC bool compile_built_in_decorator(compiler_t *comp, size_t name_len, mp_parse_node_t *name_nodes, uint *emit_options) {
|
2014-01-04 15:57:35 +00:00
|
|
|
if (MP_PARSE_NODE_LEAF_ARG(name_nodes[0]) != MP_QSTR_micropython) {
|
2013-10-05 18:08:26 +01:00
|
|
|
return false;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (name_len != 2) {
|
2020-03-02 11:35:22 +00:00
|
|
|
compile_syntax_error(comp, name_nodes[0], MP_ERROR_TEXT("invalid micropython decorator"));
|
2013-10-05 18:08:26 +01:00
|
|
|
return true;
|
|
|
|
}
|
|
|
|
|
2013-12-21 18:17:45 +00:00
|
|
|
qstr attr = MP_PARSE_NODE_LEAF_ARG(name_nodes[1]);
|
2014-05-10 10:36:38 +01:00
|
|
|
if (attr == MP_QSTR_bytecode) {
|
2014-05-12 22:35:37 +01:00
|
|
|
*emit_options = MP_EMIT_OPT_BYTECODE;
|
2013-10-15 22:25:17 +01:00
|
|
|
#if MICROPY_EMIT_NATIVE
|
2014-01-04 15:57:35 +00:00
|
|
|
} else if (attr == MP_QSTR_native) {
|
2014-04-06 11:48:15 +01:00
|
|
|
*emit_options = MP_EMIT_OPT_NATIVE_PYTHON;
|
2014-01-04 15:57:35 +00:00
|
|
|
} else if (attr == MP_QSTR_viper) {
|
2014-04-06 11:48:15 +01:00
|
|
|
*emit_options = MP_EMIT_OPT_VIPER;
|
2013-10-15 22:25:17 +01:00
|
|
|
#endif
|
2016-12-09 02:17:49 +00:00
|
|
|
#if MICROPY_EMIT_INLINE_ASM
|
2019-03-09 01:32:09 +00:00
|
|
|
#if MICROPY_DYNAMIC_COMPILER
|
|
|
|
} else if (attr == MP_QSTR_asm_thumb) {
|
|
|
|
*emit_options = MP_EMIT_OPT_ASM;
|
|
|
|
} else if (attr == MP_QSTR_asm_xtensa) {
|
|
|
|
*emit_options = MP_EMIT_OPT_ASM;
|
|
|
|
#else
|
2016-12-09 02:17:49 +00:00
|
|
|
} else if (attr == ASM_DECORATOR_QSTR) {
|
|
|
|
*emit_options = MP_EMIT_OPT_ASM;
|
|
|
|
#endif
|
2019-03-09 01:32:09 +00:00
|
|
|
#endif
|
2013-10-05 18:08:26 +01:00
|
|
|
} else {
|
2020-03-02 11:35:22 +00:00
|
|
|
compile_syntax_error(comp, name_nodes[1], MP_ERROR_TEXT("invalid micropython decorator"));
|
2013-10-05 18:08:26 +01:00
|
|
|
}
|
|
|
|
|
py: Rework bytecode and .mpy file format to be mostly static data.
Background: .mpy files are precompiled .py files, built using mpy-cross,
that contain compiled bytecode functions (and can also contain machine
code). The benefit of using an .mpy file over a .py file is that they are
faster to import and take less memory when importing. They are also
smaller on disk.
But the real benefit of .mpy files comes when they are frozen into the
firmware. This is done by loading the .mpy file during compilation of the
firmware and turning it into a set of big C data structures (the job of
mpy-tool.py), which are then compiled and downloaded into the ROM of a
device. These C data structures can be executed in-place, ie directly from
ROM. This makes importing even faster because there is very little to do,
and also means such frozen modules take up much less RAM (because their
bytecode stays in ROM).
The downside of frozen code is that it requires recompiling and reflashing
the entire firmware. This can be a big barrier to entry, slows down
development time, and makes it harder to do OTA updates of frozen code
(because the whole firmware must be updated).
This commit attempts to solve this problem by providing a solution that
sits between loading .mpy files into RAM and freezing them into the
firmware. The .mpy file format has been reworked so that it consists of
data and bytecode which is mostly static and ready to run in-place. If
these new .mpy files are located in flash/ROM which is memory addressable,
the .mpy file can be executed (mostly) in-place.
With this approach there is still a small amount of unpacking and linking
of the .mpy file that needs to be done when it's imported, but it's still
much better than loading an .mpy from disk into RAM (although not as good
as freezing .mpy files into the firmware).
The main trick to make static .mpy files is to adjust the bytecode so any
qstrs that it references now go through a lookup table to convert from
local qstr number in the module to global qstr number in the firmware.
That means the bytecode does not need linking/rewriting of qstrs when it's
loaded. Instead only a small qstr table needs to be built (and put in RAM)
at import time. This means the bytecode itself is static/constant and can
be used directly if it's in addressable memory. Also the qstr string data
in the .mpy file, and some constant object data, can be used directly.
Note that the qstr table is global to the module (ie not per function).
In more detail, in the VM what used to be (schematically):
qst = DECODE_QSTR_VALUE;
is now (schematically):
idx = DECODE_QSTR_INDEX;
qst = qstr_table[idx];
That allows the bytecode to be fixed at compile time and not need
relinking/rewriting of the qstr values. Only qstr_table needs to be linked
when the .mpy is loaded.
Incidentally, this helps to reduce the size of bytecode because what used
to be 2-byte qstr values in the bytecode are now (mostly) 1-byte indices.
If the module uses the same qstr more than two times then the bytecode is
smaller than before.
The following changes are measured for this commit compared to the
previous (the baseline):
- average 7%-9% reduction in size of .mpy files
- frozen code size is reduced by about 5%-7%
- importing .py files uses about 5% less RAM in total
- importing .mpy files uses about 4% less RAM in total
- importing .py and .mpy files takes about the same time as before
The qstr indirection in the bytecode has only a small impact on VM
performance. For stm32 on PYBv1.0 the performance change of this commit
is:
diff of scores (higher is better)
N=100 M=100 baseline -> this-commit diff diff% (error%)
bm_chaos.py 371.07 -> 357.39 : -13.68 = -3.687% (+/-0.02%)
bm_fannkuch.py 78.72 -> 77.49 : -1.23 = -1.563% (+/-0.01%)
bm_fft.py 2591.73 -> 2539.28 : -52.45 = -2.024% (+/-0.00%)
bm_float.py 6034.93 -> 5908.30 : -126.63 = -2.098% (+/-0.01%)
bm_hexiom.py 48.96 -> 47.93 : -1.03 = -2.104% (+/-0.00%)
bm_nqueens.py 4510.63 -> 4459.94 : -50.69 = -1.124% (+/-0.00%)
bm_pidigits.py 650.28 -> 644.96 : -5.32 = -0.818% (+/-0.23%)
core_import_mpy_multi.py 564.77 -> 581.49 : +16.72 = +2.960% (+/-0.01%)
core_import_mpy_single.py 68.67 -> 67.16 : -1.51 = -2.199% (+/-0.01%)
core_qstr.py 64.16 -> 64.12 : -0.04 = -0.062% (+/-0.00%)
core_yield_from.py 362.58 -> 354.50 : -8.08 = -2.228% (+/-0.00%)
misc_aes.py 429.69 -> 405.59 : -24.10 = -5.609% (+/-0.01%)
misc_mandel.py 3485.13 -> 3416.51 : -68.62 = -1.969% (+/-0.00%)
misc_pystone.py 2496.53 -> 2405.56 : -90.97 = -3.644% (+/-0.01%)
misc_raytrace.py 381.47 -> 374.01 : -7.46 = -1.956% (+/-0.01%)
viper_call0.py 576.73 -> 572.49 : -4.24 = -0.735% (+/-0.04%)
viper_call1a.py 550.37 -> 546.21 : -4.16 = -0.756% (+/-0.09%)
viper_call1b.py 438.23 -> 435.68 : -2.55 = -0.582% (+/-0.06%)
viper_call1c.py 442.84 -> 440.04 : -2.80 = -0.632% (+/-0.08%)
viper_call2a.py 536.31 -> 532.35 : -3.96 = -0.738% (+/-0.06%)
viper_call2b.py 382.34 -> 377.07 : -5.27 = -1.378% (+/-0.03%)
And for unix on x64:
diff of scores (higher is better)
N=2000 M=2000 baseline -> this-commit diff diff% (error%)
bm_chaos.py 13594.20 -> 13073.84 : -520.36 = -3.828% (+/-5.44%)
bm_fannkuch.py 60.63 -> 59.58 : -1.05 = -1.732% (+/-3.01%)
bm_fft.py 112009.15 -> 111603.32 : -405.83 = -0.362% (+/-4.03%)
bm_float.py 246202.55 -> 247923.81 : +1721.26 = +0.699% (+/-2.79%)
bm_hexiom.py 615.65 -> 617.21 : +1.56 = +0.253% (+/-1.64%)
bm_nqueens.py 215807.95 -> 215600.96 : -206.99 = -0.096% (+/-3.52%)
bm_pidigits.py 8246.74 -> 8422.82 : +176.08 = +2.135% (+/-3.64%)
misc_aes.py 16133.00 -> 16452.74 : +319.74 = +1.982% (+/-1.50%)
misc_mandel.py 128146.69 -> 130796.43 : +2649.74 = +2.068% (+/-3.18%)
misc_pystone.py 83811.49 -> 83124.85 : -686.64 = -0.819% (+/-1.03%)
misc_raytrace.py 21688.02 -> 21385.10 : -302.92 = -1.397% (+/-3.20%)
The code size change is (firmware with a lot of frozen code benefits the
most):
bare-arm: +396 +0.697%
minimal x86: +1595 +0.979% [incl +32(data)]
unix x64: +2408 +0.470% [incl +800(data)]
unix nanbox: +1396 +0.309% [incl -96(data)]
stm32: -1256 -0.318% PYBV10
cc3200: +288 +0.157%
esp8266: -260 -0.037% GENERIC
esp32: -216 -0.014% GENERIC[incl -1072(data)]
nrf: +116 +0.067% pca10040
rp2: -664 -0.135% PICO
samd: +844 +0.607% ADAFRUIT_ITSYBITSY_M4_EXPRESS
As part of this change the .mpy file format version is bumped to version 6.
And mpy-tool.py has been improved to provide a good visualisation of the
contents of .mpy files.
In summary: this commit changes the bytecode to use qstr indirection, and
reworks the .mpy file format to be simpler and allow .mpy files to be
executed in-place. Performance is not impacted too much. Eventually it
will be possible to store such .mpy files in a linear, read-only, memory-
mappable filesystem so they can be executed from flash/ROM. This will
essentially be able to replace frozen code for most applications.
Signed-off-by: Damien George <damien@micropython.org>
2021-10-22 12:22:47 +01:00
|
|
|
#if MICROPY_EMIT_NATIVE && MICROPY_DYNAMIC_COMPILER
|
2019-03-09 01:32:26 +00:00
|
|
|
if (*emit_options == MP_EMIT_OPT_NATIVE_PYTHON || *emit_options == MP_EMIT_OPT_VIPER) {
|
|
|
|
if (emit_native_table[mp_dynamic_compiler.native_arch] == NULL) {
|
2020-03-02 11:35:22 +00:00
|
|
|
compile_syntax_error(comp, name_nodes[1], MP_ERROR_TEXT("invalid arch"));
|
2019-03-09 01:32:26 +00:00
|
|
|
}
|
|
|
|
} else if (*emit_options == MP_EMIT_OPT_ASM) {
|
|
|
|
if (emit_asm_table[mp_dynamic_compiler.native_arch] == NULL) {
|
2020-03-02 11:35:22 +00:00
|
|
|
compile_syntax_error(comp, name_nodes[1], MP_ERROR_TEXT("invalid arch"));
|
2019-03-09 01:32:26 +00:00
|
|
|
}
|
|
|
|
}
|
|
|
|
#endif
|
|
|
|
|
2013-10-05 18:08:26 +01:00
|
|
|
return true;
|
|
|
|
}
|
|
|
|
|
2014-12-10 22:07:04 +00:00
|
|
|
STATIC void compile_decorated(compiler_t *comp, mp_parse_node_struct_t *pns) {
|
2013-10-04 19:53:11 +01:00
|
|
|
// get the list of decorators
|
2013-12-21 18:17:45 +00:00
|
|
|
mp_parse_node_t *nodes;
|
2020-05-04 13:11:44 +01:00
|
|
|
size_t n = mp_parse_node_extract_list(&pns->nodes[0], PN_decorators, &nodes);
|
2013-10-04 19:53:11 +01:00
|
|
|
|
2013-10-05 18:08:26 +01:00
|
|
|
// inherit emit options for this function/class definition
|
|
|
|
uint emit_options = comp->scope_cur->emit_options;
|
|
|
|
|
|
|
|
// compile each decorator
|
2020-05-04 13:11:44 +01:00
|
|
|
size_t num_built_in_decorators = 0;
|
|
|
|
for (size_t i = 0; i < n; i++) {
|
2013-12-21 18:17:45 +00:00
|
|
|
assert(MP_PARSE_NODE_IS_STRUCT_KIND(nodes[i], PN_decorator)); // should be
|
|
|
|
mp_parse_node_struct_t *pns_decorator = (mp_parse_node_struct_t *)nodes[i];
|
2013-10-05 18:08:26 +01:00
|
|
|
|
|
|
|
// nodes[0] contains the decorator function, which is a dotted name
|
2013-12-21 18:17:45 +00:00
|
|
|
mp_parse_node_t *name_nodes;
|
2020-05-04 13:11:44 +01:00
|
|
|
size_t name_len = mp_parse_node_extract_list(&pns_decorator->nodes[0], PN_dotted_name, &name_nodes);
|
2013-10-05 18:08:26 +01:00
|
|
|
|
|
|
|
// check for built-in decorators
|
|
|
|
if (compile_built_in_decorator(comp, name_len, name_nodes, &emit_options)) {
|
|
|
|
// this was a built-in
|
|
|
|
num_built_in_decorators += 1;
|
|
|
|
|
|
|
|
} else {
|
|
|
|
// not a built-in, compile normally
|
|
|
|
|
|
|
|
// compile the decorator function
|
|
|
|
compile_node(comp, name_nodes[0]);
|
2020-05-04 13:11:44 +01:00
|
|
|
for (size_t j = 1; j < name_len; j++) {
|
2015-01-20 11:55:10 +00:00
|
|
|
assert(MP_PARSE_NODE_IS_ID(name_nodes[j])); // should be
|
2018-05-22 12:43:41 +01:00
|
|
|
EMIT_ARG(attr, MP_PARSE_NODE_LEAF_ARG(name_nodes[j]), MP_EMIT_ATTR_LOAD);
|
2013-10-05 18:08:26 +01:00
|
|
|
}
|
|
|
|
|
|
|
|
// nodes[1] contains arguments to the decorator function, if any
|
2013-12-21 18:17:45 +00:00
|
|
|
if (!MP_PARSE_NODE_IS_NULL(pns_decorator->nodes[1])) {
|
2013-10-05 18:08:26 +01:00
|
|
|
// call the decorator function with the arguments in nodes[1]
|
|
|
|
compile_node(comp, pns_decorator->nodes[1]);
|
|
|
|
}
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2016-01-27 20:23:11 +00:00
|
|
|
// compile the body (funcdef, async funcdef or classdef) and get its name
|
2013-12-21 18:17:45 +00:00
|
|
|
mp_parse_node_struct_t *pns_body = (mp_parse_node_struct_t *)pns->nodes[1];
|
2013-10-04 19:53:11 +01:00
|
|
|
qstr body_name = 0;
|
2013-12-21 18:17:45 +00:00
|
|
|
if (MP_PARSE_NODE_STRUCT_KIND(pns_body) == PN_funcdef) {
|
2013-10-05 18:08:26 +01:00
|
|
|
body_name = compile_funcdef_helper(comp, pns_body, emit_options);
|
2016-01-27 20:23:11 +00:00
|
|
|
#if MICROPY_PY_ASYNC_AWAIT
|
|
|
|
} else if (MP_PARSE_NODE_STRUCT_KIND(pns_body) == PN_async_funcdef) {
|
|
|
|
assert(MP_PARSE_NODE_IS_STRUCT(pns_body->nodes[0]));
|
|
|
|
mp_parse_node_struct_t *pns0 = (mp_parse_node_struct_t *)pns_body->nodes[0];
|
|
|
|
body_name = compile_funcdef_helper(comp, pns0, emit_options);
|
|
|
|
scope_t *fscope = (scope_t *)pns0->nodes[4];
|
|
|
|
fscope->scope_flags |= MP_SCOPE_FLAG_GENERATOR;
|
|
|
|
#endif
|
2013-10-04 19:53:11 +01:00
|
|
|
} else {
|
2015-02-27 14:25:47 +00:00
|
|
|
assert(MP_PARSE_NODE_STRUCT_KIND(pns_body) == PN_classdef); // should be
|
|
|
|
body_name = compile_classdef_helper(comp, pns_body, emit_options);
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
|
|
|
|
|
|
|
// call each decorator
|
2020-05-04 13:11:44 +01:00
|
|
|
for (size_t i = 0; i < n - num_built_in_decorators; i++) {
|
2014-04-09 12:43:17 +01:00
|
|
|
EMIT_ARG(call_function, 1, 0, 0);
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
|
|
|
|
|
|
|
// store func/class object into name
|
2015-03-26 14:42:40 +00:00
|
|
|
compile_store_id(comp, body_name);
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
|
|
|
|
2014-12-10 22:07:04 +00:00
|
|
|
STATIC void compile_funcdef(compiler_t *comp, mp_parse_node_struct_t *pns) {
|
2013-10-05 18:08:26 +01:00
|
|
|
qstr fname = compile_funcdef_helper(comp, pns, comp->scope_cur->emit_options);
|
2013-10-04 19:53:11 +01:00
|
|
|
// store function object into function name
|
2015-03-26 14:42:40 +00:00
|
|
|
compile_store_id(comp, fname);
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
|
|
|
|
2014-08-15 14:30:52 +01:00
|
|
|
STATIC void c_del_stmt(compiler_t *comp, mp_parse_node_t pn) {
|
2013-12-21 18:17:45 +00:00
|
|
|
if (MP_PARSE_NODE_IS_ID(pn)) {
|
2015-03-26 14:42:40 +00:00
|
|
|
compile_delete_id(comp, MP_PARSE_NODE_LEAF_ARG(pn));
|
2016-01-27 20:23:11 +00:00
|
|
|
} else if (MP_PARSE_NODE_IS_STRUCT_KIND(pn, PN_atom_expr_normal)) {
|
2013-12-21 18:17:45 +00:00
|
|
|
mp_parse_node_struct_t *pns = (mp_parse_node_struct_t *)pn;
|
2013-10-04 19:53:11 +01:00
|
|
|
|
2016-01-27 20:23:11 +00:00
|
|
|
compile_node(comp, pns->nodes[0]); // base of the atom_expr_normal node
|
2013-10-04 19:53:11 +01:00
|
|
|
|
2013-12-21 18:17:45 +00:00
|
|
|
if (MP_PARSE_NODE_IS_STRUCT(pns->nodes[1])) {
|
|
|
|
mp_parse_node_struct_t *pns1 = (mp_parse_node_struct_t *)pns->nodes[1];
|
2016-01-27 20:23:11 +00:00
|
|
|
if (MP_PARSE_NODE_STRUCT_KIND(pns1) == PN_atom_expr_trailers) {
|
2013-12-21 18:17:45 +00:00
|
|
|
int n = MP_PARSE_NODE_STRUCT_NUM_NODES(pns1);
|
2013-10-04 19:53:11 +01:00
|
|
|
for (int i = 0; i < n - 1; i++) {
|
|
|
|
compile_node(comp, pns1->nodes[i]);
|
|
|
|
}
|
2013-12-21 18:17:45 +00:00
|
|
|
assert(MP_PARSE_NODE_IS_STRUCT(pns1->nodes[n - 1]));
|
|
|
|
pns1 = (mp_parse_node_struct_t *)pns1->nodes[n - 1];
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
2015-03-25 22:06:47 +00:00
|
|
|
if (MP_PARSE_NODE_STRUCT_KIND(pns1) == PN_trailer_bracket) {
|
2013-10-04 19:53:11 +01:00
|
|
|
compile_node(comp, pns1->nodes[0]);
|
2018-05-22 12:31:56 +01:00
|
|
|
EMIT_ARG(subscr, MP_EMIT_SUBSCR_DELETE);
|
2013-12-21 18:17:45 +00:00
|
|
|
} else if (MP_PARSE_NODE_STRUCT_KIND(pns1) == PN_trailer_period) {
|
|
|
|
assert(MP_PARSE_NODE_IS_ID(pns1->nodes[0]));
|
2018-05-22 12:43:41 +01:00
|
|
|
EMIT_ARG(attr, MP_PARSE_NODE_LEAF_ARG(pns1->nodes[0]), MP_EMIT_ATTR_DELETE);
|
2013-10-04 19:53:11 +01:00
|
|
|
} else {
|
2014-04-08 16:41:02 +01:00
|
|
|
goto cannot_delete;
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
|
|
|
} else {
|
2014-04-08 16:41:02 +01:00
|
|
|
goto cannot_delete;
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
|
|
|
|
2013-12-21 18:17:45 +00:00
|
|
|
} else if (MP_PARSE_NODE_IS_STRUCT_KIND(pn, PN_atom_paren)) {
|
|
|
|
pn = ((mp_parse_node_struct_t *)pn)->nodes[0];
|
2016-01-07 13:07:52 +00:00
|
|
|
if (MP_PARSE_NODE_IS_NULL(pn)) {
|
|
|
|
goto cannot_delete;
|
|
|
|
} else {
|
|
|
|
assert(MP_PARSE_NODE_IS_STRUCT_KIND(pn, PN_testlist_comp));
|
2013-12-21 18:17:45 +00:00
|
|
|
mp_parse_node_struct_t *pns = (mp_parse_node_struct_t *)pn;
|
2021-03-23 01:48:35 +00:00
|
|
|
if (MP_PARSE_NODE_TESTLIST_COMP_HAS_COMP_FOR(pns)) {
|
|
|
|
goto cannot_delete;
|
|
|
|
}
|
|
|
|
for (size_t i = 0; i < MP_PARSE_NODE_STRUCT_NUM_NODES(pns); ++i) {
|
|
|
|
c_del_stmt(comp, pns->nodes[i]);
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
|
|
|
}
|
|
|
|
} else {
|
2017-05-29 08:08:14 +01:00
|
|
|
// some arbitrary statement that we can't delete (eg del 1)
|
2014-04-08 16:41:02 +01:00
|
|
|
goto cannot_delete;
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
2014-04-08 16:41:02 +01:00
|
|
|
|
|
|
|
return;
|
|
|
|
|
|
|
|
cannot_delete:
|
2020-03-02 11:35:22 +00:00
|
|
|
compile_syntax_error(comp, (mp_parse_node_t)pn, MP_ERROR_TEXT("can't delete expression"));
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
|
|
|
|
2014-12-10 22:07:04 +00:00
|
|
|
STATIC void compile_del_stmt(compiler_t *comp, mp_parse_node_struct_t *pns) {
|
2013-10-04 19:53:11 +01:00
|
|
|
apply_to_single_or_list(comp, pns->nodes[0], PN_exprlist, c_del_stmt);
|
|
|
|
}
|
|
|
|
|
2018-06-19 04:54:03 +01:00
|
|
|
STATIC void compile_break_cont_stmt(compiler_t *comp, mp_parse_node_struct_t *pns) {
|
|
|
|
uint16_t label;
|
|
|
|
if (MP_PARSE_NODE_STRUCT_KIND(pns) == PN_break_stmt) {
|
|
|
|
label = comp->break_label;
|
|
|
|
} else {
|
|
|
|
label = comp->continue_label;
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
2018-06-19 04:54:03 +01:00
|
|
|
if (label == INVALID_LABEL) {
|
2020-03-02 11:35:22 +00:00
|
|
|
compile_syntax_error(comp, (mp_parse_node_t)pns, MP_ERROR_TEXT("'break'/'continue' outside loop"));
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
2014-10-17 15:08:49 +01:00
|
|
|
assert(comp->cur_except_level >= comp->break_continue_except_level);
|
2018-06-19 04:54:03 +01:00
|
|
|
EMIT_ARG(unwind_jump, label, comp->cur_except_level - comp->break_continue_except_level);
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
|
|
|
|
2014-12-10 22:07:04 +00:00
|
|
|
STATIC void compile_return_stmt(compiler_t *comp, mp_parse_node_struct_t *pns) {
|
2020-01-14 15:47:20 +00:00
|
|
|
#if MICROPY_CPYTHON_COMPAT
|
2013-10-18 19:58:12 +01:00
|
|
|
if (comp->scope_cur->kind != SCOPE_FUNCTION) {
|
2020-03-02 11:35:22 +00:00
|
|
|
compile_syntax_error(comp, (mp_parse_node_t)pns, MP_ERROR_TEXT("'return' outside function"));
|
2013-10-18 19:58:12 +01:00
|
|
|
return;
|
|
|
|
}
|
2020-01-14 15:47:20 +00:00
|
|
|
#endif
|
2013-12-21 18:17:45 +00:00
|
|
|
if (MP_PARSE_NODE_IS_NULL(pns->nodes[0])) {
|
2013-10-18 19:58:12 +01:00
|
|
|
// no argument to 'return', so return None
|
2014-01-23 00:34:21 +00:00
|
|
|
EMIT_ARG(load_const_tok, MP_TOKEN_KW_NONE);
|
2017-04-22 05:58:01 +01:00
|
|
|
} else if (MICROPY_COMP_RETURN_IF_EXPR
|
|
|
|
&& MP_PARSE_NODE_IS_STRUCT_KIND(pns->nodes[0], PN_test_if_expr)) {
|
2013-10-04 19:53:11 +01:00
|
|
|
// special case when returning an if-expression; to match CPython optimisation
|
2013-12-21 18:17:45 +00:00
|
|
|
mp_parse_node_struct_t *pns_test_if_expr = (mp_parse_node_struct_t *)pns->nodes[0];
|
|
|
|
mp_parse_node_struct_t *pns_test_if_else = (mp_parse_node_struct_t *)pns_test_if_expr->nodes[1];
|
2013-10-04 19:53:11 +01:00
|
|
|
|
2014-04-10 14:11:31 +01:00
|
|
|
uint l_fail = comp_next_label(comp);
|
2013-10-04 19:53:11 +01:00
|
|
|
c_if_cond(comp, pns_test_if_else->nodes[0], false, l_fail); // condition
|
|
|
|
compile_node(comp, pns_test_if_expr->nodes[0]); // success value
|
|
|
|
EMIT(return_value);
|
2014-01-23 00:34:21 +00:00
|
|
|
EMIT_ARG(label_assign, l_fail);
|
2013-10-04 19:53:11 +01:00
|
|
|
compile_node(comp, pns_test_if_else->nodes[1]); // failure value
|
|
|
|
} else {
|
|
|
|
compile_node(comp, pns->nodes[0]);
|
|
|
|
}
|
|
|
|
EMIT(return_value);
|
|
|
|
}
|
|
|
|
|
2014-12-10 22:07:04 +00:00
|
|
|
STATIC void compile_yield_stmt(compiler_t *comp, mp_parse_node_struct_t *pns) {
|
2013-10-04 19:53:11 +01:00
|
|
|
compile_node(comp, pns->nodes[0]);
|
|
|
|
EMIT(pop_top);
|
|
|
|
}
|
|
|
|
|
2014-12-10 22:07:04 +00:00
|
|
|
STATIC void compile_raise_stmt(compiler_t *comp, mp_parse_node_struct_t *pns) {
|
2013-12-21 18:17:45 +00:00
|
|
|
if (MP_PARSE_NODE_IS_NULL(pns->nodes[0])) {
|
2013-10-04 19:53:11 +01:00
|
|
|
// raise
|
2014-01-23 00:34:21 +00:00
|
|
|
EMIT_ARG(raise_varargs, 0);
|
2013-12-21 18:17:45 +00:00
|
|
|
} else if (MP_PARSE_NODE_IS_STRUCT_KIND(pns->nodes[0], PN_raise_stmt_arg)) {
|
2013-10-04 19:53:11 +01:00
|
|
|
// raise x from y
|
2013-12-21 18:17:45 +00:00
|
|
|
pns = (mp_parse_node_struct_t *)pns->nodes[0];
|
2013-10-04 19:53:11 +01:00
|
|
|
compile_node(comp, pns->nodes[0]);
|
|
|
|
compile_node(comp, pns->nodes[1]);
|
2014-01-23 00:34:21 +00:00
|
|
|
EMIT_ARG(raise_varargs, 2);
|
2013-10-04 19:53:11 +01:00
|
|
|
} else {
|
|
|
|
// raise x
|
|
|
|
compile_node(comp, pns->nodes[0]);
|
2014-01-23 00:34:21 +00:00
|
|
|
EMIT_ARG(raise_varargs, 1);
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2014-04-10 12:56:52 +01:00
|
|
|
// q_base holds the base of the name
|
|
|
|
// eg a -> q_base=a
|
|
|
|
// a.b.c -> q_base=a
|
2014-08-15 14:30:52 +01:00
|
|
|
STATIC void do_import_name(compiler_t *comp, mp_parse_node_t pn, qstr *q_base) {
|
2013-10-04 19:53:11 +01:00
|
|
|
bool is_as = false;
|
2013-12-21 18:17:45 +00:00
|
|
|
if (MP_PARSE_NODE_IS_STRUCT_KIND(pn, PN_dotted_as_name)) {
|
|
|
|
mp_parse_node_struct_t *pns = (mp_parse_node_struct_t *)pn;
|
2013-10-04 19:53:11 +01:00
|
|
|
// a name of the form x as y; unwrap it
|
2014-04-10 12:56:52 +01:00
|
|
|
*q_base = MP_PARSE_NODE_LEAF_ARG(pns->nodes[1]);
|
2013-10-04 19:53:11 +01:00
|
|
|
pn = pns->nodes[0];
|
|
|
|
is_as = true;
|
|
|
|
}
|
2014-04-10 12:56:52 +01:00
|
|
|
if (MP_PARSE_NODE_IS_NULL(pn)) {
|
|
|
|
// empty name (eg, from . import x)
|
|
|
|
*q_base = MP_QSTR_;
|
2018-05-22 12:58:25 +01:00
|
|
|
EMIT_ARG(import, MP_QSTR_, MP_EMIT_IMPORT_NAME); // import the empty string
|
2014-04-10 12:56:52 +01:00
|
|
|
} else if (MP_PARSE_NODE_IS_ID(pn)) {
|
2013-10-04 19:53:11 +01:00
|
|
|
// just a simple name
|
2014-04-10 12:56:52 +01:00
|
|
|
qstr q_full = MP_PARSE_NODE_LEAF_ARG(pn);
|
2013-10-04 19:53:11 +01:00
|
|
|
if (!is_as) {
|
2014-04-10 12:56:52 +01:00
|
|
|
*q_base = q_full;
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
2018-05-22 12:58:25 +01:00
|
|
|
EMIT_ARG(import, q_full, MP_EMIT_IMPORT_NAME);
|
2015-02-27 14:25:47 +00:00
|
|
|
} else {
|
|
|
|
assert(MP_PARSE_NODE_IS_STRUCT_KIND(pn, PN_dotted_name)); // should be
|
2013-12-21 18:17:45 +00:00
|
|
|
mp_parse_node_struct_t *pns = (mp_parse_node_struct_t *)pn;
|
2015-02-27 14:25:47 +00:00
|
|
|
{
|
2013-10-04 19:53:11 +01:00
|
|
|
// a name of the form a.b.c
|
|
|
|
if (!is_as) {
|
2014-04-10 12:56:52 +01:00
|
|
|
*q_base = MP_PARSE_NODE_LEAF_ARG(pns->nodes[0]);
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
2013-12-21 18:17:45 +00:00
|
|
|
int n = MP_PARSE_NODE_STRUCT_NUM_NODES(pns);
|
2013-10-04 19:53:11 +01:00
|
|
|
int len = n - 1;
|
|
|
|
for (int i = 0; i < n; i++) {
|
2014-01-21 21:40:13 +00:00
|
|
|
len += qstr_len(MP_PARSE_NODE_LEAF_ARG(pns->nodes[i]));
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
2017-11-26 12:37:19 +00:00
|
|
|
char *q_ptr = mp_local_alloc(len);
|
2017-11-01 02:16:16 +00:00
|
|
|
char *str_dest = q_ptr;
|
2013-10-04 19:53:11 +01:00
|
|
|
for (int i = 0; i < n; i++) {
|
|
|
|
if (i > 0) {
|
2014-01-02 16:36:09 +00:00
|
|
|
*str_dest++ = '.';
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
2015-11-27 12:23:18 +00:00
|
|
|
size_t str_src_len;
|
2014-01-21 21:40:13 +00:00
|
|
|
const byte *str_src = qstr_data(MP_PARSE_NODE_LEAF_ARG(pns->nodes[i]), &str_src_len);
|
2014-01-02 16:36:09 +00:00
|
|
|
memcpy(str_dest, str_src, str_src_len);
|
|
|
|
str_dest += str_src_len;
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
2017-11-01 02:16:16 +00:00
|
|
|
qstr q_full = qstr_from_strn(q_ptr, len);
|
2017-11-26 12:37:19 +00:00
|
|
|
mp_local_free(q_ptr);
|
2018-05-22 12:58:25 +01:00
|
|
|
EMIT_ARG(import, q_full, MP_EMIT_IMPORT_NAME);
|
2013-10-04 19:53:11 +01:00
|
|
|
if (is_as) {
|
|
|
|
for (int i = 1; i < n; i++) {
|
2018-05-22 12:43:41 +01:00
|
|
|
EMIT_ARG(attr, MP_PARSE_NODE_LEAF_ARG(pns->nodes[i]), MP_EMIT_ATTR_LOAD);
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2014-08-15 14:30:52 +01:00
|
|
|
STATIC void compile_dotted_as_name(compiler_t *comp, mp_parse_node_t pn) {
|
2014-04-10 12:56:52 +01:00
|
|
|
EMIT_ARG(load_const_small_int, 0); // level 0 import
|
|
|
|
EMIT_ARG(load_const_tok, MP_TOKEN_KW_NONE); // not importing from anything
|
|
|
|
qstr q_base;
|
|
|
|
do_import_name(comp, pn, &q_base);
|
2015-03-26 14:42:40 +00:00
|
|
|
compile_store_id(comp, q_base);
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
|
|
|
|
2014-12-10 22:07:04 +00:00
|
|
|
STATIC void compile_import_name(compiler_t *comp, mp_parse_node_struct_t *pns) {
|
2013-10-04 19:53:11 +01:00
|
|
|
apply_to_single_or_list(comp, pns->nodes[0], PN_dotted_as_names, compile_dotted_as_name);
|
|
|
|
}
|
|
|
|
|
2014-12-10 22:07:04 +00:00
|
|
|
STATIC void compile_import_from(compiler_t *comp, mp_parse_node_struct_t *pns) {
|
2014-04-10 12:56:52 +01:00
|
|
|
mp_parse_node_t pn_import_source = pns->nodes[0];
|
|
|
|
|
2017-05-29 08:08:14 +01:00
|
|
|
// extract the preceding .'s (if any) for a relative import, to compute the import level
|
2014-04-10 12:56:52 +01:00
|
|
|
uint import_level = 0;
|
|
|
|
do {
|
|
|
|
mp_parse_node_t pn_rel;
|
|
|
|
if (MP_PARSE_NODE_IS_TOKEN(pn_import_source) || MP_PARSE_NODE_IS_STRUCT_KIND(pn_import_source, PN_one_or_more_period_or_ellipsis)) {
|
|
|
|
// This covers relative imports with dots only like "from .. import"
|
|
|
|
pn_rel = pn_import_source;
|
|
|
|
pn_import_source = MP_PARSE_NODE_NULL;
|
|
|
|
} else if (MP_PARSE_NODE_IS_STRUCT_KIND(pn_import_source, PN_import_from_2b)) {
|
|
|
|
// This covers relative imports starting with dot(s) like "from .foo import"
|
|
|
|
mp_parse_node_struct_t *pns_2b = (mp_parse_node_struct_t *)pn_import_source;
|
|
|
|
pn_rel = pns_2b->nodes[0];
|
|
|
|
pn_import_source = pns_2b->nodes[1];
|
|
|
|
assert(!MP_PARSE_NODE_IS_NULL(pn_import_source)); // should not be
|
|
|
|
} else {
|
|
|
|
// Not a relative import
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
|
|
|
|
// get the list of . and/or ...'s
|
|
|
|
mp_parse_node_t *nodes;
|
2020-05-04 13:11:44 +01:00
|
|
|
size_t n = mp_parse_node_extract_list(&pn_rel, PN_one_or_more_period_or_ellipsis, &nodes);
|
2014-04-10 12:56:52 +01:00
|
|
|
|
|
|
|
// count the total number of .'s
|
2020-05-04 13:11:44 +01:00
|
|
|
for (size_t i = 0; i < n; i++) {
|
2014-04-10 12:56:52 +01:00
|
|
|
if (MP_PARSE_NODE_IS_TOKEN_KIND(nodes[i], MP_TOKEN_DEL_PERIOD)) {
|
|
|
|
import_level++;
|
|
|
|
} else {
|
|
|
|
// should be an MP_TOKEN_ELLIPSIS
|
|
|
|
import_level += 3;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
} while (0);
|
|
|
|
|
2013-12-21 18:17:45 +00:00
|
|
|
if (MP_PARSE_NODE_IS_TOKEN_KIND(pns->nodes[1], MP_TOKEN_OP_STAR)) {
|
2019-09-20 08:16:34 +01:00
|
|
|
#if MICROPY_CPYTHON_COMPAT
|
|
|
|
if (comp->scope_cur->kind != SCOPE_MODULE) {
|
2020-03-02 11:35:22 +00:00
|
|
|
compile_syntax_error(comp, (mp_parse_node_t)pns, MP_ERROR_TEXT("import * not at module level"));
|
2019-09-20 08:16:34 +01:00
|
|
|
return;
|
|
|
|
}
|
|
|
|
#endif
|
|
|
|
|
2014-04-10 12:56:52 +01:00
|
|
|
EMIT_ARG(load_const_small_int, import_level);
|
2013-12-10 17:27:24 +00:00
|
|
|
|
|
|
|
// build the "fromlist" tuple
|
2015-06-25 15:42:13 +01:00
|
|
|
EMIT_ARG(load_const_str, MP_QSTR__star_);
|
2018-05-18 15:41:40 +01:00
|
|
|
EMIT_ARG(build, 1, MP_EMIT_BUILD_TUPLE);
|
2013-12-10 17:27:24 +00:00
|
|
|
|
|
|
|
// do the import
|
2014-04-10 12:56:52 +01:00
|
|
|
qstr dummy_q;
|
|
|
|
do_import_name(comp, pn_import_source, &dummy_q);
|
2019-09-25 06:53:30 +01:00
|
|
|
EMIT_ARG(import, MP_QSTRnull, MP_EMIT_IMPORT_STAR);
|
2013-12-10 17:27:24 +00:00
|
|
|
|
2013-10-04 19:53:11 +01:00
|
|
|
} else {
|
2014-04-10 12:56:52 +01:00
|
|
|
EMIT_ARG(load_const_small_int, import_level);
|
2013-12-10 17:27:24 +00:00
|
|
|
|
|
|
|
// build the "fromlist" tuple
|
2013-12-21 18:17:45 +00:00
|
|
|
mp_parse_node_t *pn_nodes;
|
2020-05-04 13:11:44 +01:00
|
|
|
size_t n = mp_parse_node_extract_list(&pns->nodes[1], PN_import_as_names, &pn_nodes);
|
|
|
|
for (size_t i = 0; i < n; i++) {
|
2013-12-21 18:17:45 +00:00
|
|
|
assert(MP_PARSE_NODE_IS_STRUCT_KIND(pn_nodes[i], PN_import_as_name));
|
|
|
|
mp_parse_node_struct_t *pns3 = (mp_parse_node_struct_t *)pn_nodes[i];
|
|
|
|
qstr id2 = MP_PARSE_NODE_LEAF_ARG(pns3->nodes[0]); // should be id
|
2015-06-25 15:42:13 +01:00
|
|
|
EMIT_ARG(load_const_str, id2);
|
2013-12-10 17:27:24 +00:00
|
|
|
}
|
2018-05-18 15:41:40 +01:00
|
|
|
EMIT_ARG(build, n, MP_EMIT_BUILD_TUPLE);
|
2013-12-10 17:27:24 +00:00
|
|
|
|
|
|
|
// do the import
|
2014-04-10 12:56:52 +01:00
|
|
|
qstr dummy_q;
|
|
|
|
do_import_name(comp, pn_import_source, &dummy_q);
|
2020-05-04 13:11:44 +01:00
|
|
|
for (size_t i = 0; i < n; i++) {
|
2013-12-21 18:17:45 +00:00
|
|
|
assert(MP_PARSE_NODE_IS_STRUCT_KIND(pn_nodes[i], PN_import_as_name));
|
|
|
|
mp_parse_node_struct_t *pns3 = (mp_parse_node_struct_t *)pn_nodes[i];
|
|
|
|
qstr id2 = MP_PARSE_NODE_LEAF_ARG(pns3->nodes[0]); // should be id
|
2018-05-22 12:58:25 +01:00
|
|
|
EMIT_ARG(import, id2, MP_EMIT_IMPORT_FROM);
|
2013-12-21 18:17:45 +00:00
|
|
|
if (MP_PARSE_NODE_IS_NULL(pns3->nodes[1])) {
|
2015-03-26 14:42:40 +00:00
|
|
|
compile_store_id(comp, id2);
|
2013-10-04 19:53:11 +01:00
|
|
|
} else {
|
2015-03-26 14:42:40 +00:00
|
|
|
compile_store_id(comp, MP_PARSE_NODE_LEAF_ARG(pns3->nodes[1]));
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
|
|
|
}
|
|
|
|
EMIT(pop_top);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2018-10-27 12:41:21 +01:00
|
|
|
STATIC void compile_declare_global(compiler_t *comp, mp_parse_node_t pn, id_info_t *id_info) {
|
|
|
|
if (id_info->kind != ID_INFO_KIND_UNDECIDED && id_info->kind != ID_INFO_KIND_GLOBAL_EXPLICIT) {
|
2020-03-02 11:35:22 +00:00
|
|
|
compile_syntax_error(comp, pn, MP_ERROR_TEXT("identifier redefined as global"));
|
2014-12-21 17:26:45 +00:00
|
|
|
return;
|
|
|
|
}
|
|
|
|
id_info->kind = ID_INFO_KIND_GLOBAL_EXPLICIT;
|
|
|
|
|
|
|
|
// if the id exists in the global scope, set its kind to EXPLICIT_GLOBAL
|
2018-10-26 07:11:48 +01:00
|
|
|
id_info = scope_find_global(comp->scope_cur, id_info->qst);
|
2014-12-21 17:26:45 +00:00
|
|
|
if (id_info != NULL) {
|
|
|
|
id_info->kind = ID_INFO_KIND_GLOBAL_EXPLICIT;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2018-10-27 12:41:21 +01:00
|
|
|
STATIC void compile_declare_nonlocal(compiler_t *comp, mp_parse_node_t pn, id_info_t *id_info) {
|
|
|
|
if (id_info->kind == ID_INFO_KIND_UNDECIDED) {
|
2018-10-26 06:48:07 +01:00
|
|
|
id_info->kind = ID_INFO_KIND_GLOBAL_IMPLICIT;
|
|
|
|
scope_check_to_close_over(comp->scope_cur, id_info);
|
2016-09-30 04:53:00 +01:00
|
|
|
if (id_info->kind == ID_INFO_KIND_GLOBAL_IMPLICIT) {
|
2020-03-02 11:35:22 +00:00
|
|
|
compile_syntax_error(comp, pn, MP_ERROR_TEXT("no binding for nonlocal found"));
|
2016-09-30 04:53:00 +01:00
|
|
|
}
|
|
|
|
} else if (id_info->kind != ID_INFO_KIND_FREE) {
|
2020-03-02 11:35:22 +00:00
|
|
|
compile_syntax_error(comp, pn, MP_ERROR_TEXT("identifier redefined as nonlocal"));
|
2014-12-21 17:26:45 +00:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2018-06-19 05:04:05 +01:00
|
|
|
STATIC void compile_global_nonlocal_stmt(compiler_t *comp, mp_parse_node_struct_t *pns) {
|
2014-05-07 17:24:22 +01:00
|
|
|
if (comp->pass == MP_PASS_SCOPE) {
|
2018-06-19 05:04:05 +01:00
|
|
|
bool is_global = MP_PARSE_NODE_STRUCT_KIND(pns) == PN_global_stmt;
|
|
|
|
|
|
|
|
if (!is_global && comp->scope_cur->kind == SCOPE_MODULE) {
|
2020-03-02 11:35:22 +00:00
|
|
|
compile_syntax_error(comp, (mp_parse_node_t)pns, MP_ERROR_TEXT("can't declare nonlocal in outer code"));
|
2014-12-21 17:26:45 +00:00
|
|
|
return;
|
|
|
|
}
|
2018-06-19 05:04:05 +01:00
|
|
|
|
2014-12-21 17:26:45 +00:00
|
|
|
mp_parse_node_t *nodes;
|
2020-05-04 13:11:44 +01:00
|
|
|
size_t n = mp_parse_node_extract_list(&pns->nodes[0], PN_name_list, &nodes);
|
|
|
|
for (size_t i = 0; i < n; i++) {
|
2018-06-19 05:04:05 +01:00
|
|
|
qstr qst = MP_PARSE_NODE_LEAF_ARG(nodes[i]);
|
2018-10-27 12:41:21 +01:00
|
|
|
id_info_t *id_info = scope_find_or_add_id(comp->scope_cur, qst, ID_INFO_KIND_UNDECIDED);
|
2018-06-19 05:04:05 +01:00
|
|
|
if (is_global) {
|
2018-10-27 12:41:21 +01:00
|
|
|
compile_declare_global(comp, (mp_parse_node_t)pns, id_info);
|
2018-06-19 05:04:05 +01:00
|
|
|
} else {
|
2018-10-27 12:41:21 +01:00
|
|
|
compile_declare_nonlocal(comp, (mp_parse_node_t)pns, id_info);
|
2018-06-19 05:04:05 +01:00
|
|
|
}
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2014-12-10 22:07:04 +00:00
|
|
|
STATIC void compile_assert_stmt(compiler_t *comp, mp_parse_node_struct_t *pns) {
|
2016-08-26 13:28:22 +01:00
|
|
|
// with optimisations enabled we don't compile assertions
|
|
|
|
if (MP_STATE_VM(mp_optimise_value) != 0) {
|
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
2014-04-10 14:11:31 +01:00
|
|
|
uint l_end = comp_next_label(comp);
|
2013-10-04 19:53:11 +01:00
|
|
|
c_if_cond(comp, pns->nodes[0], true, l_end);
|
2015-03-26 14:42:40 +00:00
|
|
|
EMIT_LOAD_GLOBAL(MP_QSTR_AssertionError); // we load_global instead of load_id, to be consistent with CPython
|
2013-12-21 18:17:45 +00:00
|
|
|
if (!MP_PARSE_NODE_IS_NULL(pns->nodes[1])) {
|
2013-10-04 19:53:11 +01:00
|
|
|
// assertion message
|
|
|
|
compile_node(comp, pns->nodes[1]);
|
2014-04-09 12:43:17 +01:00
|
|
|
EMIT_ARG(call_function, 1, 0, 0);
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
2014-01-23 00:34:21 +00:00
|
|
|
EMIT_ARG(raise_varargs, 1);
|
|
|
|
EMIT_ARG(label_assign, l_end);
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
|
|
|
|
2014-12-10 22:07:04 +00:00
|
|
|
STATIC void compile_if_stmt(compiler_t *comp, mp_parse_node_struct_t *pns) {
|
2014-04-10 14:11:31 +01:00
|
|
|
uint l_end = comp_next_label(comp);
|
2013-10-04 19:53:11 +01:00
|
|
|
|
2015-08-14 12:24:11 +01:00
|
|
|
// optimisation: don't emit anything when "if False"
|
2016-11-13 04:32:05 +00:00
|
|
|
if (!mp_parse_node_is_const_false(pns->nodes[0])) {
|
2014-10-17 18:57:33 +01:00
|
|
|
uint l_fail = comp_next_label(comp);
|
|
|
|
c_if_cond(comp, pns->nodes[0], false, l_fail); // if condition
|
2013-10-04 19:53:11 +01:00
|
|
|
|
2014-10-17 18:57:33 +01:00
|
|
|
compile_node(comp, pns->nodes[1]); // if block
|
2014-04-02 16:12:28 +01:00
|
|
|
|
2015-08-14 12:24:11 +01:00
|
|
|
// optimisation: skip everything else when "if True"
|
2016-11-13 04:32:05 +00:00
|
|
|
if (mp_parse_node_is_const_true(pns->nodes[0])) {
|
2014-10-17 18:57:33 +01:00
|
|
|
goto done;
|
|
|
|
}
|
2014-04-02 16:12:28 +01:00
|
|
|
|
2014-10-17 18:57:33 +01:00
|
|
|
if (
|
2015-08-14 12:24:11 +01:00
|
|
|
// optimisation: don't jump over non-existent elif/else blocks
|
|
|
|
!(MP_PARSE_NODE_IS_NULL(pns->nodes[2]) && MP_PARSE_NODE_IS_NULL(pns->nodes[3]))
|
2014-10-17 18:57:33 +01:00
|
|
|
// optimisation: don't jump if last instruction was return
|
|
|
|
&& !EMIT(last_emit_was_return_value)
|
|
|
|
) {
|
|
|
|
// jump over elif/else blocks
|
|
|
|
EMIT_ARG(jump, l_end);
|
|
|
|
}
|
|
|
|
|
|
|
|
EMIT_ARG(label_assign, l_fail);
|
|
|
|
}
|
2013-10-04 19:53:11 +01:00
|
|
|
|
2014-10-17 18:30:16 +01:00
|
|
|
// compile elif blocks (if any)
|
|
|
|
mp_parse_node_t *pn_elif;
|
2020-05-04 13:11:44 +01:00
|
|
|
size_t n_elif = mp_parse_node_extract_list(&pns->nodes[2], PN_if_stmt_elif_list, &pn_elif);
|
|
|
|
for (size_t i = 0; i < n_elif; i++) {
|
2014-10-17 18:30:16 +01:00
|
|
|
assert(MP_PARSE_NODE_IS_STRUCT_KIND(pn_elif[i], PN_if_stmt_elif)); // should be
|
|
|
|
mp_parse_node_struct_t *pns_elif = (mp_parse_node_struct_t *)pn_elif[i];
|
|
|
|
|
2015-08-14 12:24:11 +01:00
|
|
|
// optimisation: don't emit anything when "if False"
|
2016-11-13 04:32:05 +00:00
|
|
|
if (!mp_parse_node_is_const_false(pns_elif->nodes[0])) {
|
2014-10-17 18:57:33 +01:00
|
|
|
uint l_fail = comp_next_label(comp);
|
|
|
|
c_if_cond(comp, pns_elif->nodes[0], false, l_fail); // elif condition
|
2014-10-17 18:30:16 +01:00
|
|
|
|
2014-10-17 18:57:33 +01:00
|
|
|
compile_node(comp, pns_elif->nodes[1]); // elif block
|
|
|
|
|
2015-08-14 12:24:11 +01:00
|
|
|
// optimisation: skip everything else when "elif True"
|
2016-11-13 04:32:05 +00:00
|
|
|
if (mp_parse_node_is_const_true(pns_elif->nodes[0])) {
|
2014-10-17 18:57:33 +01:00
|
|
|
goto done;
|
|
|
|
}
|
|
|
|
|
|
|
|
// optimisation: don't jump if last instruction was return
|
|
|
|
if (!EMIT(last_emit_was_return_value)) {
|
|
|
|
EMIT_ARG(jump, l_end);
|
|
|
|
}
|
|
|
|
EMIT_ARG(label_assign, l_fail);
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
// compile else block
|
|
|
|
compile_node(comp, pns->nodes[3]); // can be null
|
|
|
|
|
2014-10-17 18:57:33 +01:00
|
|
|
done:
|
2014-01-23 00:34:21 +00:00
|
|
|
EMIT_ARG(label_assign, l_end);
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
|
|
|
|
2014-02-01 20:08:18 +00:00
|
|
|
#define START_BREAK_CONTINUE_BLOCK \
|
2014-10-17 15:08:49 +01:00
|
|
|
uint16_t old_break_label = comp->break_label; \
|
|
|
|
uint16_t old_continue_label = comp->continue_label; \
|
|
|
|
uint16_t old_break_continue_except_level = comp->break_continue_except_level; \
|
2014-04-10 14:11:31 +01:00
|
|
|
uint break_label = comp_next_label(comp); \
|
|
|
|
uint continue_label = comp_next_label(comp); \
|
2014-02-01 20:08:18 +00:00
|
|
|
comp->break_label = break_label; \
|
|
|
|
comp->continue_label = continue_label; \
|
|
|
|
comp->break_continue_except_level = comp->cur_except_level;
|
2013-10-04 19:53:11 +01:00
|
|
|
|
2014-02-01 20:08:18 +00:00
|
|
|
#define END_BREAK_CONTINUE_BLOCK \
|
|
|
|
comp->break_label = old_break_label; \
|
|
|
|
comp->continue_label = old_continue_label; \
|
2014-10-17 15:08:49 +01:00
|
|
|
comp->break_continue_except_level = old_break_continue_except_level;
|
2013-10-04 19:53:11 +01:00
|
|
|
|
2014-12-10 22:07:04 +00:00
|
|
|
STATIC void compile_while_stmt(compiler_t *comp, mp_parse_node_struct_t *pns) {
|
2014-02-01 20:08:18 +00:00
|
|
|
START_BREAK_CONTINUE_BLOCK
|
2013-10-04 19:53:11 +01:00
|
|
|
|
2016-11-13 04:32:05 +00:00
|
|
|
if (!mp_parse_node_is_const_false(pns->nodes[0])) { // optimisation: don't emit anything for "while False"
|
2014-10-17 18:57:33 +01:00
|
|
|
uint top_label = comp_next_label(comp);
|
2016-11-13 04:32:05 +00:00
|
|
|
if (!mp_parse_node_is_const_true(pns->nodes[0])) { // optimisation: don't jump to cond for "while True"
|
2014-10-17 18:57:33 +01:00
|
|
|
EMIT_ARG(jump, continue_label);
|
|
|
|
}
|
|
|
|
EMIT_ARG(label_assign, top_label);
|
|
|
|
compile_node(comp, pns->nodes[1]); // body
|
|
|
|
EMIT_ARG(label_assign, continue_label);
|
|
|
|
c_if_cond(comp, pns->nodes[0], true, top_label); // condition
|
|
|
|
}
|
2013-10-15 22:25:17 +01:00
|
|
|
|
|
|
|
// break/continue apply to outer loop (if any) in the else block
|
2014-02-01 20:08:18 +00:00
|
|
|
END_BREAK_CONTINUE_BLOCK
|
2013-10-04 19:53:11 +01:00
|
|
|
|
|
|
|
compile_node(comp, pns->nodes[2]); // else
|
|
|
|
|
2014-01-23 00:34:21 +00:00
|
|
|
EMIT_ARG(label_assign, break_label);
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
|
|
|
|
2014-12-12 17:19:56 +00:00
|
|
|
// This function compiles an optimised for-loop of the form:
|
|
|
|
// for <var> in range(<start>, <end>, <step>):
|
|
|
|
// <body>
|
|
|
|
// else:
|
|
|
|
// <else>
|
|
|
|
// <var> must be an identifier and <step> must be a small-int.
|
|
|
|
//
|
|
|
|
// Semantics of for-loop require:
|
|
|
|
// - final failing value should not be stored in the loop variable
|
|
|
|
// - if the loop never runs, the loop variable should never be assigned
|
|
|
|
// - assignments to <var>, <end> or <step> in the body do not alter the loop
|
|
|
|
// (<step> is a constant for us, so no need to worry about it changing)
|
|
|
|
//
|
|
|
|
// If <end> is a small-int, then the stack during the for-loop contains just
|
|
|
|
// the current value of <var>. Otherwise, the stack contains <end> then the
|
|
|
|
// current value of <var>.
|
2014-08-15 14:30:52 +01:00
|
|
|
STATIC void compile_for_stmt_optimised_range(compiler_t *comp, mp_parse_node_t pn_var, mp_parse_node_t pn_start, mp_parse_node_t pn_end, mp_parse_node_t pn_step, mp_parse_node_t pn_body, mp_parse_node_t pn_else) {
|
2014-02-01 20:08:18 +00:00
|
|
|
START_BREAK_CONTINUE_BLOCK
|
2013-11-06 20:20:49 +00:00
|
|
|
|
2014-04-10 14:11:31 +01:00
|
|
|
uint top_label = comp_next_label(comp);
|
|
|
|
uint entry_label = comp_next_label(comp);
|
2013-11-06 20:20:49 +00:00
|
|
|
|
2014-12-12 17:19:56 +00:00
|
|
|
// put the end value on the stack if it's not a small-int constant
|
|
|
|
bool end_on_stack = !MP_PARSE_NODE_IS_SMALL_INT(pn_end);
|
|
|
|
if (end_on_stack) {
|
|
|
|
compile_node(comp, pn_end);
|
|
|
|
}
|
|
|
|
|
|
|
|
// compile: start
|
2013-11-06 20:20:49 +00:00
|
|
|
compile_node(comp, pn_start);
|
|
|
|
|
2014-01-23 00:34:21 +00:00
|
|
|
EMIT_ARG(jump, entry_label);
|
|
|
|
EMIT_ARG(label_assign, top_label);
|
2013-11-06 20:20:49 +00:00
|
|
|
|
2014-12-11 17:35:23 +00:00
|
|
|
// duplicate next value and store it to var
|
|
|
|
EMIT(dup_top);
|
2014-03-31 18:02:22 +01:00
|
|
|
c_assign(comp, pn_var, ASSIGN_STORE);
|
|
|
|
|
2013-11-09 20:12:03 +00:00
|
|
|
// compile body
|
|
|
|
compile_node(comp, pn_body);
|
|
|
|
|
2014-01-23 00:34:21 +00:00
|
|
|
EMIT_ARG(label_assign, continue_label);
|
2014-01-21 23:48:04 +00:00
|
|
|
|
2014-12-12 17:19:56 +00:00
|
|
|
// compile: var + step
|
2013-11-06 20:20:49 +00:00
|
|
|
compile_node(comp, pn_step);
|
2014-03-30 13:35:08 +01:00
|
|
|
EMIT_ARG(binary_op, MP_BINARY_OP_INPLACE_ADD);
|
2013-11-06 20:20:49 +00:00
|
|
|
|
2014-01-23 00:34:21 +00:00
|
|
|
EMIT_ARG(label_assign, entry_label);
|
2013-11-06 20:20:49 +00:00
|
|
|
|
2014-01-10 18:38:57 +00:00
|
|
|
// compile: if var <cond> end: goto top
|
2014-12-12 17:19:56 +00:00
|
|
|
if (end_on_stack) {
|
|
|
|
EMIT(dup_top_two);
|
|
|
|
EMIT(rot_two);
|
|
|
|
} else {
|
|
|
|
EMIT(dup_top);
|
|
|
|
compile_node(comp, pn_end);
|
|
|
|
}
|
2014-02-22 14:39:45 +00:00
|
|
|
assert(MP_PARSE_NODE_IS_SMALL_INT(pn_step));
|
|
|
|
if (MP_PARSE_NODE_LEAF_SMALL_INT(pn_step) >= 0) {
|
2014-03-30 13:35:08 +01:00
|
|
|
EMIT_ARG(binary_op, MP_BINARY_OP_LESS);
|
2014-01-10 18:38:57 +00:00
|
|
|
} else {
|
2014-03-30 13:35:08 +01:00
|
|
|
EMIT_ARG(binary_op, MP_BINARY_OP_MORE);
|
2014-01-10 18:38:57 +00:00
|
|
|
}
|
2015-02-28 15:04:06 +00:00
|
|
|
EMIT_ARG(pop_jump_if, true, top_label);
|
2013-11-06 20:20:49 +00:00
|
|
|
|
|
|
|
// break/continue apply to outer loop (if any) in the else block
|
2014-02-01 20:08:18 +00:00
|
|
|
END_BREAK_CONTINUE_BLOCK
|
2013-11-06 20:20:49 +00:00
|
|
|
|
2017-06-22 04:50:33 +01:00
|
|
|
// Compile the else block. We must pop the iterator variables before
|
|
|
|
// executing the else code because it may contain break/continue statements.
|
|
|
|
uint end_label = 0;
|
|
|
|
if (!MP_PARSE_NODE_IS_NULL(pn_else)) {
|
|
|
|
// discard final value of "var", and possible "end" value
|
|
|
|
EMIT(pop_top);
|
|
|
|
if (end_on_stack) {
|
|
|
|
EMIT(pop_top);
|
|
|
|
}
|
|
|
|
compile_node(comp, pn_else);
|
|
|
|
end_label = comp_next_label(comp);
|
|
|
|
EMIT_ARG(jump, end_label);
|
|
|
|
EMIT_ARG(adjust_stack_size, 1 + end_on_stack);
|
|
|
|
}
|
2013-11-06 20:20:49 +00:00
|
|
|
|
2014-01-23 00:34:21 +00:00
|
|
|
EMIT_ARG(label_assign, break_label);
|
2014-12-12 17:19:56 +00:00
|
|
|
|
|
|
|
// discard final value of var that failed the loop condition
|
|
|
|
EMIT(pop_top);
|
|
|
|
|
|
|
|
// discard <end> value if it's on the stack
|
|
|
|
if (end_on_stack) {
|
|
|
|
EMIT(pop_top);
|
|
|
|
}
|
2017-06-22 04:50:33 +01:00
|
|
|
|
|
|
|
if (!MP_PARSE_NODE_IS_NULL(pn_else)) {
|
|
|
|
EMIT_ARG(label_assign, end_label);
|
|
|
|
}
|
2013-11-06 20:20:49 +00:00
|
|
|
}
|
|
|
|
|
2014-12-10 22:07:04 +00:00
|
|
|
STATIC void compile_for_stmt(compiler_t *comp, mp_parse_node_struct_t *pns) {
|
2013-11-06 20:20:49 +00:00
|
|
|
// this bit optimises: for <x> in range(...), turning it into an explicitly incremented variable
|
|
|
|
// this is actually slower, but uses no heap memory
|
|
|
|
// for viper it will be much, much faster
|
2016-01-27 20:23:11 +00:00
|
|
|
if (/*comp->scope_cur->emit_options == MP_EMIT_OPT_VIPER &&*/ MP_PARSE_NODE_IS_ID(pns->nodes[0]) && MP_PARSE_NODE_IS_STRUCT_KIND(pns->nodes[1], PN_atom_expr_normal)) {
|
2013-12-21 18:17:45 +00:00
|
|
|
mp_parse_node_struct_t *pns_it = (mp_parse_node_struct_t *)pns->nodes[1];
|
2016-01-27 20:23:11 +00:00
|
|
|
if (MP_PARSE_NODE_IS_ID(pns_it->nodes[0])
|
2014-01-10 18:38:57 +00:00
|
|
|
&& MP_PARSE_NODE_LEAF_ARG(pns_it->nodes[0]) == MP_QSTR_range
|
2017-04-22 05:13:37 +01:00
|
|
|
&& MP_PARSE_NODE_STRUCT_KIND((mp_parse_node_struct_t *)pns_it->nodes[1]) == PN_trailer_paren) {
|
2013-12-21 18:17:45 +00:00
|
|
|
mp_parse_node_t pn_range_args = ((mp_parse_node_struct_t *)pns_it->nodes[1])->nodes[0];
|
|
|
|
mp_parse_node_t *args;
|
2020-05-04 13:11:44 +01:00
|
|
|
size_t n_args = mp_parse_node_extract_list(&pn_range_args, PN_arglist, &args);
|
2014-01-10 18:38:57 +00:00
|
|
|
mp_parse_node_t pn_range_start;
|
|
|
|
mp_parse_node_t pn_range_end;
|
|
|
|
mp_parse_node_t pn_range_step;
|
|
|
|
bool optimize = false;
|
2013-11-06 20:20:49 +00:00
|
|
|
if (1 <= n_args && n_args <= 3) {
|
2014-01-10 18:38:57 +00:00
|
|
|
optimize = true;
|
2013-11-06 20:20:49 +00:00
|
|
|
if (n_args == 1) {
|
2016-11-13 04:35:11 +00:00
|
|
|
pn_range_start = mp_parse_node_new_small_int(0);
|
2013-11-06 20:20:49 +00:00
|
|
|
pn_range_end = args[0];
|
2016-11-13 04:35:11 +00:00
|
|
|
pn_range_step = mp_parse_node_new_small_int(1);
|
2013-11-06 20:20:49 +00:00
|
|
|
} else if (n_args == 2) {
|
|
|
|
pn_range_start = args[0];
|
|
|
|
pn_range_end = args[1];
|
2016-11-13 04:35:11 +00:00
|
|
|
pn_range_step = mp_parse_node_new_small_int(1);
|
2013-11-06 20:20:49 +00:00
|
|
|
} else {
|
|
|
|
pn_range_start = args[0];
|
|
|
|
pn_range_end = args[1];
|
|
|
|
pn_range_step = args[2];
|
2017-04-05 01:50:26 +01:00
|
|
|
// the step must be a non-zero constant integer to do the optimisation
|
|
|
|
if (!MP_PARSE_NODE_IS_SMALL_INT(pn_range_step)
|
|
|
|
|| MP_PARSE_NODE_LEAF_SMALL_INT(pn_range_step) == 0) {
|
2014-01-10 18:38:57 +00:00
|
|
|
optimize = false;
|
|
|
|
}
|
2013-11-06 20:20:49 +00:00
|
|
|
}
|
2015-12-08 21:05:14 +00:00
|
|
|
// arguments must be able to be compiled as standard expressions
|
|
|
|
if (optimize && MP_PARSE_NODE_IS_STRUCT(pn_range_start)) {
|
|
|
|
int k = MP_PARSE_NODE_STRUCT_KIND((mp_parse_node_struct_t *)pn_range_start);
|
|
|
|
if (k == PN_arglist_star || k == PN_arglist_dbl_star || k == PN_argument) {
|
|
|
|
optimize = false;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
if (optimize && MP_PARSE_NODE_IS_STRUCT(pn_range_end)) {
|
|
|
|
int k = MP_PARSE_NODE_STRUCT_KIND((mp_parse_node_struct_t *)pn_range_end);
|
|
|
|
if (k == PN_arglist_star || k == PN_arglist_dbl_star || k == PN_argument) {
|
|
|
|
optimize = false;
|
|
|
|
}
|
|
|
|
}
|
2014-01-10 18:38:57 +00:00
|
|
|
}
|
|
|
|
if (optimize) {
|
2013-11-06 20:20:49 +00:00
|
|
|
compile_for_stmt_optimised_range(comp, pns->nodes[0], pn_range_start, pn_range_end, pn_range_step, pns->nodes[2], pns->nodes[3]);
|
|
|
|
return;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2014-02-01 20:08:18 +00:00
|
|
|
START_BREAK_CONTINUE_BLOCK
|
2014-05-30 15:20:41 +01:00
|
|
|
comp->break_label |= MP_EMIT_BREAK_FROM_FOR;
|
2013-10-04 19:53:11 +01:00
|
|
|
|
2014-04-10 14:11:31 +01:00
|
|
|
uint pop_label = comp_next_label(comp);
|
2013-10-04 19:53:11 +01:00
|
|
|
|
|
|
|
compile_node(comp, pns->nodes[1]); // iterator
|
2016-01-09 23:59:52 +00:00
|
|
|
EMIT_ARG(get_iter, true);
|
2014-02-01 20:08:18 +00:00
|
|
|
EMIT_ARG(label_assign, continue_label);
|
2014-01-23 00:34:21 +00:00
|
|
|
EMIT_ARG(for_iter, pop_label);
|
2013-10-04 19:53:11 +01:00
|
|
|
c_assign(comp, pns->nodes[0], ASSIGN_STORE); // variable
|
|
|
|
compile_node(comp, pns->nodes[2]); // body
|
2013-10-05 12:19:06 +01:00
|
|
|
if (!EMIT(last_emit_was_return_value)) {
|
2014-02-01 20:08:18 +00:00
|
|
|
EMIT_ARG(jump, continue_label);
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
2014-01-23 00:34:21 +00:00
|
|
|
EMIT_ARG(label_assign, pop_label);
|
2017-01-17 04:30:18 +00:00
|
|
|
EMIT(for_iter_end);
|
2013-10-04 19:53:11 +01:00
|
|
|
|
|
|
|
// break/continue apply to outer loop (if any) in the else block
|
2014-02-01 20:08:18 +00:00
|
|
|
END_BREAK_CONTINUE_BLOCK
|
2013-10-04 19:53:11 +01:00
|
|
|
|
2017-06-22 04:50:33 +01:00
|
|
|
compile_node(comp, pns->nodes[3]); // else (may be empty)
|
2013-10-04 19:53:11 +01:00
|
|
|
|
2014-01-23 00:34:21 +00:00
|
|
|
EMIT_ARG(label_assign, break_label);
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
|
|
|
|
2014-08-15 14:30:52 +01:00
|
|
|
STATIC void compile_try_except(compiler_t *comp, mp_parse_node_t pn_body, int n_except, mp_parse_node_t *pn_excepts, mp_parse_node_t pn_else) {
|
2013-10-04 19:53:11 +01:00
|
|
|
// setup code
|
2014-04-10 14:11:31 +01:00
|
|
|
uint l1 = comp_next_label(comp);
|
|
|
|
uint success_label = comp_next_label(comp);
|
2014-02-01 20:08:18 +00:00
|
|
|
|
2018-09-04 06:34:51 +01:00
|
|
|
compile_increase_except_level(comp, l1, MP_EMIT_SETUP_BLOCK_EXCEPT);
|
2014-02-01 20:08:18 +00:00
|
|
|
|
2013-10-04 19:53:11 +01:00
|
|
|
compile_node(comp, pn_body); // body
|
2019-02-15 01:18:59 +00:00
|
|
|
EMIT_ARG(pop_except_jump, success_label, false); // jump over exception handler
|
2014-04-10 18:22:19 +01:00
|
|
|
|
|
|
|
EMIT_ARG(label_assign, l1); // start of exception handler
|
2014-06-30 05:17:25 +01:00
|
|
|
EMIT(start_except_handler);
|
2014-04-10 18:22:19 +01:00
|
|
|
|
2016-09-27 03:37:21 +01:00
|
|
|
// at this point the top of the stack contains the exception instance that was raised
|
|
|
|
|
2014-04-10 14:11:31 +01:00
|
|
|
uint l2 = comp_next_label(comp);
|
2013-10-04 19:53:11 +01:00
|
|
|
|
|
|
|
for (int i = 0; i < n_except; i++) {
|
2013-12-21 18:17:45 +00:00
|
|
|
assert(MP_PARSE_NODE_IS_STRUCT_KIND(pn_excepts[i], PN_try_stmt_except)); // should be
|
|
|
|
mp_parse_node_struct_t *pns_except = (mp_parse_node_struct_t *)pn_excepts[i];
|
2013-10-04 19:53:11 +01:00
|
|
|
|
|
|
|
qstr qstr_exception_local = 0;
|
2014-04-10 14:11:31 +01:00
|
|
|
uint end_finally_label = comp_next_label(comp);
|
2019-08-14 15:09:36 +01:00
|
|
|
#if MICROPY_PY_SYS_SETTRACE
|
|
|
|
EMIT_ARG(set_source_line, pns_except->source_line);
|
|
|
|
#endif
|
2013-10-04 19:53:11 +01:00
|
|
|
|
2013-12-21 18:17:45 +00:00
|
|
|
if (MP_PARSE_NODE_IS_NULL(pns_except->nodes[0])) {
|
2013-10-04 19:53:11 +01:00
|
|
|
// this is a catch all exception handler
|
|
|
|
if (i + 1 != n_except) {
|
2020-03-02 11:35:22 +00:00
|
|
|
compile_syntax_error(comp, pn_excepts[i], MP_ERROR_TEXT("default 'except' must be last"));
|
2015-01-13 23:33:16 +00:00
|
|
|
compile_decrease_except_level(comp);
|
2013-10-04 19:53:11 +01:00
|
|
|
return;
|
|
|
|
}
|
|
|
|
} else {
|
|
|
|
// this exception handler requires a match to a certain type of exception
|
2013-12-21 18:17:45 +00:00
|
|
|
mp_parse_node_t pns_exception_expr = pns_except->nodes[0];
|
|
|
|
if (MP_PARSE_NODE_IS_STRUCT(pns_exception_expr)) {
|
|
|
|
mp_parse_node_struct_t *pns3 = (mp_parse_node_struct_t *)pns_exception_expr;
|
|
|
|
if (MP_PARSE_NODE_STRUCT_KIND(pns3) == PN_try_stmt_as_name) {
|
2013-10-04 19:53:11 +01:00
|
|
|
// handler binds the exception to a local
|
|
|
|
pns_exception_expr = pns3->nodes[0];
|
2013-12-21 18:17:45 +00:00
|
|
|
qstr_exception_local = MP_PARSE_NODE_LEAF_ARG(pns3->nodes[1]);
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
|
|
|
}
|
|
|
|
EMIT(dup_top);
|
|
|
|
compile_node(comp, pns_exception_expr);
|
2014-03-30 13:35:08 +01:00
|
|
|
EMIT_ARG(binary_op, MP_BINARY_OP_EXCEPTION_MATCH);
|
2015-02-28 15:04:06 +00:00
|
|
|
EMIT_ARG(pop_jump_if, false, end_finally_label);
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
|
|
|
|
2016-09-27 03:37:21 +01:00
|
|
|
// either discard or store the exception instance
|
2013-10-04 19:53:11 +01:00
|
|
|
if (qstr_exception_local == 0) {
|
|
|
|
EMIT(pop_top);
|
|
|
|
} else {
|
2015-03-26 14:42:40 +00:00
|
|
|
compile_store_id(comp, qstr_exception_local);
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
|
|
|
|
2019-01-01 23:46:51 +00:00
|
|
|
// If the exception is bound to a variable <e> then the <body> of the
|
|
|
|
// exception handler is wrapped in a try-finally so that the name <e> can
|
|
|
|
// be deleted (per Python semantics) even if the <body> has an exception.
|
|
|
|
// In such a case the generated code for the exception handler is:
|
|
|
|
// try:
|
|
|
|
// <body>
|
|
|
|
// finally:
|
|
|
|
// <e> = None
|
|
|
|
// del <e>
|
2014-04-10 14:11:31 +01:00
|
|
|
uint l3 = 0;
|
2013-10-04 19:53:11 +01:00
|
|
|
if (qstr_exception_local != 0) {
|
2013-10-05 13:37:10 +01:00
|
|
|
l3 = comp_next_label(comp);
|
2018-09-04 06:34:51 +01:00
|
|
|
compile_increase_except_level(comp, l3, MP_EMIT_SETUP_BLOCK_FINALLY);
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
2019-01-01 23:46:51 +00:00
|
|
|
compile_node(comp, pns_except->nodes[1]); // the <body>
|
2013-10-04 19:53:11 +01:00
|
|
|
if (qstr_exception_local != 0) {
|
2014-01-23 00:34:21 +00:00
|
|
|
EMIT_ARG(load_const_tok, MP_TOKEN_KW_NONE);
|
|
|
|
EMIT_ARG(label_assign, l3);
|
|
|
|
EMIT_ARG(load_const_tok, MP_TOKEN_KW_NONE);
|
2015-03-26 14:42:40 +00:00
|
|
|
compile_store_id(comp, qstr_exception_local);
|
|
|
|
compile_delete_id(comp, qstr_exception_local);
|
2014-03-27 10:55:21 +00:00
|
|
|
compile_decrease_except_level(comp);
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
2019-01-01 23:46:51 +00:00
|
|
|
|
2019-02-15 01:18:59 +00:00
|
|
|
EMIT_ARG(pop_except_jump, l2, true);
|
2014-01-23 00:34:21 +00:00
|
|
|
EMIT_ARG(label_assign, end_finally_label);
|
2016-09-27 03:37:21 +01:00
|
|
|
EMIT_ARG(adjust_stack_size, 1); // stack adjust for the exception instance
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
|
|
|
|
2014-03-27 10:55:21 +00:00
|
|
|
compile_decrease_except_level(comp);
|
2014-06-30 05:17:25 +01:00
|
|
|
EMIT(end_except_handler);
|
2014-02-01 20:08:18 +00:00
|
|
|
|
2014-01-23 00:34:21 +00:00
|
|
|
EMIT_ARG(label_assign, success_label);
|
2013-10-04 19:53:11 +01:00
|
|
|
compile_node(comp, pn_else); // else block, can be null
|
2014-01-23 00:34:21 +00:00
|
|
|
EMIT_ARG(label_assign, l2);
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
|
|
|
|
2014-08-15 14:30:52 +01:00
|
|
|
STATIC void compile_try_finally(compiler_t *comp, mp_parse_node_t pn_body, int n_except, mp_parse_node_t *pn_except, mp_parse_node_t pn_else, mp_parse_node_t pn_finally) {
|
2014-04-10 14:11:31 +01:00
|
|
|
uint l_finally_block = comp_next_label(comp);
|
2014-02-01 20:08:18 +00:00
|
|
|
|
2018-09-04 06:34:51 +01:00
|
|
|
compile_increase_except_level(comp, l_finally_block, MP_EMIT_SETUP_BLOCK_FINALLY);
|
2014-02-01 20:08:18 +00:00
|
|
|
|
2013-10-04 19:53:11 +01:00
|
|
|
if (n_except == 0) {
|
2013-12-21 18:17:45 +00:00
|
|
|
assert(MP_PARSE_NODE_IS_NULL(pn_else));
|
2014-04-10 18:28:54 +01:00
|
|
|
EMIT_ARG(adjust_stack_size, 3); // stack adjust for possible UNWIND_JUMP state
|
2013-10-04 19:53:11 +01:00
|
|
|
compile_node(comp, pn_body);
|
2014-04-10 18:28:54 +01:00
|
|
|
EMIT_ARG(adjust_stack_size, -3);
|
2013-10-04 19:53:11 +01:00
|
|
|
} else {
|
|
|
|
compile_try_except(comp, pn_body, n_except, pn_except, pn_else);
|
|
|
|
}
|
2014-01-23 00:34:21 +00:00
|
|
|
EMIT_ARG(load_const_tok, MP_TOKEN_KW_NONE);
|
|
|
|
EMIT_ARG(label_assign, l_finally_block);
|
2013-10-04 19:53:11 +01:00
|
|
|
compile_node(comp, pn_finally);
|
2014-02-01 20:08:18 +00:00
|
|
|
|
2014-03-27 10:55:21 +00:00
|
|
|
compile_decrease_except_level(comp);
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
|
|
|
|
2014-12-10 22:07:04 +00:00
|
|
|
STATIC void compile_try_stmt(compiler_t *comp, mp_parse_node_struct_t *pns) {
|
2015-02-27 14:25:47 +00:00
|
|
|
assert(MP_PARSE_NODE_IS_STRUCT(pns->nodes[1])); // should be
|
|
|
|
{
|
2013-12-21 18:17:45 +00:00
|
|
|
mp_parse_node_struct_t *pns2 = (mp_parse_node_struct_t *)pns->nodes[1];
|
|
|
|
if (MP_PARSE_NODE_STRUCT_KIND(pns2) == PN_try_stmt_finally) {
|
2013-10-04 19:53:11 +01:00
|
|
|
// just try-finally
|
2013-12-21 18:17:45 +00:00
|
|
|
compile_try_finally(comp, pns->nodes[0], 0, NULL, MP_PARSE_NODE_NULL, pns2->nodes[0]);
|
|
|
|
} else if (MP_PARSE_NODE_STRUCT_KIND(pns2) == PN_try_stmt_except_and_more) {
|
2013-10-04 19:53:11 +01:00
|
|
|
// try-except and possibly else and/or finally
|
2013-12-21 18:17:45 +00:00
|
|
|
mp_parse_node_t *pn_excepts;
|
2020-05-04 13:11:44 +01:00
|
|
|
size_t n_except = mp_parse_node_extract_list(&pns2->nodes[0], PN_try_stmt_except_list, &pn_excepts);
|
2013-12-21 18:17:45 +00:00
|
|
|
if (MP_PARSE_NODE_IS_NULL(pns2->nodes[2])) {
|
2013-10-04 19:53:11 +01:00
|
|
|
// no finally
|
|
|
|
compile_try_except(comp, pns->nodes[0], n_except, pn_excepts, pns2->nodes[1]);
|
|
|
|
} else {
|
|
|
|
// have finally
|
2013-12-21 18:17:45 +00:00
|
|
|
compile_try_finally(comp, pns->nodes[0], n_except, pn_excepts, pns2->nodes[1], ((mp_parse_node_struct_t *)pns2->nodes[2])->nodes[0]);
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
|
|
|
} else {
|
|
|
|
// just try-except
|
2013-12-21 18:17:45 +00:00
|
|
|
mp_parse_node_t *pn_excepts;
|
2020-05-04 13:11:44 +01:00
|
|
|
size_t n_except = mp_parse_node_extract_list(&pns->nodes[1], PN_try_stmt_except_list, &pn_excepts);
|
2013-12-21 18:17:45 +00:00
|
|
|
compile_try_except(comp, pns->nodes[0], n_except, pn_excepts, MP_PARSE_NODE_NULL);
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2020-05-04 13:11:44 +01:00
|
|
|
STATIC void compile_with_stmt_helper(compiler_t *comp, size_t n, mp_parse_node_t *nodes, mp_parse_node_t body) {
|
2013-10-04 19:53:11 +01:00
|
|
|
if (n == 0) {
|
|
|
|
// no more pre-bits, compile the body of the with
|
|
|
|
compile_node(comp, body);
|
|
|
|
} else {
|
2014-04-10 14:11:31 +01:00
|
|
|
uint l_end = comp_next_label(comp);
|
2013-12-21 18:17:45 +00:00
|
|
|
if (MP_PARSE_NODE_IS_STRUCT_KIND(nodes[0], PN_with_item)) {
|
2013-10-04 19:53:11 +01:00
|
|
|
// this pre-bit is of the form "a as b"
|
2013-12-21 18:17:45 +00:00
|
|
|
mp_parse_node_struct_t *pns = (mp_parse_node_struct_t *)nodes[0];
|
2013-10-04 19:53:11 +01:00
|
|
|
compile_node(comp, pns->nodes[0]);
|
2018-09-04 06:34:51 +01:00
|
|
|
compile_increase_except_level(comp, l_end, MP_EMIT_SETUP_BLOCK_WITH);
|
2013-10-04 19:53:11 +01:00
|
|
|
c_assign(comp, pns->nodes[1], ASSIGN_STORE);
|
|
|
|
} else {
|
|
|
|
// this pre-bit is just an expression
|
|
|
|
compile_node(comp, nodes[0]);
|
2018-09-04 06:34:51 +01:00
|
|
|
compile_increase_except_level(comp, l_end, MP_EMIT_SETUP_BLOCK_WITH);
|
2013-10-04 19:53:11 +01:00
|
|
|
EMIT(pop_top);
|
|
|
|
}
|
|
|
|
// compile additional pre-bits and the body
|
|
|
|
compile_with_stmt_helper(comp, n - 1, nodes + 1, body);
|
|
|
|
// finish this with block
|
2016-04-07 08:50:38 +01:00
|
|
|
EMIT_ARG(with_cleanup, l_end);
|
2018-09-04 05:31:28 +01:00
|
|
|
reserve_labels_for_native(comp, 3); // used by native's with_cleanup
|
2014-03-29 02:10:11 +00:00
|
|
|
compile_decrease_except_level(comp);
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2014-12-10 22:07:04 +00:00
|
|
|
STATIC void compile_with_stmt(compiler_t *comp, mp_parse_node_struct_t *pns) {
|
2013-10-04 19:53:11 +01:00
|
|
|
// get the nodes for the pre-bit of the with (the a as b, c as d, ... bit)
|
2013-12-21 18:17:45 +00:00
|
|
|
mp_parse_node_t *nodes;
|
2020-05-04 13:11:44 +01:00
|
|
|
size_t n = mp_parse_node_extract_list(&pns->nodes[0], PN_with_stmt_list, &nodes);
|
2013-10-04 19:53:11 +01:00
|
|
|
assert(n > 0);
|
|
|
|
|
|
|
|
// compile in a nested fashion
|
|
|
|
compile_with_stmt_helper(comp, n, nodes, pns->nodes[1]);
|
|
|
|
}
|
|
|
|
|
2016-01-27 20:23:11 +00:00
|
|
|
STATIC void compile_yield_from(compiler_t *comp) {
|
2016-01-09 23:59:52 +00:00
|
|
|
EMIT_ARG(get_iter, false);
|
2016-01-27 20:23:11 +00:00
|
|
|
EMIT_ARG(load_const_tok, MP_TOKEN_KW_NONE);
|
2018-05-18 15:30:42 +01:00
|
|
|
EMIT_ARG(yield, MP_EMIT_YIELD_FROM);
|
2018-10-01 04:07:04 +01:00
|
|
|
reserve_labels_for_native(comp, 3);
|
2016-01-27 20:23:11 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
#if MICROPY_PY_ASYNC_AWAIT
|
|
|
|
STATIC void compile_await_object_method(compiler_t *comp, qstr method) {
|
2017-04-19 00:45:59 +01:00
|
|
|
EMIT_ARG(load_method, method, false);
|
2016-01-27 20:23:11 +00:00
|
|
|
EMIT_ARG(call_method, 0, 0, 0);
|
|
|
|
compile_yield_from(comp);
|
|
|
|
}
|
|
|
|
|
|
|
|
STATIC void compile_async_for_stmt(compiler_t *comp, mp_parse_node_struct_t *pns) {
|
|
|
|
// comp->break_label |= MP_EMIT_BREAK_FROM_FOR;
|
|
|
|
|
|
|
|
qstr context = MP_PARSE_NODE_LEAF_ARG(pns->nodes[1]);
|
|
|
|
uint while_else_label = comp_next_label(comp);
|
|
|
|
uint try_exception_label = comp_next_label(comp);
|
|
|
|
uint try_else_label = comp_next_label(comp);
|
|
|
|
uint try_finally_label = comp_next_label(comp);
|
|
|
|
|
|
|
|
compile_node(comp, pns->nodes[1]); // iterator
|
2020-07-21 18:47:28 +01:00
|
|
|
EMIT_ARG(load_method, MP_QSTR___aiter__, false);
|
|
|
|
EMIT_ARG(call_method, 0, 0, 0);
|
2016-01-27 20:23:11 +00:00
|
|
|
compile_store_id(comp, context);
|
|
|
|
|
|
|
|
START_BREAK_CONTINUE_BLOCK
|
|
|
|
|
|
|
|
EMIT_ARG(label_assign, continue_label);
|
|
|
|
|
2018-09-04 06:34:51 +01:00
|
|
|
compile_increase_except_level(comp, try_exception_label, MP_EMIT_SETUP_BLOCK_EXCEPT);
|
2016-01-27 20:23:11 +00:00
|
|
|
|
|
|
|
compile_load_id(comp, context);
|
|
|
|
compile_await_object_method(comp, MP_QSTR___anext__);
|
|
|
|
c_assign(comp, pns->nodes[0], ASSIGN_STORE); // variable
|
2019-02-15 01:18:59 +00:00
|
|
|
EMIT_ARG(pop_except_jump, try_else_label, false);
|
2016-01-27 20:23:11 +00:00
|
|
|
|
|
|
|
EMIT_ARG(label_assign, try_exception_label);
|
|
|
|
EMIT(start_except_handler);
|
|
|
|
EMIT(dup_top);
|
|
|
|
EMIT_LOAD_GLOBAL(MP_QSTR_StopAsyncIteration);
|
|
|
|
EMIT_ARG(binary_op, MP_BINARY_OP_EXCEPTION_MATCH);
|
|
|
|
EMIT_ARG(pop_jump_if, false, try_finally_label);
|
2016-09-28 02:52:13 +01:00
|
|
|
EMIT(pop_top); // pop exception instance
|
2019-02-15 01:18:59 +00:00
|
|
|
EMIT_ARG(pop_except_jump, while_else_label, true);
|
2016-01-27 20:23:11 +00:00
|
|
|
|
|
|
|
EMIT_ARG(label_assign, try_finally_label);
|
2016-09-28 02:52:13 +01:00
|
|
|
EMIT_ARG(adjust_stack_size, 1); // if we jump here, the exc is on the stack
|
2016-01-27 20:23:11 +00:00
|
|
|
compile_decrease_except_level(comp);
|
|
|
|
EMIT(end_except_handler);
|
|
|
|
|
|
|
|
EMIT_ARG(label_assign, try_else_label);
|
|
|
|
compile_node(comp, pns->nodes[2]); // body
|
|
|
|
|
|
|
|
EMIT_ARG(jump, continue_label);
|
|
|
|
// break/continue apply to outer loop (if any) in the else block
|
|
|
|
END_BREAK_CONTINUE_BLOCK
|
|
|
|
|
|
|
|
EMIT_ARG(label_assign, while_else_label);
|
|
|
|
compile_node(comp, pns->nodes[3]); // else
|
|
|
|
|
|
|
|
EMIT_ARG(label_assign, break_label);
|
|
|
|
}
|
|
|
|
|
2020-05-04 13:11:44 +01:00
|
|
|
STATIC void compile_async_with_stmt_helper(compiler_t *comp, size_t n, mp_parse_node_t *nodes, mp_parse_node_t body) {
|
2016-01-27 20:23:11 +00:00
|
|
|
if (n == 0) {
|
|
|
|
// no more pre-bits, compile the body of the with
|
|
|
|
compile_node(comp, body);
|
|
|
|
} else {
|
2018-06-23 13:32:09 +01:00
|
|
|
uint l_finally_block = comp_next_label(comp);
|
|
|
|
uint l_aexit_no_exc = comp_next_label(comp);
|
|
|
|
uint l_ret_unwind_jump = comp_next_label(comp);
|
|
|
|
uint l_end = comp_next_label(comp);
|
2016-01-27 20:23:11 +00:00
|
|
|
|
|
|
|
if (MP_PARSE_NODE_IS_STRUCT_KIND(nodes[0], PN_with_item)) {
|
|
|
|
// this pre-bit is of the form "a as b"
|
|
|
|
mp_parse_node_struct_t *pns = (mp_parse_node_struct_t *)nodes[0];
|
|
|
|
compile_node(comp, pns->nodes[0]);
|
2018-06-23 13:32:09 +01:00
|
|
|
EMIT(dup_top);
|
2016-01-27 20:23:11 +00:00
|
|
|
compile_await_object_method(comp, MP_QSTR___aenter__);
|
|
|
|
c_assign(comp, pns->nodes[1], ASSIGN_STORE);
|
|
|
|
} else {
|
|
|
|
// this pre-bit is just an expression
|
|
|
|
compile_node(comp, nodes[0]);
|
2018-06-23 13:32:09 +01:00
|
|
|
EMIT(dup_top);
|
2016-01-27 20:23:11 +00:00
|
|
|
compile_await_object_method(comp, MP_QSTR___aenter__);
|
|
|
|
EMIT(pop_top);
|
|
|
|
}
|
|
|
|
|
2018-06-23 13:32:09 +01:00
|
|
|
// To keep the Python stack size down, and because we can't access values on
|
|
|
|
// this stack further down than 3 elements (via rot_three), we don't preload
|
|
|
|
// __aexit__ (as per normal with) but rather wait until we need it below.
|
2016-01-27 20:23:11 +00:00
|
|
|
|
2018-06-23 13:32:09 +01:00
|
|
|
// Start the try-finally statement
|
2018-09-04 06:34:51 +01:00
|
|
|
compile_increase_except_level(comp, l_finally_block, MP_EMIT_SETUP_BLOCK_FINALLY);
|
2018-06-23 13:32:09 +01:00
|
|
|
|
|
|
|
// Compile any additional pre-bits of the "async with", and also the body
|
|
|
|
EMIT_ARG(adjust_stack_size, 3); // stack adjust for possible UNWIND_JUMP state
|
2016-01-27 20:23:11 +00:00
|
|
|
compile_async_with_stmt_helper(comp, n - 1, nodes + 1, body);
|
2018-06-23 13:32:09 +01:00
|
|
|
EMIT_ARG(adjust_stack_size, -3);
|
|
|
|
|
py: Fix VM crash with unwinding jump out of a finally block.
This patch fixes a bug in the VM when breaking within a try-finally. The
bug has to do with executing a break within the finally block of a
try-finally statement. For example:
def f():
for x in (1,):
print('a', x)
try:
raise Exception
finally:
print(1)
break
print('b', x)
f()
Currently in uPy the above code will print:
a 1
1
1
segmentation fault (core dumped) micropython
Not only is there a seg fault, but the "1" in the finally block is printed
twice. This is because when the VM executes a finally block it doesn't
really know if that block was executed due to a fall-through of the try (no
exception raised), or because an exception is active. In particular, for
nested finallys the VM has no idea which of the nested ones have active
exceptions and which are just fall-throughs. So when a break (or continue)
is executed it tries to unwind all of the finallys, when in fact only some
may be active.
It's questionable whether break (or return or continue) should be allowed
within a finally block, because they implicitly swallow any active
exception, but nevertheless it's allowed by CPython (although almost never
used in the standard library). And uPy should at least not crash in such a
case.
The solution here relies on the fact that exception and finally handlers
always appear in the bytecode after the try body.
Note: there was a similar bug with a return in a finally block, but that
was previously fixed in b735208403a54774f9fd3d966f7c1a194c41870f
2019-01-02 06:48:43 +00:00
|
|
|
// We have now finished the "try" block and fall through to the "finally"
|
2016-01-27 20:23:11 +00:00
|
|
|
|
2018-06-23 13:32:09 +01:00
|
|
|
// At this point, after the with body has executed, we have 3 cases:
|
|
|
|
// 1. no exception, we just fall through to this point; stack: (..., ctx_mgr)
|
|
|
|
// 2. exception propagating out, we get to the finally block; stack: (..., ctx_mgr, exc)
|
|
|
|
// 3. return or unwind jump, we get to the finally block; stack: (..., ctx_mgr, X, INT)
|
|
|
|
|
|
|
|
// Handle case 1: call __aexit__
|
|
|
|
// Stack: (..., ctx_mgr)
|
|
|
|
EMIT_ARG(load_const_tok, MP_TOKEN_KW_NONE); // to tell end_finally there's no exception
|
|
|
|
EMIT(rot_two);
|
|
|
|
EMIT_ARG(jump, l_aexit_no_exc); // jump to code below to call __aexit__
|
|
|
|
|
|
|
|
// Start of "finally" block
|
|
|
|
// At this point we have case 2 or 3, we detect which one by the TOS being an exception or not
|
|
|
|
EMIT_ARG(label_assign, l_finally_block);
|
2016-09-28 02:52:13 +01:00
|
|
|
|
2018-06-23 13:32:09 +01:00
|
|
|
// Detect if TOS an exception or not
|
|
|
|
EMIT(dup_top);
|
2019-02-26 02:47:14 +00:00
|
|
|
EMIT_LOAD_GLOBAL(MP_QSTR_BaseException);
|
2018-06-23 13:32:09 +01:00
|
|
|
EMIT_ARG(binary_op, MP_BINARY_OP_EXCEPTION_MATCH);
|
|
|
|
EMIT_ARG(pop_jump_if, false, l_ret_unwind_jump); // if not an exception then we have case 3
|
|
|
|
|
|
|
|
// Handle case 2: call __aexit__ and either swallow or re-raise the exception
|
|
|
|
// Stack: (..., ctx_mgr, exc)
|
|
|
|
EMIT(dup_top);
|
|
|
|
EMIT(rot_three);
|
|
|
|
EMIT(rot_two);
|
|
|
|
EMIT_ARG(load_method, MP_QSTR___aexit__, false);
|
|
|
|
EMIT(rot_three);
|
|
|
|
EMIT(rot_three);
|
2016-09-28 02:52:13 +01:00
|
|
|
EMIT(dup_top);
|
|
|
|
#if MICROPY_CPYTHON_COMPAT
|
2018-05-22 12:43:41 +01:00
|
|
|
EMIT_ARG(attr, MP_QSTR___class__, MP_EMIT_ATTR_LOAD); // get type(exc)
|
2016-09-28 02:52:13 +01:00
|
|
|
#else
|
|
|
|
compile_load_id(comp, MP_QSTR_type);
|
|
|
|
EMIT(rot_two);
|
|
|
|
EMIT_ARG(call_function, 1, 0, 0); // get type(exc)
|
|
|
|
#endif
|
2016-01-27 20:23:11 +00:00
|
|
|
EMIT(rot_two);
|
2016-09-28 02:52:13 +01:00
|
|
|
EMIT_ARG(load_const_tok, MP_TOKEN_KW_NONE); // dummy traceback value
|
2018-06-23 13:32:09 +01:00
|
|
|
// Stack: (..., exc, __aexit__, ctx_mgr, type(exc), exc, None)
|
2016-01-27 20:23:11 +00:00
|
|
|
EMIT_ARG(call_method, 3, 0, 0);
|
|
|
|
compile_yield_from(comp);
|
2018-06-23 13:32:09 +01:00
|
|
|
EMIT_ARG(pop_jump_if, false, l_end);
|
|
|
|
EMIT(pop_top); // pop exception
|
|
|
|
EMIT_ARG(load_const_tok, MP_TOKEN_KW_NONE); // replace with None to swallow exception
|
|
|
|
EMIT_ARG(jump, l_end);
|
|
|
|
EMIT_ARG(adjust_stack_size, 2);
|
|
|
|
|
|
|
|
// Handle case 3: call __aexit__
|
|
|
|
// Stack: (..., ctx_mgr, X, INT)
|
|
|
|
EMIT_ARG(label_assign, l_ret_unwind_jump);
|
|
|
|
EMIT(rot_three);
|
|
|
|
EMIT(rot_three);
|
|
|
|
EMIT_ARG(label_assign, l_aexit_no_exc);
|
|
|
|
EMIT_ARG(load_method, MP_QSTR___aexit__, false);
|
2016-01-27 20:23:11 +00:00
|
|
|
EMIT_ARG(load_const_tok, MP_TOKEN_KW_NONE);
|
|
|
|
EMIT(dup_top);
|
|
|
|
EMIT(dup_top);
|
|
|
|
EMIT_ARG(call_method, 3, 0, 0);
|
|
|
|
compile_yield_from(comp);
|
|
|
|
EMIT(pop_top);
|
2018-06-23 13:32:09 +01:00
|
|
|
EMIT_ARG(adjust_stack_size, -1);
|
|
|
|
|
|
|
|
// End of "finally" block
|
|
|
|
// Stack can have one of three configurations:
|
|
|
|
// a. (..., None) - from either case 1, or case 2 with swallowed exception
|
|
|
|
// b. (..., exc) - from case 2 with re-raised exception
|
|
|
|
// c. (..., X, INT) - from case 3
|
|
|
|
EMIT_ARG(label_assign, l_end);
|
|
|
|
compile_decrease_except_level(comp);
|
2016-01-27 20:23:11 +00:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
STATIC void compile_async_with_stmt(compiler_t *comp, mp_parse_node_struct_t *pns) {
|
|
|
|
// get the nodes for the pre-bit of the with (the a as b, c as d, ... bit)
|
|
|
|
mp_parse_node_t *nodes;
|
2020-05-04 13:11:44 +01:00
|
|
|
size_t n = mp_parse_node_extract_list(&pns->nodes[0], PN_with_stmt_list, &nodes);
|
2016-01-27 20:23:11 +00:00
|
|
|
assert(n > 0);
|
|
|
|
|
|
|
|
// compile in a nested fashion
|
|
|
|
compile_async_with_stmt_helper(comp, n, nodes, pns->nodes[1]);
|
|
|
|
}
|
|
|
|
|
|
|
|
STATIC void compile_async_stmt(compiler_t *comp, mp_parse_node_struct_t *pns) {
|
|
|
|
assert(MP_PARSE_NODE_IS_STRUCT(pns->nodes[0]));
|
|
|
|
mp_parse_node_struct_t *pns0 = (mp_parse_node_struct_t *)pns->nodes[0];
|
|
|
|
if (MP_PARSE_NODE_STRUCT_KIND(pns0) == PN_funcdef) {
|
|
|
|
// async def
|
|
|
|
compile_funcdef(comp, pns0);
|
|
|
|
scope_t *fscope = (scope_t *)pns0->nodes[4];
|
|
|
|
fscope->scope_flags |= MP_SCOPE_FLAG_GENERATOR;
|
|
|
|
} else {
|
2020-03-01 15:40:43 +00:00
|
|
|
// async for/with; first verify the scope is a generator
|
|
|
|
int scope_flags = comp->scope_cur->scope_flags;
|
|
|
|
if (!(scope_flags & MP_SCOPE_FLAG_GENERATOR)) {
|
|
|
|
compile_syntax_error(comp, (mp_parse_node_t)pns0,
|
|
|
|
MP_ERROR_TEXT("async for/with outside async function"));
|
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (MP_PARSE_NODE_STRUCT_KIND(pns0) == PN_for_stmt) {
|
|
|
|
// async for
|
|
|
|
compile_async_for_stmt(comp, pns0);
|
|
|
|
} else {
|
|
|
|
// async with
|
|
|
|
assert(MP_PARSE_NODE_STRUCT_KIND(pns0) == PN_with_stmt);
|
|
|
|
compile_async_with_stmt(comp, pns0);
|
|
|
|
}
|
2016-01-27 20:23:11 +00:00
|
|
|
}
|
|
|
|
}
|
|
|
|
#endif
|
|
|
|
|
2014-12-10 22:07:04 +00:00
|
|
|
STATIC void compile_expr_stmt(compiler_t *comp, mp_parse_node_struct_t *pns) {
|
2020-06-16 13:49:25 +01:00
|
|
|
mp_parse_node_t pn_rhs = pns->nodes[1];
|
|
|
|
if (MP_PARSE_NODE_IS_NULL(pn_rhs)) {
|
2013-10-18 19:58:12 +01:00
|
|
|
if (comp->is_repl && comp->scope_cur->kind == SCOPE_MODULE) {
|
|
|
|
// for REPL, evaluate then print the expression
|
2015-03-26 14:42:40 +00:00
|
|
|
compile_load_id(comp, MP_QSTR___repl_print__);
|
2013-10-18 19:58:12 +01:00
|
|
|
compile_node(comp, pns->nodes[0]);
|
2014-04-09 12:43:17 +01:00
|
|
|
EMIT_ARG(call_function, 1, 0, 0);
|
2013-10-18 19:58:12 +01:00
|
|
|
EMIT(pop_top);
|
|
|
|
|
2013-10-04 19:53:11 +01:00
|
|
|
} else {
|
2013-10-18 19:58:12 +01:00
|
|
|
// for non-REPL, evaluate then discard the expression
|
2014-05-25 22:06:06 +01:00
|
|
|
if ((MP_PARSE_NODE_IS_LEAF(pns->nodes[0]) && !MP_PARSE_NODE_IS_ID(pns->nodes[0]))
|
2015-02-08 01:57:40 +00:00
|
|
|
|| MP_PARSE_NODE_IS_STRUCT_KIND(pns->nodes[0], PN_const_object)) {
|
2013-10-18 19:58:12 +01:00
|
|
|
// do nothing with a lonely constant
|
|
|
|
} else {
|
|
|
|
compile_node(comp, pns->nodes[0]); // just an expression
|
|
|
|
EMIT(pop_top); // discard last result since this is a statement and leaves nothing on the stack
|
|
|
|
}
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
2020-06-16 13:49:25 +01:00
|
|
|
} else if (MP_PARSE_NODE_IS_STRUCT(pn_rhs)) {
|
|
|
|
mp_parse_node_struct_t *pns1 = (mp_parse_node_struct_t *)pn_rhs;
|
2013-12-21 18:17:45 +00:00
|
|
|
int kind = MP_PARSE_NODE_STRUCT_KIND(pns1);
|
2020-06-16 13:49:25 +01:00
|
|
|
if (kind == PN_annassign) {
|
|
|
|
// the annotation is in pns1->nodes[0] and is ignored
|
|
|
|
if (MP_PARSE_NODE_IS_NULL(pns1->nodes[1])) {
|
|
|
|
// an annotation of the form "x: y"
|
|
|
|
// inside a function this declares "x" as a local
|
|
|
|
if (comp->scope_cur->kind == SCOPE_FUNCTION) {
|
|
|
|
if (MP_PARSE_NODE_IS_ID(pns->nodes[0])) {
|
|
|
|
qstr lhs = MP_PARSE_NODE_LEAF_ARG(pns->nodes[0]);
|
|
|
|
scope_find_or_add_id(comp->scope_cur, lhs, ID_INFO_KIND_LOCAL);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
} else {
|
|
|
|
// an assigned annotation of the form "x: y = z"
|
|
|
|
pn_rhs = pns1->nodes[1];
|
|
|
|
goto plain_assign;
|
|
|
|
}
|
|
|
|
} else if (kind == PN_expr_stmt_augassign) {
|
2013-10-04 19:53:11 +01:00
|
|
|
c_assign(comp, pns->nodes[0], ASSIGN_AUG_LOAD); // lhs load for aug assign
|
|
|
|
compile_node(comp, pns1->nodes[1]); // rhs
|
2013-12-21 18:17:45 +00:00
|
|
|
assert(MP_PARSE_NODE_IS_TOKEN(pns1->nodes[0]));
|
2019-07-25 03:37:37 +01:00
|
|
|
mp_token_kind_t tok = MP_PARSE_NODE_LEAF_ARG(pns1->nodes[0]);
|
|
|
|
mp_binary_op_t op = MP_BINARY_OP_INPLACE_OR + (tok - MP_TOKEN_DEL_PIPE_EQUAL);
|
2014-02-01 22:18:47 +00:00
|
|
|
EMIT_ARG(binary_op, op);
|
2013-10-04 19:53:11 +01:00
|
|
|
c_assign(comp, pns->nodes[0], ASSIGN_AUG_STORE); // lhs store for aug assign
|
|
|
|
} else if (kind == PN_expr_stmt_assign_list) {
|
2013-12-21 18:17:45 +00:00
|
|
|
int rhs = MP_PARSE_NODE_STRUCT_NUM_NODES(pns1) - 1;
|
2015-10-08 14:26:01 +01:00
|
|
|
compile_node(comp, pns1->nodes[rhs]); // rhs
|
2013-10-04 19:53:11 +01:00
|
|
|
// following CPython, we store left-most first
|
|
|
|
if (rhs > 0) {
|
|
|
|
EMIT(dup_top);
|
|
|
|
}
|
|
|
|
c_assign(comp, pns->nodes[0], ASSIGN_STORE); // lhs store
|
|
|
|
for (int i = 0; i < rhs; i++) {
|
|
|
|
if (i + 1 < rhs) {
|
|
|
|
EMIT(dup_top);
|
|
|
|
}
|
2015-10-08 14:26:01 +01:00
|
|
|
c_assign(comp, pns1->nodes[i], ASSIGN_STORE); // middle store
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
2015-02-27 14:25:47 +00:00
|
|
|
} else {
|
2015-10-08 14:26:01 +01:00
|
|
|
plain_assign:
|
2018-02-04 02:35:21 +00:00
|
|
|
#if MICROPY_COMP_DOUBLE_TUPLE_ASSIGN
|
2020-06-16 13:49:25 +01:00
|
|
|
if (MP_PARSE_NODE_IS_STRUCT_KIND(pn_rhs, PN_testlist_star_expr)
|
2018-02-04 02:35:21 +00:00
|
|
|
&& MP_PARSE_NODE_IS_STRUCT_KIND(pns->nodes[0], PN_testlist_star_expr)) {
|
2015-04-09 16:29:54 +01:00
|
|
|
mp_parse_node_struct_t *pns0 = (mp_parse_node_struct_t *)pns->nodes[0];
|
2020-06-16 13:49:25 +01:00
|
|
|
pns1 = (mp_parse_node_struct_t *)pn_rhs;
|
2018-02-04 02:35:21 +00:00
|
|
|
uint32_t n_pns0 = MP_PARSE_NODE_STRUCT_NUM_NODES(pns0);
|
|
|
|
// Can only optimise a tuple-to-tuple assignment when all of the following hold:
|
|
|
|
// - equal number of items in LHS and RHS tuples
|
|
|
|
// - 2 or 3 items in the tuples
|
|
|
|
// - there are no star expressions in the LHS tuple
|
|
|
|
if (n_pns0 == MP_PARSE_NODE_STRUCT_NUM_NODES(pns1)
|
|
|
|
&& (n_pns0 == 2
|
|
|
|
#if MICROPY_COMP_TRIPLE_TUPLE_ASSIGN
|
|
|
|
|| n_pns0 == 3
|
|
|
|
#endif
|
|
|
|
)
|
|
|
|
&& !MP_PARSE_NODE_IS_STRUCT_KIND(pns0->nodes[0], PN_star_expr)
|
|
|
|
&& !MP_PARSE_NODE_IS_STRUCT_KIND(pns0->nodes[1], PN_star_expr)
|
|
|
|
#if MICROPY_COMP_TRIPLE_TUPLE_ASSIGN
|
|
|
|
&& (n_pns0 == 2 || !MP_PARSE_NODE_IS_STRUCT_KIND(pns0->nodes[2], PN_star_expr))
|
|
|
|
#endif
|
|
|
|
) {
|
|
|
|
// Optimisation for a, b = c, d or a, b, c = d, e, f
|
|
|
|
compile_node(comp, pns1->nodes[0]); // rhs
|
|
|
|
compile_node(comp, pns1->nodes[1]); // rhs
|
|
|
|
#if MICROPY_COMP_TRIPLE_TUPLE_ASSIGN
|
|
|
|
if (n_pns0 == 3) {
|
|
|
|
compile_node(comp, pns1->nodes[2]); // rhs
|
|
|
|
EMIT(rot_three);
|
|
|
|
}
|
|
|
|
#endif
|
|
|
|
EMIT(rot_two);
|
|
|
|
c_assign(comp, pns0->nodes[0], ASSIGN_STORE); // lhs store
|
|
|
|
c_assign(comp, pns0->nodes[1], ASSIGN_STORE); // lhs store
|
|
|
|
#if MICROPY_COMP_TRIPLE_TUPLE_ASSIGN
|
|
|
|
if (n_pns0 == 3) {
|
|
|
|
c_assign(comp, pns0->nodes[2], ASSIGN_STORE); // lhs store
|
|
|
|
}
|
|
|
|
#endif
|
|
|
|
return;
|
2014-04-08 17:51:47 +01:00
|
|
|
}
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
2018-02-04 02:35:21 +00:00
|
|
|
#endif
|
|
|
|
|
2020-06-16 13:49:25 +01:00
|
|
|
compile_node(comp, pn_rhs); // rhs
|
2018-02-04 02:35:21 +00:00
|
|
|
c_assign(comp, pns->nodes[0], ASSIGN_STORE); // lhs store
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
2015-10-08 14:26:01 +01:00
|
|
|
} else {
|
|
|
|
goto plain_assign;
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2014-12-10 22:07:04 +00:00
|
|
|
STATIC void compile_test_if_expr(compiler_t *comp, mp_parse_node_struct_t *pns) {
|
2013-12-21 18:17:45 +00:00
|
|
|
assert(MP_PARSE_NODE_IS_STRUCT_KIND(pns->nodes[1], PN_test_if_else));
|
|
|
|
mp_parse_node_struct_t *pns_test_if_else = (mp_parse_node_struct_t *)pns->nodes[1];
|
2013-10-04 19:53:11 +01:00
|
|
|
|
2014-04-10 14:11:31 +01:00
|
|
|
uint l_fail = comp_next_label(comp);
|
|
|
|
uint l_end = comp_next_label(comp);
|
2013-10-04 19:53:11 +01:00
|
|
|
c_if_cond(comp, pns_test_if_else->nodes[0], false, l_fail); // condition
|
|
|
|
compile_node(comp, pns->nodes[0]); // success value
|
2014-01-23 00:34:21 +00:00
|
|
|
EMIT_ARG(jump, l_end);
|
|
|
|
EMIT_ARG(label_assign, l_fail);
|
2014-04-10 18:28:54 +01:00
|
|
|
EMIT_ARG(adjust_stack_size, -1); // adjust stack size
|
2013-10-04 19:53:11 +01:00
|
|
|
compile_node(comp, pns_test_if_else->nodes[1]); // failure value
|
2014-01-23 00:34:21 +00:00
|
|
|
EMIT_ARG(label_assign, l_end);
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
|
|
|
|
2014-12-10 22:07:04 +00:00
|
|
|
STATIC void compile_lambdef(compiler_t *comp, mp_parse_node_struct_t *pns) {
|
2014-05-07 17:24:22 +01:00
|
|
|
if (comp->pass == MP_PASS_SCOPE) {
|
2013-10-04 19:53:11 +01:00
|
|
|
// create a new scope for this lambda
|
2013-12-21 18:17:45 +00:00
|
|
|
scope_t *s = scope_new_and_link(comp, SCOPE_LAMBDA, (mp_parse_node_t)pns, comp->scope_cur->emit_options);
|
2013-10-04 19:53:11 +01:00
|
|
|
// store the lambda scope so the compiling function (this one) can use it at each pass
|
2013-12-21 18:17:45 +00:00
|
|
|
pns->nodes[2] = (mp_parse_node_t)s;
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
|
|
|
|
|
|
|
// get the scope for this lambda
|
|
|
|
scope_t *this_scope = (scope_t *)pns->nodes[2];
|
|
|
|
|
2015-11-17 14:00:14 +00:00
|
|
|
// compile the lambda definition
|
|
|
|
compile_funcdef_lambdef(comp, this_scope, pns->nodes[0], PN_varargslist);
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
|
|
|
|
2020-06-16 12:42:44 +01:00
|
|
|
#if MICROPY_PY_ASSIGN_EXPR
|
|
|
|
STATIC void compile_namedexpr_helper(compiler_t *comp, mp_parse_node_t pn_name, mp_parse_node_t pn_expr) {
|
|
|
|
if (!MP_PARSE_NODE_IS_ID(pn_name)) {
|
|
|
|
compile_syntax_error(comp, (mp_parse_node_t)pn_name, MP_ERROR_TEXT("can't assign to expression"));
|
py: Rework bytecode and .mpy file format to be mostly static data.
Background: .mpy files are precompiled .py files, built using mpy-cross,
that contain compiled bytecode functions (and can also contain machine
code). The benefit of using an .mpy file over a .py file is that they are
faster to import and take less memory when importing. They are also
smaller on disk.
But the real benefit of .mpy files comes when they are frozen into the
firmware. This is done by loading the .mpy file during compilation of the
firmware and turning it into a set of big C data structures (the job of
mpy-tool.py), which are then compiled and downloaded into the ROM of a
device. These C data structures can be executed in-place, ie directly from
ROM. This makes importing even faster because there is very little to do,
and also means such frozen modules take up much less RAM (because their
bytecode stays in ROM).
The downside of frozen code is that it requires recompiling and reflashing
the entire firmware. This can be a big barrier to entry, slows down
development time, and makes it harder to do OTA updates of frozen code
(because the whole firmware must be updated).
This commit attempts to solve this problem by providing a solution that
sits between loading .mpy files into RAM and freezing them into the
firmware. The .mpy file format has been reworked so that it consists of
data and bytecode which is mostly static and ready to run in-place. If
these new .mpy files are located in flash/ROM which is memory addressable,
the .mpy file can be executed (mostly) in-place.
With this approach there is still a small amount of unpacking and linking
of the .mpy file that needs to be done when it's imported, but it's still
much better than loading an .mpy from disk into RAM (although not as good
as freezing .mpy files into the firmware).
The main trick to make static .mpy files is to adjust the bytecode so any
qstrs that it references now go through a lookup table to convert from
local qstr number in the module to global qstr number in the firmware.
That means the bytecode does not need linking/rewriting of qstrs when it's
loaded. Instead only a small qstr table needs to be built (and put in RAM)
at import time. This means the bytecode itself is static/constant and can
be used directly if it's in addressable memory. Also the qstr string data
in the .mpy file, and some constant object data, can be used directly.
Note that the qstr table is global to the module (ie not per function).
In more detail, in the VM what used to be (schematically):
qst = DECODE_QSTR_VALUE;
is now (schematically):
idx = DECODE_QSTR_INDEX;
qst = qstr_table[idx];
That allows the bytecode to be fixed at compile time and not need
relinking/rewriting of the qstr values. Only qstr_table needs to be linked
when the .mpy is loaded.
Incidentally, this helps to reduce the size of bytecode because what used
to be 2-byte qstr values in the bytecode are now (mostly) 1-byte indices.
If the module uses the same qstr more than two times then the bytecode is
smaller than before.
The following changes are measured for this commit compared to the
previous (the baseline):
- average 7%-9% reduction in size of .mpy files
- frozen code size is reduced by about 5%-7%
- importing .py files uses about 5% less RAM in total
- importing .mpy files uses about 4% less RAM in total
- importing .py and .mpy files takes about the same time as before
The qstr indirection in the bytecode has only a small impact on VM
performance. For stm32 on PYBv1.0 the performance change of this commit
is:
diff of scores (higher is better)
N=100 M=100 baseline -> this-commit diff diff% (error%)
bm_chaos.py 371.07 -> 357.39 : -13.68 = -3.687% (+/-0.02%)
bm_fannkuch.py 78.72 -> 77.49 : -1.23 = -1.563% (+/-0.01%)
bm_fft.py 2591.73 -> 2539.28 : -52.45 = -2.024% (+/-0.00%)
bm_float.py 6034.93 -> 5908.30 : -126.63 = -2.098% (+/-0.01%)
bm_hexiom.py 48.96 -> 47.93 : -1.03 = -2.104% (+/-0.00%)
bm_nqueens.py 4510.63 -> 4459.94 : -50.69 = -1.124% (+/-0.00%)
bm_pidigits.py 650.28 -> 644.96 : -5.32 = -0.818% (+/-0.23%)
core_import_mpy_multi.py 564.77 -> 581.49 : +16.72 = +2.960% (+/-0.01%)
core_import_mpy_single.py 68.67 -> 67.16 : -1.51 = -2.199% (+/-0.01%)
core_qstr.py 64.16 -> 64.12 : -0.04 = -0.062% (+/-0.00%)
core_yield_from.py 362.58 -> 354.50 : -8.08 = -2.228% (+/-0.00%)
misc_aes.py 429.69 -> 405.59 : -24.10 = -5.609% (+/-0.01%)
misc_mandel.py 3485.13 -> 3416.51 : -68.62 = -1.969% (+/-0.00%)
misc_pystone.py 2496.53 -> 2405.56 : -90.97 = -3.644% (+/-0.01%)
misc_raytrace.py 381.47 -> 374.01 : -7.46 = -1.956% (+/-0.01%)
viper_call0.py 576.73 -> 572.49 : -4.24 = -0.735% (+/-0.04%)
viper_call1a.py 550.37 -> 546.21 : -4.16 = -0.756% (+/-0.09%)
viper_call1b.py 438.23 -> 435.68 : -2.55 = -0.582% (+/-0.06%)
viper_call1c.py 442.84 -> 440.04 : -2.80 = -0.632% (+/-0.08%)
viper_call2a.py 536.31 -> 532.35 : -3.96 = -0.738% (+/-0.06%)
viper_call2b.py 382.34 -> 377.07 : -5.27 = -1.378% (+/-0.03%)
And for unix on x64:
diff of scores (higher is better)
N=2000 M=2000 baseline -> this-commit diff diff% (error%)
bm_chaos.py 13594.20 -> 13073.84 : -520.36 = -3.828% (+/-5.44%)
bm_fannkuch.py 60.63 -> 59.58 : -1.05 = -1.732% (+/-3.01%)
bm_fft.py 112009.15 -> 111603.32 : -405.83 = -0.362% (+/-4.03%)
bm_float.py 246202.55 -> 247923.81 : +1721.26 = +0.699% (+/-2.79%)
bm_hexiom.py 615.65 -> 617.21 : +1.56 = +0.253% (+/-1.64%)
bm_nqueens.py 215807.95 -> 215600.96 : -206.99 = -0.096% (+/-3.52%)
bm_pidigits.py 8246.74 -> 8422.82 : +176.08 = +2.135% (+/-3.64%)
misc_aes.py 16133.00 -> 16452.74 : +319.74 = +1.982% (+/-1.50%)
misc_mandel.py 128146.69 -> 130796.43 : +2649.74 = +2.068% (+/-3.18%)
misc_pystone.py 83811.49 -> 83124.85 : -686.64 = -0.819% (+/-1.03%)
misc_raytrace.py 21688.02 -> 21385.10 : -302.92 = -1.397% (+/-3.20%)
The code size change is (firmware with a lot of frozen code benefits the
most):
bare-arm: +396 +0.697%
minimal x86: +1595 +0.979% [incl +32(data)]
unix x64: +2408 +0.470% [incl +800(data)]
unix nanbox: +1396 +0.309% [incl -96(data)]
stm32: -1256 -0.318% PYBV10
cc3200: +288 +0.157%
esp8266: -260 -0.037% GENERIC
esp32: -216 -0.014% GENERIC[incl -1072(data)]
nrf: +116 +0.067% pca10040
rp2: -664 -0.135% PICO
samd: +844 +0.607% ADAFRUIT_ITSYBITSY_M4_EXPRESS
As part of this change the .mpy file format version is bumped to version 6.
And mpy-tool.py has been improved to provide a good visualisation of the
contents of .mpy files.
In summary: this commit changes the bytecode to use qstr indirection, and
reworks the .mpy file format to be simpler and allow .mpy files to be
executed in-place. Performance is not impacted too much. Eventually it
will be possible to store such .mpy files in a linear, read-only, memory-
mappable filesystem so they can be executed from flash/ROM. This will
essentially be able to replace frozen code for most applications.
Signed-off-by: Damien George <damien@micropython.org>
2021-10-22 12:22:47 +01:00
|
|
|
return; // because pn_name is not a valid qstr (in compile_store_id below)
|
2020-06-16 12:42:44 +01:00
|
|
|
}
|
|
|
|
compile_node(comp, pn_expr);
|
|
|
|
EMIT(dup_top);
|
|
|
|
scope_t *old_scope = comp->scope_cur;
|
|
|
|
if (SCOPE_IS_COMP_LIKE(comp->scope_cur->kind)) {
|
|
|
|
// Use parent's scope for assigned value so it can "escape"
|
|
|
|
comp->scope_cur = comp->scope_cur->parent;
|
|
|
|
}
|
|
|
|
compile_store_id(comp, MP_PARSE_NODE_LEAF_ARG(pn_name));
|
|
|
|
comp->scope_cur = old_scope;
|
|
|
|
}
|
|
|
|
|
|
|
|
STATIC void compile_namedexpr(compiler_t *comp, mp_parse_node_struct_t *pns) {
|
|
|
|
compile_namedexpr_helper(comp, pns->nodes[0], pns->nodes[1]);
|
|
|
|
}
|
|
|
|
#endif
|
|
|
|
|
2018-06-19 05:10:29 +01:00
|
|
|
STATIC void compile_or_and_test(compiler_t *comp, mp_parse_node_struct_t *pns) {
|
|
|
|
bool cond = MP_PARSE_NODE_STRUCT_KIND(pns) == PN_or_test;
|
2014-04-10 14:11:31 +01:00
|
|
|
uint l_end = comp_next_label(comp);
|
2013-12-21 18:17:45 +00:00
|
|
|
int n = MP_PARSE_NODE_STRUCT_NUM_NODES(pns);
|
2013-10-04 19:53:11 +01:00
|
|
|
for (int i = 0; i < n; i += 1) {
|
|
|
|
compile_node(comp, pns->nodes[i]);
|
|
|
|
if (i + 1 < n) {
|
2015-02-28 15:10:18 +00:00
|
|
|
EMIT_ARG(jump_if_or_pop, cond, l_end);
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
|
|
|
}
|
2014-01-23 00:34:21 +00:00
|
|
|
EMIT_ARG(label_assign, l_end);
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
|
|
|
|
2014-12-10 22:07:04 +00:00
|
|
|
STATIC void compile_not_test_2(compiler_t *comp, mp_parse_node_struct_t *pns) {
|
2013-10-04 19:53:11 +01:00
|
|
|
compile_node(comp, pns->nodes[0]);
|
2014-03-30 13:35:08 +01:00
|
|
|
EMIT_ARG(unary_op, MP_UNARY_OP_NOT);
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
|
|
|
|
2014-12-10 22:07:04 +00:00
|
|
|
STATIC void compile_comparison(compiler_t *comp, mp_parse_node_struct_t *pns) {
|
2013-12-21 18:17:45 +00:00
|
|
|
int num_nodes = MP_PARSE_NODE_STRUCT_NUM_NODES(pns);
|
2013-10-04 19:53:11 +01:00
|
|
|
compile_node(comp, pns->nodes[0]);
|
|
|
|
bool multi = (num_nodes > 3);
|
2014-04-10 14:11:31 +01:00
|
|
|
uint l_fail = 0;
|
2013-10-04 19:53:11 +01:00
|
|
|
if (multi) {
|
2013-10-05 13:37:10 +01:00
|
|
|
l_fail = comp_next_label(comp);
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
|
|
|
for (int i = 1; i + 1 < num_nodes; i += 2) {
|
|
|
|
compile_node(comp, pns->nodes[i + 1]);
|
|
|
|
if (i + 2 < num_nodes) {
|
|
|
|
EMIT(dup_top);
|
|
|
|
EMIT(rot_three);
|
|
|
|
}
|
2014-02-01 22:18:47 +00:00
|
|
|
if (MP_PARSE_NODE_IS_TOKEN(pns->nodes[i])) {
|
2019-07-25 03:37:37 +01:00
|
|
|
mp_token_kind_t tok = MP_PARSE_NODE_LEAF_ARG(pns->nodes[i]);
|
2014-03-30 13:35:08 +01:00
|
|
|
mp_binary_op_t op;
|
2019-07-25 03:37:37 +01:00
|
|
|
if (tok == MP_TOKEN_KW_IN) {
|
|
|
|
op = MP_BINARY_OP_IN;
|
|
|
|
} else {
|
|
|
|
op = MP_BINARY_OP_LESS + (tok - MP_TOKEN_OP_LESS);
|
2014-02-01 22:18:47 +00:00
|
|
|
}
|
|
|
|
EMIT_ARG(binary_op, op);
|
2015-02-27 14:25:47 +00:00
|
|
|
} else {
|
|
|
|
assert(MP_PARSE_NODE_IS_STRUCT(pns->nodes[i])); // should be
|
2013-12-21 18:17:45 +00:00
|
|
|
mp_parse_node_struct_t *pns2 = (mp_parse_node_struct_t *)pns->nodes[i];
|
|
|
|
int kind = MP_PARSE_NODE_STRUCT_KIND(pns2);
|
2013-10-04 19:53:11 +01:00
|
|
|
if (kind == PN_comp_op_not_in) {
|
2014-03-30 13:35:08 +01:00
|
|
|
EMIT_ARG(binary_op, MP_BINARY_OP_NOT_IN);
|
2015-02-27 14:25:47 +00:00
|
|
|
} else {
|
|
|
|
assert(kind == PN_comp_op_is); // should be
|
2013-12-21 18:17:45 +00:00
|
|
|
if (MP_PARSE_NODE_IS_NULL(pns2->nodes[0])) {
|
2014-03-30 13:35:08 +01:00
|
|
|
EMIT_ARG(binary_op, MP_BINARY_OP_IS);
|
2013-10-04 19:53:11 +01:00
|
|
|
} else {
|
2014-03-30 13:35:08 +01:00
|
|
|
EMIT_ARG(binary_op, MP_BINARY_OP_IS_NOT);
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
if (i + 2 < num_nodes) {
|
2015-02-28 15:04:06 +00:00
|
|
|
EMIT_ARG(jump_if_or_pop, false, l_fail);
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
|
|
|
}
|
|
|
|
if (multi) {
|
2014-04-10 14:11:31 +01:00
|
|
|
uint l_end = comp_next_label(comp);
|
2014-01-23 00:34:21 +00:00
|
|
|
EMIT_ARG(jump, l_end);
|
|
|
|
EMIT_ARG(label_assign, l_fail);
|
2014-04-10 18:28:54 +01:00
|
|
|
EMIT_ARG(adjust_stack_size, 1);
|
2013-10-04 19:53:11 +01:00
|
|
|
EMIT(rot_two);
|
|
|
|
EMIT(pop_top);
|
2014-01-23 00:34:21 +00:00
|
|
|
EMIT_ARG(label_assign, l_end);
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2014-12-10 22:07:04 +00:00
|
|
|
STATIC void compile_star_expr(compiler_t *comp, mp_parse_node_struct_t *pns) {
|
2020-03-02 11:35:22 +00:00
|
|
|
compile_syntax_error(comp, (mp_parse_node_t)pns, MP_ERROR_TEXT("*x must be assignment target"));
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
|
|
|
|
2018-06-19 05:20:42 +01:00
|
|
|
STATIC void compile_binary_op(compiler_t *comp, mp_parse_node_struct_t *pns) {
|
|
|
|
MP_STATIC_ASSERT(MP_BINARY_OP_OR + PN_xor_expr - PN_expr == MP_BINARY_OP_XOR);
|
|
|
|
MP_STATIC_ASSERT(MP_BINARY_OP_OR + PN_and_expr - PN_expr == MP_BINARY_OP_AND);
|
|
|
|
mp_binary_op_t binary_op = MP_BINARY_OP_OR + MP_PARSE_NODE_STRUCT_KIND(pns) - PN_expr;
|
|
|
|
int num_nodes = MP_PARSE_NODE_STRUCT_NUM_NODES(pns);
|
|
|
|
compile_node(comp, pns->nodes[0]);
|
|
|
|
for (int i = 1; i < num_nodes; ++i) {
|
|
|
|
compile_node(comp, pns->nodes[i]);
|
|
|
|
EMIT_ARG(binary_op, binary_op);
|
|
|
|
}
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
|
|
|
|
2017-04-27 20:32:50 +01:00
|
|
|
STATIC void compile_term(compiler_t *comp, mp_parse_node_struct_t *pns) {
|
2013-12-21 18:17:45 +00:00
|
|
|
int num_nodes = MP_PARSE_NODE_STRUCT_NUM_NODES(pns);
|
2013-10-04 19:53:11 +01:00
|
|
|
compile_node(comp, pns->nodes[0]);
|
|
|
|
for (int i = 1; i + 1 < num_nodes; i += 2) {
|
|
|
|
compile_node(comp, pns->nodes[i + 1]);
|
2017-04-27 20:33:11 +01:00
|
|
|
mp_token_kind_t tok = MP_PARSE_NODE_LEAF_ARG(pns->nodes[i]);
|
2019-07-25 03:37:37 +01:00
|
|
|
mp_binary_op_t op = MP_BINARY_OP_LSHIFT + (tok - MP_TOKEN_OP_DBL_LESS);
|
2017-04-27 20:33:11 +01:00
|
|
|
EMIT_ARG(binary_op, op);
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2014-12-10 22:07:04 +00:00
|
|
|
STATIC void compile_factor_2(compiler_t *comp, mp_parse_node_struct_t *pns) {
|
2013-10-04 19:53:11 +01:00
|
|
|
compile_node(comp, pns->nodes[1]);
|
2017-04-27 20:33:11 +01:00
|
|
|
mp_token_kind_t tok = MP_PARSE_NODE_LEAF_ARG(pns->nodes[0]);
|
2019-07-25 03:37:37 +01:00
|
|
|
mp_unary_op_t op;
|
|
|
|
if (tok == MP_TOKEN_OP_TILDE) {
|
|
|
|
op = MP_UNARY_OP_INVERT;
|
|
|
|
} else {
|
|
|
|
assert(tok == MP_TOKEN_OP_PLUS || tok == MP_TOKEN_OP_MINUS);
|
|
|
|
op = MP_UNARY_OP_POSITIVE + (tok - MP_TOKEN_OP_PLUS);
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
2017-04-27 20:33:11 +01:00
|
|
|
EMIT_ARG(unary_op, op);
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
|
|
|
|
2016-01-27 20:23:11 +00:00
|
|
|
STATIC void compile_atom_expr_normal(compiler_t *comp, mp_parse_node_struct_t *pns) {
|
2017-04-18 13:52:18 +01:00
|
|
|
// compile the subject of the expression
|
|
|
|
compile_node(comp, pns->nodes[0]);
|
2014-02-05 00:51:47 +00:00
|
|
|
|
2017-04-18 13:52:18 +01:00
|
|
|
// compile_atom_expr_await may call us with a NULL node
|
|
|
|
if (MP_PARSE_NODE_IS_NULL(pns->nodes[1])) {
|
|
|
|
return;
|
|
|
|
}
|
2016-03-16 13:04:51 +00:00
|
|
|
|
2017-04-18 13:52:18 +01:00
|
|
|
// get the array of trailers (known to be an array of PARSE_NODE_STRUCT)
|
|
|
|
size_t num_trail = 1;
|
|
|
|
mp_parse_node_struct_t **pns_trail = (mp_parse_node_struct_t **)&pns->nodes[1];
|
|
|
|
if (MP_PARSE_NODE_STRUCT_KIND(pns_trail[0]) == PN_atom_expr_trailers) {
|
|
|
|
num_trail = MP_PARSE_NODE_STRUCT_NUM_NODES(pns_trail[0]);
|
|
|
|
pns_trail = (mp_parse_node_struct_t **)&pns_trail[0]->nodes[0];
|
|
|
|
}
|
2014-02-05 00:51:47 +00:00
|
|
|
|
2017-04-18 13:52:18 +01:00
|
|
|
// the current index into the array of trailers
|
|
|
|
size_t i = 0;
|
2013-10-04 19:53:11 +01:00
|
|
|
|
2017-04-18 13:52:18 +01:00
|
|
|
// handle special super() call
|
|
|
|
if (comp->scope_cur->kind == SCOPE_FUNCTION
|
|
|
|
&& MP_PARSE_NODE_IS_ID(pns->nodes[0])
|
|
|
|
&& MP_PARSE_NODE_LEAF_ARG(pns->nodes[0]) == MP_QSTR_super
|
|
|
|
&& MP_PARSE_NODE_STRUCT_KIND(pns_trail[0]) == PN_trailer_paren
|
|
|
|
&& MP_PARSE_NODE_IS_NULL(pns_trail[0]->nodes[0])) {
|
|
|
|
// at this point we have matched "super()" within a function
|
|
|
|
|
|
|
|
// load the class for super to search for a parent
|
2015-03-26 14:42:40 +00:00
|
|
|
compile_load_id(comp, MP_QSTR___class__);
|
2017-04-18 13:52:18 +01:00
|
|
|
|
2014-12-21 17:44:27 +00:00
|
|
|
// look for first argument to function (assumes it's "self")
|
2017-04-18 13:52:18 +01:00
|
|
|
bool found = false;
|
|
|
|
id_info_t *id = &comp->scope_cur->id_info[0];
|
|
|
|
for (size_t n = comp->scope_cur->id_info_len; n > 0; --n, ++id) {
|
2017-03-27 01:27:08 +01:00
|
|
|
if (id->flags & ID_FLAG_IS_PARAM) {
|
2017-04-18 13:52:18 +01:00
|
|
|
// first argument found; load it
|
2017-03-27 01:27:08 +01:00
|
|
|
compile_load_id(comp, id->qst);
|
2017-04-18 13:52:18 +01:00
|
|
|
found = true;
|
|
|
|
break;
|
2014-02-05 00:51:47 +00:00
|
|
|
}
|
|
|
|
}
|
2017-04-18 13:52:18 +01:00
|
|
|
if (!found) {
|
|
|
|
compile_syntax_error(comp, (mp_parse_node_t)pns_trail[0],
|
2020-03-02 11:35:22 +00:00
|
|
|
MP_ERROR_TEXT("super() can't find self")); // really a TypeError
|
2017-04-18 13:52:18 +01:00
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
2017-04-19 00:45:59 +01:00
|
|
|
if (num_trail >= 3
|
|
|
|
&& MP_PARSE_NODE_STRUCT_KIND(pns_trail[1]) == PN_trailer_period
|
|
|
|
&& MP_PARSE_NODE_STRUCT_KIND(pns_trail[2]) == PN_trailer_paren) {
|
|
|
|
// optimisation for method calls super().f(...), to eliminate heap allocation
|
|
|
|
mp_parse_node_struct_t *pns_period = pns_trail[1];
|
|
|
|
mp_parse_node_struct_t *pns_paren = pns_trail[2];
|
|
|
|
EMIT_ARG(load_method, MP_PARSE_NODE_LEAF_ARG(pns_period->nodes[0]), true);
|
|
|
|
compile_trailer_paren_helper(comp, pns_paren->nodes[0], true, 0);
|
|
|
|
i = 3;
|
|
|
|
} else {
|
|
|
|
// a super() call
|
|
|
|
EMIT_ARG(call_function, 2, 0, 0);
|
|
|
|
i = 1;
|
|
|
|
}
|
2019-02-26 13:10:04 +00:00
|
|
|
|
|
|
|
#if MICROPY_COMP_CONST_LITERAL && MICROPY_PY_COLLECTIONS_ORDEREDDICT
|
|
|
|
// handle special OrderedDict constructor
|
|
|
|
} else if (MP_PARSE_NODE_IS_ID(pns->nodes[0])
|
|
|
|
&& MP_PARSE_NODE_LEAF_ARG(pns->nodes[0]) == MP_QSTR_OrderedDict
|
|
|
|
&& MP_PARSE_NODE_STRUCT_KIND(pns_trail[0]) == PN_trailer_paren
|
|
|
|
&& MP_PARSE_NODE_IS_STRUCT_KIND(pns_trail[0]->nodes[0], PN_atom_brace)) {
|
|
|
|
// at this point we have matched "OrderedDict({...})"
|
|
|
|
|
|
|
|
EMIT_ARG(call_function, 0, 0, 0);
|
|
|
|
mp_parse_node_struct_t *pns_dict = (mp_parse_node_struct_t *)pns_trail[0]->nodes[0];
|
|
|
|
compile_atom_brace_helper(comp, pns_dict, false);
|
|
|
|
i = 1;
|
|
|
|
#endif
|
2017-04-18 13:52:18 +01:00
|
|
|
}
|
|
|
|
|
|
|
|
// compile the remaining trailers
|
|
|
|
for (; i < num_trail; i++) {
|
|
|
|
if (i + 1 < num_trail
|
|
|
|
&& MP_PARSE_NODE_STRUCT_KIND(pns_trail[i]) == PN_trailer_period
|
|
|
|
&& MP_PARSE_NODE_STRUCT_KIND(pns_trail[i + 1]) == PN_trailer_paren) {
|
|
|
|
// optimisation for method calls a.f(...), following PyPy
|
|
|
|
mp_parse_node_struct_t *pns_period = pns_trail[i];
|
|
|
|
mp_parse_node_struct_t *pns_paren = pns_trail[i + 1];
|
2017-04-19 00:45:59 +01:00
|
|
|
EMIT_ARG(load_method, MP_PARSE_NODE_LEAF_ARG(pns_period->nodes[0]), false);
|
2017-04-18 13:52:18 +01:00
|
|
|
compile_trailer_paren_helper(comp, pns_paren->nodes[0], true, 0);
|
|
|
|
i += 1;
|
|
|
|
} else {
|
|
|
|
// node is one of: trailer_paren, trailer_bracket, trailer_period
|
|
|
|
compile_node(comp, (mp_parse_node_t)pns_trail[i]);
|
|
|
|
}
|
2014-02-05 00:51:47 +00:00
|
|
|
}
|
2017-04-18 13:52:18 +01:00
|
|
|
}
|
|
|
|
|
|
|
|
STATIC void compile_power(compiler_t *comp, mp_parse_node_struct_t *pns) {
|
|
|
|
compile_generic_all_nodes(comp, pns); // 2 nodes, arguments of power
|
|
|
|
EMIT_ARG(binary_op, MP_BINARY_OP_POWER);
|
|
|
|
}
|
|
|
|
|
|
|
|
STATIC void compile_trailer_paren_helper(compiler_t *comp, mp_parse_node_t pn_arglist, bool is_method_call, int n_positional_extra) {
|
|
|
|
// function to call is on top of stack
|
2014-02-05 00:51:47 +00:00
|
|
|
|
2014-04-27 16:46:51 +01:00
|
|
|
// get the list of arguments
|
|
|
|
mp_parse_node_t *args;
|
2020-05-04 13:11:44 +01:00
|
|
|
size_t n_args = mp_parse_node_extract_list(&pn_arglist, PN_arglist, &args);
|
2014-04-27 16:46:51 +01:00
|
|
|
|
|
|
|
// compile the arguments
|
|
|
|
// Rather than calling compile_node on the list, we go through the list of args
|
|
|
|
// explicitly here so that we can count the number of arguments and give sensible
|
|
|
|
// error messages.
|
|
|
|
int n_positional = n_positional_extra;
|
|
|
|
uint n_keyword = 0;
|
|
|
|
uint star_flags = 0;
|
2015-09-17 11:07:06 +01:00
|
|
|
mp_parse_node_struct_t *star_args_node = NULL, *dblstar_args_node = NULL;
|
2020-05-04 13:11:44 +01:00
|
|
|
for (size_t i = 0; i < n_args; i++) {
|
2014-04-27 16:46:51 +01:00
|
|
|
if (MP_PARSE_NODE_IS_STRUCT(args[i])) {
|
|
|
|
mp_parse_node_struct_t *pns_arg = (mp_parse_node_struct_t *)args[i];
|
|
|
|
if (MP_PARSE_NODE_STRUCT_KIND(pns_arg) == PN_arglist_star) {
|
|
|
|
if (star_flags & MP_EMIT_STAR_FLAG_SINGLE) {
|
2020-03-02 11:35:22 +00:00
|
|
|
compile_syntax_error(comp, (mp_parse_node_t)pns_arg, MP_ERROR_TEXT("can't have multiple *x"));
|
2014-04-27 16:46:51 +01:00
|
|
|
return;
|
|
|
|
}
|
|
|
|
star_flags |= MP_EMIT_STAR_FLAG_SINGLE;
|
2015-09-17 11:07:06 +01:00
|
|
|
star_args_node = pns_arg;
|
2014-04-27 16:46:51 +01:00
|
|
|
} else if (MP_PARSE_NODE_STRUCT_KIND(pns_arg) == PN_arglist_dbl_star) {
|
|
|
|
if (star_flags & MP_EMIT_STAR_FLAG_DOUBLE) {
|
2020-03-02 11:35:22 +00:00
|
|
|
compile_syntax_error(comp, (mp_parse_node_t)pns_arg, MP_ERROR_TEXT("can't have multiple **x"));
|
2014-04-27 16:46:51 +01:00
|
|
|
return;
|
|
|
|
}
|
|
|
|
star_flags |= MP_EMIT_STAR_FLAG_DOUBLE;
|
2015-09-17 11:07:06 +01:00
|
|
|
dblstar_args_node = pns_arg;
|
2014-04-27 16:46:51 +01:00
|
|
|
} else if (MP_PARSE_NODE_STRUCT_KIND(pns_arg) == PN_argument) {
|
2020-06-16 12:42:44 +01:00
|
|
|
#if MICROPY_PY_ASSIGN_EXPR
|
2020-06-16 13:48:46 +01:00
|
|
|
if (MP_PARSE_NODE_IS_STRUCT_KIND(pns_arg->nodes[1], PN_argument_3)) {
|
2020-06-16 12:42:44 +01:00
|
|
|
compile_namedexpr_helper(comp, pns_arg->nodes[0], ((mp_parse_node_struct_t *)pns_arg->nodes[1])->nodes[0]);
|
|
|
|
n_positional++;
|
|
|
|
} else
|
|
|
|
#endif
|
2015-10-08 14:26:01 +01:00
|
|
|
if (!MP_PARSE_NODE_IS_STRUCT_KIND(pns_arg->nodes[1], PN_comp_for)) {
|
2014-04-27 16:46:51 +01:00
|
|
|
if (!MP_PARSE_NODE_IS_ID(pns_arg->nodes[0])) {
|
2020-03-02 11:35:22 +00:00
|
|
|
compile_syntax_error(comp, (mp_parse_node_t)pns_arg, MP_ERROR_TEXT("LHS of keyword arg must be an id"));
|
2014-04-27 16:46:51 +01:00
|
|
|
return;
|
|
|
|
}
|
2015-06-25 15:42:13 +01:00
|
|
|
EMIT_ARG(load_const_str, MP_PARSE_NODE_LEAF_ARG(pns_arg->nodes[0]));
|
2015-10-08 14:26:01 +01:00
|
|
|
compile_node(comp, pns_arg->nodes[1]);
|
2014-04-27 16:46:51 +01:00
|
|
|
n_keyword += 1;
|
|
|
|
} else {
|
|
|
|
compile_comprehension(comp, pns_arg, SCOPE_GEN_EXPR);
|
|
|
|
n_positional++;
|
|
|
|
}
|
|
|
|
} else {
|
|
|
|
goto normal_argument;
|
|
|
|
}
|
|
|
|
} else {
|
|
|
|
normal_argument:
|
2017-06-14 09:18:01 +01:00
|
|
|
if (star_flags) {
|
2020-03-02 11:35:22 +00:00
|
|
|
compile_syntax_error(comp, args[i], MP_ERROR_TEXT("non-keyword arg after */**"));
|
2017-06-14 09:18:01 +01:00
|
|
|
return;
|
|
|
|
}
|
2014-04-27 16:46:51 +01:00
|
|
|
if (n_keyword > 0) {
|
2020-03-02 11:35:22 +00:00
|
|
|
compile_syntax_error(comp, args[i], MP_ERROR_TEXT("non-keyword arg after keyword arg"));
|
2014-04-27 16:46:51 +01:00
|
|
|
return;
|
|
|
|
}
|
|
|
|
compile_node(comp, args[i]);
|
|
|
|
n_positional++;
|
|
|
|
}
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
|
|
|
|
2015-09-17 11:07:06 +01:00
|
|
|
// compile the star/double-star arguments if we had them
|
2015-09-23 11:47:01 +01:00
|
|
|
// if we had one but not the other then we load "null" as a place holder
|
|
|
|
if (star_flags != 0) {
|
|
|
|
if (star_args_node == NULL) {
|
|
|
|
EMIT(load_null);
|
|
|
|
} else {
|
|
|
|
compile_node(comp, star_args_node->nodes[0]);
|
|
|
|
}
|
|
|
|
if (dblstar_args_node == NULL) {
|
|
|
|
EMIT(load_null);
|
|
|
|
} else {
|
|
|
|
compile_node(comp, dblstar_args_node->nodes[0]);
|
|
|
|
}
|
2015-09-17 11:07:06 +01:00
|
|
|
}
|
|
|
|
|
2014-04-27 16:46:51 +01:00
|
|
|
// emit the function/method call
|
2013-10-04 19:53:11 +01:00
|
|
|
if (is_method_call) {
|
2014-04-27 16:46:51 +01:00
|
|
|
EMIT_ARG(call_method, n_positional, n_keyword, star_flags);
|
2013-10-04 19:53:11 +01:00
|
|
|
} else {
|
2014-04-27 16:46:51 +01:00
|
|
|
EMIT_ARG(call_function, n_positional, n_keyword, star_flags);
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
// pns needs to have 2 nodes, first is lhs of comprehension, second is PN_comp_for node
|
2014-08-15 14:30:52 +01:00
|
|
|
STATIC void compile_comprehension(compiler_t *comp, mp_parse_node_struct_t *pns, scope_kind_t kind) {
|
2013-12-21 18:17:45 +00:00
|
|
|
assert(MP_PARSE_NODE_STRUCT_NUM_NODES(pns) == 2);
|
|
|
|
assert(MP_PARSE_NODE_IS_STRUCT_KIND(pns->nodes[1], PN_comp_for));
|
|
|
|
mp_parse_node_struct_t *pns_comp_for = (mp_parse_node_struct_t *)pns->nodes[1];
|
2013-10-04 19:53:11 +01:00
|
|
|
|
2014-05-07 17:24:22 +01:00
|
|
|
if (comp->pass == MP_PASS_SCOPE) {
|
2013-10-04 19:53:11 +01:00
|
|
|
// create a new scope for this comprehension
|
2013-12-21 18:17:45 +00:00
|
|
|
scope_t *s = scope_new_and_link(comp, kind, (mp_parse_node_t)pns, comp->scope_cur->emit_options);
|
2013-10-04 19:53:11 +01:00
|
|
|
// store the comprehension scope so the compiling function (this one) can use it at each pass
|
2013-12-21 18:17:45 +00:00
|
|
|
pns_comp_for->nodes[3] = (mp_parse_node_t)s;
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
|
|
|
|
|
|
|
// get the scope for this comprehension
|
|
|
|
scope_t *this_scope = (scope_t *)pns_comp_for->nodes[3];
|
|
|
|
|
|
|
|
// compile the comprehension
|
|
|
|
close_over_variables_etc(comp, this_scope, 0, 0);
|
|
|
|
|
|
|
|
compile_node(comp, pns_comp_for->nodes[1]); // source of the iterator
|
2017-02-10 04:39:33 +00:00
|
|
|
if (kind == SCOPE_GEN_EXPR) {
|
|
|
|
EMIT_ARG(get_iter, false);
|
|
|
|
}
|
2014-04-09 12:43:17 +01:00
|
|
|
EMIT_ARG(call_function, 1, 0, 0);
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
|
|
|
|
2014-12-10 22:07:04 +00:00
|
|
|
STATIC void compile_atom_paren(compiler_t *comp, mp_parse_node_struct_t *pns) {
|
2013-12-21 18:17:45 +00:00
|
|
|
if (MP_PARSE_NODE_IS_NULL(pns->nodes[0])) {
|
2013-10-04 19:53:11 +01:00
|
|
|
// an empty tuple
|
2021-03-23 01:48:35 +00:00
|
|
|
EMIT_ARG(build, 0, MP_EMIT_BUILD_TUPLE);
|
2016-01-07 13:07:52 +00:00
|
|
|
} else {
|
|
|
|
assert(MP_PARSE_NODE_IS_STRUCT_KIND(pns->nodes[0], PN_testlist_comp));
|
2013-12-21 18:17:45 +00:00
|
|
|
pns = (mp_parse_node_struct_t *)pns->nodes[0];
|
2021-03-23 01:48:35 +00:00
|
|
|
if (MP_PARSE_NODE_TESTLIST_COMP_HAS_COMP_FOR(pns)) {
|
|
|
|
// generator expression
|
|
|
|
compile_comprehension(comp, pns, SCOPE_GEN_EXPR);
|
2013-10-04 19:53:11 +01:00
|
|
|
} else {
|
2021-03-23 01:48:35 +00:00
|
|
|
// tuple with N items
|
|
|
|
compile_generic_tuple(comp, pns);
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2014-12-10 22:07:04 +00:00
|
|
|
STATIC void compile_atom_bracket(compiler_t *comp, mp_parse_node_struct_t *pns) {
|
2013-12-21 18:17:45 +00:00
|
|
|
if (MP_PARSE_NODE_IS_NULL(pns->nodes[0])) {
|
2013-10-04 19:53:11 +01:00
|
|
|
// empty list
|
2018-05-18 15:41:40 +01:00
|
|
|
EMIT_ARG(build, 0, MP_EMIT_BUILD_LIST);
|
2013-12-21 18:17:45 +00:00
|
|
|
} else if (MP_PARSE_NODE_IS_STRUCT_KIND(pns->nodes[0], PN_testlist_comp)) {
|
|
|
|
mp_parse_node_struct_t *pns2 = (mp_parse_node_struct_t *)pns->nodes[0];
|
2021-03-23 01:48:35 +00:00
|
|
|
if (MP_PARSE_NODE_TESTLIST_COMP_HAS_COMP_FOR(pns2)) {
|
|
|
|
// list comprehension
|
|
|
|
compile_comprehension(comp, pns2, SCOPE_LIST_COMP);
|
2013-10-04 19:53:11 +01:00
|
|
|
} else {
|
2021-03-23 01:48:35 +00:00
|
|
|
// list with N items
|
|
|
|
compile_generic_all_nodes(comp, pns2);
|
|
|
|
EMIT_ARG(build, MP_PARSE_NODE_STRUCT_NUM_NODES(pns2), MP_EMIT_BUILD_LIST);
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
|
|
|
} else {
|
|
|
|
// list with 1 item
|
|
|
|
compile_node(comp, pns->nodes[0]);
|
2018-05-18 15:41:40 +01:00
|
|
|
EMIT_ARG(build, 1, MP_EMIT_BUILD_LIST);
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2019-02-26 13:10:04 +00:00
|
|
|
STATIC void compile_atom_brace_helper(compiler_t *comp, mp_parse_node_struct_t *pns, bool create_map) {
|
2013-12-21 18:17:45 +00:00
|
|
|
mp_parse_node_t pn = pns->nodes[0];
|
|
|
|
if (MP_PARSE_NODE_IS_NULL(pn)) {
|
2013-10-04 19:53:11 +01:00
|
|
|
// empty dict
|
2019-02-26 13:10:04 +00:00
|
|
|
if (create_map) {
|
|
|
|
EMIT_ARG(build, 0, MP_EMIT_BUILD_MAP);
|
|
|
|
}
|
2013-12-21 18:17:45 +00:00
|
|
|
} else if (MP_PARSE_NODE_IS_STRUCT(pn)) {
|
|
|
|
pns = (mp_parse_node_struct_t *)pn;
|
|
|
|
if (MP_PARSE_NODE_STRUCT_KIND(pns) == PN_dictorsetmaker_item) {
|
2013-10-04 19:53:11 +01:00
|
|
|
// dict with one element
|
2019-02-26 13:10:04 +00:00
|
|
|
if (create_map) {
|
|
|
|
EMIT_ARG(build, 1, MP_EMIT_BUILD_MAP);
|
|
|
|
}
|
2013-10-04 19:53:11 +01:00
|
|
|
compile_node(comp, pn);
|
|
|
|
EMIT(store_map);
|
2013-12-21 18:17:45 +00:00
|
|
|
} else if (MP_PARSE_NODE_STRUCT_KIND(pns) == PN_dictorsetmaker) {
|
|
|
|
assert(MP_PARSE_NODE_IS_STRUCT(pns->nodes[1])); // should succeed
|
|
|
|
mp_parse_node_struct_t *pns1 = (mp_parse_node_struct_t *)pns->nodes[1];
|
|
|
|
if (MP_PARSE_NODE_STRUCT_KIND(pns1) == PN_dictorsetmaker_list) {
|
2013-10-04 19:53:11 +01:00
|
|
|
// dict/set with multiple elements
|
|
|
|
|
|
|
|
// get tail elements (2nd, 3rd, ...)
|
2013-12-21 18:17:45 +00:00
|
|
|
mp_parse_node_t *nodes;
|
2020-05-04 13:11:44 +01:00
|
|
|
size_t n = mp_parse_node_extract_list(&pns1->nodes[0], PN_dictorsetmaker_list2, &nodes);
|
2013-10-04 19:53:11 +01:00
|
|
|
|
|
|
|
// first element sets whether it's a dict or set
|
|
|
|
bool is_dict;
|
2014-12-27 17:07:16 +00:00
|
|
|
if (!MICROPY_PY_BUILTINS_SET || MP_PARSE_NODE_IS_STRUCT_KIND(pns->nodes[0], PN_dictorsetmaker_item)) {
|
2013-10-04 19:53:11 +01:00
|
|
|
// a dictionary
|
2019-02-26 13:10:04 +00:00
|
|
|
if (create_map) {
|
|
|
|
EMIT_ARG(build, 1 + n, MP_EMIT_BUILD_MAP);
|
|
|
|
}
|
2013-10-04 19:53:11 +01:00
|
|
|
compile_node(comp, pns->nodes[0]);
|
|
|
|
EMIT(store_map);
|
|
|
|
is_dict = true;
|
|
|
|
} else {
|
|
|
|
// a set
|
|
|
|
compile_node(comp, pns->nodes[0]); // 1st value of set
|
|
|
|
is_dict = false;
|
|
|
|
}
|
|
|
|
|
|
|
|
// process rest of elements
|
2020-05-04 13:11:44 +01:00
|
|
|
for (size_t i = 0; i < n; i++) {
|
2015-01-20 11:55:10 +00:00
|
|
|
mp_parse_node_t pn_i = nodes[i];
|
|
|
|
bool is_key_value = MP_PARSE_NODE_IS_STRUCT_KIND(pn_i, PN_dictorsetmaker_item);
|
|
|
|
compile_node(comp, pn_i);
|
2013-10-04 19:53:11 +01:00
|
|
|
if (is_dict) {
|
|
|
|
if (!is_key_value) {
|
2021-04-22 03:13:58 +01:00
|
|
|
#if MICROPY_ERROR_REPORTING <= MICROPY_ERROR_REPORTING_TERSE
|
2020-03-02 11:35:22 +00:00
|
|
|
compile_syntax_error(comp, (mp_parse_node_t)pns, MP_ERROR_TEXT("invalid syntax"));
|
2019-09-26 13:52:04 +01:00
|
|
|
#else
|
2020-03-02 11:35:22 +00:00
|
|
|
compile_syntax_error(comp, (mp_parse_node_t)pns, MP_ERROR_TEXT("expecting key:value for dict"));
|
2019-09-26 13:52:04 +01:00
|
|
|
#endif
|
2013-10-04 19:53:11 +01:00
|
|
|
return;
|
|
|
|
}
|
|
|
|
EMIT(store_map);
|
|
|
|
} else {
|
|
|
|
if (is_key_value) {
|
2021-04-22 03:13:58 +01:00
|
|
|
#if MICROPY_ERROR_REPORTING <= MICROPY_ERROR_REPORTING_TERSE
|
2020-03-02 11:35:22 +00:00
|
|
|
compile_syntax_error(comp, (mp_parse_node_t)pns, MP_ERROR_TEXT("invalid syntax"));
|
2019-09-26 13:52:04 +01:00
|
|
|
#else
|
2020-03-02 11:35:22 +00:00
|
|
|
compile_syntax_error(comp, (mp_parse_node_t)pns, MP_ERROR_TEXT("expecting just a value for set"));
|
2019-09-26 13:52:04 +01:00
|
|
|
#endif
|
2013-10-04 19:53:11 +01:00
|
|
|
return;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2014-12-27 17:07:16 +00:00
|
|
|
#if MICROPY_PY_BUILTINS_SET
|
2013-10-04 19:53:11 +01:00
|
|
|
// if it's a set, build it
|
|
|
|
if (!is_dict) {
|
2018-05-22 13:18:42 +01:00
|
|
|
EMIT_ARG(build, 1 + n, MP_EMIT_BUILD_SET);
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
2014-12-27 17:07:16 +00:00
|
|
|
#endif
|
2015-02-27 14:25:47 +00:00
|
|
|
} else {
|
|
|
|
assert(MP_PARSE_NODE_STRUCT_KIND(pns1) == PN_comp_for); // should be
|
2013-10-04 19:53:11 +01:00
|
|
|
// dict/set comprehension
|
2014-12-27 17:07:16 +00:00
|
|
|
if (!MICROPY_PY_BUILTINS_SET || MP_PARSE_NODE_IS_STRUCT_KIND(pns->nodes[0], PN_dictorsetmaker_item)) {
|
2013-10-04 19:53:11 +01:00
|
|
|
// a dictionary comprehension
|
|
|
|
compile_comprehension(comp, pns, SCOPE_DICT_COMP);
|
|
|
|
} else {
|
|
|
|
// a set comprehension
|
|
|
|
compile_comprehension(comp, pns, SCOPE_SET_COMP);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
} else {
|
|
|
|
// set with one element
|
|
|
|
goto set_with_one_element;
|
|
|
|
}
|
|
|
|
} else {
|
|
|
|
// set with one element
|
|
|
|
set_with_one_element:
|
2014-12-27 17:07:16 +00:00
|
|
|
#if MICROPY_PY_BUILTINS_SET
|
2013-10-04 19:53:11 +01:00
|
|
|
compile_node(comp, pn);
|
2018-05-22 13:18:42 +01:00
|
|
|
EMIT_ARG(build, 1, MP_EMIT_BUILD_SET);
|
2014-12-27 17:07:16 +00:00
|
|
|
#else
|
|
|
|
assert(0);
|
|
|
|
#endif
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2019-02-26 13:10:04 +00:00
|
|
|
STATIC void compile_atom_brace(compiler_t *comp, mp_parse_node_struct_t *pns) {
|
|
|
|
compile_atom_brace_helper(comp, pns, true);
|
|
|
|
}
|
|
|
|
|
2014-12-10 22:07:04 +00:00
|
|
|
STATIC void compile_trailer_paren(compiler_t *comp, mp_parse_node_struct_t *pns) {
|
2014-02-06 20:30:16 +00:00
|
|
|
compile_trailer_paren_helper(comp, pns->nodes[0], false, 0);
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
|
|
|
|
2014-12-10 22:07:04 +00:00
|
|
|
STATIC void compile_trailer_bracket(compiler_t *comp, mp_parse_node_struct_t *pns) {
|
2013-10-04 19:53:11 +01:00
|
|
|
// object who's index we want is on top of stack
|
|
|
|
compile_node(comp, pns->nodes[0]); // the index
|
2018-05-22 12:31:56 +01:00
|
|
|
EMIT_ARG(subscr, MP_EMIT_SUBSCR_LOAD);
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
|
|
|
|
2014-12-10 22:07:04 +00:00
|
|
|
STATIC void compile_trailer_period(compiler_t *comp, mp_parse_node_struct_t *pns) {
|
2013-10-04 19:53:11 +01:00
|
|
|
// object who's attribute we want is on top of stack
|
2018-05-22 12:43:41 +01:00
|
|
|
EMIT_ARG(attr, MP_PARSE_NODE_LEAF_ARG(pns->nodes[0]), MP_EMIT_ATTR_LOAD); // attribute to get
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
|
|
|
|
2014-12-27 17:20:41 +00:00
|
|
|
#if MICROPY_PY_BUILTINS_SLICE
|
2018-06-19 04:57:55 +01:00
|
|
|
STATIC void compile_subscript(compiler_t *comp, mp_parse_node_struct_t *pns) {
|
|
|
|
if (MP_PARSE_NODE_STRUCT_KIND(pns) == PN_subscript_2) {
|
|
|
|
compile_node(comp, pns->nodes[0]); // start of slice
|
|
|
|
assert(MP_PARSE_NODE_IS_STRUCT(pns->nodes[1])); // should always be
|
|
|
|
pns = (mp_parse_node_struct_t *)pns->nodes[1];
|
|
|
|
} else {
|
|
|
|
// pns is a PN_subscript_3, load None for start of slice
|
|
|
|
EMIT_ARG(load_const_tok, MP_TOKEN_KW_NONE);
|
|
|
|
}
|
|
|
|
|
2013-12-21 18:17:45 +00:00
|
|
|
assert(MP_PARSE_NODE_STRUCT_KIND(pns) == PN_subscript_3); // should always be
|
|
|
|
mp_parse_node_t pn = pns->nodes[0];
|
|
|
|
if (MP_PARSE_NODE_IS_NULL(pn)) {
|
2013-10-04 19:53:11 +01:00
|
|
|
// [?:]
|
2014-01-23 00:34:21 +00:00
|
|
|
EMIT_ARG(load_const_tok, MP_TOKEN_KW_NONE);
|
2018-05-22 13:18:42 +01:00
|
|
|
EMIT_ARG(build, 2, MP_EMIT_BUILD_SLICE);
|
2013-12-21 18:17:45 +00:00
|
|
|
} else if (MP_PARSE_NODE_IS_STRUCT(pn)) {
|
|
|
|
pns = (mp_parse_node_struct_t *)pn;
|
|
|
|
if (MP_PARSE_NODE_STRUCT_KIND(pns) == PN_subscript_3c) {
|
2014-01-23 00:34:21 +00:00
|
|
|
EMIT_ARG(load_const_tok, MP_TOKEN_KW_NONE);
|
2013-10-04 19:53:11 +01:00
|
|
|
pn = pns->nodes[0];
|
2013-12-21 18:17:45 +00:00
|
|
|
if (MP_PARSE_NODE_IS_NULL(pn)) {
|
2013-10-04 19:53:11 +01:00
|
|
|
// [?::]
|
2018-05-22 13:18:42 +01:00
|
|
|
EMIT_ARG(build, 2, MP_EMIT_BUILD_SLICE);
|
2013-10-04 19:53:11 +01:00
|
|
|
} else {
|
|
|
|
// [?::x]
|
|
|
|
compile_node(comp, pn);
|
2018-05-22 13:18:42 +01:00
|
|
|
EMIT_ARG(build, 3, MP_EMIT_BUILD_SLICE);
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
2013-12-21 18:17:45 +00:00
|
|
|
} else if (MP_PARSE_NODE_STRUCT_KIND(pns) == PN_subscript_3d) {
|
2013-10-04 19:53:11 +01:00
|
|
|
compile_node(comp, pns->nodes[0]);
|
2013-12-21 18:17:45 +00:00
|
|
|
assert(MP_PARSE_NODE_IS_STRUCT(pns->nodes[1])); // should always be
|
|
|
|
pns = (mp_parse_node_struct_t *)pns->nodes[1];
|
|
|
|
assert(MP_PARSE_NODE_STRUCT_KIND(pns) == PN_sliceop); // should always be
|
|
|
|
if (MP_PARSE_NODE_IS_NULL(pns->nodes[0])) {
|
2013-10-04 19:53:11 +01:00
|
|
|
// [?:x:]
|
2018-05-22 13:18:42 +01:00
|
|
|
EMIT_ARG(build, 2, MP_EMIT_BUILD_SLICE);
|
2013-10-04 19:53:11 +01:00
|
|
|
} else {
|
|
|
|
// [?:x:x]
|
|
|
|
compile_node(comp, pns->nodes[0]);
|
2018-05-22 13:18:42 +01:00
|
|
|
EMIT_ARG(build, 3, MP_EMIT_BUILD_SLICE);
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
|
|
|
} else {
|
|
|
|
// [?:x]
|
|
|
|
compile_node(comp, pn);
|
2018-05-22 13:18:42 +01:00
|
|
|
EMIT_ARG(build, 2, MP_EMIT_BUILD_SLICE);
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
|
|
|
} else {
|
|
|
|
// [?:x]
|
|
|
|
compile_node(comp, pn);
|
2018-05-22 13:18:42 +01:00
|
|
|
EMIT_ARG(build, 2, MP_EMIT_BUILD_SLICE);
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
|
|
|
}
|
2014-12-27 17:20:41 +00:00
|
|
|
#endif // MICROPY_PY_BUILTINS_SLICE
|
2013-10-04 19:53:11 +01:00
|
|
|
|
2014-12-10 22:07:04 +00:00
|
|
|
STATIC void compile_dictorsetmaker_item(compiler_t *comp, mp_parse_node_struct_t *pns) {
|
2013-10-04 19:53:11 +01:00
|
|
|
// if this is called then we are compiling a dict key:value pair
|
|
|
|
compile_node(comp, pns->nodes[1]); // value
|
|
|
|
compile_node(comp, pns->nodes[0]); // key
|
|
|
|
}
|
|
|
|
|
2014-12-10 22:07:04 +00:00
|
|
|
STATIC void compile_classdef(compiler_t *comp, mp_parse_node_struct_t *pns) {
|
2013-10-05 18:08:26 +01:00
|
|
|
qstr cname = compile_classdef_helper(comp, pns, comp->scope_cur->emit_options);
|
2013-10-04 19:53:11 +01:00
|
|
|
// store class object into class name
|
2015-03-26 14:42:40 +00:00
|
|
|
compile_store_id(comp, cname);
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
|
|
|
|
2014-12-10 22:07:04 +00:00
|
|
|
STATIC void compile_yield_expr(compiler_t *comp, mp_parse_node_struct_t *pns) {
|
2014-04-11 14:10:21 +01:00
|
|
|
if (comp->scope_cur->kind != SCOPE_FUNCTION && comp->scope_cur->kind != SCOPE_LAMBDA) {
|
2020-03-02 11:35:22 +00:00
|
|
|
compile_syntax_error(comp, (mp_parse_node_t)pns, MP_ERROR_TEXT("'yield' outside function"));
|
2013-10-04 19:53:11 +01:00
|
|
|
return;
|
|
|
|
}
|
2013-12-21 18:17:45 +00:00
|
|
|
if (MP_PARSE_NODE_IS_NULL(pns->nodes[0])) {
|
2014-01-23 00:34:21 +00:00
|
|
|
EMIT_ARG(load_const_tok, MP_TOKEN_KW_NONE);
|
2018-05-18 15:30:42 +01:00
|
|
|
EMIT_ARG(yield, MP_EMIT_YIELD_VALUE);
|
2018-10-01 04:07:04 +01:00
|
|
|
reserve_labels_for_native(comp, 1);
|
2013-12-21 18:17:45 +00:00
|
|
|
} else if (MP_PARSE_NODE_IS_STRUCT_KIND(pns->nodes[0], PN_yield_arg_from)) {
|
|
|
|
pns = (mp_parse_node_struct_t *)pns->nodes[0];
|
2013-10-04 19:53:11 +01:00
|
|
|
compile_node(comp, pns->nodes[0]);
|
2016-01-27 20:23:11 +00:00
|
|
|
compile_yield_from(comp);
|
2013-10-04 19:53:11 +01:00
|
|
|
} else {
|
|
|
|
compile_node(comp, pns->nodes[0]);
|
2018-05-18 15:30:42 +01:00
|
|
|
EMIT_ARG(yield, MP_EMIT_YIELD_VALUE);
|
2018-10-01 04:07:04 +01:00
|
|
|
reserve_labels_for_native(comp, 1);
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2016-01-27 20:23:11 +00:00
|
|
|
#if MICROPY_PY_ASYNC_AWAIT
|
|
|
|
STATIC void compile_atom_expr_await(compiler_t *comp, mp_parse_node_struct_t *pns) {
|
|
|
|
if (comp->scope_cur->kind != SCOPE_FUNCTION && comp->scope_cur->kind != SCOPE_LAMBDA) {
|
2020-03-02 11:35:22 +00:00
|
|
|
compile_syntax_error(comp, (mp_parse_node_t)pns, MP_ERROR_TEXT("'await' outside function"));
|
2016-01-27 20:23:11 +00:00
|
|
|
return;
|
|
|
|
}
|
|
|
|
compile_atom_expr_normal(comp, pns);
|
|
|
|
compile_yield_from(comp);
|
|
|
|
}
|
|
|
|
#endif
|
|
|
|
|
2017-02-24 02:43:43 +00:00
|
|
|
STATIC mp_obj_t get_const_object(mp_parse_node_struct_t *pns) {
|
2015-11-27 17:09:11 +00:00
|
|
|
#if MICROPY_OBJ_REPR == MICROPY_OBJ_REPR_D
|
|
|
|
// nodes are 32-bit pointers, but need to extract 64-bit object
|
2017-02-24 02:43:43 +00:00
|
|
|
return (uint64_t)pns->nodes[0] | ((uint64_t)pns->nodes[1] << 32);
|
2015-11-27 17:09:11 +00:00
|
|
|
#else
|
2017-02-24 02:43:43 +00:00
|
|
|
return (mp_obj_t)pns->nodes[0];
|
2015-11-27 17:09:11 +00:00
|
|
|
#endif
|
2015-02-08 01:57:40 +00:00
|
|
|
}
|
|
|
|
|
2017-02-24 02:43:43 +00:00
|
|
|
STATIC void compile_const_object(compiler_t *comp, mp_parse_node_struct_t *pns) {
|
|
|
|
EMIT_ARG(load_const_obj, get_const_object(pns));
|
|
|
|
}
|
|
|
|
|
2013-12-21 18:17:45 +00:00
|
|
|
typedef void (*compile_function_t)(compiler_t *, mp_parse_node_struct_t *);
|
2016-05-20 12:38:15 +01:00
|
|
|
STATIC const compile_function_t compile_function[] = {
|
2017-02-14 23:58:05 +00:00
|
|
|
// only define rules with a compile function
|
2013-10-04 19:53:11 +01:00
|
|
|
#define c(f) compile_##f
|
2014-02-26 16:57:08 +00:00
|
|
|
#define DEF_RULE(rule, comp, kind, ...) comp,
|
2017-02-14 23:58:05 +00:00
|
|
|
#define DEF_RULE_NC(rule, kind, ...)
|
2015-01-01 20:27:54 +00:00
|
|
|
#include "py/grammar.h"
|
2013-10-04 19:53:11 +01:00
|
|
|
#undef c
|
|
|
|
#undef DEF_RULE
|
2017-02-14 23:58:05 +00:00
|
|
|
#undef DEF_RULE_NC
|
2015-02-08 01:57:40 +00:00
|
|
|
compile_const_object,
|
2013-10-04 19:53:11 +01:00
|
|
|
};
|
|
|
|
|
2014-08-15 14:30:52 +01:00
|
|
|
STATIC void compile_node(compiler_t *comp, mp_parse_node_t pn) {
|
2013-12-21 18:17:45 +00:00
|
|
|
if (MP_PARSE_NODE_IS_NULL(pn)) {
|
2013-10-04 19:53:11 +01:00
|
|
|
// pass
|
2014-02-22 14:39:45 +00:00
|
|
|
} else if (MP_PARSE_NODE_IS_SMALL_INT(pn)) {
|
2014-07-03 13:25:24 +01:00
|
|
|
mp_int_t arg = MP_PARSE_NODE_LEAF_SMALL_INT(pn);
|
2016-02-11 22:30:53 +00:00
|
|
|
#if MICROPY_DYNAMIC_COMPILER
|
2018-08-13 14:34:47 +01:00
|
|
|
mp_uint_t sign_mask = -((mp_uint_t)1 << (mp_dynamic_compiler.small_int_bits - 1));
|
2016-02-11 22:30:53 +00:00
|
|
|
if ((arg & sign_mask) == 0 || (arg & sign_mask) == sign_mask) {
|
|
|
|
// integer fits in target runtime's small-int
|
|
|
|
EMIT_ARG(load_const_small_int, arg);
|
|
|
|
} else {
|
|
|
|
// integer doesn't fit, so create a multi-precision int object
|
|
|
|
// (but only create the actual object on the last pass)
|
|
|
|
if (comp->pass != MP_PASS_EMIT) {
|
|
|
|
EMIT_ARG(load_const_obj, mp_const_none);
|
|
|
|
} else {
|
|
|
|
EMIT_ARG(load_const_obj, mp_obj_new_int_from_ll(arg));
|
|
|
|
}
|
|
|
|
}
|
|
|
|
#else
|
2014-02-22 14:39:45 +00:00
|
|
|
EMIT_ARG(load_const_small_int, arg);
|
2016-02-11 22:30:53 +00:00
|
|
|
#endif
|
2013-12-21 18:17:45 +00:00
|
|
|
} else if (MP_PARSE_NODE_IS_LEAF(pn)) {
|
2015-12-17 13:13:18 +00:00
|
|
|
uintptr_t arg = MP_PARSE_NODE_LEAF_ARG(pn);
|
2013-12-21 18:17:45 +00:00
|
|
|
switch (MP_PARSE_NODE_LEAF_KIND(pn)) {
|
2015-03-26 14:42:40 +00:00
|
|
|
case MP_PARSE_NODE_ID:
|
|
|
|
compile_load_id(comp, arg);
|
|
|
|
break;
|
2015-06-25 15:42:13 +01:00
|
|
|
case MP_PARSE_NODE_STRING:
|
|
|
|
EMIT_ARG(load_const_str, arg);
|
|
|
|
break;
|
|
|
|
case MP_PARSE_NODE_BYTES:
|
|
|
|
// only create and load the actual bytes object on the last pass
|
|
|
|
if (comp->pass != MP_PASS_EMIT) {
|
|
|
|
EMIT_ARG(load_const_obj, mp_const_none);
|
|
|
|
} else {
|
2015-11-27 12:23:18 +00:00
|
|
|
size_t len;
|
2015-06-25 15:42:13 +01:00
|
|
|
const byte *data = qstr_data(arg, &len);
|
|
|
|
EMIT_ARG(load_const_obj, mp_obj_new_bytes(data, len));
|
|
|
|
}
|
|
|
|
break;
|
2015-01-14 21:32:42 +00:00
|
|
|
case MP_PARSE_NODE_TOKEN:
|
|
|
|
default:
|
2013-12-21 18:17:45 +00:00
|
|
|
if (arg == MP_TOKEN_NEWLINE) {
|
2013-10-09 15:09:52 +01:00
|
|
|
// this can occur when file_input lets through a NEWLINE (eg if file starts with a newline)
|
2013-10-18 19:58:12 +01:00
|
|
|
// or when single_input lets through a NEWLINE (user enters a blank line)
|
2013-10-09 15:09:52 +01:00
|
|
|
// do nothing
|
|
|
|
} else {
|
2014-01-23 00:34:21 +00:00
|
|
|
EMIT_ARG(load_const_tok, arg);
|
2013-10-09 15:09:52 +01:00
|
|
|
}
|
|
|
|
break;
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
|
|
|
} else {
|
2013-12-21 18:17:45 +00:00
|
|
|
mp_parse_node_struct_t *pns = (mp_parse_node_struct_t *)pn;
|
2015-03-26 16:44:14 +00:00
|
|
|
EMIT_ARG(set_source_line, pns->source_line);
|
2017-02-14 23:58:05 +00:00
|
|
|
assert(MP_PARSE_NODE_STRUCT_KIND(pns) <= PN_const_object);
|
2015-01-14 21:17:27 +00:00
|
|
|
compile_function_t f = compile_function[MP_PARSE_NODE_STRUCT_KIND(pns)];
|
2016-10-12 00:20:48 +01:00
|
|
|
f(comp, pns);
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2018-09-15 04:43:49 +01:00
|
|
|
#if MICROPY_EMIT_NATIVE
|
|
|
|
STATIC int compile_viper_type_annotation(compiler_t *comp, mp_parse_node_t pn_annotation) {
|
|
|
|
int native_type = MP_NATIVE_TYPE_OBJ;
|
|
|
|
if (MP_PARSE_NODE_IS_NULL(pn_annotation)) {
|
|
|
|
// No annotation, type defaults to object
|
|
|
|
} else if (MP_PARSE_NODE_IS_ID(pn_annotation)) {
|
|
|
|
qstr type_name = MP_PARSE_NODE_LEAF_ARG(pn_annotation);
|
|
|
|
native_type = mp_native_type_from_qstr(type_name);
|
|
|
|
if (native_type < 0) {
|
2020-03-02 11:35:22 +00:00
|
|
|
comp->compile_error = mp_obj_new_exception_msg_varg(&mp_type_ViperTypeError, MP_ERROR_TEXT("unknown type '%q'"), type_name);
|
2018-09-15 04:43:49 +01:00
|
|
|
native_type = 0;
|
|
|
|
}
|
|
|
|
} else {
|
2020-03-02 11:35:22 +00:00
|
|
|
compile_syntax_error(comp, pn_annotation, MP_ERROR_TEXT("annotation must be an identifier"));
|
2018-09-15 04:43:49 +01:00
|
|
|
}
|
|
|
|
return native_type;
|
|
|
|
}
|
|
|
|
#endif
|
|
|
|
|
2014-08-15 16:45:41 +01:00
|
|
|
STATIC void compile_scope_func_lambda_param(compiler_t *comp, mp_parse_node_t pn, pn_kind_t pn_name, pn_kind_t pn_star, pn_kind_t pn_dbl_star) {
|
2019-02-25 03:52:36 +00:00
|
|
|
(void)pn_dbl_star;
|
|
|
|
|
2015-11-23 16:50:42 +00:00
|
|
|
// check that **kw is last
|
|
|
|
if ((comp->scope_cur->scope_flags & MP_SCOPE_FLAG_VARKEYWORDS) != 0) {
|
2020-03-02 11:35:22 +00:00
|
|
|
compile_syntax_error(comp, pn, MP_ERROR_TEXT("invalid syntax"));
|
2015-11-23 16:50:42 +00:00
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
2019-09-25 06:53:30 +01:00
|
|
|
qstr param_name = MP_QSTRnull;
|
2014-04-27 15:50:52 +01:00
|
|
|
uint param_flag = ID_FLAG_IS_PARAM;
|
2018-09-15 04:20:54 +01:00
|
|
|
mp_parse_node_struct_t *pns = NULL;
|
2013-12-21 18:17:45 +00:00
|
|
|
if (MP_PARSE_NODE_IS_ID(pn)) {
|
|
|
|
param_name = MP_PARSE_NODE_LEAF_ARG(pn);
|
2014-04-11 23:25:34 +01:00
|
|
|
if (comp->have_star) {
|
2014-04-27 15:50:52 +01:00
|
|
|
// comes after a star, so counts as a keyword-only parameter
|
|
|
|
comp->scope_cur->num_kwonly_args += 1;
|
2013-10-04 19:53:11 +01:00
|
|
|
} else {
|
2014-04-27 15:50:52 +01:00
|
|
|
// comes before a star, so counts as a positional parameter
|
|
|
|
comp->scope_cur->num_pos_args += 1;
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
py: Rework bytecode and .mpy file format to be mostly static data.
Background: .mpy files are precompiled .py files, built using mpy-cross,
that contain compiled bytecode functions (and can also contain machine
code). The benefit of using an .mpy file over a .py file is that they are
faster to import and take less memory when importing. They are also
smaller on disk.
But the real benefit of .mpy files comes when they are frozen into the
firmware. This is done by loading the .mpy file during compilation of the
firmware and turning it into a set of big C data structures (the job of
mpy-tool.py), which are then compiled and downloaded into the ROM of a
device. These C data structures can be executed in-place, ie directly from
ROM. This makes importing even faster because there is very little to do,
and also means such frozen modules take up much less RAM (because their
bytecode stays in ROM).
The downside of frozen code is that it requires recompiling and reflashing
the entire firmware. This can be a big barrier to entry, slows down
development time, and makes it harder to do OTA updates of frozen code
(because the whole firmware must be updated).
This commit attempts to solve this problem by providing a solution that
sits between loading .mpy files into RAM and freezing them into the
firmware. The .mpy file format has been reworked so that it consists of
data and bytecode which is mostly static and ready to run in-place. If
these new .mpy files are located in flash/ROM which is memory addressable,
the .mpy file can be executed (mostly) in-place.
With this approach there is still a small amount of unpacking and linking
of the .mpy file that needs to be done when it's imported, but it's still
much better than loading an .mpy from disk into RAM (although not as good
as freezing .mpy files into the firmware).
The main trick to make static .mpy files is to adjust the bytecode so any
qstrs that it references now go through a lookup table to convert from
local qstr number in the module to global qstr number in the firmware.
That means the bytecode does not need linking/rewriting of qstrs when it's
loaded. Instead only a small qstr table needs to be built (and put in RAM)
at import time. This means the bytecode itself is static/constant and can
be used directly if it's in addressable memory. Also the qstr string data
in the .mpy file, and some constant object data, can be used directly.
Note that the qstr table is global to the module (ie not per function).
In more detail, in the VM what used to be (schematically):
qst = DECODE_QSTR_VALUE;
is now (schematically):
idx = DECODE_QSTR_INDEX;
qst = qstr_table[idx];
That allows the bytecode to be fixed at compile time and not need
relinking/rewriting of the qstr values. Only qstr_table needs to be linked
when the .mpy is loaded.
Incidentally, this helps to reduce the size of bytecode because what used
to be 2-byte qstr values in the bytecode are now (mostly) 1-byte indices.
If the module uses the same qstr more than two times then the bytecode is
smaller than before.
The following changes are measured for this commit compared to the
previous (the baseline):
- average 7%-9% reduction in size of .mpy files
- frozen code size is reduced by about 5%-7%
- importing .py files uses about 5% less RAM in total
- importing .mpy files uses about 4% less RAM in total
- importing .py and .mpy files takes about the same time as before
The qstr indirection in the bytecode has only a small impact on VM
performance. For stm32 on PYBv1.0 the performance change of this commit
is:
diff of scores (higher is better)
N=100 M=100 baseline -> this-commit diff diff% (error%)
bm_chaos.py 371.07 -> 357.39 : -13.68 = -3.687% (+/-0.02%)
bm_fannkuch.py 78.72 -> 77.49 : -1.23 = -1.563% (+/-0.01%)
bm_fft.py 2591.73 -> 2539.28 : -52.45 = -2.024% (+/-0.00%)
bm_float.py 6034.93 -> 5908.30 : -126.63 = -2.098% (+/-0.01%)
bm_hexiom.py 48.96 -> 47.93 : -1.03 = -2.104% (+/-0.00%)
bm_nqueens.py 4510.63 -> 4459.94 : -50.69 = -1.124% (+/-0.00%)
bm_pidigits.py 650.28 -> 644.96 : -5.32 = -0.818% (+/-0.23%)
core_import_mpy_multi.py 564.77 -> 581.49 : +16.72 = +2.960% (+/-0.01%)
core_import_mpy_single.py 68.67 -> 67.16 : -1.51 = -2.199% (+/-0.01%)
core_qstr.py 64.16 -> 64.12 : -0.04 = -0.062% (+/-0.00%)
core_yield_from.py 362.58 -> 354.50 : -8.08 = -2.228% (+/-0.00%)
misc_aes.py 429.69 -> 405.59 : -24.10 = -5.609% (+/-0.01%)
misc_mandel.py 3485.13 -> 3416.51 : -68.62 = -1.969% (+/-0.00%)
misc_pystone.py 2496.53 -> 2405.56 : -90.97 = -3.644% (+/-0.01%)
misc_raytrace.py 381.47 -> 374.01 : -7.46 = -1.956% (+/-0.01%)
viper_call0.py 576.73 -> 572.49 : -4.24 = -0.735% (+/-0.04%)
viper_call1a.py 550.37 -> 546.21 : -4.16 = -0.756% (+/-0.09%)
viper_call1b.py 438.23 -> 435.68 : -2.55 = -0.582% (+/-0.06%)
viper_call1c.py 442.84 -> 440.04 : -2.80 = -0.632% (+/-0.08%)
viper_call2a.py 536.31 -> 532.35 : -3.96 = -0.738% (+/-0.06%)
viper_call2b.py 382.34 -> 377.07 : -5.27 = -1.378% (+/-0.03%)
And for unix on x64:
diff of scores (higher is better)
N=2000 M=2000 baseline -> this-commit diff diff% (error%)
bm_chaos.py 13594.20 -> 13073.84 : -520.36 = -3.828% (+/-5.44%)
bm_fannkuch.py 60.63 -> 59.58 : -1.05 = -1.732% (+/-3.01%)
bm_fft.py 112009.15 -> 111603.32 : -405.83 = -0.362% (+/-4.03%)
bm_float.py 246202.55 -> 247923.81 : +1721.26 = +0.699% (+/-2.79%)
bm_hexiom.py 615.65 -> 617.21 : +1.56 = +0.253% (+/-1.64%)
bm_nqueens.py 215807.95 -> 215600.96 : -206.99 = -0.096% (+/-3.52%)
bm_pidigits.py 8246.74 -> 8422.82 : +176.08 = +2.135% (+/-3.64%)
misc_aes.py 16133.00 -> 16452.74 : +319.74 = +1.982% (+/-1.50%)
misc_mandel.py 128146.69 -> 130796.43 : +2649.74 = +2.068% (+/-3.18%)
misc_pystone.py 83811.49 -> 83124.85 : -686.64 = -0.819% (+/-1.03%)
misc_raytrace.py 21688.02 -> 21385.10 : -302.92 = -1.397% (+/-3.20%)
The code size change is (firmware with a lot of frozen code benefits the
most):
bare-arm: +396 +0.697%
minimal x86: +1595 +0.979% [incl +32(data)]
unix x64: +2408 +0.470% [incl +800(data)]
unix nanbox: +1396 +0.309% [incl -96(data)]
stm32: -1256 -0.318% PYBV10
cc3200: +288 +0.157%
esp8266: -260 -0.037% GENERIC
esp32: -216 -0.014% GENERIC[incl -1072(data)]
nrf: +116 +0.067% pca10040
rp2: -664 -0.135% PICO
samd: +844 +0.607% ADAFRUIT_ITSYBITSY_M4_EXPRESS
As part of this change the .mpy file format version is bumped to version 6.
And mpy-tool.py has been improved to provide a good visualisation of the
contents of .mpy files.
In summary: this commit changes the bytecode to use qstr indirection, and
reworks the .mpy file format to be simpler and allow .mpy files to be
executed in-place. Performance is not impacted too much. Eventually it
will be possible to store such .mpy files in a linear, read-only, memory-
mappable filesystem so they can be executed from flash/ROM. This will
essentially be able to replace frozen code for most applications.
Signed-off-by: Damien George <damien@micropython.org>
2021-10-22 12:22:47 +01:00
|
|
|
mp_emit_common_use_qstr(&comp->emit_common, param_name);
|
2013-10-06 00:28:28 +01:00
|
|
|
} else {
|
2013-12-21 18:17:45 +00:00
|
|
|
assert(MP_PARSE_NODE_IS_STRUCT(pn));
|
2018-09-15 04:20:54 +01:00
|
|
|
pns = (mp_parse_node_struct_t *)pn;
|
2013-12-21 18:17:45 +00:00
|
|
|
if (MP_PARSE_NODE_STRUCT_KIND(pns) == pn_name) {
|
2018-09-15 04:20:54 +01:00
|
|
|
// named parameter with possible annotation
|
2013-12-21 18:17:45 +00:00
|
|
|
param_name = MP_PARSE_NODE_LEAF_ARG(pns->nodes[0]);
|
2014-04-11 23:25:34 +01:00
|
|
|
if (comp->have_star) {
|
2014-04-27 15:50:52 +01:00
|
|
|
// comes after a star, so counts as a keyword-only parameter
|
|
|
|
comp->scope_cur->num_kwonly_args += 1;
|
2013-10-06 00:28:28 +01:00
|
|
|
} else {
|
2014-04-27 15:50:52 +01:00
|
|
|
// comes before a star, so counts as a positional parameter
|
|
|
|
comp->scope_cur->num_pos_args += 1;
|
2013-10-06 00:28:28 +01:00
|
|
|
}
|
py: Rework bytecode and .mpy file format to be mostly static data.
Background: .mpy files are precompiled .py files, built using mpy-cross,
that contain compiled bytecode functions (and can also contain machine
code). The benefit of using an .mpy file over a .py file is that they are
faster to import and take less memory when importing. They are also
smaller on disk.
But the real benefit of .mpy files comes when they are frozen into the
firmware. This is done by loading the .mpy file during compilation of the
firmware and turning it into a set of big C data structures (the job of
mpy-tool.py), which are then compiled and downloaded into the ROM of a
device. These C data structures can be executed in-place, ie directly from
ROM. This makes importing even faster because there is very little to do,
and also means such frozen modules take up much less RAM (because their
bytecode stays in ROM).
The downside of frozen code is that it requires recompiling and reflashing
the entire firmware. This can be a big barrier to entry, slows down
development time, and makes it harder to do OTA updates of frozen code
(because the whole firmware must be updated).
This commit attempts to solve this problem by providing a solution that
sits between loading .mpy files into RAM and freezing them into the
firmware. The .mpy file format has been reworked so that it consists of
data and bytecode which is mostly static and ready to run in-place. If
these new .mpy files are located in flash/ROM which is memory addressable,
the .mpy file can be executed (mostly) in-place.
With this approach there is still a small amount of unpacking and linking
of the .mpy file that needs to be done when it's imported, but it's still
much better than loading an .mpy from disk into RAM (although not as good
as freezing .mpy files into the firmware).
The main trick to make static .mpy files is to adjust the bytecode so any
qstrs that it references now go through a lookup table to convert from
local qstr number in the module to global qstr number in the firmware.
That means the bytecode does not need linking/rewriting of qstrs when it's
loaded. Instead only a small qstr table needs to be built (and put in RAM)
at import time. This means the bytecode itself is static/constant and can
be used directly if it's in addressable memory. Also the qstr string data
in the .mpy file, and some constant object data, can be used directly.
Note that the qstr table is global to the module (ie not per function).
In more detail, in the VM what used to be (schematically):
qst = DECODE_QSTR_VALUE;
is now (schematically):
idx = DECODE_QSTR_INDEX;
qst = qstr_table[idx];
That allows the bytecode to be fixed at compile time and not need
relinking/rewriting of the qstr values. Only qstr_table needs to be linked
when the .mpy is loaded.
Incidentally, this helps to reduce the size of bytecode because what used
to be 2-byte qstr values in the bytecode are now (mostly) 1-byte indices.
If the module uses the same qstr more than two times then the bytecode is
smaller than before.
The following changes are measured for this commit compared to the
previous (the baseline):
- average 7%-9% reduction in size of .mpy files
- frozen code size is reduced by about 5%-7%
- importing .py files uses about 5% less RAM in total
- importing .mpy files uses about 4% less RAM in total
- importing .py and .mpy files takes about the same time as before
The qstr indirection in the bytecode has only a small impact on VM
performance. For stm32 on PYBv1.0 the performance change of this commit
is:
diff of scores (higher is better)
N=100 M=100 baseline -> this-commit diff diff% (error%)
bm_chaos.py 371.07 -> 357.39 : -13.68 = -3.687% (+/-0.02%)
bm_fannkuch.py 78.72 -> 77.49 : -1.23 = -1.563% (+/-0.01%)
bm_fft.py 2591.73 -> 2539.28 : -52.45 = -2.024% (+/-0.00%)
bm_float.py 6034.93 -> 5908.30 : -126.63 = -2.098% (+/-0.01%)
bm_hexiom.py 48.96 -> 47.93 : -1.03 = -2.104% (+/-0.00%)
bm_nqueens.py 4510.63 -> 4459.94 : -50.69 = -1.124% (+/-0.00%)
bm_pidigits.py 650.28 -> 644.96 : -5.32 = -0.818% (+/-0.23%)
core_import_mpy_multi.py 564.77 -> 581.49 : +16.72 = +2.960% (+/-0.01%)
core_import_mpy_single.py 68.67 -> 67.16 : -1.51 = -2.199% (+/-0.01%)
core_qstr.py 64.16 -> 64.12 : -0.04 = -0.062% (+/-0.00%)
core_yield_from.py 362.58 -> 354.50 : -8.08 = -2.228% (+/-0.00%)
misc_aes.py 429.69 -> 405.59 : -24.10 = -5.609% (+/-0.01%)
misc_mandel.py 3485.13 -> 3416.51 : -68.62 = -1.969% (+/-0.00%)
misc_pystone.py 2496.53 -> 2405.56 : -90.97 = -3.644% (+/-0.01%)
misc_raytrace.py 381.47 -> 374.01 : -7.46 = -1.956% (+/-0.01%)
viper_call0.py 576.73 -> 572.49 : -4.24 = -0.735% (+/-0.04%)
viper_call1a.py 550.37 -> 546.21 : -4.16 = -0.756% (+/-0.09%)
viper_call1b.py 438.23 -> 435.68 : -2.55 = -0.582% (+/-0.06%)
viper_call1c.py 442.84 -> 440.04 : -2.80 = -0.632% (+/-0.08%)
viper_call2a.py 536.31 -> 532.35 : -3.96 = -0.738% (+/-0.06%)
viper_call2b.py 382.34 -> 377.07 : -5.27 = -1.378% (+/-0.03%)
And for unix on x64:
diff of scores (higher is better)
N=2000 M=2000 baseline -> this-commit diff diff% (error%)
bm_chaos.py 13594.20 -> 13073.84 : -520.36 = -3.828% (+/-5.44%)
bm_fannkuch.py 60.63 -> 59.58 : -1.05 = -1.732% (+/-3.01%)
bm_fft.py 112009.15 -> 111603.32 : -405.83 = -0.362% (+/-4.03%)
bm_float.py 246202.55 -> 247923.81 : +1721.26 = +0.699% (+/-2.79%)
bm_hexiom.py 615.65 -> 617.21 : +1.56 = +0.253% (+/-1.64%)
bm_nqueens.py 215807.95 -> 215600.96 : -206.99 = -0.096% (+/-3.52%)
bm_pidigits.py 8246.74 -> 8422.82 : +176.08 = +2.135% (+/-3.64%)
misc_aes.py 16133.00 -> 16452.74 : +319.74 = +1.982% (+/-1.50%)
misc_mandel.py 128146.69 -> 130796.43 : +2649.74 = +2.068% (+/-3.18%)
misc_pystone.py 83811.49 -> 83124.85 : -686.64 = -0.819% (+/-1.03%)
misc_raytrace.py 21688.02 -> 21385.10 : -302.92 = -1.397% (+/-3.20%)
The code size change is (firmware with a lot of frozen code benefits the
most):
bare-arm: +396 +0.697%
minimal x86: +1595 +0.979% [incl +32(data)]
unix x64: +2408 +0.470% [incl +800(data)]
unix nanbox: +1396 +0.309% [incl -96(data)]
stm32: -1256 -0.318% PYBV10
cc3200: +288 +0.157%
esp8266: -260 -0.037% GENERIC
esp32: -216 -0.014% GENERIC[incl -1072(data)]
nrf: +116 +0.067% pca10040
rp2: -664 -0.135% PICO
samd: +844 +0.607% ADAFRUIT_ITSYBITSY_M4_EXPRESS
As part of this change the .mpy file format version is bumped to version 6.
And mpy-tool.py has been improved to provide a good visualisation of the
contents of .mpy files.
In summary: this commit changes the bytecode to use qstr indirection, and
reworks the .mpy file format to be simpler and allow .mpy files to be
executed in-place. Performance is not impacted too much. Eventually it
will be possible to store such .mpy files in a linear, read-only, memory-
mappable filesystem so they can be executed from flash/ROM. This will
essentially be able to replace frozen code for most applications.
Signed-off-by: Damien George <damien@micropython.org>
2021-10-22 12:22:47 +01:00
|
|
|
mp_emit_common_use_qstr(&comp->emit_common, param_name);
|
2013-12-21 18:17:45 +00:00
|
|
|
} else if (MP_PARSE_NODE_STRUCT_KIND(pns) == pn_star) {
|
2015-11-23 16:50:42 +00:00
|
|
|
if (comp->have_star) {
|
|
|
|
// more than one star
|
2020-03-02 11:35:22 +00:00
|
|
|
compile_syntax_error(comp, pn, MP_ERROR_TEXT("invalid syntax"));
|
2015-11-23 16:50:42 +00:00
|
|
|
return;
|
|
|
|
}
|
2014-04-11 23:25:34 +01:00
|
|
|
comp->have_star = true;
|
2014-04-27 15:50:52 +01:00
|
|
|
param_flag = ID_FLAG_IS_PARAM | ID_FLAG_IS_STAR_PARAM;
|
2013-12-21 18:17:45 +00:00
|
|
|
if (MP_PARSE_NODE_IS_NULL(pns->nodes[0])) {
|
2013-10-06 00:28:28 +01:00
|
|
|
// bare star
|
|
|
|
// TODO see http://www.python.org/dev/peps/pep-3102/
|
|
|
|
// assert(comp->scope_cur->num_dict_params == 0);
|
2018-09-15 04:20:54 +01:00
|
|
|
pns = NULL;
|
2013-12-21 18:17:45 +00:00
|
|
|
} else if (MP_PARSE_NODE_IS_ID(pns->nodes[0])) {
|
2013-10-06 00:28:28 +01:00
|
|
|
// named star
|
2014-02-15 19:33:11 +00:00
|
|
|
comp->scope_cur->scope_flags |= MP_SCOPE_FLAG_VARARGS;
|
2013-12-21 18:17:45 +00:00
|
|
|
param_name = MP_PARSE_NODE_LEAF_ARG(pns->nodes[0]);
|
2018-09-15 04:20:54 +01:00
|
|
|
pns = NULL;
|
2015-02-27 14:25:47 +00:00
|
|
|
} else {
|
|
|
|
assert(MP_PARSE_NODE_IS_STRUCT_KIND(pns->nodes[0], PN_tfpdef)); // should be
|
2014-04-11 23:25:34 +01:00
|
|
|
// named star with possible annotation
|
2014-02-15 19:33:11 +00:00
|
|
|
comp->scope_cur->scope_flags |= MP_SCOPE_FLAG_VARARGS;
|
2013-12-21 18:17:45 +00:00
|
|
|
pns = (mp_parse_node_struct_t *)pns->nodes[0];
|
|
|
|
param_name = MP_PARSE_NODE_LEAF_ARG(pns->nodes[0]);
|
2013-10-06 00:28:28 +01:00
|
|
|
}
|
2015-02-27 14:25:47 +00:00
|
|
|
} else {
|
2018-09-15 04:20:54 +01:00
|
|
|
// double star with possible annotation
|
2015-02-27 14:25:47 +00:00
|
|
|
assert(MP_PARSE_NODE_STRUCT_KIND(pns) == pn_dbl_star); // should be
|
2013-12-21 18:17:45 +00:00
|
|
|
param_name = MP_PARSE_NODE_LEAF_ARG(pns->nodes[0]);
|
2014-04-27 15:50:52 +01:00
|
|
|
param_flag = ID_FLAG_IS_PARAM | ID_FLAG_IS_DBL_STAR_PARAM;
|
2014-02-15 19:33:11 +00:00
|
|
|
comp->scope_cur->scope_flags |= MP_SCOPE_FLAG_VARKEYWORDS;
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2019-09-25 06:53:30 +01:00
|
|
|
if (param_name != MP_QSTRnull) {
|
2018-10-27 12:41:21 +01:00
|
|
|
id_info_t *id_info = scope_find_or_add_id(comp->scope_cur, param_name, ID_INFO_KIND_UNDECIDED);
|
|
|
|
if (id_info->kind != ID_INFO_KIND_UNDECIDED) {
|
2020-03-02 11:35:22 +00:00
|
|
|
compile_syntax_error(comp, pn, MP_ERROR_TEXT("argument name reused"));
|
2013-10-04 19:53:11 +01:00
|
|
|
return;
|
|
|
|
}
|
|
|
|
id_info->kind = ID_INFO_KIND_LOCAL;
|
2014-04-27 15:50:52 +01:00
|
|
|
id_info->flags = param_flag;
|
2018-09-15 04:20:54 +01:00
|
|
|
|
|
|
|
#if MICROPY_EMIT_NATIVE
|
|
|
|
if (comp->scope_cur->emit_options == MP_EMIT_OPT_VIPER && pn_name == PN_typedargslist_name && pns != NULL) {
|
2018-09-15 04:43:49 +01:00
|
|
|
id_info->flags |= compile_viper_type_annotation(comp, pns->nodes[1]) << ID_FLAG_VIPER_TYPE_POS;
|
2018-09-15 04:20:54 +01:00
|
|
|
}
|
|
|
|
#else
|
|
|
|
(void)pns;
|
|
|
|
#endif
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2014-04-27 15:50:52 +01:00
|
|
|
STATIC void compile_scope_func_param(compiler_t *comp, mp_parse_node_t pn) {
|
2014-08-15 16:45:41 +01:00
|
|
|
compile_scope_func_lambda_param(comp, pn, PN_typedargslist_name, PN_typedargslist_star, PN_typedargslist_dbl_star);
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
|
|
|
|
2014-04-27 15:50:52 +01:00
|
|
|
STATIC void compile_scope_lambda_param(compiler_t *comp, mp_parse_node_t pn) {
|
2014-08-15 16:45:41 +01:00
|
|
|
compile_scope_func_lambda_param(comp, pn, PN_varargslist_name, PN_varargslist_star, PN_varargslist_dbl_star);
|
|
|
|
}
|
|
|
|
|
2015-12-18 01:37:55 +00:00
|
|
|
STATIC void compile_scope_comp_iter(compiler_t *comp, mp_parse_node_struct_t *pns_comp_for, mp_parse_node_t pn_inner_expr, int for_depth) {
|
|
|
|
uint l_top = comp_next_label(comp);
|
|
|
|
uint l_end = comp_next_label(comp);
|
|
|
|
EMIT_ARG(label_assign, l_top);
|
|
|
|
EMIT_ARG(for_iter, l_end);
|
|
|
|
c_assign(comp, pns_comp_for->nodes[0], ASSIGN_STORE);
|
|
|
|
mp_parse_node_t pn_iter = pns_comp_for->nodes[2];
|
|
|
|
|
2013-10-04 19:53:11 +01:00
|
|
|
tail_recursion:
|
2013-12-21 18:17:45 +00:00
|
|
|
if (MP_PARSE_NODE_IS_NULL(pn_iter)) {
|
2013-10-04 19:53:11 +01:00
|
|
|
// no more nested if/for; compile inner expression
|
|
|
|
compile_node(comp, pn_inner_expr);
|
2016-09-18 14:59:47 +01:00
|
|
|
if (comp->scope_cur->kind == SCOPE_GEN_EXPR) {
|
2018-05-18 15:30:42 +01:00
|
|
|
EMIT_ARG(yield, MP_EMIT_YIELD_VALUE);
|
2018-10-01 04:07:04 +01:00
|
|
|
reserve_labels_for_native(comp, 1);
|
2013-10-04 19:53:11 +01:00
|
|
|
EMIT(pop_top);
|
2016-09-18 14:59:47 +01:00
|
|
|
} else {
|
2017-01-17 04:27:37 +00:00
|
|
|
EMIT_ARG(store_comp, comp->scope_cur->kind, 4 * for_depth + 5);
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
2017-04-22 12:43:42 +01:00
|
|
|
} else if (MP_PARSE_NODE_STRUCT_KIND((mp_parse_node_struct_t *)pn_iter) == PN_comp_if) {
|
2013-10-04 19:53:11 +01:00
|
|
|
// if condition
|
2013-12-21 18:17:45 +00:00
|
|
|
mp_parse_node_struct_t *pns_comp_if = (mp_parse_node_struct_t *)pn_iter;
|
2013-10-04 19:53:11 +01:00
|
|
|
c_if_cond(comp, pns_comp_if->nodes[0], false, l_top);
|
|
|
|
pn_iter = pns_comp_if->nodes[1];
|
|
|
|
goto tail_recursion;
|
2015-02-27 14:25:47 +00:00
|
|
|
} else {
|
2017-04-22 12:43:42 +01:00
|
|
|
assert(MP_PARSE_NODE_STRUCT_KIND((mp_parse_node_struct_t *)pn_iter) == PN_comp_for); // should be
|
2013-10-04 19:53:11 +01:00
|
|
|
// for loop
|
2013-12-21 18:17:45 +00:00
|
|
|
mp_parse_node_struct_t *pns_comp_for2 = (mp_parse_node_struct_t *)pn_iter;
|
2013-10-04 19:53:11 +01:00
|
|
|
compile_node(comp, pns_comp_for2->nodes[1]);
|
2017-01-17 03:32:50 +00:00
|
|
|
EMIT_ARG(get_iter, true);
|
2015-12-18 01:37:55 +00:00
|
|
|
compile_scope_comp_iter(comp, pns_comp_for2, pn_inner_expr, for_depth + 1);
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
2015-12-18 01:37:55 +00:00
|
|
|
|
|
|
|
EMIT_ARG(jump, l_top);
|
|
|
|
EMIT_ARG(label_assign, l_end);
|
2017-01-17 04:30:18 +00:00
|
|
|
EMIT(for_iter_end);
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
|
|
|
|
2014-04-25 23:52:57 +01:00
|
|
|
STATIC void check_for_doc_string(compiler_t *comp, mp_parse_node_t pn) {
|
2015-08-14 12:24:11 +01:00
|
|
|
#if MICROPY_ENABLE_DOC_STRING
|
2013-10-04 19:53:11 +01:00
|
|
|
// see http://www.python.org/dev/peps/pep-0257/
|
|
|
|
|
|
|
|
// look for the first statement
|
2013-12-21 18:17:45 +00:00
|
|
|
if (MP_PARSE_NODE_IS_STRUCT_KIND(pn, PN_expr_stmt)) {
|
2013-12-12 15:24:38 +00:00
|
|
|
// a statement; fall through
|
2013-12-21 18:17:45 +00:00
|
|
|
} else if (MP_PARSE_NODE_IS_STRUCT_KIND(pn, PN_file_input_2)) {
|
2013-12-12 15:24:38 +00:00
|
|
|
// file input; find the first non-newline node
|
2013-12-21 18:17:45 +00:00
|
|
|
mp_parse_node_struct_t *pns = (mp_parse_node_struct_t *)pn;
|
|
|
|
int num_nodes = MP_PARSE_NODE_STRUCT_NUM_NODES(pns);
|
2013-12-12 15:24:38 +00:00
|
|
|
for (int i = 0; i < num_nodes; i++) {
|
|
|
|
pn = pns->nodes[i];
|
2013-12-21 18:17:45 +00:00
|
|
|
if (!(MP_PARSE_NODE_IS_LEAF(pn) && MP_PARSE_NODE_LEAF_KIND(pn) == MP_PARSE_NODE_TOKEN && MP_PARSE_NODE_LEAF_ARG(pn) == MP_TOKEN_NEWLINE)) {
|
2013-12-12 15:24:38 +00:00
|
|
|
// not a newline, so this is the first statement; finish search
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
// if we didn't find a non-newline then it's okay to fall through; pn will be a newline and so doc-string test below will fail gracefully
|
2013-12-21 18:17:45 +00:00
|
|
|
} else if (MP_PARSE_NODE_IS_STRUCT_KIND(pn, PN_suite_block_stmts)) {
|
2013-12-12 15:24:38 +00:00
|
|
|
// a list of statements; get the first one
|
2013-12-21 18:17:45 +00:00
|
|
|
pn = ((mp_parse_node_struct_t *)pn)->nodes[0];
|
2013-10-04 19:53:11 +01:00
|
|
|
} else {
|
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
|
|
|
// check the first statement for a doc string
|
2013-12-21 18:17:45 +00:00
|
|
|
if (MP_PARSE_NODE_IS_STRUCT_KIND(pn, PN_expr_stmt)) {
|
2015-04-09 16:29:54 +01:00
|
|
|
mp_parse_node_struct_t *pns = (mp_parse_node_struct_t *)pn;
|
2014-05-25 22:06:06 +01:00
|
|
|
if ((MP_PARSE_NODE_IS_LEAF(pns->nodes[0])
|
|
|
|
&& MP_PARSE_NODE_LEAF_KIND(pns->nodes[0]) == MP_PARSE_NODE_STRING)
|
2017-02-24 02:43:43 +00:00
|
|
|
|| (MP_PARSE_NODE_IS_STRUCT_KIND(pns->nodes[0], PN_const_object)
|
2019-01-30 07:49:52 +00:00
|
|
|
&& mp_obj_is_str(get_const_object((mp_parse_node_struct_t *)pns->nodes[0])))) {
|
2014-05-25 22:06:06 +01:00
|
|
|
// compile the doc string
|
|
|
|
compile_node(comp, pns->nodes[0]);
|
|
|
|
// store the doc string
|
2015-03-26 14:42:40 +00:00
|
|
|
compile_store_id(comp, MP_QSTR___doc__);
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
|
|
|
}
|
2015-01-20 12:47:20 +00:00
|
|
|
#else
|
|
|
|
(void)comp;
|
|
|
|
(void)pn;
|
2014-04-25 23:52:57 +01:00
|
|
|
#endif
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
|
|
|
|
2014-04-25 23:52:57 +01:00
|
|
|
STATIC void compile_scope(compiler_t *comp, scope_t *scope, pass_kind_t pass) {
|
2013-10-04 19:53:11 +01:00
|
|
|
comp->pass = pass;
|
|
|
|
comp->scope_cur = scope;
|
2017-06-22 06:05:58 +01:00
|
|
|
comp->next_label = 0;
|
py: Rework bytecode and .mpy file format to be mostly static data.
Background: .mpy files are precompiled .py files, built using mpy-cross,
that contain compiled bytecode functions (and can also contain machine
code). The benefit of using an .mpy file over a .py file is that they are
faster to import and take less memory when importing. They are also
smaller on disk.
But the real benefit of .mpy files comes when they are frozen into the
firmware. This is done by loading the .mpy file during compilation of the
firmware and turning it into a set of big C data structures (the job of
mpy-tool.py), which are then compiled and downloaded into the ROM of a
device. These C data structures can be executed in-place, ie directly from
ROM. This makes importing even faster because there is very little to do,
and also means such frozen modules take up much less RAM (because their
bytecode stays in ROM).
The downside of frozen code is that it requires recompiling and reflashing
the entire firmware. This can be a big barrier to entry, slows down
development time, and makes it harder to do OTA updates of frozen code
(because the whole firmware must be updated).
This commit attempts to solve this problem by providing a solution that
sits between loading .mpy files into RAM and freezing them into the
firmware. The .mpy file format has been reworked so that it consists of
data and bytecode which is mostly static and ready to run in-place. If
these new .mpy files are located in flash/ROM which is memory addressable,
the .mpy file can be executed (mostly) in-place.
With this approach there is still a small amount of unpacking and linking
of the .mpy file that needs to be done when it's imported, but it's still
much better than loading an .mpy from disk into RAM (although not as good
as freezing .mpy files into the firmware).
The main trick to make static .mpy files is to adjust the bytecode so any
qstrs that it references now go through a lookup table to convert from
local qstr number in the module to global qstr number in the firmware.
That means the bytecode does not need linking/rewriting of qstrs when it's
loaded. Instead only a small qstr table needs to be built (and put in RAM)
at import time. This means the bytecode itself is static/constant and can
be used directly if it's in addressable memory. Also the qstr string data
in the .mpy file, and some constant object data, can be used directly.
Note that the qstr table is global to the module (ie not per function).
In more detail, in the VM what used to be (schematically):
qst = DECODE_QSTR_VALUE;
is now (schematically):
idx = DECODE_QSTR_INDEX;
qst = qstr_table[idx];
That allows the bytecode to be fixed at compile time and not need
relinking/rewriting of the qstr values. Only qstr_table needs to be linked
when the .mpy is loaded.
Incidentally, this helps to reduce the size of bytecode because what used
to be 2-byte qstr values in the bytecode are now (mostly) 1-byte indices.
If the module uses the same qstr more than two times then the bytecode is
smaller than before.
The following changes are measured for this commit compared to the
previous (the baseline):
- average 7%-9% reduction in size of .mpy files
- frozen code size is reduced by about 5%-7%
- importing .py files uses about 5% less RAM in total
- importing .mpy files uses about 4% less RAM in total
- importing .py and .mpy files takes about the same time as before
The qstr indirection in the bytecode has only a small impact on VM
performance. For stm32 on PYBv1.0 the performance change of this commit
is:
diff of scores (higher is better)
N=100 M=100 baseline -> this-commit diff diff% (error%)
bm_chaos.py 371.07 -> 357.39 : -13.68 = -3.687% (+/-0.02%)
bm_fannkuch.py 78.72 -> 77.49 : -1.23 = -1.563% (+/-0.01%)
bm_fft.py 2591.73 -> 2539.28 : -52.45 = -2.024% (+/-0.00%)
bm_float.py 6034.93 -> 5908.30 : -126.63 = -2.098% (+/-0.01%)
bm_hexiom.py 48.96 -> 47.93 : -1.03 = -2.104% (+/-0.00%)
bm_nqueens.py 4510.63 -> 4459.94 : -50.69 = -1.124% (+/-0.00%)
bm_pidigits.py 650.28 -> 644.96 : -5.32 = -0.818% (+/-0.23%)
core_import_mpy_multi.py 564.77 -> 581.49 : +16.72 = +2.960% (+/-0.01%)
core_import_mpy_single.py 68.67 -> 67.16 : -1.51 = -2.199% (+/-0.01%)
core_qstr.py 64.16 -> 64.12 : -0.04 = -0.062% (+/-0.00%)
core_yield_from.py 362.58 -> 354.50 : -8.08 = -2.228% (+/-0.00%)
misc_aes.py 429.69 -> 405.59 : -24.10 = -5.609% (+/-0.01%)
misc_mandel.py 3485.13 -> 3416.51 : -68.62 = -1.969% (+/-0.00%)
misc_pystone.py 2496.53 -> 2405.56 : -90.97 = -3.644% (+/-0.01%)
misc_raytrace.py 381.47 -> 374.01 : -7.46 = -1.956% (+/-0.01%)
viper_call0.py 576.73 -> 572.49 : -4.24 = -0.735% (+/-0.04%)
viper_call1a.py 550.37 -> 546.21 : -4.16 = -0.756% (+/-0.09%)
viper_call1b.py 438.23 -> 435.68 : -2.55 = -0.582% (+/-0.06%)
viper_call1c.py 442.84 -> 440.04 : -2.80 = -0.632% (+/-0.08%)
viper_call2a.py 536.31 -> 532.35 : -3.96 = -0.738% (+/-0.06%)
viper_call2b.py 382.34 -> 377.07 : -5.27 = -1.378% (+/-0.03%)
And for unix on x64:
diff of scores (higher is better)
N=2000 M=2000 baseline -> this-commit diff diff% (error%)
bm_chaos.py 13594.20 -> 13073.84 : -520.36 = -3.828% (+/-5.44%)
bm_fannkuch.py 60.63 -> 59.58 : -1.05 = -1.732% (+/-3.01%)
bm_fft.py 112009.15 -> 111603.32 : -405.83 = -0.362% (+/-4.03%)
bm_float.py 246202.55 -> 247923.81 : +1721.26 = +0.699% (+/-2.79%)
bm_hexiom.py 615.65 -> 617.21 : +1.56 = +0.253% (+/-1.64%)
bm_nqueens.py 215807.95 -> 215600.96 : -206.99 = -0.096% (+/-3.52%)
bm_pidigits.py 8246.74 -> 8422.82 : +176.08 = +2.135% (+/-3.64%)
misc_aes.py 16133.00 -> 16452.74 : +319.74 = +1.982% (+/-1.50%)
misc_mandel.py 128146.69 -> 130796.43 : +2649.74 = +2.068% (+/-3.18%)
misc_pystone.py 83811.49 -> 83124.85 : -686.64 = -0.819% (+/-1.03%)
misc_raytrace.py 21688.02 -> 21385.10 : -302.92 = -1.397% (+/-3.20%)
The code size change is (firmware with a lot of frozen code benefits the
most):
bare-arm: +396 +0.697%
minimal x86: +1595 +0.979% [incl +32(data)]
unix x64: +2408 +0.470% [incl +800(data)]
unix nanbox: +1396 +0.309% [incl -96(data)]
stm32: -1256 -0.318% PYBV10
cc3200: +288 +0.157%
esp8266: -260 -0.037% GENERIC
esp32: -216 -0.014% GENERIC[incl -1072(data)]
nrf: +116 +0.067% pca10040
rp2: -664 -0.135% PICO
samd: +844 +0.607% ADAFRUIT_ITSYBITSY_M4_EXPRESS
As part of this change the .mpy file format version is bumped to version 6.
And mpy-tool.py has been improved to provide a good visualisation of the
contents of .mpy files.
In summary: this commit changes the bytecode to use qstr indirection, and
reworks the .mpy file format to be simpler and allow .mpy files to be
executed in-place. Performance is not impacted too much. Eventually it
will be possible to store such .mpy files in a linear, read-only, memory-
mappable filesystem so they can be executed from flash/ROM. This will
essentially be able to replace frozen code for most applications.
Signed-off-by: Damien George <damien@micropython.org>
2021-10-22 12:22:47 +01:00
|
|
|
mp_emit_common_start_pass(&comp->emit_common, pass);
|
2014-01-23 00:34:21 +00:00
|
|
|
EMIT_ARG(start_pass, pass, scope);
|
2018-09-14 08:40:59 +01:00
|
|
|
reserve_labels_for_native(comp, 6); // used by native's start_pass
|
2013-10-04 19:53:11 +01:00
|
|
|
|
2014-05-07 17:24:22 +01:00
|
|
|
if (comp->pass == MP_PASS_SCOPE) {
|
2014-03-27 10:55:21 +00:00
|
|
|
// reset maximum stack sizes in scope
|
|
|
|
// they will be computed in this first pass
|
2013-10-04 19:53:11 +01:00
|
|
|
scope->stack_size = 0;
|
2014-03-27 10:55:21 +00:00
|
|
|
scope->exc_stack_size = 0;
|
py: Rework bytecode and .mpy file format to be mostly static data.
Background: .mpy files are precompiled .py files, built using mpy-cross,
that contain compiled bytecode functions (and can also contain machine
code). The benefit of using an .mpy file over a .py file is that they are
faster to import and take less memory when importing. They are also
smaller on disk.
But the real benefit of .mpy files comes when they are frozen into the
firmware. This is done by loading the .mpy file during compilation of the
firmware and turning it into a set of big C data structures (the job of
mpy-tool.py), which are then compiled and downloaded into the ROM of a
device. These C data structures can be executed in-place, ie directly from
ROM. This makes importing even faster because there is very little to do,
and also means such frozen modules take up much less RAM (because their
bytecode stays in ROM).
The downside of frozen code is that it requires recompiling and reflashing
the entire firmware. This can be a big barrier to entry, slows down
development time, and makes it harder to do OTA updates of frozen code
(because the whole firmware must be updated).
This commit attempts to solve this problem by providing a solution that
sits between loading .mpy files into RAM and freezing them into the
firmware. The .mpy file format has been reworked so that it consists of
data and bytecode which is mostly static and ready to run in-place. If
these new .mpy files are located in flash/ROM which is memory addressable,
the .mpy file can be executed (mostly) in-place.
With this approach there is still a small amount of unpacking and linking
of the .mpy file that needs to be done when it's imported, but it's still
much better than loading an .mpy from disk into RAM (although not as good
as freezing .mpy files into the firmware).
The main trick to make static .mpy files is to adjust the bytecode so any
qstrs that it references now go through a lookup table to convert from
local qstr number in the module to global qstr number in the firmware.
That means the bytecode does not need linking/rewriting of qstrs when it's
loaded. Instead only a small qstr table needs to be built (and put in RAM)
at import time. This means the bytecode itself is static/constant and can
be used directly if it's in addressable memory. Also the qstr string data
in the .mpy file, and some constant object data, can be used directly.
Note that the qstr table is global to the module (ie not per function).
In more detail, in the VM what used to be (schematically):
qst = DECODE_QSTR_VALUE;
is now (schematically):
idx = DECODE_QSTR_INDEX;
qst = qstr_table[idx];
That allows the bytecode to be fixed at compile time and not need
relinking/rewriting of the qstr values. Only qstr_table needs to be linked
when the .mpy is loaded.
Incidentally, this helps to reduce the size of bytecode because what used
to be 2-byte qstr values in the bytecode are now (mostly) 1-byte indices.
If the module uses the same qstr more than two times then the bytecode is
smaller than before.
The following changes are measured for this commit compared to the
previous (the baseline):
- average 7%-9% reduction in size of .mpy files
- frozen code size is reduced by about 5%-7%
- importing .py files uses about 5% less RAM in total
- importing .mpy files uses about 4% less RAM in total
- importing .py and .mpy files takes about the same time as before
The qstr indirection in the bytecode has only a small impact on VM
performance. For stm32 on PYBv1.0 the performance change of this commit
is:
diff of scores (higher is better)
N=100 M=100 baseline -> this-commit diff diff% (error%)
bm_chaos.py 371.07 -> 357.39 : -13.68 = -3.687% (+/-0.02%)
bm_fannkuch.py 78.72 -> 77.49 : -1.23 = -1.563% (+/-0.01%)
bm_fft.py 2591.73 -> 2539.28 : -52.45 = -2.024% (+/-0.00%)
bm_float.py 6034.93 -> 5908.30 : -126.63 = -2.098% (+/-0.01%)
bm_hexiom.py 48.96 -> 47.93 : -1.03 = -2.104% (+/-0.00%)
bm_nqueens.py 4510.63 -> 4459.94 : -50.69 = -1.124% (+/-0.00%)
bm_pidigits.py 650.28 -> 644.96 : -5.32 = -0.818% (+/-0.23%)
core_import_mpy_multi.py 564.77 -> 581.49 : +16.72 = +2.960% (+/-0.01%)
core_import_mpy_single.py 68.67 -> 67.16 : -1.51 = -2.199% (+/-0.01%)
core_qstr.py 64.16 -> 64.12 : -0.04 = -0.062% (+/-0.00%)
core_yield_from.py 362.58 -> 354.50 : -8.08 = -2.228% (+/-0.00%)
misc_aes.py 429.69 -> 405.59 : -24.10 = -5.609% (+/-0.01%)
misc_mandel.py 3485.13 -> 3416.51 : -68.62 = -1.969% (+/-0.00%)
misc_pystone.py 2496.53 -> 2405.56 : -90.97 = -3.644% (+/-0.01%)
misc_raytrace.py 381.47 -> 374.01 : -7.46 = -1.956% (+/-0.01%)
viper_call0.py 576.73 -> 572.49 : -4.24 = -0.735% (+/-0.04%)
viper_call1a.py 550.37 -> 546.21 : -4.16 = -0.756% (+/-0.09%)
viper_call1b.py 438.23 -> 435.68 : -2.55 = -0.582% (+/-0.06%)
viper_call1c.py 442.84 -> 440.04 : -2.80 = -0.632% (+/-0.08%)
viper_call2a.py 536.31 -> 532.35 : -3.96 = -0.738% (+/-0.06%)
viper_call2b.py 382.34 -> 377.07 : -5.27 = -1.378% (+/-0.03%)
And for unix on x64:
diff of scores (higher is better)
N=2000 M=2000 baseline -> this-commit diff diff% (error%)
bm_chaos.py 13594.20 -> 13073.84 : -520.36 = -3.828% (+/-5.44%)
bm_fannkuch.py 60.63 -> 59.58 : -1.05 = -1.732% (+/-3.01%)
bm_fft.py 112009.15 -> 111603.32 : -405.83 = -0.362% (+/-4.03%)
bm_float.py 246202.55 -> 247923.81 : +1721.26 = +0.699% (+/-2.79%)
bm_hexiom.py 615.65 -> 617.21 : +1.56 = +0.253% (+/-1.64%)
bm_nqueens.py 215807.95 -> 215600.96 : -206.99 = -0.096% (+/-3.52%)
bm_pidigits.py 8246.74 -> 8422.82 : +176.08 = +2.135% (+/-3.64%)
misc_aes.py 16133.00 -> 16452.74 : +319.74 = +1.982% (+/-1.50%)
misc_mandel.py 128146.69 -> 130796.43 : +2649.74 = +2.068% (+/-3.18%)
misc_pystone.py 83811.49 -> 83124.85 : -686.64 = -0.819% (+/-1.03%)
misc_raytrace.py 21688.02 -> 21385.10 : -302.92 = -1.397% (+/-3.20%)
The code size change is (firmware with a lot of frozen code benefits the
most):
bare-arm: +396 +0.697%
minimal x86: +1595 +0.979% [incl +32(data)]
unix x64: +2408 +0.470% [incl +800(data)]
unix nanbox: +1396 +0.309% [incl -96(data)]
stm32: -1256 -0.318% PYBV10
cc3200: +288 +0.157%
esp8266: -260 -0.037% GENERIC
esp32: -216 -0.014% GENERIC[incl -1072(data)]
nrf: +116 +0.067% pca10040
rp2: -664 -0.135% PICO
samd: +844 +0.607% ADAFRUIT_ITSYBITSY_M4_EXPRESS
As part of this change the .mpy file format version is bumped to version 6.
And mpy-tool.py has been improved to provide a good visualisation of the
contents of .mpy files.
In summary: this commit changes the bytecode to use qstr indirection, and
reworks the .mpy file format to be simpler and allow .mpy files to be
executed in-place. Performance is not impacted too much. Eventually it
will be possible to store such .mpy files in a linear, read-only, memory-
mappable filesystem so they can be executed from flash/ROM. This will
essentially be able to replace frozen code for most applications.
Signed-off-by: Damien George <damien@micropython.org>
2021-10-22 12:22:47 +01:00
|
|
|
|
|
|
|
#if MICROPY_EMIT_NATIVE
|
|
|
|
if (scope->emit_options == MP_EMIT_OPT_NATIVE_PYTHON || scope->emit_options == MP_EMIT_OPT_VIPER) {
|
|
|
|
// allow native code to perfom basic tasks during the pass scope
|
|
|
|
// note: the first argument passed here is mp_emit_common_t, not the native emitter context
|
|
|
|
NATIVE_EMITTER_TABLE->start_pass((void *)&comp->emit_common, comp->pass, scope);
|
|
|
|
}
|
|
|
|
#endif
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
|
|
|
|
|
|
|
// compile
|
2014-01-15 22:14:03 +00:00
|
|
|
if (MP_PARSE_NODE_IS_STRUCT_KIND(scope->pn, PN_eval_input)) {
|
|
|
|
assert(scope->kind == SCOPE_MODULE);
|
|
|
|
mp_parse_node_struct_t *pns = (mp_parse_node_struct_t *)scope->pn;
|
|
|
|
compile_node(comp, pns->nodes[0]); // compile the expression
|
|
|
|
EMIT(return_value);
|
|
|
|
} else if (scope->kind == SCOPE_MODULE) {
|
2013-10-18 19:58:12 +01:00
|
|
|
if (!comp->is_repl) {
|
|
|
|
check_for_doc_string(comp, scope->pn);
|
|
|
|
}
|
2013-10-04 19:53:11 +01:00
|
|
|
compile_node(comp, scope->pn);
|
2014-01-23 00:34:21 +00:00
|
|
|
EMIT_ARG(load_const_tok, MP_TOKEN_KW_NONE);
|
2013-10-04 19:53:11 +01:00
|
|
|
EMIT(return_value);
|
|
|
|
} else if (scope->kind == SCOPE_FUNCTION) {
|
2013-12-21 18:17:45 +00:00
|
|
|
assert(MP_PARSE_NODE_IS_STRUCT(scope->pn));
|
|
|
|
mp_parse_node_struct_t *pns = (mp_parse_node_struct_t *)scope->pn;
|
|
|
|
assert(MP_PARSE_NODE_STRUCT_KIND(pns) == PN_funcdef);
|
2013-10-04 19:53:11 +01:00
|
|
|
|
|
|
|
// work out number of parameters, keywords and default parameters, and add them to the id_info array
|
2013-10-05 18:08:26 +01:00
|
|
|
// must be done before compiling the body so that arguments are numbered first (for LOAD_FAST etc)
|
2014-05-07 17:24:22 +01:00
|
|
|
if (comp->pass == MP_PASS_SCOPE) {
|
2014-04-11 23:25:34 +01:00
|
|
|
comp->have_star = false;
|
2013-10-04 19:53:11 +01:00
|
|
|
apply_to_single_or_list(comp, pns->nodes[1], PN_typedargslist, compile_scope_func_param);
|
2018-09-15 04:43:49 +01:00
|
|
|
|
|
|
|
#if MICROPY_EMIT_NATIVE
|
|
|
|
if (scope->emit_options == MP_EMIT_OPT_VIPER) {
|
|
|
|
// Compile return type; pns->nodes[2] is return/whole function annotation
|
|
|
|
scope->scope_flags |= compile_viper_type_annotation(comp, pns->nodes[2]) << MP_SCOPE_FLAG_VIPERRET_POS;
|
2014-08-15 16:45:41 +01:00
|
|
|
}
|
2018-09-15 04:43:49 +01:00
|
|
|
#endif // MICROPY_EMIT_NATIVE
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
|
|
|
|
|
|
|
compile_node(comp, pns->nodes[3]); // 3 is function body
|
|
|
|
// emit return if it wasn't the last opcode
|
2013-10-05 12:19:06 +01:00
|
|
|
if (!EMIT(last_emit_was_return_value)) {
|
2014-01-23 00:34:21 +00:00
|
|
|
EMIT_ARG(load_const_tok, MP_TOKEN_KW_NONE);
|
2013-10-04 19:53:11 +01:00
|
|
|
EMIT(return_value);
|
|
|
|
}
|
|
|
|
} else if (scope->kind == SCOPE_LAMBDA) {
|
2013-12-21 18:17:45 +00:00
|
|
|
assert(MP_PARSE_NODE_IS_STRUCT(scope->pn));
|
|
|
|
mp_parse_node_struct_t *pns = (mp_parse_node_struct_t *)scope->pn;
|
|
|
|
assert(MP_PARSE_NODE_STRUCT_NUM_NODES(pns) == 3);
|
2013-10-04 19:53:11 +01:00
|
|
|
|
2019-08-28 02:37:43 +01:00
|
|
|
// Set the source line number for the start of the lambda
|
|
|
|
EMIT_ARG(set_source_line, pns->source_line);
|
|
|
|
|
2013-10-04 19:53:11 +01:00
|
|
|
// work out number of parameters, keywords and default parameters, and add them to the id_info array
|
2013-10-05 18:08:26 +01:00
|
|
|
// must be done before compiling the body so that arguments are numbered first (for LOAD_FAST etc)
|
2014-05-07 17:24:22 +01:00
|
|
|
if (comp->pass == MP_PASS_SCOPE) {
|
2014-04-11 23:25:34 +01:00
|
|
|
comp->have_star = false;
|
2013-10-04 19:53:11 +01:00
|
|
|
apply_to_single_or_list(comp, pns->nodes[0], PN_varargslist, compile_scope_lambda_param);
|
|
|
|
}
|
|
|
|
|
|
|
|
compile_node(comp, pns->nodes[1]); // 1 is lambda body
|
2014-04-11 14:10:21 +01:00
|
|
|
|
|
|
|
// if the lambda is a generator, then we return None, not the result of the expression of the lambda
|
|
|
|
if (scope->scope_flags & MP_SCOPE_FLAG_GENERATOR) {
|
|
|
|
EMIT(pop_top);
|
|
|
|
EMIT_ARG(load_const_tok, MP_TOKEN_KW_NONE);
|
|
|
|
}
|
2013-10-04 19:53:11 +01:00
|
|
|
EMIT(return_value);
|
2020-06-16 12:42:37 +01:00
|
|
|
} else if (SCOPE_IS_COMP_LIKE(scope->kind)) {
|
2013-10-04 19:53:11 +01:00
|
|
|
// a bit of a hack at the moment
|
|
|
|
|
2013-12-21 18:17:45 +00:00
|
|
|
assert(MP_PARSE_NODE_IS_STRUCT(scope->pn));
|
|
|
|
mp_parse_node_struct_t *pns = (mp_parse_node_struct_t *)scope->pn;
|
|
|
|
assert(MP_PARSE_NODE_STRUCT_NUM_NODES(pns) == 2);
|
|
|
|
assert(MP_PARSE_NODE_IS_STRUCT_KIND(pns->nodes[1], PN_comp_for));
|
|
|
|
mp_parse_node_struct_t *pns_comp_for = (mp_parse_node_struct_t *)pns->nodes[1];
|
2013-10-04 19:53:11 +01:00
|
|
|
|
2014-04-27 19:23:46 +01:00
|
|
|
// We need a unique name for the comprehension argument (the iterator).
|
|
|
|
// CPython uses .0, but we should be able to use anything that won't
|
|
|
|
// clash with a user defined variable. Best to use an existing qstr,
|
|
|
|
// so we use the blank qstr.
|
|
|
|
qstr qstr_arg = MP_QSTR_;
|
2014-05-07 17:24:22 +01:00
|
|
|
if (comp->pass == MP_PASS_SCOPE) {
|
2018-10-27 12:41:21 +01:00
|
|
|
scope_find_or_add_id(comp->scope_cur, qstr_arg, ID_INFO_KIND_LOCAL);
|
2014-04-27 15:50:52 +01:00
|
|
|
scope->num_pos_args = 1;
|
py: Rework bytecode and .mpy file format to be mostly static data.
Background: .mpy files are precompiled .py files, built using mpy-cross,
that contain compiled bytecode functions (and can also contain machine
code). The benefit of using an .mpy file over a .py file is that they are
faster to import and take less memory when importing. They are also
smaller on disk.
But the real benefit of .mpy files comes when they are frozen into the
firmware. This is done by loading the .mpy file during compilation of the
firmware and turning it into a set of big C data structures (the job of
mpy-tool.py), which are then compiled and downloaded into the ROM of a
device. These C data structures can be executed in-place, ie directly from
ROM. This makes importing even faster because there is very little to do,
and also means such frozen modules take up much less RAM (because their
bytecode stays in ROM).
The downside of frozen code is that it requires recompiling and reflashing
the entire firmware. This can be a big barrier to entry, slows down
development time, and makes it harder to do OTA updates of frozen code
(because the whole firmware must be updated).
This commit attempts to solve this problem by providing a solution that
sits between loading .mpy files into RAM and freezing them into the
firmware. The .mpy file format has been reworked so that it consists of
data and bytecode which is mostly static and ready to run in-place. If
these new .mpy files are located in flash/ROM which is memory addressable,
the .mpy file can be executed (mostly) in-place.
With this approach there is still a small amount of unpacking and linking
of the .mpy file that needs to be done when it's imported, but it's still
much better than loading an .mpy from disk into RAM (although not as good
as freezing .mpy files into the firmware).
The main trick to make static .mpy files is to adjust the bytecode so any
qstrs that it references now go through a lookup table to convert from
local qstr number in the module to global qstr number in the firmware.
That means the bytecode does not need linking/rewriting of qstrs when it's
loaded. Instead only a small qstr table needs to be built (and put in RAM)
at import time. This means the bytecode itself is static/constant and can
be used directly if it's in addressable memory. Also the qstr string data
in the .mpy file, and some constant object data, can be used directly.
Note that the qstr table is global to the module (ie not per function).
In more detail, in the VM what used to be (schematically):
qst = DECODE_QSTR_VALUE;
is now (schematically):
idx = DECODE_QSTR_INDEX;
qst = qstr_table[idx];
That allows the bytecode to be fixed at compile time and not need
relinking/rewriting of the qstr values. Only qstr_table needs to be linked
when the .mpy is loaded.
Incidentally, this helps to reduce the size of bytecode because what used
to be 2-byte qstr values in the bytecode are now (mostly) 1-byte indices.
If the module uses the same qstr more than two times then the bytecode is
smaller than before.
The following changes are measured for this commit compared to the
previous (the baseline):
- average 7%-9% reduction in size of .mpy files
- frozen code size is reduced by about 5%-7%
- importing .py files uses about 5% less RAM in total
- importing .mpy files uses about 4% less RAM in total
- importing .py and .mpy files takes about the same time as before
The qstr indirection in the bytecode has only a small impact on VM
performance. For stm32 on PYBv1.0 the performance change of this commit
is:
diff of scores (higher is better)
N=100 M=100 baseline -> this-commit diff diff% (error%)
bm_chaos.py 371.07 -> 357.39 : -13.68 = -3.687% (+/-0.02%)
bm_fannkuch.py 78.72 -> 77.49 : -1.23 = -1.563% (+/-0.01%)
bm_fft.py 2591.73 -> 2539.28 : -52.45 = -2.024% (+/-0.00%)
bm_float.py 6034.93 -> 5908.30 : -126.63 = -2.098% (+/-0.01%)
bm_hexiom.py 48.96 -> 47.93 : -1.03 = -2.104% (+/-0.00%)
bm_nqueens.py 4510.63 -> 4459.94 : -50.69 = -1.124% (+/-0.00%)
bm_pidigits.py 650.28 -> 644.96 : -5.32 = -0.818% (+/-0.23%)
core_import_mpy_multi.py 564.77 -> 581.49 : +16.72 = +2.960% (+/-0.01%)
core_import_mpy_single.py 68.67 -> 67.16 : -1.51 = -2.199% (+/-0.01%)
core_qstr.py 64.16 -> 64.12 : -0.04 = -0.062% (+/-0.00%)
core_yield_from.py 362.58 -> 354.50 : -8.08 = -2.228% (+/-0.00%)
misc_aes.py 429.69 -> 405.59 : -24.10 = -5.609% (+/-0.01%)
misc_mandel.py 3485.13 -> 3416.51 : -68.62 = -1.969% (+/-0.00%)
misc_pystone.py 2496.53 -> 2405.56 : -90.97 = -3.644% (+/-0.01%)
misc_raytrace.py 381.47 -> 374.01 : -7.46 = -1.956% (+/-0.01%)
viper_call0.py 576.73 -> 572.49 : -4.24 = -0.735% (+/-0.04%)
viper_call1a.py 550.37 -> 546.21 : -4.16 = -0.756% (+/-0.09%)
viper_call1b.py 438.23 -> 435.68 : -2.55 = -0.582% (+/-0.06%)
viper_call1c.py 442.84 -> 440.04 : -2.80 = -0.632% (+/-0.08%)
viper_call2a.py 536.31 -> 532.35 : -3.96 = -0.738% (+/-0.06%)
viper_call2b.py 382.34 -> 377.07 : -5.27 = -1.378% (+/-0.03%)
And for unix on x64:
diff of scores (higher is better)
N=2000 M=2000 baseline -> this-commit diff diff% (error%)
bm_chaos.py 13594.20 -> 13073.84 : -520.36 = -3.828% (+/-5.44%)
bm_fannkuch.py 60.63 -> 59.58 : -1.05 = -1.732% (+/-3.01%)
bm_fft.py 112009.15 -> 111603.32 : -405.83 = -0.362% (+/-4.03%)
bm_float.py 246202.55 -> 247923.81 : +1721.26 = +0.699% (+/-2.79%)
bm_hexiom.py 615.65 -> 617.21 : +1.56 = +0.253% (+/-1.64%)
bm_nqueens.py 215807.95 -> 215600.96 : -206.99 = -0.096% (+/-3.52%)
bm_pidigits.py 8246.74 -> 8422.82 : +176.08 = +2.135% (+/-3.64%)
misc_aes.py 16133.00 -> 16452.74 : +319.74 = +1.982% (+/-1.50%)
misc_mandel.py 128146.69 -> 130796.43 : +2649.74 = +2.068% (+/-3.18%)
misc_pystone.py 83811.49 -> 83124.85 : -686.64 = -0.819% (+/-1.03%)
misc_raytrace.py 21688.02 -> 21385.10 : -302.92 = -1.397% (+/-3.20%)
The code size change is (firmware with a lot of frozen code benefits the
most):
bare-arm: +396 +0.697%
minimal x86: +1595 +0.979% [incl +32(data)]
unix x64: +2408 +0.470% [incl +800(data)]
unix nanbox: +1396 +0.309% [incl -96(data)]
stm32: -1256 -0.318% PYBV10
cc3200: +288 +0.157%
esp8266: -260 -0.037% GENERIC
esp32: -216 -0.014% GENERIC[incl -1072(data)]
nrf: +116 +0.067% pca10040
rp2: -664 -0.135% PICO
samd: +844 +0.607% ADAFRUIT_ITSYBITSY_M4_EXPRESS
As part of this change the .mpy file format version is bumped to version 6.
And mpy-tool.py has been improved to provide a good visualisation of the
contents of .mpy files.
In summary: this commit changes the bytecode to use qstr indirection, and
reworks the .mpy file format to be simpler and allow .mpy files to be
executed in-place. Performance is not impacted too much. Eventually it
will be possible to store such .mpy files in a linear, read-only, memory-
mappable filesystem so they can be executed from flash/ROM. This will
essentially be able to replace frozen code for most applications.
Signed-off-by: Damien George <damien@micropython.org>
2021-10-22 12:22:47 +01:00
|
|
|
mp_emit_common_use_qstr(&comp->emit_common, MP_QSTR__star_);
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
|
|
|
|
2019-08-19 13:16:33 +01:00
|
|
|
// Set the source line number for the start of the comprehension
|
|
|
|
EMIT_ARG(set_source_line, pns->source_line);
|
|
|
|
|
2013-10-04 19:53:11 +01:00
|
|
|
if (scope->kind == SCOPE_LIST_COMP) {
|
2018-05-18 15:41:40 +01:00
|
|
|
EMIT_ARG(build, 0, MP_EMIT_BUILD_LIST);
|
2013-10-04 19:53:11 +01:00
|
|
|
} else if (scope->kind == SCOPE_DICT_COMP) {
|
2018-05-18 15:41:40 +01:00
|
|
|
EMIT_ARG(build, 0, MP_EMIT_BUILD_MAP);
|
2014-12-27 17:07:16 +00:00
|
|
|
#if MICROPY_PY_BUILTINS_SET
|
2013-10-04 19:53:11 +01:00
|
|
|
} else if (scope->kind == SCOPE_SET_COMP) {
|
2018-05-22 13:18:42 +01:00
|
|
|
EMIT_ARG(build, 0, MP_EMIT_BUILD_SET);
|
2014-12-27 17:07:16 +00:00
|
|
|
#endif
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
|
|
|
|
2017-01-17 04:27:37 +00:00
|
|
|
// There are 4 slots on the stack for the iterator, and the first one is
|
|
|
|
// NULL to indicate that the second one points to the iterator object.
|
2017-02-10 04:39:33 +00:00
|
|
|
if (scope->kind == SCOPE_GEN_EXPR) {
|
2018-05-18 04:11:02 +01:00
|
|
|
MP_STATIC_ASSERT(MP_OBJ_ITER_BUF_NSLOTS == 4);
|
2017-02-10 04:39:33 +00:00
|
|
|
EMIT(load_null);
|
|
|
|
compile_load_id(comp, qstr_arg);
|
|
|
|
EMIT(load_null);
|
|
|
|
EMIT(load_null);
|
|
|
|
} else {
|
|
|
|
compile_load_id(comp, qstr_arg);
|
|
|
|
EMIT_ARG(get_iter, true);
|
|
|
|
}
|
2017-01-17 03:32:50 +00:00
|
|
|
|
2015-12-18 01:37:55 +00:00
|
|
|
compile_scope_comp_iter(comp, pns_comp_for, pns->nodes[0], 0);
|
2013-10-04 19:53:11 +01:00
|
|
|
|
|
|
|
if (scope->kind == SCOPE_GEN_EXPR) {
|
2014-01-23 00:34:21 +00:00
|
|
|
EMIT_ARG(load_const_tok, MP_TOKEN_KW_NONE);
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
|
|
|
EMIT(return_value);
|
|
|
|
} else {
|
|
|
|
assert(scope->kind == SCOPE_CLASS);
|
2013-12-21 18:17:45 +00:00
|
|
|
assert(MP_PARSE_NODE_IS_STRUCT(scope->pn));
|
|
|
|
mp_parse_node_struct_t *pns = (mp_parse_node_struct_t *)scope->pn;
|
|
|
|
assert(MP_PARSE_NODE_STRUCT_KIND(pns) == PN_classdef);
|
2013-10-04 19:53:11 +01:00
|
|
|
|
2014-05-07 17:24:22 +01:00
|
|
|
if (comp->pass == MP_PASS_SCOPE) {
|
2018-10-27 12:41:21 +01:00
|
|
|
scope_find_or_add_id(scope, MP_QSTR___class__, ID_INFO_KIND_LOCAL);
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
|
|
|
|
2019-08-14 15:09:36 +01:00
|
|
|
#if MICROPY_PY_SYS_SETTRACE
|
|
|
|
EMIT_ARG(set_source_line, pns->source_line);
|
|
|
|
#endif
|
2015-03-26 14:42:40 +00:00
|
|
|
compile_load_id(comp, MP_QSTR___name__);
|
|
|
|
compile_store_id(comp, MP_QSTR___module__);
|
2015-06-25 15:42:13 +01:00
|
|
|
EMIT_ARG(load_const_str, MP_PARSE_NODE_LEAF_ARG(pns->nodes[0])); // 0 is class name
|
2015-03-26 14:42:40 +00:00
|
|
|
compile_store_id(comp, MP_QSTR___qualname__);
|
2013-10-04 19:53:11 +01:00
|
|
|
|
|
|
|
check_for_doc_string(comp, pns->nodes[2]);
|
|
|
|
compile_node(comp, pns->nodes[2]); // 2 is class body
|
|
|
|
|
2014-01-04 15:57:35 +00:00
|
|
|
id_info_t *id = scope_find(scope, MP_QSTR___class__);
|
2013-10-04 19:53:11 +01:00
|
|
|
assert(id != NULL);
|
|
|
|
if (id->kind == ID_INFO_KIND_LOCAL) {
|
2014-01-23 00:34:21 +00:00
|
|
|
EMIT_ARG(load_const_tok, MP_TOKEN_KW_NONE);
|
2013-10-04 19:53:11 +01:00
|
|
|
} else {
|
2015-03-26 14:42:40 +00:00
|
|
|
EMIT_LOAD_FAST(MP_QSTR___class__, id->local_num);
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
|
|
|
EMIT(return_value);
|
|
|
|
}
|
|
|
|
|
2013-10-05 12:19:06 +01:00
|
|
|
EMIT(end_pass);
|
2014-03-27 10:55:21 +00:00
|
|
|
|
|
|
|
// make sure we match all the exception levels
|
|
|
|
assert(comp->cur_except_level == 0);
|
2013-10-05 23:17:28 +01:00
|
|
|
}
|
|
|
|
|
2016-12-09 02:17:49 +00:00
|
|
|
#if MICROPY_EMIT_INLINE_ASM
|
2014-05-07 17:24:22 +01:00
|
|
|
// requires 3 passes: SCOPE, CODE_SIZE, EMIT
|
2014-04-25 23:52:57 +01:00
|
|
|
STATIC void compile_scope_inline_asm(compiler_t *comp, scope_t *scope, pass_kind_t pass) {
|
2013-10-05 23:17:28 +01:00
|
|
|
comp->pass = pass;
|
|
|
|
comp->scope_cur = scope;
|
2017-06-22 06:05:58 +01:00
|
|
|
comp->next_label = 0;
|
2013-10-05 23:17:28 +01:00
|
|
|
|
|
|
|
if (scope->kind != SCOPE_FUNCTION) {
|
2020-03-02 11:35:22 +00:00
|
|
|
compile_syntax_error(comp, MP_PARSE_NODE_NULL, MP_ERROR_TEXT("inline assembler must be a function"));
|
2013-10-05 23:17:28 +01:00
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
2014-05-07 17:24:22 +01:00
|
|
|
if (comp->pass > MP_PASS_SCOPE) {
|
2016-12-09 10:23:17 +00:00
|
|
|
EMIT_INLINE_ASM_ARG(start_pass, comp->pass, &comp->compile_error);
|
2013-10-06 00:14:13 +01:00
|
|
|
}
|
|
|
|
|
2013-10-05 23:17:28 +01:00
|
|
|
// get the function definition parse node
|
2013-12-21 18:17:45 +00:00
|
|
|
assert(MP_PARSE_NODE_IS_STRUCT(scope->pn));
|
|
|
|
mp_parse_node_struct_t *pns = (mp_parse_node_struct_t *)scope->pn;
|
|
|
|
assert(MP_PARSE_NODE_STRUCT_KIND(pns) == PN_funcdef);
|
2013-10-05 23:17:28 +01:00
|
|
|
|
2013-12-21 18:17:45 +00:00
|
|
|
// qstr f_id = MP_PARSE_NODE_LEAF_ARG(pns->nodes[0]); // function name
|
2013-10-06 00:14:13 +01:00
|
|
|
|
|
|
|
// parameters are in pns->nodes[1]
|
2014-05-07 17:24:22 +01:00
|
|
|
if (comp->pass == MP_PASS_CODE_SIZE) {
|
2013-12-21 18:17:45 +00:00
|
|
|
mp_parse_node_t *pn_params;
|
2020-05-04 13:11:44 +01:00
|
|
|
size_t n_params = mp_parse_node_extract_list(&pns->nodes[1], PN_typedargslist, &pn_params);
|
2014-04-27 15:50:52 +01:00
|
|
|
scope->num_pos_args = EMIT_INLINE_ASM_ARG(count_params, n_params, pn_params);
|
2015-02-13 01:00:51 +00:00
|
|
|
if (comp->compile_error != MP_OBJ_NULL) {
|
|
|
|
goto inline_asm_error;
|
|
|
|
}
|
2013-10-06 00:14:13 +01:00
|
|
|
}
|
2013-10-05 23:17:28 +01:00
|
|
|
|
2016-01-15 15:20:43 +00:00
|
|
|
// pns->nodes[2] is function return annotation
|
|
|
|
mp_uint_t type_sig = MP_NATIVE_TYPE_INT;
|
|
|
|
mp_parse_node_t pn_annotation = pns->nodes[2];
|
|
|
|
if (!MP_PARSE_NODE_IS_NULL(pn_annotation)) {
|
|
|
|
// nodes[2] can be null or a test-expr
|
|
|
|
if (MP_PARSE_NODE_IS_ID(pn_annotation)) {
|
|
|
|
qstr ret_type = MP_PARSE_NODE_LEAF_ARG(pn_annotation);
|
|
|
|
switch (ret_type) {
|
|
|
|
case MP_QSTR_object:
|
|
|
|
type_sig = MP_NATIVE_TYPE_OBJ;
|
|
|
|
break;
|
|
|
|
case MP_QSTR_bool:
|
|
|
|
type_sig = MP_NATIVE_TYPE_BOOL;
|
|
|
|
break;
|
|
|
|
case MP_QSTR_int:
|
|
|
|
type_sig = MP_NATIVE_TYPE_INT;
|
|
|
|
break;
|
|
|
|
case MP_QSTR_uint:
|
|
|
|
type_sig = MP_NATIVE_TYPE_UINT;
|
|
|
|
break;
|
|
|
|
default:
|
2020-03-02 11:35:22 +00:00
|
|
|
compile_syntax_error(comp, pn_annotation, MP_ERROR_TEXT("unknown type"));
|
2016-01-15 15:20:43 +00:00
|
|
|
return;
|
|
|
|
}
|
|
|
|
} else {
|
2020-03-02 11:35:22 +00:00
|
|
|
compile_syntax_error(comp, pn_annotation, MP_ERROR_TEXT("return annotation must be an identifier"));
|
2016-01-15 15:20:43 +00:00
|
|
|
}
|
|
|
|
}
|
2013-10-05 23:17:28 +01:00
|
|
|
|
2013-12-21 18:17:45 +00:00
|
|
|
mp_parse_node_t pn_body = pns->nodes[3]; // body
|
|
|
|
mp_parse_node_t *nodes;
|
2020-05-04 13:11:44 +01:00
|
|
|
size_t num = mp_parse_node_extract_list(&pn_body, PN_suite_block_stmts, &nodes);
|
2013-10-05 23:17:28 +01:00
|
|
|
|
2020-05-04 13:11:44 +01:00
|
|
|
for (size_t i = 0; i < num; i++) {
|
2013-12-21 18:17:45 +00:00
|
|
|
assert(MP_PARSE_NODE_IS_STRUCT(nodes[i]));
|
|
|
|
mp_parse_node_struct_t *pns2 = (mp_parse_node_struct_t *)nodes[i];
|
2014-04-12 17:54:52 +01:00
|
|
|
if (MP_PARSE_NODE_STRUCT_KIND(pns2) == PN_pass_stmt) {
|
|
|
|
// no instructions
|
|
|
|
continue;
|
2015-02-16 17:46:28 +00:00
|
|
|
} else if (MP_PARSE_NODE_STRUCT_KIND(pns2) != PN_expr_stmt) {
|
2014-04-12 17:54:52 +01:00
|
|
|
// not an instruction; error
|
2015-02-16 17:46:28 +00:00
|
|
|
not_an_instruction:
|
2020-03-02 11:35:22 +00:00
|
|
|
compile_syntax_error(comp, nodes[i], MP_ERROR_TEXT("expecting an assembler instruction"));
|
2014-04-12 17:54:52 +01:00
|
|
|
return;
|
|
|
|
}
|
2015-02-16 17:46:28 +00:00
|
|
|
|
|
|
|
// check structure of parse node
|
2013-12-21 18:17:45 +00:00
|
|
|
assert(MP_PARSE_NODE_IS_STRUCT(pns2->nodes[0]));
|
2015-02-16 17:46:28 +00:00
|
|
|
if (!MP_PARSE_NODE_IS_NULL(pns2->nodes[1])) {
|
|
|
|
goto not_an_instruction;
|
|
|
|
}
|
2013-12-21 18:17:45 +00:00
|
|
|
pns2 = (mp_parse_node_struct_t *)pns2->nodes[0];
|
2016-01-27 20:23:11 +00:00
|
|
|
if (MP_PARSE_NODE_STRUCT_KIND(pns2) != PN_atom_expr_normal) {
|
2015-02-16 17:46:28 +00:00
|
|
|
goto not_an_instruction;
|
|
|
|
}
|
|
|
|
if (!MP_PARSE_NODE_IS_ID(pns2->nodes[0])) {
|
|
|
|
goto not_an_instruction;
|
|
|
|
}
|
|
|
|
if (!MP_PARSE_NODE_IS_STRUCT_KIND(pns2->nodes[1], PN_trailer_paren)) {
|
|
|
|
goto not_an_instruction;
|
|
|
|
}
|
|
|
|
|
|
|
|
// parse node looks like an instruction
|
|
|
|
// get instruction name and args
|
2013-12-21 18:17:45 +00:00
|
|
|
qstr op = MP_PARSE_NODE_LEAF_ARG(pns2->nodes[0]);
|
|
|
|
pns2 = (mp_parse_node_struct_t *)pns2->nodes[1]; // PN_trailer_paren
|
|
|
|
mp_parse_node_t *pn_arg;
|
2020-05-04 13:11:44 +01:00
|
|
|
size_t n_args = mp_parse_node_extract_list(&pns2->nodes[0], PN_arglist, &pn_arg);
|
2013-10-05 23:17:28 +01:00
|
|
|
|
|
|
|
// emit instructions
|
2014-04-21 13:33:15 +01:00
|
|
|
if (op == MP_QSTR_label) {
|
2013-12-21 18:17:45 +00:00
|
|
|
if (!(n_args == 1 && MP_PARSE_NODE_IS_ID(pn_arg[0]))) {
|
2020-03-02 11:35:22 +00:00
|
|
|
compile_syntax_error(comp, nodes[i], MP_ERROR_TEXT("'label' requires 1 argument"));
|
2013-10-05 23:17:28 +01:00
|
|
|
return;
|
|
|
|
}
|
2014-04-10 14:11:31 +01:00
|
|
|
uint lab = comp_next_label(comp);
|
2014-05-07 17:24:22 +01:00
|
|
|
if (pass > MP_PASS_SCOPE) {
|
2015-03-03 17:08:02 +00:00
|
|
|
if (!EMIT_INLINE_ASM_ARG(label, lab, MP_PARSE_NODE_LEAF_ARG(pn_arg[0]))) {
|
2020-03-02 11:35:22 +00:00
|
|
|
compile_syntax_error(comp, nodes[i], MP_ERROR_TEXT("label redefined"));
|
2015-03-03 17:08:02 +00:00
|
|
|
return;
|
|
|
|
}
|
2013-10-05 23:17:28 +01:00
|
|
|
}
|
2014-04-21 13:33:15 +01:00
|
|
|
} else if (op == MP_QSTR_align) {
|
|
|
|
if (!(n_args == 1 && MP_PARSE_NODE_IS_SMALL_INT(pn_arg[0]))) {
|
2020-03-02 11:35:22 +00:00
|
|
|
compile_syntax_error(comp, nodes[i], MP_ERROR_TEXT("'align' requires 1 argument"));
|
2014-04-21 13:33:15 +01:00
|
|
|
return;
|
|
|
|
}
|
2014-05-07 17:24:22 +01:00
|
|
|
if (pass > MP_PASS_SCOPE) {
|
2016-12-09 09:54:54 +00:00
|
|
|
mp_asm_base_align((mp_asm_base_t *)comp->emit_inline_asm,
|
|
|
|
MP_PARSE_NODE_LEAF_SMALL_INT(pn_arg[0]));
|
2014-04-21 13:33:15 +01:00
|
|
|
}
|
|
|
|
} else if (op == MP_QSTR_data) {
|
|
|
|
if (!(n_args >= 2 && MP_PARSE_NODE_IS_SMALL_INT(pn_arg[0]))) {
|
2020-03-02 11:35:22 +00:00
|
|
|
compile_syntax_error(comp, nodes[i], MP_ERROR_TEXT("'data' requires at least 2 arguments"));
|
2014-04-21 13:33:15 +01:00
|
|
|
return;
|
|
|
|
}
|
2014-05-07 17:24:22 +01:00
|
|
|
if (pass > MP_PASS_SCOPE) {
|
2014-07-03 13:25:24 +01:00
|
|
|
mp_int_t bytesize = MP_PARSE_NODE_LEAF_SMALL_INT(pn_arg[0]);
|
2015-01-20 11:55:10 +00:00
|
|
|
for (uint j = 1; j < n_args; j++) {
|
|
|
|
if (!MP_PARSE_NODE_IS_SMALL_INT(pn_arg[j])) {
|
2020-03-02 11:35:22 +00:00
|
|
|
compile_syntax_error(comp, nodes[i], MP_ERROR_TEXT("'data' requires integer arguments"));
|
2014-04-21 13:33:15 +01:00
|
|
|
return;
|
|
|
|
}
|
2016-12-09 09:54:54 +00:00
|
|
|
mp_asm_base_data((mp_asm_base_t *)comp->emit_inline_asm,
|
|
|
|
bytesize, MP_PARSE_NODE_LEAF_SMALL_INT(pn_arg[j]));
|
2014-04-21 13:33:15 +01:00
|
|
|
}
|
|
|
|
}
|
2013-10-05 23:17:28 +01:00
|
|
|
} else {
|
2014-05-07 17:24:22 +01:00
|
|
|
if (pass > MP_PASS_SCOPE) {
|
2014-01-23 00:34:21 +00:00
|
|
|
EMIT_INLINE_ASM_ARG(op, op, n_args, pn_arg);
|
2013-10-05 23:17:28 +01:00
|
|
|
}
|
|
|
|
}
|
2015-02-13 01:00:51 +00:00
|
|
|
|
|
|
|
if (comp->compile_error != MP_OBJ_NULL) {
|
|
|
|
pns = pns2; // this is the parse node that had the error
|
|
|
|
goto inline_asm_error;
|
|
|
|
}
|
2013-10-05 23:17:28 +01:00
|
|
|
}
|
|
|
|
|
2014-05-07 17:24:22 +01:00
|
|
|
if (comp->pass > MP_PASS_SCOPE) {
|
2016-01-15 15:20:43 +00:00
|
|
|
EMIT_INLINE_ASM_ARG(end_pass, type_sig);
|
2016-12-09 10:23:17 +00:00
|
|
|
|
|
|
|
if (comp->pass == MP_PASS_EMIT) {
|
|
|
|
void *f = mp_asm_base_get_code((mp_asm_base_t *)comp->emit_inline_asm);
|
|
|
|
mp_emit_glue_assign_native(comp->scope_cur->raw_code, MP_CODE_NATIVE_ASM,
|
|
|
|
f, mp_asm_base_get_code_size((mp_asm_base_t *)comp->emit_inline_asm),
|
py: Add support to save native, viper and asm code to .mpy files.
This commit adds support for saving and loading .mpy files that contain
native code (native, viper and inline-asm). A lot of the ground work was
already done for this in the form of removing pointers from generated
native code. The changes here are mainly to link in qstr values to the
native code, and change the format of .mpy files to contain native code
blocks (possibly mixed with bytecode).
A top-level summary:
- @micropython.native, @micropython.viper and @micropython.asm_thumb/
asm_xtensa are now allowed in .py files when compiling to .mpy, and they
work transparently to the user.
- Entire .py files can be compiled to native via mpy-cross -X emit=native
and for the most part the generated .mpy files should work the same as
their bytecode version.
- The .mpy file format is changed to 1) specify in the header if the file
contains native code and if so the architecture (eg x86, ARMV7M, Xtensa);
2) for each function block the kind of code is specified (bytecode,
native, viper, asm).
- When native code is loaded from a .mpy file the native code must be
modified (in place) to link qstr values in, just like bytecode (see
py/persistentcode.c:arch_link_qstr() function).
In addition, this now defines a public, native ABI for dynamically loadable
native code generated by other languages, like C.
2019-02-21 04:18:33 +00:00
|
|
|
NULL,
|
|
|
|
#if MICROPY_PERSISTENT_CODE_SAVE
|
py: Rework bytecode and .mpy file format to be mostly static data.
Background: .mpy files are precompiled .py files, built using mpy-cross,
that contain compiled bytecode functions (and can also contain machine
code). The benefit of using an .mpy file over a .py file is that they are
faster to import and take less memory when importing. They are also
smaller on disk.
But the real benefit of .mpy files comes when they are frozen into the
firmware. This is done by loading the .mpy file during compilation of the
firmware and turning it into a set of big C data structures (the job of
mpy-tool.py), which are then compiled and downloaded into the ROM of a
device. These C data structures can be executed in-place, ie directly from
ROM. This makes importing even faster because there is very little to do,
and also means such frozen modules take up much less RAM (because their
bytecode stays in ROM).
The downside of frozen code is that it requires recompiling and reflashing
the entire firmware. This can be a big barrier to entry, slows down
development time, and makes it harder to do OTA updates of frozen code
(because the whole firmware must be updated).
This commit attempts to solve this problem by providing a solution that
sits between loading .mpy files into RAM and freezing them into the
firmware. The .mpy file format has been reworked so that it consists of
data and bytecode which is mostly static and ready to run in-place. If
these new .mpy files are located in flash/ROM which is memory addressable,
the .mpy file can be executed (mostly) in-place.
With this approach there is still a small amount of unpacking and linking
of the .mpy file that needs to be done when it's imported, but it's still
much better than loading an .mpy from disk into RAM (although not as good
as freezing .mpy files into the firmware).
The main trick to make static .mpy files is to adjust the bytecode so any
qstrs that it references now go through a lookup table to convert from
local qstr number in the module to global qstr number in the firmware.
That means the bytecode does not need linking/rewriting of qstrs when it's
loaded. Instead only a small qstr table needs to be built (and put in RAM)
at import time. This means the bytecode itself is static/constant and can
be used directly if it's in addressable memory. Also the qstr string data
in the .mpy file, and some constant object data, can be used directly.
Note that the qstr table is global to the module (ie not per function).
In more detail, in the VM what used to be (schematically):
qst = DECODE_QSTR_VALUE;
is now (schematically):
idx = DECODE_QSTR_INDEX;
qst = qstr_table[idx];
That allows the bytecode to be fixed at compile time and not need
relinking/rewriting of the qstr values. Only qstr_table needs to be linked
when the .mpy is loaded.
Incidentally, this helps to reduce the size of bytecode because what used
to be 2-byte qstr values in the bytecode are now (mostly) 1-byte indices.
If the module uses the same qstr more than two times then the bytecode is
smaller than before.
The following changes are measured for this commit compared to the
previous (the baseline):
- average 7%-9% reduction in size of .mpy files
- frozen code size is reduced by about 5%-7%
- importing .py files uses about 5% less RAM in total
- importing .mpy files uses about 4% less RAM in total
- importing .py and .mpy files takes about the same time as before
The qstr indirection in the bytecode has only a small impact on VM
performance. For stm32 on PYBv1.0 the performance change of this commit
is:
diff of scores (higher is better)
N=100 M=100 baseline -> this-commit diff diff% (error%)
bm_chaos.py 371.07 -> 357.39 : -13.68 = -3.687% (+/-0.02%)
bm_fannkuch.py 78.72 -> 77.49 : -1.23 = -1.563% (+/-0.01%)
bm_fft.py 2591.73 -> 2539.28 : -52.45 = -2.024% (+/-0.00%)
bm_float.py 6034.93 -> 5908.30 : -126.63 = -2.098% (+/-0.01%)
bm_hexiom.py 48.96 -> 47.93 : -1.03 = -2.104% (+/-0.00%)
bm_nqueens.py 4510.63 -> 4459.94 : -50.69 = -1.124% (+/-0.00%)
bm_pidigits.py 650.28 -> 644.96 : -5.32 = -0.818% (+/-0.23%)
core_import_mpy_multi.py 564.77 -> 581.49 : +16.72 = +2.960% (+/-0.01%)
core_import_mpy_single.py 68.67 -> 67.16 : -1.51 = -2.199% (+/-0.01%)
core_qstr.py 64.16 -> 64.12 : -0.04 = -0.062% (+/-0.00%)
core_yield_from.py 362.58 -> 354.50 : -8.08 = -2.228% (+/-0.00%)
misc_aes.py 429.69 -> 405.59 : -24.10 = -5.609% (+/-0.01%)
misc_mandel.py 3485.13 -> 3416.51 : -68.62 = -1.969% (+/-0.00%)
misc_pystone.py 2496.53 -> 2405.56 : -90.97 = -3.644% (+/-0.01%)
misc_raytrace.py 381.47 -> 374.01 : -7.46 = -1.956% (+/-0.01%)
viper_call0.py 576.73 -> 572.49 : -4.24 = -0.735% (+/-0.04%)
viper_call1a.py 550.37 -> 546.21 : -4.16 = -0.756% (+/-0.09%)
viper_call1b.py 438.23 -> 435.68 : -2.55 = -0.582% (+/-0.06%)
viper_call1c.py 442.84 -> 440.04 : -2.80 = -0.632% (+/-0.08%)
viper_call2a.py 536.31 -> 532.35 : -3.96 = -0.738% (+/-0.06%)
viper_call2b.py 382.34 -> 377.07 : -5.27 = -1.378% (+/-0.03%)
And for unix on x64:
diff of scores (higher is better)
N=2000 M=2000 baseline -> this-commit diff diff% (error%)
bm_chaos.py 13594.20 -> 13073.84 : -520.36 = -3.828% (+/-5.44%)
bm_fannkuch.py 60.63 -> 59.58 : -1.05 = -1.732% (+/-3.01%)
bm_fft.py 112009.15 -> 111603.32 : -405.83 = -0.362% (+/-4.03%)
bm_float.py 246202.55 -> 247923.81 : +1721.26 = +0.699% (+/-2.79%)
bm_hexiom.py 615.65 -> 617.21 : +1.56 = +0.253% (+/-1.64%)
bm_nqueens.py 215807.95 -> 215600.96 : -206.99 = -0.096% (+/-3.52%)
bm_pidigits.py 8246.74 -> 8422.82 : +176.08 = +2.135% (+/-3.64%)
misc_aes.py 16133.00 -> 16452.74 : +319.74 = +1.982% (+/-1.50%)
misc_mandel.py 128146.69 -> 130796.43 : +2649.74 = +2.068% (+/-3.18%)
misc_pystone.py 83811.49 -> 83124.85 : -686.64 = -0.819% (+/-1.03%)
misc_raytrace.py 21688.02 -> 21385.10 : -302.92 = -1.397% (+/-3.20%)
The code size change is (firmware with a lot of frozen code benefits the
most):
bare-arm: +396 +0.697%
minimal x86: +1595 +0.979% [incl +32(data)]
unix x64: +2408 +0.470% [incl +800(data)]
unix nanbox: +1396 +0.309% [incl -96(data)]
stm32: -1256 -0.318% PYBV10
cc3200: +288 +0.157%
esp8266: -260 -0.037% GENERIC
esp32: -216 -0.014% GENERIC[incl -1072(data)]
nrf: +116 +0.067% pca10040
rp2: -664 -0.135% PICO
samd: +844 +0.607% ADAFRUIT_ITSYBITSY_M4_EXPRESS
As part of this change the .mpy file format version is bumped to version 6.
And mpy-tool.py has been improved to provide a good visualisation of the
contents of .mpy files.
In summary: this commit changes the bytecode to use qstr indirection, and
reworks the .mpy file format to be simpler and allow .mpy files to be
executed in-place. Performance is not impacted too much. Eventually it
will be possible to store such .mpy files in a linear, read-only, memory-
mappable filesystem so they can be executed from flash/ROM. This will
essentially be able to replace frozen code for most applications.
Signed-off-by: Damien George <damien@micropython.org>
2021-10-22 12:22:47 +01:00
|
|
|
0,
|
|
|
|
0, 0, NULL,
|
py: Add support to save native, viper and asm code to .mpy files.
This commit adds support for saving and loading .mpy files that contain
native code (native, viper and inline-asm). A lot of the ground work was
already done for this in the form of removing pointers from generated
native code. The changes here are mainly to link in qstr values to the
native code, and change the format of .mpy files to contain native code
blocks (possibly mixed with bytecode).
A top-level summary:
- @micropython.native, @micropython.viper and @micropython.asm_thumb/
asm_xtensa are now allowed in .py files when compiling to .mpy, and they
work transparently to the user.
- Entire .py files can be compiled to native via mpy-cross -X emit=native
and for the most part the generated .mpy files should work the same as
their bytecode version.
- The .mpy file format is changed to 1) specify in the header if the file
contains native code and if so the architecture (eg x86, ARMV7M, Xtensa);
2) for each function block the kind of code is specified (bytecode,
native, viper, asm).
- When native code is loaded from a .mpy file the native code must be
modified (in place) to link qstr values in, just like bytecode (see
py/persistentcode.c:arch_link_qstr() function).
In addition, this now defines a public, native ABI for dynamically loadable
native code generated by other languages, like C.
2019-02-21 04:18:33 +00:00
|
|
|
#endif
|
py: Rework bytecode and .mpy file format to be mostly static data.
Background: .mpy files are precompiled .py files, built using mpy-cross,
that contain compiled bytecode functions (and can also contain machine
code). The benefit of using an .mpy file over a .py file is that they are
faster to import and take less memory when importing. They are also
smaller on disk.
But the real benefit of .mpy files comes when they are frozen into the
firmware. This is done by loading the .mpy file during compilation of the
firmware and turning it into a set of big C data structures (the job of
mpy-tool.py), which are then compiled and downloaded into the ROM of a
device. These C data structures can be executed in-place, ie directly from
ROM. This makes importing even faster because there is very little to do,
and also means such frozen modules take up much less RAM (because their
bytecode stays in ROM).
The downside of frozen code is that it requires recompiling and reflashing
the entire firmware. This can be a big barrier to entry, slows down
development time, and makes it harder to do OTA updates of frozen code
(because the whole firmware must be updated).
This commit attempts to solve this problem by providing a solution that
sits between loading .mpy files into RAM and freezing them into the
firmware. The .mpy file format has been reworked so that it consists of
data and bytecode which is mostly static and ready to run in-place. If
these new .mpy files are located in flash/ROM which is memory addressable,
the .mpy file can be executed (mostly) in-place.
With this approach there is still a small amount of unpacking and linking
of the .mpy file that needs to be done when it's imported, but it's still
much better than loading an .mpy from disk into RAM (although not as good
as freezing .mpy files into the firmware).
The main trick to make static .mpy files is to adjust the bytecode so any
qstrs that it references now go through a lookup table to convert from
local qstr number in the module to global qstr number in the firmware.
That means the bytecode does not need linking/rewriting of qstrs when it's
loaded. Instead only a small qstr table needs to be built (and put in RAM)
at import time. This means the bytecode itself is static/constant and can
be used directly if it's in addressable memory. Also the qstr string data
in the .mpy file, and some constant object data, can be used directly.
Note that the qstr table is global to the module (ie not per function).
In more detail, in the VM what used to be (schematically):
qst = DECODE_QSTR_VALUE;
is now (schematically):
idx = DECODE_QSTR_INDEX;
qst = qstr_table[idx];
That allows the bytecode to be fixed at compile time and not need
relinking/rewriting of the qstr values. Only qstr_table needs to be linked
when the .mpy is loaded.
Incidentally, this helps to reduce the size of bytecode because what used
to be 2-byte qstr values in the bytecode are now (mostly) 1-byte indices.
If the module uses the same qstr more than two times then the bytecode is
smaller than before.
The following changes are measured for this commit compared to the
previous (the baseline):
- average 7%-9% reduction in size of .mpy files
- frozen code size is reduced by about 5%-7%
- importing .py files uses about 5% less RAM in total
- importing .mpy files uses about 4% less RAM in total
- importing .py and .mpy files takes about the same time as before
The qstr indirection in the bytecode has only a small impact on VM
performance. For stm32 on PYBv1.0 the performance change of this commit
is:
diff of scores (higher is better)
N=100 M=100 baseline -> this-commit diff diff% (error%)
bm_chaos.py 371.07 -> 357.39 : -13.68 = -3.687% (+/-0.02%)
bm_fannkuch.py 78.72 -> 77.49 : -1.23 = -1.563% (+/-0.01%)
bm_fft.py 2591.73 -> 2539.28 : -52.45 = -2.024% (+/-0.00%)
bm_float.py 6034.93 -> 5908.30 : -126.63 = -2.098% (+/-0.01%)
bm_hexiom.py 48.96 -> 47.93 : -1.03 = -2.104% (+/-0.00%)
bm_nqueens.py 4510.63 -> 4459.94 : -50.69 = -1.124% (+/-0.00%)
bm_pidigits.py 650.28 -> 644.96 : -5.32 = -0.818% (+/-0.23%)
core_import_mpy_multi.py 564.77 -> 581.49 : +16.72 = +2.960% (+/-0.01%)
core_import_mpy_single.py 68.67 -> 67.16 : -1.51 = -2.199% (+/-0.01%)
core_qstr.py 64.16 -> 64.12 : -0.04 = -0.062% (+/-0.00%)
core_yield_from.py 362.58 -> 354.50 : -8.08 = -2.228% (+/-0.00%)
misc_aes.py 429.69 -> 405.59 : -24.10 = -5.609% (+/-0.01%)
misc_mandel.py 3485.13 -> 3416.51 : -68.62 = -1.969% (+/-0.00%)
misc_pystone.py 2496.53 -> 2405.56 : -90.97 = -3.644% (+/-0.01%)
misc_raytrace.py 381.47 -> 374.01 : -7.46 = -1.956% (+/-0.01%)
viper_call0.py 576.73 -> 572.49 : -4.24 = -0.735% (+/-0.04%)
viper_call1a.py 550.37 -> 546.21 : -4.16 = -0.756% (+/-0.09%)
viper_call1b.py 438.23 -> 435.68 : -2.55 = -0.582% (+/-0.06%)
viper_call1c.py 442.84 -> 440.04 : -2.80 = -0.632% (+/-0.08%)
viper_call2a.py 536.31 -> 532.35 : -3.96 = -0.738% (+/-0.06%)
viper_call2b.py 382.34 -> 377.07 : -5.27 = -1.378% (+/-0.03%)
And for unix on x64:
diff of scores (higher is better)
N=2000 M=2000 baseline -> this-commit diff diff% (error%)
bm_chaos.py 13594.20 -> 13073.84 : -520.36 = -3.828% (+/-5.44%)
bm_fannkuch.py 60.63 -> 59.58 : -1.05 = -1.732% (+/-3.01%)
bm_fft.py 112009.15 -> 111603.32 : -405.83 = -0.362% (+/-4.03%)
bm_float.py 246202.55 -> 247923.81 : +1721.26 = +0.699% (+/-2.79%)
bm_hexiom.py 615.65 -> 617.21 : +1.56 = +0.253% (+/-1.64%)
bm_nqueens.py 215807.95 -> 215600.96 : -206.99 = -0.096% (+/-3.52%)
bm_pidigits.py 8246.74 -> 8422.82 : +176.08 = +2.135% (+/-3.64%)
misc_aes.py 16133.00 -> 16452.74 : +319.74 = +1.982% (+/-1.50%)
misc_mandel.py 128146.69 -> 130796.43 : +2649.74 = +2.068% (+/-3.18%)
misc_pystone.py 83811.49 -> 83124.85 : -686.64 = -0.819% (+/-1.03%)
misc_raytrace.py 21688.02 -> 21385.10 : -302.92 = -1.397% (+/-3.20%)
The code size change is (firmware with a lot of frozen code benefits the
most):
bare-arm: +396 +0.697%
minimal x86: +1595 +0.979% [incl +32(data)]
unix x64: +2408 +0.470% [incl +800(data)]
unix nanbox: +1396 +0.309% [incl -96(data)]
stm32: -1256 -0.318% PYBV10
cc3200: +288 +0.157%
esp8266: -260 -0.037% GENERIC
esp32: -216 -0.014% GENERIC[incl -1072(data)]
nrf: +116 +0.067% pca10040
rp2: -664 -0.135% PICO
samd: +844 +0.607% ADAFRUIT_ITSYBITSY_M4_EXPRESS
As part of this change the .mpy file format version is bumped to version 6.
And mpy-tool.py has been improved to provide a good visualisation of the
contents of .mpy files.
In summary: this commit changes the bytecode to use qstr indirection, and
reworks the .mpy file format to be simpler and allow .mpy files to be
executed in-place. Performance is not impacted too much. Eventually it
will be possible to store such .mpy files in a linear, read-only, memory-
mappable filesystem so they can be executed from flash/ROM. This will
essentially be able to replace frozen code for most applications.
Signed-off-by: Damien George <damien@micropython.org>
2021-10-22 12:22:47 +01:00
|
|
|
0, comp->scope_cur->num_pos_args, type_sig);
|
2016-12-09 10:23:17 +00:00
|
|
|
}
|
2015-02-13 01:00:51 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
if (comp->compile_error != MP_OBJ_NULL) {
|
2015-07-29 23:16:01 +01:00
|
|
|
// inline assembler had an error; set line for its exception
|
2015-02-13 01:00:51 +00:00
|
|
|
inline_asm_error:
|
2015-07-29 23:16:01 +01:00
|
|
|
comp->compile_error_line = pns->source_line;
|
2013-10-05 13:37:10 +01:00
|
|
|
}
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
2014-04-02 17:31:27 +01:00
|
|
|
#endif
|
2013-10-04 19:53:11 +01:00
|
|
|
|
py: Rework bytecode and .mpy file format to be mostly static data.
Background: .mpy files are precompiled .py files, built using mpy-cross,
that contain compiled bytecode functions (and can also contain machine
code). The benefit of using an .mpy file over a .py file is that they are
faster to import and take less memory when importing. They are also
smaller on disk.
But the real benefit of .mpy files comes when they are frozen into the
firmware. This is done by loading the .mpy file during compilation of the
firmware and turning it into a set of big C data structures (the job of
mpy-tool.py), which are then compiled and downloaded into the ROM of a
device. These C data structures can be executed in-place, ie directly from
ROM. This makes importing even faster because there is very little to do,
and also means such frozen modules take up much less RAM (because their
bytecode stays in ROM).
The downside of frozen code is that it requires recompiling and reflashing
the entire firmware. This can be a big barrier to entry, slows down
development time, and makes it harder to do OTA updates of frozen code
(because the whole firmware must be updated).
This commit attempts to solve this problem by providing a solution that
sits between loading .mpy files into RAM and freezing them into the
firmware. The .mpy file format has been reworked so that it consists of
data and bytecode which is mostly static and ready to run in-place. If
these new .mpy files are located in flash/ROM which is memory addressable,
the .mpy file can be executed (mostly) in-place.
With this approach there is still a small amount of unpacking and linking
of the .mpy file that needs to be done when it's imported, but it's still
much better than loading an .mpy from disk into RAM (although not as good
as freezing .mpy files into the firmware).
The main trick to make static .mpy files is to adjust the bytecode so any
qstrs that it references now go through a lookup table to convert from
local qstr number in the module to global qstr number in the firmware.
That means the bytecode does not need linking/rewriting of qstrs when it's
loaded. Instead only a small qstr table needs to be built (and put in RAM)
at import time. This means the bytecode itself is static/constant and can
be used directly if it's in addressable memory. Also the qstr string data
in the .mpy file, and some constant object data, can be used directly.
Note that the qstr table is global to the module (ie not per function).
In more detail, in the VM what used to be (schematically):
qst = DECODE_QSTR_VALUE;
is now (schematically):
idx = DECODE_QSTR_INDEX;
qst = qstr_table[idx];
That allows the bytecode to be fixed at compile time and not need
relinking/rewriting of the qstr values. Only qstr_table needs to be linked
when the .mpy is loaded.
Incidentally, this helps to reduce the size of bytecode because what used
to be 2-byte qstr values in the bytecode are now (mostly) 1-byte indices.
If the module uses the same qstr more than two times then the bytecode is
smaller than before.
The following changes are measured for this commit compared to the
previous (the baseline):
- average 7%-9% reduction in size of .mpy files
- frozen code size is reduced by about 5%-7%
- importing .py files uses about 5% less RAM in total
- importing .mpy files uses about 4% less RAM in total
- importing .py and .mpy files takes about the same time as before
The qstr indirection in the bytecode has only a small impact on VM
performance. For stm32 on PYBv1.0 the performance change of this commit
is:
diff of scores (higher is better)
N=100 M=100 baseline -> this-commit diff diff% (error%)
bm_chaos.py 371.07 -> 357.39 : -13.68 = -3.687% (+/-0.02%)
bm_fannkuch.py 78.72 -> 77.49 : -1.23 = -1.563% (+/-0.01%)
bm_fft.py 2591.73 -> 2539.28 : -52.45 = -2.024% (+/-0.00%)
bm_float.py 6034.93 -> 5908.30 : -126.63 = -2.098% (+/-0.01%)
bm_hexiom.py 48.96 -> 47.93 : -1.03 = -2.104% (+/-0.00%)
bm_nqueens.py 4510.63 -> 4459.94 : -50.69 = -1.124% (+/-0.00%)
bm_pidigits.py 650.28 -> 644.96 : -5.32 = -0.818% (+/-0.23%)
core_import_mpy_multi.py 564.77 -> 581.49 : +16.72 = +2.960% (+/-0.01%)
core_import_mpy_single.py 68.67 -> 67.16 : -1.51 = -2.199% (+/-0.01%)
core_qstr.py 64.16 -> 64.12 : -0.04 = -0.062% (+/-0.00%)
core_yield_from.py 362.58 -> 354.50 : -8.08 = -2.228% (+/-0.00%)
misc_aes.py 429.69 -> 405.59 : -24.10 = -5.609% (+/-0.01%)
misc_mandel.py 3485.13 -> 3416.51 : -68.62 = -1.969% (+/-0.00%)
misc_pystone.py 2496.53 -> 2405.56 : -90.97 = -3.644% (+/-0.01%)
misc_raytrace.py 381.47 -> 374.01 : -7.46 = -1.956% (+/-0.01%)
viper_call0.py 576.73 -> 572.49 : -4.24 = -0.735% (+/-0.04%)
viper_call1a.py 550.37 -> 546.21 : -4.16 = -0.756% (+/-0.09%)
viper_call1b.py 438.23 -> 435.68 : -2.55 = -0.582% (+/-0.06%)
viper_call1c.py 442.84 -> 440.04 : -2.80 = -0.632% (+/-0.08%)
viper_call2a.py 536.31 -> 532.35 : -3.96 = -0.738% (+/-0.06%)
viper_call2b.py 382.34 -> 377.07 : -5.27 = -1.378% (+/-0.03%)
And for unix on x64:
diff of scores (higher is better)
N=2000 M=2000 baseline -> this-commit diff diff% (error%)
bm_chaos.py 13594.20 -> 13073.84 : -520.36 = -3.828% (+/-5.44%)
bm_fannkuch.py 60.63 -> 59.58 : -1.05 = -1.732% (+/-3.01%)
bm_fft.py 112009.15 -> 111603.32 : -405.83 = -0.362% (+/-4.03%)
bm_float.py 246202.55 -> 247923.81 : +1721.26 = +0.699% (+/-2.79%)
bm_hexiom.py 615.65 -> 617.21 : +1.56 = +0.253% (+/-1.64%)
bm_nqueens.py 215807.95 -> 215600.96 : -206.99 = -0.096% (+/-3.52%)
bm_pidigits.py 8246.74 -> 8422.82 : +176.08 = +2.135% (+/-3.64%)
misc_aes.py 16133.00 -> 16452.74 : +319.74 = +1.982% (+/-1.50%)
misc_mandel.py 128146.69 -> 130796.43 : +2649.74 = +2.068% (+/-3.18%)
misc_pystone.py 83811.49 -> 83124.85 : -686.64 = -0.819% (+/-1.03%)
misc_raytrace.py 21688.02 -> 21385.10 : -302.92 = -1.397% (+/-3.20%)
The code size change is (firmware with a lot of frozen code benefits the
most):
bare-arm: +396 +0.697%
minimal x86: +1595 +0.979% [incl +32(data)]
unix x64: +2408 +0.470% [incl +800(data)]
unix nanbox: +1396 +0.309% [incl -96(data)]
stm32: -1256 -0.318% PYBV10
cc3200: +288 +0.157%
esp8266: -260 -0.037% GENERIC
esp32: -216 -0.014% GENERIC[incl -1072(data)]
nrf: +116 +0.067% pca10040
rp2: -664 -0.135% PICO
samd: +844 +0.607% ADAFRUIT_ITSYBITSY_M4_EXPRESS
As part of this change the .mpy file format version is bumped to version 6.
And mpy-tool.py has been improved to provide a good visualisation of the
contents of .mpy files.
In summary: this commit changes the bytecode to use qstr indirection, and
reworks the .mpy file format to be simpler and allow .mpy files to be
executed in-place. Performance is not impacted too much. Eventually it
will be possible to store such .mpy files in a linear, read-only, memory-
mappable filesystem so they can be executed from flash/ROM. This will
essentially be able to replace frozen code for most applications.
Signed-off-by: Damien George <damien@micropython.org>
2021-10-22 12:22:47 +01:00
|
|
|
STATIC void scope_compute_things(scope_t *scope, mp_emit_common_t *emit_common) {
|
2017-06-30 08:22:17 +01:00
|
|
|
// in MicroPython we put the *x parameter after all other parameters (except **y)
|
2014-04-27 15:50:52 +01:00
|
|
|
if (scope->scope_flags & MP_SCOPE_FLAG_VARARGS) {
|
|
|
|
id_info_t *id_param = NULL;
|
|
|
|
for (int i = scope->id_info_len - 1; i >= 0; i--) {
|
|
|
|
id_info_t *id = &scope->id_info[i];
|
|
|
|
if (id->flags & ID_FLAG_IS_STAR_PARAM) {
|
|
|
|
if (id_param != NULL) {
|
|
|
|
// swap star param with last param
|
|
|
|
id_info_t temp = *id_param;
|
|
|
|
*id_param = *id;
|
|
|
|
*id = temp;
|
|
|
|
}
|
|
|
|
break;
|
|
|
|
} else if (id_param == NULL && id->flags == ID_FLAG_IS_PARAM) {
|
|
|
|
id_param = id;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2013-10-04 19:53:11 +01:00
|
|
|
// in functions, turn implicit globals into explicit globals
|
2013-12-30 22:32:17 +00:00
|
|
|
// compute the index of each local
|
2013-10-04 19:53:11 +01:00
|
|
|
scope->num_locals = 0;
|
|
|
|
for (int i = 0; i < scope->id_info_len; i++) {
|
|
|
|
id_info_t *id = &scope->id_info[i];
|
2014-09-08 23:05:16 +01:00
|
|
|
if (scope->kind == SCOPE_CLASS && id->qst == MP_QSTR___class__) {
|
2013-10-04 19:53:11 +01:00
|
|
|
// __class__ is not counted as a local; if it's used then it becomes a ID_INFO_KIND_CELL
|
|
|
|
continue;
|
|
|
|
}
|
2016-09-30 03:34:05 +01:00
|
|
|
if (SCOPE_IS_FUNC_LIKE(scope->kind) && id->kind == ID_INFO_KIND_GLOBAL_IMPLICIT) {
|
2013-10-04 19:53:11 +01:00
|
|
|
id->kind = ID_INFO_KIND_GLOBAL_EXPLICIT;
|
|
|
|
}
|
py: Fix native functions so they run with their correct globals context.
Prior to this commit a function compiled with the native decorator
@micropython.native would not work correctly when accessing global
variables, because the globals dict was not being set upon function entry.
This commit fixes this problem by, upon function entry, setting as the
current globals dict the globals dict context the function was defined
within, as per normal Python semantics, and as bytecode does. Upon
function exit the original globals dict is restored.
In order to restore the globals dict when an exception is raised the native
function must guard its internals with an nlr_push/nlr_pop pair. Because
this push/pop is relatively expensive, in both C stack usage for the
nlr_buf_t and CPU execution time, the implementation here optimises things
as much as possible. First, the compiler keeps track of whether a function
even needs to access global variables. Using this information the native
emitter then generates three different kinds of code:
1. no globals used, no exception handlers: no nlr handling code and no
setting of the globals dict.
2. globals used, no exception handlers: an nlr_buf_t is allocated on the
C stack but it is not used if the globals dict is unchanged, saving
execution time because nlr_push/nlr_pop don't need to run.
3. function has exception handlers, may use globals: an nlr_buf_t is
allocated and nlr_push/nlr_pop are always called.
In the end, native functions that don't access globals and don't have
exception handlers will run more efficiently than those that do.
Fixes issue #1573.
2018-09-13 13:03:48 +01:00
|
|
|
#if MICROPY_EMIT_NATIVE
|
|
|
|
if (id->kind == ID_INFO_KIND_GLOBAL_EXPLICIT) {
|
|
|
|
// This function makes a reference to a global variable
|
2018-09-15 13:37:07 +01:00
|
|
|
if (scope->emit_options == MP_EMIT_OPT_VIPER
|
|
|
|
&& mp_native_type_from_qstr(id->qst) >= MP_NATIVE_TYPE_INT) {
|
|
|
|
// A casting operator in viper mode, not a real global reference
|
|
|
|
} else {
|
|
|
|
scope->scope_flags |= MP_SCOPE_FLAG_REFGLOBALS;
|
|
|
|
}
|
py: Fix native functions so they run with their correct globals context.
Prior to this commit a function compiled with the native decorator
@micropython.native would not work correctly when accessing global
variables, because the globals dict was not being set upon function entry.
This commit fixes this problem by, upon function entry, setting as the
current globals dict the globals dict context the function was defined
within, as per normal Python semantics, and as bytecode does. Upon
function exit the original globals dict is restored.
In order to restore the globals dict when an exception is raised the native
function must guard its internals with an nlr_push/nlr_pop pair. Because
this push/pop is relatively expensive, in both C stack usage for the
nlr_buf_t and CPU execution time, the implementation here optimises things
as much as possible. First, the compiler keeps track of whether a function
even needs to access global variables. Using this information the native
emitter then generates three different kinds of code:
1. no globals used, no exception handlers: no nlr handling code and no
setting of the globals dict.
2. globals used, no exception handlers: an nlr_buf_t is allocated on the
C stack but it is not used if the globals dict is unchanged, saving
execution time because nlr_push/nlr_pop don't need to run.
3. function has exception handlers, may use globals: an nlr_buf_t is
allocated and nlr_push/nlr_pop are always called.
In the end, native functions that don't access globals and don't have
exception handlers will run more efficiently than those that do.
Fixes issue #1573.
2018-09-13 13:03:48 +01:00
|
|
|
}
|
|
|
|
#endif
|
2014-04-27 15:50:52 +01:00
|
|
|
// params always count for 1 local, even if they are a cell
|
2014-04-09 14:42:51 +01:00
|
|
|
if (id->kind == ID_INFO_KIND_LOCAL || (id->flags & ID_FLAG_IS_PARAM)) {
|
2014-04-27 15:50:52 +01:00
|
|
|
id->local_num = scope->num_locals++;
|
2013-12-11 00:41:43 +00:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2015-08-14 12:24:11 +01:00
|
|
|
// compute the index of cell vars
|
2013-12-11 00:41:43 +00:00
|
|
|
for (int i = 0; i < scope->id_info_len; i++) {
|
|
|
|
id_info_t *id = &scope->id_info[i];
|
2017-06-30 08:22:17 +01:00
|
|
|
// in MicroPython the cells come right after the fast locals
|
2013-12-30 22:32:17 +00:00
|
|
|
// parameters are not counted here, since they remain at the start
|
|
|
|
// of the locals, even if they are cell vars
|
2014-04-09 14:42:51 +01:00
|
|
|
if (id->kind == ID_INFO_KIND_CELL && !(id->flags & ID_FLAG_IS_PARAM)) {
|
2013-12-30 22:32:17 +00:00
|
|
|
id->local_num = scope->num_locals;
|
|
|
|
scope->num_locals += 1;
|
2013-12-11 00:41:43 +00:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2015-08-14 12:24:11 +01:00
|
|
|
// compute the index of free vars
|
2013-12-11 00:41:43 +00:00
|
|
|
// make sure they are in the order of the parent scope
|
|
|
|
if (scope->parent != NULL) {
|
|
|
|
int num_free = 0;
|
|
|
|
for (int i = 0; i < scope->parent->id_info_len; i++) {
|
|
|
|
id_info_t *id = &scope->parent->id_info[i];
|
|
|
|
if (id->kind == ID_INFO_KIND_CELL || id->kind == ID_INFO_KIND_FREE) {
|
|
|
|
for (int j = 0; j < scope->id_info_len; j++) {
|
|
|
|
id_info_t *id2 = &scope->id_info[j];
|
2014-09-08 23:05:16 +01:00
|
|
|
if (id2->kind == ID_INFO_KIND_FREE && id->qst == id2->qst) {
|
2014-04-09 14:42:51 +01:00
|
|
|
assert(!(id2->flags & ID_FLAG_IS_PARAM)); // free vars should not be params
|
2017-06-30 08:22:17 +01:00
|
|
|
// in MicroPython the frees come first, before the params
|
2013-12-30 22:32:17 +00:00
|
|
|
id2->local_num = num_free;
|
2013-12-11 00:41:43 +00:00
|
|
|
num_free += 1;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
2017-06-30 08:22:17 +01:00
|
|
|
// in MicroPython shift all other locals after the free locals
|
2013-12-30 22:32:17 +00:00
|
|
|
if (num_free > 0) {
|
|
|
|
for (int i = 0; i < scope->id_info_len; i++) {
|
|
|
|
id_info_t *id = &scope->id_info[i];
|
2014-04-09 15:26:46 +01:00
|
|
|
if (id->kind != ID_INFO_KIND_FREE || (id->flags & ID_FLAG_IS_PARAM)) {
|
2013-12-30 22:32:17 +00:00
|
|
|
id->local_num += num_free;
|
|
|
|
}
|
|
|
|
}
|
2014-04-27 15:50:52 +01:00
|
|
|
scope->num_pos_args += num_free; // free vars are counted as params for passing them into the function
|
2013-12-30 22:32:17 +00:00
|
|
|
scope->num_locals += num_free;
|
py: Rework bytecode and .mpy file format to be mostly static data.
Background: .mpy files are precompiled .py files, built using mpy-cross,
that contain compiled bytecode functions (and can also contain machine
code). The benefit of using an .mpy file over a .py file is that they are
faster to import and take less memory when importing. They are also
smaller on disk.
But the real benefit of .mpy files comes when they are frozen into the
firmware. This is done by loading the .mpy file during compilation of the
firmware and turning it into a set of big C data structures (the job of
mpy-tool.py), which are then compiled and downloaded into the ROM of a
device. These C data structures can be executed in-place, ie directly from
ROM. This makes importing even faster because there is very little to do,
and also means such frozen modules take up much less RAM (because their
bytecode stays in ROM).
The downside of frozen code is that it requires recompiling and reflashing
the entire firmware. This can be a big barrier to entry, slows down
development time, and makes it harder to do OTA updates of frozen code
(because the whole firmware must be updated).
This commit attempts to solve this problem by providing a solution that
sits between loading .mpy files into RAM and freezing them into the
firmware. The .mpy file format has been reworked so that it consists of
data and bytecode which is mostly static and ready to run in-place. If
these new .mpy files are located in flash/ROM which is memory addressable,
the .mpy file can be executed (mostly) in-place.
With this approach there is still a small amount of unpacking and linking
of the .mpy file that needs to be done when it's imported, but it's still
much better than loading an .mpy from disk into RAM (although not as good
as freezing .mpy files into the firmware).
The main trick to make static .mpy files is to adjust the bytecode so any
qstrs that it references now go through a lookup table to convert from
local qstr number in the module to global qstr number in the firmware.
That means the bytecode does not need linking/rewriting of qstrs when it's
loaded. Instead only a small qstr table needs to be built (and put in RAM)
at import time. This means the bytecode itself is static/constant and can
be used directly if it's in addressable memory. Also the qstr string data
in the .mpy file, and some constant object data, can be used directly.
Note that the qstr table is global to the module (ie not per function).
In more detail, in the VM what used to be (schematically):
qst = DECODE_QSTR_VALUE;
is now (schematically):
idx = DECODE_QSTR_INDEX;
qst = qstr_table[idx];
That allows the bytecode to be fixed at compile time and not need
relinking/rewriting of the qstr values. Only qstr_table needs to be linked
when the .mpy is loaded.
Incidentally, this helps to reduce the size of bytecode because what used
to be 2-byte qstr values in the bytecode are now (mostly) 1-byte indices.
If the module uses the same qstr more than two times then the bytecode is
smaller than before.
The following changes are measured for this commit compared to the
previous (the baseline):
- average 7%-9% reduction in size of .mpy files
- frozen code size is reduced by about 5%-7%
- importing .py files uses about 5% less RAM in total
- importing .mpy files uses about 4% less RAM in total
- importing .py and .mpy files takes about the same time as before
The qstr indirection in the bytecode has only a small impact on VM
performance. For stm32 on PYBv1.0 the performance change of this commit
is:
diff of scores (higher is better)
N=100 M=100 baseline -> this-commit diff diff% (error%)
bm_chaos.py 371.07 -> 357.39 : -13.68 = -3.687% (+/-0.02%)
bm_fannkuch.py 78.72 -> 77.49 : -1.23 = -1.563% (+/-0.01%)
bm_fft.py 2591.73 -> 2539.28 : -52.45 = -2.024% (+/-0.00%)
bm_float.py 6034.93 -> 5908.30 : -126.63 = -2.098% (+/-0.01%)
bm_hexiom.py 48.96 -> 47.93 : -1.03 = -2.104% (+/-0.00%)
bm_nqueens.py 4510.63 -> 4459.94 : -50.69 = -1.124% (+/-0.00%)
bm_pidigits.py 650.28 -> 644.96 : -5.32 = -0.818% (+/-0.23%)
core_import_mpy_multi.py 564.77 -> 581.49 : +16.72 = +2.960% (+/-0.01%)
core_import_mpy_single.py 68.67 -> 67.16 : -1.51 = -2.199% (+/-0.01%)
core_qstr.py 64.16 -> 64.12 : -0.04 = -0.062% (+/-0.00%)
core_yield_from.py 362.58 -> 354.50 : -8.08 = -2.228% (+/-0.00%)
misc_aes.py 429.69 -> 405.59 : -24.10 = -5.609% (+/-0.01%)
misc_mandel.py 3485.13 -> 3416.51 : -68.62 = -1.969% (+/-0.00%)
misc_pystone.py 2496.53 -> 2405.56 : -90.97 = -3.644% (+/-0.01%)
misc_raytrace.py 381.47 -> 374.01 : -7.46 = -1.956% (+/-0.01%)
viper_call0.py 576.73 -> 572.49 : -4.24 = -0.735% (+/-0.04%)
viper_call1a.py 550.37 -> 546.21 : -4.16 = -0.756% (+/-0.09%)
viper_call1b.py 438.23 -> 435.68 : -2.55 = -0.582% (+/-0.06%)
viper_call1c.py 442.84 -> 440.04 : -2.80 = -0.632% (+/-0.08%)
viper_call2a.py 536.31 -> 532.35 : -3.96 = -0.738% (+/-0.06%)
viper_call2b.py 382.34 -> 377.07 : -5.27 = -1.378% (+/-0.03%)
And for unix on x64:
diff of scores (higher is better)
N=2000 M=2000 baseline -> this-commit diff diff% (error%)
bm_chaos.py 13594.20 -> 13073.84 : -520.36 = -3.828% (+/-5.44%)
bm_fannkuch.py 60.63 -> 59.58 : -1.05 = -1.732% (+/-3.01%)
bm_fft.py 112009.15 -> 111603.32 : -405.83 = -0.362% (+/-4.03%)
bm_float.py 246202.55 -> 247923.81 : +1721.26 = +0.699% (+/-2.79%)
bm_hexiom.py 615.65 -> 617.21 : +1.56 = +0.253% (+/-1.64%)
bm_nqueens.py 215807.95 -> 215600.96 : -206.99 = -0.096% (+/-3.52%)
bm_pidigits.py 8246.74 -> 8422.82 : +176.08 = +2.135% (+/-3.64%)
misc_aes.py 16133.00 -> 16452.74 : +319.74 = +1.982% (+/-1.50%)
misc_mandel.py 128146.69 -> 130796.43 : +2649.74 = +2.068% (+/-3.18%)
misc_pystone.py 83811.49 -> 83124.85 : -686.64 = -0.819% (+/-1.03%)
misc_raytrace.py 21688.02 -> 21385.10 : -302.92 = -1.397% (+/-3.20%)
The code size change is (firmware with a lot of frozen code benefits the
most):
bare-arm: +396 +0.697%
minimal x86: +1595 +0.979% [incl +32(data)]
unix x64: +2408 +0.470% [incl +800(data)]
unix nanbox: +1396 +0.309% [incl -96(data)]
stm32: -1256 -0.318% PYBV10
cc3200: +288 +0.157%
esp8266: -260 -0.037% GENERIC
esp32: -216 -0.014% GENERIC[incl -1072(data)]
nrf: +116 +0.067% pca10040
rp2: -664 -0.135% PICO
samd: +844 +0.607% ADAFRUIT_ITSYBITSY_M4_EXPRESS
As part of this change the .mpy file format version is bumped to version 6.
And mpy-tool.py has been improved to provide a good visualisation of the
contents of .mpy files.
In summary: this commit changes the bytecode to use qstr indirection, and
reworks the .mpy file format to be simpler and allow .mpy files to be
executed in-place. Performance is not impacted too much. Eventually it
will be possible to store such .mpy files in a linear, read-only, memory-
mappable filesystem so they can be executed from flash/ROM. This will
essentially be able to replace frozen code for most applications.
Signed-off-by: Damien George <damien@micropython.org>
2021-10-22 12:22:47 +01:00
|
|
|
mp_emit_common_use_qstr(emit_common, MP_QSTR__star_);
|
2013-12-30 22:32:17 +00:00
|
|
|
}
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2015-11-13 13:38:28 +00:00
|
|
|
#if !MICROPY_PERSISTENT_CODE_SAVE
|
|
|
|
STATIC
|
|
|
|
#endif
|
py: Rework bytecode and .mpy file format to be mostly static data.
Background: .mpy files are precompiled .py files, built using mpy-cross,
that contain compiled bytecode functions (and can also contain machine
code). The benefit of using an .mpy file over a .py file is that they are
faster to import and take less memory when importing. They are also
smaller on disk.
But the real benefit of .mpy files comes when they are frozen into the
firmware. This is done by loading the .mpy file during compilation of the
firmware and turning it into a set of big C data structures (the job of
mpy-tool.py), which are then compiled and downloaded into the ROM of a
device. These C data structures can be executed in-place, ie directly from
ROM. This makes importing even faster because there is very little to do,
and also means such frozen modules take up much less RAM (because their
bytecode stays in ROM).
The downside of frozen code is that it requires recompiling and reflashing
the entire firmware. This can be a big barrier to entry, slows down
development time, and makes it harder to do OTA updates of frozen code
(because the whole firmware must be updated).
This commit attempts to solve this problem by providing a solution that
sits between loading .mpy files into RAM and freezing them into the
firmware. The .mpy file format has been reworked so that it consists of
data and bytecode which is mostly static and ready to run in-place. If
these new .mpy files are located in flash/ROM which is memory addressable,
the .mpy file can be executed (mostly) in-place.
With this approach there is still a small amount of unpacking and linking
of the .mpy file that needs to be done when it's imported, but it's still
much better than loading an .mpy from disk into RAM (although not as good
as freezing .mpy files into the firmware).
The main trick to make static .mpy files is to adjust the bytecode so any
qstrs that it references now go through a lookup table to convert from
local qstr number in the module to global qstr number in the firmware.
That means the bytecode does not need linking/rewriting of qstrs when it's
loaded. Instead only a small qstr table needs to be built (and put in RAM)
at import time. This means the bytecode itself is static/constant and can
be used directly if it's in addressable memory. Also the qstr string data
in the .mpy file, and some constant object data, can be used directly.
Note that the qstr table is global to the module (ie not per function).
In more detail, in the VM what used to be (schematically):
qst = DECODE_QSTR_VALUE;
is now (schematically):
idx = DECODE_QSTR_INDEX;
qst = qstr_table[idx];
That allows the bytecode to be fixed at compile time and not need
relinking/rewriting of the qstr values. Only qstr_table needs to be linked
when the .mpy is loaded.
Incidentally, this helps to reduce the size of bytecode because what used
to be 2-byte qstr values in the bytecode are now (mostly) 1-byte indices.
If the module uses the same qstr more than two times then the bytecode is
smaller than before.
The following changes are measured for this commit compared to the
previous (the baseline):
- average 7%-9% reduction in size of .mpy files
- frozen code size is reduced by about 5%-7%
- importing .py files uses about 5% less RAM in total
- importing .mpy files uses about 4% less RAM in total
- importing .py and .mpy files takes about the same time as before
The qstr indirection in the bytecode has only a small impact on VM
performance. For stm32 on PYBv1.0 the performance change of this commit
is:
diff of scores (higher is better)
N=100 M=100 baseline -> this-commit diff diff% (error%)
bm_chaos.py 371.07 -> 357.39 : -13.68 = -3.687% (+/-0.02%)
bm_fannkuch.py 78.72 -> 77.49 : -1.23 = -1.563% (+/-0.01%)
bm_fft.py 2591.73 -> 2539.28 : -52.45 = -2.024% (+/-0.00%)
bm_float.py 6034.93 -> 5908.30 : -126.63 = -2.098% (+/-0.01%)
bm_hexiom.py 48.96 -> 47.93 : -1.03 = -2.104% (+/-0.00%)
bm_nqueens.py 4510.63 -> 4459.94 : -50.69 = -1.124% (+/-0.00%)
bm_pidigits.py 650.28 -> 644.96 : -5.32 = -0.818% (+/-0.23%)
core_import_mpy_multi.py 564.77 -> 581.49 : +16.72 = +2.960% (+/-0.01%)
core_import_mpy_single.py 68.67 -> 67.16 : -1.51 = -2.199% (+/-0.01%)
core_qstr.py 64.16 -> 64.12 : -0.04 = -0.062% (+/-0.00%)
core_yield_from.py 362.58 -> 354.50 : -8.08 = -2.228% (+/-0.00%)
misc_aes.py 429.69 -> 405.59 : -24.10 = -5.609% (+/-0.01%)
misc_mandel.py 3485.13 -> 3416.51 : -68.62 = -1.969% (+/-0.00%)
misc_pystone.py 2496.53 -> 2405.56 : -90.97 = -3.644% (+/-0.01%)
misc_raytrace.py 381.47 -> 374.01 : -7.46 = -1.956% (+/-0.01%)
viper_call0.py 576.73 -> 572.49 : -4.24 = -0.735% (+/-0.04%)
viper_call1a.py 550.37 -> 546.21 : -4.16 = -0.756% (+/-0.09%)
viper_call1b.py 438.23 -> 435.68 : -2.55 = -0.582% (+/-0.06%)
viper_call1c.py 442.84 -> 440.04 : -2.80 = -0.632% (+/-0.08%)
viper_call2a.py 536.31 -> 532.35 : -3.96 = -0.738% (+/-0.06%)
viper_call2b.py 382.34 -> 377.07 : -5.27 = -1.378% (+/-0.03%)
And for unix on x64:
diff of scores (higher is better)
N=2000 M=2000 baseline -> this-commit diff diff% (error%)
bm_chaos.py 13594.20 -> 13073.84 : -520.36 = -3.828% (+/-5.44%)
bm_fannkuch.py 60.63 -> 59.58 : -1.05 = -1.732% (+/-3.01%)
bm_fft.py 112009.15 -> 111603.32 : -405.83 = -0.362% (+/-4.03%)
bm_float.py 246202.55 -> 247923.81 : +1721.26 = +0.699% (+/-2.79%)
bm_hexiom.py 615.65 -> 617.21 : +1.56 = +0.253% (+/-1.64%)
bm_nqueens.py 215807.95 -> 215600.96 : -206.99 = -0.096% (+/-3.52%)
bm_pidigits.py 8246.74 -> 8422.82 : +176.08 = +2.135% (+/-3.64%)
misc_aes.py 16133.00 -> 16452.74 : +319.74 = +1.982% (+/-1.50%)
misc_mandel.py 128146.69 -> 130796.43 : +2649.74 = +2.068% (+/-3.18%)
misc_pystone.py 83811.49 -> 83124.85 : -686.64 = -0.819% (+/-1.03%)
misc_raytrace.py 21688.02 -> 21385.10 : -302.92 = -1.397% (+/-3.20%)
The code size change is (firmware with a lot of frozen code benefits the
most):
bare-arm: +396 +0.697%
minimal x86: +1595 +0.979% [incl +32(data)]
unix x64: +2408 +0.470% [incl +800(data)]
unix nanbox: +1396 +0.309% [incl -96(data)]
stm32: -1256 -0.318% PYBV10
cc3200: +288 +0.157%
esp8266: -260 -0.037% GENERIC
esp32: -216 -0.014% GENERIC[incl -1072(data)]
nrf: +116 +0.067% pca10040
rp2: -664 -0.135% PICO
samd: +844 +0.607% ADAFRUIT_ITSYBITSY_M4_EXPRESS
As part of this change the .mpy file format version is bumped to version 6.
And mpy-tool.py has been improved to provide a good visualisation of the
contents of .mpy files.
In summary: this commit changes the bytecode to use qstr indirection, and
reworks the .mpy file format to be simpler and allow .mpy files to be
executed in-place. Performance is not impacted too much. Eventually it
will be possible to store such .mpy files in a linear, read-only, memory-
mappable filesystem so they can be executed from flash/ROM. This will
essentially be able to replace frozen code for most applications.
Signed-off-by: Damien George <damien@micropython.org>
2021-10-22 12:22:47 +01:00
|
|
|
mp_compiled_module_t mp_compile_to_raw_code(mp_parse_tree_t *parse_tree, qstr source_file, bool is_repl, mp_module_context_t *context) {
|
2015-09-24 13:15:57 +01:00
|
|
|
// put compiler state on the stack, it's relatively small
|
|
|
|
compiler_t comp_state = {0};
|
|
|
|
compiler_t *comp = &comp_state;
|
|
|
|
|
2013-10-18 19:58:12 +01:00
|
|
|
comp->is_repl = is_repl;
|
2017-06-22 06:05:58 +01:00
|
|
|
comp->break_label = INVALID_LABEL;
|
|
|
|
comp->continue_label = INVALID_LABEL;
|
py: Rework bytecode and .mpy file format to be mostly static data.
Background: .mpy files are precompiled .py files, built using mpy-cross,
that contain compiled bytecode functions (and can also contain machine
code). The benefit of using an .mpy file over a .py file is that they are
faster to import and take less memory when importing. They are also
smaller on disk.
But the real benefit of .mpy files comes when they are frozen into the
firmware. This is done by loading the .mpy file during compilation of the
firmware and turning it into a set of big C data structures (the job of
mpy-tool.py), which are then compiled and downloaded into the ROM of a
device. These C data structures can be executed in-place, ie directly from
ROM. This makes importing even faster because there is very little to do,
and also means such frozen modules take up much less RAM (because their
bytecode stays in ROM).
The downside of frozen code is that it requires recompiling and reflashing
the entire firmware. This can be a big barrier to entry, slows down
development time, and makes it harder to do OTA updates of frozen code
(because the whole firmware must be updated).
This commit attempts to solve this problem by providing a solution that
sits between loading .mpy files into RAM and freezing them into the
firmware. The .mpy file format has been reworked so that it consists of
data and bytecode which is mostly static and ready to run in-place. If
these new .mpy files are located in flash/ROM which is memory addressable,
the .mpy file can be executed (mostly) in-place.
With this approach there is still a small amount of unpacking and linking
of the .mpy file that needs to be done when it's imported, but it's still
much better than loading an .mpy from disk into RAM (although not as good
as freezing .mpy files into the firmware).
The main trick to make static .mpy files is to adjust the bytecode so any
qstrs that it references now go through a lookup table to convert from
local qstr number in the module to global qstr number in the firmware.
That means the bytecode does not need linking/rewriting of qstrs when it's
loaded. Instead only a small qstr table needs to be built (and put in RAM)
at import time. This means the bytecode itself is static/constant and can
be used directly if it's in addressable memory. Also the qstr string data
in the .mpy file, and some constant object data, can be used directly.
Note that the qstr table is global to the module (ie not per function).
In more detail, in the VM what used to be (schematically):
qst = DECODE_QSTR_VALUE;
is now (schematically):
idx = DECODE_QSTR_INDEX;
qst = qstr_table[idx];
That allows the bytecode to be fixed at compile time and not need
relinking/rewriting of the qstr values. Only qstr_table needs to be linked
when the .mpy is loaded.
Incidentally, this helps to reduce the size of bytecode because what used
to be 2-byte qstr values in the bytecode are now (mostly) 1-byte indices.
If the module uses the same qstr more than two times then the bytecode is
smaller than before.
The following changes are measured for this commit compared to the
previous (the baseline):
- average 7%-9% reduction in size of .mpy files
- frozen code size is reduced by about 5%-7%
- importing .py files uses about 5% less RAM in total
- importing .mpy files uses about 4% less RAM in total
- importing .py and .mpy files takes about the same time as before
The qstr indirection in the bytecode has only a small impact on VM
performance. For stm32 on PYBv1.0 the performance change of this commit
is:
diff of scores (higher is better)
N=100 M=100 baseline -> this-commit diff diff% (error%)
bm_chaos.py 371.07 -> 357.39 : -13.68 = -3.687% (+/-0.02%)
bm_fannkuch.py 78.72 -> 77.49 : -1.23 = -1.563% (+/-0.01%)
bm_fft.py 2591.73 -> 2539.28 : -52.45 = -2.024% (+/-0.00%)
bm_float.py 6034.93 -> 5908.30 : -126.63 = -2.098% (+/-0.01%)
bm_hexiom.py 48.96 -> 47.93 : -1.03 = -2.104% (+/-0.00%)
bm_nqueens.py 4510.63 -> 4459.94 : -50.69 = -1.124% (+/-0.00%)
bm_pidigits.py 650.28 -> 644.96 : -5.32 = -0.818% (+/-0.23%)
core_import_mpy_multi.py 564.77 -> 581.49 : +16.72 = +2.960% (+/-0.01%)
core_import_mpy_single.py 68.67 -> 67.16 : -1.51 = -2.199% (+/-0.01%)
core_qstr.py 64.16 -> 64.12 : -0.04 = -0.062% (+/-0.00%)
core_yield_from.py 362.58 -> 354.50 : -8.08 = -2.228% (+/-0.00%)
misc_aes.py 429.69 -> 405.59 : -24.10 = -5.609% (+/-0.01%)
misc_mandel.py 3485.13 -> 3416.51 : -68.62 = -1.969% (+/-0.00%)
misc_pystone.py 2496.53 -> 2405.56 : -90.97 = -3.644% (+/-0.01%)
misc_raytrace.py 381.47 -> 374.01 : -7.46 = -1.956% (+/-0.01%)
viper_call0.py 576.73 -> 572.49 : -4.24 = -0.735% (+/-0.04%)
viper_call1a.py 550.37 -> 546.21 : -4.16 = -0.756% (+/-0.09%)
viper_call1b.py 438.23 -> 435.68 : -2.55 = -0.582% (+/-0.06%)
viper_call1c.py 442.84 -> 440.04 : -2.80 = -0.632% (+/-0.08%)
viper_call2a.py 536.31 -> 532.35 : -3.96 = -0.738% (+/-0.06%)
viper_call2b.py 382.34 -> 377.07 : -5.27 = -1.378% (+/-0.03%)
And for unix on x64:
diff of scores (higher is better)
N=2000 M=2000 baseline -> this-commit diff diff% (error%)
bm_chaos.py 13594.20 -> 13073.84 : -520.36 = -3.828% (+/-5.44%)
bm_fannkuch.py 60.63 -> 59.58 : -1.05 = -1.732% (+/-3.01%)
bm_fft.py 112009.15 -> 111603.32 : -405.83 = -0.362% (+/-4.03%)
bm_float.py 246202.55 -> 247923.81 : +1721.26 = +0.699% (+/-2.79%)
bm_hexiom.py 615.65 -> 617.21 : +1.56 = +0.253% (+/-1.64%)
bm_nqueens.py 215807.95 -> 215600.96 : -206.99 = -0.096% (+/-3.52%)
bm_pidigits.py 8246.74 -> 8422.82 : +176.08 = +2.135% (+/-3.64%)
misc_aes.py 16133.00 -> 16452.74 : +319.74 = +1.982% (+/-1.50%)
misc_mandel.py 128146.69 -> 130796.43 : +2649.74 = +2.068% (+/-3.18%)
misc_pystone.py 83811.49 -> 83124.85 : -686.64 = -0.819% (+/-1.03%)
misc_raytrace.py 21688.02 -> 21385.10 : -302.92 = -1.397% (+/-3.20%)
The code size change is (firmware with a lot of frozen code benefits the
most):
bare-arm: +396 +0.697%
minimal x86: +1595 +0.979% [incl +32(data)]
unix x64: +2408 +0.470% [incl +800(data)]
unix nanbox: +1396 +0.309% [incl -96(data)]
stm32: -1256 -0.318% PYBV10
cc3200: +288 +0.157%
esp8266: -260 -0.037% GENERIC
esp32: -216 -0.014% GENERIC[incl -1072(data)]
nrf: +116 +0.067% pca10040
rp2: -664 -0.135% PICO
samd: +844 +0.607% ADAFRUIT_ITSYBITSY_M4_EXPRESS
As part of this change the .mpy file format version is bumped to version 6.
And mpy-tool.py has been improved to provide a good visualisation of the
contents of .mpy files.
In summary: this commit changes the bytecode to use qstr indirection, and
reworks the .mpy file format to be simpler and allow .mpy files to be
executed in-place. Performance is not impacted too much. Eventually it
will be possible to store such .mpy files in a linear, read-only, memory-
mappable filesystem so they can be executed from flash/ROM. This will
essentially be able to replace frozen code for most applications.
Signed-off-by: Damien George <damien@micropython.org>
2021-10-22 12:22:47 +01:00
|
|
|
mp_emit_common_init(&comp->emit_common, source_file);
|
2013-10-04 19:53:11 +01:00
|
|
|
|
2015-03-01 12:04:05 +00:00
|
|
|
// create the module scope
|
2019-08-23 02:20:50 +01:00
|
|
|
#if MICROPY_EMIT_NATIVE
|
|
|
|
const uint emit_opt = MP_STATE_VM(default_emit_opt);
|
|
|
|
#else
|
|
|
|
const uint emit_opt = MP_EMIT_OPT_NONE;
|
|
|
|
#endif
|
2015-09-23 10:50:43 +01:00
|
|
|
scope_t *module_scope = scope_new_and_link(comp, SCOPE_MODULE, parse_tree->root, emit_opt);
|
2015-03-01 12:04:05 +00:00
|
|
|
|
2015-03-26 15:49:53 +00:00
|
|
|
// create standard emitter; it's used at least for MP_PASS_SCOPE
|
py: Rework bytecode and .mpy file format to be mostly static data.
Background: .mpy files are precompiled .py files, built using mpy-cross,
that contain compiled bytecode functions (and can also contain machine
code). The benefit of using an .mpy file over a .py file is that they are
faster to import and take less memory when importing. They are also
smaller on disk.
But the real benefit of .mpy files comes when they are frozen into the
firmware. This is done by loading the .mpy file during compilation of the
firmware and turning it into a set of big C data structures (the job of
mpy-tool.py), which are then compiled and downloaded into the ROM of a
device. These C data structures can be executed in-place, ie directly from
ROM. This makes importing even faster because there is very little to do,
and also means such frozen modules take up much less RAM (because their
bytecode stays in ROM).
The downside of frozen code is that it requires recompiling and reflashing
the entire firmware. This can be a big barrier to entry, slows down
development time, and makes it harder to do OTA updates of frozen code
(because the whole firmware must be updated).
This commit attempts to solve this problem by providing a solution that
sits between loading .mpy files into RAM and freezing them into the
firmware. The .mpy file format has been reworked so that it consists of
data and bytecode which is mostly static and ready to run in-place. If
these new .mpy files are located in flash/ROM which is memory addressable,
the .mpy file can be executed (mostly) in-place.
With this approach there is still a small amount of unpacking and linking
of the .mpy file that needs to be done when it's imported, but it's still
much better than loading an .mpy from disk into RAM (although not as good
as freezing .mpy files into the firmware).
The main trick to make static .mpy files is to adjust the bytecode so any
qstrs that it references now go through a lookup table to convert from
local qstr number in the module to global qstr number in the firmware.
That means the bytecode does not need linking/rewriting of qstrs when it's
loaded. Instead only a small qstr table needs to be built (and put in RAM)
at import time. This means the bytecode itself is static/constant and can
be used directly if it's in addressable memory. Also the qstr string data
in the .mpy file, and some constant object data, can be used directly.
Note that the qstr table is global to the module (ie not per function).
In more detail, in the VM what used to be (schematically):
qst = DECODE_QSTR_VALUE;
is now (schematically):
idx = DECODE_QSTR_INDEX;
qst = qstr_table[idx];
That allows the bytecode to be fixed at compile time and not need
relinking/rewriting of the qstr values. Only qstr_table needs to be linked
when the .mpy is loaded.
Incidentally, this helps to reduce the size of bytecode because what used
to be 2-byte qstr values in the bytecode are now (mostly) 1-byte indices.
If the module uses the same qstr more than two times then the bytecode is
smaller than before.
The following changes are measured for this commit compared to the
previous (the baseline):
- average 7%-9% reduction in size of .mpy files
- frozen code size is reduced by about 5%-7%
- importing .py files uses about 5% less RAM in total
- importing .mpy files uses about 4% less RAM in total
- importing .py and .mpy files takes about the same time as before
The qstr indirection in the bytecode has only a small impact on VM
performance. For stm32 on PYBv1.0 the performance change of this commit
is:
diff of scores (higher is better)
N=100 M=100 baseline -> this-commit diff diff% (error%)
bm_chaos.py 371.07 -> 357.39 : -13.68 = -3.687% (+/-0.02%)
bm_fannkuch.py 78.72 -> 77.49 : -1.23 = -1.563% (+/-0.01%)
bm_fft.py 2591.73 -> 2539.28 : -52.45 = -2.024% (+/-0.00%)
bm_float.py 6034.93 -> 5908.30 : -126.63 = -2.098% (+/-0.01%)
bm_hexiom.py 48.96 -> 47.93 : -1.03 = -2.104% (+/-0.00%)
bm_nqueens.py 4510.63 -> 4459.94 : -50.69 = -1.124% (+/-0.00%)
bm_pidigits.py 650.28 -> 644.96 : -5.32 = -0.818% (+/-0.23%)
core_import_mpy_multi.py 564.77 -> 581.49 : +16.72 = +2.960% (+/-0.01%)
core_import_mpy_single.py 68.67 -> 67.16 : -1.51 = -2.199% (+/-0.01%)
core_qstr.py 64.16 -> 64.12 : -0.04 = -0.062% (+/-0.00%)
core_yield_from.py 362.58 -> 354.50 : -8.08 = -2.228% (+/-0.00%)
misc_aes.py 429.69 -> 405.59 : -24.10 = -5.609% (+/-0.01%)
misc_mandel.py 3485.13 -> 3416.51 : -68.62 = -1.969% (+/-0.00%)
misc_pystone.py 2496.53 -> 2405.56 : -90.97 = -3.644% (+/-0.01%)
misc_raytrace.py 381.47 -> 374.01 : -7.46 = -1.956% (+/-0.01%)
viper_call0.py 576.73 -> 572.49 : -4.24 = -0.735% (+/-0.04%)
viper_call1a.py 550.37 -> 546.21 : -4.16 = -0.756% (+/-0.09%)
viper_call1b.py 438.23 -> 435.68 : -2.55 = -0.582% (+/-0.06%)
viper_call1c.py 442.84 -> 440.04 : -2.80 = -0.632% (+/-0.08%)
viper_call2a.py 536.31 -> 532.35 : -3.96 = -0.738% (+/-0.06%)
viper_call2b.py 382.34 -> 377.07 : -5.27 = -1.378% (+/-0.03%)
And for unix on x64:
diff of scores (higher is better)
N=2000 M=2000 baseline -> this-commit diff diff% (error%)
bm_chaos.py 13594.20 -> 13073.84 : -520.36 = -3.828% (+/-5.44%)
bm_fannkuch.py 60.63 -> 59.58 : -1.05 = -1.732% (+/-3.01%)
bm_fft.py 112009.15 -> 111603.32 : -405.83 = -0.362% (+/-4.03%)
bm_float.py 246202.55 -> 247923.81 : +1721.26 = +0.699% (+/-2.79%)
bm_hexiom.py 615.65 -> 617.21 : +1.56 = +0.253% (+/-1.64%)
bm_nqueens.py 215807.95 -> 215600.96 : -206.99 = -0.096% (+/-3.52%)
bm_pidigits.py 8246.74 -> 8422.82 : +176.08 = +2.135% (+/-3.64%)
misc_aes.py 16133.00 -> 16452.74 : +319.74 = +1.982% (+/-1.50%)
misc_mandel.py 128146.69 -> 130796.43 : +2649.74 = +2.068% (+/-3.18%)
misc_pystone.py 83811.49 -> 83124.85 : -686.64 = -0.819% (+/-1.03%)
misc_raytrace.py 21688.02 -> 21385.10 : -302.92 = -1.397% (+/-3.20%)
The code size change is (firmware with a lot of frozen code benefits the
most):
bare-arm: +396 +0.697%
minimal x86: +1595 +0.979% [incl +32(data)]
unix x64: +2408 +0.470% [incl +800(data)]
unix nanbox: +1396 +0.309% [incl -96(data)]
stm32: -1256 -0.318% PYBV10
cc3200: +288 +0.157%
esp8266: -260 -0.037% GENERIC
esp32: -216 -0.014% GENERIC[incl -1072(data)]
nrf: +116 +0.067% pca10040
rp2: -664 -0.135% PICO
samd: +844 +0.607% ADAFRUIT_ITSYBITSY_M4_EXPRESS
As part of this change the .mpy file format version is bumped to version 6.
And mpy-tool.py has been improved to provide a good visualisation of the
contents of .mpy files.
In summary: this commit changes the bytecode to use qstr indirection, and
reworks the .mpy file format to be simpler and allow .mpy files to be
executed in-place. Performance is not impacted too much. Eventually it
will be possible to store such .mpy files in a linear, read-only, memory-
mappable filesystem so they can be executed from flash/ROM. This will
essentially be able to replace frozen code for most applications.
Signed-off-by: Damien George <damien@micropython.org>
2021-10-22 12:22:47 +01:00
|
|
|
emit_t *emit_bc = emit_bc_new(&comp->emit_common);
|
2015-03-26 15:49:53 +00:00
|
|
|
|
py: Rework bytecode and .mpy file format to be mostly static data.
Background: .mpy files are precompiled .py files, built using mpy-cross,
that contain compiled bytecode functions (and can also contain machine
code). The benefit of using an .mpy file over a .py file is that they are
faster to import and take less memory when importing. They are also
smaller on disk.
But the real benefit of .mpy files comes when they are frozen into the
firmware. This is done by loading the .mpy file during compilation of the
firmware and turning it into a set of big C data structures (the job of
mpy-tool.py), which are then compiled and downloaded into the ROM of a
device. These C data structures can be executed in-place, ie directly from
ROM. This makes importing even faster because there is very little to do,
and also means such frozen modules take up much less RAM (because their
bytecode stays in ROM).
The downside of frozen code is that it requires recompiling and reflashing
the entire firmware. This can be a big barrier to entry, slows down
development time, and makes it harder to do OTA updates of frozen code
(because the whole firmware must be updated).
This commit attempts to solve this problem by providing a solution that
sits between loading .mpy files into RAM and freezing them into the
firmware. The .mpy file format has been reworked so that it consists of
data and bytecode which is mostly static and ready to run in-place. If
these new .mpy files are located in flash/ROM which is memory addressable,
the .mpy file can be executed (mostly) in-place.
With this approach there is still a small amount of unpacking and linking
of the .mpy file that needs to be done when it's imported, but it's still
much better than loading an .mpy from disk into RAM (although not as good
as freezing .mpy files into the firmware).
The main trick to make static .mpy files is to adjust the bytecode so any
qstrs that it references now go through a lookup table to convert from
local qstr number in the module to global qstr number in the firmware.
That means the bytecode does not need linking/rewriting of qstrs when it's
loaded. Instead only a small qstr table needs to be built (and put in RAM)
at import time. This means the bytecode itself is static/constant and can
be used directly if it's in addressable memory. Also the qstr string data
in the .mpy file, and some constant object data, can be used directly.
Note that the qstr table is global to the module (ie not per function).
In more detail, in the VM what used to be (schematically):
qst = DECODE_QSTR_VALUE;
is now (schematically):
idx = DECODE_QSTR_INDEX;
qst = qstr_table[idx];
That allows the bytecode to be fixed at compile time and not need
relinking/rewriting of the qstr values. Only qstr_table needs to be linked
when the .mpy is loaded.
Incidentally, this helps to reduce the size of bytecode because what used
to be 2-byte qstr values in the bytecode are now (mostly) 1-byte indices.
If the module uses the same qstr more than two times then the bytecode is
smaller than before.
The following changes are measured for this commit compared to the
previous (the baseline):
- average 7%-9% reduction in size of .mpy files
- frozen code size is reduced by about 5%-7%
- importing .py files uses about 5% less RAM in total
- importing .mpy files uses about 4% less RAM in total
- importing .py and .mpy files takes about the same time as before
The qstr indirection in the bytecode has only a small impact on VM
performance. For stm32 on PYBv1.0 the performance change of this commit
is:
diff of scores (higher is better)
N=100 M=100 baseline -> this-commit diff diff% (error%)
bm_chaos.py 371.07 -> 357.39 : -13.68 = -3.687% (+/-0.02%)
bm_fannkuch.py 78.72 -> 77.49 : -1.23 = -1.563% (+/-0.01%)
bm_fft.py 2591.73 -> 2539.28 : -52.45 = -2.024% (+/-0.00%)
bm_float.py 6034.93 -> 5908.30 : -126.63 = -2.098% (+/-0.01%)
bm_hexiom.py 48.96 -> 47.93 : -1.03 = -2.104% (+/-0.00%)
bm_nqueens.py 4510.63 -> 4459.94 : -50.69 = -1.124% (+/-0.00%)
bm_pidigits.py 650.28 -> 644.96 : -5.32 = -0.818% (+/-0.23%)
core_import_mpy_multi.py 564.77 -> 581.49 : +16.72 = +2.960% (+/-0.01%)
core_import_mpy_single.py 68.67 -> 67.16 : -1.51 = -2.199% (+/-0.01%)
core_qstr.py 64.16 -> 64.12 : -0.04 = -0.062% (+/-0.00%)
core_yield_from.py 362.58 -> 354.50 : -8.08 = -2.228% (+/-0.00%)
misc_aes.py 429.69 -> 405.59 : -24.10 = -5.609% (+/-0.01%)
misc_mandel.py 3485.13 -> 3416.51 : -68.62 = -1.969% (+/-0.00%)
misc_pystone.py 2496.53 -> 2405.56 : -90.97 = -3.644% (+/-0.01%)
misc_raytrace.py 381.47 -> 374.01 : -7.46 = -1.956% (+/-0.01%)
viper_call0.py 576.73 -> 572.49 : -4.24 = -0.735% (+/-0.04%)
viper_call1a.py 550.37 -> 546.21 : -4.16 = -0.756% (+/-0.09%)
viper_call1b.py 438.23 -> 435.68 : -2.55 = -0.582% (+/-0.06%)
viper_call1c.py 442.84 -> 440.04 : -2.80 = -0.632% (+/-0.08%)
viper_call2a.py 536.31 -> 532.35 : -3.96 = -0.738% (+/-0.06%)
viper_call2b.py 382.34 -> 377.07 : -5.27 = -1.378% (+/-0.03%)
And for unix on x64:
diff of scores (higher is better)
N=2000 M=2000 baseline -> this-commit diff diff% (error%)
bm_chaos.py 13594.20 -> 13073.84 : -520.36 = -3.828% (+/-5.44%)
bm_fannkuch.py 60.63 -> 59.58 : -1.05 = -1.732% (+/-3.01%)
bm_fft.py 112009.15 -> 111603.32 : -405.83 = -0.362% (+/-4.03%)
bm_float.py 246202.55 -> 247923.81 : +1721.26 = +0.699% (+/-2.79%)
bm_hexiom.py 615.65 -> 617.21 : +1.56 = +0.253% (+/-1.64%)
bm_nqueens.py 215807.95 -> 215600.96 : -206.99 = -0.096% (+/-3.52%)
bm_pidigits.py 8246.74 -> 8422.82 : +176.08 = +2.135% (+/-3.64%)
misc_aes.py 16133.00 -> 16452.74 : +319.74 = +1.982% (+/-1.50%)
misc_mandel.py 128146.69 -> 130796.43 : +2649.74 = +2.068% (+/-3.18%)
misc_pystone.py 83811.49 -> 83124.85 : -686.64 = -0.819% (+/-1.03%)
misc_raytrace.py 21688.02 -> 21385.10 : -302.92 = -1.397% (+/-3.20%)
The code size change is (firmware with a lot of frozen code benefits the
most):
bare-arm: +396 +0.697%
minimal x86: +1595 +0.979% [incl +32(data)]
unix x64: +2408 +0.470% [incl +800(data)]
unix nanbox: +1396 +0.309% [incl -96(data)]
stm32: -1256 -0.318% PYBV10
cc3200: +288 +0.157%
esp8266: -260 -0.037% GENERIC
esp32: -216 -0.014% GENERIC[incl -1072(data)]
nrf: +116 +0.067% pca10040
rp2: -664 -0.135% PICO
samd: +844 +0.607% ADAFRUIT_ITSYBITSY_M4_EXPRESS
As part of this change the .mpy file format version is bumped to version 6.
And mpy-tool.py has been improved to provide a good visualisation of the
contents of .mpy files.
In summary: this commit changes the bytecode to use qstr indirection, and
reworks the .mpy file format to be simpler and allow .mpy files to be
executed in-place. Performance is not impacted too much. Eventually it
will be possible to store such .mpy files in a linear, read-only, memory-
mappable filesystem so they can be executed from flash/ROM. This will
essentially be able to replace frozen code for most applications.
Signed-off-by: Damien George <damien@micropython.org>
2021-10-22 12:22:47 +01:00
|
|
|
// compile MP_PASS_SCOPE
|
2015-03-26 15:49:53 +00:00
|
|
|
comp->emit = emit_bc;
|
py: Rework bytecode and .mpy file format to be mostly static data.
Background: .mpy files are precompiled .py files, built using mpy-cross,
that contain compiled bytecode functions (and can also contain machine
code). The benefit of using an .mpy file over a .py file is that they are
faster to import and take less memory when importing. They are also
smaller on disk.
But the real benefit of .mpy files comes when they are frozen into the
firmware. This is done by loading the .mpy file during compilation of the
firmware and turning it into a set of big C data structures (the job of
mpy-tool.py), which are then compiled and downloaded into the ROM of a
device. These C data structures can be executed in-place, ie directly from
ROM. This makes importing even faster because there is very little to do,
and also means such frozen modules take up much less RAM (because their
bytecode stays in ROM).
The downside of frozen code is that it requires recompiling and reflashing
the entire firmware. This can be a big barrier to entry, slows down
development time, and makes it harder to do OTA updates of frozen code
(because the whole firmware must be updated).
This commit attempts to solve this problem by providing a solution that
sits between loading .mpy files into RAM and freezing them into the
firmware. The .mpy file format has been reworked so that it consists of
data and bytecode which is mostly static and ready to run in-place. If
these new .mpy files are located in flash/ROM which is memory addressable,
the .mpy file can be executed (mostly) in-place.
With this approach there is still a small amount of unpacking and linking
of the .mpy file that needs to be done when it's imported, but it's still
much better than loading an .mpy from disk into RAM (although not as good
as freezing .mpy files into the firmware).
The main trick to make static .mpy files is to adjust the bytecode so any
qstrs that it references now go through a lookup table to convert from
local qstr number in the module to global qstr number in the firmware.
That means the bytecode does not need linking/rewriting of qstrs when it's
loaded. Instead only a small qstr table needs to be built (and put in RAM)
at import time. This means the bytecode itself is static/constant and can
be used directly if it's in addressable memory. Also the qstr string data
in the .mpy file, and some constant object data, can be used directly.
Note that the qstr table is global to the module (ie not per function).
In more detail, in the VM what used to be (schematically):
qst = DECODE_QSTR_VALUE;
is now (schematically):
idx = DECODE_QSTR_INDEX;
qst = qstr_table[idx];
That allows the bytecode to be fixed at compile time and not need
relinking/rewriting of the qstr values. Only qstr_table needs to be linked
when the .mpy is loaded.
Incidentally, this helps to reduce the size of bytecode because what used
to be 2-byte qstr values in the bytecode are now (mostly) 1-byte indices.
If the module uses the same qstr more than two times then the bytecode is
smaller than before.
The following changes are measured for this commit compared to the
previous (the baseline):
- average 7%-9% reduction in size of .mpy files
- frozen code size is reduced by about 5%-7%
- importing .py files uses about 5% less RAM in total
- importing .mpy files uses about 4% less RAM in total
- importing .py and .mpy files takes about the same time as before
The qstr indirection in the bytecode has only a small impact on VM
performance. For stm32 on PYBv1.0 the performance change of this commit
is:
diff of scores (higher is better)
N=100 M=100 baseline -> this-commit diff diff% (error%)
bm_chaos.py 371.07 -> 357.39 : -13.68 = -3.687% (+/-0.02%)
bm_fannkuch.py 78.72 -> 77.49 : -1.23 = -1.563% (+/-0.01%)
bm_fft.py 2591.73 -> 2539.28 : -52.45 = -2.024% (+/-0.00%)
bm_float.py 6034.93 -> 5908.30 : -126.63 = -2.098% (+/-0.01%)
bm_hexiom.py 48.96 -> 47.93 : -1.03 = -2.104% (+/-0.00%)
bm_nqueens.py 4510.63 -> 4459.94 : -50.69 = -1.124% (+/-0.00%)
bm_pidigits.py 650.28 -> 644.96 : -5.32 = -0.818% (+/-0.23%)
core_import_mpy_multi.py 564.77 -> 581.49 : +16.72 = +2.960% (+/-0.01%)
core_import_mpy_single.py 68.67 -> 67.16 : -1.51 = -2.199% (+/-0.01%)
core_qstr.py 64.16 -> 64.12 : -0.04 = -0.062% (+/-0.00%)
core_yield_from.py 362.58 -> 354.50 : -8.08 = -2.228% (+/-0.00%)
misc_aes.py 429.69 -> 405.59 : -24.10 = -5.609% (+/-0.01%)
misc_mandel.py 3485.13 -> 3416.51 : -68.62 = -1.969% (+/-0.00%)
misc_pystone.py 2496.53 -> 2405.56 : -90.97 = -3.644% (+/-0.01%)
misc_raytrace.py 381.47 -> 374.01 : -7.46 = -1.956% (+/-0.01%)
viper_call0.py 576.73 -> 572.49 : -4.24 = -0.735% (+/-0.04%)
viper_call1a.py 550.37 -> 546.21 : -4.16 = -0.756% (+/-0.09%)
viper_call1b.py 438.23 -> 435.68 : -2.55 = -0.582% (+/-0.06%)
viper_call1c.py 442.84 -> 440.04 : -2.80 = -0.632% (+/-0.08%)
viper_call2a.py 536.31 -> 532.35 : -3.96 = -0.738% (+/-0.06%)
viper_call2b.py 382.34 -> 377.07 : -5.27 = -1.378% (+/-0.03%)
And for unix on x64:
diff of scores (higher is better)
N=2000 M=2000 baseline -> this-commit diff diff% (error%)
bm_chaos.py 13594.20 -> 13073.84 : -520.36 = -3.828% (+/-5.44%)
bm_fannkuch.py 60.63 -> 59.58 : -1.05 = -1.732% (+/-3.01%)
bm_fft.py 112009.15 -> 111603.32 : -405.83 = -0.362% (+/-4.03%)
bm_float.py 246202.55 -> 247923.81 : +1721.26 = +0.699% (+/-2.79%)
bm_hexiom.py 615.65 -> 617.21 : +1.56 = +0.253% (+/-1.64%)
bm_nqueens.py 215807.95 -> 215600.96 : -206.99 = -0.096% (+/-3.52%)
bm_pidigits.py 8246.74 -> 8422.82 : +176.08 = +2.135% (+/-3.64%)
misc_aes.py 16133.00 -> 16452.74 : +319.74 = +1.982% (+/-1.50%)
misc_mandel.py 128146.69 -> 130796.43 : +2649.74 = +2.068% (+/-3.18%)
misc_pystone.py 83811.49 -> 83124.85 : -686.64 = -0.819% (+/-1.03%)
misc_raytrace.py 21688.02 -> 21385.10 : -302.92 = -1.397% (+/-3.20%)
The code size change is (firmware with a lot of frozen code benefits the
most):
bare-arm: +396 +0.697%
minimal x86: +1595 +0.979% [incl +32(data)]
unix x64: +2408 +0.470% [incl +800(data)]
unix nanbox: +1396 +0.309% [incl -96(data)]
stm32: -1256 -0.318% PYBV10
cc3200: +288 +0.157%
esp8266: -260 -0.037% GENERIC
esp32: -216 -0.014% GENERIC[incl -1072(data)]
nrf: +116 +0.067% pca10040
rp2: -664 -0.135% PICO
samd: +844 +0.607% ADAFRUIT_ITSYBITSY_M4_EXPRESS
As part of this change the .mpy file format version is bumped to version 6.
And mpy-tool.py has been improved to provide a good visualisation of the
contents of .mpy files.
In summary: this commit changes the bytecode to use qstr indirection, and
reworks the .mpy file format to be simpler and allow .mpy files to be
executed in-place. Performance is not impacted too much. Eventually it
will be possible to store such .mpy files in a linear, read-only, memory-
mappable filesystem so they can be executed from flash/ROM. This will
essentially be able to replace frozen code for most applications.
Signed-off-by: Damien George <damien@micropython.org>
2021-10-22 12:22:47 +01:00
|
|
|
comp->emit_bc = emit_bc;
|
2015-03-26 16:44:14 +00:00
|
|
|
#if MICROPY_EMIT_NATIVE
|
2015-03-26 15:49:53 +00:00
|
|
|
comp->emit_method_table = &emit_bc_method_table;
|
|
|
|
#endif
|
2013-10-05 23:17:28 +01:00
|
|
|
uint max_num_labels = 0;
|
2014-10-05 19:01:34 +01:00
|
|
|
for (scope_t *s = comp->scope_head; s != NULL && comp->compile_error == MP_OBJ_NULL; s = s->next) {
|
2016-12-09 02:17:49 +00:00
|
|
|
#if MICROPY_EMIT_INLINE_ASM
|
2019-05-05 04:29:43 +01:00
|
|
|
if (s->emit_options == MP_EMIT_OPT_ASM) {
|
2014-05-07 17:24:22 +01:00
|
|
|
compile_scope_inline_asm(comp, s, MP_PASS_SCOPE);
|
2019-05-05 04:29:43 +01:00
|
|
|
} else
|
2016-12-09 02:17:49 +00:00
|
|
|
#endif
|
2019-05-05 04:29:43 +01:00
|
|
|
{
|
2014-05-07 17:24:22 +01:00
|
|
|
compile_scope(comp, s, MP_PASS_SCOPE);
|
2018-10-26 06:48:07 +01:00
|
|
|
|
|
|
|
// Check if any implicitly declared variables should be closed over
|
|
|
|
for (size_t i = 0; i < s->id_info_len; ++i) {
|
|
|
|
id_info_t *id = &s->id_info[i];
|
|
|
|
if (id->kind == ID_INFO_KIND_GLOBAL_IMPLICIT) {
|
|
|
|
scope_check_to_close_over(s, id);
|
|
|
|
}
|
|
|
|
}
|
2013-10-05 23:17:28 +01:00
|
|
|
}
|
|
|
|
|
|
|
|
// update maximim number of labels needed
|
|
|
|
if (comp->next_label > max_num_labels) {
|
|
|
|
max_num_labels = comp->next_label;
|
|
|
|
}
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
|
|
|
|
2013-10-05 23:17:28 +01:00
|
|
|
// compute some things related to scope and identifiers
|
py: Rework bytecode and .mpy file format to be mostly static data.
Background: .mpy files are precompiled .py files, built using mpy-cross,
that contain compiled bytecode functions (and can also contain machine
code). The benefit of using an .mpy file over a .py file is that they are
faster to import and take less memory when importing. They are also
smaller on disk.
But the real benefit of .mpy files comes when they are frozen into the
firmware. This is done by loading the .mpy file during compilation of the
firmware and turning it into a set of big C data structures (the job of
mpy-tool.py), which are then compiled and downloaded into the ROM of a
device. These C data structures can be executed in-place, ie directly from
ROM. This makes importing even faster because there is very little to do,
and also means such frozen modules take up much less RAM (because their
bytecode stays in ROM).
The downside of frozen code is that it requires recompiling and reflashing
the entire firmware. This can be a big barrier to entry, slows down
development time, and makes it harder to do OTA updates of frozen code
(because the whole firmware must be updated).
This commit attempts to solve this problem by providing a solution that
sits between loading .mpy files into RAM and freezing them into the
firmware. The .mpy file format has been reworked so that it consists of
data and bytecode which is mostly static and ready to run in-place. If
these new .mpy files are located in flash/ROM which is memory addressable,
the .mpy file can be executed (mostly) in-place.
With this approach there is still a small amount of unpacking and linking
of the .mpy file that needs to be done when it's imported, but it's still
much better than loading an .mpy from disk into RAM (although not as good
as freezing .mpy files into the firmware).
The main trick to make static .mpy files is to adjust the bytecode so any
qstrs that it references now go through a lookup table to convert from
local qstr number in the module to global qstr number in the firmware.
That means the bytecode does not need linking/rewriting of qstrs when it's
loaded. Instead only a small qstr table needs to be built (and put in RAM)
at import time. This means the bytecode itself is static/constant and can
be used directly if it's in addressable memory. Also the qstr string data
in the .mpy file, and some constant object data, can be used directly.
Note that the qstr table is global to the module (ie not per function).
In more detail, in the VM what used to be (schematically):
qst = DECODE_QSTR_VALUE;
is now (schematically):
idx = DECODE_QSTR_INDEX;
qst = qstr_table[idx];
That allows the bytecode to be fixed at compile time and not need
relinking/rewriting of the qstr values. Only qstr_table needs to be linked
when the .mpy is loaded.
Incidentally, this helps to reduce the size of bytecode because what used
to be 2-byte qstr values in the bytecode are now (mostly) 1-byte indices.
If the module uses the same qstr more than two times then the bytecode is
smaller than before.
The following changes are measured for this commit compared to the
previous (the baseline):
- average 7%-9% reduction in size of .mpy files
- frozen code size is reduced by about 5%-7%
- importing .py files uses about 5% less RAM in total
- importing .mpy files uses about 4% less RAM in total
- importing .py and .mpy files takes about the same time as before
The qstr indirection in the bytecode has only a small impact on VM
performance. For stm32 on PYBv1.0 the performance change of this commit
is:
diff of scores (higher is better)
N=100 M=100 baseline -> this-commit diff diff% (error%)
bm_chaos.py 371.07 -> 357.39 : -13.68 = -3.687% (+/-0.02%)
bm_fannkuch.py 78.72 -> 77.49 : -1.23 = -1.563% (+/-0.01%)
bm_fft.py 2591.73 -> 2539.28 : -52.45 = -2.024% (+/-0.00%)
bm_float.py 6034.93 -> 5908.30 : -126.63 = -2.098% (+/-0.01%)
bm_hexiom.py 48.96 -> 47.93 : -1.03 = -2.104% (+/-0.00%)
bm_nqueens.py 4510.63 -> 4459.94 : -50.69 = -1.124% (+/-0.00%)
bm_pidigits.py 650.28 -> 644.96 : -5.32 = -0.818% (+/-0.23%)
core_import_mpy_multi.py 564.77 -> 581.49 : +16.72 = +2.960% (+/-0.01%)
core_import_mpy_single.py 68.67 -> 67.16 : -1.51 = -2.199% (+/-0.01%)
core_qstr.py 64.16 -> 64.12 : -0.04 = -0.062% (+/-0.00%)
core_yield_from.py 362.58 -> 354.50 : -8.08 = -2.228% (+/-0.00%)
misc_aes.py 429.69 -> 405.59 : -24.10 = -5.609% (+/-0.01%)
misc_mandel.py 3485.13 -> 3416.51 : -68.62 = -1.969% (+/-0.00%)
misc_pystone.py 2496.53 -> 2405.56 : -90.97 = -3.644% (+/-0.01%)
misc_raytrace.py 381.47 -> 374.01 : -7.46 = -1.956% (+/-0.01%)
viper_call0.py 576.73 -> 572.49 : -4.24 = -0.735% (+/-0.04%)
viper_call1a.py 550.37 -> 546.21 : -4.16 = -0.756% (+/-0.09%)
viper_call1b.py 438.23 -> 435.68 : -2.55 = -0.582% (+/-0.06%)
viper_call1c.py 442.84 -> 440.04 : -2.80 = -0.632% (+/-0.08%)
viper_call2a.py 536.31 -> 532.35 : -3.96 = -0.738% (+/-0.06%)
viper_call2b.py 382.34 -> 377.07 : -5.27 = -1.378% (+/-0.03%)
And for unix on x64:
diff of scores (higher is better)
N=2000 M=2000 baseline -> this-commit diff diff% (error%)
bm_chaos.py 13594.20 -> 13073.84 : -520.36 = -3.828% (+/-5.44%)
bm_fannkuch.py 60.63 -> 59.58 : -1.05 = -1.732% (+/-3.01%)
bm_fft.py 112009.15 -> 111603.32 : -405.83 = -0.362% (+/-4.03%)
bm_float.py 246202.55 -> 247923.81 : +1721.26 = +0.699% (+/-2.79%)
bm_hexiom.py 615.65 -> 617.21 : +1.56 = +0.253% (+/-1.64%)
bm_nqueens.py 215807.95 -> 215600.96 : -206.99 = -0.096% (+/-3.52%)
bm_pidigits.py 8246.74 -> 8422.82 : +176.08 = +2.135% (+/-3.64%)
misc_aes.py 16133.00 -> 16452.74 : +319.74 = +1.982% (+/-1.50%)
misc_mandel.py 128146.69 -> 130796.43 : +2649.74 = +2.068% (+/-3.18%)
misc_pystone.py 83811.49 -> 83124.85 : -686.64 = -0.819% (+/-1.03%)
misc_raytrace.py 21688.02 -> 21385.10 : -302.92 = -1.397% (+/-3.20%)
The code size change is (firmware with a lot of frozen code benefits the
most):
bare-arm: +396 +0.697%
minimal x86: +1595 +0.979% [incl +32(data)]
unix x64: +2408 +0.470% [incl +800(data)]
unix nanbox: +1396 +0.309% [incl -96(data)]
stm32: -1256 -0.318% PYBV10
cc3200: +288 +0.157%
esp8266: -260 -0.037% GENERIC
esp32: -216 -0.014% GENERIC[incl -1072(data)]
nrf: +116 +0.067% pca10040
rp2: -664 -0.135% PICO
samd: +844 +0.607% ADAFRUIT_ITSYBITSY_M4_EXPRESS
As part of this change the .mpy file format version is bumped to version 6.
And mpy-tool.py has been improved to provide a good visualisation of the
contents of .mpy files.
In summary: this commit changes the bytecode to use qstr indirection, and
reworks the .mpy file format to be simpler and allow .mpy files to be
executed in-place. Performance is not impacted too much. Eventually it
will be possible to store such .mpy files in a linear, read-only, memory-
mappable filesystem so they can be executed from flash/ROM. This will
essentially be able to replace frozen code for most applications.
Signed-off-by: Damien George <damien@micropython.org>
2021-10-22 12:22:47 +01:00
|
|
|
bool has_native_code = false;
|
2014-10-05 19:01:34 +01:00
|
|
|
for (scope_t *s = comp->scope_head; s != NULL && comp->compile_error == MP_OBJ_NULL; s = s->next) {
|
py: Rework bytecode and .mpy file format to be mostly static data.
Background: .mpy files are precompiled .py files, built using mpy-cross,
that contain compiled bytecode functions (and can also contain machine
code). The benefit of using an .mpy file over a .py file is that they are
faster to import and take less memory when importing. They are also
smaller on disk.
But the real benefit of .mpy files comes when they are frozen into the
firmware. This is done by loading the .mpy file during compilation of the
firmware and turning it into a set of big C data structures (the job of
mpy-tool.py), which are then compiled and downloaded into the ROM of a
device. These C data structures can be executed in-place, ie directly from
ROM. This makes importing even faster because there is very little to do,
and also means such frozen modules take up much less RAM (because their
bytecode stays in ROM).
The downside of frozen code is that it requires recompiling and reflashing
the entire firmware. This can be a big barrier to entry, slows down
development time, and makes it harder to do OTA updates of frozen code
(because the whole firmware must be updated).
This commit attempts to solve this problem by providing a solution that
sits between loading .mpy files into RAM and freezing them into the
firmware. The .mpy file format has been reworked so that it consists of
data and bytecode which is mostly static and ready to run in-place. If
these new .mpy files are located in flash/ROM which is memory addressable,
the .mpy file can be executed (mostly) in-place.
With this approach there is still a small amount of unpacking and linking
of the .mpy file that needs to be done when it's imported, but it's still
much better than loading an .mpy from disk into RAM (although not as good
as freezing .mpy files into the firmware).
The main trick to make static .mpy files is to adjust the bytecode so any
qstrs that it references now go through a lookup table to convert from
local qstr number in the module to global qstr number in the firmware.
That means the bytecode does not need linking/rewriting of qstrs when it's
loaded. Instead only a small qstr table needs to be built (and put in RAM)
at import time. This means the bytecode itself is static/constant and can
be used directly if it's in addressable memory. Also the qstr string data
in the .mpy file, and some constant object data, can be used directly.
Note that the qstr table is global to the module (ie not per function).
In more detail, in the VM what used to be (schematically):
qst = DECODE_QSTR_VALUE;
is now (schematically):
idx = DECODE_QSTR_INDEX;
qst = qstr_table[idx];
That allows the bytecode to be fixed at compile time and not need
relinking/rewriting of the qstr values. Only qstr_table needs to be linked
when the .mpy is loaded.
Incidentally, this helps to reduce the size of bytecode because what used
to be 2-byte qstr values in the bytecode are now (mostly) 1-byte indices.
If the module uses the same qstr more than two times then the bytecode is
smaller than before.
The following changes are measured for this commit compared to the
previous (the baseline):
- average 7%-9% reduction in size of .mpy files
- frozen code size is reduced by about 5%-7%
- importing .py files uses about 5% less RAM in total
- importing .mpy files uses about 4% less RAM in total
- importing .py and .mpy files takes about the same time as before
The qstr indirection in the bytecode has only a small impact on VM
performance. For stm32 on PYBv1.0 the performance change of this commit
is:
diff of scores (higher is better)
N=100 M=100 baseline -> this-commit diff diff% (error%)
bm_chaos.py 371.07 -> 357.39 : -13.68 = -3.687% (+/-0.02%)
bm_fannkuch.py 78.72 -> 77.49 : -1.23 = -1.563% (+/-0.01%)
bm_fft.py 2591.73 -> 2539.28 : -52.45 = -2.024% (+/-0.00%)
bm_float.py 6034.93 -> 5908.30 : -126.63 = -2.098% (+/-0.01%)
bm_hexiom.py 48.96 -> 47.93 : -1.03 = -2.104% (+/-0.00%)
bm_nqueens.py 4510.63 -> 4459.94 : -50.69 = -1.124% (+/-0.00%)
bm_pidigits.py 650.28 -> 644.96 : -5.32 = -0.818% (+/-0.23%)
core_import_mpy_multi.py 564.77 -> 581.49 : +16.72 = +2.960% (+/-0.01%)
core_import_mpy_single.py 68.67 -> 67.16 : -1.51 = -2.199% (+/-0.01%)
core_qstr.py 64.16 -> 64.12 : -0.04 = -0.062% (+/-0.00%)
core_yield_from.py 362.58 -> 354.50 : -8.08 = -2.228% (+/-0.00%)
misc_aes.py 429.69 -> 405.59 : -24.10 = -5.609% (+/-0.01%)
misc_mandel.py 3485.13 -> 3416.51 : -68.62 = -1.969% (+/-0.00%)
misc_pystone.py 2496.53 -> 2405.56 : -90.97 = -3.644% (+/-0.01%)
misc_raytrace.py 381.47 -> 374.01 : -7.46 = -1.956% (+/-0.01%)
viper_call0.py 576.73 -> 572.49 : -4.24 = -0.735% (+/-0.04%)
viper_call1a.py 550.37 -> 546.21 : -4.16 = -0.756% (+/-0.09%)
viper_call1b.py 438.23 -> 435.68 : -2.55 = -0.582% (+/-0.06%)
viper_call1c.py 442.84 -> 440.04 : -2.80 = -0.632% (+/-0.08%)
viper_call2a.py 536.31 -> 532.35 : -3.96 = -0.738% (+/-0.06%)
viper_call2b.py 382.34 -> 377.07 : -5.27 = -1.378% (+/-0.03%)
And for unix on x64:
diff of scores (higher is better)
N=2000 M=2000 baseline -> this-commit diff diff% (error%)
bm_chaos.py 13594.20 -> 13073.84 : -520.36 = -3.828% (+/-5.44%)
bm_fannkuch.py 60.63 -> 59.58 : -1.05 = -1.732% (+/-3.01%)
bm_fft.py 112009.15 -> 111603.32 : -405.83 = -0.362% (+/-4.03%)
bm_float.py 246202.55 -> 247923.81 : +1721.26 = +0.699% (+/-2.79%)
bm_hexiom.py 615.65 -> 617.21 : +1.56 = +0.253% (+/-1.64%)
bm_nqueens.py 215807.95 -> 215600.96 : -206.99 = -0.096% (+/-3.52%)
bm_pidigits.py 8246.74 -> 8422.82 : +176.08 = +2.135% (+/-3.64%)
misc_aes.py 16133.00 -> 16452.74 : +319.74 = +1.982% (+/-1.50%)
misc_mandel.py 128146.69 -> 130796.43 : +2649.74 = +2.068% (+/-3.18%)
misc_pystone.py 83811.49 -> 83124.85 : -686.64 = -0.819% (+/-1.03%)
misc_raytrace.py 21688.02 -> 21385.10 : -302.92 = -1.397% (+/-3.20%)
The code size change is (firmware with a lot of frozen code benefits the
most):
bare-arm: +396 +0.697%
minimal x86: +1595 +0.979% [incl +32(data)]
unix x64: +2408 +0.470% [incl +800(data)]
unix nanbox: +1396 +0.309% [incl -96(data)]
stm32: -1256 -0.318% PYBV10
cc3200: +288 +0.157%
esp8266: -260 -0.037% GENERIC
esp32: -216 -0.014% GENERIC[incl -1072(data)]
nrf: +116 +0.067% pca10040
rp2: -664 -0.135% PICO
samd: +844 +0.607% ADAFRUIT_ITSYBITSY_M4_EXPRESS
As part of this change the .mpy file format version is bumped to version 6.
And mpy-tool.py has been improved to provide a good visualisation of the
contents of .mpy files.
In summary: this commit changes the bytecode to use qstr indirection, and
reworks the .mpy file format to be simpler and allow .mpy files to be
executed in-place. Performance is not impacted too much. Eventually it
will be possible to store such .mpy files in a linear, read-only, memory-
mappable filesystem so they can be executed from flash/ROM. This will
essentially be able to replace frozen code for most applications.
Signed-off-by: Damien George <damien@micropython.org>
2021-10-22 12:22:47 +01:00
|
|
|
#if MICROPY_EMIT_NATIVE
|
|
|
|
if (s->emit_options == MP_EMIT_OPT_NATIVE_PYTHON || s->emit_options == MP_EMIT_OPT_VIPER) {
|
|
|
|
has_native_code = true;
|
|
|
|
}
|
|
|
|
#endif
|
|
|
|
|
|
|
|
scope_compute_things(s, &comp->emit_common);
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
|
|
|
|
2015-03-26 15:49:53 +00:00
|
|
|
// set max number of labels now that it's calculated
|
|
|
|
emit_bc_set_max_num_labels(emit_bc, max_num_labels);
|
|
|
|
|
py: Rework bytecode and .mpy file format to be mostly static data.
Background: .mpy files are precompiled .py files, built using mpy-cross,
that contain compiled bytecode functions (and can also contain machine
code). The benefit of using an .mpy file over a .py file is that they are
faster to import and take less memory when importing. They are also
smaller on disk.
But the real benefit of .mpy files comes when they are frozen into the
firmware. This is done by loading the .mpy file during compilation of the
firmware and turning it into a set of big C data structures (the job of
mpy-tool.py), which are then compiled and downloaded into the ROM of a
device. These C data structures can be executed in-place, ie directly from
ROM. This makes importing even faster because there is very little to do,
and also means such frozen modules take up much less RAM (because their
bytecode stays in ROM).
The downside of frozen code is that it requires recompiling and reflashing
the entire firmware. This can be a big barrier to entry, slows down
development time, and makes it harder to do OTA updates of frozen code
(because the whole firmware must be updated).
This commit attempts to solve this problem by providing a solution that
sits between loading .mpy files into RAM and freezing them into the
firmware. The .mpy file format has been reworked so that it consists of
data and bytecode which is mostly static and ready to run in-place. If
these new .mpy files are located in flash/ROM which is memory addressable,
the .mpy file can be executed (mostly) in-place.
With this approach there is still a small amount of unpacking and linking
of the .mpy file that needs to be done when it's imported, but it's still
much better than loading an .mpy from disk into RAM (although not as good
as freezing .mpy files into the firmware).
The main trick to make static .mpy files is to adjust the bytecode so any
qstrs that it references now go through a lookup table to convert from
local qstr number in the module to global qstr number in the firmware.
That means the bytecode does not need linking/rewriting of qstrs when it's
loaded. Instead only a small qstr table needs to be built (and put in RAM)
at import time. This means the bytecode itself is static/constant and can
be used directly if it's in addressable memory. Also the qstr string data
in the .mpy file, and some constant object data, can be used directly.
Note that the qstr table is global to the module (ie not per function).
In more detail, in the VM what used to be (schematically):
qst = DECODE_QSTR_VALUE;
is now (schematically):
idx = DECODE_QSTR_INDEX;
qst = qstr_table[idx];
That allows the bytecode to be fixed at compile time and not need
relinking/rewriting of the qstr values. Only qstr_table needs to be linked
when the .mpy is loaded.
Incidentally, this helps to reduce the size of bytecode because what used
to be 2-byte qstr values in the bytecode are now (mostly) 1-byte indices.
If the module uses the same qstr more than two times then the bytecode is
smaller than before.
The following changes are measured for this commit compared to the
previous (the baseline):
- average 7%-9% reduction in size of .mpy files
- frozen code size is reduced by about 5%-7%
- importing .py files uses about 5% less RAM in total
- importing .mpy files uses about 4% less RAM in total
- importing .py and .mpy files takes about the same time as before
The qstr indirection in the bytecode has only a small impact on VM
performance. For stm32 on PYBv1.0 the performance change of this commit
is:
diff of scores (higher is better)
N=100 M=100 baseline -> this-commit diff diff% (error%)
bm_chaos.py 371.07 -> 357.39 : -13.68 = -3.687% (+/-0.02%)
bm_fannkuch.py 78.72 -> 77.49 : -1.23 = -1.563% (+/-0.01%)
bm_fft.py 2591.73 -> 2539.28 : -52.45 = -2.024% (+/-0.00%)
bm_float.py 6034.93 -> 5908.30 : -126.63 = -2.098% (+/-0.01%)
bm_hexiom.py 48.96 -> 47.93 : -1.03 = -2.104% (+/-0.00%)
bm_nqueens.py 4510.63 -> 4459.94 : -50.69 = -1.124% (+/-0.00%)
bm_pidigits.py 650.28 -> 644.96 : -5.32 = -0.818% (+/-0.23%)
core_import_mpy_multi.py 564.77 -> 581.49 : +16.72 = +2.960% (+/-0.01%)
core_import_mpy_single.py 68.67 -> 67.16 : -1.51 = -2.199% (+/-0.01%)
core_qstr.py 64.16 -> 64.12 : -0.04 = -0.062% (+/-0.00%)
core_yield_from.py 362.58 -> 354.50 : -8.08 = -2.228% (+/-0.00%)
misc_aes.py 429.69 -> 405.59 : -24.10 = -5.609% (+/-0.01%)
misc_mandel.py 3485.13 -> 3416.51 : -68.62 = -1.969% (+/-0.00%)
misc_pystone.py 2496.53 -> 2405.56 : -90.97 = -3.644% (+/-0.01%)
misc_raytrace.py 381.47 -> 374.01 : -7.46 = -1.956% (+/-0.01%)
viper_call0.py 576.73 -> 572.49 : -4.24 = -0.735% (+/-0.04%)
viper_call1a.py 550.37 -> 546.21 : -4.16 = -0.756% (+/-0.09%)
viper_call1b.py 438.23 -> 435.68 : -2.55 = -0.582% (+/-0.06%)
viper_call1c.py 442.84 -> 440.04 : -2.80 = -0.632% (+/-0.08%)
viper_call2a.py 536.31 -> 532.35 : -3.96 = -0.738% (+/-0.06%)
viper_call2b.py 382.34 -> 377.07 : -5.27 = -1.378% (+/-0.03%)
And for unix on x64:
diff of scores (higher is better)
N=2000 M=2000 baseline -> this-commit diff diff% (error%)
bm_chaos.py 13594.20 -> 13073.84 : -520.36 = -3.828% (+/-5.44%)
bm_fannkuch.py 60.63 -> 59.58 : -1.05 = -1.732% (+/-3.01%)
bm_fft.py 112009.15 -> 111603.32 : -405.83 = -0.362% (+/-4.03%)
bm_float.py 246202.55 -> 247923.81 : +1721.26 = +0.699% (+/-2.79%)
bm_hexiom.py 615.65 -> 617.21 : +1.56 = +0.253% (+/-1.64%)
bm_nqueens.py 215807.95 -> 215600.96 : -206.99 = -0.096% (+/-3.52%)
bm_pidigits.py 8246.74 -> 8422.82 : +176.08 = +2.135% (+/-3.64%)
misc_aes.py 16133.00 -> 16452.74 : +319.74 = +1.982% (+/-1.50%)
misc_mandel.py 128146.69 -> 130796.43 : +2649.74 = +2.068% (+/-3.18%)
misc_pystone.py 83811.49 -> 83124.85 : -686.64 = -0.819% (+/-1.03%)
misc_raytrace.py 21688.02 -> 21385.10 : -302.92 = -1.397% (+/-3.20%)
The code size change is (firmware with a lot of frozen code benefits the
most):
bare-arm: +396 +0.697%
minimal x86: +1595 +0.979% [incl +32(data)]
unix x64: +2408 +0.470% [incl +800(data)]
unix nanbox: +1396 +0.309% [incl -96(data)]
stm32: -1256 -0.318% PYBV10
cc3200: +288 +0.157%
esp8266: -260 -0.037% GENERIC
esp32: -216 -0.014% GENERIC[incl -1072(data)]
nrf: +116 +0.067% pca10040
rp2: -664 -0.135% PICO
samd: +844 +0.607% ADAFRUIT_ITSYBITSY_M4_EXPRESS
As part of this change the .mpy file format version is bumped to version 6.
And mpy-tool.py has been improved to provide a good visualisation of the
contents of .mpy files.
In summary: this commit changes the bytecode to use qstr indirection, and
reworks the .mpy file format to be simpler and allow .mpy files to be
executed in-place. Performance is not impacted too much. Eventually it
will be possible to store such .mpy files in a linear, read-only, memory-
mappable filesystem so they can be executed from flash/ROM. This will
essentially be able to replace frozen code for most applications.
Signed-off-by: Damien George <damien@micropython.org>
2021-10-22 12:22:47 +01:00
|
|
|
// finalise and allocate the constant table
|
|
|
|
mp_emit_common_finalise(&comp->emit_common, has_native_code);
|
|
|
|
|
|
|
|
// compile MP_PASS_STACK_SIZE, MP_PASS_CODE_SIZE, MP_PASS_EMIT
|
2014-01-04 13:55:24 +00:00
|
|
|
#if MICROPY_EMIT_NATIVE
|
2013-10-06 01:01:01 +01:00
|
|
|
emit_t *emit_native = NULL;
|
2013-10-12 14:30:21 +01:00
|
|
|
#endif
|
2014-10-05 19:01:34 +01:00
|
|
|
for (scope_t *s = comp->scope_head; s != NULL && comp->compile_error == MP_OBJ_NULL; s = s->next) {
|
2016-12-09 02:17:49 +00:00
|
|
|
#if MICROPY_EMIT_INLINE_ASM
|
2019-05-05 04:29:43 +01:00
|
|
|
if (s->emit_options == MP_EMIT_OPT_ASM) {
|
2016-12-09 02:17:49 +00:00
|
|
|
// inline assembly
|
|
|
|
if (comp->emit_inline_asm == NULL) {
|
|
|
|
comp->emit_inline_asm = ASM_EMITTER(new)(max_num_labels);
|
2013-10-05 23:17:28 +01:00
|
|
|
}
|
|
|
|
comp->emit = NULL;
|
2019-03-09 01:32:09 +00:00
|
|
|
comp->emit_inline_asm_method_table = ASM_EMITTER_TABLE;
|
2014-05-07 17:24:22 +01:00
|
|
|
compile_scope_inline_asm(comp, s, MP_PASS_CODE_SIZE);
|
2016-12-19 06:42:25 +00:00
|
|
|
#if MICROPY_EMIT_INLINE_XTENSA
|
|
|
|
// Xtensa requires an extra pass to compute size of l32r const table
|
|
|
|
// TODO this can be improved by calculating it during SCOPE pass
|
|
|
|
// but that requires some other structural changes to the asm emitters
|
2019-03-09 01:32:09 +00:00
|
|
|
#if MICROPY_DYNAMIC_COMPILER
|
|
|
|
if (mp_dynamic_compiler.native_arch == MP_NATIVE_ARCH_XTENSA)
|
|
|
|
#endif
|
|
|
|
{
|
|
|
|
compile_scope_inline_asm(comp, s, MP_PASS_CODE_SIZE);
|
|
|
|
}
|
2016-12-19 06:42:25 +00:00
|
|
|
#endif
|
2014-10-05 19:01:34 +01:00
|
|
|
if (comp->compile_error == MP_OBJ_NULL) {
|
2014-05-07 17:24:22 +01:00
|
|
|
compile_scope_inline_asm(comp, s, MP_PASS_EMIT);
|
2014-04-12 17:54:52 +01:00
|
|
|
}
|
2019-05-05 04:29:43 +01:00
|
|
|
} else
|
2016-12-09 02:17:49 +00:00
|
|
|
#endif
|
2019-05-05 04:29:43 +01:00
|
|
|
{
|
2013-10-12 14:30:21 +01:00
|
|
|
|
|
|
|
// choose the emit type
|
|
|
|
|
2013-10-05 23:17:28 +01:00
|
|
|
switch (s->emit_options) {
|
2014-01-04 13:55:24 +00:00
|
|
|
|
|
|
|
#if MICROPY_EMIT_NATIVE
|
2014-04-06 11:48:15 +01:00
|
|
|
case MP_EMIT_OPT_NATIVE_PYTHON:
|
|
|
|
case MP_EMIT_OPT_VIPER:
|
2014-09-06 23:06:36 +01:00
|
|
|
if (emit_native == NULL) {
|
py: Rework bytecode and .mpy file format to be mostly static data.
Background: .mpy files are precompiled .py files, built using mpy-cross,
that contain compiled bytecode functions (and can also contain machine
code). The benefit of using an .mpy file over a .py file is that they are
faster to import and take less memory when importing. They are also
smaller on disk.
But the real benefit of .mpy files comes when they are frozen into the
firmware. This is done by loading the .mpy file during compilation of the
firmware and turning it into a set of big C data structures (the job of
mpy-tool.py), which are then compiled and downloaded into the ROM of a
device. These C data structures can be executed in-place, ie directly from
ROM. This makes importing even faster because there is very little to do,
and also means such frozen modules take up much less RAM (because their
bytecode stays in ROM).
The downside of frozen code is that it requires recompiling and reflashing
the entire firmware. This can be a big barrier to entry, slows down
development time, and makes it harder to do OTA updates of frozen code
(because the whole firmware must be updated).
This commit attempts to solve this problem by providing a solution that
sits between loading .mpy files into RAM and freezing them into the
firmware. The .mpy file format has been reworked so that it consists of
data and bytecode which is mostly static and ready to run in-place. If
these new .mpy files are located in flash/ROM which is memory addressable,
the .mpy file can be executed (mostly) in-place.
With this approach there is still a small amount of unpacking and linking
of the .mpy file that needs to be done when it's imported, but it's still
much better than loading an .mpy from disk into RAM (although not as good
as freezing .mpy files into the firmware).
The main trick to make static .mpy files is to adjust the bytecode so any
qstrs that it references now go through a lookup table to convert from
local qstr number in the module to global qstr number in the firmware.
That means the bytecode does not need linking/rewriting of qstrs when it's
loaded. Instead only a small qstr table needs to be built (and put in RAM)
at import time. This means the bytecode itself is static/constant and can
be used directly if it's in addressable memory. Also the qstr string data
in the .mpy file, and some constant object data, can be used directly.
Note that the qstr table is global to the module (ie not per function).
In more detail, in the VM what used to be (schematically):
qst = DECODE_QSTR_VALUE;
is now (schematically):
idx = DECODE_QSTR_INDEX;
qst = qstr_table[idx];
That allows the bytecode to be fixed at compile time and not need
relinking/rewriting of the qstr values. Only qstr_table needs to be linked
when the .mpy is loaded.
Incidentally, this helps to reduce the size of bytecode because what used
to be 2-byte qstr values in the bytecode are now (mostly) 1-byte indices.
If the module uses the same qstr more than two times then the bytecode is
smaller than before.
The following changes are measured for this commit compared to the
previous (the baseline):
- average 7%-9% reduction in size of .mpy files
- frozen code size is reduced by about 5%-7%
- importing .py files uses about 5% less RAM in total
- importing .mpy files uses about 4% less RAM in total
- importing .py and .mpy files takes about the same time as before
The qstr indirection in the bytecode has only a small impact on VM
performance. For stm32 on PYBv1.0 the performance change of this commit
is:
diff of scores (higher is better)
N=100 M=100 baseline -> this-commit diff diff% (error%)
bm_chaos.py 371.07 -> 357.39 : -13.68 = -3.687% (+/-0.02%)
bm_fannkuch.py 78.72 -> 77.49 : -1.23 = -1.563% (+/-0.01%)
bm_fft.py 2591.73 -> 2539.28 : -52.45 = -2.024% (+/-0.00%)
bm_float.py 6034.93 -> 5908.30 : -126.63 = -2.098% (+/-0.01%)
bm_hexiom.py 48.96 -> 47.93 : -1.03 = -2.104% (+/-0.00%)
bm_nqueens.py 4510.63 -> 4459.94 : -50.69 = -1.124% (+/-0.00%)
bm_pidigits.py 650.28 -> 644.96 : -5.32 = -0.818% (+/-0.23%)
core_import_mpy_multi.py 564.77 -> 581.49 : +16.72 = +2.960% (+/-0.01%)
core_import_mpy_single.py 68.67 -> 67.16 : -1.51 = -2.199% (+/-0.01%)
core_qstr.py 64.16 -> 64.12 : -0.04 = -0.062% (+/-0.00%)
core_yield_from.py 362.58 -> 354.50 : -8.08 = -2.228% (+/-0.00%)
misc_aes.py 429.69 -> 405.59 : -24.10 = -5.609% (+/-0.01%)
misc_mandel.py 3485.13 -> 3416.51 : -68.62 = -1.969% (+/-0.00%)
misc_pystone.py 2496.53 -> 2405.56 : -90.97 = -3.644% (+/-0.01%)
misc_raytrace.py 381.47 -> 374.01 : -7.46 = -1.956% (+/-0.01%)
viper_call0.py 576.73 -> 572.49 : -4.24 = -0.735% (+/-0.04%)
viper_call1a.py 550.37 -> 546.21 : -4.16 = -0.756% (+/-0.09%)
viper_call1b.py 438.23 -> 435.68 : -2.55 = -0.582% (+/-0.06%)
viper_call1c.py 442.84 -> 440.04 : -2.80 = -0.632% (+/-0.08%)
viper_call2a.py 536.31 -> 532.35 : -3.96 = -0.738% (+/-0.06%)
viper_call2b.py 382.34 -> 377.07 : -5.27 = -1.378% (+/-0.03%)
And for unix on x64:
diff of scores (higher is better)
N=2000 M=2000 baseline -> this-commit diff diff% (error%)
bm_chaos.py 13594.20 -> 13073.84 : -520.36 = -3.828% (+/-5.44%)
bm_fannkuch.py 60.63 -> 59.58 : -1.05 = -1.732% (+/-3.01%)
bm_fft.py 112009.15 -> 111603.32 : -405.83 = -0.362% (+/-4.03%)
bm_float.py 246202.55 -> 247923.81 : +1721.26 = +0.699% (+/-2.79%)
bm_hexiom.py 615.65 -> 617.21 : +1.56 = +0.253% (+/-1.64%)
bm_nqueens.py 215807.95 -> 215600.96 : -206.99 = -0.096% (+/-3.52%)
bm_pidigits.py 8246.74 -> 8422.82 : +176.08 = +2.135% (+/-3.64%)
misc_aes.py 16133.00 -> 16452.74 : +319.74 = +1.982% (+/-1.50%)
misc_mandel.py 128146.69 -> 130796.43 : +2649.74 = +2.068% (+/-3.18%)
misc_pystone.py 83811.49 -> 83124.85 : -686.64 = -0.819% (+/-1.03%)
misc_raytrace.py 21688.02 -> 21385.10 : -302.92 = -1.397% (+/-3.20%)
The code size change is (firmware with a lot of frozen code benefits the
most):
bare-arm: +396 +0.697%
minimal x86: +1595 +0.979% [incl +32(data)]
unix x64: +2408 +0.470% [incl +800(data)]
unix nanbox: +1396 +0.309% [incl -96(data)]
stm32: -1256 -0.318% PYBV10
cc3200: +288 +0.157%
esp8266: -260 -0.037% GENERIC
esp32: -216 -0.014% GENERIC[incl -1072(data)]
nrf: +116 +0.067% pca10040
rp2: -664 -0.135% PICO
samd: +844 +0.607% ADAFRUIT_ITSYBITSY_M4_EXPRESS
As part of this change the .mpy file format version is bumped to version 6.
And mpy-tool.py has been improved to provide a good visualisation of the
contents of .mpy files.
In summary: this commit changes the bytecode to use qstr indirection, and
reworks the .mpy file format to be simpler and allow .mpy files to be
executed in-place. Performance is not impacted too much. Eventually it
will be possible to store such .mpy files in a linear, read-only, memory-
mappable filesystem so they can be executed from flash/ROM. This will
essentially be able to replace frozen code for most applications.
Signed-off-by: Damien George <damien@micropython.org>
2021-10-22 12:22:47 +01:00
|
|
|
emit_native = NATIVE_EMITTER(new)(&comp->emit_common, &comp->compile_error, &comp->next_label, max_num_labels);
|
2014-09-06 23:06:36 +01:00
|
|
|
}
|
2019-03-08 23:59:25 +00:00
|
|
|
comp->emit_method_table = NATIVE_EMITTER_TABLE;
|
2013-10-12 14:30:21 +01:00
|
|
|
comp->emit = emit_native;
|
2013-10-07 00:02:49 +01:00
|
|
|
break;
|
2014-01-04 13:55:24 +00:00
|
|
|
#endif // MICROPY_EMIT_NATIVE
|
2013-10-07 00:02:49 +01:00
|
|
|
|
2013-10-05 23:17:28 +01:00
|
|
|
default:
|
|
|
|
comp->emit = emit_bc;
|
2015-03-26 16:44:14 +00:00
|
|
|
#if MICROPY_EMIT_NATIVE
|
2013-10-05 23:17:28 +01:00
|
|
|
comp->emit_method_table = &emit_bc_method_table;
|
2015-03-26 16:44:14 +00:00
|
|
|
#endif
|
2013-10-05 23:17:28 +01:00
|
|
|
break;
|
|
|
|
}
|
2013-10-12 14:30:21 +01:00
|
|
|
|
2015-01-14 00:20:28 +00:00
|
|
|
// need a pass to compute stack size
|
|
|
|
compile_scope(comp, s, MP_PASS_STACK_SIZE);
|
|
|
|
|
2014-05-07 17:24:22 +01:00
|
|
|
// second last pass: compute code size
|
2014-10-05 19:01:34 +01:00
|
|
|
if (comp->compile_error == MP_OBJ_NULL) {
|
2014-05-07 17:24:22 +01:00
|
|
|
compile_scope(comp, s, MP_PASS_CODE_SIZE);
|
|
|
|
}
|
|
|
|
|
|
|
|
// final pass: emit code
|
2014-10-05 19:01:34 +01:00
|
|
|
if (comp->compile_error == MP_OBJ_NULL) {
|
2014-05-07 17:24:22 +01:00
|
|
|
compile_scope(comp, s, MP_PASS_EMIT);
|
2014-04-12 17:54:52 +01:00
|
|
|
}
|
2013-10-05 18:08:26 +01:00
|
|
|
}
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
|
|
|
|
2015-07-29 23:16:01 +01:00
|
|
|
if (comp->compile_error != MP_OBJ_NULL) {
|
|
|
|
// if there is no line number for the error then use the line
|
|
|
|
// number for the start of this scope
|
|
|
|
compile_error_set_line(comp, comp->scope_cur->pn);
|
|
|
|
// add a traceback to the exception using relevant source info
|
py: Rework bytecode and .mpy file format to be mostly static data.
Background: .mpy files are precompiled .py files, built using mpy-cross,
that contain compiled bytecode functions (and can also contain machine
code). The benefit of using an .mpy file over a .py file is that they are
faster to import and take less memory when importing. They are also
smaller on disk.
But the real benefit of .mpy files comes when they are frozen into the
firmware. This is done by loading the .mpy file during compilation of the
firmware and turning it into a set of big C data structures (the job of
mpy-tool.py), which are then compiled and downloaded into the ROM of a
device. These C data structures can be executed in-place, ie directly from
ROM. This makes importing even faster because there is very little to do,
and also means such frozen modules take up much less RAM (because their
bytecode stays in ROM).
The downside of frozen code is that it requires recompiling and reflashing
the entire firmware. This can be a big barrier to entry, slows down
development time, and makes it harder to do OTA updates of frozen code
(because the whole firmware must be updated).
This commit attempts to solve this problem by providing a solution that
sits between loading .mpy files into RAM and freezing them into the
firmware. The .mpy file format has been reworked so that it consists of
data and bytecode which is mostly static and ready to run in-place. If
these new .mpy files are located in flash/ROM which is memory addressable,
the .mpy file can be executed (mostly) in-place.
With this approach there is still a small amount of unpacking and linking
of the .mpy file that needs to be done when it's imported, but it's still
much better than loading an .mpy from disk into RAM (although not as good
as freezing .mpy files into the firmware).
The main trick to make static .mpy files is to adjust the bytecode so any
qstrs that it references now go through a lookup table to convert from
local qstr number in the module to global qstr number in the firmware.
That means the bytecode does not need linking/rewriting of qstrs when it's
loaded. Instead only a small qstr table needs to be built (and put in RAM)
at import time. This means the bytecode itself is static/constant and can
be used directly if it's in addressable memory. Also the qstr string data
in the .mpy file, and some constant object data, can be used directly.
Note that the qstr table is global to the module (ie not per function).
In more detail, in the VM what used to be (schematically):
qst = DECODE_QSTR_VALUE;
is now (schematically):
idx = DECODE_QSTR_INDEX;
qst = qstr_table[idx];
That allows the bytecode to be fixed at compile time and not need
relinking/rewriting of the qstr values. Only qstr_table needs to be linked
when the .mpy is loaded.
Incidentally, this helps to reduce the size of bytecode because what used
to be 2-byte qstr values in the bytecode are now (mostly) 1-byte indices.
If the module uses the same qstr more than two times then the bytecode is
smaller than before.
The following changes are measured for this commit compared to the
previous (the baseline):
- average 7%-9% reduction in size of .mpy files
- frozen code size is reduced by about 5%-7%
- importing .py files uses about 5% less RAM in total
- importing .mpy files uses about 4% less RAM in total
- importing .py and .mpy files takes about the same time as before
The qstr indirection in the bytecode has only a small impact on VM
performance. For stm32 on PYBv1.0 the performance change of this commit
is:
diff of scores (higher is better)
N=100 M=100 baseline -> this-commit diff diff% (error%)
bm_chaos.py 371.07 -> 357.39 : -13.68 = -3.687% (+/-0.02%)
bm_fannkuch.py 78.72 -> 77.49 : -1.23 = -1.563% (+/-0.01%)
bm_fft.py 2591.73 -> 2539.28 : -52.45 = -2.024% (+/-0.00%)
bm_float.py 6034.93 -> 5908.30 : -126.63 = -2.098% (+/-0.01%)
bm_hexiom.py 48.96 -> 47.93 : -1.03 = -2.104% (+/-0.00%)
bm_nqueens.py 4510.63 -> 4459.94 : -50.69 = -1.124% (+/-0.00%)
bm_pidigits.py 650.28 -> 644.96 : -5.32 = -0.818% (+/-0.23%)
core_import_mpy_multi.py 564.77 -> 581.49 : +16.72 = +2.960% (+/-0.01%)
core_import_mpy_single.py 68.67 -> 67.16 : -1.51 = -2.199% (+/-0.01%)
core_qstr.py 64.16 -> 64.12 : -0.04 = -0.062% (+/-0.00%)
core_yield_from.py 362.58 -> 354.50 : -8.08 = -2.228% (+/-0.00%)
misc_aes.py 429.69 -> 405.59 : -24.10 = -5.609% (+/-0.01%)
misc_mandel.py 3485.13 -> 3416.51 : -68.62 = -1.969% (+/-0.00%)
misc_pystone.py 2496.53 -> 2405.56 : -90.97 = -3.644% (+/-0.01%)
misc_raytrace.py 381.47 -> 374.01 : -7.46 = -1.956% (+/-0.01%)
viper_call0.py 576.73 -> 572.49 : -4.24 = -0.735% (+/-0.04%)
viper_call1a.py 550.37 -> 546.21 : -4.16 = -0.756% (+/-0.09%)
viper_call1b.py 438.23 -> 435.68 : -2.55 = -0.582% (+/-0.06%)
viper_call1c.py 442.84 -> 440.04 : -2.80 = -0.632% (+/-0.08%)
viper_call2a.py 536.31 -> 532.35 : -3.96 = -0.738% (+/-0.06%)
viper_call2b.py 382.34 -> 377.07 : -5.27 = -1.378% (+/-0.03%)
And for unix on x64:
diff of scores (higher is better)
N=2000 M=2000 baseline -> this-commit diff diff% (error%)
bm_chaos.py 13594.20 -> 13073.84 : -520.36 = -3.828% (+/-5.44%)
bm_fannkuch.py 60.63 -> 59.58 : -1.05 = -1.732% (+/-3.01%)
bm_fft.py 112009.15 -> 111603.32 : -405.83 = -0.362% (+/-4.03%)
bm_float.py 246202.55 -> 247923.81 : +1721.26 = +0.699% (+/-2.79%)
bm_hexiom.py 615.65 -> 617.21 : +1.56 = +0.253% (+/-1.64%)
bm_nqueens.py 215807.95 -> 215600.96 : -206.99 = -0.096% (+/-3.52%)
bm_pidigits.py 8246.74 -> 8422.82 : +176.08 = +2.135% (+/-3.64%)
misc_aes.py 16133.00 -> 16452.74 : +319.74 = +1.982% (+/-1.50%)
misc_mandel.py 128146.69 -> 130796.43 : +2649.74 = +2.068% (+/-3.18%)
misc_pystone.py 83811.49 -> 83124.85 : -686.64 = -0.819% (+/-1.03%)
misc_raytrace.py 21688.02 -> 21385.10 : -302.92 = -1.397% (+/-3.20%)
The code size change is (firmware with a lot of frozen code benefits the
most):
bare-arm: +396 +0.697%
minimal x86: +1595 +0.979% [incl +32(data)]
unix x64: +2408 +0.470% [incl +800(data)]
unix nanbox: +1396 +0.309% [incl -96(data)]
stm32: -1256 -0.318% PYBV10
cc3200: +288 +0.157%
esp8266: -260 -0.037% GENERIC
esp32: -216 -0.014% GENERIC[incl -1072(data)]
nrf: +116 +0.067% pca10040
rp2: -664 -0.135% PICO
samd: +844 +0.607% ADAFRUIT_ITSYBITSY_M4_EXPRESS
As part of this change the .mpy file format version is bumped to version 6.
And mpy-tool.py has been improved to provide a good visualisation of the
contents of .mpy files.
In summary: this commit changes the bytecode to use qstr indirection, and
reworks the .mpy file format to be simpler and allow .mpy files to be
executed in-place. Performance is not impacted too much. Eventually it
will be possible to store such .mpy files in a linear, read-only, memory-
mappable filesystem so they can be executed from flash/ROM. This will
essentially be able to replace frozen code for most applications.
Signed-off-by: Damien George <damien@micropython.org>
2021-10-22 12:22:47 +01:00
|
|
|
mp_obj_exception_add_traceback(comp->compile_error, source_file,
|
2015-07-29 23:16:01 +01:00
|
|
|
comp->compile_error_line, comp->scope_cur->simple_name);
|
|
|
|
}
|
|
|
|
|
py: Rework bytecode and .mpy file format to be mostly static data.
Background: .mpy files are precompiled .py files, built using mpy-cross,
that contain compiled bytecode functions (and can also contain machine
code). The benefit of using an .mpy file over a .py file is that they are
faster to import and take less memory when importing. They are also
smaller on disk.
But the real benefit of .mpy files comes when they are frozen into the
firmware. This is done by loading the .mpy file during compilation of the
firmware and turning it into a set of big C data structures (the job of
mpy-tool.py), which are then compiled and downloaded into the ROM of a
device. These C data structures can be executed in-place, ie directly from
ROM. This makes importing even faster because there is very little to do,
and also means such frozen modules take up much less RAM (because their
bytecode stays in ROM).
The downside of frozen code is that it requires recompiling and reflashing
the entire firmware. This can be a big barrier to entry, slows down
development time, and makes it harder to do OTA updates of frozen code
(because the whole firmware must be updated).
This commit attempts to solve this problem by providing a solution that
sits between loading .mpy files into RAM and freezing them into the
firmware. The .mpy file format has been reworked so that it consists of
data and bytecode which is mostly static and ready to run in-place. If
these new .mpy files are located in flash/ROM which is memory addressable,
the .mpy file can be executed (mostly) in-place.
With this approach there is still a small amount of unpacking and linking
of the .mpy file that needs to be done when it's imported, but it's still
much better than loading an .mpy from disk into RAM (although not as good
as freezing .mpy files into the firmware).
The main trick to make static .mpy files is to adjust the bytecode so any
qstrs that it references now go through a lookup table to convert from
local qstr number in the module to global qstr number in the firmware.
That means the bytecode does not need linking/rewriting of qstrs when it's
loaded. Instead only a small qstr table needs to be built (and put in RAM)
at import time. This means the bytecode itself is static/constant and can
be used directly if it's in addressable memory. Also the qstr string data
in the .mpy file, and some constant object data, can be used directly.
Note that the qstr table is global to the module (ie not per function).
In more detail, in the VM what used to be (schematically):
qst = DECODE_QSTR_VALUE;
is now (schematically):
idx = DECODE_QSTR_INDEX;
qst = qstr_table[idx];
That allows the bytecode to be fixed at compile time and not need
relinking/rewriting of the qstr values. Only qstr_table needs to be linked
when the .mpy is loaded.
Incidentally, this helps to reduce the size of bytecode because what used
to be 2-byte qstr values in the bytecode are now (mostly) 1-byte indices.
If the module uses the same qstr more than two times then the bytecode is
smaller than before.
The following changes are measured for this commit compared to the
previous (the baseline):
- average 7%-9% reduction in size of .mpy files
- frozen code size is reduced by about 5%-7%
- importing .py files uses about 5% less RAM in total
- importing .mpy files uses about 4% less RAM in total
- importing .py and .mpy files takes about the same time as before
The qstr indirection in the bytecode has only a small impact on VM
performance. For stm32 on PYBv1.0 the performance change of this commit
is:
diff of scores (higher is better)
N=100 M=100 baseline -> this-commit diff diff% (error%)
bm_chaos.py 371.07 -> 357.39 : -13.68 = -3.687% (+/-0.02%)
bm_fannkuch.py 78.72 -> 77.49 : -1.23 = -1.563% (+/-0.01%)
bm_fft.py 2591.73 -> 2539.28 : -52.45 = -2.024% (+/-0.00%)
bm_float.py 6034.93 -> 5908.30 : -126.63 = -2.098% (+/-0.01%)
bm_hexiom.py 48.96 -> 47.93 : -1.03 = -2.104% (+/-0.00%)
bm_nqueens.py 4510.63 -> 4459.94 : -50.69 = -1.124% (+/-0.00%)
bm_pidigits.py 650.28 -> 644.96 : -5.32 = -0.818% (+/-0.23%)
core_import_mpy_multi.py 564.77 -> 581.49 : +16.72 = +2.960% (+/-0.01%)
core_import_mpy_single.py 68.67 -> 67.16 : -1.51 = -2.199% (+/-0.01%)
core_qstr.py 64.16 -> 64.12 : -0.04 = -0.062% (+/-0.00%)
core_yield_from.py 362.58 -> 354.50 : -8.08 = -2.228% (+/-0.00%)
misc_aes.py 429.69 -> 405.59 : -24.10 = -5.609% (+/-0.01%)
misc_mandel.py 3485.13 -> 3416.51 : -68.62 = -1.969% (+/-0.00%)
misc_pystone.py 2496.53 -> 2405.56 : -90.97 = -3.644% (+/-0.01%)
misc_raytrace.py 381.47 -> 374.01 : -7.46 = -1.956% (+/-0.01%)
viper_call0.py 576.73 -> 572.49 : -4.24 = -0.735% (+/-0.04%)
viper_call1a.py 550.37 -> 546.21 : -4.16 = -0.756% (+/-0.09%)
viper_call1b.py 438.23 -> 435.68 : -2.55 = -0.582% (+/-0.06%)
viper_call1c.py 442.84 -> 440.04 : -2.80 = -0.632% (+/-0.08%)
viper_call2a.py 536.31 -> 532.35 : -3.96 = -0.738% (+/-0.06%)
viper_call2b.py 382.34 -> 377.07 : -5.27 = -1.378% (+/-0.03%)
And for unix on x64:
diff of scores (higher is better)
N=2000 M=2000 baseline -> this-commit diff diff% (error%)
bm_chaos.py 13594.20 -> 13073.84 : -520.36 = -3.828% (+/-5.44%)
bm_fannkuch.py 60.63 -> 59.58 : -1.05 = -1.732% (+/-3.01%)
bm_fft.py 112009.15 -> 111603.32 : -405.83 = -0.362% (+/-4.03%)
bm_float.py 246202.55 -> 247923.81 : +1721.26 = +0.699% (+/-2.79%)
bm_hexiom.py 615.65 -> 617.21 : +1.56 = +0.253% (+/-1.64%)
bm_nqueens.py 215807.95 -> 215600.96 : -206.99 = -0.096% (+/-3.52%)
bm_pidigits.py 8246.74 -> 8422.82 : +176.08 = +2.135% (+/-3.64%)
misc_aes.py 16133.00 -> 16452.74 : +319.74 = +1.982% (+/-1.50%)
misc_mandel.py 128146.69 -> 130796.43 : +2649.74 = +2.068% (+/-3.18%)
misc_pystone.py 83811.49 -> 83124.85 : -686.64 = -0.819% (+/-1.03%)
misc_raytrace.py 21688.02 -> 21385.10 : -302.92 = -1.397% (+/-3.20%)
The code size change is (firmware with a lot of frozen code benefits the
most):
bare-arm: +396 +0.697%
minimal x86: +1595 +0.979% [incl +32(data)]
unix x64: +2408 +0.470% [incl +800(data)]
unix nanbox: +1396 +0.309% [incl -96(data)]
stm32: -1256 -0.318% PYBV10
cc3200: +288 +0.157%
esp8266: -260 -0.037% GENERIC
esp32: -216 -0.014% GENERIC[incl -1072(data)]
nrf: +116 +0.067% pca10040
rp2: -664 -0.135% PICO
samd: +844 +0.607% ADAFRUIT_ITSYBITSY_M4_EXPRESS
As part of this change the .mpy file format version is bumped to version 6.
And mpy-tool.py has been improved to provide a good visualisation of the
contents of .mpy files.
In summary: this commit changes the bytecode to use qstr indirection, and
reworks the .mpy file format to be simpler and allow .mpy files to be
executed in-place. Performance is not impacted too much. Eventually it
will be possible to store such .mpy files in a linear, read-only, memory-
mappable filesystem so they can be executed from flash/ROM. This will
essentially be able to replace frozen code for most applications.
Signed-off-by: Damien George <damien@micropython.org>
2021-10-22 12:22:47 +01:00
|
|
|
// construct the global qstr/const table for this module
|
|
|
|
mp_compiled_module_t cm;
|
|
|
|
cm.rc = module_scope->raw_code;
|
|
|
|
cm.context = context;
|
|
|
|
#if MICROPY_PERSISTENT_CODE_SAVE
|
|
|
|
cm.has_native = has_native_code;
|
|
|
|
cm.n_qstr = comp->emit_common.qstr_map.used;
|
|
|
|
cm.n_obj = comp->emit_common.ct_cur_obj;
|
|
|
|
#endif
|
|
|
|
if (comp->compile_error == MP_OBJ_NULL) {
|
|
|
|
mp_emit_common_populate_module_context(&comp->emit_common, source_file, context);
|
|
|
|
|
|
|
|
#if MICROPY_DEBUG_PRINTERS
|
|
|
|
// now that the module context is valid, the raw codes can be printed
|
|
|
|
if (mp_verbose_flag >= 2) {
|
|
|
|
for (scope_t *s = comp->scope_head; s != NULL; s = s->next) {
|
|
|
|
mp_raw_code_t *rc = s->raw_code;
|
|
|
|
mp_bytecode_print(&mp_plat_print, rc, rc->fun_data, rc->fun_data_len, &cm.context->constants);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
#endif
|
|
|
|
}
|
|
|
|
|
2014-01-24 22:42:28 +00:00
|
|
|
// free the emitters
|
2015-03-26 15:49:53 +00:00
|
|
|
|
|
|
|
emit_bc_free(emit_bc);
|
2014-01-24 22:42:28 +00:00
|
|
|
#if MICROPY_EMIT_NATIVE
|
|
|
|
if (emit_native != NULL) {
|
2016-12-07 00:17:17 +00:00
|
|
|
NATIVE_EMITTER(free)(emit_native);
|
2014-01-24 22:42:28 +00:00
|
|
|
}
|
|
|
|
#endif
|
2016-12-09 02:17:49 +00:00
|
|
|
#if MICROPY_EMIT_INLINE_ASM
|
|
|
|
if (comp->emit_inline_asm != NULL) {
|
|
|
|
ASM_EMITTER(free)(comp->emit_inline_asm);
|
2014-01-24 22:42:28 +00:00
|
|
|
}
|
2016-12-09 02:17:49 +00:00
|
|
|
#endif
|
2014-01-24 22:42:28 +00:00
|
|
|
|
2014-09-23 16:31:56 +01:00
|
|
|
// free the parse tree
|
2015-09-23 10:50:43 +01:00
|
|
|
mp_parse_tree_clear(parse_tree);
|
2014-09-23 16:31:56 +01:00
|
|
|
|
2014-01-24 22:42:28 +00:00
|
|
|
// free the scopes
|
2014-01-23 21:05:47 +00:00
|
|
|
for (scope_t *s = module_scope; s;) {
|
|
|
|
scope_t *next = s->next;
|
|
|
|
scope_free(s);
|
|
|
|
s = next;
|
|
|
|
}
|
2013-10-18 19:58:12 +01:00
|
|
|
|
2015-09-24 13:15:57 +01:00
|
|
|
if (comp->compile_error != MP_OBJ_NULL) {
|
|
|
|
nlr_raise(comp->compile_error);
|
2014-01-03 14:22:03 +00:00
|
|
|
}
|
py: Rework bytecode and .mpy file format to be mostly static data.
Background: .mpy files are precompiled .py files, built using mpy-cross,
that contain compiled bytecode functions (and can also contain machine
code). The benefit of using an .mpy file over a .py file is that they are
faster to import and take less memory when importing. They are also
smaller on disk.
But the real benefit of .mpy files comes when they are frozen into the
firmware. This is done by loading the .mpy file during compilation of the
firmware and turning it into a set of big C data structures (the job of
mpy-tool.py), which are then compiled and downloaded into the ROM of a
device. These C data structures can be executed in-place, ie directly from
ROM. This makes importing even faster because there is very little to do,
and also means such frozen modules take up much less RAM (because their
bytecode stays in ROM).
The downside of frozen code is that it requires recompiling and reflashing
the entire firmware. This can be a big barrier to entry, slows down
development time, and makes it harder to do OTA updates of frozen code
(because the whole firmware must be updated).
This commit attempts to solve this problem by providing a solution that
sits between loading .mpy files into RAM and freezing them into the
firmware. The .mpy file format has been reworked so that it consists of
data and bytecode which is mostly static and ready to run in-place. If
these new .mpy files are located in flash/ROM which is memory addressable,
the .mpy file can be executed (mostly) in-place.
With this approach there is still a small amount of unpacking and linking
of the .mpy file that needs to be done when it's imported, but it's still
much better than loading an .mpy from disk into RAM (although not as good
as freezing .mpy files into the firmware).
The main trick to make static .mpy files is to adjust the bytecode so any
qstrs that it references now go through a lookup table to convert from
local qstr number in the module to global qstr number in the firmware.
That means the bytecode does not need linking/rewriting of qstrs when it's
loaded. Instead only a small qstr table needs to be built (and put in RAM)
at import time. This means the bytecode itself is static/constant and can
be used directly if it's in addressable memory. Also the qstr string data
in the .mpy file, and some constant object data, can be used directly.
Note that the qstr table is global to the module (ie not per function).
In more detail, in the VM what used to be (schematically):
qst = DECODE_QSTR_VALUE;
is now (schematically):
idx = DECODE_QSTR_INDEX;
qst = qstr_table[idx];
That allows the bytecode to be fixed at compile time and not need
relinking/rewriting of the qstr values. Only qstr_table needs to be linked
when the .mpy is loaded.
Incidentally, this helps to reduce the size of bytecode because what used
to be 2-byte qstr values in the bytecode are now (mostly) 1-byte indices.
If the module uses the same qstr more than two times then the bytecode is
smaller than before.
The following changes are measured for this commit compared to the
previous (the baseline):
- average 7%-9% reduction in size of .mpy files
- frozen code size is reduced by about 5%-7%
- importing .py files uses about 5% less RAM in total
- importing .mpy files uses about 4% less RAM in total
- importing .py and .mpy files takes about the same time as before
The qstr indirection in the bytecode has only a small impact on VM
performance. For stm32 on PYBv1.0 the performance change of this commit
is:
diff of scores (higher is better)
N=100 M=100 baseline -> this-commit diff diff% (error%)
bm_chaos.py 371.07 -> 357.39 : -13.68 = -3.687% (+/-0.02%)
bm_fannkuch.py 78.72 -> 77.49 : -1.23 = -1.563% (+/-0.01%)
bm_fft.py 2591.73 -> 2539.28 : -52.45 = -2.024% (+/-0.00%)
bm_float.py 6034.93 -> 5908.30 : -126.63 = -2.098% (+/-0.01%)
bm_hexiom.py 48.96 -> 47.93 : -1.03 = -2.104% (+/-0.00%)
bm_nqueens.py 4510.63 -> 4459.94 : -50.69 = -1.124% (+/-0.00%)
bm_pidigits.py 650.28 -> 644.96 : -5.32 = -0.818% (+/-0.23%)
core_import_mpy_multi.py 564.77 -> 581.49 : +16.72 = +2.960% (+/-0.01%)
core_import_mpy_single.py 68.67 -> 67.16 : -1.51 = -2.199% (+/-0.01%)
core_qstr.py 64.16 -> 64.12 : -0.04 = -0.062% (+/-0.00%)
core_yield_from.py 362.58 -> 354.50 : -8.08 = -2.228% (+/-0.00%)
misc_aes.py 429.69 -> 405.59 : -24.10 = -5.609% (+/-0.01%)
misc_mandel.py 3485.13 -> 3416.51 : -68.62 = -1.969% (+/-0.00%)
misc_pystone.py 2496.53 -> 2405.56 : -90.97 = -3.644% (+/-0.01%)
misc_raytrace.py 381.47 -> 374.01 : -7.46 = -1.956% (+/-0.01%)
viper_call0.py 576.73 -> 572.49 : -4.24 = -0.735% (+/-0.04%)
viper_call1a.py 550.37 -> 546.21 : -4.16 = -0.756% (+/-0.09%)
viper_call1b.py 438.23 -> 435.68 : -2.55 = -0.582% (+/-0.06%)
viper_call1c.py 442.84 -> 440.04 : -2.80 = -0.632% (+/-0.08%)
viper_call2a.py 536.31 -> 532.35 : -3.96 = -0.738% (+/-0.06%)
viper_call2b.py 382.34 -> 377.07 : -5.27 = -1.378% (+/-0.03%)
And for unix on x64:
diff of scores (higher is better)
N=2000 M=2000 baseline -> this-commit diff diff% (error%)
bm_chaos.py 13594.20 -> 13073.84 : -520.36 = -3.828% (+/-5.44%)
bm_fannkuch.py 60.63 -> 59.58 : -1.05 = -1.732% (+/-3.01%)
bm_fft.py 112009.15 -> 111603.32 : -405.83 = -0.362% (+/-4.03%)
bm_float.py 246202.55 -> 247923.81 : +1721.26 = +0.699% (+/-2.79%)
bm_hexiom.py 615.65 -> 617.21 : +1.56 = +0.253% (+/-1.64%)
bm_nqueens.py 215807.95 -> 215600.96 : -206.99 = -0.096% (+/-3.52%)
bm_pidigits.py 8246.74 -> 8422.82 : +176.08 = +2.135% (+/-3.64%)
misc_aes.py 16133.00 -> 16452.74 : +319.74 = +1.982% (+/-1.50%)
misc_mandel.py 128146.69 -> 130796.43 : +2649.74 = +2.068% (+/-3.18%)
misc_pystone.py 83811.49 -> 83124.85 : -686.64 = -0.819% (+/-1.03%)
misc_raytrace.py 21688.02 -> 21385.10 : -302.92 = -1.397% (+/-3.20%)
The code size change is (firmware with a lot of frozen code benefits the
most):
bare-arm: +396 +0.697%
minimal x86: +1595 +0.979% [incl +32(data)]
unix x64: +2408 +0.470% [incl +800(data)]
unix nanbox: +1396 +0.309% [incl -96(data)]
stm32: -1256 -0.318% PYBV10
cc3200: +288 +0.157%
esp8266: -260 -0.037% GENERIC
esp32: -216 -0.014% GENERIC[incl -1072(data)]
nrf: +116 +0.067% pca10040
rp2: -664 -0.135% PICO
samd: +844 +0.607% ADAFRUIT_ITSYBITSY_M4_EXPRESS
As part of this change the .mpy file format version is bumped to version 6.
And mpy-tool.py has been improved to provide a good visualisation of the
contents of .mpy files.
In summary: this commit changes the bytecode to use qstr indirection, and
reworks the .mpy file format to be simpler and allow .mpy files to be
executed in-place. Performance is not impacted too much. Eventually it
will be possible to store such .mpy files in a linear, read-only, memory-
mappable filesystem so they can be executed from flash/ROM. This will
essentially be able to replace frozen code for most applications.
Signed-off-by: Damien George <damien@micropython.org>
2021-10-22 12:22:47 +01:00
|
|
|
|
|
|
|
return cm;
|
2013-10-04 19:53:11 +01:00
|
|
|
}
|
2015-11-13 13:38:28 +00:00
|
|
|
|
2019-08-23 02:20:50 +01:00
|
|
|
mp_obj_t mp_compile(mp_parse_tree_t *parse_tree, qstr source_file, bool is_repl) {
|
py: Rework bytecode and .mpy file format to be mostly static data.
Background: .mpy files are precompiled .py files, built using mpy-cross,
that contain compiled bytecode functions (and can also contain machine
code). The benefit of using an .mpy file over a .py file is that they are
faster to import and take less memory when importing. They are also
smaller on disk.
But the real benefit of .mpy files comes when they are frozen into the
firmware. This is done by loading the .mpy file during compilation of the
firmware and turning it into a set of big C data structures (the job of
mpy-tool.py), which are then compiled and downloaded into the ROM of a
device. These C data structures can be executed in-place, ie directly from
ROM. This makes importing even faster because there is very little to do,
and also means such frozen modules take up much less RAM (because their
bytecode stays in ROM).
The downside of frozen code is that it requires recompiling and reflashing
the entire firmware. This can be a big barrier to entry, slows down
development time, and makes it harder to do OTA updates of frozen code
(because the whole firmware must be updated).
This commit attempts to solve this problem by providing a solution that
sits between loading .mpy files into RAM and freezing them into the
firmware. The .mpy file format has been reworked so that it consists of
data and bytecode which is mostly static and ready to run in-place. If
these new .mpy files are located in flash/ROM which is memory addressable,
the .mpy file can be executed (mostly) in-place.
With this approach there is still a small amount of unpacking and linking
of the .mpy file that needs to be done when it's imported, but it's still
much better than loading an .mpy from disk into RAM (although not as good
as freezing .mpy files into the firmware).
The main trick to make static .mpy files is to adjust the bytecode so any
qstrs that it references now go through a lookup table to convert from
local qstr number in the module to global qstr number in the firmware.
That means the bytecode does not need linking/rewriting of qstrs when it's
loaded. Instead only a small qstr table needs to be built (and put in RAM)
at import time. This means the bytecode itself is static/constant and can
be used directly if it's in addressable memory. Also the qstr string data
in the .mpy file, and some constant object data, can be used directly.
Note that the qstr table is global to the module (ie not per function).
In more detail, in the VM what used to be (schematically):
qst = DECODE_QSTR_VALUE;
is now (schematically):
idx = DECODE_QSTR_INDEX;
qst = qstr_table[idx];
That allows the bytecode to be fixed at compile time and not need
relinking/rewriting of the qstr values. Only qstr_table needs to be linked
when the .mpy is loaded.
Incidentally, this helps to reduce the size of bytecode because what used
to be 2-byte qstr values in the bytecode are now (mostly) 1-byte indices.
If the module uses the same qstr more than two times then the bytecode is
smaller than before.
The following changes are measured for this commit compared to the
previous (the baseline):
- average 7%-9% reduction in size of .mpy files
- frozen code size is reduced by about 5%-7%
- importing .py files uses about 5% less RAM in total
- importing .mpy files uses about 4% less RAM in total
- importing .py and .mpy files takes about the same time as before
The qstr indirection in the bytecode has only a small impact on VM
performance. For stm32 on PYBv1.0 the performance change of this commit
is:
diff of scores (higher is better)
N=100 M=100 baseline -> this-commit diff diff% (error%)
bm_chaos.py 371.07 -> 357.39 : -13.68 = -3.687% (+/-0.02%)
bm_fannkuch.py 78.72 -> 77.49 : -1.23 = -1.563% (+/-0.01%)
bm_fft.py 2591.73 -> 2539.28 : -52.45 = -2.024% (+/-0.00%)
bm_float.py 6034.93 -> 5908.30 : -126.63 = -2.098% (+/-0.01%)
bm_hexiom.py 48.96 -> 47.93 : -1.03 = -2.104% (+/-0.00%)
bm_nqueens.py 4510.63 -> 4459.94 : -50.69 = -1.124% (+/-0.00%)
bm_pidigits.py 650.28 -> 644.96 : -5.32 = -0.818% (+/-0.23%)
core_import_mpy_multi.py 564.77 -> 581.49 : +16.72 = +2.960% (+/-0.01%)
core_import_mpy_single.py 68.67 -> 67.16 : -1.51 = -2.199% (+/-0.01%)
core_qstr.py 64.16 -> 64.12 : -0.04 = -0.062% (+/-0.00%)
core_yield_from.py 362.58 -> 354.50 : -8.08 = -2.228% (+/-0.00%)
misc_aes.py 429.69 -> 405.59 : -24.10 = -5.609% (+/-0.01%)
misc_mandel.py 3485.13 -> 3416.51 : -68.62 = -1.969% (+/-0.00%)
misc_pystone.py 2496.53 -> 2405.56 : -90.97 = -3.644% (+/-0.01%)
misc_raytrace.py 381.47 -> 374.01 : -7.46 = -1.956% (+/-0.01%)
viper_call0.py 576.73 -> 572.49 : -4.24 = -0.735% (+/-0.04%)
viper_call1a.py 550.37 -> 546.21 : -4.16 = -0.756% (+/-0.09%)
viper_call1b.py 438.23 -> 435.68 : -2.55 = -0.582% (+/-0.06%)
viper_call1c.py 442.84 -> 440.04 : -2.80 = -0.632% (+/-0.08%)
viper_call2a.py 536.31 -> 532.35 : -3.96 = -0.738% (+/-0.06%)
viper_call2b.py 382.34 -> 377.07 : -5.27 = -1.378% (+/-0.03%)
And for unix on x64:
diff of scores (higher is better)
N=2000 M=2000 baseline -> this-commit diff diff% (error%)
bm_chaos.py 13594.20 -> 13073.84 : -520.36 = -3.828% (+/-5.44%)
bm_fannkuch.py 60.63 -> 59.58 : -1.05 = -1.732% (+/-3.01%)
bm_fft.py 112009.15 -> 111603.32 : -405.83 = -0.362% (+/-4.03%)
bm_float.py 246202.55 -> 247923.81 : +1721.26 = +0.699% (+/-2.79%)
bm_hexiom.py 615.65 -> 617.21 : +1.56 = +0.253% (+/-1.64%)
bm_nqueens.py 215807.95 -> 215600.96 : -206.99 = -0.096% (+/-3.52%)
bm_pidigits.py 8246.74 -> 8422.82 : +176.08 = +2.135% (+/-3.64%)
misc_aes.py 16133.00 -> 16452.74 : +319.74 = +1.982% (+/-1.50%)
misc_mandel.py 128146.69 -> 130796.43 : +2649.74 = +2.068% (+/-3.18%)
misc_pystone.py 83811.49 -> 83124.85 : -686.64 = -0.819% (+/-1.03%)
misc_raytrace.py 21688.02 -> 21385.10 : -302.92 = -1.397% (+/-3.20%)
The code size change is (firmware with a lot of frozen code benefits the
most):
bare-arm: +396 +0.697%
minimal x86: +1595 +0.979% [incl +32(data)]
unix x64: +2408 +0.470% [incl +800(data)]
unix nanbox: +1396 +0.309% [incl -96(data)]
stm32: -1256 -0.318% PYBV10
cc3200: +288 +0.157%
esp8266: -260 -0.037% GENERIC
esp32: -216 -0.014% GENERIC[incl -1072(data)]
nrf: +116 +0.067% pca10040
rp2: -664 -0.135% PICO
samd: +844 +0.607% ADAFRUIT_ITSYBITSY_M4_EXPRESS
As part of this change the .mpy file format version is bumped to version 6.
And mpy-tool.py has been improved to provide a good visualisation of the
contents of .mpy files.
In summary: this commit changes the bytecode to use qstr indirection, and
reworks the .mpy file format to be simpler and allow .mpy files to be
executed in-place. Performance is not impacted too much. Eventually it
will be possible to store such .mpy files in a linear, read-only, memory-
mappable filesystem so they can be executed from flash/ROM. This will
essentially be able to replace frozen code for most applications.
Signed-off-by: Damien George <damien@micropython.org>
2021-10-22 12:22:47 +01:00
|
|
|
mp_module_context_t *context = m_new_obj(mp_module_context_t);
|
|
|
|
context->module.globals = mp_globals_get();
|
|
|
|
mp_compiled_module_t cm = mp_compile_to_raw_code(parse_tree, source_file, is_repl, context);
|
2015-11-13 13:38:28 +00:00
|
|
|
// return function that executes the outer module
|
py: Rework bytecode and .mpy file format to be mostly static data.
Background: .mpy files are precompiled .py files, built using mpy-cross,
that contain compiled bytecode functions (and can also contain machine
code). The benefit of using an .mpy file over a .py file is that they are
faster to import and take less memory when importing. They are also
smaller on disk.
But the real benefit of .mpy files comes when they are frozen into the
firmware. This is done by loading the .mpy file during compilation of the
firmware and turning it into a set of big C data structures (the job of
mpy-tool.py), which are then compiled and downloaded into the ROM of a
device. These C data structures can be executed in-place, ie directly from
ROM. This makes importing even faster because there is very little to do,
and also means such frozen modules take up much less RAM (because their
bytecode stays in ROM).
The downside of frozen code is that it requires recompiling and reflashing
the entire firmware. This can be a big barrier to entry, slows down
development time, and makes it harder to do OTA updates of frozen code
(because the whole firmware must be updated).
This commit attempts to solve this problem by providing a solution that
sits between loading .mpy files into RAM and freezing them into the
firmware. The .mpy file format has been reworked so that it consists of
data and bytecode which is mostly static and ready to run in-place. If
these new .mpy files are located in flash/ROM which is memory addressable,
the .mpy file can be executed (mostly) in-place.
With this approach there is still a small amount of unpacking and linking
of the .mpy file that needs to be done when it's imported, but it's still
much better than loading an .mpy from disk into RAM (although not as good
as freezing .mpy files into the firmware).
The main trick to make static .mpy files is to adjust the bytecode so any
qstrs that it references now go through a lookup table to convert from
local qstr number in the module to global qstr number in the firmware.
That means the bytecode does not need linking/rewriting of qstrs when it's
loaded. Instead only a small qstr table needs to be built (and put in RAM)
at import time. This means the bytecode itself is static/constant and can
be used directly if it's in addressable memory. Also the qstr string data
in the .mpy file, and some constant object data, can be used directly.
Note that the qstr table is global to the module (ie not per function).
In more detail, in the VM what used to be (schematically):
qst = DECODE_QSTR_VALUE;
is now (schematically):
idx = DECODE_QSTR_INDEX;
qst = qstr_table[idx];
That allows the bytecode to be fixed at compile time and not need
relinking/rewriting of the qstr values. Only qstr_table needs to be linked
when the .mpy is loaded.
Incidentally, this helps to reduce the size of bytecode because what used
to be 2-byte qstr values in the bytecode are now (mostly) 1-byte indices.
If the module uses the same qstr more than two times then the bytecode is
smaller than before.
The following changes are measured for this commit compared to the
previous (the baseline):
- average 7%-9% reduction in size of .mpy files
- frozen code size is reduced by about 5%-7%
- importing .py files uses about 5% less RAM in total
- importing .mpy files uses about 4% less RAM in total
- importing .py and .mpy files takes about the same time as before
The qstr indirection in the bytecode has only a small impact on VM
performance. For stm32 on PYBv1.0 the performance change of this commit
is:
diff of scores (higher is better)
N=100 M=100 baseline -> this-commit diff diff% (error%)
bm_chaos.py 371.07 -> 357.39 : -13.68 = -3.687% (+/-0.02%)
bm_fannkuch.py 78.72 -> 77.49 : -1.23 = -1.563% (+/-0.01%)
bm_fft.py 2591.73 -> 2539.28 : -52.45 = -2.024% (+/-0.00%)
bm_float.py 6034.93 -> 5908.30 : -126.63 = -2.098% (+/-0.01%)
bm_hexiom.py 48.96 -> 47.93 : -1.03 = -2.104% (+/-0.00%)
bm_nqueens.py 4510.63 -> 4459.94 : -50.69 = -1.124% (+/-0.00%)
bm_pidigits.py 650.28 -> 644.96 : -5.32 = -0.818% (+/-0.23%)
core_import_mpy_multi.py 564.77 -> 581.49 : +16.72 = +2.960% (+/-0.01%)
core_import_mpy_single.py 68.67 -> 67.16 : -1.51 = -2.199% (+/-0.01%)
core_qstr.py 64.16 -> 64.12 : -0.04 = -0.062% (+/-0.00%)
core_yield_from.py 362.58 -> 354.50 : -8.08 = -2.228% (+/-0.00%)
misc_aes.py 429.69 -> 405.59 : -24.10 = -5.609% (+/-0.01%)
misc_mandel.py 3485.13 -> 3416.51 : -68.62 = -1.969% (+/-0.00%)
misc_pystone.py 2496.53 -> 2405.56 : -90.97 = -3.644% (+/-0.01%)
misc_raytrace.py 381.47 -> 374.01 : -7.46 = -1.956% (+/-0.01%)
viper_call0.py 576.73 -> 572.49 : -4.24 = -0.735% (+/-0.04%)
viper_call1a.py 550.37 -> 546.21 : -4.16 = -0.756% (+/-0.09%)
viper_call1b.py 438.23 -> 435.68 : -2.55 = -0.582% (+/-0.06%)
viper_call1c.py 442.84 -> 440.04 : -2.80 = -0.632% (+/-0.08%)
viper_call2a.py 536.31 -> 532.35 : -3.96 = -0.738% (+/-0.06%)
viper_call2b.py 382.34 -> 377.07 : -5.27 = -1.378% (+/-0.03%)
And for unix on x64:
diff of scores (higher is better)
N=2000 M=2000 baseline -> this-commit diff diff% (error%)
bm_chaos.py 13594.20 -> 13073.84 : -520.36 = -3.828% (+/-5.44%)
bm_fannkuch.py 60.63 -> 59.58 : -1.05 = -1.732% (+/-3.01%)
bm_fft.py 112009.15 -> 111603.32 : -405.83 = -0.362% (+/-4.03%)
bm_float.py 246202.55 -> 247923.81 : +1721.26 = +0.699% (+/-2.79%)
bm_hexiom.py 615.65 -> 617.21 : +1.56 = +0.253% (+/-1.64%)
bm_nqueens.py 215807.95 -> 215600.96 : -206.99 = -0.096% (+/-3.52%)
bm_pidigits.py 8246.74 -> 8422.82 : +176.08 = +2.135% (+/-3.64%)
misc_aes.py 16133.00 -> 16452.74 : +319.74 = +1.982% (+/-1.50%)
misc_mandel.py 128146.69 -> 130796.43 : +2649.74 = +2.068% (+/-3.18%)
misc_pystone.py 83811.49 -> 83124.85 : -686.64 = -0.819% (+/-1.03%)
misc_raytrace.py 21688.02 -> 21385.10 : -302.92 = -1.397% (+/-3.20%)
The code size change is (firmware with a lot of frozen code benefits the
most):
bare-arm: +396 +0.697%
minimal x86: +1595 +0.979% [incl +32(data)]
unix x64: +2408 +0.470% [incl +800(data)]
unix nanbox: +1396 +0.309% [incl -96(data)]
stm32: -1256 -0.318% PYBV10
cc3200: +288 +0.157%
esp8266: -260 -0.037% GENERIC
esp32: -216 -0.014% GENERIC[incl -1072(data)]
nrf: +116 +0.067% pca10040
rp2: -664 -0.135% PICO
samd: +844 +0.607% ADAFRUIT_ITSYBITSY_M4_EXPRESS
As part of this change the .mpy file format version is bumped to version 6.
And mpy-tool.py has been improved to provide a good visualisation of the
contents of .mpy files.
In summary: this commit changes the bytecode to use qstr indirection, and
reworks the .mpy file format to be simpler and allow .mpy files to be
executed in-place. Performance is not impacted too much. Eventually it
will be possible to store such .mpy files in a linear, read-only, memory-
mappable filesystem so they can be executed from flash/ROM. This will
essentially be able to replace frozen code for most applications.
Signed-off-by: Damien George <damien@micropython.org>
2021-10-22 12:22:47 +01:00
|
|
|
return mp_make_function_from_raw_code(cm.rc, cm.context, NULL);
|
2015-11-13 13:38:28 +00:00
|
|
|
}
|
2015-12-18 12:35:44 +00:00
|
|
|
|
|
|
|
#endif // MICROPY_ENABLE_COMPILER
|