Commit Graph

320 Commits

Author SHA1 Message Date
Damien George 07caf4f969 py/emit: Remove need to call set_native_type to set viper return type.
Instead this return type is now stored in the scope_flags.
2018-09-15 12:41:25 +10:00
Damien George 1d7c221b30 py/emit: Remove need to call set_native_type to set native/viper mode.
The native emitter can easily determine the mode via scope->emit_options.
2018-09-15 12:17:14 +10:00
Damien George 4f3d9429b5 py: Fix native functions so they run with their correct globals context.
Prior to this commit a function compiled with the native decorator
@micropython.native would not work correctly when accessing global
variables, because the globals dict was not being set upon function entry.

This commit fixes this problem by, upon function entry, setting as the
current globals dict the globals dict context the function was defined
within, as per normal Python semantics, and as bytecode does.  Upon
function exit the original globals dict is restored.

In order to restore the globals dict when an exception is raised the native
function must guard its internals with an nlr_push/nlr_pop pair.  Because
this push/pop is relatively expensive, in both C stack usage for the
nlr_buf_t and CPU execution time, the implementation here optimises things
as much as possible.  First, the compiler keeps track of whether a function
even needs to access global variables.  Using this information the native
emitter then generates three different kinds of code:

1. no globals used, no exception handlers: no nlr handling code and no
   setting of the globals dict.

2. globals used, no exception handlers: an nlr_buf_t is allocated on the
   C stack but it is not used if the globals dict is unchanged, saving
   execution time because nlr_push/nlr_pop don't need to run.

3. function has exception handlers, may use globals: an nlr_buf_t is
   allocated and nlr_push/nlr_pop are always called.

In the end, native functions that don't access globals and don't have
exception handlers will run more efficiently than those that do.

Fixes issue #1573.
2018-09-13 22:47:20 +10:00
Damien George 8014e7f15f py/compile: Factor code that compiles start/end of exception handler. 2018-09-04 16:06:22 +10:00
Damien George 4ae7111573 py/emitnative: Add support for return/break/continue in try and with.
This patch adds full support for unwinding jumps to the native emitter.
This means that return/break/continue can be used in try-except,
try-finally and with statements.  For code that doesn't use unwinding jumps
there is almost no overhead added to the generated code.
2018-09-04 14:31:28 +10:00
Damien George a3de776486 py/emitnative: Optimise and improve exception handling in native code.
Prior to this patch, native code would use a full nlr_buf_t for each
exception handler (try-except, try-finally, with).  For nested exception
handlers this would use a lot of C stack and be rather inefficient.

This patch changes how exceptions are handled in native code by setting up
only a single nlr_buf_t context for the entire function, and then manages a
state machine (using the PC) to work out which exception handler to run
when an exception is raised by an nlr_jump.  This keeps the C stack usage
at a constant level regardless of the depth of Python exception blocks.

The patch also fixes an existing bug when local variables are written to
within an exception handler, then their value was incorrectly restored if
an exception was raised (since the nlr_jump would restore register values,
back to the point of the nlr_push).

And it also gets nested try-finally+with working with the viper emitter.

Broadly speaking, efficiency of executing native code that doesn't use
any exception blocks is unchanged, and emitted code size is only slightly
increased for such function.  C stack usage of all native functions is
either equal or less than before.  Emitted code size for native functions
that use exception blocks is increased by roughly 10% (due in part to
fixing of above-mentioned bugs).

But, most importantly, this patch allows to implement more Python features
in native code, like unwind jumps and yielding from within nested exception
blocks.
2018-08-16 13:56:36 +10:00
Damien George cbec17f2cd py/compile: For dynamic compiler, widen literal 1 to get correct shift.
Without this patch, on 64-bit architectures the "1 << (small_int_bits - 1)"
is computed using only 32-bit values (since small_int_bits is a uint8_t)
and so will overflow (and give the wrong result) if small_int_bits is
larger than 32.
2018-08-13 23:34:47 +10:00
Damien George d8dc918deb py/compile: Handle return/break/continue correctly in async with.
Before this patch the context manager's __aexit__() method would not be
executed if a return/break/continue statement was used to exit an async
with block.  async with now has the same semantics as normal with.

The fix here applies purely to the compiler, and does not modify the
runtime at all. It might (eventually) be better to define new bytecode(s)
to handle async with (and maybe other async constructs) in a cleaner, more
efficient way.

One minor drawback with addressing this issue purely in the compiler is
that it wasn't possible to get 100% CPython semantics.  The thing that is
different here to CPython is that the __aexit__ method is not looked up in
the context manager until it is needed, which is after the body of the
async with statement has executed.  So if a context manager doesn't have
__aexit__ then CPython raises an exception before the async with is
executed, whereas uPy will raise it after it is executed.  Note that
__aenter__ is looked up at the beginning in uPy because it needs to be
called straightaway, so if the context manager isn't a context manager then
it'll still raise an exception at the same location as CPython.  The only
difference is if the context manager has the __aenter__ method but not the
__aexit__ method, then in that case uPy has different behaviour.  But this
is a very minor, and acceptable, difference.
2018-06-27 16:57:42 +10:00
Damien George 25ae98f07c py/compile: Combine expr, xor_expr and and_expr into one function.
This and the previous 4 commits combined have change in code size of:

   bare-arm:   -92
minimal x86:  -544
   unix x64:  -544
unix nanbox:  -712
      stm32:  -116
     cc3200:  -128
    esp8266:  -348
      esp32:  -232
2018-06-22 17:00:29 +10:00
Damien George 36e474e83f py/compile: Combine or_test and and_test compile functions. 2018-06-22 17:00:29 +10:00
Damien George 1a7109d65a py/compile: Combine global and nonlocal statement compile functions. 2018-06-22 17:00:29 +10:00
Damien George d23bec3fc8 py/compile: Combine subscript_2 and subscript_3 into one function. 2018-06-22 17:00:29 +10:00
Damien George c149197928 py/compile: Combine break and continue compile functions. 2018-06-22 17:00:29 +10:00
Damien George 18e6358480 py/emit: Combine setup with/except/finally into one emit function.
This patch reduces code size by:

   bare-arm:   -16
minimal x86:  -156
   unix x64:  -288
unix nanbox:  -184
      stm32:   -48
     cc3200:   -16
    esp8266:   -96
      esp32:   -16

The last 10 patches combined reduce code size by:

   bare-arm:  -164
minimal x86: -1260
   unix x64: -3416
unix nanbox: -1616
      stm32:  -676
     cc3200:  -232
    esp8266: -1144
      esp32:  -268
2018-05-23 00:35:16 +10:00
Damien George 436e0d4c54 py/emit: Merge build set/slice into existing build emit function.
Reduces code size by:

   bare-arm:    +0
minimal x86:    +0
   unix x64:  -368
unix nanbox:  -248
      stm32:  -128
     cc3200:   -48
    esp8266:  -184
      esp32:   -40
2018-05-23 00:23:36 +10:00
Damien George d97906ca9a py/emit: Combine import from/name/star into one emit function.
Change in code size is:

   bare-arm:    +4
minimal x86:   -88
   unix x64:  -456
unix nanbox:   -88
      stm32:   -44
     cc3200:    +0
    esp8266:  -104
      esp32:    +8
2018-05-23 00:23:08 +10:00
Damien George 8a513da5a5 py/emit: Combine break_loop and continue_loop into one emit function.
Reduces code size by:

   bare-arm:    +0
minimal x86:    +0
   unix x64:   -80
unix nanbox:    +0
      stm32:   -12
     cc3200:    +0
    esp8266:   -28
      esp32:    +0
2018-05-23 00:23:04 +10:00
Damien George 6211d979ee py/emit: Combine load/store/delete attr into one emit function.
Reduces code size by:

   bare-arm:   -20
minimal x86:  -140
   unix x64:  -408
unix nanbox:  -140
      stm32:   -68
     cc3200:   -16
    esp8266:   -80
      esp32:   -32
2018-05-23 00:22:59 +10:00
Damien George a4941a8ba4 py/emit: Combine load/store/delete subscr into one emit function.
Reduces code size by:

   bare-arm:    -8
minimal x86:  -104
   unix x64:  -312
unix nanbox:  -120
      stm32:   -60
     cc3200:   -16
    esp8266:   -92
      esp32:   -24
2018-05-23 00:22:55 +10:00
Damien George d298013939 py/emit: Combine name and global into one func for load/store/delete.
Reduces code size by:

   bare-arm:   -56
minimal x86:  -300
   unix x64:  -576
unix nanbox:  -300
      stm32:  -164
     cc3200:   -56
    esp8266:  -236
      esp32:   -76
2018-05-23 00:22:47 +10:00
Damien George 26b5754092 py/emit: Combine build tuple/list/map emit funcs into one.
Reduces code size by:

   bare-arm:   -24
minimal x86:  -192
   unix x64:  -288
unix nanbox:  -184
      stm32:   -72
     cc3200:   -16
    esp8266:  -148
      esp32:   -32
2018-05-23 00:22:44 +10:00
Damien George e686c94052 py/emit: Combine yield value and yield-from emit funcs into one.
Reduces code size by:

   bare-arm:   -24
minimal x86:   -72
   unix x64:  -200
unix nanbox:   -72
      stm32:   -52
     cc3200:   -32
    esp8266:   -84
      esp32:   -24
2018-05-23 00:22:35 +10:00
Damien George 0a25fff956 py/emit: Combine fast and deref into one function for load/store/delete.
Reduces code size by:

   bare-arm:   -16
minimal x86:  -208
   unix x64:  -408
unix nanbox:  -248
      stm32:   -12
     cc3200:   -24
    esp8266:   -96
      esp32:   -44
2018-05-23 00:22:20 +10:00
Damien George 828ce16dc8 py/compile: Change comment about ITER_BUF_NSLOTS to a static assertion. 2018-05-18 23:31:00 +10:00
Damien George 7dfa56e40e py/compile: Adjust c_assign_atom_expr() to use return instead of goto.
Makes the flow of the function a little more obvious, and allows to reach
100% coverage of compile.c when using gcov.
2018-02-24 23:03:17 +11:00
Damien George 253f2bd7be py/compile: Combine compiler-opt of 2 and 3 tuple-to-tuple assignment.
This patch combines the compiler optimisation code for double and triple
tuple-to-tuple assignment, taking it from two separate if-blocks to one
combined if-block.  This can be done because the code for both of these
optimisations has a lot in common.  Combining them together reduces code
size for ports that have the triple-tuple optimisation enabled (and doesn't
change code size for ports that have it disabled).
2018-02-04 13:35:21 +11:00
Damien George 1e5a33df41 py: Convert all uses of alloca() to use new scoped allocation API. 2017-12-11 13:49:09 +11:00
Damien George 487dbdb267 py/compile: Use alloca instead of qstr_build when compiling import name.
The technique of using alloca is how dotted import names are composed in
mp_import_from and mp_builtin___import__, so use the same technique in the
compiler.  This puts less pressure on the heap (only the stack is used if
the qstr already exists, and if it doesn't exist then the standard qstr
block memory is used for the new qstr rather than a separate chunk of the
heap) and reduces overall code size.
2017-11-01 13:16:16 +11:00
Damien George ad6aae13a4 py/compile: Remove unused pn_colon code when compiling func params. 2017-08-21 22:00:34 +10:00
Alexander Steffen 55f33240f3 all: Use the name MicroPython consistently in comments
There were several different spellings of MicroPython present in comments,
when there should be only one.
2017-07-31 18:35:40 +10:00
Damien George 0291a624cb py/compile: Fix enum variable declaration. 2017-07-09 13:18:14 +10:00
Krzysztof Blazewicz 91a385db98 py/compile: Use switch-case to match token and operator.
Reduces code size.
2017-07-05 15:50:36 +10:00
Krzysztof Blazewicz a040fb89e7 py/compile: Combine arith and bit-shift ops into 1 compile routine.
This refactoring saves code space.
2017-07-05 15:49:00 +10:00
Damien George d94bc675e8 py/compile: Optimise emitter label indices to save a word of heap.
Previous to this patch, a label with value "0" was used to indicate an
invalid label, but that meant a wasted word (at slot 0) in the array of
label offsets.  This patch adjusts the label indices so the first one
starts at 0, and the maximum value indicates an invalid label.
2017-06-22 15:05:58 +10:00
Damien George 4c5f108321 py/compile: Fix bug with break/continue in else of optimised for-range.
This patch fixes a bug whereby the Python stack was not correctly reset if
there was a break/continue statement in the else black of an optimised
for-range loop.

For example, in the following code the "j" variable from the inner for loop
was not being popped off the Python stack:

    for i in range(4):
        for j in range(4):
            pass
        else:
            continue

This is now fixed with this patch.
2017-06-22 13:50:33 +10:00
Damien George 1e70fda69f py/compile: Raise SyntaxError if positional args are given after */**.
In CPython 3.4 this raises a SyntaxError.  In CPython 3.5+ having a
positional after * is allowed but uPy has the wrong semantics and passes
the arguments in the incorrect order.  To prevent incorrect use of a
function going unnoticed it is important to raise the SyntaxError in uPy,
until the behaviour is fixed to follow CPython 3.5+.
2017-06-14 18:18:01 +10:00
Ville Skyttä ca16c38210 various: Spelling fixes 2017-05-29 11:36:05 +03:00
Damien George dd11af209d py: Add LOAD_SUPER_METHOD bytecode to allow heap-free super meth calls.
This patch allows the following code to run without allocating on the heap:

    super().foo(...)

Before this patch such a call would allocate a super object on the heap and
then load the foo method and call it right away.  The super object is only
needed to perform the lookup of the method and not needed after that.  This
patch makes an optimisation to allocate the super object on the C stack and
discard it right after use.

Changes in code size due to this patch are:

   bare-arm: +128
    minimal: +232
   unix x64: +416
unix nanbox: +364
     stmhal: +184
    esp8266: +340
     cc3200: +128
2017-04-22 23:39:20 +10:00
Damien George 5335942b59 py/compile: Refactor handling of special super() call.
This patch refactors the handling of the special super() call within the
compiler.  It removes the need for a global (to the compiler) state variable
which keeps track of whether the subject of an expression is super.  The
handling of super() is now done entirely within one function, which makes
the compiler a bit cleaner and allows to easily add more optimisations to
super calls.

Changes to the code size are:

   bare-arm: +12
    minimal:  +0
   unix x64: +48
unix nanbox: -16
     stmhal:  +4
     cc3200:  +0
    esp8266: -56
2017-04-22 21:46:32 +10:00
Damien George 0dd6a59c89 py/compile: Don't do unnecessary check if iter parse node is a struct.
If we get to this point in the code then pn_iter is guaranteed to be a
struct.
2017-04-22 21:43:42 +10:00
Damien George ae54fbf166 py/compile: Add COMP_RETURN_IF_EXPR option to enable return-if-else opt.
With this optimisation enabled the compiler optimises the if-else
expression within a return statement.  The optimisation reduces bytecode
size by 2 bytes for each use of such a return-if-else statement.  Since
such a statement is not often used, and costs bytes for the code, the
feature is disabled by default.

For example the following code:

    def f(x):
        return 1 if x else 2

compiles to this bytecode with the optimisation disabled (left column is
bytecode offset in bytes):

    00 LOAD_FAST 0
    01 POP_JUMP_IF_FALSE 8
    04 LOAD_CONST_SMALL_INT 1
    05 JUMP 9
    08 LOAD_CONST_SMALL_INT 2
    09 RETURN_VALUE

and to this bytecode with the optimisation enabled:

    00 LOAD_FAST 0
    01 POP_JUMP_IF_FALSE 6
    04 LOAD_CONST_SMALL_INT 1
    05 RETURN_VALUE
    06 LOAD_CONST_SMALL_INT 2
    07 RETURN_VALUE

So the JUMP to RETURN_VALUE is optimised and replaced by RETURN_VALUE,
saving 2 bytes and making the code a bit faster.
2017-04-22 14:58:01 +10:00
Damien George 40b40ffc98 py/compile: Extract parse-node kind at start of func for efficiency.
Otherwise the type of parse-node and its kind has to be re-extracted
multiple times.  This optimisation reduces code size by a bit (16 bytes on
bare-arm).
2017-04-22 14:23:47 +10:00
Damien George fa03bbf0fd py/compile: Don't do unnecessary check if parse node is a struct.
PN_atom_expr_normal parse nodes always have structs for their second
sub-node, so simplify the check for the sub-node kind to save code size.
2017-04-22 14:13:37 +10:00
Damien George de9b53695d py: Raise a ValueError if range() step is zero.
Following CPython.  Otherwise one gets either an infinite loop (if code is
optimised by the uPy compiler) or possibly a divide-by-zero CPU exception.
2017-04-05 10:50:26 +10:00
Damien George f9b0e644e5 py/compile: Provide terse error message for invalid dict/set literals. 2017-03-29 12:44:27 +11:00
Damien George 18c059febf py: Shorten a couple of error messages. 2017-03-29 12:36:46 +11:00
Damien George f55a059e7a py/compile: Simplify syntax-error messages for illegal assignments.
With this patch all illegal assignments are reported as "can't assign to
expression".  Before the patch there were special cases for a literal on
the LHS, and for augmented assignments (eg +=), but it seems a waste of
bytes (and there are lots of bytes used in error messages) to spend on
distinguishing such errors which a user will rarely encounter.
2017-03-29 12:28:33 +11:00
Damien George 40c1272e55 py/compile: When compiling super(), handle closed-over self variable.
The self variable may be closed-over in the function, and in that case the
call to super() should load the contents of the closure cell using
LOAD_DEREF (before this patch it would just load the cell directly).
2017-03-27 11:27:08 +11:00
Damien George 60656eaea4 py: Define and use MP_OBJ_ITER_BUF_NSLOTS to get size of stack iter buf.
It improves readability of code and reduces the chance to make a mistake.

This patch also fixes a bug with nan-boxing builds by rounding up the
calculation of the new NSLOTS variable, giving the correct number of slots
(being 4) even if mp_obj_t is larger than the native machine size.
2017-03-23 16:36:08 +11:00
Damien George 5255255fb9 py: Create str/bytes objects in the parser, not the compiler.
Previous to this patch any non-interned str/bytes objects would create a
special parse node that held a copy of the str/bytes data.  Then in the
compiler this data would be turned into a str/bytes object.  This actually
lead to 2 copies of the data, one in the parse node and one in the object.
The parse node's copy of the data would be freed at the end of the compile
stage but nevertheless it meant that the peak memory usage of the
parse/compile stage was higher than it needed to be (by an amount equal to
the number of bytes in all the non-interned str/bytes objects).

This patch changes the behaviour so that str/bytes objects are created
directly in the parser and the object stored in a const-object parse node
(which already exists for bignum, float and complex const objects).  This
reduces peak RAM usage of the parse/compile stage, simplifies the parser
and compiler, and reduces code size by about 170 bytes on Thumb2 archs,
and by about 300 bytes on Xtensa archs.
2017-02-24 13:43:43 +11:00