Commit Graph

136 Commits

Author SHA1 Message Date
Damien George 0c1de1cdee py: Simplify "and" action within parser by making ident-rules explicit.
Most grammar rules can optimise to the identity if they only have a single
argument, saving a lot of RAM building the parse tree.  Previous to this
patch, whether a given grammar rule could be optimised was defined (mostly
implicitly) by a complicated set of logic rules.  With this patch the
definition is always specified explicitly by using "and_ident" in the rule
definition in the grammar.  This simplifies the logic of the parser,
making it a bit smaller and faster.  RAM usage in unaffected.
2016-04-14 13:49:23 +01:00
Damien George eacbd7aeba py: Fix constant folding and inline-asm to work with new async grammar. 2016-04-13 15:26:39 +01:00
Damien George 8d4d6731f5 py/parse: When looking up consts, check they exist before checking type. 2016-03-19 21:36:32 +00:00
Damien George d6c558c0aa py/parse: Use m_renew_maybe to ensure that memory is shrunk in-place.
The chunks of memory that the parser allocates contain parse nodes and
are pointed to from many places, so these chunks cannot be relocated
by the memory manager.  This patch makes it so that when a chunk is
shrunk to fit, it is not relocated.
2016-02-23 13:44:29 +00:00
Antonin ENFRUN efc971e8f9 py: unary_op enum type fix, and a cast to remove clang warning 2016-01-12 22:06:39 +01:00
Damien George 7dbf74c5b9 py/parse: Include unistd.h for ssize_t definition.
In some cases ssize_t is not defined by already included headers.
2016-01-08 13:42:00 +00:00
Damien George 22b2265053 py/parse: Improve constant folding to operate on small and big ints.
Constant folding in the parser can now operate on big ints, whatever
their representation.  This is now possible because the parser can create
parse nodes holding arbitrary objects.  For the case of small ints the
folding is still efficient in RAM because the folded small int is stored
inplace in the parse node.

Adds 48 bytes to code size on Thumb2 architecture.  Helps reduce heap
usage because more constants can be computed at compile time, leading to
a smaller parse tree, and most importantly means that the constants don't
have to be computed at runtime (perhaps more than once).  Parser will now
be a little slower when folding due to calls to runtime to do the
arithmetic.
2016-01-07 14:40:35 +00:00
Damien George 93b3726240 py/parse: Optimise away parse node that's just parenthesis around expr.
Before this patch, (x+y)*z would be parsed to a tree that contained a
redundant identity parse node corresponding to the parenthesis.  With
this patch such nodes are optimised away, which reduces memory
requirements for expressions with parenthesis, and simplifies the
compiler because it doesn't need to handle this identity case.

A parenthesis parse node is still needed for tuples.
2016-01-07 13:07:52 +00:00
Damien George dd5353a405 py: Add MICROPY_ENABLE_COMPILER and MICROPY_PY_BUILTINS_EVAL_EXEC opts.
MICROPY_ENABLE_COMPILER can be used to enable/disable the entire compiler,
which is useful when only loading of pre-compiled bytecode is supported.
It is enabled by default.

MICROPY_PY_BUILTINS_EVAL_EXEC controls support of eval and exec builtin
functions.  By default they are only included if MICROPY_ENABLE_COMPILER
is enabled.

Disabling both options saves about 40k of code size on 32-bit x86.
2015-12-18 12:35:44 +00:00
Damien George 16a6a47a7b py/parse: Replace mp_int_t/mp_uint_t with size_t etc, where appropriate. 2015-12-17 13:06:05 +00:00
Damien George b8cfb0d7b2 py: Add support for 64-bit NaN-boxing object model, on 32-bit machine.
To use, put the following in mpconfigport.h:

    #define MICROPY_OBJ_REPR (MICROPY_OBJ_REPR_D)
    #define MICROPY_FLOAT_IMPL (MICROPY_FLOAT_IMPL_DOUBLE)
    typedef int64_t mp_int_t;
    typedef uint64_t mp_uint_t;
    #define UINT_FMT "%llu"
    #define INT_FMT "%lld"

Currently does not work with native emitter enabled.
2015-11-29 14:25:36 +00:00
Damien George 999cedb90f py: Wrap all obj-ptr conversions in MP_OBJ_TO_PTR/MP_OBJ_FROM_PTR.
This allows the mp_obj_t type to be configured to something other than a
pointer-sized primitive type.

This patch also includes additional changes to allow the code to compile
when sizeof(mp_uint_t) != sizeof(void*), such as using size_t instead of
mp_uint_t, and various casts.
2015-11-29 14:25:35 +00:00
Damien George cbf7674025 py: Add MP_ROM_* macros and mp_rom_* types and use them. 2015-11-29 14:25:04 +00:00
Damien George 2c83894257 py: Implement default and star args for lambdas. 2015-11-17 14:00:14 +00:00
Damien George fdfcee7b1e py/parse: Make parser error handling cleaner, less spaghetti-like. 2015-10-12 12:59:18 +01:00
Damien George 64f2b213bb py: Move constant folding from compiler to parser.
It makes much more sense to do constant folding in the parser while the
parse tree is being built.  This eliminates the need to create parse
nodes that will just be folded away.  The code is slightly simpler and a
bit smaller as well.

Constant folding now has a configuration option,
MICROPY_COMP_CONST_FOLDING, which is enabled by default.
2015-10-12 12:58:45 +01:00
Damien George 366239b8b9 py/parse: Factor logic when creating parse node from and-rule. 2015-10-08 23:13:18 +01:00
Damien George 58e0f4ac50 py: Allocate parse nodes in chunks to reduce fragmentation and RAM use.
With this patch parse nodes are allocated sequentially in chunks.  This
reduces fragmentation of the heap and prevents waste at the end of
individually allocated parse nodes.

Saves roughly 20% of RAM during parse stage.
2015-10-02 00:11:11 +01:00
Damien George 65dc960e3b unix-cpy: Remove unix-cpy. It's no longer needed.
unix-cpy was originally written to get semantic equivalent with CPython
without writing functional tests.  When writing the initial
implementation of uPy it was a long way between lexer and functional
tests, so the half-way test was to make sure that the bytecode was
correct.  The idea was that if the uPy bytecode matched CPython 1-1 then
uPy would be proper Python if the bytecodes acted correctly.  And having
matching bytecode meant that it was less likely to miss some deep
subtlety in the Python semantics that would require an architectural
change later on.

But that is all history and it no longer makes sense to retain the
ability to output CPython bytecode, because:

1. It outputs CPython 3.3 compatible bytecode.  CPython's bytecode
changes from version to version, and seems to have changed quite a bit
in 3.5.  There's no point in changing the bytecode output to match
CPython anymore.

2. uPy and CPy do different optimisations to the bytecode which makes it
harder to match.

3. The bytecode tests are not run.  They were never part of Travis and
are not run locally anymore.

4. The EMIT_CPYTHON option needs a lot of extra source code which adds
heaps of noise, especially in compile.c.

5. Now that there is an extensive test suite (which tests functionality)
there is no need to match the bytecode.  Some very subtle behaviour is
tested with the test suite and passing these tests is a much better
way to stay Python-language compliant, rather than trying to match
CPy bytecode.
2015-08-17 12:51:26 +01:00
Damien George 96f0dd3cbc py/parse: Fix handling of empty input so it raises an exception. 2015-07-24 15:05:56 +00:00
Damien George fa7c61dfab py/parse: De-duplicate and simplify code for parser "or" rule. 2015-07-24 14:35:57 +00:00
Damien George ade9a05236 py: Improve allocation policy of qstr data.
Previous to this patch all interned strings lived in their own malloc'd
chunk.  On average this wastes N/2 bytes per interned string, where N is
the number-of-bytes for a quanta of the memory allocator (16 bytes on 32
bit archs).

With this patch interned strings are concatenated into the same malloc'd
chunk when possible.  Such chunks are enlarged inplace when possible,
and shrunk to fit when a new chunk is needed.

RAM savings with this patch are highly varied, but should always show an
improvement (unless only 3 or 4 strings are interned).  New version
typically uses about 70% of previous memory for the qstr data, and can
lead to savings of around 10% of total memory footprint of a running
script.

Costs about 120 bytes code size on Thumb2 archs (depends on how many
calls to gc_realloc are made).
2015-07-14 22:56:32 +01:00
Damien George 4735c45c51 py: Clean up some bits and pieces in parser, grammar. 2015-04-21 16:43:18 +00:00
nhtshot 5d323defe4 py: Update parse.c&mpconfig.h to reflect rename of mp_lexer_show_token.
This function is only used when DEBUG_PRINTERS and USE_RULE_NAME are
enabled.
2015-02-23 21:36:05 +00:00
Damien George dfe944c3e5 py: Expose compile.c:list_get as mp_parse_node_extract_list. 2015-02-13 02:29:46 +00:00
Damien George f804833a97 py: Initialise variables in mp_parse correctly, to satisfy gcc warning. 2015-02-08 13:40:20 +00:00
Damien George 7d414a1b52 py: Parse big-int/float/imag constants directly in parser.
Previous to this patch, a big-int, float or imag constant was interned
(made into a qstr) and then parsed at runtime to create an object each
time it was needed.  This is wasteful in RAM and not efficient.  Now,
these constants are parsed straight away in the parser and turned into
objects.  This allows constants with large numbers of digits (so
addresses issue #1103) and takes us a step closer to #722.
2015-02-08 01:57:40 +00:00
Damien George 0bfc7638ba py: Protect mp_parse and mp_compile with nlr push/pop block.
To enable parsing constants more efficiently, mp_parse should be allowed
to raise an exception, and mp_compile can already raise a MemoryError.
So these functions need to be protected by an nlr push/pop block.

This patch adds that feature in all places.  This allows to simplify how
mp_parse and mp_compile are called: they now raise an exception if they
have an error and so explicit checking is not needed anymore.
2015-02-07 18:33:58 +00:00
Damien George 5c670acb1f py: Be more machine-portable with size of bit fields. 2015-01-24 23:12:58 +00:00
Damien George 50912e7f5d py, unix, stmhal: Allow to compile with -Wshadow.
See issue #699.
2015-01-20 11:55:10 +00:00
Damien George 963a5a3e82 py, unix: Allow to compile with -Wsign-compare.
See issue #699.
2015-01-16 17:47:07 +00:00
Damien George d2d64f00fb py: Add "default" to switches to allow better code flow analysis.
This helps compiler produce smaller code.  Saves 124 bytes on stmhal and
bare-arm.
2015-01-14 21:32:42 +00:00
Damien George 4c81ba8015 py: Never intern data of large string/bytes object; add relevant tests.
Previously to this patch all constant string/bytes objects were
interned by the compiler, and this lead to crashes when the qstr was too
long (noticeable now that qstr length storage defaults to 1 byte).

With this patch, long string/bytes objects are never interned, and are
referenced directly as constant objects within generated code using
load_const_obj.
2015-01-13 16:21:23 +00:00
Damien George 51dfcb4bb7 py: Move to guarded includes, everywhere in py/ core.
Addresses issue #1022.
2015-01-01 20:32:09 +00:00
Damien George 6efa66f125 py: Remove unnecessary RULE_none and PN_none from parser. 2014-12-20 18:41:59 +00:00
Damien George b47ea4eadd py: Add blank and ident flags to grammar rules to simplify parser.
This saves around 100 bytes code space on stmhal, more on unix.
2014-12-20 18:37:50 +00:00
Damien George 2870d85a11 py: Save a few code bytes in parser; make vars local where possible. 2014-12-20 18:06:08 +00:00
Damien George a4c52c5a3d py: Optimise lexer by exposing lexer type.
mp_lexer_t type is exposed, mp_token_t type is removed, and simple lexer
functions (like checking current token kind) are now inlined.

This saves 784 bytes ROM on 32-bit unix, 348 bytes on stmhal, and 460
bytes on bare-arm.  It also saves a tiny bit of RAM since mp_lexer_t
is a bit smaller.  Also will run a bit more efficiently.
2014-12-05 19:35:18 +00:00
Damien George e7bb0443cd py: Properly free string parse-node; add assertion to gc_free. 2014-10-23 14:13:05 +01:00
Damien George 52b5d76a6b py: Free non-interned strings in the parser when not needed.
mp_parse_node_free now frees the memory associated with non-interned
strings.  And the parser calls mp_parse_node_free when discarding a
non-used node (such as a doc string).

Also, the compiler now frees the parse tree explicitly just before it
exits (as opposed to relying on the caller to do this).

Addresses issue #708 as best we can.
2014-09-23 15:31:56 +00:00
Damien George 2ac4af6946 py: Allow viper to have type annotations.
Viper functions can now be annotated with the type of their arguments
and return value.  Eg:

@micropython.viper
def f(x:int) -> int:
    return x + 1
2014-08-15 16:45:41 +01:00
Damien George 381618269a parser: Convert (u)int to mp_(u)int_t. 2014-07-03 14:13:33 +01:00
Damien George 40f3c02682 Rename machine_(u)int_t to mp_(u)int_t.
See discussion in issue #50.
2014-07-03 13:25:24 +01:00
Paul Sokolovsky 59c675a64c py: Include mpconfig.h before all other includes.
It defines types used by all other headers.

Fixes #691.
2014-06-21 22:43:22 +03:00
Damien George d1e355ea8e py: Fix check of small-int overflow when parsing ints.
Also unifies use of SMALL_INT_FITS macro across parser and runtime.
2014-05-28 14:51:12 +01:00
Damien George 2617eebf2f Change const byte* to const char* where sensible.
This removes need for some casts (at least, more than it adds need
for new casts!).
2014-05-25 22:27:57 +01:00
Damien George 5042bce8fb py: Don't automatically intern strings in parser.
This completes non-automatic interning of strings in the parser, so that
doc strings don't take up RAM.  It complicates the parser and compiler,
and bloats stmhal by about 300 bytes.  It's complicated because now
there are 2 kinds of parse-nodes that can be strings: interned leaves
and non-interned structs.
2014-05-25 22:06:06 +01:00
Damien George 3aaabd11a0 Merge branch 'keep-strings-uninterned' of github.com:pfalcon/micropython into pfalcon-keep-strings-uninterned
Conflicts:
	py/parse.c
2014-05-25 13:19:31 +01:00
Damien George 58ebde4664 Tidy up some configuration options.
MP_ALLOC_* -> MICROPY_ALLOC_*
MICROPY_PATH_MAX -> MICROPY_ALLOC_PATH_MAX
MICROPY_ENABLE_REPL_HELPERS -> MICROPY_HELPER_REPL
MICROPY_ENABLE_LEXER_UNIX -> MICROPY_HELPER_LEXER_UNIX
MICROPY_EXTRA_* -> MICROPY_PORT_*

See issue #35.
2014-05-21 20:32:59 +01:00
Damien George 1b82e9af5c py: Improve handling of memory error in parser.
Parser shouldn't raise exceptions, so needs to check when memory
allocation fails.  This patch does that for the initial set up of the
parser state.

Also, we now put the parser object on the stack.  It's small enough to
go there instead of on the heap.

This partially addresses issue #558.
2014-05-10 17:36:41 +01:00
Paul Sokolovsky 9e76b1181b Draft approach towards resolving https://github.com/micropython/micropython/issues/560#issuecomment-42213955 2014-05-08 22:43:46 +03:00
Damien George 93afa230a4 py, parser: Add commented-out code to discard doc strings.
Doesn't help with RAM reduction because doc strings are interned as soon
as they are encountered, which is too soon to do any optimisations on
them.
2014-05-06 21:44:11 +01:00
Damien George 66e18f04d8 py: Turn down amount of RAM parser and compiler use.
There are 2 locations in parser, and 1 in compiler, where memory
allocation is not precise.  In the parser it's the rule stack and result
stack, in the compiler it's the array for the identifiers in the current
scope.  All other mallocs are exact (ie they don't allocate more than is
needed).

This patch adds tuning options (MP_ALLOC_*) to mpconfig.h for these 3
inexact allocations.

The inexact allocations in the parser should actually be close to
logarithmic: you need an exponentially larger script (absent pathological
cases) to use up more room on the rule and result stacks.  As such, the
default allocation policy for these is now to start with a modest sized
stack, but grow only in small increments.

For the identifier arrays in the compiler, these now start out quite
small (4 entries, since most functions don't have that many ids), and
grow incrementally by 6 (since if you have more ids than 4, you probably
have quite a few more, but it wouldn't be exponentially more).

Partially addresses issue #560.
2014-05-05 13:19:03 +01:00
Damien George 04b9147e15 Add license header to (almost) all files.
Blanket wide to all .c and .h files.  Some files originating from ST are
difficult to deal with (license wise) so it was left out of those.

Also merged modpyb.h, modos.h, modstm.h and modtime.h in stmhal/.
2014-05-03 23:27:38 +01:00
Damien George 58ba4c3b4c py: Check explicitly for memory allocation failure in parser.
Previously, a failed malloc/realloc would throw an exception, which was
not caught.  I think it's better to keep the parser free from NLR
(exception throwing), hence this patch.
2014-04-10 14:27:31 +00:00
xbe efe3422394 py: Clean up includes.
Remove unnecessary includes. Add includes that improve portability.
2014-03-17 02:43:40 -07:00
Damien George 06201ff3d6 py: Implement bit-shift and not operations for mpz.
Implement not, shl and shr in mpz library.  Add function to create mpzs
on the stack, used for memory efficiency when rhs is a small int.
Factor out code to parse base-prefix of number into a dedicated function.
2014-03-01 19:50:50 +00:00
Paul Sokolovsky 56e5ef203b parse: Refactor parse node encoding to support full range of small ints.
Based on suggestion by @dpgeorge at
https://github.com/micropython/micropython/pull/313
2014-02-22 16:39:45 +02:00
Paul Sokolovsky bbf0e2fe12 parse: Note that fact that parser's small ints are different than VM small int.
Specifically, VM's small ints are 31 bit, while parser's only 28. There's already
MP_OBJ_FITS_SMALL_INT(), so, for clarity, rename MP_FIT_SMALL_INT() to
MP_PARSE_FITS_SMALL_INT().
2014-02-21 03:27:09 +02:00
Damien George c5966128c7 Implement proper exception type hierarchy.
Each built-in exception is now a type, with base type BaseException.
C exceptions are created by passing a pointer to the exception type to
make an instance of.  When raising an exception from the VM, an
instance is created automatically if an exception type is raised (as
opposed to an exception instance).

Exception matching (RT_BINARY_OP_EXCEPTION_MATCH) is now proper.

Handling of parse error changed to match new exceptions.

mp_const_type renamed to mp_type_type for consistency.
2014-02-15 16:10:44 +00:00
Paul Sokolovsky 520e2f58a5 Replace global "static" -> "STATIC", to allow "analysis builds". Part 2. 2014-02-12 18:31:30 +02:00
Damien George 08d075592f py: Fix bug with LOAD_METHOD; fix int->machine_int_t for small int.
LOAD_METHOD bug was: emitbc did not correctly calculate the amount of
stack usage for a LOAD_METHOD operation.

small int bug was: int was being used to pass small ints, when it should
have been machine_int_t.
2014-01-29 18:58:52 +00:00
Damien George b829b5caec Implement mp_parse_node_free; print properly repr(string). 2014-01-25 13:51:19 +00:00
Paul Sokolovsky aee2ba70de Add parse_node_free_struct() and use it to free parse tree after compilation.
TODO: Check lexer/parse/compile error path for leaks too.
2014-01-25 02:11:59 +02:00
Damien George 00208ce194 py: Change macro var args in parser to be C99 compliant. 2014-01-23 00:00:53 +00:00
Damien George 55baff4c9b Revamp qstrs: they now include length and hash.
Can now have null bytes in strings.  Can define ROM qstrs per port using
qstrdefsport.h
2014-01-21 21:40:13 +00:00
Damien George cbd2f7482c py: Add module/function/class name to exceptions.
Exceptions know source file, line and block name.

Also tidy up some debug printing functions and provide a global
flag to enable/disable them.
2014-01-19 11:48:48 +00:00
Damien George 08335004cf Add source file name and line number to error messages.
Byte code has a map from byte-code offset to source-code line number,
used to give better error messages.
2014-01-18 23:24:36 +00:00
Damien George d02c6d8962 Implement eval. 2014-01-15 22:14:03 +00:00
Damien George 9528cd66d7 Convert parse errors to exceptions.
Parser no longer prints an error, but instead returns an exception ID
and message.
2014-01-15 21:23:31 +00:00
Paul Sokolovsky 80f60e1aee Parse long Python ints properly.
Long int is something which doesn't fit into SMALL_INT partion of
machine_int_t. But it's also something which doesn't fit into
machine_int_t in the first place.
2014-01-12 22:04:21 +02:00
Damien George 69a818d418 py: Improve memory management for parser; add lexer error for bad line cont. 2014-01-12 13:55:24 +00:00
Damien George 8cc96a35e5 Put unicode functions in unicode.c, and tidy their names. 2013-12-30 18:23:50 +00:00
Damien 732407f1bf Change memory allocation API to require size for free and realloc. 2013-12-29 19:33:23 +00:00
Damien dd12d1378f Parse upper-case hex numbers correctly. 2013-12-29 13:03:49 +00:00
Damien d99b05282d Change object representation from 1 big union to individual structs.
A big change.  Micro Python objects are allocated as individual structs
with the first element being a pointer to the type information (which
is itself an object).  This scheme follows CPython.  Much more flexible,
not necessarily slower, uses same heap memory, and can allocate objects
statically.

Also change name prefix, from py_ to mp_ (mp for Micro Python).
2013-12-21 18:17:45 +00:00
Damien 7410e440ab Add basic complex number support. 2013-11-02 19:47:57 +00:00
Damien df4b4f31ef Make grammar rules const so the go in .text section. 2013-10-19 18:28:01 +01:00
Damien 5ac1b2efbd Implement REPL. 2013-10-18 19:58:12 +01:00
Damien 0efb3a1b66 Tidy up SMALL_INT optimisations and CPython compatibility. 2013-10-12 16:16:56 +01:00
Damien c025ebb2dc Separate out mpy core and unix version. 2013-10-12 14:30:21 +01:00
Damien 91d387de7d Improve indent/dedent error checking and reporting. 2013-10-09 15:09:52 +01:00
Damien b14de21fc8 Optimise typedargslist_name to not create a node if just an id. 2013-10-06 00:28:28 +01:00
Damien 5fa5ae40be Make range of small int 24 bits. 2013-10-06 00:13:15 +01:00
Damien 826005c60b Add support for inline thumb assembly. 2013-10-05 23:17:28 +01:00
Damien 429d71943d Initial commit. 2013-10-04 19:53:11 +01:00