Commit Graph

68 Commits

Author SHA1 Message Date
Damien George 65dc960e3b unix-cpy: Remove unix-cpy. It's no longer needed.
unix-cpy was originally written to get semantic equivalent with CPython
without writing functional tests.  When writing the initial
implementation of uPy it was a long way between lexer and functional
tests, so the half-way test was to make sure that the bytecode was
correct.  The idea was that if the uPy bytecode matched CPython 1-1 then
uPy would be proper Python if the bytecodes acted correctly.  And having
matching bytecode meant that it was less likely to miss some deep
subtlety in the Python semantics that would require an architectural
change later on.

But that is all history and it no longer makes sense to retain the
ability to output CPython bytecode, because:

1. It outputs CPython 3.3 compatible bytecode.  CPython's bytecode
changes from version to version, and seems to have changed quite a bit
in 3.5.  There's no point in changing the bytecode output to match
CPython anymore.

2. uPy and CPy do different optimisations to the bytecode which makes it
harder to match.

3. The bytecode tests are not run.  They were never part of Travis and
are not run locally anymore.

4. The EMIT_CPYTHON option needs a lot of extra source code which adds
heaps of noise, especially in compile.c.

5. Now that there is an extensive test suite (which tests functionality)
there is no need to match the bytecode.  Some very subtle behaviour is
tested with the test suite and passing these tests is a much better
way to stay Python-language compliant, rather than trying to match
CPy bytecode.
2015-08-17 12:51:26 +01:00
Damien George 96f0dd3cbc py/parse: Fix handling of empty input so it raises an exception. 2015-07-24 15:05:56 +00:00
Damien George fa7c61dfab py/parse: De-duplicate and simplify code for parser "or" rule. 2015-07-24 14:35:57 +00:00
Damien George ade9a05236 py: Improve allocation policy of qstr data.
Previous to this patch all interned strings lived in their own malloc'd
chunk.  On average this wastes N/2 bytes per interned string, where N is
the number-of-bytes for a quanta of the memory allocator (16 bytes on 32
bit archs).

With this patch interned strings are concatenated into the same malloc'd
chunk when possible.  Such chunks are enlarged inplace when possible,
and shrunk to fit when a new chunk is needed.

RAM savings with this patch are highly varied, but should always show an
improvement (unless only 3 or 4 strings are interned).  New version
typically uses about 70% of previous memory for the qstr data, and can
lead to savings of around 10% of total memory footprint of a running
script.

Costs about 120 bytes code size on Thumb2 archs (depends on how many
calls to gc_realloc are made).
2015-07-14 22:56:32 +01:00
Damien George 4735c45c51 py: Clean up some bits and pieces in parser, grammar. 2015-04-21 16:43:18 +00:00
nhtshot 5d323defe4 py: Update parse.c&mpconfig.h to reflect rename of mp_lexer_show_token.
This function is only used when DEBUG_PRINTERS and USE_RULE_NAME are
enabled.
2015-02-23 21:36:05 +00:00
Damien George dfe944c3e5 py: Expose compile.c:list_get as mp_parse_node_extract_list. 2015-02-13 02:29:46 +00:00
Damien George f804833a97 py: Initialise variables in mp_parse correctly, to satisfy gcc warning. 2015-02-08 13:40:20 +00:00
Damien George 7d414a1b52 py: Parse big-int/float/imag constants directly in parser.
Previous to this patch, a big-int, float or imag constant was interned
(made into a qstr) and then parsed at runtime to create an object each
time it was needed.  This is wasteful in RAM and not efficient.  Now,
these constants are parsed straight away in the parser and turned into
objects.  This allows constants with large numbers of digits (so
addresses issue #1103) and takes us a step closer to #722.
2015-02-08 01:57:40 +00:00
Damien George 0bfc7638ba py: Protect mp_parse and mp_compile with nlr push/pop block.
To enable parsing constants more efficiently, mp_parse should be allowed
to raise an exception, and mp_compile can already raise a MemoryError.
So these functions need to be protected by an nlr push/pop block.

This patch adds that feature in all places.  This allows to simplify how
mp_parse and mp_compile are called: they now raise an exception if they
have an error and so explicit checking is not needed anymore.
2015-02-07 18:33:58 +00:00
Damien George 5c670acb1f py: Be more machine-portable with size of bit fields. 2015-01-24 23:12:58 +00:00
Damien George 50912e7f5d py, unix, stmhal: Allow to compile with -Wshadow.
See issue #699.
2015-01-20 11:55:10 +00:00
Damien George 963a5a3e82 py, unix: Allow to compile with -Wsign-compare.
See issue #699.
2015-01-16 17:47:07 +00:00
Damien George d2d64f00fb py: Add "default" to switches to allow better code flow analysis.
This helps compiler produce smaller code.  Saves 124 bytes on stmhal and
bare-arm.
2015-01-14 21:32:42 +00:00
Damien George 4c81ba8015 py: Never intern data of large string/bytes object; add relevant tests.
Previously to this patch all constant string/bytes objects were
interned by the compiler, and this lead to crashes when the qstr was too
long (noticeable now that qstr length storage defaults to 1 byte).

With this patch, long string/bytes objects are never interned, and are
referenced directly as constant objects within generated code using
load_const_obj.
2015-01-13 16:21:23 +00:00
Damien George 51dfcb4bb7 py: Move to guarded includes, everywhere in py/ core.
Addresses issue #1022.
2015-01-01 20:32:09 +00:00
Damien George 6efa66f125 py: Remove unnecessary RULE_none and PN_none from parser. 2014-12-20 18:41:59 +00:00
Damien George b47ea4eadd py: Add blank and ident flags to grammar rules to simplify parser.
This saves around 100 bytes code space on stmhal, more on unix.
2014-12-20 18:37:50 +00:00
Damien George 2870d85a11 py: Save a few code bytes in parser; make vars local where possible. 2014-12-20 18:06:08 +00:00
Damien George a4c52c5a3d py: Optimise lexer by exposing lexer type.
mp_lexer_t type is exposed, mp_token_t type is removed, and simple lexer
functions (like checking current token kind) are now inlined.

This saves 784 bytes ROM on 32-bit unix, 348 bytes on stmhal, and 460
bytes on bare-arm.  It also saves a tiny bit of RAM since mp_lexer_t
is a bit smaller.  Also will run a bit more efficiently.
2014-12-05 19:35:18 +00:00
Damien George e7bb0443cd py: Properly free string parse-node; add assertion to gc_free. 2014-10-23 14:13:05 +01:00
Damien George 52b5d76a6b py: Free non-interned strings in the parser when not needed.
mp_parse_node_free now frees the memory associated with non-interned
strings.  And the parser calls mp_parse_node_free when discarding a
non-used node (such as a doc string).

Also, the compiler now frees the parse tree explicitly just before it
exits (as opposed to relying on the caller to do this).

Addresses issue #708 as best we can.
2014-09-23 15:31:56 +00:00
Damien George 2ac4af6946 py: Allow viper to have type annotations.
Viper functions can now be annotated with the type of their arguments
and return value.  Eg:

@micropython.viper
def f(x:int) -> int:
    return x + 1
2014-08-15 16:45:41 +01:00
Damien George 381618269a parser: Convert (u)int to mp_(u)int_t. 2014-07-03 14:13:33 +01:00
Damien George 40f3c02682 Rename machine_(u)int_t to mp_(u)int_t.
See discussion in issue #50.
2014-07-03 13:25:24 +01:00
Paul Sokolovsky 59c675a64c py: Include mpconfig.h before all other includes.
It defines types used by all other headers.

Fixes #691.
2014-06-21 22:43:22 +03:00
Damien George d1e355ea8e py: Fix check of small-int overflow when parsing ints.
Also unifies use of SMALL_INT_FITS macro across parser and runtime.
2014-05-28 14:51:12 +01:00
Damien George 2617eebf2f Change const byte* to const char* where sensible.
This removes need for some casts (at least, more than it adds need
for new casts!).
2014-05-25 22:27:57 +01:00
Damien George 5042bce8fb py: Don't automatically intern strings in parser.
This completes non-automatic interning of strings in the parser, so that
doc strings don't take up RAM.  It complicates the parser and compiler,
and bloats stmhal by about 300 bytes.  It's complicated because now
there are 2 kinds of parse-nodes that can be strings: interned leaves
and non-interned structs.
2014-05-25 22:06:06 +01:00
Damien George 3aaabd11a0 Merge branch 'keep-strings-uninterned' of github.com:pfalcon/micropython into pfalcon-keep-strings-uninterned
Conflicts:
	py/parse.c
2014-05-25 13:19:31 +01:00
Damien George 58ebde4664 Tidy up some configuration options.
MP_ALLOC_* -> MICROPY_ALLOC_*
MICROPY_PATH_MAX -> MICROPY_ALLOC_PATH_MAX
MICROPY_ENABLE_REPL_HELPERS -> MICROPY_HELPER_REPL
MICROPY_ENABLE_LEXER_UNIX -> MICROPY_HELPER_LEXER_UNIX
MICROPY_EXTRA_* -> MICROPY_PORT_*

See issue #35.
2014-05-21 20:32:59 +01:00
Damien George 1b82e9af5c py: Improve handling of memory error in parser.
Parser shouldn't raise exceptions, so needs to check when memory
allocation fails.  This patch does that for the initial set up of the
parser state.

Also, we now put the parser object on the stack.  It's small enough to
go there instead of on the heap.

This partially addresses issue #558.
2014-05-10 17:36:41 +01:00
Paul Sokolovsky 9e76b1181b Draft approach towards resolving https://github.com/micropython/micropython/issues/560#issuecomment-42213955 2014-05-08 22:43:46 +03:00
Damien George 93afa230a4 py, parser: Add commented-out code to discard doc strings.
Doesn't help with RAM reduction because doc strings are interned as soon
as they are encountered, which is too soon to do any optimisations on
them.
2014-05-06 21:44:11 +01:00
Damien George 66e18f04d8 py: Turn down amount of RAM parser and compiler use.
There are 2 locations in parser, and 1 in compiler, where memory
allocation is not precise.  In the parser it's the rule stack and result
stack, in the compiler it's the array for the identifiers in the current
scope.  All other mallocs are exact (ie they don't allocate more than is
needed).

This patch adds tuning options (MP_ALLOC_*) to mpconfig.h for these 3
inexact allocations.

The inexact allocations in the parser should actually be close to
logarithmic: you need an exponentially larger script (absent pathological
cases) to use up more room on the rule and result stacks.  As such, the
default allocation policy for these is now to start with a modest sized
stack, but grow only in small increments.

For the identifier arrays in the compiler, these now start out quite
small (4 entries, since most functions don't have that many ids), and
grow incrementally by 6 (since if you have more ids than 4, you probably
have quite a few more, but it wouldn't be exponentially more).

Partially addresses issue #560.
2014-05-05 13:19:03 +01:00
Damien George 04b9147e15 Add license header to (almost) all files.
Blanket wide to all .c and .h files.  Some files originating from ST are
difficult to deal with (license wise) so it was left out of those.

Also merged modpyb.h, modos.h, modstm.h and modtime.h in stmhal/.
2014-05-03 23:27:38 +01:00
Damien George 58ba4c3b4c py: Check explicitly for memory allocation failure in parser.
Previously, a failed malloc/realloc would throw an exception, which was
not caught.  I think it's better to keep the parser free from NLR
(exception throwing), hence this patch.
2014-04-10 14:27:31 +00:00
xbe efe3422394 py: Clean up includes.
Remove unnecessary includes. Add includes that improve portability.
2014-03-17 02:43:40 -07:00
Damien George 06201ff3d6 py: Implement bit-shift and not operations for mpz.
Implement not, shl and shr in mpz library.  Add function to create mpzs
on the stack, used for memory efficiency when rhs is a small int.
Factor out code to parse base-prefix of number into a dedicated function.
2014-03-01 19:50:50 +00:00
Paul Sokolovsky 56e5ef203b parse: Refactor parse node encoding to support full range of small ints.
Based on suggestion by @dpgeorge at
https://github.com/micropython/micropython/pull/313
2014-02-22 16:39:45 +02:00
Paul Sokolovsky bbf0e2fe12 parse: Note that fact that parser's small ints are different than VM small int.
Specifically, VM's small ints are 31 bit, while parser's only 28. There's already
MP_OBJ_FITS_SMALL_INT(), so, for clarity, rename MP_FIT_SMALL_INT() to
MP_PARSE_FITS_SMALL_INT().
2014-02-21 03:27:09 +02:00
Damien George c5966128c7 Implement proper exception type hierarchy.
Each built-in exception is now a type, with base type BaseException.
C exceptions are created by passing a pointer to the exception type to
make an instance of.  When raising an exception from the VM, an
instance is created automatically if an exception type is raised (as
opposed to an exception instance).

Exception matching (RT_BINARY_OP_EXCEPTION_MATCH) is now proper.

Handling of parse error changed to match new exceptions.

mp_const_type renamed to mp_type_type for consistency.
2014-02-15 16:10:44 +00:00
Paul Sokolovsky 520e2f58a5 Replace global "static" -> "STATIC", to allow "analysis builds". Part 2. 2014-02-12 18:31:30 +02:00
Damien George 08d075592f py: Fix bug with LOAD_METHOD; fix int->machine_int_t for small int.
LOAD_METHOD bug was: emitbc did not correctly calculate the amount of
stack usage for a LOAD_METHOD operation.

small int bug was: int was being used to pass small ints, when it should
have been machine_int_t.
2014-01-29 18:58:52 +00:00
Damien George b829b5caec Implement mp_parse_node_free; print properly repr(string). 2014-01-25 13:51:19 +00:00
Paul Sokolovsky aee2ba70de Add parse_node_free_struct() and use it to free parse tree after compilation.
TODO: Check lexer/parse/compile error path for leaks too.
2014-01-25 02:11:59 +02:00
Damien George 00208ce194 py: Change macro var args in parser to be C99 compliant. 2014-01-23 00:00:53 +00:00
Damien George 55baff4c9b Revamp qstrs: they now include length and hash.
Can now have null bytes in strings.  Can define ROM qstrs per port using
qstrdefsport.h
2014-01-21 21:40:13 +00:00
Damien George cbd2f7482c py: Add module/function/class name to exceptions.
Exceptions know source file, line and block name.

Also tidy up some debug printing functions and provide a global
flag to enable/disable them.
2014-01-19 11:48:48 +00:00
Damien George 08335004cf Add source file name and line number to error messages.
Byte code has a map from byte-code offset to source-code line number,
used to give better error messages.
2014-01-18 23:24:36 +00:00