Features inline get/put operations for the highest performance. Locking
is not part of implementation, operation should be wrapped with locking
externally as needed.
When taking the logarithm of the float to determine the exponent, there
are some edge cases that finish the log loop too large. Eg for an
input value of 1e32-epsilon, this is actually less than 1e32 from the
log-loop table and finishes as 10.0e31 when it should be 1.0e32. It
is thus rendered as :e32 (: comes after 9 in ascii).
There was the same problem with numbers less than 1.
Previous to this patch, the "**b" in "a**b" had its own parse node with
just one item (the "b"). Now, the "b" is just the last element of the
power parse-node. This saves (a tiny bit of) RAM when compiling.
Passing an mp_uint_t to a %d printf format is incorrect for builds where
mp_uint_t is larger than word size (eg a nanboxing build). This patch
adds some simple casting to int in these cases.
If the heap is locked, or memory allocation fails, then calling a bound
method will still succeed by allocating the argument state on the stack.
The new code also allocates less stack than before if less than 4
arguments are passed. It's also a tiny bit smaller in code size.
This was done as part of the ESA project.
This new compile-time option allows to make the bytecode compiler
configurable at runtime by setting the fields in the mp_dynamic_compiler
structure. By using this feature, the compiler can generate bytecode
that targets any MicroPython runtime/VM, regardless of the host and
target compile-time settings.
Options so far that fall under this dynamic setting are:
- maximum number of bits that a small int can hold;
- whether caching of lookups is used in the bytecode;
- whether to use unicode strings or not (lexer behaviour differs, and
therefore generated string constants differ).
Reduces code size by 112 bytes on Thumb2 arch, and makes assembler faster
because comparison can be a simple equals instead of a string compare.
Not all ops have been converted, only those that were simple to convert
and reduced code size.
The chunks of memory that the parser allocates contain parse nodes and
are pointed to from many places, so these chunks cannot be relocated
by the memory manager. This patch makes it so that when a chunk is
shrunk to fit, it is not relocated.
These can be used to insert arbitrary checks, polling, etc into the VM.
They are left general because the VM is a highly tuned loop and it should
be up to a given port how that port wants to modify the VM internals.
One common use would be to insert a polling check, but only done after
a certain number of opcodes were executed, so as not to slow down the VM
too much. For example:
#define MICROPY_VM_HOOK_COUNT (30)
#define MICROPY_VM_HOOK_INIT static uint vm_hook_divisor = MICROPY_VM_HOOK_COUNT
#define MICROPY_VM_HOOK_POLL if (--vm_hook_divisor == 0) { \
vm_hook_divisor = MICROPY_VM_HOOK_COUNT;
extern void vm_hook_function(void);
vm_hook_function();
}
#define MICROPY_VM_HOOK_LOOP MICROPY_VM_HOOK_POLL
#define MICROPY_VM_HOOK_RETURN MICROPY_VM_HOOK_POLL
The new block protocol is:
- readblocks(self, n, buf)
- writeblocks(self, n, buf)
- ioctl(self, cmd, arg)
The new ioctl method handles the old sync and count methods, as well as
a new "get sector size" method.
The old protocol is still supported, and used if the device doesn't have
the ioctl method.
This allows you to pass a number (being an address) to a viper function
that expects a pointer, and also allows casting of integers to pointers
within viper functions.
This was actually the original behaviour, but it regressed due to native
type identifiers being promoted to 4 bits in width.
This function computes (x**y)%z in an efficient way. For large arguments
this operation is otherwise not computable by doing x**y and then %z.
It's currently not used, but is added in case it's useful one day.
For these 3 bitwise operations there are now fast functions for
positive-only arguments, and general functions for arbitrary sign
arguments (the fast functions are the existing implementation).
By default the fast functions are not used (to save space) and instead
the general functions are used for all operations.
Enable MICROPY_OPT_MPZ_BITWISE to use the fast functions for positive
arguments.
Before this patch, the native types for uint and ptr/ptr8/ptr16/ptr32
all overlapped and it was possible to make a mistake in casting. Now,
these types are all separate and any coding mistakes will be raised
as runtime errors.
Eg: '{:{}}'.format(123, '>20')
@pohmelie was the original author of this patch, but @dpgeorge made
significant changes to reduce code size and improve efficiency.
For single prec, exponents never get larger than about 37. For double
prec, exponents can be larger than 99 and need 3 bytes to format. This
patch makes the number of bytes needed configurable.
Addresses issue #1772.
Calling it from mp_init() is too late for some ports (like Unix), and leads
to incomplete stack frame being captured, with following GC issues. So, now
each port should call mp_stack_ctrl_init() on its own, ASAP after startup,
and taking special precautions so it really was called before stack variables
get allocated (because if such variable with a pointer is missed, it may lead
to over-collecting (typical symptom is segfaulting)).
MP_BC_NOT was removed and the "not" operation made a proper unary
operator, and the opcode format table needs to be updated to reflect
this change (but actually the change is only cosmetic).
Functions added are:
- randint
- randrange
- choice
- random
- uniform
They are enabled with configuration variable
MICROPY_PY_URANDOM_EXTRA_FUNCS, which is disabled by default. It is
enabled for unix coverage build and stmhal.
SHA1 is used in a number of protocols and algorithm originated 5 years ago
or so, in other words, it's in "wide use", and only newer protocols use
SHA2.
The implementation depends on axTLS enabled. TODO: Make separate config
option specifically for sha1().
micropython.stack_use() returns an integer being the number of bytes used
on the stack.
micropython.heap_lock() and heap_unlock() can be used to prevent the
memory manager from allocating anything on the heap. Calls to these are
allowed to be nested.
Seedable and reproducible pseudo-random number generator. Implemented
functions are getrandbits(n) (n <= 32) and seed().
The algorithm used is Yasmarang by Ilya Levin:
http://www.literatecode.com/yasmarang
this allows python code to use property(lambda:..., doc=...) idiom.
named versions for the fget, fset and fdel arguments are left out in the
interest of saving space; they are rarely used and easy to enable when
actually needed.
a test case is included.
The first argument to the type.make_new method is naturally a uPy type,
and all uses of this argument cast it directly to a pointer to a type
structure. So it makes sense to just have it a pointer to a type from
the very beginning (and a const pointer at that). This patch makes
such a change, and removes all unnecessary casting to/from mp_obj_t.
This patch changes the type signature of .make_new and .call object method
slots to use size_t for n_args and n_kw (was mp_uint_t. Makes code more
efficient when mp_uint_t is larger than a machine word. Doesn't affect
ports when size_t and mp_uint_t have the same size.
Constant folding in the parser can now operate on big ints, whatever
their representation. This is now possible because the parser can create
parse nodes holding arbitrary objects. For the case of small ints the
folding is still efficient in RAM because the folded small int is stored
inplace in the parse node.
Adds 48 bytes to code size on Thumb2 architecture. Helps reduce heap
usage because more constants can be computed at compile time, leading to
a smaller parse tree, and most importantly means that the constants don't
have to be computed at runtime (perhaps more than once). Parser will now
be a little slower when folding due to calls to runtime to do the
arithmetic.
Before this patch, (x+y)*z would be parsed to a tree that contained a
redundant identity parse node corresponding to the parenthesis. With
this patch such nodes are optimised away, which reduces memory
requirements for expressions with parenthesis, and simplifies the
compiler because it doesn't need to handle this identity case.
A parenthesis parse node is still needed for tuples.
Note that even though wrapped in MICROPY_CPYTHON_COMPAT, it is not
fully compatible because the modifications to the dictionary do not
propagate to the actual instance members.
Only types whose iterator instances still fit in 4 machine words have
been changed to use the polymorphic iterator.
Reduces Thumb2 arch code size by 264 bytes.
Previously, mark operation weren't logged at all, while it's quite useful
to see cascade of marks in case of over-marking (and in other cases too).
Previously, sweep was logged for each block of object in memory, but that
doesn't make much sense and just lead to longer output, harder to parse
by a human. Instead, log sweep only once per object. This is similar to
other memory manager operations, e.g. an object is allocated, then freed.
Or object is allocated, then marked, otherwise swept (one log entry per
operation, with the same memory address in each case).
Map indicies are most commonly a qstr, and adding a fast-path for hashing
of a qstr increases overall performance of the runtime.
On pyboard there is a 4% improvement in the pystone benchmark for a cost
of 20 bytes of code size. It's about a 2% improvement on unix.
When looking up and extracting an attribute of an instance, some
attributes must bind self as the first argument to make a working method
call. Previously to this patch, any attribute that was callable had self
bound as the first argument. But Python specs require the check to be
more restrictive, and only functions, closures and generators should have
self bound as the first argument
Addresses issue #1675.
POSIX doesn't guarantee something like that to work, but it works on any
system with careful signal implementation. Roughly, the requirement is
that signal handler is executed in the context of the process, its main
thread, etc. This is true for Linux. Also tested to work without issues
on MacOSX.
This makes all tests pass again for 64bit windows builds which would
previously fail for anything printing ranges (builtin_range/unpack1)
because they were printed as range( ld, ld ).
This is done by reusing the mp_vprintf implementation for MICROPY_OBJ_REPR_D
for 64bit windows builds (both msvc and mingw-w64) since the format specifier
used for 64bit integers is also %lld, or %llu for the unsigned version.
Note these specifiers used to be fetched from inttypes.h, which is the
C99 way of working with printf/scanf in a portable way, but mingw-w64
wants to be backwards compatible with older MS C runtimes and uses
the non-portable %I64i instead of %lld in inttypes.h, so remove the use
of said header again in mpconfig.h and define the specifiers manually.
Ideally we'd use %zu for size_t args, but that's unlikely to be supported
by all runtimes, and we would then need to implement it in mp_printf.
So simplest and most portable option is to use %u and cast the argument
to uint(=unsigned int).
Note: reason for the change is that UINT_FMT can be %llu (size suitable
for mp_uint_t) which is wider than size_t and prints incorrect results.
MICROPY_ENABLE_COMPILER can be used to enable/disable the entire compiler,
which is useful when only loading of pre-compiled bytecode is supported.
It is enabled by default.
MICROPY_PY_BUILTINS_EVAL_EXEC controls support of eval and exec builtin
functions. By default they are only included if MICROPY_ENABLE_COMPILER
is enabled.
Disabling both options saves about 40k of code size on 32-bit x86.
To let unix port implement "machine" functionality on Python level, and
keep consistent naming in other ports (baremetal ports will use magic
module "symlinking" to still load it on "import machine").
Fixes#1701.
For builds where mp_uint_t is larger than size_t, it doesn't make
sense to use such a wide type for qstrs. There can only be as many
qstrs as there is address space on the machine, so size_t is the correct
type to use.
Saves about 3000 bytes of code size when building unix/ port with
MICROPY_OBJ_REPR_D.
size_t is the correct type to use to count things related to the size of
the address space. Using size_t (instead of mp_uint_t) is important for
the efficiency of ports that configure mp_uint_t to larger than the
machine word size.
This allows to have single itertaor type for various internal iterator
types (save rodata space by not having repeating almost-empty type
structures). It works by looking "iternext" method stored in particular
object instance (should be first object field after "base").
Fixes#1684 and makes "not" match Python semantics. The code is also
simplified (the separate MP_BC_NOT opcode is removed) and the patch saves
68 bytes for bare-arm/ and 52 bytes for minimal/.
Previously "not x" was implemented as !mp_unary_op(x, MP_UNARY_OP_BOOL),
so any given object only needs to implement MP_UNARY_OP_BOOL (and the VM
had a special opcode to do the ! bit).
With this patch "not x" is implemented as mp_unary_op(x, MP_UNARY_OP_NOT),
but this operation is caught at the start of mp_unary_op and dispatched as
!mp_obj_is_true(x). mp_obj_is_true has special logic to test for
truthness, and is the correct way to handle the not operation.
Oftentimes, libc, libm, etc. don't come compiled with CPU compressed code
option (Thumb, MIPS16, etc.), but we may still want to use such compressed
code for MicroPython itself.
To use, put the following in mpconfigport.h:
#define MICROPY_OBJ_REPR (MICROPY_OBJ_REPR_D)
#define MICROPY_FLOAT_IMPL (MICROPY_FLOAT_IMPL_DOUBLE)
typedef int64_t mp_int_t;
typedef uint64_t mp_uint_t;
#define UINT_FMT "%llu"
#define INT_FMT "%lld"
Currently does not work with native emitter enabled.
This allows the mp_obj_t type to be configured to something other than a
pointer-sized primitive type.
This patch also includes additional changes to allow the code to compile
when sizeof(mp_uint_t) != sizeof(void*), such as using size_t instead of
mp_uint_t, and various casts.
- add mp_int_t/mp_uint_t typedefs in mpconfigport.h
- fix integer suffixes/formatting in mpconfig.h and mpz.h
- use MICROPY_NLR_SETJMP=1 in Makefile since the current nlrx64.S
implementation causes segfaults in gc_free()
- update README
This takes previous IEEE-754 single precision float implementation, and
converts it to fully portable parametrizable implementation using C99
functions like signbit(), isnan(), isinf(). As long as those functions
are available (they can be defined in adhoc manner of course), and
compiler can perform standard arithmetic and comparison operations on a
float type, this implementation will work with any underlying float type
(including types whose mantissa is larger than available intergral integer
type).
This change makes the code behave how it was supposed to work when first
written. The avail_slot variable is set to the first free slot when
looking for a key (which would come from deleting an entry). So it's
more efficient (for subsequent lookups) to insert a new key into such a
slot, rather than the very last slot that was searched.
MICROPY_PERSISTENT_CODE must be enabled, and then enabling
MICROPY_PERSISTENT_CODE_LOAD/SAVE (either or both) will allow loading
and/or saving of code (at the moment just bytecode) from/to a .mpy file.
Main changes when MICROPY_PERSISTENT_CODE is enabled are:
- qstrs are encoded as 2-byte fixed width in the bytecode
- all pointers are removed from bytecode and put in const_table (this
includes const objects and raw code pointers)
Ultimately this option will enable persistence for not just bytecode but
also native code.
Currently, the only place that clears the bit is in gc_collect.
So if a block with a finalizer is allocated, and subsequently
freed, and then the block is reallocated with no finalizer then
the bit remains set.
This could also be fixed by having gc_alloc clear the bit, but
I'm pretty sure that free is called way less than alloc, so doing
it in free is more efficient.
This patch adds/subtracts a constant from the 30-bit float representation
so that str/qstr representations are favoured: they now have all the high
bits set to zero. This makes encoding/decoding qstr strings more
efficient (and they are used more often than floats, which are now
slightly less efficient to encode/decode).
Saves about 300 bytes of code space on Thumb 2 arch.
py/mphal.h contains declarations for generic mp_hal_XXX functions, such
as stdio and delay/ticks, which ports should provide definitions for. A
port will also provide mphalport.h with further HAL declarations.
This makes format specifiers ~ fully compatible with CPython.
Adds 24 bytes for stmhal port (because previosuly we had to catch and report
it's unsupported to user).
Scenario: module1 depends on some common file from lib/, so specifies it
in its SRC_MOD, and the same situation with module2, then common file
from lib/ eventually ends up listed twice in $(OBJ), which leads to link
errors.
Make is equipped to deal with such situation easily, quoting the manual:
"The value of $^ omits duplicate prerequisites, while $+ retains them and
preserves their order." So, just use $^ consistently in all link targets.
This saves around 1000 bytes (Thumb2 arch) because in repr "C" it is
costly to check and extract a qstr. So making such check/extract a
function instead of a macro saves lots of code space.
This new object representation puts floats into the object word instead
of on the heap, at the expense of reducing their precision to 30 bits.
It only makes sense when the word size is 32-bits.
Cortex-M0, M0+ and M1 only have ARMv6-M Thumb/Thumb2 instructions. M3,
M4 and M7 have a superset of these, named ARMv7-M. This patch adds a
config option to enable support of the superset of instructions.
It makes much more sense to do constant folding in the parser while the
parse tree is being built. This eliminates the need to create parse
nodes that will just be folded away. The code is slightly simpler and a
bit smaller as well.
Constant folding now has a configuration option,
MICROPY_COMP_CONST_FOLDING, which is enabled by default.
With this patch parse nodes are allocated sequentially in chunks. This
reduces fragmentation of the heap and prevents waste at the end of
individually allocated parse nodes.
Saves roughly 20% of RAM during parse stage.
This patch adds more fine grained error message control for errors when
parsing integers (now has terse, normal and detailed). When detailed is
enabled, the error now escapes bytes when printing them so they can be
more easily seen.
When creating constant mpz's, the length of the mpz must be exactly how
many digits are used (not allocated) otherwise these numbers are not
compatible with dynamically allocated numbers.
Addresses issue #1448.
4 spaces are added at start of line to match previous indent, and if
previous line ended in colon.
Backspace deletes 4 space if only spaces begin a line.
Configurable via MICROPY_REPL_AUTO_INDENT. Disabled by default.
This optimises (in speed and code size) for the common case where the
binary op for the bool object is supported. Unsupported binary ops
still behave the same.
Function annotations are only needed when the native emitter is enabled
and when the current scope is emitted in viper mode. All other times
the annotations can be skipped completely.
Fetch the current usb mode and return a string representation when
pyb.usb_mode() is called with no args. The possible string values are interned
as qstr's. None will be returned if an incorrect mode is set.
Indeed, this flag efectively selects architecture target, and must
consistently apply to all compiles and links, including 3rd-party
libraries, unlike CFLAGS, which have MicroPython-specific setting.
unix-cpy was originally written to get semantic equivalent with CPython
without writing functional tests. When writing the initial
implementation of uPy it was a long way between lexer and functional
tests, so the half-way test was to make sure that the bytecode was
correct. The idea was that if the uPy bytecode matched CPython 1-1 then
uPy would be proper Python if the bytecodes acted correctly. And having
matching bytecode meant that it was less likely to miss some deep
subtlety in the Python semantics that would require an architectural
change later on.
But that is all history and it no longer makes sense to retain the
ability to output CPython bytecode, because:
1. It outputs CPython 3.3 compatible bytecode. CPython's bytecode
changes from version to version, and seems to have changed quite a bit
in 3.5. There's no point in changing the bytecode output to match
CPython anymore.
2. uPy and CPy do different optimisations to the bytecode which makes it
harder to match.
3. The bytecode tests are not run. They were never part of Travis and
are not run locally anymore.
4. The EMIT_CPYTHON option needs a lot of extra source code which adds
heaps of noise, especially in compile.c.
5. Now that there is an extensive test suite (which tests functionality)
there is no need to match the bytecode. Some very subtle behaviour is
tested with the test suite and passing these tests is a much better
way to stay Python-language compliant, rather than trying to match
CPy bytecode.
Previous to this patch there were some cases where line numbers for
errors were 0 (unknown). Now the compiler attempts to give a better
line number where possible, in some cases giving the line number of the
closest statement, and other cases the line number of the inner-most
scope of the error (eg the line number of the start of the function).
This helps to give good (and sometimes exact) line numbers for
ViperTypeError exceptions.
This patch also makes sure that the first compile error (eg SyntaxError)
that is encountered is reported (previously it was the last one that was
reported).
When looking to see if the REPL input needs to be continued on the next
line, don't look inside strings for unmatched ()[]{} ''' or """.
Addresses issue #1387.
ViperTypeError now includes filename and function name where the error
occurred. The line number is the line number of the start of the
function definition, which is the best that can be done without a lot
more work.
Partially addresses issue #1381.
This patch makes configurable, via MICROPY_QSTR_BYTES_IN_HASH, the
number of bytes used for a qstr hash. It was originally fixed at 2
bytes, and now defaults to 2 bytes. Setting it to 1 byte will save
ROM and RAM at a small expense of hash collisions.
Previous to this patch all interned strings lived in their own malloc'd
chunk. On average this wastes N/2 bytes per interned string, where N is
the number-of-bytes for a quanta of the memory allocator (16 bytes on 32
bit archs).
With this patch interned strings are concatenated into the same malloc'd
chunk when possible. Such chunks are enlarged inplace when possible,
and shrunk to fit when a new chunk is needed.
RAM savings with this patch are highly varied, but should always show an
improvement (unless only 3 or 4 strings are interned). New version
typically uses about 70% of previous memory for the qstr data, and can
lead to savings of around 10% of total memory footprint of a running
script.
Costs about 120 bytes code size on Thumb2 archs (depends on how many
calls to gc_realloc are made).
I checked the entire codebase, and every place that vstr_init_len
was called, there was a call to mp_obj_new_str_from_vstr after it.
mp_obj_new_str_from_vstr always tries to reallocate a new buffer
1 byte larger than the original to store the terminating null
character.
In many cases, if we allocated the initial buffer to be 1 byte
longer, we can prevent this extra allocation, and just reuse
the originally allocated buffer.
Asking to read 256 bytes and only getting 100 will still cause
the extra allocation, but if you ask to read 256 and get 256
then the extra allocation will be optimized away.
Yes - the reallocation is optimized in the heap to try and reuse
the buffer if it can, but it takes quite a few cycles to figure
this out.
Note by Damien: vstr_init_len should now be considered as a
string-init convenience function and used only when creating
null-terminated objects.
Previous to this patch, if "abcd" and "ab" were possible completions
to tab-completing "a", then tab would expand to "abcd" straight away
if this identifier appeared first in the dict.
The TimeoutError is useful for some modules, specially the the
socket module. TimeoutError can then be alised to socket.timeout
and then Python code can differentiate between socket.error and
socket.timeout.
When "micropython -m pkg.mod" command was used, relative imports in pkg.mod
didn't work, because pkg.mod.__name__ was set to __main__, and the fact that
it's a package submodule was missed. This is an original workaround to this
issue. TODO: investigate and compare how CPython deals with this issue.
Previous to this patch each time a bytes object was referenced a new
instance (with the same data) was created. With this patch a single
bytes object is created in the compiler and is loaded directly at execute
time as a true constant (similar to loading bignum and float objects).
This saves on allocating RAM and means that bytes objects can now be
used when the memory manager is locked (eg in interrupts).
The MP_BC_LOAD_CONST_BYTES bytecode was removed as part of this.
Generated bytecode is slightly larger due to storing a pointer to the
bytes object instead of the qstr identifier.
Code size is reduced by about 60 bytes on Thumb2 architectures.
Previous to this patch a call such as list.append(1, 2) would lead to a
seg fault. This is because list.append is a builtin method and the first
argument to such methods is always assumed to have the correct type.
Now, when a builtin method is extracted like this it is wrapped in a
checker object which checks the the type of the first argument before
calling the builtin function.
This feature is contrelled by MICROPY_BUILTIN_METHOD_CHECK_SELF_ARG and
is enabled by default.
See issue #1216.
mpconfigport.mk contains configuration options which affect the way
MicroPython is linked. In this regard, it's "stronger" configuration
dependency than even mpconfigport.h, so if we rebuild everything on
mpconfigport.h change, we certianly should of that on mpconfigport.mk
change too.
If heap allocation for the Python-stack of a function fails then we may
as well allocate the Python-stack on the C stack. This will allow to
run more code without using the heap.
This allows to do "ar[i]" and "ar[i] = val" in viper when ar is a Python
object and i and/or val are native viper types (eg ints).
Patch also includes tests for this feature.
This patch converts Q(abc) to "Q(abc)" to protect the abc from the
C preprocessor, then converts back after the preprocessor is finished.
So now we can safely put includes in mpconfig(port).h, and also
preprocess qstrdefsport.h (latter is now done also in this patch).
Addresses issue #1252.
C's printf will pad nan/inf differently to CPython. Our implementation
originally conformed to C, now it conforms to CPython's way.
Tests for this are also added in this patch.