Prior to this commit, pyboard.py used eval() to "parse" file data received
from the board. Using eval() on received data from a device is dangerous,
because a malicious device may inject arbitrary code execution on the PC
that is doing the operation.
Consider the following scenario:
Eve may write a malicious script to Bob's board in his absence. On return
Bob notices that something is wrong with the board, because it doesn't work
as expected anymore. He wants to read out boot.py (or any other file) to
see what is wrong. What he gets is a remote code execution on his PC.
Proof of concept:
Eve:
$ cat boot.py
_print = print
print = lambda *x, **y: _print("os.system('ls /; echo Pwned!')", end="\r\n\x04")
$ ./pyboard.py -f cp boot.py :
cp boot.py :boot.py
Bob:
$ ./pyboard.py -f cp :boot.py /tmp/foo
cp :boot.py /tmp/foo
bin chroot dev home lib32 media opt root sbin sys usr
boot config etc lib lib64 mnt proc run srv tmp var
Pwned!
There's also the possibility that the device is malfunctioning and sends
random and possibly dangerous data back to the PC, to be eval'd.
Fix this problem by using ast.literal_eval() to parse the received bytes,
instead of eval().
Signed-off-by: Michael Buesch <m@bues.ch>
If mpy-cross exits with an error be sure to print that error in a way that
is readable, instead of a long bytes object.
Signed-off-by: Damien George <damien@micropython.org>
The file `mbedtls_errors/mp_mbedtls_errors.c` can be used instead of
`mbedtls/library/error.c` to give shorter error strings, reducing the build
size of the error strings from about 12-16kB down to about 2-5kB.
This reverts commit 4d6f60d428.
This implementation used the timeout as a maximum amount of time needed for
the operation, when actually the spec and other tools suggest that it's the
minumum delay needed between subsequent USB transfers.
This adds support for freezing an entire directory while keeping the
directory as part of the import path. For example
freeze("path/to/library", "module")
will recursively freeze all scripts in "path/to/library/module" and have
them importable as "from module import ...".
Signed-off-by: Damien George <damien@micropython.org>
With only `sp_func_proto_paren = remove` set there are some cases where
uncrustify misses removing a space between the function name and the
opening '('. This sets all of the related options to `force` as well.
This is the result of running...
uncrustify -c tools/uncrustify.cfg --update-config-with-doc -o tools/uncrustify.cfg
...with some manual fixups to correct places where it changed things it
should not have.
Essentially it just adds new config parameters introduced in v0.71.0
with their default values.
Signed-off-by: David Lechner <david@pybricks.com>
Formatting for `* sizeof` was fixed in uncrustify v0.71, so we no longer
need the fixups for it. Also, there was one file where the updated
uncrustify caught a problem that the regex didn't pick up, which is updated
in this commit.
Signed-off-by: David Lechner <david@pybricks.com>
This adds a new command line option `-v` to `tools/codeformat.py` to enable
verbose printing of all files that are scanned.
Normally `uncrustify` and `black` are called with the `-q` option so
setting verbose suppresses the `-q` option and on `black` also enables the
`-v` option which makes it print out all file names matching the filter
similar to how `uncrustify` does by default.
This suppresses the Parsing: <file> as language C lines. This makes
parsing run a bit faster and on CI it makes for less scrolling through logs
(and black already uses the -q option).
Note: the uncrustify configuration is explicitly set to 'add' instead of
'force' in order not to alter the comments which use extra spaces after //
as a means of indenting text for clarity.
The decompression of error-strings is only done if the string is accessed
via printing or via er.args. Tests are added for this feature to ensure
the decompression works.
This adds the Python files in the tests/ directory to be formatted with
./tools/codeformat.py. The basics/ subdirectory is excluded for now so we
aren't changing too much at once.
In a few places `# fmt: off`/`# fmt: on` was used where the code had
special formatting for readability or where the test was actually testing
the specific formatting.
This eliminates the need for the sizeof regex fixup by rearranging things a
bit. All other bitfields already use the parentheses around expressions
with sizeof, so one case is fixed by following this convention.
VM_MAX_STATE_ON_STACK is the only remaining problem and it can be worked
around by changing the order of the operands.
When using a manifest on Windows the reference to mpy-cross compiler was
missing the .exe file extension, so add it when appropriate.
Also allow the default path to mpy-cross to be overridden by the (optional)
MICROPY_MPYCROSS environment variable, to allow full flexibility on any OS.
This commit adds a tool, codeformat.py, which will reformat C and Python
code to fit a certain style. By default the tool will reformat (almost)
all the original (ie not 3rd-party) .c, .h and .py files in this
repository. Passing filenames on the command-line to codeformat.py will
reformat only those. Reformatting is done in-place.
uncrustify is used for C reformatting, which is available for many
platforms and can be easily built from source, see
https://github.com/uncrustify/uncrustify. The configuration for uncrustify
is also added in this commit and values are chosen to best match the
existing code style. A small post-processing stage on .c and .h files is
done by codeformat.py (after running uncrustify) to fix up some minor
items:
- space inserted after * when used as multiplication with sizeof
- #if/ifdef/ifndef/elif/else/endif are dedented by one level when they are
configuring if-blocks and case-blocks.
For Python code, the formatter used is black, which can be pip-installed;
see https://github.com/psf/black. The defaults are used, except for line-
length which is set at 99 characters to match the "about 100" line-length
limit used in C code.
The formatting tools used and their configuration were chosen to strike a
balance between keeping existing style and not changing too many lines of
code, and enforcing a relatively strict style (especially for Python code).
This should help to keep the code consistent across everything, and reduce
cognitive load when writing new code to match the style.
This option makes pyboard.py exit as soon as the script/command is
successfully sent to the device, ie it does not wait for any output. This
can help to avoid hangs if the board is being rebooted with --comman (for
example).
Example usage:
$ python3 ./tools/pyboard.py --device /dev/ttyUSB0 --no-follow \
--command 'import machine; machine.reset()'
Some parts of code have been aligned to increase readability. In general
'' instead of "" were used wherever possible to keep the same convention
for entire file. Import inspect line has been moved to the top according
to hints reported by pep8 tools. A few extra spaces were removed, a few
missing spaces were added. Comments have been updated, mostly in
"read_dfu_file" function. Some other comments have been capitalized and/or
slightly updated. A few docstrings were fixed as well. No real code
changes intended.
This is an alternative to f4ed2df that adds a newline so that the output of
the test starts on a new line and the result of the test is prefixed with
"result: " to distinguish it from the test output.
Suggested-by: @dpgeorge
This patch allows executing .mpy files (including native ones) directly on
a target, eg a board over a serial connection. So there's no need to copy
the file to its filesystem to test it.
For example:
$ mpy-cross foo.py
$ pyboard.py foo.mpy
This makes the loading of viper-code-with-relocations a bit neater and
easier to understand, by treating the rodata/bss like a special object to
be loaded into the constant table (which is how it behaves).
We don't want to add a feature flag to .mpy files that indicate float
support because it will get complex and difficult to use. Instead the .mpy
is built using whatever precision it chooses (float or double) and the
native glue API will convert between this choice and what the host runtime
actually uses.
This commit adds a new tool called mpy_ld.py which is essentially a linker
that builds .mpy files directly from .o files. A new header file
(dynruntime.h) and makefile fragment (dynruntime.mk) are also included
which allow building .mpy files from C source code. Such .mpy files can
then be dynamically imported as though they were a normal Python module,
even though they are implemented in C.
Converting .o files directly (rather than pre-linked .elf files) allows the
resulting .mpy to be more efficient because it has more control over the
relocations; for example it can skip PLT indirection. Doing it this way
also allows supporting more architectures, such as Xtensa which has
specific needs for position-independent code and the GOT.
The tool supports targets of x86, x86-64, ARM Thumb and Xtensa (windowed
and non-windowed). BSS, text and rodata sections are supported, with
relocations to all internal sections and symbols, as well as relocations to
some external symbols (defined by dynruntime.h), and linking of qstrs.
Usage:
mpy-tool.py -o merged.mpy --merge mod1.mpy mod2.mpy
The constituent .mpy files are executed sequentially when the merged file
is imported, and they all use the same global namespace.
While the new manifest.py style got introduced for freezing python code
into the resulting binary, the old way - where files and modules within
ports/*/modules where baked into the resulting binary - was still
supported via `freeze('$(PORT_DIR)/modules')` within manifest.py.
However behaviour changed for symlinked directories (=modules), as those
links weren't followed anymore.
This commit restores the original behaviour by explicitly following
symlinks within a modules/ directory
When loading a manifest file, e.g. by include(), it will chdir first to the
directory of that manifest. This means that all file operations within a
manifest are relative to that manifest's location.
As a consequence of this, additional environment variables are needed to
find absolute paths, so the following are added: $(MPY_LIB_DIR),
$(PORT_DIR), $(BOARD_DIR). And rename $(MPY) to $(MPY_DIR) to be
consistent.
Existing manifests are updated to match.
This introduces a new build variable FROZEN_MANIFEST which can be set to a
manifest listing (written in Python) that describes the set of files to be
frozen in to the firmware.
Instead of encoding 4 zero bytes as placeholders for the simple_name and
source_file qstrs, and storing the qstrs after the bytecode, store the
qstrs at the location of these 4 bytes. This saves 4 bytes per bytecode
function stored in a .mpy file (for example lcd160cr.mpy drops by 232
bytes, 4x 58 functions). And resulting code size is slightly reduced on
ports that use this feature.
This patch compresses the second part of the bytecode prelude which
contains the source file name, function name, source-line-number mapping
and cell closure information. This part of the prelude now begins with a
single varible length unsigned integer which encodes 2 numbers, being the
byte-size of the following 2 sections in the header: the "source info
section" and the "closure section". After decoding this variable unsigned
integer it's possible to skip over one or both of these sections very
easily.
This scheme saves about 2 bytes for most functions compared to the original
format: one in the case that there are no closure cells, and one because
padding was eliminated.
The start of the bytecode prelude contains 6 numbers telling the amount of
stack needed for the Python values and exceptions, and the signature of the
function. Prior to this patch these numbers were all encoded one after the
other (2x variable unsigned integers, then 4x bytes), but using so many
bytes is unnecessary.
An entropy analysis of around 150,000 bytecode functions from the CPython
standard library showed that the optimal Shannon coding would need about
7.1 bits on average to encode these 6 numbers, compared to the existing 48
bits.
This patch attempts to get close to this optimal value by packing the 6
numbers into a single, varible-length unsigned integer via bit-wise
interleaving. The interleaving scheme is chosen to minimise the average
number of bytes needed, and at the same time keep the scheme simple enough
so it can be implemented without too much overhead in code size or speed.
The scheme requires about 10.5 bits on average to store the 6 numbers.
As a result most functions which originally took 6 bytes to encode these 6
numbers now need only 1 byte (in 80% of cases).
Prior to this patch mp_opcode_format would calculate the incorrect size of
the MP_BC_UNWIND_JUMP opcode, missing the additional byte. But, because
opcodes below 0x10 are unused and treated as bytes in the .mpy load/save
and freezing code, this bug did not show any symptoms, since nested unwind
jumps would rarely (if ever) reach a depth of 16 (so the extra byte of this
opcode would be between 0x01 and 0x0f and be correctly loaded/saved/frozen
simply as an undefined opcode).
This patch fixes this bug by correctly accounting for the additional byte.
.
- Split 'qemu-arm' from 'unix' for generating tests.
- Add frozen module to the qemu-arm test build.
- Add test that reproduces the requirement to half-word align native
function data.
Use "-f" to select filesystem mode, followed by the command to execute.
Optionally put ":" at the start of a filename to indicate that it's on the
remote device, if it would otherwise be ambiguous.
Examples:
$ pyboard.py -f ls
$ pyboard.py -f cat main.py
$ pyboard.py -f cp :main.py . # get from device
$ pyboard.py -f cp main.py : # put to device
$ pyboard.py -f rm main.py
Previously, when linking qstr objects in native code for ARM Thumb, the
index into the machine code was being incremented by 4, not 8. It should
be 8 to account for the size of the two machine instructions movw and movt.
This patch makes sure the index into the machine code is incremented by the
correct amount for all variations of qstr linking.
See issue #4829.
Fixes errors in the tool when 1) linking qstrs in native ARM-M code; 2)
freezing multiple files some of which use native code and some which don't.
Fixes issue #4829.
The user can now select their own package index by either passing the "-i"
command line option, or setting the upip.index_urls variable (before doing
an install).
The https://micropython.org/pi package index hosts packages from
micropython-lib and will be searched first when installing a package. If a
package is not found here then it will fallback to PyPI.
Prior to this patch, when a lot of data was output by a running script
pyboard.py would try to capture all of this output into the "data"
variable, which would gradually slow down pyboard.py to the point where it
would have large CPU and memory usage (on the host) and potentially lose
data.
This patch fixes this problem by not accumulating the data in the case that
the data is not needed, which is when "data_consumer" is used.
The qstr window size is not log-2 encoded, it's just the actual number (but
in mpy-tool.py this didn't lead to an error because the size is just used
to truncate the window so it doesn't grow arbitrarily large in memory).
Addresses issue #4635.
This system makes it a lot easier to include external libraries as static,
native modules in MicroPython. Simply pass USER_C_MODULES (like
FROZEN_MPY_DIR) as a make parameter.
When encoded in the mpy file, if qstr <= QSTR_LAST_STATIC then store two
bytes: 0, static_qstr_id. Otherwise encode the qstr as usual (either with
string data or a reference into the qstr window).
Reduces mpy file size by about 5%.
Instead of emitting two bytes in the bytecode for where the linked qstr
should be written to, it is now replaced by the actual qstr data, or a
reference into the qstr window.
Reduces mpy file size by about 10%.
This is an implementation of a sliding qstr window used to reduce the
number of qstrs stored in a .mpy file. The window size is configured to 32
entries which takes a fixed 64 bytes (16-bits each) on the C stack when
loading/saving a .mpy file. It allows to remember the most recent 32 qstrs
so they don't need to be stored again in the .mpy file. The qstr window
uses a simple least-recently-used mechanism to discard the least recently
used qstr when the window overflows (similar to dictionary compression).
This scheme only needs a single pass to save/load the .mpy file.
Reduces mpy file size by about 25% with a window size of 32.
POP_BLOCK and POP_EXCEPT are now the same, and are always followed by a
JUMP. So this optimisation reduces code size, and RAM usage of bytecode by
two bytes for each try-except handler.
Under python3 (tested with 3.6.7) bytes with a list of integers as an
argument returns a different result than under python 2.7 (tested with
2.7.15rc1) which causes pydfu.py to fail when run under 2.7. Changing
bytes to bytearray makes pydfu work properly under both Python 2.7 and
Python 3.6.
If you happen to only have a really simple frozen file that doesn't contain
any new qstrs then the generated frozen_mpy.c file contains an empty
enumeration which causes a C compile time error.
Following an equivalent fix to py/bc.c. The reason the incorrect values
for the opcode constants were not previously causing a bug is because they
were never being used: these opcodes always have qstr arguments so the part
of the code that was comparing them would never be reached.
Thanks to @malinah for finding the problem and providing the initial patch.
A DFU device must be in the idle state before it can be programmed, and
this requires either clearing the status or aborting, depending on its
current state. Code is added to do this. And the USB transfer size is now
automatically detected so devices with a size less than 2048 bytes work
correctly.
Some Python linters don't like unconditional except clauses because they
catch SystemExit and KeyboardInterrupt, which usually is not the intended
behaviour.
There appears to be an issue on Windows with CPython >= 3.6,
sys.stdout.flush() raises an exception:
OSError: [WinError 87] The parameter is incorrect
It works fine to just catch and ignore the error on the flush line. Tested
on Windows 10 x64 1803 (Build 17134.228), Python 3.6.4 amd64.
The Python documentation recommends to pass the command as a string when
using Popen(..., shell=True). This is because "sh -c <string>" is used to
execute the command and additional arguments after the command string are
passed to the shell itself (not the executing command).
https://docs.python.org/3.5/library/subprocess.html#subprocess.Popen
The first dynamic qstr pool is double the size of the 'alloc' field of
the last const qstr pool. The built in const qstr pool
(mp_qstr_const_pool) has a hardcoded alloc size of 10, meaning that the
first dynamic pool is allocated space for 20 entries. The alloc size
must be less than or equal to the actual number of qstrs in the pool
(the 'len' field) to ensure that the first dynamically created qstr
triggers the creation of a new pool.
When modules are frozen a second const pool is created (generally
mp_qstr_frozen_const_pool) and linked to the built in pool. However,
this second const pool had its 'alloc' field set to the number of qstrs
in the pool. When freezing a large quantity of modules this can result
in thousands of qstrs being in the pool. This means that the first
dynamically created qstr results in a massive allocation. This commit
sets the alloc size of the frozen qstr pool to 10 or less (if the number
of qstrs in the pool is less than 10). The result of this is that the
allocation behaviour when a dynamic qstr is created is identical with an
without frozen code.
Note that there is the potential for a slight memory inefficiency if the
frozen modules have less than 10 qstrs, as the first few dynamic
allocations will have quite a large overhead, but the geometric growth
soon deals with this.
The ST DFU bootloader supports a transfer size up to 2048 bytes, so send
that much data on each download (to device) packet. This almost halves
total download time.
Instead of passing thru more and more options from tinytest-codegen to
run-tests --list-tests, pipe output of run-tests --list-tests into
tinytest-codegen.
Gets passed to run-tests --list-tests to get actual list of tests to use.
If --target= is not given, legacy set hardcoded in tinytest-codegen itself
is used.
Also, get rid of tinytest test groups - they aren't really used for
anything, and only complicate processing. Besides, one of the next
step is to limit number of tests per a generated file to control
the binary size, which also will require "flat" list of tests.
The way tinytest was used in qemu-arm test target is that it didn't test
much. MicroPython tests are based on matching the test output against
reference output, but qemu-arm's implementation didn't do that, it
effectively tested just that there was no exception during test
execution. "upytesthelper" wrapper was introduce to fix it, and so
test generator is now switched to generate test code for it.
Also, fix PEP8 and other codestyle issues.
This patch allows the following code to run without allocating on the heap:
super().foo(...)
Before this patch such a call would allocate a super object on the heap and
then load the foo method and call it right away. The super object is only
needed to perform the lookup of the method and not needed after that. This
patch makes an optimisation to allocate the super object on the C stack and
discard it right after use.
Changes in code size due to this patch are:
bare-arm: +128
minimal: +232
unix x64: +416
unix nanbox: +364
stmhal: +184
esp8266: +340
cc3200: +128
This allows to execute a command and communicate with its stdin/stdout
via pipes ("exec") or with command-created pseudo-terminal ("execpty"),
to emulate serial access. Immediate usecase is controlling a QEMU process
which emulates board's serial via normal console, but it could be used
e.g. with helper binaries to access real board over other hadware
protocols, etc.
An example of device specification for these cases is:
--device exec:../zephyr/qemu.sh
--device execpty:../zephyr/qemu2.sh
Where qemu.sh contains long-long qemu startup line, or calls another
command. There's a special support in this patch for running the command
in a new terminal session, to support shell wrappers like that (without
new terminal session, only wrapper script would be terminated, but its
child processes would continue to run).