Commit Graph

12748 Commits

Author SHA1 Message Date
Mike Wadsten c3c2c37fbc tests/basics: Add tests for type-checking subclassed exc instances. 2021-10-21 12:42:48 +11:00
Mike Wadsten fe2bc92b4d py/runtime: Fix crash when exc __new__ doesn't return an exc instance.
See CPython bug https://bugs.python.org/issue39091 for more details.
2021-10-21 12:32:16 +11:00
Damien George 30268c93dc stm32/pendsv: Allow a board to add entries for pendsv_schedule_dispatch.
Signed-off-by: Damien George <damien@micropython.org>
2021-10-20 21:20:18 +11:00
Damien George 69522822de stm32/mpbthciport: Allow a board to hook BT HCI poll functions.
Signed-off-by: Damien George <damien@micropython.org>
2021-10-20 21:20:18 +11:00
Damien George 5f2f9044ff stm32/usbd_cdc_interface: Allow a board to hook into USBD CDC RX events.
Signed-off-by: Damien George <damien@micropython.org>
2021-10-20 21:20:18 +11:00
Andrew Leech cc42b7c88b unix/modusocket: Support MP_STREAM_POLL in unix socket_ioctl.
Allows asyncio reading of network sockets when MICROPY_PY_USELECT is used
in the build configuration.
2021-10-19 22:48:10 +11:00
Andrew Leech 2ceeabf180 extmod/vfs_posix_file: Support MP_STREAM_POLL in vfs_posix_file_ioctl.
Allows asyncio reading of sys.stdin when MICROPY_PY_USELECT is used in the
build configuration.
2021-10-19 22:47:18 +11:00
Damien George ba940250a5 esp32/usb: Improve speed of USB CDC output.
Signed-off-by: Damien George <damien@micropython.org>
2021-10-16 00:23:59 +11:00
Damien George 549448e8bb esp32: Enable optimisations and move code to iRAM to boost performance.
This commit enables some significant optimisations for esp32:
- move the VM to iRAM
- move hot parts of the runtime to iRAM (map lookup, load global/name,
  mp_obj_get_type)
- enable MICROPY_OPT_LOAD_ATTR_FAST_PATH
- enable MICROPY_OPT_MAP_LOOKUP_CACHE
- disable assertions
- change from -Os to -O2 for compilation

It's hard to measure performance on esp32 due to external flash and
hardware caching.  But this set of changes improves performance compared to
master by (on a TinyPICO with the GENERIC build, using IDF 4.2.2, running
at 160MHz):

diff of scores (higher is better)
N=100 M=100    esp32-master -> esp32-perf       diff      diff% (error%)
bm_chaos.py           71.28 ->     268.08 :  +196.80 = +276.094% (+/-0.04%)
bm_fannkuch.py        44.10 ->      69.31 :   +25.21 = +57.166% (+/-0.01%)
bm_fft.py           1385.27 ->    2538.23 : +1152.96 = +83.230% (+/-0.01%)
bm_float.py         1060.94 ->    3900.62 : +2839.68 = +267.657% (+/-0.03%)
bm_hexiom.py          10.90 ->      32.79 :   +21.89 = +200.826% (+/-0.02%)
bm_nqueens.py       1000.83 ->    2372.87 : +1372.04 = +137.090% (+/-0.01%)
bm_pidigits.py       288.13 ->     664.40 :  +376.27 = +130.590% (+/-0.46%)
misc_aes.py          102.45 ->     345.69 :  +243.24 = +237.423% (+/-0.01%)
misc_mandel.py      1016.58 ->    2121.92 : +1105.34 = +108.731% (+/-0.01%)
misc_pystone.py      632.91 ->    1801.87 : +1168.96 = +184.696% (+/-0.08%)
misc_raytrace.py      76.66 ->     281.78 :  +205.12 = +267.571% (+/-0.05%)
viper_call0.py       210.63 ->     273.17 :   +62.54 = +29.692% (+/-0.01%)
viper_call1a.py      208.45 ->     269.51 :   +61.06 = +29.292% (+/-0.00%)
viper_call1b.py      185.44 ->     228.25 :   +42.81 = +23.086% (+/-0.01%)
viper_call1c.py      185.86 ->     228.90 :   +43.04 = +23.157% (+/-0.01%)
viper_call2a.py      207.10 ->     267.25 :   +60.15 = +29.044% (+/-0.00%)
viper_call2b.py      173.76 ->     209.42 :   +35.66 = +20.523% (+/-0.00%)

Five tests have more than 3x speed up (200%+).

The performance of the tests bm_fft, bm_pidigits and misc_aes now scale
with CPU frequency (eg changing frequency to 240MHz boosts the performance
of these by 50%), which means they are no longer influenced by timing of
external flash access.  (The viper_call* tests did previously scale with
CPU frequency, and they still do.)

Turning off assertions reduces code size by about 80k, and going from -Os
to -O2 costs about 100k, so the net change in code size (for the GENERIC
board) is about +20k.

If a board wants to enable assertions, or use -Os instead of -O2, that's
still possible by overriding the sdkconfig parameters.

Signed-off-by: Damien George <damien@micropython.org>
2021-10-16 00:07:11 +11:00
Damien George 8412568e7b py: Add wrapper macros so hot VM functions can go in fast code location.
For example, on esp32 they can go in iRAM to improve performance.

Signed-off-by: Damien George <damien@micropython.org>
2021-10-15 23:31:19 +11:00
iabdalkader eea6cd85b3 stm32/sdram: Enforce gcc opt, and use volatile and DSB in sdram_test.
Ensures consistent behaviour and resolves the D-Cache bug (the "exhaustive"
argument being lost due to cache being turned off) when O0 is used.

The changes in this commit are:

- Change -O0 to -Os because "gcc is considered broken at -O0" according to
  https://github.com/ARM-software/CMSIS_5/issues/620#issuecomment-550235656

- Use volatile for mem_base so the compiler doesn't optimise away reads or
  writes to the SDRAM, which is being tested.

- Use DSB to prevent any other compiler optimisations that would change the
  testing logic.

- Use alternating pattern/antipattern in exhaustive test to catch more
  hardware/configuration errors.

Implementation adapted by @andrewleech, taken directly from investigation
by @iabdalkader and @dpgeorge.

See #7841 and #7869 for further discussion.
2021-10-15 17:59:31 +11:00
NitiKaur 4c9e17e0a1 docs/esp32/tutorial: Add an example of peripheral control via regs. 2021-10-14 23:31:45 +11:00
NitiKaur 763042a287 docs/library/stm.rst: Document the stm module. 2021-10-14 23:19:08 +11:00
NitiKaur e87b2e8bfa docs/reference/manifest.rst: Add docs for manifest.py files. 2021-10-14 14:03:03 +11:00
NitiKaur 135339ce3a docs/reference/mpremote.rst: Add docs for mpremote. 2021-10-14 13:09:51 +11:00
NitiKaur c42c1c8718 docs/library/random.rst: Document the random module. 2021-10-13 16:56:37 +11:00
NitiKaur baa5a76fc0 docs/rp2: Add reference for PIO assembly instructions, and PIO tutorial. 2021-10-13 15:54:49 +11:00
stijn d42cba0d22 extmod/moduplatform: Improve implementation for PC ports.
Fix identification of 32/64 bit and of the Windows platform and add a
platform string mimicking CPython for the latter.
2021-09-24 13:51:39 +10:00
stijn ea880d5674 py/builtinimport: Forward all debug printing to MICROPY_DEBUG_PRINTER. 2021-09-24 13:17:19 +10:00
Damien George ea186de4c5 esp32: Split out WLAN code from modnetwork.c to network_wlan.c.
To match network_lan.c and network_ppp.c, and make it clear what code is
specifically for WLAN support.

Also provide a configuration option MICROPY_PY_NETWORK_WLAN which can be
used to fully disable network.WLAN (it's enabled by default).

Signed-off-by: Damien George <damien@micropython.org>
2021-09-24 12:41:35 +10:00
Damien George f046b50ca5 esp32/main: Add option for a board to hook code into startup sequence.
To do this the board must define MICROPY_BOARD_STARTUP, set
MICROPY_SOURCE_BOARD then define the new start-up code.

For example, in mpconfigboard.h:

    #define MICROPY_BOARD_STARTUP board_startup
    void board_startup(void);

in mpconfigboard.cmake:

    set(MICROPY_SOURCE_BOARD
        ${MICROPY_BOARD_DIR}/board.c
    )

and in a new board.c file in the board directory:

    #include "py/mpconfig.h"

    void board_startup(void) {
        boardctrl_startup();
        // extra custom startup
    }

This follows stm32's boardctrl facilities.

Signed-off-by: Damien George <damien@micropython.org>
2021-09-24 12:23:14 +10:00
leo chung 4fdf795efa esp32/mpthreadport: Fix TCB cleanup function so thread_mutex is ready.
Because vPortCleanUpTCB is called by the FreeRTOS idle task, and it checks
thread, but didn't check the thread_mutex.

And if thread is not NULL, but thread_mutex not ready then it will crash
with an error when calling mp_thread_mutex_lock(&thread_mutex, 1).

As suggested by @dpgeorge, move the thread = &thread_entry0 line to the end
of mp_thread_init().

Signed-off-by: leo chung <gewalalb@gmail.com>
2021-09-24 12:12:03 +10:00
Seon Rozenblum a39a596b79 esp32/machine_pin: Block out IO16 and IO17 when using SPIRAM on ESP32.
Fixes issue #7819.
2021-09-24 10:56:09 +10:00
Seon Rozenblum 35fb90bd57 esp32/usb: Add USB host connection detection for CDC serial output.
This callback allows detecting if there is a USB host connected to the CDC
or not, in which case the stdout_tx should skip CDC TX writing and
flushing or the system will block.

Fixes issue #7820.
2021-09-22 00:42:20 +10:00
Seon Rozenblum 7bf466a281 esp32/README: Updated readme with req IDF vers for ESP32-S2, C3 and S3. 2021-09-22 00:41:02 +10:00
IhorNehrutsa 71111cffba docs/esp32: Explain ESP32 PWM modes, timers, and channels. 2021-09-21 23:28:16 +10:00
IhorNehrutsa 52636fa692 esp32/machine_pwm: Add support for all PWM timers and channels.
This commit allows using all the available PWM timers (up to 8) and
channels (up to 16), without affecting the PWM API.

If a new frequency is set, first it checks if another timer is using the
same frequency.  If yes, then it uses this timer, otherwise, it creates a
new one.  If all timers are used, the user should set an already used
frequency, or de-init a channel.

This work is based on #6276 and #3608.
2021-09-21 23:18:09 +10:00
Stewart Bonnick 0d9429f44c esp32/boards: Add LOLIN_S2_MINI ESP32-S2 board.
To support Lolin S2 Mini ESP32-S2 Variant board.  More information about
this board can be found at https://www.wemos.cc/en/latest/s2/s2_mini.html
2021-09-21 22:49:51 +10:00
iabdalkader 67d1dca9c2 stm32/machine_i2c: Use hardware I2C for STM32H7. 2021-09-21 18:13:28 +10:00
roland van straten 9eff4029ab stm32/boards: Add PF11-BOOT0 to stm32f091_af.csv.
PF11 is added so it can be used as GPIO.
2021-09-21 18:11:42 +10:00
Ned Konz 99bb52047c stm32/boards/NUCLEO_H743ZI: Enable VfsLfs2 on NUCLEO_H743ZI(2) boards. 2021-09-21 18:02:19 +10:00
Ned Konz 8c214ed200 stm32: Extended flash filesystem space to 512K on H743 boards.
The H743 has equal sized pages of 128k, which means the filesystem doesn't
need to be near the beginning.  This commit moves the filesystem to the
very end of flash, and extends it to 512k (4 pages).

Signed-off-by: Damien George <damien@micropython.org>
2021-09-21 18:02:14 +10:00
iabdalkader 782d5b2e53 stm32: Enable platform module.
The HAL version is based on the stm32lib version.
2021-09-19 23:35:44 +10:00
iabdalkader 2c5e9bbdfa extmod: Add platform module.
It contains the compiler version, and underlying system HAL/SDK version.
2021-09-19 23:35:10 +10:00
iabdalkader 38f8e852e0 rp2: Add framework for networking.
MICROPY_PY_NETWORK and MICROPY_PY_USOCKET need to be enabled by a board to
get networking.  No NICs have yet been defined.
2021-09-19 23:20:13 +10:00
iabdalkader c973cfd2f3 rp2: Add support for bluetooth module using NimBLE. 2021-09-19 23:09:59 +10:00
iabdalkader 8064c3bf9c extmod/nimble: Add nimble CMake fragment file. 2021-09-19 23:02:16 +10:00
iabdalkader 80f2c794e6 extmod/mpbthci.h: Add mp_bluetooth_hci_uart_any prototype.
This allows drivers to use mpbthciport functions to read/write/poll UART.
2021-09-19 23:00:39 +10:00
Chris Fiege 6e39f2cc1e stm32/boards: Add OLIMEX H407 board definition.
This change adds the OLIMEX H407 support to the STM32 port.  The H407
(https://www.olimex.com/Products/ARM/ST/STM32-H407/) is simliar to the
already existing E407
(https://www.olimex.com/Products/ARM/ST/STM32-E407) but does not support
Ethernet and has a full-size USB-A port instead of a Mini-USB socket.

Both boards use the STM32F407ZGT6 CPU.

This port is basically a copy of the E407 but with changed pinmux:
* Removed Ethernet pin definition
* Removed UART1 (pins are used for other functions)
* Removed UART3 flow control pins (pins are used for other functions)
* Removed SD-Card detect pin (since it is not connected on the H407)

A REPL on UART3 is connected to the U3BOOT-header, a 3-pin header with RX,
TX and GND that is intended for the serial terminal.

Tested:
* Micro-SD Card is detected when inserted on RESET
* REPL on UART3 works
* Serial port on the mini USB socket

Signed-off-by: Chris Fiege <cfi@pengutronix.de>
2021-09-19 16:58:58 +10:00
patrick 4cfd85eb4a esp32/boards: Add board definition for ESP32-S2-WROVER module. 2021-09-19 16:49:35 +10:00
Seon Rozenblum 13e6e0d7f5 esp32/machine_hw_spi: Fix hardware SPI DMA channels for S2/S3. 2021-09-19 10:23:12 +10:00
Damien George da4593f937 tools/ci.sh: Use IDF v4.4 as part of esp32 CI and build GENERIC_S3.
IDF v4.4 does not have an official release so for now use the latest
master.  Also remove building GENERIC with no options (all the other boards
are no-option builds), to keep CI time reasonable.

Signed-off-by: Damien George <damien@micropython.org>
2021-09-16 22:59:05 +10:00
Damien George 80fe25689f esp32/boards: Add new GENERIC_S3 board definition.
Thanks to Seon Rozenblum aka @UnexpectedMaker for the work.

Signed-off-by: Damien George <damien@micropython.org>
2021-09-16 22:58:47 +10:00
Damien George 54d33b266c esp32: Add support for ESP32-S3 SoCs.
Thanks to Seon Rozenblum aka @UnexpectedMaker for the work.

Signed-off-by: Damien George <damien@micropython.org>
2021-09-16 22:58:47 +10:00
Jim Mussared b326edf68c all: Remove MICROPY_OPT_CACHE_MAP_LOOKUP_IN_BYTECODE.
This commit removes all parts of code associated with the existing
MICROPY_OPT_CACHE_MAP_LOOKUP_IN_BYTECODE optimisation option, including the
-mcache-lookup-bc option to mpy-cross.

This feature originally provided a significant performance boost for Unix,
but wasn't able to be enabled for MCU targets (due to frozen bytecode), and
added significant extra complexity to generating and distributing .mpy
files.

The equivalent performance gain is now provided by the combination of
MICROPY_OPT_LOAD_ATTR_FAST_PATH and MICROPY_OPT_MAP_LOOKUP_CACHE (which has
been enabled on the unix port in the previous commit).

It's hard to provide precise performance numbers, but tests have been run
on a wide variety of architectures (x86-64, ARM Cortex, Aarch64, RISC-V,
xtensa) and they all generally agree on the qualitative improvements seen
by the combination of MICROPY_OPT_LOAD_ATTR_FAST_PATH and
MICROPY_OPT_MAP_LOOKUP_CACHE.

For example, on a "quiet" Linux x64 environment (i3-5010U @ 2.10GHz) the
change from CACHE_MAP_LOOKUP_IN_BYTECODE, to LOAD_ATTR_FAST_PATH combined
with MAP_LOOKUP_CACHE is:

diff of scores (higher is better)
N=2000 M=2000       bccache -> attrmapcache      diff      diff% (error%)
bm_chaos.py        13742.56 ->   13905.67 :   +163.11 =  +1.187% (+/-3.75%)
bm_fannkuch.py        60.13 ->      61.34 :     +1.21 =  +2.012% (+/-2.11%)
bm_fft.py         113083.20 ->  114793.68 :  +1710.48 =  +1.513% (+/-1.57%)
bm_float.py       256552.80 ->  243908.29 : -12644.51 =  -4.929% (+/-1.90%)
bm_hexiom.py         521.93 ->     625.41 :   +103.48 = +19.826% (+/-0.40%)
bm_nqueens.py     197544.25 ->  217713.12 : +20168.87 = +10.210% (+/-3.01%)
bm_pidigits.py      8072.98 ->    8198.75 :   +125.77 =  +1.558% (+/-3.22%)
misc_aes.py        17283.45 ->   16480.52 :   -802.93 =  -4.646% (+/-0.82%)
misc_mandel.py     99083.99 ->  128939.84 : +29855.85 = +30.132% (+/-5.88%)
misc_pystone.py    83860.10 ->   82592.56 :  -1267.54 =  -1.511% (+/-2.27%)
misc_raytrace.py   21490.40 ->   22227.23 :   +736.83 =  +3.429% (+/-1.88%)

This shows that the new optimisations are at least as good as the existing
inline-bytecode-caching, and are sometimes much better (because the new
ones apply caching to a wider variety of map lookups).

The new optimisations can also benefit code generated by the native
emitter, because they apply to the runtime rather than the generated code.
The improvement for the native emitter when LOAD_ATTR_FAST_PATH and
MAP_LOOKUP_CACHE are enabled is (same Linux environment as above):

diff of scores (higher is better)
N=2000 M=2000        native -> nat-attrmapcache  diff      diff% (error%)
bm_chaos.py        14130.62 ->   15464.68 :  +1334.06 =  +9.441% (+/-7.11%)
bm_fannkuch.py        74.96 ->      76.16 :     +1.20 =  +1.601% (+/-1.80%)
bm_fft.py         166682.99 ->  168221.86 :  +1538.87 =  +0.923% (+/-4.20%)
bm_float.py       233415.23 ->  265524.90 : +32109.67 = +13.756% (+/-2.57%)
bm_hexiom.py         628.59 ->     734.17 :   +105.58 = +16.796% (+/-1.39%)
bm_nqueens.py     225418.44 ->  232926.45 :  +7508.01 =  +3.331% (+/-3.10%)
bm_pidigits.py      6322.00 ->    6379.52 :    +57.52 =  +0.910% (+/-5.62%)
misc_aes.py        20670.10 ->   27223.18 :  +6553.08 = +31.703% (+/-1.56%)
misc_mandel.py    138221.11 ->  152014.01 : +13792.90 =  +9.979% (+/-2.46%)
misc_pystone.py    85032.14 ->  105681.44 : +20649.30 = +24.284% (+/-2.25%)
misc_raytrace.py   19800.01 ->   23350.73 :  +3550.72 = +17.933% (+/-2.79%)

In summary, compared to MICROPY_OPT_CACHE_MAP_LOOKUP_IN_BYTECODE, the new
MICROPY_OPT_LOAD_ATTR_FAST_PATH and MICROPY_OPT_MAP_LOOKUP_CACHE options:
- are simpler;
- take less code size;
- are faster (generally);
- work with code generated by the native emitter;
- can be used on embedded targets with a small and constant RAM overhead;
- allow the same .mpy bytecode to run on all targets.

See #7680 for further discussion.  And see also #7653 for a discussion
about simplifying mpy-cross options.

Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
2021-09-16 16:04:03 +10:00
Jim Mussared 60c6d5594f unix: Enable LOAD_ATTR fast path, and map lookup caching.
Enabled for all variants except minimal.

Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
2021-09-16 16:02:19 +10:00
Jim Mussared 68219a295c stm32: Enable LOAD_ATTR fast path, and map lookup caching on >M0.
Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
2021-09-16 16:02:19 +10:00
Jim Mussared 11ef8f22fe py/map: Add an optional cache of (map+index) to speed up map lookups.
The existing inline bytecode caching optimisation, selected by
MICROPY_OPT_CACHE_MAP_LOOKUP_IN_BYTECODE, reserves an extra byte in the
bytecode after certain opcodes, which at runtime stores a map index of the
likely location of this field when looking up the qstr.  This scheme is
incompatible with bytecode-in-ROM, and doesn't work with native generated
code.  It also stores bytecode in .mpy files which is of a different format
to when the feature is disabled, making generation of .mpy files more
complex.

This commit provides an alternative optimisation via an approach that adds
a global cache for map offsets, then all mp_map_lookup operations use it.
It's less precise than bytecode caching, but allows the cache to be
independent and external to the bytecode that is executing.  It also works
for the native emitter and adds a similar performance boost on top of the
gain already provided by the native emitter.

Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
2021-09-16 16:02:19 +10:00
Jim Mussared 7b89ad8dbf py/vm: Add a fast path for LOAD_ATTR on instance types.
When the LOAD_ATTR opcode is executed there are quite a few different cases
that have to be handled, but the common case is accessing a member on an
instance type.  Typically, built-in types provide methods which is why this
is common.

Fortunately, for this specific case, if the member is found in the member
map then there's no further processing.

This optimisation does a relatively cheap check (type is instance) and then
forwards directly to the member map lookup, falling back to the regular
path if necessary.

Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
2021-09-16 16:02:15 +10:00
Jim Mussared 910e060f93 minimal/mpconfigport.h: Use MICROPY_CONFIG_ROM_LEVEL_MINIMUM.
Update minimal port to use the new "minimal" rom level config (this is a
no-op change, the binary is the same size and contains the exact same
symbols).

Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
2021-09-16 13:24:33 +10:00