Frozen modules will be searched preferentially, but gives the user the
ability to override this behavior.
This matches the previous behavior where "" was implicitly the frozen
search path, but the frozen list was checked before the filesystem.
Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
This feature is not enabled on any port, it's not in CPython's io module,
and functionality is better suited to the micropython-lib implementation of
pkg_resources.
Default SPI pins are now correctly assigned by machine_hw_spi.c even for S2
and S3. mpconfigboard.h files define defaults with flipped SPI(1) and
SPI(2) to workaround a bug in machine_hw_spi.c - the bug is fixed.
Signed-off-by: Tomas Vanek <vanekt@fbl.cz>
Use IO_MUX pins as defined by ESP IDF in soc/esp32/include/soc/spi_pins.h
ESP32S2 and S3 don't have IO_MUX pins for SPI3, GPIO matrix is always used.
Choose suitable defaults for S2 and S3.
ESP32C3 does not have SPI3 at all. Don't define pin mappings for it.
Signed-off-by: Tomas Vanek <vanekt@fbl.cz>
Use IO_MUX pins as defined by ESP IDF in soc/esp32*/include/soc/spi_pins.h
Alternatively use now deprecated HSPI_IOMUX_PIN_NUM_xxx
(or FSPI_IOMUX_PIN_NUM_xxx for ESP32S2) for compatibility with IDF 4.2
and older.
Signed-off-by: Tomas Vanek <vanekt@fbl.cz>
The index of machine_hw_spi_obj and machine_hw_spi_default_pins arrays is
assigned to 0 for ARG_id==HSPI_HOST and 1 for another SPI. On ESP32S2 and
S3 HSPI_HOST=2 so the first set (idx=0) of default pins is used for
SPI(id=2) aka HSPI/SPI3 and the second set (idx=1) for SPI(id=1) aka
FSPI/SPI2. This makes a misleading mess in MICROPY_HW_SPIxxxx definitions
and it is also in contradiction to the comments around the definitions.
Change the test of ARG_id to fix the order of machine_hw_spi_default_pins.
This change might require adjusting MICROPY_HW_SPIxxxx definitions in
mpconfigboard.h of S2/S3 based boards.
Signed-off-by: Tomas Vanek <vanekt@fbl.cz>
- Move the qspi_xxxx_flash_config.c files to hal.
It turned out that they are less board than flash type specific.
- Change to a common flexspi_flash_config.h header file.
Thanks for the hint, Damien. The DEBUG build got very large recently.
The major difference is, that inline function are now inlined and
not included as a function. That's good and maybe bad. The good thing is,
that the code speed si now close to the final code. It could be worse
in single step debugging. I'll see.
Setting this option caused a new warning and a formatting error
to pop up at different places. Fixed as well.
The ID is read in a single function and used for:
- machine.unique_id()
- Ethernet MAC addresses.
- ...
That facilitates use of other MCU using a different access method for
the ID (e.g. i.MX RT1176).
Just another choice for the PHY interface.
Added: Keyword option phy_clock=LAN.IN or LAN.OUT
to define the source of the 50MHZ clock for the PHY
interface. The RMII clock is not enabled if it
is generated by a PYH board. Constants:
LAN.IN The clock is provided by the PHY board.
LAN.OUT The clock is provided by the MCU board.
The default is LAN.OUT or the value set in mpconfigboard.h, which
is currently set to IN only for the SEEED ARCH MIX board. Usage etc:
lan = LAN(phy_type=LAN.PHY_DP83848, phy_clock=LAN.IN)
The initial problem with a wrong ICMP checksum was caused by
the test code setting a checksum and the HW taking that probably as
the start value and ending up with 0xffff. With a checksum field of 0
set by the test code the HW creates the proper checksum.
Useful for boards without a PHY interface, where that has to be
attached. Like the Seed ARCH MIX board or Vision SOM. Phy drivers
supported so far are:
- KSZ8081
- DP83825
- LAN8720
More to come. Usage e.g.:
lan = LAN(phy_type=LAN.PHY_LAN8720, phy_addr=1)
The default values are those set in mpconfigboard.h.
UART 0 is attached to the Debug USB port. The settings are
115200 Baud, 8N1.
For MIMXRT1010_EVK this is identical to UART1. For the other boards,
this is an additional UART.
The pico-sdk 1.3.0 update in 97a7cc243b
introduced a change that broke RP2 Bluetooth UART, and possibly UART in
general, which stops working right after UART is initialized. The commit
raspberrypi/pico-sdk@2622e9b enables the UART receive timeout (RTIM) IRQ,
which is asserted when the receive FIFO is not empty, and no more
characters are received for a period of time.
This commit makes sure the RTIM IRQ is handled and cleared in
uart_service_interrupt.
The current ST HAL does not support reading the extended CSD so cannot
correctly detect the capacity of high-capacity cards. As a workaround, the
capacity can be forced via the MICROPY_HW_MMCARD_LOG_BLOCK_NBR config
option.
Signed-off-by: Damien George <damien@micropython.org>
These were commented correctly by their colour, but in the wrong order with
respect to the PCB silkscreen.
Fixes issue #8054.
Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
For the coverage build this reduces the binary size to about 1/4 of its
size, and seems to help gcov/lcov coverage analysis so that it doesn't miss
lines.
Signed-off-by: Damien George <damien@micropython.org>
The methods duty_u16() and duty_ns() are implemented to match the existing
docs. The duty will remain the same when the frequency is changed.
Standard ESP32 as well as S2, S3 and C3 are supported.
Thanks to @kdschlosser for the fix for rounding in resolution calculation.
Documentation is updated and examples expanded for esp32, including the
quickref and tutorial. Additional notes are added to the machine.PWM docs
regarding limitations of hardware PWM.
Instead of board pins, so that pins which have only the CPU specified in
pins.csv can still be used with mp_hal_pin_config_alt_static().
Signed-off-by: Damien George <damien@micropython.org>
A board can now define the following linker symbols to configure its flash
storage layout:
_micropy_hw_internal_flash_storage_start
_micropy_hw_internal_flash_storage_end
_micropy_hw_internal_flash_storage_ram_cache_start
_micropy_hw_internal_flash_storage_ram_cache_end
And optionally have a second flash segment by configuring
MICROPY_HW_ENABLE_INTERNAL_FLASH_STORAGE_SEGMENT2 to 1 and defining:
_micropy_hw_internal_flash_storage2_start
_micropy_hw_internal_flash_storage2_end
Signed-off-by: Damien George <damien@micropython.org>
This reduces code size and code duplication, and fixes `pyb.usb_mode()` so
that it now returns the correct string when in multi-VCP mode (before, it
would return None when in one of these modes).
Signed-off-by: Damien George <damien@micropython.org>
Frequency range 15Hz/18Hz to > 1 MHz, with decreasing resolution of the
duty cycle. The basic API is supported as documentated, except that
keyword parameters are accepted for both the instatiaton and the
PWM.init() call.
Extensions: support PWM for channel pairs. Channel pairs are declared by
supplying 2-element tuples for the pins. The two channels of a pair must
be the A/B channel of a FLEXPWM module. These form than a complementary
pair.
Additional supported keyword arguments:
- center=value Defines the center position of a pulse within the pulse
cycle. The align keyword is actually shortcut for center.
- sync=True|False: If set to True, the channels will be synchronized to a
submodule 0 channel, which has already to be enabled.
- align=PWM.MIDDLE | PMW.BEGIN | PWM.END. It defines, whether synchronized
channels are Center-Aligned or Edge-aligned. The channels must be either
complementary a channel pair or a group of synchronized channels. It may
as well be applied to a single channel, but withiout any benefit.
- invert= 0..3. Controls ouput inversion of the pins. Bit 0 controls the
first pin, bit 1 the second.
- deadtime=time_ns time of complementary channels for delaying the rising
slope.
- xor=0|1|2 xor causes the output of channel A and B to be xored. If
applied to a X channel, it shows the value oif A ^ B. If applied to an A
or B channel, both channel show the xored signal for xor=1. For xor=2,
the xored signal is split between channels A and B. See also the
Reference Manual, chapter about double pulses. The behavior of xor=2 can
also be achieved using the center method for locating a pulse within a
clock period.
The output is enabled for board pins only.
CPU pins may still be used for FLEXPWM, e.g. as sync source, but the signal
will not be routed to the output. That applies only to FLEXPWM pins. The
use of QTMR pins which are not board pins will be rejected.
As part of this commit, the _WFE() statement is removed from
ticks_delay_us64() to prevent PWM glitching during calls to sleep().
This is to make the builds for all nucleo/discovery boards uniform, so they
can be treated the same by the auto build scripts.
The CI script is updated to explicitly enable mboot and packing, to test
these features.
Signed-off-by: Damien George <damien@micropython.org>
This prevents SPI4/5 from being used if SDIO and CYW43 are enabled, because
the DMA for the SDIO is used on an IRQ and must be exclusivly available for
use by the SDIO peripheral.
Signed-off-by: Damien George <damien@micropython.org>
Because DMA2 may be in use by other peripherals, eg SPI1.
On PYBD-SF6 it's possible to trigger a bug in the existing code by turning
on WLAN and connecting to an AP, pinging the IP address from a PC and
running the following code on the PYBD:
def spi_test(s):
while 1:
s.write('test')
s.read(4)
spi_test(machine.SPI(1,100000000))
This will eventually fail with `OSError: [Errno 110] ETIMEDOUT` because
DMA2 was turned off by the CYW43 driver during the SPI1 transfer.
This commit fixes the bug by removing the code that explicitly disables
DMA2. Instead DMA2 will be automatically disabled after an inactivity
timeout, see commit a96afae90f
Signed-off-by: Damien George <damien@micropython.org>
Quail (https://www.mikroe.com/quail, PID: MIKROE-1793) is based on an
STM32F427VI CPU, featuring 2048 kB of Flash memory and 192 kB of RAM. An
on-board Cypress S25FL164K adds 8 MB of SPI Flash.
Quail has 4 mikroBUS(TM) sockets for Mikroe click(TM) board connectivity,
along with 24 screw terminals for connecting additional electronics and two
USB ports (one for programming, the other for external mass storage).
4 UARTs, 2 SPIs and 1 I2C bus are available for communication.
Signed-off-by: Lorenzo Cappelletti <lorenzo.cappelletti@gmail.com>
Was incorrectly added as 7MB for an 8MB SPI flash, but this board has a
16MB chip, not 8MB, so it should be 15MB leaving 1MB for MicroPython.
Thanks to @robert-hh
The rp2.StateMachine.exec errors when supplying a sideset action. This
commit passes the sideset_opt from the StateMachine though to the parser.
It also adds some value validation to the sideset operator.
Additionally, the "word" method is added to the exec to allow any other
unsupported opcodes.
Fixes issue #7924.
- Makefile: update to use new ASF4 files, support frozen manifest, and
include source files in upcoming commits
- boards/manifest.py: add files to freeze
- boards/samd51p19a.ld: add linker script for this MCU
- help.c: add custom help text
- main.c: execute _boot.py, boot.py and main.py on start-up
- modules/_boot.py: startup file to freeze
- modutime.c: add gmtime, localtime, mktime, time functions
- mpconfigport.h: enabled more features for sys and io and modules
- mphalport.h: add mp_hal_pin_xxx macros
- mphalport.c: add mp_hal_stdio_poll
Don't force the 'HAL' string to be part of the platform string because
it doesn't have a sensible meaning for all possible platforms, and
swap it with the PLATFORM_ARCH string so the strings which most platforms
have come first.
Although the pyboard has only 4 LEDs, there are some boards that (may) have
more. This commit adds 2 more LEDs to the led.c file that if defined in
the board-specific config file will be compiled in.
- Add board.md files for MIMXRT1060_EVK and MIMXRT1064_EVK warning about
their experimental state.
- Add separate deploy_teensy.md and deploy_mimxrt.md files.
The ARCH MIX board exposes the Ethernet Pins at it's connectors. Therefore
the software is configured for using a LAN8720 PHY device. Breakout boards
with the LAN8720 are easily available.
This commit adds I2S protocol support for the rp2 port:
- I2S API is consistent with STM32 and ESP32 ports
- I2S configurations supported:
- master transmit and master receive
- 16-bit and 32-bit sample sizes
- mono and stereo formats
- sampling frequency
- 3 modes of operation:
- blocking
- non-blocking with callback
- uasyncio
- internal ring buffer size can be tuned
- DMA IRQs are managed on an I2S object basis, allowing other
RP2 entities to use DMA IRQs when I2S is not being used
- MicroPython documentation
- tested on Raspberry Pi Pico development board
- build metric changes for this commit: text(+4552), data(0), bss(+8)
Signed-off-by: Mike Teachman <mike.teachman@gmail.com>
Eliminate noise data from being sent to the I2S peripheral when the
transmitted sample stream is stopped.
Signed-off-by: Mike Teachman <mike.teachman@gmail.com>
Now that there are feature levels, and that this port uses
MICROPY_CONFIG_ROM_LEVEL_MINIMUM, it's easy to see what optional features
can be disabled. And this commit disables them.
Signed-off-by: Damien George <damien@micropython.org>
Word-size specific configuration is now done automatically, so it no longer
requires this to match the ARM configuration.
Also it's less common to have 32-bit compilation support installed, so this
will make it work "out of the box" for more people.
Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
This is an stm32-specific feature that's accessed via the pyb module, so
not something that will be widely enabled.
Signed-off-by: Damien George <damien@micropython.org>
This commit is a no-op change. Future improvements can come from making
individual boards use CORE or BASIC.
Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
Prior to this commit IRQs on STM32F4 could be lost because SR is cleared by
reading SR then reading DR. For example, if both RXNE and IDLE IRQs were
active upon entry to the IRQ handler, then IDLE is lost because the code
that handles RXNE comes first and accidentally clears SR (by reading SR
then DR to get the incoming character).
This commit fixes this problem by making the IRQ handler more atomic in the
following operations:
- get current IRQ status flags
- deal with RX character
- clear remaining status flags
- call user handler
On the STM32F4 it's very hard to get this right because the only way to
clear IRQ status flags is to read SR then DR, but the read of DR may read
some data which should remain in the register until the user wants to read
it. And it won't work to cache the read because RTS/CTS flow control will
then not work. So instead the new code disables interrupts if the DR is
full and waits for the user to read it before reenabling the interrupts.
Fixes issue mentioned in #4599 and #6082.
Signed-off-by: Damien George <damien@micropython.org>
Following on from ba940250a5, the change here
makes output about 15 times faster (now up to about 550 kbytes/sec).
tinyusb_cdcacm_write_queue will return the number of bytes written, so
there's no need to use tud_cdc_n_write_available.
Signed-off-by: Damien George <damien@micropython.org>
This will be used by https://micropython.org/download/ to generate the
full listing of boards and firmware files.
Optionally supports a board.md for additional customisation of the
download page, as well as deploy.md for flashing instructions.
Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
By moving code to ITCM, like vm, gc, parse, runtime. The change affects
mostly the execution speed of MicroPython code. The speed is increased by
up to a factor of 6, especially for MCU with small cache.
Prior to this commit mp_hal_ticks_cpu() was not started properly. It only
started when the code was executed with a debugger attached, except for the
Teensy (i.MXRT1062) boards. As an additional fix, the CYYCNT timer is now
started at boot time.
Also rename mp_hal_ticks_cpu_init() to mp_hal_ticks_cpu_enable().
The API follows that of rp2, stm32, esp32, and the docs.
wdt=machine.WDT(0, timeout)
Timeout is given in ms. The valid range is 500 to 128000 (128
seconds) with 500 ms granularity. Values outside of that range will
be silently aligned.
wdt.feed()
Resets the watchdog timer (feeding).
wdt.timeout_ms(value)
Sets a new timeout and feeds the watchdog.
This is a new, preliminary method which is not yet documented.
reset_cause = machine.reset_cause()
Values returned:
1 Power On reset
3 Watchdog reset
5 Software reset: state after calling machine.reset()
More elaborate API functions are supported by the MCU, like an interrupt
called a certain time after feeding. But for port cosistency that is not
implemented.
This commit implements 10/100 Mbit Ethernet support in the mimxrt port.
The following boards are configured without ETH network:
- MIMXRT1010_EVK
- Teensy 4.0
The following boards are configured with ETH network:
- MIMXRT1020_EVK
- MIMXRT1050_EVK
- MIMXRT1060_EVK
- MIMXRT1064_EVK
- Teensy 4.1
Ethernet support tested with TEENSY 4.1, MIMRTX1020_EVK and MIMXRT1050_EVK.
Build tested with Teensy 4.0 and MIMXRT1010_EVK to be still working.
Compiles and builds properly for MIMXRT1060_EVK and MIMXRT1064_EVK, but not
tested lacking suitable boards.
Tested functions are:
- ping works bothway
- simple UDP transfer works bothway
- ntptime works
- the ftp server works
- secure socker works
- telnet and webrepl works
The MAC address is 0x02 plus 5 bytes from the manifacturing info field,
which can be considered as unique per device.
Some boards do not wire the RESET and INT pin of the PHY transceiver. For
operation, these are not required. If they are defined, they will be used.
Adds support for SDRAM via `SEMC` peripheral. SDRAM support can be
enabled in the mpconfigboard.mk file by setting `MICROPY_HW_SDRAM_AVAIL`
to `1` and poviding the size of the RAM via `MICROPY_HW_FLASH_SIZE`.
When SDRAM support is enabled the whole SDRAM is currently used used
for MicroPython heap.
Signed-off-by: Philipp Ebensberger
This commit enables some significant optimisations for esp32:
- move the VM to iRAM
- move hot parts of the runtime to iRAM (map lookup, load global/name,
mp_obj_get_type)
- enable MICROPY_OPT_LOAD_ATTR_FAST_PATH
- enable MICROPY_OPT_MAP_LOOKUP_CACHE
- disable assertions
- change from -Os to -O2 for compilation
It's hard to measure performance on esp32 due to external flash and
hardware caching. But this set of changes improves performance compared to
master by (on a TinyPICO with the GENERIC build, using IDF 4.2.2, running
at 160MHz):
diff of scores (higher is better)
N=100 M=100 esp32-master -> esp32-perf diff diff% (error%)
bm_chaos.py 71.28 -> 268.08 : +196.80 = +276.094% (+/-0.04%)
bm_fannkuch.py 44.10 -> 69.31 : +25.21 = +57.166% (+/-0.01%)
bm_fft.py 1385.27 -> 2538.23 : +1152.96 = +83.230% (+/-0.01%)
bm_float.py 1060.94 -> 3900.62 : +2839.68 = +267.657% (+/-0.03%)
bm_hexiom.py 10.90 -> 32.79 : +21.89 = +200.826% (+/-0.02%)
bm_nqueens.py 1000.83 -> 2372.87 : +1372.04 = +137.090% (+/-0.01%)
bm_pidigits.py 288.13 -> 664.40 : +376.27 = +130.590% (+/-0.46%)
misc_aes.py 102.45 -> 345.69 : +243.24 = +237.423% (+/-0.01%)
misc_mandel.py 1016.58 -> 2121.92 : +1105.34 = +108.731% (+/-0.01%)
misc_pystone.py 632.91 -> 1801.87 : +1168.96 = +184.696% (+/-0.08%)
misc_raytrace.py 76.66 -> 281.78 : +205.12 = +267.571% (+/-0.05%)
viper_call0.py 210.63 -> 273.17 : +62.54 = +29.692% (+/-0.01%)
viper_call1a.py 208.45 -> 269.51 : +61.06 = +29.292% (+/-0.00%)
viper_call1b.py 185.44 -> 228.25 : +42.81 = +23.086% (+/-0.01%)
viper_call1c.py 185.86 -> 228.90 : +43.04 = +23.157% (+/-0.01%)
viper_call2a.py 207.10 -> 267.25 : +60.15 = +29.044% (+/-0.00%)
viper_call2b.py 173.76 -> 209.42 : +35.66 = +20.523% (+/-0.00%)
Five tests have more than 3x speed up (200%+).
The performance of the tests bm_fft, bm_pidigits and misc_aes now scale
with CPU frequency (eg changing frequency to 240MHz boosts the performance
of these by 50%), which means they are no longer influenced by timing of
external flash access. (The viper_call* tests did previously scale with
CPU frequency, and they still do.)
Turning off assertions reduces code size by about 80k, and going from -Os
to -O2 costs about 100k, so the net change in code size (for the GENERIC
board) is about +20k.
If a board wants to enable assertions, or use -Os instead of -O2, that's
still possible by overriding the sdkconfig parameters.
Signed-off-by: Damien George <damien@micropython.org>
Ensures consistent behaviour and resolves the D-Cache bug (the "exhaustive"
argument being lost due to cache being turned off) when O0 is used.
The changes in this commit are:
- Change -O0 to -Os because "gcc is considered broken at -O0" according to
https://github.com/ARM-software/CMSIS_5/issues/620#issuecomment-550235656
- Use volatile for mem_base so the compiler doesn't optimise away reads or
writes to the SDRAM, which is being tested.
- Use DSB to prevent any other compiler optimisations that would change the
testing logic.
- Use alternating pattern/antipattern in exhaustive test to catch more
hardware/configuration errors.
Implementation adapted by @andrewleech, taken directly from investigation
by @iabdalkader and @dpgeorge.
See #7841 and #7869 for further discussion.
To match network_lan.c and network_ppp.c, and make it clear what code is
specifically for WLAN support.
Also provide a configuration option MICROPY_PY_NETWORK_WLAN which can be
used to fully disable network.WLAN (it's enabled by default).
Signed-off-by: Damien George <damien@micropython.org>
To do this the board must define MICROPY_BOARD_STARTUP, set
MICROPY_SOURCE_BOARD then define the new start-up code.
For example, in mpconfigboard.h:
#define MICROPY_BOARD_STARTUP board_startup
void board_startup(void);
in mpconfigboard.cmake:
set(MICROPY_SOURCE_BOARD
${MICROPY_BOARD_DIR}/board.c
)
and in a new board.c file in the board directory:
#include "py/mpconfig.h"
void board_startup(void) {
boardctrl_startup();
// extra custom startup
}
This follows stm32's boardctrl facilities.
Signed-off-by: Damien George <damien@micropython.org>
Because vPortCleanUpTCB is called by the FreeRTOS idle task, and it checks
thread, but didn't check the thread_mutex.
And if thread is not NULL, but thread_mutex not ready then it will crash
with an error when calling mp_thread_mutex_lock(&thread_mutex, 1).
As suggested by @dpgeorge, move the thread = &thread_entry0 line to the end
of mp_thread_init().
Signed-off-by: leo chung <gewalalb@gmail.com>
This callback allows detecting if there is a USB host connected to the CDC
or not, in which case the stdout_tx should skip CDC TX writing and
flushing or the system will block.
Fixes issue #7820.
This commit allows using all the available PWM timers (up to 8) and
channels (up to 16), without affecting the PWM API.
If a new frequency is set, first it checks if another timer is using the
same frequency. If yes, then it uses this timer, otherwise, it creates a
new one. If all timers are used, the user should set an already used
frequency, or de-init a channel.
This work is based on #6276 and #3608.
The H743 has equal sized pages of 128k, which means the filesystem doesn't
need to be near the beginning. This commit moves the filesystem to the
very end of flash, and extends it to 512k (4 pages).
Signed-off-by: Damien George <damien@micropython.org>
This change adds the OLIMEX H407 support to the STM32 port. The H407
(https://www.olimex.com/Products/ARM/ST/STM32-H407/) is simliar to the
already existing E407
(https://www.olimex.com/Products/ARM/ST/STM32-E407) but does not support
Ethernet and has a full-size USB-A port instead of a Mini-USB socket.
Both boards use the STM32F407ZGT6 CPU.
This port is basically a copy of the E407 but with changed pinmux:
* Removed Ethernet pin definition
* Removed UART1 (pins are used for other functions)
* Removed UART3 flow control pins (pins are used for other functions)
* Removed SD-Card detect pin (since it is not connected on the H407)
A REPL on UART3 is connected to the U3BOOT-header, a 3-pin header with RX,
TX and GND that is intended for the serial terminal.
Tested:
* Micro-SD Card is detected when inserted on RESET
* REPL on UART3 works
* Serial port on the mini USB socket
Signed-off-by: Chris Fiege <cfi@pengutronix.de>
This commit removes all parts of code associated with the existing
MICROPY_OPT_CACHE_MAP_LOOKUP_IN_BYTECODE optimisation option, including the
-mcache-lookup-bc option to mpy-cross.
This feature originally provided a significant performance boost for Unix,
but wasn't able to be enabled for MCU targets (due to frozen bytecode), and
added significant extra complexity to generating and distributing .mpy
files.
The equivalent performance gain is now provided by the combination of
MICROPY_OPT_LOAD_ATTR_FAST_PATH and MICROPY_OPT_MAP_LOOKUP_CACHE (which has
been enabled on the unix port in the previous commit).
It's hard to provide precise performance numbers, but tests have been run
on a wide variety of architectures (x86-64, ARM Cortex, Aarch64, RISC-V,
xtensa) and they all generally agree on the qualitative improvements seen
by the combination of MICROPY_OPT_LOAD_ATTR_FAST_PATH and
MICROPY_OPT_MAP_LOOKUP_CACHE.
For example, on a "quiet" Linux x64 environment (i3-5010U @ 2.10GHz) the
change from CACHE_MAP_LOOKUP_IN_BYTECODE, to LOAD_ATTR_FAST_PATH combined
with MAP_LOOKUP_CACHE is:
diff of scores (higher is better)
N=2000 M=2000 bccache -> attrmapcache diff diff% (error%)
bm_chaos.py 13742.56 -> 13905.67 : +163.11 = +1.187% (+/-3.75%)
bm_fannkuch.py 60.13 -> 61.34 : +1.21 = +2.012% (+/-2.11%)
bm_fft.py 113083.20 -> 114793.68 : +1710.48 = +1.513% (+/-1.57%)
bm_float.py 256552.80 -> 243908.29 : -12644.51 = -4.929% (+/-1.90%)
bm_hexiom.py 521.93 -> 625.41 : +103.48 = +19.826% (+/-0.40%)
bm_nqueens.py 197544.25 -> 217713.12 : +20168.87 = +10.210% (+/-3.01%)
bm_pidigits.py 8072.98 -> 8198.75 : +125.77 = +1.558% (+/-3.22%)
misc_aes.py 17283.45 -> 16480.52 : -802.93 = -4.646% (+/-0.82%)
misc_mandel.py 99083.99 -> 128939.84 : +29855.85 = +30.132% (+/-5.88%)
misc_pystone.py 83860.10 -> 82592.56 : -1267.54 = -1.511% (+/-2.27%)
misc_raytrace.py 21490.40 -> 22227.23 : +736.83 = +3.429% (+/-1.88%)
This shows that the new optimisations are at least as good as the existing
inline-bytecode-caching, and are sometimes much better (because the new
ones apply caching to a wider variety of map lookups).
The new optimisations can also benefit code generated by the native
emitter, because they apply to the runtime rather than the generated code.
The improvement for the native emitter when LOAD_ATTR_FAST_PATH and
MAP_LOOKUP_CACHE are enabled is (same Linux environment as above):
diff of scores (higher is better)
N=2000 M=2000 native -> nat-attrmapcache diff diff% (error%)
bm_chaos.py 14130.62 -> 15464.68 : +1334.06 = +9.441% (+/-7.11%)
bm_fannkuch.py 74.96 -> 76.16 : +1.20 = +1.601% (+/-1.80%)
bm_fft.py 166682.99 -> 168221.86 : +1538.87 = +0.923% (+/-4.20%)
bm_float.py 233415.23 -> 265524.90 : +32109.67 = +13.756% (+/-2.57%)
bm_hexiom.py 628.59 -> 734.17 : +105.58 = +16.796% (+/-1.39%)
bm_nqueens.py 225418.44 -> 232926.45 : +7508.01 = +3.331% (+/-3.10%)
bm_pidigits.py 6322.00 -> 6379.52 : +57.52 = +0.910% (+/-5.62%)
misc_aes.py 20670.10 -> 27223.18 : +6553.08 = +31.703% (+/-1.56%)
misc_mandel.py 138221.11 -> 152014.01 : +13792.90 = +9.979% (+/-2.46%)
misc_pystone.py 85032.14 -> 105681.44 : +20649.30 = +24.284% (+/-2.25%)
misc_raytrace.py 19800.01 -> 23350.73 : +3550.72 = +17.933% (+/-2.79%)
In summary, compared to MICROPY_OPT_CACHE_MAP_LOOKUP_IN_BYTECODE, the new
MICROPY_OPT_LOAD_ATTR_FAST_PATH and MICROPY_OPT_MAP_LOOKUP_CACHE options:
- are simpler;
- take less code size;
- are faster (generally);
- work with code generated by the native emitter;
- can be used on embedded targets with a small and constant RAM overhead;
- allow the same .mpy bytecode to run on all targets.
See #7680 for further discussion. And see also #7653 for a discussion
about simplifying mpy-cross options.
Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
Update minimal port to use the new "minimal" rom level config (this is a
no-op change, the binary is the same size and contains the exact same
symbols).
Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
Any external user of DMA (eg a board with a custom DMA driver) must call
dma_external_acquire() for their DMA controller/stream to ensure that the
DMA clock is not automatically turned off while it's still being used
externally.
Signed-off-by: Damien George <damien@micropython.org>