Commit Graph

60 Commits

Author SHA1 Message Date
Andrew Leech f7f56d4285 py/objstr: Consolidate methods for str/bytes/bytearray/array.
This commit adds the bytes methods to bytearray, matching CPython.  The
existing implementations of these methods for str/bytes are reused for
bytearray with minor updates to match CPython return types.

For details on the CPython behaviour see
https://docs.python.org/3/library/stdtypes.html#bytes-and-bytearray-operations

The work to merge locals tables for str/bytes/bytearray/array was done by
@jimmo.  Because of this merging of locals the change in code size for this
commit is mostly negative:

       bare-arm:    +0 +0.000%
    minimal x86:   +29 +0.018%
       unix x64:  -792 -0.128% standard[incl -448(data)]
    unix nanbox:  -436 -0.078% nanbox[incl -448(data)]
          stm32:   -40 -0.010% PYBV10
         cc3200:   -32 -0.017%
        esp8266:   -28 -0.004% GENERIC
          esp32:   -72 -0.005% GENERIC[incl -200(data)]
         mimxrt:   -40 -0.011% TEENSY40
     renesas-ra:   -40 -0.006% RA6M2_EK
            nrf:   -16 -0.009% pca10040
            rp2:   -64 -0.013% PICO
           samd:  +148 +0.105% ADAFRUIT_ITSYBITSY_M4_EXPRESS
2022-08-11 23:18:02 +10:00
Joris Peeraer 5020b14d54 py/mpprint: Fix length calculation for strings with precision-modifier.
Two issues are tackled:

1. The calculation of the correct length to print is fixed to treat the
   precision as a maximum length instead as the exact length.
   This is done for both qstr (%q) and for regular str (%s).

2. Fix the incorrect use of mp_printf("%.*s") to mp_print_strn().

   Because of the fix of above issue, some testcases that would print
   an embedded null-byte (^@ in test-output) would now fail.
   The bug here is that "%s" was used to print null-bytes. Instead,
   mp_print_strn is used to make sure all bytes are outputted and the
   exact length is respected.

Test-cases are added for both %s and %q with a combination of precision
and padding specifiers.
2020-12-07 23:32:06 +11:00
Jim Mussared def76fe4d9 all: Use MP_ERROR_TEXT for all error messages. 2020-04-05 15:02:06 +10:00
Damien George 69661f3343 all: Reformat C and Python source code with tools/codeformat.py.
This is run with uncrustify 0.70.1, and black 19.10b0.
2020-02-28 10:33:03 +11:00
Damien George ad7213d3c3 py: Add mp_raise_msg_varg helper and use it where appropriate.
This commit adds mp_raise_msg_varg(type, fmt, ...) as a helper for
nlr_raise(mp_obj_new_exception_msg_varg(type, fmt, ...)).  It makes the
C-level API for raising exceptions more consistent, and reduces code size
on most ports:

   bare-arm:   +28 +0.042%
minimal x86:  +100 +0.067%
   unix x64:   -56 -0.011%
unix nanbox:  -300 -0.068%
      stm32:  -204 -0.054% PYBV10
     cc3200:    +0 +0.000%
    esp8266:   -64 -0.010% GENERIC
      esp32:  -104 -0.007% GENERIC
        nrf:  -136 -0.094% pca10040
       samd:    +0 +0.000% ADAFRUIT_ITSYBITSY_M4_EXPRESS
2020-02-13 11:52:40 +11:00
Damien George bfbd94401d py: Make mp_obj_get_type() return a const ptr to mp_obj_type_t.
Most types are in rodata/ROM, and mp_obj_base_t.type is a constant pointer,
so enforce this const-ness throughout the code base.  If a type ever needs
to be modified (eg a user type) then a simple cast can be used.
2020-01-09 11:25:26 +11:00
Nicko van Someren 10709846f3 py/objslice: Inline fetching of slice paramters in str_subscr().
To reduce code size.
2019-12-29 00:06:02 +11:00
Damien George eee1e8841a py: Downcase all MP_OBJ_IS_xxx macros to make a more consistent C API.
These macros could in principle be (inline) functions so it makes sense to
have them lower case, to match the other C API functions.

The remaining macros that are upper case are:
- MP_OBJ_TO_PTR, MP_OBJ_FROM_PTR
- MP_OBJ_NEW_SMALL_INT, MP_OBJ_SMALL_INT_VALUE
- MP_OBJ_NEW_QSTR, MP_OBJ_QSTR_VALUE
- MP_OBJ_FUN_MAKE_SIG
- MP_DECLARE_CONST_xxx
- MP_DEFINE_CONST_xxx

These must remain macros because they are used when defining const data (at
least, MP_OBJ_NEW_SMALL_INT is so it makes sense to have
MP_OBJ_SMALL_INT_VALUE also a macro).

For those macros that have been made lower case, compatibility macros are
provided for the old names so that users do not need to change their code
immediately.
2019-02-12 14:54:51 +11:00
Paul Sokolovsky 8fea833e3f py: Update my copyright info on some files.
Based on git history.
2019-02-06 00:19:00 +11:00
Paul Sokolovsky 5a91fce9f8 py/objstr: Make str.count() method configurable.
Configurable via MICROPY_PY_BUILTINS_STR_COUNT.  Default is enabled.
Disabled for bare-arm, minimal, unix-minimal and zephyr ports.  Disabling
it saves 408 bytes on x86.
2018-10-22 22:49:05 +11:00
Damien George 19aee9438a py/unicode: Clean up utf8 funcs and provide non-utf8 inline versions.
This patch provides inline versions of the utf8 helper functions for the
case when unicode is disabled (MICROPY_PY_BUILTINS_STR_UNICODE set to 0).
This saves code size.

The unichar_charlen function is also renamed to utf8_charlen to match the
other utf8 helper functions, and the signature of this function is adjusted
for consistency (const char* -> const byte*, mp_uint_t -> size_t).
2018-02-14 18:19:22 +11:00
Damien George 4601759bf5 py/objstr: Remove "make_qstr_if_not_already" arg from mp_obj_new_str.
This patch simplifies the str creation API to favour the common case of
creating a str object that is not forced to be interned.  To force
interning of a new str the new mp_obj_new_str_via_qstr function is added,
and should only be used if warranted.

Apart from simplifying the mp_obj_new_str function (and making it have the
same signature as mp_obj_new_bytes), this patch also reduces code size by a
bit (-16 bytes for bare-arm and roughly -40 bytes on the bare-metal archs).
2017-11-16 13:17:51 +11:00
Damien George a3dc1b1957 all: Remove inclusion of internal py header files.
Header files that are considered internal to the py core and should not
normally be included directly are:
    py/nlr.h - internal nlr configuration and declarations
    py/bc0.h - contains bytecode macro definitions
    py/runtime0.h - contains basic runtime enums

Instead, the top-level header files to include are one of:
    py/obj.h - includes runtime0.h and defines everything to use the
        mp_obj_t type
    py/runtime.h - includes mpstate.h and hence nlr.h, obj.h, runtime0.h,
        and defines everything to use the general runtime support functions

Additional, specific headers (eg py/objlist.h) can be included if needed.
2017-10-04 12:37:50 +11:00
Damien George 58321dd985 all: Convert mp_uint_t to mp_unary_op_t/mp_binary_op_t where appropriate
The unary-op/binary-op enums are already defined, and there are no
arithmetic tricks used with these types, so it makes sense to use the
correct enum type for arguments that take these values.  It also reduces
code size quite a bit for nan-boxing builds.
2017-08-29 13:16:30 +10:00
Javier Candeira 35a1fea90b all: Raise exceptions via mp_raise_XXX
- Changed: ValueError, TypeError, NotImplementedError
  - OSError invocations unchanged, because the corresponding utility
    function takes ints, not strings like the long form invocation.
  - OverflowError, IndexError and RuntimeError etc. not changed for now
    until we decide whether to add new utility functions.
2017-08-13 22:52:33 +10:00
Alexander Steffen 55f33240f3 all: Use the name MicroPython consistently in comments
There were several different spellings of MicroPython present in comments,
when there should be only one.
2017-07-31 18:35:40 +10:00
Damien George 48d867b4a6 all: Make more use of mp_raise_{msg,TypeError,ValueError} helpers. 2017-06-15 11:54:41 +10:00
Damien George c88cfe165b py: Use size_t as len argument and return type of mp_get_index.
These values are used to compute memory addresses and so size_t is the
more appropriate type to use.
2017-03-23 16:17:40 +11:00
Damien George ae8d867586 py: Add iter_buf to getiter type method.
Allows to iterate over the following without allocating on the heap:
- tuple
- list
- string, bytes
- bytearray, array
- dict (not dict.keys, dict.values, dict.items)
- set, frozenset

Allows to call the following without heap memory:
- all, any, min, max, sum

TODO: still need to allocate stack memory in bytecode for iter_buf.
2017-02-16 18:38:06 +11:00
Damien George c0d9500eee py/objstr: Convert mp_uint_t to size_t (and use int) where appropriate. 2017-02-16 16:51:16 +11:00
Paul Sokolovsky 1563388001 py/objstr,objstrunicode: Fix inconistent #if indentation. 2016-08-07 15:24:57 +03:00
Paul Sokolovsky 56eb25f049 py/objstr: Make .partition()/.rpartition() methods configurable.
Default is disabled, enabled for unix port. Saves 600 bytes on x86.
2016-08-07 06:46:55 +03:00
Paul Sokolovsky ed1c194ebf py/objstrunicode: str_index_to_ptr: Implement positive indexing properly.
Order out-of-bounds check, completion check, and increment in the right way.
2016-07-25 19:28:04 +03:00
Paul Sokolovsky 6af90b2972 py/objstrunicode: str_index_to_ptr: Should handle bytes too.
There's single str_index_to_ptr() function, called for both bytes and
unicode objects, so should handle each properly.
2016-07-25 14:45:08 +03:00
Dave Hylands 6a60fb3cf4 py/objstr*: Properly ifdef str.center(). 2016-05-22 01:54:41 +03:00
Paul Sokolovsky 1b5abfcaae py/objstr: Implement str.center().
Disabled by default, enabled in unix port. Need for this method easily
pops up when working with text UI/reporting, and coding workalike
manually again and again counter-productive.
2016-05-22 00:13:44 +03:00
Damien George 8212d97317 py: Use polymorphic iterator type where possible to reduce code size.
Only types whose iterator instances still fit in 4 machine words have
been changed to use the polymorphic iterator.

Reduces Thumb2 arch code size by 264 bytes.
2016-01-03 16:27:55 +00:00
Damien George 999cedb90f py: Wrap all obj-ptr conversions in MP_OBJ_TO_PTR/MP_OBJ_FROM_PTR.
This allows the mp_obj_t type to be configured to something other than a
pointer-sized primitive type.

This patch also includes additional changes to allow the code to compile
when sizeof(mp_uint_t) != sizeof(void*), such as using size_t instead of
mp_uint_t, and various casts.
2015-11-29 14:25:35 +00:00
Damien George cbf7674025 py: Add MP_ROM_* macros and mp_rom_* types and use them. 2015-11-29 14:25:04 +00:00
Paul Sokolovsky 1b586f3a73 py: Rename MP_BOOL() to mp_obj_new_bool() for consistency in naming. 2015-10-11 15:18:15 +03:00
Damien George 821b7f22fe py: Use mp_not_implemented consistently for not implemented features. 2015-09-03 23:14:06 +01:00
Damien George 44e7cbf019 py: Clean up declarations of str type/funcs that are also in unicode.
Background: trying to make an amalgamation of all the code gave some
errors with redefined types and inconsistent use of static.
2015-05-17 16:44:24 +01:00
Damien George 7f9d1d6ab9 py: Overhaul and simplify printf/pfenv mechanism.
Previous to this patch the printing mechanism was a bit of a tangled
mess.  This patch attempts to consolidate printing into one interface.

All (non-debug) printing now uses the mp_print* family of functions,
mainly mp_printf.  All these functions take an mp_print_t structure as
their first argument, and this structure defines the printing backend
through the "print_strn" function of said structure.

Printing from the uPy core can reach the platform-defined print code via
two paths: either through mp_sys_stdout_obj (defined pert port) in
conjunction with mp_stream_write; or through the mp_plat_print structure
which uses the MP_PLAT_PRINT_STRN macro to define how string are printed
on the platform.  The former is only used when MICROPY_PY_IO is defined.

With this new scheme printing is generally more efficient (less layers
to go through, less arguments to pass), and, given an mp_print_t*
structure, one can call mp_print_str for efficiency instead of
mp_printf("%s", ...).  Code size is also reduced by around 200 bytes on
Thumb2 archs.
2015-04-16 14:30:16 +00:00
Damien George 0528c5a22a py: In str unicode, str_subscr will never be passed a bytes object. 2015-04-04 19:42:03 +01:00
Paul Sokolovsky ac2f7a7f6a objstr: Add .splitlines() method.
splitlines() occurs ~179 times in CPython3 standard library, so was
deemed worthy to implement. The method has subtle semantic differences
from just .split("\n"). It is also defined as working for any end-of-line
combination, but this is currently not implemented - it works only with
LF line-endings (which should be OK for text strings on any platforms,
but not OK for bytes).
2015-04-04 00:09:48 +03:00
Damien George 2e2e404ff7 py: Allow to compile with extra warnings (sign-compare, unused-param). 2015-03-19 00:25:33 +00:00
Damien George 98e3a64694 py: Remove duplicated mp_obj_str_make_new function from objstrunicode.c. 2015-01-28 14:14:57 +00:00
Paul Sokolovsky 344e15b1ae objstr: Remove code duplication and unbreak Windows build.
There was really weird warning (promoted to error) when building Windows
port. Exact cause is still unknown, but it uncovered another issue:
8-bit and unicode str_make_new implementations should be mutually exclusive,
and not built at the same time. What we had is that bytes_decode() pulled
8-bit str_make_new() even for unicode build.
2015-01-23 02:15:56 +02:00
Paul Sokolovsky 6113eb2f33 objstr*: Use separate names for locals_dict of 8-bit and unicode str's.
To somewhat unbreak -DSTATIC="" compile.
2015-01-23 02:05:58 +02:00
Damien George 0b9ee86133 py: Add mp_obj_new_str_from_vstr, and use it where relevant.
This patch allows to reuse vstr memory when creating str/bytes object.
This improves memory usage.

Also saves code ROM: 128 bytes on stmhal, 92 bytes on bare-arm, and 88
bytes on unix x64.
2015-01-21 23:17:27 +00:00
Damien George ff8dd3f486 py, unix: Allow to compile with -Wunused-parameter.
See issue #699.
2015-01-20 12:47:20 +00:00
Damien George 51dfcb4bb7 py: Move to guarded includes, everywhere in py/ core.
Addresses issue #1022.
2015-01-01 20:32:09 +00:00
Paul Sokolovsky e62a0fe367 objstr: Allow to convert any buffer proto object to str.
Original motivation is to support converting bytearrays, but easier to just
support buffer protocol at all.
2014-10-31 00:03:53 +02:00
Damien George cde0ca21bf py: Simplify JSON str printing (while still conforming to JSON spec).
The JSON specs are relatively flexible and allow us to use one function
to print strings, be they ascii, bytes or utf-8 encoded.
2014-09-25 17:35:56 +01:00
Damien George 612045f53f py: Add native json printing using existing print framework.
Also add start of ujson module with dumps implemented.  Enabled in unix
and stmhal ports.  Test passes on both.
2014-09-17 22:56:34 +01:00
Damien George 4abff7500f py: Change uint to mp_uint_t in runtime.h, stackctrl.h, binary.h.
Part of code cleanup, working towards resolving issue #50.
2014-08-30 14:59:21 +01:00
Damien George ecc88e949c Change some parts of the core API to use mp_uint_t instead of uint/int.
Addressing issue #50, still some way to go yet.
2014-08-30 00:35:11 +01:00
Damien George bb4c6f35c6 py: Make MP_OBJ_NEW_SMALL_INT cast arg to mp_int_t itself.
Addresses issue #724.
2014-07-31 10:49:14 +01:00
Damien George 40f3c02682 Rename machine_(u)int_t to mp_(u)int_t.
See discussion in issue #50.
2014-07-03 13:25:24 +01:00
Paul Sokolovsky 9e215fa4c2 py: Make unichar_charlen() accept/return machine_uint_t. 2014-06-28 23:15:29 +03:00