Commit Graph

27 Commits

Author SHA1 Message Date
Yonatan Goldschmidt df5c3bd976 py/unicode: Add unichar_isalnum(). 2020-01-12 13:03:57 +11:00
Damien George 7c85c7c210 py/unicode: Fix check for valid utf8 being stricter about contn chars. 2018-11-26 16:13:08 +11:00
Damien George 19aee9438a py/unicode: Clean up utf8 funcs and provide non-utf8 inline versions.
This patch provides inline versions of the utf8 helper functions for the
case when unicode is disabled (MICROPY_PY_BUILTINS_STR_UNICODE set to 0).
This saves code size.

The unichar_charlen function is also renamed to utf8_charlen to match the
other utf8 helper functions, and the signature of this function is adjusted
for consistency (const char* -> const byte*, mp_uint_t -> size_t).
2018-02-14 18:19:22 +11:00
tll 68c28174d0 py/objstr: Add check for valid UTF-8 when making a str from bytes.
This patch adds a function utf8_check() to check for a valid UTF-8 encoded
string, and calls it when constructing a str from raw bytes.  The feature
is selectable at compile time via MICROPY_PY_BUILTINS_STR_UNICODE_CHECK and
is enabled if unicode is enabled.  It costs about 110 bytes on Thumb-2, 150
bytes on Xtensa and 170 bytes on x86-64.
2017-09-06 16:43:09 +10:00
Alexander Steffen 55f33240f3 all: Use the name MicroPython consistently in comments
There were several different spellings of MicroPython present in comments,
when there should be only one.
2017-07-31 18:35:40 +10:00
Damien George afc5063539 py/unicode: Comment-out unused function unichar_isprint. 2016-12-28 17:50:10 +11:00
Alex March 69d9e7d27d py/repl: Check for an identifier char after the keyword.
- As described in the #1850.
- Add cmdline tests.
2016-02-17 08:56:15 +00:00
Dave Hylands afaa66b657 py: Minor improvement to unichar_isxdigit
This drops the size of unicode_isxdigit from 0x1e + 0x02 filler to
0x14 bytes (so net code reduction of 12 bytes) and will make
unicode_is_xdigit perform slightly faster.
2015-05-20 09:31:22 +01:00
Dave Hylands 3ad94d6072 extmod: Add ubinascii.unhexlify
This also pulls out hex_digit from py/lexer.c and makes unichar_hex_digit
2015-05-20 09:29:22 +01:00
Damien George 4dea922610 py: Adjust some spaces in code style/format, purely for consistency. 2015-04-09 15:29:54 +00:00
Damien George 51dfcb4bb7 py: Move to guarded includes, everywhere in py/ core.
Addresses issue #1022.
2015-01-01 20:32:09 +00:00
Damien George 5318cc028a py: Tidy up a few function declarations. 2014-12-10 22:37:07 +00:00
Damien George 40f3c02682 Rename machine_(u)int_t to mp_(u)int_t.
See discussion in issue #50.
2014-07-03 13:25:24 +01:00
Paul Sokolovsky 9e215fa4c2 py: Make unichar_charlen() accept/return machine_uint_t. 2014-06-28 23:15:29 +03:00
Damien George e04a44e2f6 py: Small comments, name changes, use of machine_int_t. 2014-06-28 10:27:23 +01:00
Paul Sokolovsky 1044c3dfe6 unicode: Make get_char()/next_char()/charlen() be 8-bit compatible.
Based on config define.
2014-06-27 00:04:19 +03:00
Paul Sokolovsky 46d31e9ca9 unicode: Add utf8_ptr_to_index().
Useful when we have pointer to char inside string, but need to return char
index. (E.g. str.find()).
2014-06-27 00:04:19 +03:00
Chris Angelico c88987c1af py: Implement basic unicode functions. 2014-06-27 00:04:17 +03:00
Paul Sokolovsky 59c675a64c py: Include mpconfig.h before all other includes.
It defines types used by all other headers.

Fixes #691.
2014-06-21 22:43:22 +03:00
Paul Sokolovsky b0bb458810 unicode: String API is const byte*.
We still have that char vs byte dichotomy, but majority of string operations
now use byte.
2014-06-14 06:22:11 +03:00
Damien George c59af52e84 py: Rename some unichar functions for consistency. 2014-05-11 17:53:11 +01:00
Paul Sokolovsky 6913521911 objstr: Implement .lower() and .upper(). 2014-05-10 19:49:07 +03:00
Damien George 04b9147e15 Add license header to (almost) all files.
Blanket wide to all .c and .h files.  Some files originating from ST are
difficult to deal with (license wise) so it was left out of those.

Also merged modpyb.h, modos.h, modstm.h and modtime.h in stmhal/.
2014-05-03 23:27:38 +01:00
Damien George 175cecfa87 py: Make form-feed character a space (following C isspace).
Eg, in CPython stdlib, email/header.py has a form-feed character.
2014-04-10 11:39:36 +01:00
Paul Sokolovsky 520e2f58a5 Replace global "static" -> "STATIC", to allow "analysis builds". Part 2. 2014-02-12 18:31:30 +02:00
Paul Sokolovsky 0b7184dcb8 Implement octal and hex escapes in strings. 2014-01-22 22:48:25 +02:00
Damien George 8cc96a35e5 Put unicode functions in unicode.c, and tidy their names. 2013-12-30 18:23:50 +00:00