c932fb71cc90461b88ecdffe47c071d001d78fb4 |
|
27-Jan-2016 |
Shawn Landden <shawn@churchofgit.com> |
utf8.[ch] et al: use char32_t and char16_t instead of int, int32_t, int16_t
rework C11 utf8.[ch] to use char32_t instead of uint32_t when referring
to unicode chars, to make things more expressive.
[
@zonque:
* rebased to current master
* use AC_CHECK_DECLS to detect availibility of char{16,32}_t
* make utf8_encoded_to_unichar() return int
] |
3565e09594a9cd2786b5682ad13812491e6781c1 |
|
18-Jan-2016 |
Zbigniew Jędrzejewski-Szmek <zbyszek@in.waw.pl> |
basic/escape: merge utf8 and non-utf8 paths in cunescape_one
Not every byte sequence is valid utf8. We allow escaping of non-utf8
sequences in strings by using octal and hexadecimal escape sequences
(\123 and \0xAB) for bytes at or above 128. Users of cunescape_one
could infer whether such use occured when they received an answer
between 128 and 256 in *ret (a non-ascii one byte character). But this
is subtle and misleading: the comments were wrong, because ascii is a
subset of unicode, so c != 0 did not mean non-unicode, but rather
ascii-subset-of-unicode-or-raw-byte. This was all rather confusing, so
make the "single byte" condition explicit.
I'm not convinced that allowing non-utf8 sequences to be produced is
useful in all cases where we allow it (e.g. in config files), but that
behaviour is unchanged, just made more explicit.
This also fixes an (invalid) gcc warning about unitialized variable
(*ret_unicode) in callers of cunescape_one. |
c89f52ac6938374972253d8752ed65f3af0b3ef4 |
|
11-Nov-2015 |
Lennart Poettering <lennart@poettering.net> |
core: fix dependency parsing
3d793d29059a7ddf5282efa6b32b953c183d7a4d broke parsing of unit file
names that include backslashes, as extract_first_word() strips those.
Fix this, by introducing a new EXTRACT_RETAIN_ESCAPE flag which disables
looking at any flags, thus being compatible with the classic
FOREACH_WORD() behaviour. |
0247447e96f1385cf0c48e3e6b696214fbe36802 |
|
06-Nov-2015 |
Filipe Brandenburger <filbranden@google.com> |
extract-word: Skip coalesced separators in place
Just skip them in place, instead of setting separator=true. We only do
that in a single place (while finding a separator outside of quote or
backslash states) so we don't really need a separate state for it.
Tested that no regressions were introduced in test-extract-word. Ran a
full `make check` and also installed the binaries on a test system and
did not see any issues related to parsing unit files or starting units
after a reboot. |
27fc921b658adc5baa988c4c213888b016a60b18 |
|
06-Nov-2015 |
Filipe Brandenburger <filbranden@google.com> |
extract-word: Do not re-evaluate the state on each parsed character
Use inner loops to keep processing the same state, except when there is
a state change, then break back to the outer loop so that the correct
branch can be selected again.
Tested that no regressions were introduced in test-extract-word. |
b85e1c2534ca3b396c2aaa7de384995b42d12e1b |
|
06-Nov-2015 |
Filipe Brandenburger <filbranden@google.com> |
extract-word: move start block outside the for loop
This block runs once before all the other handling, so move it outside
the main loop and put it in its own loop until it's finished doing its
job.
Tested by confirming `make check` (and particularly test-extract-word)
still passes and by booting a system with binaries including this
commit. |
dea7b6b043f0cd9e34ee719b9b612c3a4776387e |
|
24-Oct-2015 |
Lennart Poettering <lennart@poettering.net> |
util-lib: rework extract_first_word_and_warn() a bit
- Really warn in all error cases, not just some. We need to make sure
that all errors are logged to not confuse the user.
- Explicitly check for EINVAL error code before claiming anything about
invalid escapes, could be ENOMEM after all. |