summaryrefslogtreecommitdiff
path: root/docs/wchar_and_locale.txt
blob: 976c9aaad5ced7ca492b8f19276822ecc6ba9038 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
		User-configurable

UCLIBC_HAS_CTYPE_TABLES
	Make toupper etc work thru translation tables
	and isalhum etc thru lookup tables. Help says:
	"While the non-table versions are often smaller when building
	statically linked apps, they work only in stub locale mode."

	"stub locale mode" is when !UCLIBC_HAS_LOCALE I presume,
	when we are permanently in POSIX/C locale.

UCLIBC_HAS_CTYPE_SIGNED
	Handle sign-extended chars. I.e. if you want
	toupper((char)0xa0) => toupper(0xffffffa0) => still works correctly,
	as if toupper(0xa0) was called.

UCLIBC_HAS_CTYPE_UNSAFE/CHECKED/ENFORCED
	Do not check ctype function argument's range/check it and return
	error/check it and abort(). Help says:
	NOTE: This only affects the 'ctype' _functions_.  It does not affect
	the macro implementations. [so what happens to macros?]
	[examples?]

UCLIBC_HAS_WCHAR
	Wide character support. I assume all those wchar_t types and functions

UCLIBC_HAS_LOCALE/XLOCALE
	Support locale / extended locale

UCLIBC_PREGENERATED_LOCALE_DATA
	Not recommended


		uclibc internal machinery

__LOCALE_C_ONLY
	#defined if !UCLIBC_HAS_LOCALE

__NO_CTYPE
	#defined only by some .c files. Prevents ctype macros to be #defined
	(those w/o underscores. __ctype() macros will still be defined).
	Looks like user is expected to never mess with defining it.

__UCLIBC_DO_XLOCALE
	#defined only by some .c files. Looks like user is expected to never
	mess with defining it.

__XL_NPP(N) - "add _l suffix if locale support is on"
	#defined to N ## _l if __UCLIBC_HAS_XLOCALE__ && __UCLIBC_DO_XLOCALE,
	else #defined to just N.

__CTYPE_HAS_8_BIT_LOCALES
__CTYPE_HAS_UTF_8_LOCALES
	Depends on contents of extra/locale/LOCALES data file. Looks like
	both will be set if UCLIBC_HAS_LOCALE and extra/locale/LOCALES
	is not edited.

__WCHAR_ENABLED
	locale_mmap.h defines it unconditionally, extra/locale/gen_ldc.c
	defines it too with a warning, and _then_ includes locale_mmap.h.
	Makefile seems to prevent the warning in gen_ldc.c:
	ifeq ($(UCLIBC_HAS_WCHAR),y)
	BUILD_CFLAGS-gen_wc8bit += -DDO_WIDE_CHAR=1
	BUILD_CFLAGS-gen_ldc += -D__WCHAR_ENABLED=1
	endif
	A mess. Why they can't just use __UCLIBC_HAS_WCHAR__?

__WCHAR_REPLACEMENT_CHAR
	Never defined (dead code???)



		Actual ctype macros are a bloody mess!

__C_isspace(c), __C_tolower(c) et al
	Defined in bits/uClibc_ctype.h. Non-locale-aware, unsafe
	wrt multiple-evaluation, macros. Return unsigned int.

__isspace(c), __tolower(c) et al
	Defined in bits/uClibc_ctype.h. Non-locale-aware,
	but safe wrt multiple-evaluation, macros. Return int.

__isdigit_char, __isdigit_int
	Visible only to uclibc code. ((unsigned char/int)((c) - '0') <= 9).

_tolower(c), _toupper(c)
	Even more unsafe versions (they just do | 0x20 or ^ 0x20). Sheesh.
	They are mandated by POSIX so we must have them defined,
	but I removed all uses in uclibc code. Half of them were buggy.

isspace(c), tolower(c) et al
	Declared as int isXXXX(int c) in bits/uClibc_ctype.h. Then,
	if not C++ compile, defined as macros to __usXXXX(c)

bits/uClibc_ctype.h is included by ctype.h only if !__UCLIBC_HAS_CTYPE_TABLES__.

Otherwise, ctype.h declares int isXXXX(int c) functions,
then defines macros for isXXXX(c), __isXXX(c), toXXXX(c).
Surprisingly, there are no __toXXXX(c), but if __USE_EXTERN_INLINES,
there are inlines (yes, not macros!) for toXXXX(c) functions,
so those may have both inlines and macros!).
It also defines "unsafe" _toXXXX macros.

All in all, __isXXXX(c) and __toXXXXX(c) seem to be useless,
they are full equivalents to non-underscored versions.
Remove?

Macro-ization of isXXX(c) for __UCLIBC_HAS_XLOCALE__ case is problematic:
it is done by indexing: __UCLIBC_CTYPE_B[c], and in __UCLIBC_HAS_XLOCALE__
case __UCLIBC_CTYPE_B is doing a __ctype_b_loc() call! We do not save
function call! Thus, why not have dedicated optimized functions
for each isXXXX() instead? Looking deeper, __ctype_b_loc() may have
another lever of function calls inside! What a mess...