Age | Commit message (Collapse) | Author |
|
Port optimized memcpy/memmove from the kernel.
Signed-off-by: Steven J. Magnani <steve@digidescorp.com>
Signed-off-by: Bernhard Reutner-Fischer <rep.dot.nop@gmail.com>
|
|
After exploring different prefetch distance-degree combinations
in this new update of the memcpy function, a new loop has been added
for moving many cache lines with an aggressive prefetching schema.
Prefetch has been removed when move few cache line aligned blocks.
As final result, this memcpy gives us the same performances for small
sizes (we already had!) and better numbers for big copies.
In case of SH4-300 CPU Series, benchmarks show a gain of ~20% for sizes
from 4KiB to 256KiB.
In case of the SH4-200, there is a gain of ~40% for sizes bigger than
32KiB.
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: Carmelo Amoroso <carmelo.amoroso@st.com>
|
|
With this patch the movca.l instruction is used within the memset.
The current memset implementation only uses the FPU and there is
an real gain for all the sizes.
Adding the movca.l instruction numbers always are better than the generic code.
There is a big gain for size greater than 64 KiB but number are worst for 4-32KiB
sizes compared with the implementation without movca.l.
Time Memory Bandwidth (Mbytes)
-------------------------------------------------
Generic SH4 SH4
(FPU) (FPU+movca.l)
-------------------------------------------------
512 1143 1998 1596
1 KiB 1273 2567 1915
2 KiB 1350 2993 2128
4-32KiB 1391 3262 2252
64KiB-16MiB 170 186 *830*
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: Carmelo Amoroso <carmelo.amoroso@st.com>
|
|
Signed-off-by: Bernhard Reutner-Fischer <rep.dot.nop@gmail.com>
|
|
This patch disables SH-4 optimizations that rely on the FPU when
building for variants that don't have an FPU, such as SH-4AL.
Signed-off-by: Andrew Stubbs <ams@codesourcery.com>
Signed-off-by: Carmelo Amoroso <carmelo.amoroso@st.com>
|
|
Result was:
strverscmp.o:
000000ec T __GI_strverscmp
i.e. no plain "strverscmp"!
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
|
|
Conflicts:
Makefile.in
extra/Configs/Config.in
libc/sysdeps/linux/common/bits/kernel-features.h
libc/sysdeps/linux/common/poll.c
libc/sysdeps/linux/common/sysdep.h
libc/sysdeps/linux/sh/sysdep.h
Signed-off-by: Austin Foxley <austinf@cetoncorp.com>
|
|
Signed-off-by: Bernhard Reutner-Fischer <rep.dot.nop@gmail.com>
|
|
Use the ENTRY macro now available through the sysdep.h header
Signed-off-by: Carmelo Amoroso <carmelo.amoroso@st.com>
|
|
Conflicts:
libc/signal/sigpause.c
libc/string/x86_64/memset.S
Signed-off-by: Austin Foxley <austinf@cetoncorp.com>
|
|
also enable __chk_fail and only try to call it when SSP is on
Signed-off-by: Austin Foxley <austinf@cetoncorp.com>
|
|
Based on Peter Mazinger's comments on a recent commit, I decided
to get rids of all occurrences of PIC changing them to __PIC__
Signed-off-by: Carmelo Amoroso <carmelo.amoroso@st.com>
|
|
Based on Peter Mazinger's comments on a recent commit, I decided
to get rids of all occurrences of PIC changing them to __PIC__
Signed-off-by: Carmelo Amoroso <carmelo.amoroso@st.com>
|
|
This patch fixes the big-endian code and adds a new optimization
only for little endian mode.
This optimization is based on prefetching and 64bit data transfer via FPU.
Tests shows that
----------------------------------------
Memory bandwidth | Gain
| sh4-300 | sh4-200
----------------------------------------
512 bytes to 16KiB | ~20% | ~25%
from 32KiB to 16MiB | ~190% | ~5%
----------------------------------------
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: Carmelo Amoroso <carmelo.amoroso@st.com>
|
|
This optimization is based on prefetching and 64bit data transfer via FPU
(only for the little endianess)
Tests shows that:
----------------------------------------
Memory bandwidth | Gain
| sh4-300 | sh4-200
----------------------------------------
512 bytes to 16KiB | ~20% | ~25%
from 32KiB to 16MiB | ~190% | ~5%
----------------------------------------
Signed-off-by: Austin Foxley <austinf@cetoncorp.com>
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: Carmelo Amoroso <carmelo.amoroso@st.com>
|
|
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
|
|
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
|
|
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
|
|
sed -i -e '/Experimentally off - /d' $(grep -rl "Experimentally off - " *)
sed -i -e '/^\/\*[[:space:]]*libc_hidden_proto(/d' $(grep -rl "libc_hidden_proto" *)
should be a nop
Signed-off-by: Bernhard Reutner-Fischer <rep.dot.nop@gmail.com>
|
|
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
|
|
Handle O=
Signed-off-by: Bernhard Reutner-Fischer <rep.dot.nop@gmail.com>
|
|
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
|
|
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
|
|
Signed-off-by: Hideo Saito <saito@densan.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
See Linux Kernel commit:
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=e08b954c9a140f2062649faec72514eb505f18c3
Signed-off-by: Carmelo Amoroso <carmelo.amoroso@st.com>
|
|
The comments on register usage in ARM memcpy had dest and src the
wrong way round; this patch (originally from Mark Shinwell) corrects
this and adds a note on the return value.
Signed-off-by: Joseph Myers <joseph@codesourcery.com>
|
|
When an IT block was changed from having two instructions to having
one, the IT instruction at the start of the block was not updated,
causing memcpy to fail to assemble for Thumb-2; this patch makes the
obvious fix.
Signed-off-by: Joseph Myers <joseph@codesourcery.com>
|
|
|
|
|
|
|
|
is always equivalent to __UCLIBC_CURLOCALE->x.
remove typedef __uclibc_locale_t, it used only in a few places,
it is lees confusing to use struct __uclibc_locale_struct
everywhere.
xlocale.h: hide __global_locale back under _LIBC,
bug 53 is wrong in claiming it should be exported.
Also hide under _LIBC:
extern __locale_t __curlocale_var;
extern __locale_t __curlocale(void);
extern __locale_t __curlocale_set(__locale_t newloc);
# define __UCLIBC_CURLOCALE
# define __XL_NPP(N)
# define __LOCALE_PARAM
# define __LOCALE_ARG
# define __LOCALE_PTR
|
|
- SUSv4_LEGACY part #1 (non-networking)
|
|
libc/string/i386/strlen.c: small optimization, same code size)
text data bss dec hex filename
- 240449 1759 11960 254168 3e0d8 lib/libuClibc-0.9.30-svn.so
+ 240339 1759 11960 254058 3e06a lib/libuClibc-0.9.30-svn.so
|
|
string/i386/strchrnul.c: new function, adapted from strchr.c
text data bss dec hex filename
- 240604 1759 11960 254323 3e173 lib/libuClibc-0.9.30-svn.so
+ 240449 1759 11960 254168 3e0d8 lib/libuClibc-0.9.30-svn.so
|
|
string/i386/*: formatiing and commentary tidying up
|
|
|
|
added check for src == dest. run tested.
text data bss dec hex filename
- 39 0 0 39 27 libc/string/i386/memmove.os
+ 37 0 0 37 25 libc/string/i386/memmove.os
|
|
memchr: add small embedded test
strnlen: make small embedded test easier to use
strncmp: reformat assembly to make it readable, no code changes
(verified with objdump)
text data bss dec hex filename
- 46 0 0 46 2e libc/string/i386/strncat.os
+ 39 0 0 39 27 libc/string/i386/strncat.os
|
|
text data bss dec hex filename
- 25 0 0 25 19 libc/string/i386/strnlen.os
+ 24 0 0 24 18 libc/string/i386/strnlen.os
|
|
|
|
implement inline versions of some of them.
Enable only those which result roughly in the same
code size as using out-or-line versions.
None of this affects users, installed headers won't have
any trace of it.
|
|
strrchr: smaller i386 version
text data bss dec hex filename
- 33 0 0 33 21 libc/string/i386/memchr.o
+ 28 0 0 28 1c libc/string/i386/memchr.o
- 31 0 0 31 1f libc/string/i386/strrchr.o
+ 26 0 0 26 1a libc/string/i386/strrchr.o
|
|
"Bounds Checking Projects... This project has been abandoned"
for four years at least.
|
|
for ancient compilers. none of other string/*.c files have them.
|
|
it is dead (not supported by gcc) for years.
(more of it remains in multiple copies of sigaction.c)
|
|
text data bss dec hex filename
- 39 0 0 39 27 libc/string/i386/memcpy.os
+ 35 0 0 35 23 libc/string/i386/memcpy.os
|
|
TARGET_SUBARCH implementation too.
Signed-off-by: Carmelo Amoroso <carmelo.amoroso@st.com>
|
|
|
|
Signed-off-by: Filippo Arcidiacono <filippo.arcidiacono@st.com>
|
|
Appears to build fine (several .configs tried)
|
|
|