diff options
| author | Salvatore Cro <salvatore.cro@st.com> | 2010-09-09 16:08:54 +0200 | 
|---|---|---|
| committer | Carmelo Amoroso <carmelo.amoroso@st.com> | 2010-09-15 12:42:09 +0200 | 
| commit | 599c74a4d7e9bbe68b946d65aef2725821ea3fe9 (patch) | |
| tree | 1eed7e5868ace26b9a08910fb36deaf95b513f99 /libpthread/linuxthreads.old/sysdeps/sh/tls.h | |
| parent | 4b88e6e858b55def2ef0392278ddf81835f2ac45 (diff) | |
sh: move data without fetching cache block within the memset
With this patch the movca.l instruction is used within the memset.
The current memset implementation only uses the FPU and there is
an real gain for all the sizes.
Adding the movca.l instruction numbers always are better than the generic code.
There is a big gain for size greater than 64 KiB but number are worst for 4-32KiB
sizes compared with the implementation without movca.l.
	Time Memory Bandwidth (Mbytes)
-------------------------------------------------
	    Generic         SH4          SH4
	                   (FPU)     (FPU+movca.l)
-------------------------------------------------
512         1143	 1998          1596
1 KiB       1273	 2567          1915
2 KiB       1350	 2993          2128
4-32KiB     1391	 3262          2252
64KiB-16MiB 170		 186	      *830*
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: Carmelo Amoroso <carmelo.amoroso@st.com>
Diffstat (limited to 'libpthread/linuxthreads.old/sysdeps/sh/tls.h')
0 files changed, 0 insertions, 0 deletions
