Hi all,
I'm trying to use the TCM (Tightly Coupled Memroy, a kind of SRAM, can be congiured as part of RAM and is used to improve performance) provided by S3c6410 (ARM1176ZJF-S based), which contains 16K ITCM and 16K DTCM.
Yeah, after a lot of searches and readings, I secceeded to write a simple driver to setup I/D-TCMs, and by issuing insmod to enable TCM at some physical memory address (not assigned according to system memory map in UM) when startup.
I wrote a very simple example to test. I got results show that the L1 cache + TCM way is even slower than just using L1 cache. Here's the code of the example:
Code:
#include <stdio.h>
#include <sys/mman.h>
#include <fcntl.h>
#include <unistd.h>
#include <sys/time.h>
#include <string.h>
#define MAP_SIZE 8192
#define MAP_MASK (MAP_SIZE-1)
off_t dphybase = 0x0c002000;
off_t dphybase2 = 0x0c004000;
int main(){
struct timeval startTime;
struct timeval endTime;
char ram_buff[MAP_SIZE], ram_buff2[MAP_SIZE], ch;
char* tcm_buff;
char* tcm_buff2;
int i, mdesc, span, mag=10*MAP_SIZE, k;
if((mdesc = open("/dev/mem",O_RDWR)) == 0) {
perror("Error openning file /dev/mem");
return;
}
tcm_buff = (char *)mmap(0, MAP_SIZE,PROT_READ|PROT_WRITE|PROT_EXEC, MAP_SHARED, mdesc, dphybase&~MAP_MASK);
tcm_buff2 = (char *)mmap(0, MAP_SIZE,PROT_READ|PROT_WRITE|PROT_EXEC, MAP_SHARED, mdesc, dphybase2&~MAP_MASK);
if((tcm_buff == (void*)-1)) {
perror("Error mapping");
return;
}
for(i=0;i<MAP_SIZE;i++){
ram_buff[i] = i%128;
ram_buff2[i] = i%128;
tcm_buff[i] = i%128;
tcm_buff2[i] = i%128;
}
/*
* time ram_buff access
*/
memset(&startTime, 0, sizeof(struct timeval));
memset(&endTime, 0, sizeof(struct timeval));
gettimeofday(&startTime, (struct timezone *) NULL);
//read every one byte
for(k=0;k<mag;k++)
for(i=0;i<MAP_SIZE;i++) {
ch = ram_buff[i];
//ch += ram_buff2[i];
//ram_buff[i] = ch;
}
//printf("%d ", ram_buff[i]);
//printf("\n");
gettimeofday(&endTime, (struct timezone *) NULL);
span = (endTime.tv_sec - startTime.tv_sec) * 1000000LL;
span += (endTime.tv_usec - startTime.tv_usec);
printf("Total time of ram_buffer access: %ld\n", span);
/*
* time tcm_buff access
*/
memset(&startTime, 0, sizeof(struct timeval));
memset(&endTime, 0, sizeof(struct timeval));
gettimeofday(&startTime, (struct timezone *) NULL);
//read every one byte
for(k=0;k<mag;k++)
for(i=0;i<MAP_SIZE;i++){
ch = tcm_buff[i];
//ch += tcm_buff2[i];
//tcm_buff[i] = ch;
//printf("%d ", tcm_buff[i]);
}
//printf("\n");
gettimeofday(&endTime, (struct timezone *) NULL);
span = (endTime.tv_sec - startTime.tv_sec) * 1000000LL;
span += (endTime.tv_usec - startTime.tv_usec);
printf("Total time of tcm_buffer access: %ld\n", span);
munmap(tcm_buff, MAP_SIZE);
return 0;
}
output:
Total time of ram_buffer access: 16888375
Total time of tcm_buffer access: 21927893
As shown in the code, I choose /dev/mem + mmap to operate on TCM.
Q1. Something wrong with my example or is it properate to test efficiecy of TCM vs. Cache and to conclude that TCM's slower?
I doubted about the physical address assigned to TCM(the User Manual does not show where they are assigned, however!), so I tried 0x0C002000(mentioned in the UM) and 0x80004000(found in smdk6410 test suites), but results make no difference. Maybe both of them are wrong location as arm info center says TCM's base address(physical address of course) can be anywhere. However, ANYWHERE maybe somewhere aleady given to other devices.So, if anyone has tried using TCM and configed it, please help me..
Q2. If my example and config are right, then what may be the cause of the lose of TCM to cache?
Q3. What's secure/non-secure access to TCM? I may config TCM as secure or non-secure according to
here, but how do I know the access to TCM is secure or not?
Thanks,
Zova