mmap / remap_pfn_range on powerpc embedded

karagul · 09-17-2010, 03:49 PM

Hello,
I've been trying for a while googling about this topic with not much luck and eventually decided to post a question here... I thank you in advance for answering my question.

I am working on a 2.6.31.6 kernel for powerpc (32 bit) and I'm trying to access memory of a device which is mapped in the physical memory of the cpu from userspace.

I know this is not good as I/O memory is special and should not be exposed to userspace but I have no choice and I wonder anyway if this is possible at all? Here's the problem I've encountered:

I implemented the mmap call in the file operation structure of my char device driver and used remap_pfn_range to make physical memory accessible to userspace.
In userspace I just use mmap() on the file descriptor and access the memory.

Reading seems to be just fine while writing to the device triggers a bus transaction only after an explicit msync() call after each write (in addition to any memory barrier, like sync/eieio).

1) Am I missing something?
2) I thought this behaviour should only happen for files on disk, does this means I am actually writing to RAM and flushing pages to the device using msync() (i.e. I am transferring 4KB at each access)??

Note1: if I access my device through read/write to the char device everything work perfectly so I don't think It's a cache problem or anything HW related.

Note2: if I map /dev/mem I've got the same error so I don't think it's my driver's fault.

I might be confused about this, I would very grateful if anyone could elaborate on this topic a bit more.

Thank you!

crust · 09-20-2010, 04:50 PM

How are you invoking remap_pfn_range .... for one, you may want to use io_remap_pfn_range. On PPC I think they are the same. Second, you can setup vma->vm_flags |= VM_IO | VM_RESERVED; before you call remap_pfn_range ... here is a call from my working driver.

vma->vm_flags |= VM_IO | VM_RESERVED;

vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);

rc = io_remap_pfn_range(vma, vma->vm_start, PLB_MASTER_MEM>>PAGE_SHIFT,
len, vma->vm_page_prot);

karagul · 09-20-2010, 04:59 PM

Hi crust, thanks for your response,
unfortunately my driver already looks like yours but still I have the same problem.

Upon further investigation it seems that my problem is related to L1 cache actually: using the scope I can see the uP puts 32 more read accesses to my device upon a write of a single byte to any location (from user space). I'm saying it's the L1 cache as the uP has 32Kb of L1 with rows of 32 bytes exactly.
My problem is that pgprot_noncached, as you suggested, should avoid caching problems but in fact it seems not to.
I wonder whether I found a bug perhaps?
Note that this doesn't happen if I access the device from kernel space using ioremap() and iowrite8().

Thank you for your help!

crust · 09-20-2010, 10:40 PM

Quote:

Originally Posted by karagul

using the scope I can see the uP puts 32 more read accesses to my device upon a write of a single byte to any location (from user space). I'm saying it's the L1 cache as the uP has 32Kb of L1 with rows of 32 bytes exactly.

That sounds like the cache line size on my PPCs (44x cores). So it appears you are doing a write and it is flushing the dirty data from the cache to the target. Is it possible that you have the page tables setup for caching the phys area? That would be somewhere in your boot code (u-boot maybe) or possibly in your device tree, but I am not sure about that.

karagul · 09-21-2010, 06:21 AM

Hi crust,
it actually seems you're right: the pages returned by ioremap have the I and G bit set in the mmu table.
The pages returned by io_remap_pfn_ranges instead don't.

I actually don't understand this, my code is the following:

////////////////////
vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);

// actually remap the physical device memory to virtual memory
ret = io_remap_pfn_range(vma, vma->vm_start, pfn, vsz, vma->vm_page_prot);
////////////////////

I don't see anything bad about this. I thought pgprot_noncached(vma->vm_page_prot) should do the work!
Also, I've noticed that /dev/mem driver does exactly the same thing, and if I map /dev/mem instead of my module I still see the cache effect... is his a bug or an expected behavior?

crust · 09-23-2010, 04:34 PM

Unfortunately, I am not 100% sure why this happens. I think it is expected behavior because the space that you are considering IO (i.e. the memory for your device) is actually in kernel managed memory. I *think*, but am no way certain that if the memory is kernel managed, then it is cacheable, at least on powerpc. Hopefully somebody can chime in here. In the 4xx series of CPUs, I believe it is up to the software to maintain coherency. In that case, you just have to sync the cache and memory before you do whatever your code will do.

In the code I showed it was not a problem b/c my code was true I/O (pci) space, so it was not cacheable.