Take a look at chapters 3 and 15 of
Linux Device Drivers, Third Edition.
From the applications point of view, using mmap() yields a normal dynamically allocated memory region, except that it is shared with your driver. All changes one makes are immediately visible to the other.
If you have a device with physical memory of its own, using mmap() to map that memory directly to userspace is extremely efficient: you can get zero-copy operation that way. Each memory access is directly seen by the hardware device.
copy_to_user() and copy_from_user() do a copy between memory allocated and used by your driver, and any client application.
The main difference between these two approaches is that using a mmap() bypasses your kernel driver. The kernel driver will not know when nor how the userspace has modified the memory.
When the data is a continuous stream of bytes, mmap() is rarely used, because it is difficult for the kernel driver to know when/if it has further data to process. (It is not impossible, however: one could use an interface similar to raw memory-mapped sockets, and use read()/write() to indicate the amount of memory prepared in the buffer, completely ignoring the data pointer. This is very nonstandard, and will confuse most application programmers, though.)
On the whole, the best answer to your question depends highly on the manner in which your driver needs to handle the data, and whether you have a related device with its own physical memory or not. If you have byte streams, mmap() is typically not the best solution -- the extra copy (copy_to_user()/copy_from_user()) is really not that expensive.
Could you give us some more details on the nature of the data you intend to process? Are they byte streams, or perhaps datagrams of known size? Do you want to minimize latencies, or maximise throughput?
If the kernel driver is just a data processor, doing some transformation on the data, you'll get better performance by writing it as a library instead.