This is a wide question, because different device classes behave differently, and have different characteristics and abilities.
There are two ways for a process to cross the user/kernel space: data transfer via system calls or direct memory mapping.
When using memory mapping, there are obvious synchronization concerns, and one has to code properly to ensure that your user process does not interfere with the device driver/device interaction. Again, how this is done very much depends upon the device. Generally, this type of coding is done, for example, to read a device's EEPROM values, or blast screen data into a frame buffer.
The device and device drivers aren't usually sitting around waiting for user space data; rather, they react to events such as interrupts, and the interrupt handler for the device does what it needs to do. They also respond (indirectly) to system calls, dispatched into the appropriate standard device entry point (eg. ioctl, read, write, flush, seek, etc.).
Devices are controlled via various hardware registers: for example, set a bit, and they do something, set another bit and they do something else, read a register, and the device returns the value and clears a register (clear on read). Some devices such as Ethernet devices have rings of buffers, where network data is loaded into the next available ring buffer, and the ring buffer next/last pointer is updated, and then the device responds. There are all sorts of combinations. This may be what you mean by "talk" to the device without kernel intervention.
If you really have an interest in this stuff, consider getting the Linux device driver book (O'Reilly) and study some device drivers. It will also *require* that you read a device's specification and programming interface.
Last edited by Mr. C.; 07-23-2008 at 02:43 PM.