Yes. You'd better use a trivial character device driver (that exports open(), close(), and mmap()) for this, though. You only need request_mem_region() to reserve the address range, then io_remap_pfn_range() to remap it to user space. See
Linux Device Drivers, 3rd Edition, especially chapter 3 for details.
I'm still a bit intrigued. The PCI bus is a good choice for something like this.
Perhaps you could set up the IOMMU on each CPU so that each has an (separate) address range on the PCI bus mapped to uncached memory on it? In this case, the standard multiprocessor locking primitives (that apply the PCI bus lock (
LOCK opcode prefix on x86 architectures) will Just Work for you. Instead of a single shared aperture, you'd have three apertures, each one owned by a specific CPU, but accessible to all three. Even if one of the CPUs drops, the other two can continue working (assuming you have hardware that will make sure one CPU will not hog the CPU bus indefinitely). Your userspace will use your driver to map all three apertures.
The logic on how to keep all three in sync (if you intend to have majority rule, like NASA does for satellite stuff -- any one CPU is overruled by the other two, but consensus is expected) is built on top of the locking primitives; the operations you need depend on the way you intend to utilize the CPU triplet: in parallel, sharing the workload, or with dedicated purpose each.