Creating file sharing permissions, ACL's, etc over multiple servers/accounts & NAS/SAN's
I'm wondering how I could use ACL's and permissions along with other abilities of Linux to allow for file sharing across many different “user accounts”? Each account is basically their own web hosting account, complete with FTP account, SSH access, web hosting space, file hosting application (torrent and other apps) and all of this comes with a limitation on data storage, transfer limits, possibly CPU/RAM usage, etc. Most of this isn't really important to what I am going to ask, but thought it may be relevant to the situation at large.
Curently there are a number of different ways accounts can be setup. Some people have entire dedicated servers, some have shared servers and then there are the “small” accounts where people have “hosting accounts” in one large server (accounts are set up under one large server and they don't have complete root access, where the dedicated & shared servers do). The small accounts have a list of software apps that can be installed (pre-approved) while the dedicated/shared servers can install just about anything they want as long as it isn't malicious.
Al of these have their own HDD and SSD's in the system but some also make use of external storage (USB, Thunderbolt, NAS, SAN, etc).
Let's say that there are 100 dedicated boxes/accounts, 400 shared and 5,000 “small accounts” (which vary widely in size/ability/power). Now one thing that happens a lot is that there are many files that may show up in 50-80% of the accounts, and these may be filed from 500MB to 30GB+ (even up to 150GB) now if even a 10GB file was on 50% of the accounts, that would be 27.5TB of data, stored on 2,750 different accounts – THE SAME FILE! That is a lot of wasted space if something could be done.
From what I have determined, there isn't a process of sharing between accounts. The data doesn’t' change and a MD5 os SHA256 is used to ensure the files is 100% accurate, so the files are identical across all accounts. Now imagine this for 50,000-500,000+ files of varying size (say 500MB smallest) and the replication is anywhere from 3-4% up to 80-90% for a file, so some files don't reach many accounts, while others almost saturate all the accounts.
I want to figure out what I need to do to create a system (possibly on a very high speed SAN) where (using that 10GB file as example), the file is stored once, but accessed by each account that needs it.
Each acount gets a “call file” that tells the system to find this file on the network and put it in their account. This call file is the same for each account user, and the data file is the same (program checks authenticity md5/etc). Now is it possible to give each account who “calls” this file access to it in it's centralized location, so it doesn't have to store it locally? When it needs it it just pulls it from the SAN instead of the local HDD/SSD?
Also, if there system allocates say 500GB or 1TB to an account, I would need this 10GB file on the SAN to count against the data allotment in some way. If this works, there could be a “shared data” section where this 10GB would be accounted for and their own allotment would be left “as-is” for their own unique data needs.
I'd also like for the users to be able to browse the shared files in the SAN as well as the “call files”. This shouldn't be much of an issue, as I have some ideas on how to implement this. My bidggest question is what in Linux/BSD would allow to set up the sharing/permissions/ACL's to allow the different accounts to access these files?
|