Little back story to what I am trying to do here.
I run a large web site that hosts many GB worth of files (over 250GB worth of files) for a game. I have the files indexed using a php script that shows basic information such as the date it was created, size and a little description of what the file is. It kind of looks similar to apache's default index page, but it's got formatting and what not.
Well recently I've began to have problem with people site ripping the site in one go. 100s of GB in a single day to people who just want all the files for whatever reason. However, them doing this hammers the site. While it hasn't negatively affected the server yet other than consuming a crap ton of bandwidth which isn't free, I want to get a handle on it now before it does get out of hand.
So I got a script that monitors the number of "hits" through the php script. If xx number of hits occur in xx time it will force them to wait xx number of seconds before they can download again. The number of seconds they have to wait goes higher the more they hit the limit within' a 24H period time. First time it's 3 seconds, then 6 seconds, up to 120 seconds each time. This is working great except one flaw.
The files are easily accessible without the use of the index script. If you know the file name and the directory you can download it directly. Which is easy to see since all the information is in the URL. An example of the download URL for each file is as follows:
http://www.mysite.com/index.php?dir=...?file=file.zip
You can easily see that if you drop index.php?dir= and &?file= from the URL you'll get this:
www.mysite.com/this/dir/here/file.zip
Thus by passing php and the script.
Now here is the catch.
1. I do not want to limit bandwidth (speed) to the clients. I want them to get their file as fast as their connection will allow them. This is because most people only want 1 - 2 files at a time. These are legit users. I do not want to punish them.
2. I do not want some kind of captcha or the like required per download. Again this will limit the legit users from getting the files. Also many of the users use the command line to grab files for their server(s) such as wget. Kind of hard to enter a captcha from within' the shell of a Linux prompt.
3. I don't want obfuscated URLs for similar reasons to #2. I do not mind if people hot link the downloads on their site and/or even use my site as a primary or alternative mirror on their site. I want to share the files.
Thus I know of two ways that I wanted to accomplish this. One was to limit the number of "hits" per user or limit bandwidth (storage) per user. While bandwidth storage would be the best approach I think. Something like 10GB per 24H, I do not know how to do that.
I have been able to limit the number of hits within' a certain time frame, but again it can be bypassed. What I think I can do is use .htaccess mod_rewrite to help the script.
Basically I want mod_rewrite to redirect this URL:
www.mysite.com/this/dir/here/file.zip
to this URL:
http://www.mysite.com/index.php?dir=...?file=file.zip
Forcing them to download the file through PHP.
I like this approach because the legit, honest user will have no idea something is going on in the background while the leechers will be stopped cold.