LinuxQuestions.org
Review your favorite Linux distribution.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 05-27-2014, 01:46 PM   #1
Skillz
Member
 
Registered: Sep 2007
Posts: 252

Rep: Reputation: 32
Need some .htaccess assistance...


Little back story to what I am trying to do here.

I run a large web site that hosts many GB worth of files (over 250GB worth of files) for a game. I have the files indexed using a php script that shows basic information such as the date it was created, size and a little description of what the file is. It kind of looks similar to apache's default index page, but it's got formatting and what not.

Well recently I've began to have problem with people site ripping the site in one go. 100s of GB in a single day to people who just want all the files for whatever reason. However, them doing this hammers the site. While it hasn't negatively affected the server yet other than consuming a crap ton of bandwidth which isn't free, I want to get a handle on it now before it does get out of hand.

So I got a script that monitors the number of "hits" through the php script. If xx number of hits occur in xx time it will force them to wait xx number of seconds before they can download again. The number of seconds they have to wait goes higher the more they hit the limit within' a 24H period time. First time it's 3 seconds, then 6 seconds, up to 120 seconds each time. This is working great except one flaw.

The files are easily accessible without the use of the index script. If you know the file name and the directory you can download it directly. Which is easy to see since all the information is in the URL. An example of the download URL for each file is as follows:

http://www.mysite.com/index.php?dir=...?file=file.zip

You can easily see that if you drop index.php?dir= and &?file= from the URL you'll get this:

www.mysite.com/this/dir/here/file.zip

Thus by passing php and the script.

Now here is the catch.

1. I do not want to limit bandwidth (speed) to the clients. I want them to get their file as fast as their connection will allow them. This is because most people only want 1 - 2 files at a time. These are legit users. I do not want to punish them.

2. I do not want some kind of captcha or the like required per download. Again this will limit the legit users from getting the files. Also many of the users use the command line to grab files for their server(s) such as wget. Kind of hard to enter a captcha from within' the shell of a Linux prompt.

3. I don't want obfuscated URLs for similar reasons to #2. I do not mind if people hot link the downloads on their site and/or even use my site as a primary or alternative mirror on their site. I want to share the files.

Thus I know of two ways that I wanted to accomplish this. One was to limit the number of "hits" per user or limit bandwidth (storage) per user. While bandwidth storage would be the best approach I think. Something like 10GB per 24H, I do not know how to do that.

I have been able to limit the number of hits within' a certain time frame, but again it can be bypassed. What I think I can do is use .htaccess mod_rewrite to help the script.

Basically I want mod_rewrite to redirect this URL:

www.mysite.com/this/dir/here/file.zip

to this URL:

http://www.mysite.com/index.php?dir=...?file=file.zip

Forcing them to download the file through PHP.

I like this approach because the legit, honest user will have no idea something is going on in the background while the leechers will be stopped cold.
 
Old 05-27-2014, 02:03 PM   #2
Skillz
Member
 
Registered: Sep 2007
Posts: 252

Original Poster
Rep: Reputation: 32
Well here is a snippet of the access log. Not sure what I want to do is possible now that I see how the indexing script redirects the user.

[27/May/2014:14:53:06 -0400] "GET /index.php?dir=Admin/&file=CB11.zip HTTP/1.1" 302 727 "http://mysite/index.php?dir=Admin/"
[27/May/2014:14:53:06 -0400] "GET /Admin/CB11.zip HTTP/1.1" 200 10654 "http://mysite.com/index.php?dir=Admin/"
[27/May/2014:14:53:20 -0400] "GET /Admin/CB11.zip HTTP/1.1" 200 10654 "-"
[27/May/2014:14:54:18 -0400] "GET /index.php?dir=Admin/&file=box2.zip HTTP/1.1" 302 727 "http://mysite.com/index.php?dir=Admin/"
[27/May/2014:14:54:18 -0400] "GET /Admin/box2.zip HTTP/1.1" 200 5293 "http://mysite.com/index.php?dir=Admin/"

What happens here is in the first line you can see that the index.php file called the file CB11.zip
The second line looks like the client accessed the file CB11.zip directly because the index file just forwards them.
The third line is where I did access the file directly by purposely bypassing the index script.
The next two lines are where I accessed another file through the index.php script.

I admit, I'm not sure what I am doing. :/
 
Old 05-31-2014, 07:21 PM   #3
Diantre
Member
 
Registered: Jun 2011
Distribution: Slackware
Posts: 515

Rep: Reputation: 234Reputation: 234Reputation: 234
Quote:
Originally Posted by Skillz View Post
The files are easily accessible without the use of the index script. If you know the file name and the directory you can download it directly. Which is easy to see since all the information is in the URL. An example of the download URL for each file is as follows:

http://www.mysite.com/index.php?dir=...?file=file.zip

You can easily see that if you drop index.php?dir= and &?file= from the URL you'll get this:

www.mysite.com/this/dir/here/file.zip

Thus by passing php and the script.
Before going with the .htaccess approach, I have a suggestion. Why don't you put the files in another directory outside the document root of the webserver? Obviously, this other directory must be accesible by the webserver.

This way, you don't need to specify the directory as a paramenter to the PHP script, just make it a constant inside the script. The users then will have to pass only the "file" parameter, they don't need to know where the files are in your server, only the script would know where the files reside.

This would mean that someone trying to directly access the files won't be able to, the users will need to use your script to get the files.

Just a thought, perhaps it'll work.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
I need your assistance donphyl SUSE / openSUSE 3 04-13-2011 11:22 AM
[SOLVED] assistance and help vonda Linux - Newbie 1 07-31-2010 01:09 PM
some assistance please!! katembox Red Hat 2 09-01-2009 10:15 AM
Need assistance spotslayer Linux - Software 1 11-18-2004 06:49 AM
Need Assistance Bizar Slackware 11 06-25-2003 09:10 PM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 09:14 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration