LinuxQuestions.org
Welcome to the most active Linux Forum on the web.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 06-11-2008, 01:07 AM   #16
TheOne...More
Member
 
Registered: Jun 2008
Posts: 30

Original Poster
Rep: Reputation: 0

yes gzipping is helpful in preserving more space but still it doesn't solve the main question, i need some sort of script to organize an html database of files scrambled on multiple disks.
 
Old 06-11-2008, 10:59 AM   #17
osor
HCL Maintainer
 
Registered: Jan 2006
Distribution: (H)LFS, Gentoo
Posts: 2,450

Rep: Reputation: 78
Quote:
Originally Posted by theriddle View Post
The only way I can imagine this could occur is that GZip support was added in IE7 (which is the UA you have).
I can confirm (through monitoring my own site) that both IE6 and 5 do also support compressed http. My guess is that the fact that the URI ends in “.gz” threw off IE6. It probably has a rule for downloading by extension (application/gzip) rather than listening to the Content-Type reported by the server (text/html). In most situations using web compression this problem does not exist since the URIs themselves look “normal” to the browser.
 
Old 06-11-2008, 10:59 AM   #18
osor
HCL Maintainer
 
Registered: Jan 2006
Distribution: (H)LFS, Gentoo
Posts: 2,450

Rep: Reputation: 78
Quote:
Originally Posted by TheOne...More View Post
i need some sort of script to organize an html database of files scrambled on multiple disks.
Okay, here’s how I understand it. You split up the “content” to multiple media, but leave the “website part” on each media. What I mean to say is that the html, css, and perhaps a few jpg, gif, or png files take up an insignificant amount of space, so you copy them to all of the disks. The other content (videos, pdfs, etc.) do take up significant space, and these are then allocated to different media.

So theriddle’s initial suggestion (to have a dummy page for the nonexistent entries which informs the user the correct disk to load) will not work, since it is specific to html. In particular, you could survive with a “dummy pdf file” and a “dummy mpeg” and so on, but this is not such an elegant solution.

Another solution (again) involves setting up a webserver. But compression is not the purpose of the server. What you can do is write your own 404-page script. The script would have access to a list of files and their locations (this list would also be insignificant in size and could be copied to each disk).

For example, suppose I have a list of files on each disk in a file by the disk number in a directory called “disks” located in the cgi-bin for the server. So it might be something like this:
Code:
$ cat disks/1
/foo/bar.mpg
/baz/qux.pdf
$ cat disks/2
/bar/foo.pdf
/qux/bar.pdf
/foo/baz.mpg
…
Then a preliminary 404 error script might look like:
Code:
#!/bin/bash

echo "Content-Type: text/html

<!DOCTYPE HTML PUBLIC \"-//IETF//DTD HTML 2.0//EN\">
<html><head>
<title>404 Not Found</title>
</head><body>
<h1>Not Found</h1>
<p>The requested URI $REQUEST_URI was not found on this server.</p>
<p>You may have better luck if you insert Disk Number "

cd disks
if ! grep -l "^$REQUEST_URI\$" * ; then
	echo "UNKOWN"
fi

echo "and reload</p>
<hr>
<address>$SERVER_SOFTWARE</address>
</body></html>"
Of course there is currently no sanitation, but it should work in 99% of the cases. You can improve as you see fit.
 
Old 06-11-2008, 11:18 AM   #19
theriddle
Member
 
Registered: Jun 2007
Distribution: Gentoo
Posts: 172

Rep: Reputation: 30
Another option, compared to osor's suggestion, is to set up an HTML redirect, for example
Code:
<html>
    <head>
        <title>Loading...</title>
        <meta http-equiv="refresh" content="0;url=!IT!"/>
   </head>
   <body>
        <h1>Loading...</h1>
   </body>
</html>

Last edited by theriddle; 06-11-2008 at 11:24 AM. Reason: Wrong button
 
Old 06-11-2008, 11:25 AM   #20
TheOne...More
Member
 
Registered: Jun 2008
Posts: 30

Original Poster
Rep: Reputation: 0
osor, if i clicked on a link to a file and the server determined the disk containing the linked file, when i insert the correct disk will the file be loaded automatically or do i have to traverse through the disk tree?
 
Old 06-11-2008, 11:37 AM   #21
osor
HCL Maintainer
 
Registered: Jan 2006
Distribution: (H)LFS, Gentoo
Posts: 2,450

Rep: Reputation: 78
Quote:
Originally Posted by theriddle View Post
Another option, compared to osor's suggestion, is to set up an HTML redirect, for example
Unless I am missing something, an HTML redirect would apply to HTML files, whereas the bulk of the OP’s files are non-html. You could always do a server-side redirect, but if you are going through the trouble of setting up a server, a 404 ErrorDocument seems more appropriate.
 
Old 06-11-2008, 11:38 AM   #22
osor
HCL Maintainer
 
Registered: Jan 2006
Distribution: (H)LFS, Gentoo
Posts: 2,450

Rep: Reputation: 78
Quote:
Originally Posted by TheOne...More View Post
osor, if i clicked on a link to a file and the server determined the disk containing the linked file, when i insert the correct disk will the file be loaded automatically or do i have to traverse through the disk tree?
When you loaded the correct disk, and just refresh (e.g., F5 on many browsers), the correct file will be accessed. You could get even fancier and have some javascript in the errorpage that basically refreshes for you after you tell it you have loaded the proper disk.
 
Old 06-11-2008, 11:38 AM   #23
TheOne...More
Member
 
Registered: Jun 2008
Posts: 30

Original Poster
Rep: Reputation: 0
theriddle, i guess i got what you're trying to say. you mean that for every link to a file not located on the working disk i link instead to an html page telling that the file isn't on this disk and what disk to find it on. if that's it, then it's great i know and it'll work for the website i want to archive if i archived it alone. but also i want to make some more sophisticated a solution, i want to put all my data on disks, in a scrambled way according to their sizes so as to maintain more space, then make a large html index on a separate cd with all the web pages/websites i downloaded -this separation will allow me to centralize control to a major index of all my content which will be easier to manage also will allow me to add remove content without further editing of all the library. then when i activate a link in the master index, i get a page or a notification just like your idea, and when i insert the disk the file is automatically loaded -this is the new bit i guess!.
 
Old 06-11-2008, 11:41 AM   #24
TheOne...More
Member
 
Registered: Jun 2008
Posts: 30

Original Poster
Rep: Reputation: 0
ok osor if what you say is possible, could you redirect me to resources where such solution or usage is explained in more details. also tell me which server(s) is the best choise for both linux/windows and may be freebsd.
 
Old 06-11-2008, 01:05 PM   #25
osor
HCL Maintainer
 
Registered: Jan 2006
Distribution: (H)LFS, Gentoo
Posts: 2,450

Rep: Reputation: 78
Quote:
Originally Posted by TheOne...More View Post
ok osor if what you say is possible, could you redirect me to resources where such solution or usage is explained in more details. also tell me which server(s) is the best choise for both linux/windows and may be freebsd.
Well apache runs on pretty much anything, and for redirecting error pages, you use the ErrorDocument directive in the conf file. Apache is probably a little heavy for your needs, so you might try lighttpd, which also runs on many platforms (although there you probably have to find 3rd-party precompiled binaries for windows). For this, you would use something like server.error-handler-404.

Also, you will need some sort of interpreter if you want to use CGI for your error document. The example I gave above was not very portable, since it used bash and some features specific to GNU grep. For POSIX systems, you would want to use plain sh and no platform-specific extensions. I am unsure of how to proceed in windows, since I don’t think it has by default /bin/sh (but I could be wrong). You might try bundling microperl with the webserver, which shouldn’t take up too much space. This is probably a more natural language for your CGI anyway, but you don’t want to use and CPAN packages, just plain Perl.

Anyway, the actual implementation of the scripts is up to you, but I gave an example of a “first draft”. If you have a central disk which houses the manifests for each of the other disks, you will have to have a two-step script. For example, when a file is not found, the first script will instruct you to load the central disk and press a submit button. This would load a different CGI script residing on the central disk, which would inform you what disk you want to find the file on. Upon inserting the correct disk, you would then press another button, which would take you to where you intended to go in the first place.
 
Old 06-11-2008, 01:36 PM   #26
TheOne...More
Member
 
Registered: Jun 2008
Posts: 30

Original Poster
Rep: Reputation: 0
wow this is a little unusual to me as my experience in web technologies is not enough -i just know a little html without which i couldn't edit the website. so i need some introductory documentation, where could i find that? apache documentation or will i need third party documentation/books. online documents are always welcome.
 
Old 06-11-2008, 04:17 PM   #27
osor
HCL Maintainer
 
Registered: Jan 2006
Distribution: (H)LFS, Gentoo
Posts: 2,450

Rep: Reputation: 78
If you are new to this, I would recommend lighttpd instead of apache (much simpler to configure, and fewer places to mess up). There is documentation on its website. CGI is just a way to use scripts in place of html files, so when a certain request is made to the webserver, dynamic rather than static content may be delivered. If you are familiar with shell scripting, a properly-configured CGI script will be launched with a few environment variables set (such as $REQUEST_URI). It is up to you to use the variables to produce the output you desire (in this case, you grep the manifest files for the requested uri, and print out which one has it). There are CGI tutorials all over the web, but the only variable you will really need to deal with is $REQUEST_URI.

Perl is a common language used for CGI, and microperl is a (severely) stripped build of the interpreter (which you can have for various platforms). Since you will need very little of Perl’s functionality, while having two or three interpreters bundled on each disk, I think it is a good choice to use. If you need help with a script, just post.

Finally, I think you’ll find that a good search engine is the best source of documentation.
 
Old 06-12-2008, 12:05 AM   #28
TheOne...More
Member
 
Registered: Jun 2008
Posts: 30

Original Poster
Rep: Reputation: 0
thank you all.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Script which checks Disk Volume - - on FC5 bskrakes Linux - General 1 10-11-2006 01:27 PM
tar multiple volume archive - volume script RGummi Linux - General 2 08-31-2006 04:08 PM
how to insert & check driver code of usb optical mouse chaitanya_pavan Linux - Kernel 0 05-26-2006 07:16 AM
How to check initrd content elmu Linux - Newbie 1 11-08-2005 12:06 PM
Boot disk; check. CD in drive; check. Doesn't work; check. Hal DamnSmallLinux 7 02-04-2004 02:10 AM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 04:41 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration