Shell scripting,;problem reading from database and exporting to html

sunksullen · 05-17-2007, 02:16 AM

I have created a database of the local school districts,(just for fun. trying to learn CSS and shell scripting), and I am trying to read each line from the database file and export each entry to its own html page. Here is a sample of the database file :

$ cat -n sttpublicschool*

1 :Bethel District:http://www.bethel.k12.or.us/:Bethel School District inspire each student to excellence.:4640 Barger Dr; Eugene, OR 97402,

541) 689-3280: education public school students learning city teacher districts
2 :Blachly District:www.blachly.k12.or.us:Blachly District:20264 Blachly Grange Rd; Blachly, OR 97412,

541) 925-3262: education public school students learning city teacher districts
3 :Creswell School District:http://www.creswell.k12.or.us/do/index.html:Creswell School District:182 S 2nd St; Creswell, OR 97426,

541) 895-2108: education public school students learning city teacher districts
4 :Crow-Applegate School District NMB.66:impactartsgroup.com:Fifteen miles southwest of Eugene, Oregon in the rural valleys of Crow and Lorane, Crow-Applegate-Lorane School District offers a uniquely successful public education experience. $:85955 Territorial Rd; Eugene, OR 97402,

541) 935-2100: education public school students learning city teacher districts

I added the command “cat -n databasefile” to add numbers to each line. I then grep'ed each line and extracted the fields using “:” as a deliminator. I set those fields as the desired variables and echo'ed them into it's own html file.

MY PROBLEM: I need to be able to automatically read each line from the database file and export each listing in its own HTML web page. I've been reading a lot, and am very new to programming (but really, really loving it). I'm totally stumped on an efficient way to read from a database text file, line-by-line and export each line's separate fields into its own seperate HTML file. Could someone please send me in the right direction?!?

Thanks so much for your time and help! The previous questions I asked were answered perfectly. I really appreciate all of your support.

Much Thanks,

Cammo

P.S. oh and obviously I need it to read from the database file automatically.

sunksullen · 05-17-2007, 03:53 AM

sorry didnt mean to post twice.

jlinkels · 05-17-2007, 08:23 AM

Ok, this problem could be solved, but whatever you do, it will always be very, very limited.

If you do it thru shell scripting, you should use awk to get each field on its own line. You can precede and surround it by HTML tags as you wish. A link to awk is present in my signature.

However, as soon as the words "database" and "HTML" occur in the same line, you should switch to LAMP: Linux, Apache, MySQL and PHP.

MySQL is the database, and it is very easy to use. Read chapter 2 and 3 in the user documentation available at mysql.com. MySQL interfaces nicely with shell scripting, so you can populate your database and test it from the shell.

The to access the database and create web pages use PHP. In the PHP manual in the mysql functions chapper there are some examples which can be copied verbatim.

I know it sounds complicated, but a few months ago I faced the same problem and decided to learn PHP. In one weekend I had my first MySQL/PHP app running, and I never regretted the investment. On my age (48) I do not learn as fast as before!

jlinkels

theNbomr · 05-17-2007, 09:25 AM

jlinkels is correct that using a web server, backend database server, and PHP (or CGI) is the classic method of providing web access to databases, but for your stated goal of learning CSS and Shell scripting, it is significant overkill. When reading a file line-by-line, tools like sed & awk are the classic tools, again as jlinkels points out. When one approaches shell scripting, these tools should be considered part of the landscape, and your exercise would serve well as a learning experience. Sprinkle in lots of other filter-style tools such as grep, sort, cat, tr for more useful experience. Maybe even look at some simple perl scripting, which I find often works well for one-liners, and is often overlooked.

I will offer one suggestion. Since you seem to have the freedom to define your own database file format, I recommend that you use something a bit more robust, and also more conventional. Comma-separated-variable (CSV) format would be adequate on the simple end of the spectrum, vs. a full-on XML format at the more complex end. There is much to learn simply in the exercise of creating such file formats, and the knowledge gained will almost certainly be useful.

--- rod.

sunksullen · 05-17-2007, 02:21 PM

I appreciate both of your replies. I've been playing with this all night, and am figuring out a lot more about reading from text files. I really want to master shell scripting before jumping into mysql and php, and I think I'm just about ready to start writing the script that will export all the HTML files.

Question: Is having around 2000 pages in HTML going to slow down the server to a significant level v.s using PHP? Why would I want to use PHP and MYSQL over CVS and XML? I wrote a program that that writes entries into the database using the format I posted. I don't really care how it looks...just as long as it works. My issues now, is that I'm trying to make everything automated. Currently I have to type "1...enter....2.....enter" (where one equals line one of the data text file and 2 is line 2 etc. Each line(entry) in the text file has it's own number. I'm grepping that number. I want to be able to type in how many lines I want it to count to, and then have it extract each line and export it to its own HTML pages. Currently, I'm stuck having "read" ask me which number line it needs to get to write to the HTML file. I want it to automatically count up to the number I specify. Does that make any sense? Thanks/

-Cameron

sunksullen · 05-17-2007, 02:24 PM

how would I get awk to to automatically extract each line to its own file? Or extract each field from each line..to its own line? Thanks!

[QUOTE=jlinkels]Ok, this problem could be solved, but whatever you do, it will always be very, very limited.

If you do it thru shell scripting, you should use awk to get each field on its own line. You can precede and surround it by HTML tags as you wish. A link to awk is present in my signature.

theNbomr · 05-17-2007, 06:12 PM

Creating static HTML pages will be marginally faster for accessing them by a web server, after they are created, that is. The advantage of using a script such as PHP is that you just create the script once, and the pages are generated on the fly as needed. Changes to the script will be immediately reflected in pages loaded.

The whole method of sed and awk acting on each line of input is what makes them very suitable for these kinds of tasks.
Save this as whatever.awk:

Code:

BEGIN{ FS=":"}
  
{
    html = $2.".html";
    print html
	print "<HTML><HEAD><TITLE>School District Data</TITLE></HEAD><BODY>" > html;
} 
{ 
	print "<TABLE>" >> html;
	for( i=1; i<NF; i++){ 
		printf("<TR><TD>%3d</TD><TD>%s</TD></TR>\n",i, $i ) >> html; 
	} 
	print "</TABLE>" >> html;
}
{
	print "</BODY></HTML>" >> html;
}
{
	close(html);
}

Try running it like:

Code:

awk whatever.awk schoolDistrict.txt

Now, notice that record 3 is different than the rest of them. This is because your database file format is not sufficiently robust. The embedded ':' in the URL is seen as a field separator. This is the reason that I recommended something that has already been designed for this purpose. My reasons had nothing to do with appearance, but everything to do with function. Also, using industry standard formats provides for easy use of existing tools and data exchange.

--- rod.

PS: Not posting your input data in [ C O D E ] tags has caused loss of data where closing paren's follow a colon. Another example of transporting data in the [in]correct format.

sunksullen · 05-18-2007, 02:16 PM

Thanks so much for your help and advice!

Quote:

Originally Posted by theNbomr

Creating static HTML pages will be marginally faster for accessing them by a web server, after they are created, that is. The advantage of using a script such as PHP is that you just create the script once, and the pages are generated on the fly as needed. Changes to the script will be immediately reflected in pages loaded.

The whole method of sed and awk acting on each line of input is what makes them very suitable for these kinds of tasks.
Save this as whatever.awk:

Code:

BEGIN{ FS=":"}
  
{
    html = $2.".html";
    print html
	print "<HTML><HEAD><TITLE>School District Data</TITLE></HEAD><BODY>" > html;
} 
{ 
	print "<TABLE>" >> html;
	for( i=1; i<NF; i++){ 
		printf("<TR><TD>%3d</TD><TD>%s</TD></TR>\n",i, $i ) >> html; 
	} 
	print "</TABLE>" >> html;
}
{
	print "</BODY></HTML>" >> html
}
{
	close(html);
}

Try running it like:

Code:

awk whatever.awk schoolDistrict.txt

Now, notice that record 3 is different that the rest of them. This is because your database file format is not sufficiently robust. The embedded ':' in the URL is seen as a field separator. This is the reason that I recommended something that has already been designed for this purpose. My reasons had nothing to do with appearance, but everything to do with function. Also, using industry standard formats provides for easy use of existing tools and data exchange.

--- rod.

PS: Not posting your input data in [ C O D E ] tags has caused loss of data where closing paren's follow a colon. Another example of transporting data in the [in]correct format.