LinuxQuestions.org
Review your favorite Linux distribution.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - General
User Name
Password
Linux - General This Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.

Notices


Reply
  Search this Thread
Old 03-26-2012, 04:59 PM   #1
rm_-rf_windows
Member
 
Registered: Jun 2007
Location: Europe
Distribution: Ubuntu
Posts: 292

Rep: Reputation: 27
Bash Script / Regular Expression Problem


Hi all,

I'm making progress in Bash regular expressions and scripting and am enjoying it thanks in part to the participants who have given really good advice in the forum.

I've got yet a new question, probably not technically a regular expression question but similar.

I've got a list of words which has the following form:
Code:
CATEGORY
Subcategory
1 specific type
2 specific type n° 2
Subcategory2
3 specific type n° 3
etc.
CATEGORY2
Subcategory3
89 specific type n° 89
90 specific type n° 90
Subcategory4
91 specific type n° 91
CATEGORY3
etc.
The idea is to convert this into a database script (sql).

The lines with numbers are going to be the main tuples. Their foreign keys will point to another id in the "subcategory" table. The "subcategory" table will have a foreign key pointing to the id of the "category" table. A very simple structure.

So my question is, keeping in mind that CATEGORIES are all uppercase, Subcategories all begin with an uppercase followed by lower case and the main tuple entries have a number followed by a tab and a word... All lines can be multiword (including spaces, hyphens, etc.)... How can I store the CATEGORY and Subcategory values into a variable in order to link the respective values to each other? It is in list form, the "titles" and "subtitles" correspond to the values which are directly below them (until another subtitle/title is encountered).

In other words, the end result should be:
Code:
CREATE TABLE category...
CREATE TABLE subcategory...
CREATE TABLE main_table...

INSERT INTO category VALUES (0, "CATEGORY");
INSERT INTO category VALUES (1, "CATEGORY2");
INSERT INTO category VALUES (2, "CATEGORY3");
etc.
INSERT INTO subcategory VALUES (0, "subcategory", 0);
INSERT INTO subcategory VALUES (1, "subcategory2", 0);
INSERT INTO subcategory VALUES (2, "subcategory3", 1);
INSERT INTO subcategory VALUES (3, "subcategory4", 1);
INSERT INTO subcategory VALUES (4, "subcategory5", 2);
etc.
INSERT INTO main_table VALUES (0, "specific type", 0);
INSERT INTO main_table VALUES (1, "specific type n° 2", 0);
etc.
INSERT INTO main_table VALUES ("specific type n° 91", 1); 
etc.
What I don't know how to do is to link CATEGORY to Subcategory and the main tuples in my main script in order to generate an .sql file. The rest is okay.

(I don't know if I've been clear this time round, if not, let me know and I'll respond again).

Thanks in advance,

rm
 
Old 03-26-2012, 07:10 PM   #2
bigrigdriver
LQ Addict
 
Registered: Jul 2002
Location: East Centra Illinois, USA
Distribution: Debian stable
Posts: 5,908

Rep: Reputation: 356Reputation: 356Reputation: 356Reputation: 356
It's been a few years since I did anything with sql, but I'll give it a try.

When you create the category table, you populate it with the names of each category. For the sake of consistency, I suggest you name CATEGORY as CATEGORY1 as a means of preventing unexpected errors when running queries, since there other categories with a number in the name. Also, rename subcategory as subcategory1 in the subcategory table.

Now, each CATEGORY in the category table should have a unique identifier (category_id) as the primary id, as well as the category name.

In the subcategory table, each subcategory should have a unique identifier (subcatetory_id) as the primary id, the subcategory name, AND a foreign key which is the same as the primary id from the category table.

In the main_table, which I gather has all the items you want to use to populate the subcategory table, each item in the main_table has a unique id (item_id) as the primary id, the item name, AND a foreign key which is the same as the primary id from the subcategory table.

If I have remembered it correctly, that's how the tables are linked together.

Since my study of sql, mysql has added something called triggers, which I must say I haven't studied so I don't know how they work. I'll leave it to you to research that.
 
Old 03-27-2012, 06:04 AM   #3
rm_-rf_windows
Member
 
Registered: Jun 2007
Location: Europe
Distribution: Ubuntu
Posts: 292

Original Poster
Rep: Reputation: 27
Thank you Bigrigdriver, but that wasn't my question!

I want to know how to generate a script that will give me:

Code:
0, "subcategory", 0
1, "subcategory2", 0
2, "subcategory3", 1
3, "subcategory4", 1
4, "subcategory5", 2
etc
0, "specific type", 0
1, "specific type n° 2", 0
In other words, how do I store "heading" and "subheading" into variables in order to place them next to their respective words ("specific type n° 2" is a word. I guess my example wasn't clear.

Here's another example:
Code:
SCIENCE
Biology
0 cell
1 organ
2 organism
3 virus
4 bacteria
5 ... etc.
I'm going to have a table for SCIENCE, LAW, POLITICS, JOURNALISM, etc.
Another table for Biology, Chemistry, etc. which are under science,
And yet another with organ, organism, virus, bacteria, etc.

This isn't a database question, it's a script question.

How do ti get:

Code:
0 cell "Biology id"
1 organ "Biology id"
lined up in a text file...

And how do I get
Code:
"Biology id" "SCIENCE id"
"Chemistry" "SCIENCE id"
lined up...

etc.

Or, to simplify my example further (this would be ok), how do I get:

Code:
0 cell, Biology, SCIENCE
1 organ, Biology, SCIENCE
2 organism, Biology, SCIENCE
3 virus, Biology, SCIENCE
4 bacteria, Biology, SCIENCE
...
78 contract, Employment law, LAW
79 severance pay, Employment law, LAW, ... 
etc.
I want to be able to store headings and subheadings in a variable and then place them beside their corresponding words (word = "0 cell"). When a new heading or subheading appears while going down in the text, I want the subheading variable to change so that the words below them have the correct categories associated with them (and likewise for headings).

Reminder: HEADINGS ARE IN UPPERCASE; Subheadings Begin With An Uppercase, The Rest Is Lower Case;
specific words are in the form "0 cell", in other words, a number followed by a space or tab and then the actual word.
All words, headings and subheadings can be either one word or several words.

What I want is rather simple, sorry if my explanations aren't very clear.

Many thanks Bigrigdriver for your response nevertheless, I don't think my question was very clear.

rm
 
Old 03-27-2012, 06:41 AM   #4
catkin
LQ 5k Club
 
Registered: Dec 2008
Location: Tamil Nadu, India
Distribution: Debian
Posts: 8,578
Blog Entries: 31

Rep: Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208
OK to do it with awk?
Code:
c@CW8:~/d/bin/try/LQ$ cat LQ-936561.input
SCIENCE
Biology
0 cell
1 organ
2 organism
3 virus
4 bacteria
c@CW8:~/d/bin/try/LQ$ ./LQ-936561.awk LQ-936561.input
Biology SCIENCE
0 cell Biology
1 organ Biology
2 organism Biology
3 virus Biology
4 bacteria Biology
c@CW8:~/d/bin/try/LQ$ cat LQ-936561.awk
#!/usr/bin/awk -f

$0 ~ /^[A-Z]*$/ {
    #print "DEBUG: " $0 " matched heading"
    heading = $0
    next
}

/[A-Z][a-z]*/ {
    #print "DEBUG: " $0 " matched subheading"
    subheading = $0
    print subheading " " heading
    next
}

{
    #print "DEBUG: " $0 " matched ordinary line"
    print $0 " " subheading
}
Note: the pattern $0 ~ /^[A-Z]*$/ may be controversial. It is intended to be portable between versions of awk.
 
Old 03-28-2012, 01:05 PM   #5
pan64
LQ Addict
 
Registered: Mar 2012
Location: Hungary
Distribution: debian/ubuntu/suse ...
Posts: 21,850

Rep: Reputation: 7309Reputation: 7309Reputation: 7309Reputation: 7309Reputation: 7309Reputation: 7309Reputation: 7309Reputation: 7309Reputation: 7309Reputation: 7309Reputation: 7309
It depends on your knowledge. You can follow the idea of catkin (above), but also you can use perl script or java, it really depends on you. Maybe you can solve it using bash, but looks not really convenient.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] Regular Expression Problem rm_-rf_windows Linux - General 25 03-17-2012 05:22 PM
[SOLVED] [bash] rm regular expression help RaptorX Programming 26 08-01-2009 06:29 PM
bash: checking if a variable is a number (need regular expression help) anonguy9 Linux - Newbie 6 03-29-2009 02:37 AM
What is meaning about the regular expression pertaining to vim script? haochao Programming 2 03-25-2009 12:08 AM
Using regular expression in expect script nik1984 Programming 1 08-28-2008 06:25 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - General

All times are GMT -5. The time now is 10:37 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration