LinuxQuestions.org
Latest LQ Deal: Linux Power User Bundle
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 07-23-2010, 11:50 AM   #31
schneidz
LQ Guru
 
Registered: May 2005
Location: boston, usa
Distribution: fc-15/ fc-20-live-usb/ aix
Posts: 5,150

Rep: Reputation: 887Reputation: 887Reputation: 887Reputation: 887Reputation: 887Reputation: 887Reputation: 887

Quote:
Originally Posted by GrapefruiTgirl View Post
EgaDs schneidz, I wonder if there is a more horrid color you could have chosen for those comments?
i'll change to purple. if it had the blue background (instead of grey) the contrast wouldve been better.
 
Old 07-23-2010, 11:52 AM   #32
hsp40oz
LQ Newbie
 
Registered: Jul 2010
Posts: 9

Original Poster
Rep: Reputation: 0
@schneidz - still returning 5700+ results.

I'm going to put this on hold for now and see if i can get some better data to work with from my boss.

Thank you all so much for the advice. Truly appreciated.
 
Old 07-23-2010, 11:56 AM   #33
GrapefruiTgirl
LQ Guru
 
Registered: Dec 2006
Location: underground
Distribution: Slackware64
Posts: 7,594

Rep: Reputation: 551Reputation: 551Reputation: 551Reputation: 551Reputation: 551Reputation: 551
No worries - we all enjoy a challenge like this, and likely some of the same ones of us will be around when you get some better data, and we can resume the battle against the books.

Cheers, & good luck.
 
Old 07-23-2010, 12:08 PM   #34
schneidz
LQ Guru
 
Registered: May 2005
Location: boston, usa
Distribution: fc-15/ fc-20-live-usb/ aix
Posts: 5,150

Rep: Reputation: 887Reputation: 887Reputation: 887Reputation: 887Reputation: 887Reputation: 887Reputation: 887
this mite be overkill but it gets us close:
Code:
rev HQNlist | cut -d " " -f 2- | rev > HQNlist.titles
grep -n -x -f Backlist HQNlist.titles
 
Old 07-23-2010, 12:11 PM   #35
GrapefruiTgirl
LQ Guru
 
Registered: Dec 2006
Location: underground
Distribution: Slackware64
Posts: 7,594

Rep: Reputation: 551Reputation: 551Reputation: 551Reputation: 551Reputation: 551Reputation: 551
In case you're still around, here's something that may help get an idea what the heck are returning all the results. This will put each search query from Backlist, into the results file, FOLLOWED by any results returned by searching the HQN file. Then, have a look in the results file and see what is going on; maybe post us a chunk of the results file when it's done.

Code:
echo "$(cat Backlist)" | while read line; do
   echo "#### Searching for: ${line}" >> results.omfg
   echo "#### Results found:" >> results.omfg
   grep -e "${line}" HQNlist >> results.omfg
   echo >> results.omfg
done
 
Old 07-23-2010, 12:13 PM   #36
GrapefruiTgirl
LQ Guru
 
Registered: Dec 2006
Location: underground
Distribution: Slackware64
Posts: 7,594

Rep: Reputation: 551Reputation: 551Reputation: 551Reputation: 551Reputation: 551Reputation: 551
Plus -- make sure you are clearing out (deleting) the results file before EACH test-run of any of these scripts!! Since we're using the ">>" redirection of the results, we are appending to a file, but nowhere have we been re-making that file empty!
 
Old 07-23-2010, 12:23 PM   #37
schneidz
LQ Guru
 
Registered: May 2005
Location: boston, usa
Distribution: fc-15/ fc-20-live-usb/ aix
Posts: 5,150

Rep: Reputation: 887Reputation: 887Reputation: 887Reputation: 887Reputation: 887Reputation: 887Reputation: 887
here is the final score:
Code:
rev HQNlist | cut -d " " -f 2- | rev > HQNlist.titles
for line in `grep -n -x -f Backlist HQNlist.titles | cut -d : -f 1`
do
 sed -n "$line"p HQNlist
done
the 'grep -x' means to only print out lines that exactly match (so The Road wont match The Road Taken).

Last edited by schneidz; 07-23-2010 at 12:49 PM.
 
Old 07-23-2010, 01:05 PM   #38
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 9,550

Rep: Reputation: 2898Reputation: 2898Reputation: 2898Reputation: 2898Reputation: 2898Reputation: 2898Reputation: 2898Reputation: 2898Reputation: 2898Reputation: 2898Reputation: 2898
So I have been following our progress here and my understanding is as follows:

We Have 2 files, one with what we are looking for (Backlist) and the (HQNlist) other containing:

a. maybe the names we are looking for, eg The Road 123455677
b. duplicates of what we might be looking for, eg The Road 123455345
c. lastly extensions of what we might be looking for, ie The Road and The Road House

Assuming I am on the correct track and that the format of the second file (HQNlist) is "Name of the movie / book" ISBN (guessing about the number but as long as space and then a digit), I came up with this:
Code:
awk 'FILENAME == ARGV[1]{a[i++]=$0}FILENAME == ARGV[2]{for(x in a)if($0 ~ "^"a[x]" [0-9]" && !b[a[x]]++)print}' Backlist HQNlist
If there are duplicates also in the Backlist file this can easily be accommodated with a small change.
 
Old 08-10-2010, 08:39 PM   #39
konsolebox
Senior Member
 
Registered: Oct 2005
Distribution: Gentoo, Slackware, LFS
Posts: 2,248
Blog Entries: 8

Rep: Reputation: 235Reputation: 235Reputation: 235
I found this thread from http://www.linuxquestions.org/questi...m-file-825366/ and decided to modify the script I made there to also work with the problem here. My idea is similar to grail's but I didn't immediately read page 3 so I thought it's not yet solved. Anyway here is a script that uses bash 4.0 and grep. I really also thought that awk will solve it with simpler syntax especially that it's going to need associative arrays and such arrays are only available in bash 4.0,.. but I still find my hands happier to write bash.

Here's the code just in case it can also help, and that it's not yet too late..

This code must be placed inside a script.

Code:
#!/bin/bash

shopt -s extglob

[[ BASH_VERSINFO -ge 4 ]] || {
	echo "You need bash version 4.0 or newer to run this script."
	exit 1
}

# void g (string <titles and isbn file path>)
#
# set to false if grep is not able to handle many arguments
#
if true; then
	function g {
		grep "${PATTERNS[@]}" "$1"
	}
else
	function g {
		local A
		for A in "${PATTERNS[@]}"; do
			grep "$A" "$1"
		done
	}
fi

# void f (string <titles file path>, string <titles and isbn file path>)
#
function f {
	if [[ $# -ne 2 ]]; then
		echo "Invalid numbre of arguments." >&2
	elif [[ ! -f $1 ]]; then
		echo "File $1 not found." >&2
	elif [[ ! -f $2 ]]; then
		echo "File $2 not found." >&2
	fi

	local REPLY
	local -a PATTERNS=()

	while read; do
		# ignore commented lines
		[[ -z $REPLY || $REPLY == *([[:blank:]])'#'* ]] && continue

		# remove leading and trailing spaces
		REPLY=${REPLY##+([[:blank:]])}
		REPLY=${REPLY%%+([[:blank:]])}

		# add with the pattern for ISBN included
		PATTERNS[${#PATTERNS[@]}]='-e'
		PATTERNS[${#PATTERNS[@]}]=$REPLY[[:blank:]]\\+[[:digit:]]\\+\\\$
	done < "$1"

	if [[ ${#PATTERNS[@]} -eq 0 ]]; then
		echo "error: no pattern was found in $1."  # >&2
		return 1
	fi

	local -A FLAG=()
	local ISBN  # might be too big to be an integer so better not use -i

	while read; do
		if [[ $REPLY != *+([[:blank:]])+([[:digit:]]) ]]; then
			echo "line is not in proper format: $REPLY" >&2
			continue
		fi

		ISBN=${REPLY%%*[[:blank:]]}

		if [[ -z ${FLAG[$ISBN]} ]]; then
			echo "$REPLY"
			FLAG[$ISBN]=.
		fi
	done < <(g "$2")

	# (return)
}
Code:
f Backlist HQNlist >Output 2>Errors
Maybe I should start contributing awk codes as well since it's proven to be more simple sometimes.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
shell script question $variable in loop icecubeflower Linux - Newbie 2 03-31-2009 10:09 AM
shell script , while loop ykc Programming 5 03-30-2009 08:50 AM
Shell Script skipping a loop dnvikram Programming 2 01-23-2009 03:29 PM
Loop in Shell Script delamatrix Programming 4 07-24-2008 06:20 PM
optional exit from loop, shell script RudraB Programming 2 07-17-2008 04:30 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie

All times are GMT -5. The time now is 04:57 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration