ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
I have a list of numbers in a notepad file and I want to take each number, line by line, and see if they exist in an excel file. I'm trying to find out which numbers don't exist and save those numbers in a new file.
For example:
list.txt would look like this:
123456789
987654321
111111111
I don't need to know the location or anything...just whether the number exists.
Are you trying to write something 'in Excel', to do this, or do you want to parse an Excel spreadsheet, to search for your text? If the latter, I suggest starting with Perl, and use the Spreadsheet::ParseExcel module, or any of dozens of other similar modules. If you want help with programming in Excel, I will wish you luck and bow out of the conversation.
I've been tooling around with a VBS script and it seems to be ok so far. Here is what I have so far:
Code:
Const xlValues = -4163
Const FOR_READING = 1
strFilePath = "C:\Projects\test.txt"
Set objFS = CreateObject("Scripting.FileSystemObject")
Set objTS = objFS.OpenTextFile(strFilePath, FOR_READING)
Set objExcel = CreateObject("Excel.Application")
objExcel.Visible = True
Set objWorkbook = objExcel.Workbooks.Open("C:\Projects\k")
Set objWorksheet = objWorkbook.Worksheets("sheet1")
Set objRange = objWorksheet.UsedRange
Set objTarget = objRange.Find(objTS.Readline)
If Not objTarget Is Nothing Then
Wscript.Echo objTarget.AddressLocal(False,False)
strFirstAddress = objTarget.AddressLocal(False,False)
End If
Do Until (objTarget Is Nothing)
Set objTarget = objRange.FindNext(objTarget)
strHolder = objTarget.AddressLocal(False,False)
If strHolder = strFirstAddress Then
Exit Do
End If
Wscript.Echo objTarget.AddressLocal(False,False)
Loop
I am a little confused on how to extract the values from a pdf file based on parameters. I would like to extract a value that comes after a certain word and then skips to the next page of the file and does the same thing. Any ideas on how to do that and combine it with this file?
As far as the code...I have many things to work on, such as writing all the values that don't exist into a file but its a good start I think. Thanks for any input guys.
I have a list of numbers in a notepad file and I want to take each number, line by line, and see if they exist in an excel file. I'm trying to find out which numbers don't exist and save those numbers in a new file.
For example:
list.txt would look like this:
123456789
987654321
111111111
I don't need to know the location or anything...just whether the number exists.
Thanks for any help
This is the code,
I am assuming your text file is list.txt and your excel file is excel.csv
Code:
for i in `cat list.txt`;
do
grep $i excel.csv
if [ $? -eq 0 ]
then
echo $i >> newfile
fi
done
If in case you are getting duplicate entries, run this
This is all fine, as long as you know that the values being searched for are textual data. If the data are some sort of binary formatted data, then grep will not find it, or will very likely return false positive hits. A spreadsheet is not simply a flat text format file, and to treat it as such would be asking for errors. Some kind of programming language that 'knows' Excel files, either built-in, or by way of specially crafted add-on packages should be used. On the surface, the original poster's intention to use Visual Basic seems appropriate, as one would expect there to exist some kind of Excel-specific parsing code in VB.
--- rod.
A short script to check if a number exists in excel spreadsheet
Here is my small script. Your notepad file is list.txt, your excel file is data.csv (I assume). This script will go thru all entries in list.txt, and print only those entries that are NOT in data.csv .
Code:
var str data ; cat "data.csv" > $data
var str list ; cat "list.txt" > $list
while ($list <> "")
do
# Get the next number from $list.
var str number ; lex "1" $list > $number
# Print $number if not in $data
if ( { sen ("^"+$number+"^") $data } <= 0 )
echo $number
endif
done
Save this script as C:/Scripts/myfind.txt and run it from biterscripting as follows.
Code:
script myfind.txt
If you need to add the newly found numbers at the end of data.csv, use the following command.
Code:
script myfind.txt >> data.csv
I wrote this in biterscripting. You can use other languages by following the same logic. To try this in biterscripting, install biterscripting. The installation directions are at biterscripting.com/install.html . Biterscripting is free and installs quickly.
Hmm. In my mind, 'Excel Spreadsheet' != 'CSV file'. Sure, it can be exported as a CSV file, but it loses functionality, and needs to be done manually. If it were as simple as a CSV file, it would probably be a bash one-liner.
So what is the raison d'être for this biterscript that I've never heard of before?
I have hit a big snag! I'm convinced that VB Script is the easiest way to eventually do what i'm trying to do but...
1. I still can't figure out how to collect the values from a pdf file.
2. Is there a way I can take my .txt file and look for the values in multiple files, perhaps search in a folder?
1. "Excel" is NOT "CSV"
An Excel file has a (closed!) binary format - you need special libraries or programs to read it.
A CSV file is a simple text format: you can easily do it in any language.
A PDF file is another (different!) example of a closed, binary format. You'll need something else to work with it.
2. You didn't tell us what platform you were on, or what languages you were comfortable using.
It appears you're on Windows...
... so you don't have ready access to tools (like Perl, bash, or awk) that'll make your life easier. You can install any/all of these on Windows, but Windows is rather hostile to the *nix philosophy of "small tools that do one thing well, and play together well". Of Perl, bash and awk, you'll probably have the best luck with Perl. Or some other "scripting" language, like Python.
3. If you insist on a scripting language like VBS (or "WSH", or ".bat" files, or whatever particular variant you have) ... you probably owe it to yourself to *at least* get the most capable variant.
However, I am so close to finishing this using VB Script in Windows. I know of a program (that I can purchase) to be able to extract the data I need from the pdf files so the only problem left is to search through multiple files. I am still trying to figure out the "for each file in folder" coding. Here is what I have:
Code:
Const xlValues = -4163
FOR_READING = 1
strFilePath = "C:\Projects\test.txt"
Set objFS = CreateObject("Scripting.FileSystemObject")
Set objTS = objFS.OpenTextFile(strFilePath, FOR_READING)
Set outfile = objFS.CreateTextFile("c:\Projects\testout.txt")
Set folder = objFS.GetFolder("c:\Invoices\")
Set objExcel = CreateObject("Excel.Application")
objExcel.Visible = True
for each file in folder.Files
Set objWorkbook = objExcel.Workbooks.Open(file.path, ForReading)
Set objWorksheet = objWorkbook.Worksheets("mooc")
Set objRange = objWorksheet.UsedRange
Do Until objTS.AtEndOfStream
Invoice = objTS.Readline
Set objTarget = objRange.Find(Invoice)
If objTarget Is Nothing Then
outfile.writeline(Invoice)
End If
Loop
objWorkbook.close
Next
I need to change around the loop so that if the value isn't found in the first file, it continues to the next file in the folder until its checked all the files...and then if it still isn't found, writes it to "outfile" and then starts over from the first file for the next value in "text.txt" I hope that makes sense haha.
' Create Filesystem object
Set objFS = CreateObject("Scripting.FileSystemObject")
Set objFolder = objFS.GetFolder("c:\Invoices\")
'Now, use a for each...next to loop through the Files collection
Dim objFile
Dim objExcel
For Each objFile in objFolder
' Try to open file as excel spreadsheet
if <<someFunction (objFile, objExcel)>> then
' Try to find the value you're looking for
if <<someOtherFunction (objExcel,results)>> then
' Update results
<<yetAnotherFunction (results, outfile)>>
end if
end if
Next
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.