LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (https://www.linuxquestions.org/questions/programming-9/)
-   -   How to search for a value in Excel (https://www.linuxquestions.org/questions/programming-9/how-to-search-for-a-value-in-excel-709711/)

offerd15 03-06-2009 04:15 PM

How to search for a value in Excel
 
I have a list of numbers in a notepad file and I want to take each number, line by line, and see if they exist in an excel file. I'm trying to find out which numbers don't exist and save those numbers in a new file.

For example:

list.txt would look like this:

123456789
987654321
111111111

I don't need to know the location or anything...just whether the number exists.

Thanks for any help :)

theNbomr 03-06-2009 06:56 PM

Are you trying to write something 'in Excel', to do this, or do you want to parse an Excel spreadsheet, to search for your text? If the latter, I suggest starting with Perl, and use the Spreadsheet::ParseExcel module, or any of dozens of other similar modules. If you want help with programming in Excel, I will wish you luck and bow out of the conversation.

--- rod.

Hko 03-07-2009 03:39 AM

Writing an OpenOffice.org macro could be another option.
But please tell us what programming language you intend to use.

offerd15 03-07-2009 03:53 PM

I've been tooling around with a VBS script and it seems to be ok so far. Here is what I have so far:
Code:

Const xlValues = -4163
Const FOR_READING = 1
strFilePath = "C:\Projects\test.txt"

Set objFS = CreateObject("Scripting.FileSystemObject")
Set objTS = objFS.OpenTextFile(strFilePath, FOR_READING)

Set objExcel = CreateObject("Excel.Application")
objExcel.Visible = True

Set objWorkbook = objExcel.Workbooks.Open("C:\Projects\k")
Set objWorksheet = objWorkbook.Worksheets("sheet1")

Set objRange = objWorksheet.UsedRange

Set objTarget = objRange.Find(objTS.Readline)

If Not objTarget Is Nothing Then
    Wscript.Echo objTarget.AddressLocal(False,False)
    strFirstAddress = objTarget.AddressLocal(False,False)
End If

Do Until (objTarget Is Nothing)
    Set objTarget = objRange.FindNext(objTarget)

    strHolder = objTarget.AddressLocal(False,False)
    If strHolder = strFirstAddress Then
        Exit Do
    End If

    Wscript.Echo objTarget.AddressLocal(False,False)
Loop

I am a little confused on how to extract the values from a pdf file based on parameters. I would like to extract a value that comes after a certain word and then skips to the next page of the file and does the same thing. Any ideas on how to do that and combine it with this file?

As far as the code...I have many things to work on, such as writing all the values that don't exist into a file but its a good start I think. Thanks for any input guys.

Digital Surgeon 03-07-2009 03:56 PM

Take a look at writing a small bash script using "grep". I remember vaguely studying this. Though there probably a macro available somewhere.

offerd15 03-07-2009 04:47 PM

I'm not real familiar with bash scripts...could you give me a link to some references? Thanks

vikas027 03-09-2009 02:52 AM

Quote:

Originally Posted by offerd15 (Post 3467311)
I have a list of numbers in a notepad file and I want to take each number, line by line, and see if they exist in an excel file. I'm trying to find out which numbers don't exist and save those numbers in a new file.

For example:

list.txt would look like this:

123456789
987654321
111111111

I don't need to know the location or anything...just whether the number exists.

Thanks for any help :)

This is the code,
I am assuming your text file is list.txt and your excel file is excel.csv

Code:

for i in `cat list.txt`;
do
grep $i excel.csv
if [ $? -eq 0 ]
then
echo $i >> newfile
fi
done

If in case you are getting duplicate entries, run this

Code:

sort -u newfile > final_result

theNbomr 03-09-2009 01:35 PM

This is all fine, as long as you know that the values being searched for are textual data. If the data are some sort of binary formatted data, then grep will not find it, or will very likely return false positive hits. A spreadsheet is not simply a flat text format file, and to treat it as such would be asking for errors. Some kind of programming language that 'knows' Excel files, either built-in, or by way of specially crafted add-on packages should be used. On the surface, the original poster's intention to use Visual Basic seems appropriate, as one would expect there to exist some kind of Excel-specific parsing code in VB.
--- rod.

RandiR 03-28-2009 04:59 PM

A short script to check if a number exists in excel spreadsheet
 
Here is my small script. Your notepad file is list.txt, your excel file is data.csv (I assume). This script will go thru all entries in list.txt, and print only those entries that are NOT in data.csv .

Code:

var str data ; cat "data.csv" > $data
var str list ; cat "list.txt" > $list
while ($list <> "")
do
    # Get the next number from $list.
    var str number ; lex "1" $list > $number
    # Print $number if not in $data
    if ( { sen ("^"+$number+"^") $data } <= 0 )
        echo $number
    endif
done

Save this script as C:/Scripts/myfind.txt and run it from biterscripting as follows.

Code:

script myfind.txt
If you need to add the newly found numbers at the end of data.csv, use the following command.

Code:

script myfind.txt >> data.csv

I wrote this in biterscripting. You can use other languages by following the same logic. To try this in biterscripting, install biterscripting. The installation directions are at biterscripting.com/install.html . Biterscripting is free and installs quickly.


Randi

theNbomr 03-28-2009 06:41 PM

Hmm. In my mind, 'Excel Spreadsheet' != 'CSV file'. Sure, it can be exported as a CSV file, but it loses functionality, and needs to be done manually. If it were as simple as a CSV file, it would probably be a bash one-liner.

So what is the raison d'être for this biterscript that I've never heard of before?

--- rod.

AnanthaP 03-29-2009 09:37 AM

Hey,

Looks like the OP is using a windows environment full and full. He has a nice solution in VB script. (May or may not work).

Next he wants to parse a PDF file. As far as I know, VB Script doesn't have an interface for it.

I suggest that he get the APIs for PDF. Or maybe some kind person can point him to a perl module to read PDF. (Like theNbomr's first post).

End

offerd15 04-01-2009 07:25 PM

I have hit a big snag! I'm convinced that VB Script is the easiest way to eventually do what i'm trying to do but...
1. I still can't figure out how to collect the values from a pdf file.

2. Is there a way I can take my .txt file and look for the values in multiple files, perhaps search in a folder?

paulsm4 04-01-2009 08:31 PM

Offerd -

To summarize previous posts:

1. "Excel" is NOT "CSV"
An Excel file has a (closed!) binary format - you need special libraries or programs to read it.

A CSV file is a simple text format: you can easily do it in any language.

A PDF file is another (different!) example of a closed, binary format. You'll need something else to work with it.

2. You didn't tell us what platform you were on, or what languages you were comfortable using.

It appears you're on Windows...
... so you don't have ready access to tools (like Perl, bash, or awk) that'll make your life easier. You can install any/all of these on Windows, but Windows is rather hostile to the *nix philosophy of "small tools that do one thing well, and play together well". Of Perl, bash and awk, you'll probably have the best luck with Perl. Or some other "scripting" language, like Python.

3. If you insist on a scripting language like VBS (or "WSH", or ".bat" files, or whatever particular variant you have) ... you probably owe it to yourself to *at least* get the most capable variant.

And that's Windows Powershell:
http://www.microsoft.com/windowsserv...l/default.mspx

'Hope that helps .. PSM

offerd15 04-01-2009 10:01 PM

Thanks Paulsm4

However, I am so close to finishing this using VB Script in Windows. I know of a program (that I can purchase) to be able to extract the data I need from the pdf files so the only problem left is to search through multiple files. I am still trying to figure out the "for each file in folder" coding. Here is what I have:
Code:

Const xlValues = -4163
FOR_READING = 1
strFilePath = "C:\Projects\test.txt"

Set objFS = CreateObject("Scripting.FileSystemObject")
Set objTS = objFS.OpenTextFile(strFilePath, FOR_READING)
Set outfile = objFS.CreateTextFile("c:\Projects\testout.txt")
Set folder = objFS.GetFolder("c:\Invoices\")

Set objExcel = CreateObject("Excel.Application")
objExcel.Visible = True

for each file in folder.Files

Set objWorkbook = objExcel.Workbooks.Open(file.path, ForReading)
Set objWorksheet = objWorkbook.Worksheets("mooc")

Set objRange = objWorksheet.UsedRange

Do Until objTS.AtEndOfStream
Invoice = objTS.Readline
Set objTarget = objRange.Find(Invoice)
If objTarget Is Nothing Then
        outfile.writeline(Invoice)
End If

Loop
objWorkbook.close


Next


I need to change around the loop so that if the value isn't found in the first file, it continues to the next file in the folder until its checked all the files...and then if it still isn't found, writes it to "outfile" and then starts over from the first file for the next value in "text.txt" I hope that makes sense haha.

paulsm4 04-01-2009 10:38 PM

Something like this perhaps?
Code:

  ' Create Filesystem object
  Set objFS = CreateObject("Scripting.FileSystemObject")
  Set objFolder = objFS.GetFolder("c:\Invoices\")

  'Now, use a for each...next to loop through the Files collection
  Dim objFile
  Dim objExcel
  For Each objFile in objFolder
    ' Try to open file as excel spreadsheet
    if <<someFunction (objFile, objExcel)>> then
      ' Try to find the value you're looking for
      if <<someOtherFunction (objExcel,results)>> then
        ' Update results
        <<yetAnotherFunction (results, outfile)>>
      end if
    end if
  Next

'Hope that helps .. PSM


All times are GMT -5. The time now is 02:38 PM.