LinuxQuestions.org
Welcome to the most active Linux Forum on the web.
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 09-26-2005, 04:14 AM   #1
synapse
Member
 
Registered: Jan 2004
Location: On Planet Earth.
Distribution: Slackware 12
Posts: 244

Rep: Reputation: 30
Need to extract info out of a binary file


Hi all

I have been trying to find a way to extract the (Summary Data [Meta Data] - The description field to be exact) out of a binary file (Type of file is a solidworks drawing or part file).

I have tried using grep and piping the output to the strings cmd in linux which seems to get me to a certain point (Not quite there but do-able).

After my investigations I found the following.

1. The Description Field does not always occur at the same place in the file
2. There are always 2 occurances of the Description Field in the file at different places.
3. Unfortunately I need to also include the - character in my search which really complicates things.

Ok what am I trying To do ??

I have many folders with drawings and parts and in all these files I have filled in the Description Filed when I have saved the file, Now I am looking to build a script that will search all the files : Outputting the filename with the files description field:

Eg

File Name : Description Field.

Am I going about this the wrong way ?

Appreciate the help

Thanx

S
 
Old 09-26-2005, 01:10 PM   #2
paulsm4
LQ Guru
 
Registered: Mar 2004
Distribution: SusE 8.2
Posts: 5,863
Blog Entries: 1

Rep: Reputation: Disabled
Ideally, you have the specs for the file format and you're prepared to write a parse utility in 'C'.

If all you're looking for is a "description" field, and that description is ASCII text, you might be able to get by with something like this:

strings MYFILE|less

.. or ..

strings MYFILE|grep '\-'|less

... or some variation thereof...

'Hope that helps .. PSM
 
Old 09-26-2005, 04:37 PM   #3
cyent
Member
 
Registered: Aug 2001
Location: ChristChurch New Zealand
Distribution: Ubuntu
Posts: 398

Rep: Reputation: 87
Say...
man 5 magic

This manual page documents the format of the magic file as used by the file(1) command, version 4.12. The
file command identifies the type of a file using, among other tests, a test for whether the file begins with a
certain magic number. The file /usr/share/misc/file/magic specifies what magic numbers are to be tested for,
what message to print if a particular magic number is found, and additional information to extract from the
file.

Each line of the file specifies a test to be performed. A test compares the data starting at a particular
offset in the file with a 1-byte, 2-byte, or 4-byte numeric value or a string. If the test succeeds, a mes-
sage is printed. The line consists of the following fields:

offset A number specifying the offset, in bytes, into the file of the data which is to be tested.

type The type of the data to be tested. The possible values are:

byte A one-byte value.

short A two-byte value (on most systems) in this machine's native byte order.

long A four-byte value (on most systems) in this machine's native byte order.

string A string of bytes. The string type specification can be optionally followed by /[Bbc]*.
The ``B'' flag compacts whitespace in the target, which must contain at least one whitespace
character. If the magic has n consecutive blanks, the target needs at least n consecutive
blanks to match. The ``b'' flag treats every blank in the target as an optional blank.
Finally the ``c'' flag, specifies case insensitive matching: lowercase characters in the
magic match both lower and upper case characters in the targer, whereas upper case charac-
ters in the magic, only much uppercase characters in the target.

..........
 
Old 09-26-2005, 04:51 PM   #4
paulsm4
LQ Guru
 
Registered: Mar 2004
Distribution: SusE 8.2
Posts: 5,863
Blog Entries: 1

Rep: Reputation: Disabled
Hi, Synapse -

To parse the description field from your Solidworks drawing and parts files, I'd suggest:

1. Try strings | MYFILE | grep '\-' | less
... or ...
2. Consider writing your own C-language binary parser

IMHO .. PSM

PS:
You might wish to do a Google search for other alternatives. For example:

http://www.allworldsoft.com/software...solidworks.htm

http://www.softpedia.com/get/Science-CAD/swCP3.shtml

etc.

Last edited by paulsm4; 09-26-2005 at 04:56 PM.
 
Old 09-27-2005, 01:32 AM   #5
synapse
Member
 
Registered: Jan 2004
Location: On Planet Earth.
Distribution: Slackware 12
Posts: 244

Original Poster
Rep: Reputation: 30
Hi,

Thanx for the prompt replys. I will dabble more with this today and get back later, I didnt think of the magic db.

Appreciate the help
cheers
 
Old 09-27-2005, 02:44 AM   #6
synapse
Member
 
Registered: Jan 2004
Location: On Planet Earth.
Distribution: Slackware 12
Posts: 244

Original Poster
Rep: Reputation: 30
Hi

Ok seems I have a little problem with the output all is ok finding the description word itself but I actually need the text that follows the word 'Description' itself. So Im looking for something like

Description 'description of file.'

The following is an extract of the command that im busy with

cat FILE_NAME | strings | grep -i -m 1 description

This yields the correct output, but I need the output to continue for at least 32 characters.

Thanx again
Cheers
 
Old 09-27-2005, 04:55 AM   #7
synapse
Member
 
Registered: Jan 2004
Location: On Planet Earth.
Distribution: Slackware 12
Posts: 244

Original Poster
Rep: Reputation: 30
Hi

ok played around a bit and this is what I came up with

grep -a -n -m 1 -H -i -r description /home/tmp/*.SLDPRT | strings > file.txt

This gives me output in the form of the following

/home/tmp/S-DB-SERV 001 301.SLDPRT:8:TADNAMEy0óúuÃ+,ù®+,ù®D+,ù®tINSULATOR SUPPORTDescriptioner_cÿÿsu_CStringArrayÿþÿ T

/home/tmp/S-DB-SERV 001 302.SLDPRT:9:TADNAME+,ù®+,ù®D+,ù®tHEATERVENTING PLATE
Descriptionsu_CStringArrayÿþÿ T

Note how the lines all end in a T

Anyway I need to remove all the garbage between TADNAME------------Description------------

so that I land up with the following

"S-DB-SERV 001 302.SLDPRT" "HEATERVENTING PLATE"

Thanx again
cheers
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Convert an info file(bash.info.gz) to a single html file Darwish Linux - Software 2 09-24-2005 06:51 AM
how to extract the name of the file bahadur Programming 9 03-22-2005 06:48 PM
linux shell - extract filename from and song info from text database d003 Programming 1 07-23-2003 04:06 AM
Extract file Electboy Linux - General 1 07-21-2003 05:51 PM
How do you extract a file? Neb Linux - Newbie 9 05-14-2003 03:46 PM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 05:18 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration