LinuxQuestions.org
Register a domain and help support LQ
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software
User Name
Password
Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.

Notices



Reply
 
Search this Thread
Old 10-25-2012, 06:46 PM   #1
SaintDanBert
Senior Member
 
Registered: Jan 2009
Location: Austin, TX
Distribution: Mint-15 with Cinnamon & KDE
Posts: 1,368
Blog Entries: 3

Rep: Reputation: 86
wanting to process v-card data


I have v-card files (VCF) that have contact pictures encoded in them. The spec says that these are "base64" encoded if they are not a URL to the photo file. The original photo files are who knows where.

Can someone tell me how to take these photo-blocks and decode them into images?

Does anyone know of a linux-based contact manager or address book that can do all or part of this Processing summary:
  1. Open a VCF file
  2. decode any photo block into its image
  3. save the image into a file
  4. write the VCF fields to comma-separated record
    using "normalized" field names and logging "broken" v-cards (too much missing detail, etc)
  5. name the image as a URL in the record
  6. repeat for all contacts in the file
  7. store the CSV data into mySQL table
  8. repeat for all VCF files
  9. spindle and mutilate the mySQL table to remove duplicates

Thanks in advance,
~~~ 0;-Dan
 
Old 10-25-2012, 07:15 PM   #2
schneidz
Senior Member
 
Registered: May 2005
Location: boston, usa
Distribution: fc-15/ fc-20-live-usb/ aix
Posts: 4,203

Rep: Reputation: 643Reputation: 643Reputation: 643Reputation: 643Reputation: 643Reputation: 643
do the usual suspects handle your vcf format (thunderbird, evolution, ...) ?

else you may need to do something custom with dd or write a c-program to grab the bytes you want.

the rest of the stuff seems script-able; whats does "normalized" (#4) mean ?
 
1 members found this post helpful.
Old 10-25-2012, 08:48 PM   #3
chrism01
Guru
 
Registered: Aug 2004
Location: Sydney
Distribution: Centos 6.6, Centos 5.10
Posts: 16,324

Rep: Reputation: 2041Reputation: 2041Reputation: 2041Reputation: 2041Reputation: 2041Reputation: 2041Reputation: 2041Reputation: 2041Reputation: 2041Reputation: 2041Reputation: 2041
I'd definitely use Perl for that using eg http://search.cpan.org/~llap/Text-vC...Addressbook.pm, although I'd also ask over at perlmonks.org (its where the Perl gurus hang out), in case there are even better/easier modules/techniques available.

Perl has the module http://search.cpan.org/~capttofu/DBD...b/DBD/mysql.pm for MySQL, for which you'll also need the DBI module.
If you have installed MySQL, you may already have those available. You should be able to get them from your repo, not CPAN direct.

Last edited by chrism01; 10-25-2012 at 08:52 PM.
 
1 members found this post helpful.
Old 10-26-2012, 08:57 AM   #4
schneidz
Senior Member
 
Registered: May 2005
Location: boston, usa
Distribution: fc-15/ fc-20-live-usb/ aix
Posts: 4,203

Rep: Reputation: 643Reputation: 643Reputation: 643Reputation: 643Reputation: 643Reputation: 643
i havent used any of these programs but they may be of interest to you:
Code:
[schneidz@hyper ide-34]$ yum search vcard
Loaded plugins: refresh-packagekit
BlueBubble                                                                                                       | 3.6 kB     00:00     
fedora-chromium                                                                                                  | 3.4 kB     00:00     
rpmfusion-free-updates                                                                                           | 3.3 kB     00:00     
rpmfusion-nonfree-updates                                                                                        | 3.3 kB     00:00     
updates/metalink                                                                                                 |  17 kB     00:00     
========================================================== N/S Matched: vcard ==========================================================
perl-Text-vCard.noarch : Package to edit and create a single vCard (RFC 2426)
trytond-party-vcarddav.noarch : party-vcarddav module for Tryton
python-vobject.noarch : A python library for manipulating vCard and vCalendar files

  Name and summary matches only, use "search all" for everything.
also, according to wikipedia the pic is prepended by the tag PHOTO so maybe you can manually scrape from that point till you hit the next tag and redirect the result to a file.

good luck.
 
1 members found this post helpful.
Old 10-26-2012, 11:25 AM   #5
SaintDanBert
Senior Member
 
Registered: Jan 2009
Location: Austin, TX
Distribution: Mint-15 with Cinnamon & KDE
Posts: 1,368
Blog Entries: 3

Original Poster
Rep: Reputation: 86
Quote:
Originally Posted by schneidz View Post
do the usual suspects handle your vcf format (thunderbird, evolution, ...) ?

else you may need to do something custom with dd or write a c-program to grab the bytes you want.

the rest of the stuff seems script-able; whats does "normalized" (#4) mean ?
When I say, "normalized," I mean that I would take the various different names for the vcard data items and select one set of names. It really is surprising how many different ways the vcards store first and last names, multiple phone numbers, multiple email addresses ... and then there are various extended field variations.

Thanks,
~~~ 0;-Dan
 
Old 10-26-2012, 11:28 AM   #6
SaintDanBert
Senior Member
 
Registered: Jan 2009
Location: Austin, TX
Distribution: Mint-15 with Cinnamon & KDE
Posts: 1,368
Blog Entries: 3

Original Poster
Rep: Reputation: 86
Quote:
Originally Posted by schneidz View Post
...
Name and summary matches only, use "search all" for everything.[/code]also, according to wikipedia the pic is prepended by the tag PHOTO so maybe you can manually scrape from that point till you hit the next tag and redirect the result to a file.
...
Yes, and there are X-something fields for photos and company logos and such.
My troublees start once I have the base64 block stripped to a file.
What do I do next?

Is it simply 'uudecode' of the block?
Which photo format do I get for the decode results?
If it restores the original format, how do I know what that was?

(laugh) That's why this is called a project.

Thanks,
~~~ 0;-Dan
 
Old 10-26-2012, 11:30 AM   #7
SaintDanBert
Senior Member
 
Registered: Jan 2009
Location: Austin, TX
Distribution: Mint-15 with Cinnamon & KDE
Posts: 1,368
Blog Entries: 3

Original Poster
Rep: Reputation: 86
Somewhere out there is package land there must be an address book or contact list that will read most of the popular vCard file formats...

Grrr Arrgghh,
~~~ 8d;-< Dan
 
Old 10-26-2012, 11:36 AM   #8
SaintDanBert
Senior Member
 
Registered: Jan 2009
Location: Austin, TX
Distribution: Mint-15 with Cinnamon & KDE
Posts: 1,368
Blog Entries: 3

Original Poster
Rep: Reputation: 86
Wikipedia offerss the following explanation of Base64 http://en.wikipedia.org/wiki/Base64

Quote:
A quote from Thomas Hobbes' Leviathan:
Code:
Man is distinguished, not only by his reason, but by this singular passion from other animals, which is a lust of the mind, that by a perseverance of delight in the continued and indefatigable generation of knowledge, exceeds the short vehemence of any carnal pleasure.
represented as a byte sequence of 8-bit-padded ASCII characters is encoded in MIME's Base64 scheme as follows:
Code:
TWFuIGlzIGRpc3Rpbmd1aXNoZWQsIG5vdCBvbmx5IGJ5IGhpcyByZWFzb24sIGJ1dCBieSB0aGlz
IHNpbmd1bGFyIHBhc3Npb24gZnJvbSBvdGhlciBhbmltYWxzLCB3aGljaCBpcyBhIGx1c3Qgb2Yg
dGhlIG1pbmQsIHRoYXQgYnkgYSBwZXJzZXZlcmFuY2Ugb2YgZGVsaWdodCBpbiB0aGUgY29udGlu
dWVkIGFuZCBpbmRlZmF0aWdhYmxlIGdlbmVyYXRpb24gb2Yga25vd2xlZGdlLCBleGNlZWRzIHRo
ZSBzaG9ydCB2ZWhlbWVuY2Ugb2YgYW55IGNhcm5hbCBwbGVhc3VyZS4=
In the above quote, the encoded value of Man is TWFu. Encoded in ASCII, the characters M, a, and n are stored as the bytes 77, 97, and 110, which are the 8-bit binary values 01001101, 01100001, and 01101110. These three values are joined together into a 24-bit string, producing 010011010110000101101110. Groups of 6 bits (6 bits have a maximum of 26 = 64 different binary values) are converted into individual numbers from left to right (in this case, there are four numbers in a 24-bit string), which are then converted into their corresponding Base64 character values.
Code:
Text content	M	a	n
ASCII	77	97	110
Bit pattern	0	1	0	0	1	1	0	1	0	1	1	0	0	0	0	1	0	1	1	0	1	1	1	0
Index	19	22	5	46
Base64-encoded	T	W	F	u
As this example illustrates, Base64 encoding converts 3 octets into 4 encoded characters.

The Base64 index table:
Code:
Value	Char	 	Value	Char	 	Value	Char	 	Value	Char
0	A	16	Q	32	g	48	w
1	B	17	R	33	h	49	x
2	C	18	S	34	i	50	y
3	D	19	T	35	j	51	z
4	E	20	U	36	k	52	0
5	F	21	V	37	l	53	1
6	G	22	W	38	m	54	2
7	H	23	X	39	n	55	3
8	I	24	Y	40	o	56	4
9	J	25	Z	41	p	57	5
10	K	26	a	42	q	58	6
11	L	27	b	43	r	59	7
12	M	28	c	44	s	60	8
13	N	29	d	45	t	61	9
14	O	30	e	46	u	62	+
15	P	31	f	47	v	63	/
When the number of bytes to encode is not divisible by 3 (that is, if there are only one or two bytes of input for the last block), then the following action is performed: Add extra bytes with value zero so there are three bytes, and perform the conversion to base64. If there was only one significant input byte, only the first two base64 digits are picked, and if there were two significant input bytes, the first three base64 digits are picked. '=' characters might be added to make the last block contain four base64 characters.

As a result: When the last group contains one octet, the four least significant bits of the final 6-bit block are set to zero; and when the last group contains two octets, the two least significant bits of the final 6-bit block are set to zero.
 
Old 11-17-2012, 01:07 AM   #9
SaintDanBert
Senior Member
 
Registered: Jan 2009
Location: Austin, TX
Distribution: Mint-15 with Cinnamon & KDE
Posts: 1,368
Blog Entries: 3

Original Poster
Rep: Reputation: 86
It appears that regardless of which image format (jpg, png, etc) one supplies as the contact photo, its bits get encoded and stored and written into the corresponding vCard file.

When you decode the vCard file, how do you know which image format to use when storing the results?

~~~ 0;-Dan
 
Old 11-17-2012, 01:48 AM   #10
Wim Sturkenboom
Senior Member
 
Registered: Jan 2005
Location: Roodepoort, South Africa
Distribution: Slackware 10.1/10.2/12, Ubuntu 12.04, Crunchbang Statler
Posts: 3,786

Rep: Reputation: 282Reputation: 282Reputation: 282
Not sure if this is of help

man base64; that should get you going with the decoding part
To determine what kind of file it is, you can use the file command

With some bash scripting, you should be good to go.

I did not research it, but I'm quite sure that you can find a library for the programming language of your choice to do the base64 decoding. The same might apply for functionalities provided by 'file'.
 
2 members found this post helpful.
Old 11-19-2012, 03:26 AM   #11
SaintDanBert
Senior Member
 
Registered: Jan 2009
Location: Austin, TX
Distribution: Mint-15 with Cinnamon & KDE
Posts: 1,368
Blog Entries: 3

Original Poster
Rep: Reputation: 86
Quote:
Originally Posted by Wim Sturkenboom View Post
Not sure if this is of help

man base64; that should get you going with the decoding part
To determine what kind of file it is, you can use the file command
I regret that I failed to adequately state the question.

A vCard has a photo blob (binary large object) encoded as Base64.
Process the vCard file and extract the blob characters, then decode them from Base64. Now you have a blob that is binary. Write the binary to disk separated from the vCard file.

At this point we have someName.vcf and namePhoto.dat.
(I call it a DAT file, because I do not know which photo image format was supplied during the original encoding process.)
Is there some vCard item that tells me that the photo is JPG vs. PNG vs. GIF vs. ??? Alternately, am I face with reading the DAT file bits looking for a file format signature and then guessing at the source format supplied to the original encoding? Are you telling me that 'file' will read and interpret those DAT file bits?

Hoping I'm clear this time.
~~~ 0;-/ Dan

Last edited by SaintDanBert; 11-19-2012 at 03:27 AM.
 
Old 11-19-2012, 04:47 AM   #12
Wim Sturkenboom
Senior Member
 
Registered: Jan 2005
Location: Roodepoort, South Africa
Distribution: Slackware 10.1/10.2/12, Ubuntu 12.04, Crunchbang Statler
Posts: 3,786

Rep: Reputation: 282Reputation: 282Reputation: 282
Quote:
Are you telling me that 'file' will read and interpret those DAT file bits?
Yes; you can have an image with a txt extension and file will tell you that it's a jpg

Code:
wim@aa0:~/images$ cp IMGP8955.JPG IMGP8955.txt
wim@aa0:~/images$ file IMGP8955.txt 
IMGP8955.txt: JPEG image data, EXIF standard 2.21
wim@aa0:~/images$ file IMGP8955.JPG
IMGP8955.JPG: JPEG image data, EXIF standard 2.21
wim@aa0:~/images$
 
2 members found this post helpful.
Old 11-19-2012, 08:14 AM   #13
schneidz
Senior Member
 
Registered: May 2005
Location: boston, usa
Distribution: fc-15/ fc-20-live-usb/ aix
Posts: 4,203

Rep: Reputation: 643Reputation: 643Reputation: 643Reputation: 643Reputation: 643Reputation: 643
Quote:
Originally Posted by SaintDanBert View Post
I regret that I failed to adequately state the question.

A vCard has a photo blob (binary large object) encoded as Base64.
Process the vCard file and extract the blob characters, then decode them from Base64. Now you have a blob that is binary. Write the binary to disk separated from the vCard file.

At this point we have someName.vcf and namePhoto.dat.
(I call it a DAT file, because I do not know which photo image format was supplied during the original encoding process.)
Is there some vCard item that tells me that the photo is JPG vs. PNG vs. GIF vs. ??? Alternately, am I face with reading the DAT file bits looking for a file format signature and then guessing at the source format supplied to the original encoding? Are you telling me that 'file' will read and interpret those DAT file bits?

Hoping I'm clear this time.
~~~ 0;-/ Dan
where are you stuck ? do you know the byte offset where the picture begins and ends. it seems like a simple c program can scrape off the necessary bytes (fgetc()). not sure how base64 fits into this but if you need to convert something there seems to be a standard program for it. then the file command will tell you the picture format.

regrads,
schneidz
 
1 members found this post helpful.
Old 11-19-2012, 11:44 AM   #14
SaintDanBert
Senior Member
 
Registered: Jan 2009
Location: Austin, TX
Distribution: Mint-15 with Cinnamon & KDE
Posts: 1,368
Blog Entries: 3

Original Poster
Rep: Reputation: 86
Quote:
Originally Posted by schneidz View Post
where are you stuck ? do you know the byte offset where the picture begins and ends. it seems like a simple c program can scrape off the necessary bytes (fgetc()). not sure how base64 fits into this but if you need to convert something there seems to be a standard program for it. then the file command will tell you the picture format.

regrads,
schneidz
A vCard, or VCF data file, has a PHOTO attribute:
Code:
BEGIN:VCARD
VERSION:3.0
LABEL;TYPE=HOME:10511 Weller Drive\nAustin\, TX\n78750-2566\nUSA
TEL;TYPE=CELL;X-EVOLUTION-UI-SLOT=2:512-413-5611
TEL;TYPE=HOME,VOICE;X-EVOLUTION-UI-SLOT=1:512-331-8217
X-MOZILLA-HTML:FALSE
X-EVOLUTION-VIDEO-URL:
FBURL:
X-EVOLUTION-BLOG-URL:
NOTE:
X-EVOLUTION-SPOUSE:
X-EVOLUTION-ASSISTANT:
CALURI:
TITLE:General Manager
X-EVOLUTION-MANAGER:
ROLE:Technical Writer\, Educator\, Coach
ORG:The GRILLON Group;;
REV:2010-11-30T20:38:31Z
PHOTO;ENCODING=b;TYPE="X-EVOLUTION-UNKNOWN":iVBORw0KGgoAAAANSUhEUgAAAIoAAAC
 WCAIAAACO8YfTAAAAA3NCSVQICAjb4U/gAAAgAElEQVR42uy9d5hcx3Unek5V3Xv7dk4zPXkwy
 DkDBAkmMQdRVCQpybJl2U/Bb71eB63t7+2zbD/Zlr226WdbWmXJkkiJokhKVGAmwQAGAETOAww
 mx57p3DdV1dk/ejAcUaSlJUGIsllf48M3+Xb96qTf+VUVaq3hZwYiKqWCIPA8Lx6Pw1vjlzTw1
 eAhIkRsfEhEb83Umwiet8abZLBXxQ3xLaN5y3reGq/Jet4ab8Hz1ngLnrfgeWu8Bc9b8Lw13oL
...
 LiOj7fiVKDQCdToeZPc+r1WpRFJVl2el0lFJhGCZJ0u12G41GkiQVOVIcx9baKKytHDk6n8+A/
 Ga743keo4hbi+vHjkkpR/1ngbnV6QAgCJnO58bYWnthOB7XgqhZb5j6pF6vW2ezLO2ErtPpWGs
 X26Lm14NuOJ3OgHQY+Mu9pXma+mHUarWIZKu9EASBp9RSb9nzPERsNpsV3eUD7HU9zGxm/9f4V
 znGoii2trbiOO71el8Uevf/15PML/BHJ6oKoyqzaDQa8LAuuX3q638CQGJndxFrAZ8AAAAASUV
 ORK5CYII=
UID:pas-id-4F4EBB80000001A3
N:St.Andre;Dan;;;
FN:Dan St.Andre
NICKNAME:saint
X-EVOLUTION-FILE-AS:Saint-Andre\, Dan
ADR;TYPE=HOME:;;10511 Weller Drive;Austin;TX;78750-2566;USA
URL:
EMAIL;TYPE=HOME:saint@grillongroup.org
EMAIL;TYPE=WORK:dan.st.andre@grillongroup.org
END:VCARD
When you supply an image vile, JPG, PNG, GIF, etc to your address book program, it encodes the image
and stores it. When you save that contact as a vCard, the encoded image data gets written into the PHOTO attribute.

I can decode the PHOTO attribute back to its original binary. However, I have yet to discover
how to programmatically learn with photo image format to declare when I name the saved binary
after the decode.

~~~ 0;-Dan
 
Old 11-19-2012, 01:11 PM   #15
schneidz
Senior Member
 
Registered: May 2005
Location: boston, usa
Distribution: fc-15/ fc-20-live-usb/ aix
Posts: 4,203

Rep: Reputation: 643Reputation: 643Reputation: 643Reputation: 643Reputation: 643Reputation: 643
Quote:
Originally Posted by SaintDanBert View Post
...
I can decode the PHOTO attribute back to its original binary. However, I have yet to discover
how to programmatically learn with (sic: did you mean what) photo image format to declare when I name the saved binary
after the decode.

~~~ 0;-Dan
^huh ? use either file or identify (if imagemagick is installed) to identify the file type.

also, is that your real address ?
 
1 members found this post helpful.
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
What is the Process to install Reliance Netconnect+ ZTE Data card in RHEL6.0 32-bit vvn1985 Red Hat 7 10-18-2012 12:07 PM
How to use BSNL Data Card(Mobi Data 0x1c9e:0xf000) in Fedora 13 jitenderpal Linux - Newbie 1 11-18-2010 11:38 PM
How to send data to WAN Interface card at Data link Layer in linux environment krishna_karne Linux - Software 0 05-09-2008 05:01 AM
Processing data from a 'foreign' database with mysql, or tools to pre-process data. linker3000 Linux - Software 1 08-14-2007 09:36 PM
Switching distros and wanting to keep data RoughEdge Linux - Newbie 11 07-07-2007 02:50 PM


All times are GMT -5. The time now is 11:27 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration