ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
I have many files in a folder from which I need to extract some contents, these are basically text files wich have individual lines with (i.e)
name: john
address: whatever
phone: 123456
Some caveats
1. Sometimes a line might be missing.
name: johnn
phone: 123456
2. Lines are not in the same line-numbers across the files
I did try some things with awk based on google searches but I couldn't extract the data of each file into a single line (this is the ultimate goal):
john,whatever,123456
I don't have knowledge other than having put some bash scripts together for backup jobs, so I am open to install anything that could help to pull this off.
It does extract the data, but every item is still dumped to a single line, and I need to combine/chain them into a one line per original file (actually csv file).
That is the part where I got stuck :S
TIA
Last edited by Frakk; 08-28-2010 at 06:38 PM.
Reason: typos
thnks for the suggestion, however, that code is giving me individual lines too and, more importantly, my source files do have garbage before the contents I really need and that is getting dumped too.
Quote:
Originally Posted by xeleema
egrep -i "name:|phone:|address:" /path/to/files/* |\
awk 'NR == 1 { line = sq $0 sq } { line = line "," sq $0 sq } END { print line }'
Not sure if I am doing something wrong, but for this data
Code:
nombre_apellido: John
direccion: TheStreet 123
ciudad: TheCity
I am getting these results (I matched the "fields" as my original example had dummy fields sorry)
Code:
,ciudad: TheCitytreet 123
Seems like everything is getting piled up. That is processing a single file, it gets worst when it goes through all of them.
... and that is exactly why you should always give a representative example of your data. This was not obvious from your initial post. In fact, your description implied that there is nothing in between.
... these are basically text files wich have individual lines with (i.e)
name: john
address: whatever
phone: 123456
Some caveats
1. Sometimes a line might be missing.
name: johnn
phone: 123456
Quote:
Originally Posted by Frakk
Not sure if I am doing something wrong, but for this data
Code:
nombre_apellido: John
direccion: TheStreet 123
ciudad: TheCity
Quote:
Originally Posted by Frakk
what I need to extract is
userid: 123456
userstatus: 1
usergroup: somegroup
Please make up your mind first and
Quote:
If this still does not match then provide some representative sample data. I am not going to *guess* what your file might look like.
So far you have provided three different scenarios. I provided two solution that I both tested and they did work based on your sample data. Your last post suggests that your data is arranged as in your initial post. That is not representative data. We are going in circles right now.
The original data is in spanish, I try to translate that to english so foreign language is out of the way when asking for help in an english speaking forum.
Quote:
1. Sometimes a line might be missing.
name: johnn
phone: 123456
I meant sometimes a line might not be present, in that example address is missing and phone is next to name, just in case someone might think of using line numbers as a reference to identify the data.
Quote:
what I need to extract is
userid: 123456
userstatus: 1
usergroup: somegroup
I didn't say "what I need to extract is" I said "the data before what I need to extract is".
I intended to illustrate what can be found in the lines prior to the ones I need. Maybe I chose the wrong words...
Quote:
So far you have provided three different scenarios. I provided two solution that I both tested and they did work based on your sample data. Your last post suggests that your data is arranged as in your initial post. That is not representative data. We are going in circles right now.
Maybe that's unnecessarily harsh? Whether the item is called "name" or "nombre_apellido" doesn't really change anything.
If I made a mistake about the contents at the begining of the documents (and I already apologized) it was due to the fact that I don't know about this, which is way I need help in the first place.
Thanks again for the help, as I said, it is working now.
I did not mean to be harsh. I just wanted to point out how I perceived the development of the initial problem.
Quote:
The original data is in spanish ...
Yes, but you also stated in that post that there are lines that are to be excluded from the output. And that does qualify as altered scenario. The translation alone, of course, does not.
Quote:
I said "the data before what I need to extract is"
Now that I do understand. But your exact words were:
Quote:
the data above...
I must admit I couldn't make hands and tails of it. I thought that by 'above' you were referring to the data you presented in a post 'above'.
So when you said that the command did not work I assumed that is due to the arrangement of your data. At this point I had already double-checked the command. Since I did not see a windows logo on the left side of your posts the possibility of a dos-formatted file (good work on catching that, by the way) did not cross my mind.
Anyway, glad I could help.
P.S.: A slightly shorter way to convert DOS to UNIX files
something
at the
start
name: JohnA
address: TheStreet 123A
phone: TheCityA
...
in the middle
of something
...
name: JohnB
address: TheStreet 123B
phone: TheCityB
name: JohnC
phone: TheCityC
... and to the
end
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.