sed or awk to remove $8 in a | delimited file if non-numerical value exists
Hello,
I have a question for you all, I have a text file that is delimited with | between fields. The file has roughly half a million lines in it. I am looking for the fastest way to remove the lines in this file or generate a new file if the pattern in field $8 is non-numerical. I could write a for loop with an if statement to do this with cut, but it would take forever to run, so I'm looking for a good awk or sed one liner that could do this instead. There are many ways to do this, either managing the file directly with a sed statement, or by awk'ing it and outputting everything to another file - but I'm asking the community on what they think the most efficient way to do this, please. Your help is much appreciated. |
So you want to keep the entire line if $8 is numerical or you want to remove those?
Do you need it to handle large numbers or just one or two digits? This will be far easier if you paste an example of how the data is now and how you wish it to end up: e.g., "I have this:" Code:
a | b | c | d | e | f | g | h | i | j Code:
a | b | c | d | e | f | g | h | i | j |
Dayid,
Actually we just figured it out - basically it was to exclude the line entirely if it contained anything other than strictly numbers (including alpha-numeric values), here is what we came up with which did the trick: Code:
# cat oldfile.txt | awk -F \| ' $8 ~ /^[0-9]*$/ ' > newfile.txt Wanted to share this with everyone in the event someone else needed this to work. I appreciate your help Dayid, and in the future I'll pose my questions with examples such as how you mentioned. Thanks again :) |
Quote:
If you have to do it again skip 'cat' as a whole: Code:
awk -F '|' ' $8 ~ /^[0-9]*$/ ' oldfile.txt > newfile.txt |
It may not be important, but remember that your current solution also allows for the eighth field to be empty, ie ||
|
Thanks again everyone for your insight and recommendations, the Linux community never ceases to amaze me with the combined knowledge, have a good one folks.
|
All times are GMT -5. The time now is 02:32 AM. |