ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
The space after the time is a tab that I've inserted as a delimiter. I need to remove all of the spaces up to but not past the tab in each line. (I want it to stop at the tab because there can be entries for the directories that follow it that might have spaces in their names that need to be preserved.)
I've looked at sed and awk but haven't been able to figure out how to do this.
Actually, my input looks exactly like what I posted. There are leading spaces, then there is a space between each of the size/date/time listings, followed by a tab, then the directories.
I want to delete the leading spaces, spaces between size/date/time, stop there, retaining the tab after time and any spaces in directory/filenames.
I'll see what the things you posted do to a test file.
edit:
Just ran both commands on a test file and each:
Code:
sed 's/^ *//' input.txt | cat -A
sed -r 's/ *(^|$|\t) */\1/g' input2.txt | cat -A
seems to do the opposite of what I was looking for. They both retain the spaces preceding the tab and delete the tab.
Thanks astrogeek! That did it. I like that it's awk too. Also seems to preserve spaces in directories/filenames. And, yes, there's only a single tab per line.
So you do actually want to merge the id, date and time into a single field?!?
That seems like an odd thing to do - makes me think there's perhaps a different underlying issue - but anyway I would do a slight variation of Astrogeek's solution:
while read num date time dir
do
printf "%d%s%s\t%s\n" "$num" "$date" "$time" "$dir"
done < /tmp/input.txt
It will cope with spaces in the dirname so long as they're not leading spaces (which read will strip).
bash is not the fastest however, so if you have millions of rows, you might want to use one of the other solutions, but you did ask how to do it in bash.
while read num date time dir
do
printf "%d%s%s\t%s\n" "$num" "$date" "$time" "$dir"
done < /tmp/input.txt
It will cope with spaces in the dirname so long as they're not leading spaces (which read will strip).
bash is not the fastest however, so if you have millions of rows, you might want to use one of the other solutions, but you did ask how to do it in bash.
Yes, that's what I wanted to say too. Split line by whitespaces and reconstruct the line. Either in bash or awk/perl/python/sed/whatever, like this
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.