LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (https://www.linuxquestions.org/questions/programming-9/)
-   -   bash and sed or awk commands to set tabs ? (https://www.linuxquestions.org/questions/programming-9/bash-and-sed-or-awk-commands-to-set-tabs-804111/)

mythcat 04-25-2010 12:28 PM

bash and sed or awk commands to set tabs ?
 
I want to set more text files.
They have "tab" differently (3, 4, 6 or 5 characters space).
I have to use "sed" or "awk" sette them in the same tab (for example five space characters).
What is the easy way to make this .

Thank you.

choogendyk 04-25-2010 12:44 PM

Awk ought to be pretty easy, but it depends on what you are talking about. I presume the files don't actually have the tab character in them (then the spacing would be determined by the terminal settings or the program displaying them), but rather spaces to the effect of the tab positions. If, then, the items that are separated by spaces have no spaces embedded, and there are a known number of items on the line, it's an awk one liner. If the number of items on the line varies, if there are embedded spaces, if you allow quoting, if the format of lines varies (say heading lines that have no tabs), then it gets more complicated.

Describe your file formats a little more completely if you want something more specific.

pixellany 04-25-2010 01:52 PM

I don't think you really mean tabs----it sounds like the spacing is set with the space character.

In SED, it is trivial to match a specified # of characters, and then replace with the desired number.

mythcat 04-26-2010 03:21 PM

Because is not just one file , maybe awk is a good solution .
But i don't use it for now ...
Yes is spaces chars , the tab is set by editors .

Tinkster 04-26-2010 03:52 PM

If you're talking about code or markup and varied depth of identation I'd
use a tool that matches your language to do the cleanup; e.g.: perltidy,
htmltidy, ...

Otherwise you'll be hard-pressed to know whether the 6 spaces are a too
deeply indented 4 or an 8 not indented deeply enough...

If the above isn't a problem:
In a "brute-force" approach you could just use sed ...
Code:

sed -i.bak -r 's/(^ {4,}/    /g' *
(read: replace any occurences of 4 or more spaces at the beginning of a line
with 4 spaces)



Cheers,
Tink

catkin 04-26-2010 03:53 PM

Would pr with the -e option help? It can remove tabs, replacing them with spaces to suit the tab positions specified.

choogendyk 04-26-2010 05:57 PM

mythcat, how about being a bit more talkative? ;)

How about an explicit description and sample of your files and the content and context? We're all just talking in different directions about what possibly might work if . . . without knowing exactly what your files look like. Tinkster was thinking code, where all the tabs are at the beginning for indentation of the code. I was thinking maybe a text version of a data file or spreadsheet that might have many "columns" separated by enough spaces to simulate tabs. What you actually have could be something that doesn't fit either of those. A picture (the file looks like this: several lines of sample) might be worth a lot of words.

mythcat 04-27-2010 03:04 AM

Quote:

Originally Posted by choogendyk (Post 3948620)
mythcat, how about being a bit more talkative? ;)

How about an explicit description and sample of your files and the content and context? We're all just talking in different directions about what possibly might work if . . . without knowing exactly what your files look like. Tinkster was thinking code, where all the tabs are at the beginning for indentation of the code. I was thinking maybe a text version of a data file or spreadsheet that might have many "columns" separated by enough spaces to simulate tabs. What you actually have could be something that doesn't fit either of those. A picture (the file looks like this: several lines of sample) might be worth a lot of words.

I did not want to put the issue to resolve. I'm not so lazy and it's not nice. However, it seems not quite simple.
Given a folder with subfolders. In python this is files (with extension. PY).
I want to create a bash script which automatically receives the name of the folder and set all "tabs" at the same number of "spaces". If they manage to do and other formatting is ok.
That's about it ...

catkin 04-27-2010 03:50 AM

Are you saying that you want to find all *.PY files in a directory and its subdirectories and modify each one, converting the tabs to spaces?

If so then
Code:

while read -r file
do
    pr -e5 "$file" > "$file.fixed"
done < <(find <directory> -t file -name '*.PY')

Beware this will not work on .py files (use -iname for that) and, after checking the *.fixed files to make sure they are OK, you will have to move the *.fixed files to the original names.

EDIT: corrected missing closing parenthesis (red above)

grail 04-27-2010 08:22 AM

I am not sure who has the best solution yet, but I thought I might help improve the question, assuming I am on the right track.

Python code is particularly pedantic about the number of spaces/tabs that are used for indentation. Should the indentation
be different by so much as a space it will not run and generate something along the lines of:
Code:

IndentationError: unindent does not match any outer indentation level
So if I am understanding correctly, he wishes to update all of his Python scripts so the tabbing is the same throughout so as to not run into this error.
The problem that i forsee, if I am correct, is that no one liner will get it all completely correct, for example:
Code:

for line in lines:
        time.sleep(.005)
        print line,

The first line in the for loop above is a single tab(equivalent to 8 spaces) and the second line is actually 8 spaces. This will error.
Also from this, if the second line were say 4 spaces and the first was 8, which do you change, if any??
Problem is that maybe the second is only 4 because it relates to a previous "if" or it is at the end of the "for" loop.

JM2C

catkin 04-27-2010 08:39 AM

Quote:

Originally Posted by grail (Post 3949292)
Python code is particularly pedantic about the number of spaces/tabs that are used for indentation. Should the indentation be different by so much as a space it will not run ...

Oh! Then change the postfix in the suggested code above from ".fixed" to ".broken"! :doh:

mythcat 04-27-2010 01:55 PM

Quote:

Originally Posted by catkin (Post 3949025)
Are you saying that you want to find all *.PY files in a directory and its subdirectories and modify each one, converting the tabs to spaces?

If so then
Code:

while read -r file
do
    pr -e5 "$file" > "$file.fixed"
done < <(find <directory> -t file -name '*.PY')

Beware this will not work on .py files (use -iname for that) and, after checking the *.fixed files to make sure they are OK, you will have to move the *.fixed files to the original names.

EDIT: corrected missing closing parenthesis (red above)

Yes , I want to find all *.PY files in a directory and its subdirectories and modify each one, converting the tabs to a specific number spaces.
I will try with *.py files .
Thank you .

mythcat 04-27-2010 02:26 PM

Quote:

Originally Posted by catkin (Post 3949317)
Oh! Then change the postfix in the suggested code above from ".fixed" to ".broken"! :doh:

I don't understand you. This is not a joke...
First your solution seam to be good . But, when i try your code and i take a look ,

mythcat 04-27-2010 02:35 PM

Quote:

Originally Posted by grail (Post 3949292)
I am not sure who has the best solution yet, but I thought I might help improve the question, assuming I am on the right track.

Python code is particularly pedantic about the number of spaces/tabs that are used for indentation. Should the indentation
be different by so much as a space it will not run and generate something along the lines of:
Code:

IndentationError: unindent does not match any outer indentation level
So if I am understanding correctly, he wishes to update all of his Python scripts so the tabbing is the same throughout so as to not run into this error.
The problem that i forsee, if I am correct, is that no one liner will get it all completely correct, for example:
Code:

for line in lines:
        time.sleep(.005)
        print line,

The first line in the for loop above is a single tab(equivalent to 8 spaces) and the second line is actually 8 spaces. This will error.
Also from this, if the second line were say 4 spaces and the first was 8, which do you change, if any??
Problem is that maybe the second is only 4 because it relates to a previous "if" or it is at the end of the "for" loop.

JM2C

Yes! It's right, but i have files with 3 tabs , another files with 6 tabs . All scripts works fine , but i need to format all on same tabs with number of spaces.
Because is many folders ( each folder with one project ) and users use diff method and editors with diff tabs and if you want to work with all is crazy , because one script has 5 space tabs , another has 3 and i just use one editor with a specific number of space for tab.
Do you understand ?
It is very usefull script not just for my issue .

catkin 04-27-2010 03:06 PM

Quote:

Originally Posted by mythcat (Post 3949645)
I don't understand you. This is not a joke...
First your solution seam to be good . But, when i try your code and i take a look ,

I was responding to grail's "Python code is particularly pedantic about the number of spaces/tabs that are used for indentation. Should the indentation be different by so much as a space it will not run". I don't know python, so don't know what the requirement is. The code given will replace the tabs with spaces; if it stops the python working it's no good.


All times are GMT -5. The time now is 01:02 PM.