LinuxQuestions.org
Visit Jeremy's Blog.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - General
User Name
Password
Linux - General This Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.

Notices


Reply
  Search this Thread
Old 01-03-2021, 03:13 AM   #1
linustalman
LQ Guru
 
Registered: Mar 2010
Location: Ireland
Distribution: Debian 12 Bookworm
Posts: 5,715

Rep: Reputation: 479Reputation: 479Reputation: 479Reputation: 479Reputation: 479
Question Is it possible to easily detect a space at the end of lines in a long list?


Hi.

I have a large file of words (17000+ lines) and their translations. I sort them with Pluma and check the 'remove duplicates' option. However, I'm sure there's a fair few with a space at the end of the line and would be seen as not a duplicate. Is there some easy way to find them?

Example of what I mean (notice how the second line has a space at the end so these would be seen as 2 different lines in Pluma):
dog - madra
dog - madra


BTW - 'madra' is the Irish word for 'dog'. :-)

Thanks.

Last edited by linustalman; 01-03-2021 at 11:40 AM. Reason: explained something about language being translated
 
Old 01-03-2021, 03:42 AM   #2
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 21,126

Rep: Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120
Of course - simple regex wit anchors. Whether pluma offers it I don't know, but if you can (pre-)edit those files, simple enough with sed (for example) to remove any trailing spaces on any/all lines.
 
Old 01-03-2021, 03:43 AM   #3
shruggy
Senior Member
 
Registered: Mar 2020
Posts: 3,670

Rep: Reputation: Disabled
To see them, cat -A|less. To trim them, sed 's/ *$//'.

BTW, I'm using this Vim plugin. It highlights trailing whitespace in red.

Last edited by shruggy; 01-03-2021 at 03:58 AM.
 
Old 01-03-2021, 04:11 AM   #4
pan64
LQ Addict
 
Registered: Mar 2012
Location: Hungary
Distribution: debian/ubuntu/suse ...
Posts: 21,841

Rep: Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308
Quote:
Originally Posted by shruggy View Post
To see them, cat -A|less. To trim them, sed 's/ *$//'.

BTW, I'm using this Vim plugin. It highlights trailing whitespace in red.
you can highlight those spaces without any plugin, and also you can remove them (in vi).
But it is not really important and offtopic.
 
Old 01-03-2021, 11:41 AM   #5
linustalman
LQ Guru
 
Registered: Mar 2010
Location: Ireland
Distribution: Debian 12 Bookworm
Posts: 5,715

Original Poster
Rep: Reputation: 479Reputation: 479Reputation: 479Reputation: 479Reputation: 479
Post

I should have added that not all lines end in a full stop (most don't).

Examples:
dogs - madraí
Does anyone here speak Irish? - An labhraíonn éinne anseo Gaeilge?
Do it this way. - Déan an bealach seo é.
 
Old 01-03-2021, 11:45 AM   #6
pan64
LQ Addict
 
Registered: Mar 2012
Location: Hungary
Distribution: debian/ubuntu/suse ...
Posts: 21,841

Rep: Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308
full stop is not a white space char, so it is now irrelevant.
did you try what was suggested?
 
Old 01-03-2021, 02:32 PM   #7
linustalman
LQ Guru
 
Registered: Mar 2010
Location: Ireland
Distribution: Debian 12 Bookworm
Posts: 5,715

Original Poster
Rep: Reputation: 479Reputation: 479Reputation: 479Reputation: 479Reputation: 479
Exclamation

Quote:
Originally Posted by pan64 View Post
full stop is not a white space char, so it is now irrelevant.
did you try what was suggested?
Hi pan64.

Code:
cat -A|less irish.txt
seems to just list the lines in the terminal. As did
Code:
sed 's/ *$//' irish.txt
 
Old 01-04-2021, 12:43 AM   #8
shruggy
Senior Member
 
Registered: Mar 2020
Posts: 3,670

Rep: Reputation: Disabled
Quote:
Originally Posted by linustalman View Post
Code:
cat -A|less irish.txt
This should have been
Code:
cat -A irish.txt|less
Quote:
Originally Posted by linustalman View Post
Code:
sed 's/ *$//' irish.txt
And this should have been
Code:
sed -i.bak 's/ *$//' irish.txt
You always look up manpages of the commands you were suggested to run, don't you?

Last edited by shruggy; 01-04-2021 at 12:47 AM.
 
Old 01-04-2021, 05:15 AM   #9
ondoho
LQ Addict
 
Registered: Dec 2013
Posts: 19,872
Blog Entries: 12

Rep: Reputation: 6053Reputation: 6053Reputation: 6053Reputation: 6053Reputation: 6053Reputation: 6053Reputation: 6053Reputation: 6053Reputation: 6053Reputation: 6053Reputation: 6053
Quote:
Originally Posted by shruggy View Post
You always look up manpages of the commands you were suggested to run, don't you?
No, I think they prefer that you do it for them!
 
Old 01-04-2021, 11:01 AM   #10
pan64
LQ Addict
 
Registered: Mar 2012
Location: Hungary
Distribution: debian/ubuntu/suse ...
Posts: 21,841

Rep: Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308
interestingly
Code:
cat -A | less irish.txt
will work (at least it will show the file), but anyway this is incorrect.
Code:
sed 's/ *$//' irish.txt
will remove spaces at the end of the lines, probably there are tabs or others, so you may need to modify that regexp (if you wish to remove them too).
but return back to your original title: if you wish to detect only those lines a grep would be sufficient:
Code:
grep '  *$' irish.txt  # <<-- there are two spaces inside !!
I still don't know what is the goal here....
 
Old 01-06-2021, 11:41 AM   #11
MadeInGermany
Senior Member
 
Registered: Dec 2011
Location: Simplicity
Posts: 2,791

Rep: Reputation: 1201Reputation: 1201Reputation: 1201Reputation: 1201Reputation: 1201Reputation: 1201Reputation: 1201Reputation: 1201Reputation: 1201
Trim the file with sed:
Code:
sed -i.bak 's/ *$//' irish.txt
leaves a .bak backup i.e. irish.txt.bak

Trim the file with vim:
Load the file
Code:
vim irish.txt
Visualize special characters
Code:
:set list
Trim
Code:
:%s/ *$//
Save and quit
Code:
:wq
See the similarity with the sed command!
The % means "all lines"; this is default in sed.

Last edited by MadeInGermany; 01-06-2021 at 11:44 AM.
 
1 members found this post helpful.
Old 01-12-2021, 02:41 AM   #12
linustalman
LQ Guru
 
Registered: Mar 2010
Location: Ireland
Distribution: Debian 12 Bookworm
Posts: 5,715

Original Poster
Rep: Reputation: 479Reputation: 479Reputation: 479Reputation: 479Reputation: 479
Quote:
Originally Posted by MadeInGermany View Post
Trim the file with sed:
Code:
sed -i.bak 's/ *$//' irish.txt
leaves a .bak backup i.e. irish.txt.bak

Trim the file with vim:
Load the file
Code:
vim irish.txt
Visualize special characters
Code:
:set list
Trim
Code:
:%s/ *$//
Save and quit
Code:
:wq
See the similarity with the sed command!
The % means "all lines"; this is default in sed.
Thank you, MIG. Your sed command did exactly what I wanted. 👍🏻☺️
 
Old 01-12-2021, 07:14 AM   #13
linustalman
LQ Guru
 
Registered: Mar 2010
Location: Ireland
Distribution: Debian 12 Bookworm
Posts: 5,715

Original Poster
Rep: Reputation: 479Reputation: 479Reputation: 479Reputation: 479Reputation: 479
Lightbulb

If anyone was unsure what I meant. Here's an example. [see attached image]
Attached Thumbnails
Click image for larger version

Name:	meld.png
Views:	12
Size:	50.1 KB
ID:	35229  
 
Old 01-13-2021, 12:09 PM   #14
linustalman
LQ Guru
 
Registered: Mar 2010
Location: Ireland
Distribution: Debian 12 Bookworm
Posts: 5,715

Original Poster
Rep: Reputation: 479Reputation: 479Reputation: 479Reputation: 479Reputation: 479
Quote:
Originally Posted by ondoho View Post
No, I think they prefer that you do it for them!
That's your contribution? 😏
 
Old 01-13-2021, 01:28 PM   #15
ondoho
LQ Addict
 
Registered: Dec 2013
Posts: 19,872
Blog Entries: 12

Rep: Reputation: 6053Reputation: 6053Reputation: 6053Reputation: 6053Reputation: 6053Reputation: 6053Reputation: 6053Reputation: 6053Reputation: 6053Reputation: 6053Reputation: 6053
Quote:
Originally Posted by linustalman View Post
That's your contribution? 😏
Yep.
Ain't I right though.
As your quoted post proves, again.
 
  


Reply

Tags
excess space, line end, space, spaces



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Aboout "with very long lines",how long is very long? yun1st Linux - Newbie 4 07-20-2012 03:38 PM
[SOLVED] Add lines end of file above the end comments bkone Programming 2 02-27-2012 09:58 AM
long long long: Too long for GCC Kenny_Strawn Programming 5 09-18-2010 01:14 AM
gui read out of back end command lines from gui front end activation? how? Siljrath Linux - General 0 10-24-2008 10:11 AM
printer printing vertical lines at beginning and end of lines makhand Linux - Hardware 0 09-02-2005 02:03 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - General

All times are GMT -5. The time now is 08:30 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration