Linux - NewbieThis Linux forum is for members that are new to Linux.
Just starting out and have a question?
If it is not in the man pages or the how-to's this is the place!
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
I need to create a shell script to search for unknown speacial characters on a file.
sample:
3 0O
2 SP5
1 U ASAKOU
1 MARTSSAUT ÿ¾åbÿ
2 XID CHDSC
1 STAR
2 ID75
What I need is to print out the "1 MARTSSAUT ÿ¾åbÿ". I only need to retrieve the visible special charcters like "ÿ¾åbÿ". Note that the special characters varies from time to time so I need a flexible script that excludes [A-Z][a-z][0-9],white spaces and any characters that can be found on keyboard like "\ | * ? ^ # @ ! ~" and so on.
I hope someone can help me with this. I tried using grep, awk and tc but I cannot seem to get the desired result.
You can use od to get you started, it'll pick all of those out of there and then you can ignore the regular stuff (note that the first field of the default output is the character offset, so you can ignore that, too)
Ex: with your file
Code:
-bash-3.2$ od -c yourFile
0000000 3 0 O \n 2 S P 5 \n 1 U A
0000020 S A K O U \n 1 M A R T S S A U
0000040 T ÿ ¾ å b ÿ \n 2 X I D C H
0000060 D S C \n 1 S T A R \n 2 I D 7
0000100 5 \n
0000102
The things is that I needed a script that will only output the special characters. The textfile that I search usually composed of hundreds of line. Using your script will be like searching each lines manually for special characters. So this is the reason that I needed only the lines where special characters are present. In the example, I want the...
1 MARTSSAUT ÿ¾åbÿ
to be the only output of the script.
More power to you and godbless!!
This is very tedious job and excruciating if I have to go cheking each lines. Please help me.. O GOD Help me!!!
I have been struggling with this, but have not solved it.
I would do this.. of course, you'll need to add in the special chars like !@#$%^&&**() etc, I'm not sure if there's an easy thing like A-z or 0-9 with those chars.. but this seems to work:
Code:
for each in `sed 's/\(\)/ /g' samplefile`
do
echo $each | egrep -v "[A-z]|[0-9]"
done
#!/usr/bin/perl
open(FILE, "<G");
while (<FILE>) {
if ( $_ =~ /[^A-Za-z0-9\s\t]/ ) {
print $_
}
}
close(FILE);
And, even though this is ugly, I think this ignores pretty much everything that's "normal" (all 94 regular characters and space and tab -- just add \n, etc for whatever extra characters you want to not earmark)
#!/usr/bin/perl
open(FILE, "<G");
while (<FILE>) {
if ( $_ =~ /[^A-Za-z0-9\s\t\`\-=\[\]\\;\',\.\/~!@#$%^&\*\(\)_+\{\}\|:\"<>\?)]/ ) {
print $_
}
}
close(FILE);
If you need it for regexp outside of perl, sed/awk should be able to make all the same matches, with an extra backslash or two.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.