search unknown special characters on a textfile
Hi--
I need to create a shell script to search for unknown speacial characters on a file. sample: 3 0O 2 SP5 1 U ASAKOU 1 MARTSSAUT ÿ¾åbÿ 2 XID CHDSC 1 STAR 2 ID75 What I need is to print out the "1 MARTSSAUT ÿ¾åbÿ". I only need to retrieve the visible special charcters like "ÿ¾åbÿ". Note that the special characters varies from time to time so I need a flexible script that excludes [A-Z][a-z][0-9],white spaces and any characters that can be found on keyboard like "\ | * ? ^ # @ ! ~" and so on. I hope someone can help me with this. I tried using grep, awk and tc but I cannot seem to get the desired result. Thanks in advance. |
Hey There,
You can use od to get you started, it'll pick all of those out of there and then you can ignore the regular stuff (note that the first field of the default output is the character offset, so you can ignore that, too) Ex: with your file Code:
-bash-3.2$ od -c yourFile Let me know if you need further help , Mike |
Hi eggixyz,
Thanks for the help. Really appreciate it!!! The things is that I needed a script that will only output the special characters. The textfile that I search usually composed of hundreds of line. Using your script will be like searching each lines manually for special characters. So this is the reason that I needed only the lines where special characters are present. In the example, I want the... 1 MARTSSAUT ÿ¾åbÿ to be the only output of the script. More power to you and godbless!! This is very tedious job and excruciating if I have to go cheking each lines. Please help me.. O GOD Help me!!! |
I have been struggling with this, but have not solved it.
|
Quote:
Code:
for each in `sed 's/\(\)/ /g' samplefile` |
Quote:
Save it as clean.sed Code:
s^I[^][ !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ\^_`abcdefghijklmnopqrstuvwxyz{|}~]^I^Ig Cheers, Tink |
Hey There,
This will do it for your script. Code:
#!/usr/bin/perl #!/usr/bin/perl open(FILE, "<G"); while (<FILE>) { if ( $_ =~ /[^A-Za-z0-9\s\t\`\-=\[\]\\;\',\.\/~!@#$%^&\*\(\)_+\{\}\|:\"<>\?)]/ ) { print $_ } } close(FILE); If you need it for regexp outside of perl, sed/awk should be able to make all the same matches, with an extra backslash or two. Hope that helps :) , Mike |
All times are GMT -5. The time now is 01:50 PM. |