LinuxQuestions.org
Share your knowledge at the LQ Wiki.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 01-23-2008, 08:25 AM   #1
trscookie
Member
 
Registered: Apr 2004
Location: oxford
Distribution: gentoo
Posts: 463

Rep: Reputation: 30
remove the top of a file:


Hello all:

I have a 400Mb log file that was created in Win$hite and I want to remove the top X amount of lines:

I have tried things like

Code:
tail -2000 filename.log > OP.file
However I get these control characters in the OP.file that I don't seem to get the the filename.log. I originally thought that this was a windows to Linux issue so I have issued the dos2unix command but it does not work.


I have also tried:

Code:
head -2000 filename.log > OP.file
And that works fine!? Am I missing something?

Thanks, trscookie.
 
Old 01-23-2008, 09:10 AM   #2
colucix
LQ Guru
 
Registered: Sep 2003
Location: Bologna
Distribution: CentOS 6.5 OpenSuSE 12.3
Posts: 10,509

Rep: Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983
What happens if you try...?
Code:
sed '1,2000d' filename.log > OP.file
 
Old 01-23-2008, 09:18 AM   #3
trscookie
Member
 
Registered: Apr 2004
Location: oxford
Distribution: gentoo
Posts: 463

Original Poster
Rep: Reputation: 30
oh, will give that a try, I have noticed also:

Code:
$ od -x tmpfile2 | head
0000000 3500 3100 3400 3800 2000 7300 6c00 4900
0000020 4600 4400 4500 7600 7400 2e00 6500 7800
^ double byte characters any Idea on how to convert to single?

so I can cat the file but not vi/nano the file.
 
Old 01-23-2008, 10:03 AM   #4
colucix
LQ Guru
 
Registered: Sep 2003
Location: Bologna
Distribution: CentOS 6.5 OpenSuSE 12.3
Posts: 10,509

Rep: Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983
Quote:
Originally Posted by trscookie View Post
^ double byte characters any Idea on how to convert to single?
I don't really understand what do you mean. The od command dumps the content of a file based on the type of representation you specify. What is the output of
Code:
od -c filename | head
to dump as one-byte ascii characters?
 
Old 01-23-2008, 10:09 AM   #5
trscookie
Member
 
Registered: Apr 2004
Location: oxford
Distribution: gentoo
Posts: 463

Original Poster
Rep: Reputation: 30
Code:
$ od -c drwtsn32.log | head
0000000 377 376  \r  \0  \n  \0   M  \0   i  \0   c  \0   r  \0   o  \0
0000020   s  \0   o  \0   f  \0   t  \0      \0   (  \0   R  \0   )  \0
0000040      \0   D  \0   r  \0   W  \0   t  \0   s  \0   n  \0   3  \0
0000060   2  \0  \r  \0  \n  \0   C  \0   o  \0   p  \0   y  \0   r  \0
0000100   i  \0   g  \0   h  \0   t  \0      \0   (  \0   C  \0   )  \0
0000120      \0   1  \0   9  \0   8  \0   5  \0   -  \0   2  \0   0  \0
0000140   0  \0   2  \0      \0   M  \0   i  \0   c  \0   r  \0   o  \0
0000160   s  \0   o  \0   f  \0   t  \0      \0   C  \0   o  \0   r  \0
0000200   p  \0   .  \0      \0   A  \0   l  \0   l  \0      \0   r  \0
0000220   i  \0   g  \0   h  \0   t  \0   s  \0      \0   r  \0   e  \0

If you look you can see the work M i c r o s o f t starting at line 140, it is the output of a log file generated on a windows pc, thanks trscookie.
 
Old 01-23-2008, 11:11 AM   #6
bigearsbilly
Senior Member
 
Registered: Mar 2004
Location: england
Distribution: Mint, Armbian, NetBSD, Puppy, Raspbian
Posts: 3,515

Rep: Reputation: 239Reputation: 239Reputation: 239
it looks like they've padded each char with a NULL byte,
tykes.
 
Old 01-23-2008, 11:15 AM   #7
trscookie
Member
 
Registered: Apr 2004
Location: oxford
Distribution: gentoo
Posts: 463

Original Poster
Rep: Reputation: 30
what would be the best way to remove the padding of a 400Mb file?!
 
Old 01-23-2008, 11:21 AM   #8
bigearsbilly
Senior Member
 
Registered: Mar 2004
Location: england
Distribution: Mint, Armbian, NetBSD, Puppy, Raspbian
Posts: 3,515

Rep: Reputation: 239Reputation: 239Reputation: 239
try this first off
Code:
#include <stdio.h>


int main(void)
{

    int ch;

    while((ch = getchar()) != EOF) {
        if(ch) putchar(ch);
    }

}
 
Old 01-23-2008, 11:41 AM   #9
colucix
LQ Guru
 
Registered: Sep 2003
Location: Bologna
Distribution: CentOS 6.5 OpenSuSE 12.3
Posts: 10,509

Rep: Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983
Ok, thanks. I see now. Looking at the output of "od -c" I'm puzzled about the first two bytes. I don't know what they mean, but maybe they cause the problem (when missing) you reported in your first post.

It's hard for me to figure out a solution without managing the files by myself. Anyway, regarding the question "on how to convert to single" you can try to simply substitute the NULL character with nothing, using sed:
Code:
sed 's/\x00//g' filename > new_file
then something similar to strip the carriage returns "\r"... but as I told it's hard for me to predict the result.
 
Old 01-23-2008, 11:43 AM   #10
colucix
LQ Guru
 
Registered: Sep 2003
Location: Bologna
Distribution: CentOS 6.5 OpenSuSE 12.3
Posts: 10,509

Rep: Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983
Uh oh.. sorry. I went away for awhile before posting and didn't see the replay from bigearsbilly. Try the C solution, first (thanks, billy).
 
Old 01-23-2008, 12:05 PM   #11
bigearsbilly
Senior Member
 
Registered: Mar 2004
Location: england
Distribution: Mint, Armbian, NetBSD, Puppy, Raspbian
Posts: 3,515

Rep: Reputation: 239Reputation: 239Reputation: 239
Quote:
I'm puzzled about the first two bytes.
well, maybe that and the null bytes is simply a pathetic microsoft attempt at obfuscation, we all
know how they hate users looking at anything.

I'll be interested to see what it is,
(I don't have any DOS OS's myself).
 
Old 01-24-2008, 03:18 AM   #12
trscookie
Member
 
Registered: Apr 2004
Location: oxford
Distribution: gentoo
Posts: 463

Original Poster
Rep: Reputation: 30
Thanks everybody for the help, that little C program works a charm!

Thank you, trscookie
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
apt-get problem - need top force remove an application amon Debian 4 10-22-2010 07:02 AM
Top, for file access phip Linux - Software 1 11-27-2007 03:00 AM
Bash remove part of a file based on contents of another file bhepdogg Programming 4 01-31-2007 03:13 PM
how to remove top and front cover of E4500 veeraalin Solaris / OpenSolaris 2 07-28-2005 02:27 AM
remove top left title icon in icewm brainlesspinkey Linux - Software 1 08-28-2004 11:38 PM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 11:00 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration