LinuxQuestions.org
Help answer threads with 0 replies.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 05-17-2013, 04:05 AM   #1
qrange
Member
 
Registered: Jul 2006
Location: Belgrade, Yugoslavia
Distribution: Debian stable, amd64
Posts: 819

Rep: Reputation: 32
strip first byte from very large binary file


I need to remove first byte of large file, what is the fastest way to do it?
I had tried:

dd bs=1 skip=1 if=largefilein.raw of=largefileout.raw

but its too slow.
 
Old 05-17-2013, 07:10 AM   #2
pan64
LQ Guru
 
Registered: Mar 2012
Location: Hungary
Distribution: debian/ubuntu/suse ...
Posts: 9,345

Rep: Reputation: 2747Reputation: 2747Reputation: 2747Reputation: 2747Reputation: 2747Reputation: 2747Reputation: 2747Reputation: 2747Reputation: 2747Reputation: 2747Reputation: 2747
I would try to implement a simple dd in perl (for example) and skip the first byte.
 
Old 05-17-2013, 03:50 PM   #3
rtmistler
Moderator
 
Registered: Mar 2011
Location: Sutton, MA. USA
Distribution: MINT Debian, Angstrom, SUSE, Ubuntu
Posts: 5,265
Blog Entries: 12

Rep: Reputation: 1862Reputation: 1862Reputation: 1862Reputation: 1862Reputation: 1862Reputation: 1862Reputation: 1862Reputation: 1862Reputation: 1862Reputation: 1862Reputation: 1862
If you can write it in code it would look something like this. Note I didn't compile this just copied a part where I open with append and truncate the last 8 bytes off of the end, in your case I set the seek differently: (of course put in appropriate validation tests, I really put in only the necessities here.)

Code:
FILE *fIn, *fOut;
char inBuf[1024];

fIn = open("fin", "rb+");
fOut = open("fout", "wb");

fseek(fIn, 1, SEEK_START);

while(fgets(inBuf, sizeof(inBuf), fIn) != NULL) {
    fputs(inBuf, fOut);
}

fclose(fIn);
fclose(fOut);
Test the results of the open for NULL valued results and check your errno's.
Make sure you use the "b" in your opens, so that you maintain the integrity of the binary file.
Don't go in excess of 4096 on the inBuf size, that's just IMHO. I know that PIPE sizes allow up to 65535, just myself many resources limit to 4096; such as serial buffers and such so my tendency is to limit below that value. It really shouldn't matter, whether you use 1024, 4096, or 8192, it adds more calls in the loop, but it shouldn't take too long to process the file.

Sorry, I'm guessing there's a slick script way to do this, but this is more the way I do stuff like this.

Plus, if it's a binary file, likely that I'm writing a program to parse it anyways. And instead of copying it to another file, minus the first byte, I open it, ignore that first byte, and then start my parse.

By the way ... you have a very large binary file, and you're ignoring the first character only? O.K. ... exam question? Seriously, if you have a large file and there truly is one first character to ignore, perhaps it is a command which is echoed and therefore you want it gone. But since the file is binary, you need some other program to deal with parsing it, I'm then guessing that the parser is not written by you, so you merely need to remove that first character so you can use an existing executable to process your binary.

When in doubt, especially with binary files. Write a program to deal with it, because you can then write logs to output the intermediate results, or convert the file entirely to a hex-ascii equivalent so you can view it and see what's up with it.
 
2 members found this post helpful.
Old 05-17-2013, 08:57 PM   #4
Beryllos
Member
 
Registered: Apr 2013
Location: Massachusetts
Distribution: Debian
Posts: 348

Rep: Reputation: 151Reputation: 151
You could use dd in two steps:
Code:
dd bs=1 skip=1 count=4095 if=largefilein.raw of=largefile_part1.raw
dd bs=4k skip=1 if=largefilein.raw of=largefile_part2.raw
cat largefile_part1.raw largefile_part2.raw > largefileout.raw
There are probably faster ways to do it.

Last edited by Beryllos; 05-17-2013 at 08:59 PM.
 
2 members found this post helpful.
Old 05-17-2013, 09:09 PM   #5
Beryllos
Member
 
Registered: Apr 2013
Location: Massachusetts
Distribution: Debian
Posts: 348

Rep: Reputation: 151Reputation: 151
Fastest? If not, at least the shortest.
Code:
tail -c +2 largefilein.raw > largefileout.raw
 
2 members found this post helpful.
Old 05-17-2013, 11:18 PM   #6
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 15,620

Rep: Reputation: 2088Reputation: 2088Reputation: 2088Reputation: 2088Reputation: 2088Reputation: 2088Reputation: 2088Reputation: 2088Reputation: 2088Reputation: 2088Reputation: 2088
Sweet. I was thinking sed, but that (tail) looks good.
As for speed, a quick strace shows it using 8k reads - no reason to believe it should be too slow for the job.
 
1 members found this post helpful.
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Help: Opening GenICs binary file, writing to binary file for GADGET-2 (in C) parallax147 Programming 0 10-05-2010 11:06 AM
As I right this, I have a large white strip on my screen from top to bottom Virtual Circuit Linux - Hardware 4 08-23-2008 02:27 AM
1 Char/byte Binary Editor inlogger Linux - Software 3 03-08-2007 11:10 AM
C : byte order of served http binary file. slzckboy Programming 5 06-22-2006 02:36 PM
I want to change one byte in a large file, pajout Linux - Newbie 4 03-23-2006 10:53 AM


All times are GMT -5. The time now is 08:52 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration