LinuxQuestions.org
Download your favorite Linux distribution at LQ ISO.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software
User Name
Password
Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.

Notices


Reply
  Search this Thread
Old 12-07-2017, 05:14 AM   #1
ricfoz
LQ Newbie
 
Registered: Dec 2017
Posts: 1

Rep: Reputation: Disabled
Trouble compressing a 37GB .vcf file because my GNU utilities doesn't support Large Files, file is a human chromosome in bioinformatics


Hello everyone,

I am new to linux, and i have been working some large files in a bioinformatics project.

I have been able to compress and decompress some large files, around 3GB or so, but now i stumbled upon these important 37GB file i need to bgzip compress, but the command returns an error because it doesn't support large files.

I have been browsing, and the problem is that my GNU utilites doesn't support Large files, and i have to compile the program with something like this: " -D_FILE_OFFSET_BITS=64", also i have read something about a tool named "lseek", which is able to compile an offset, with a name like this "off_t" (in contrast to "large" offset which appears to be supported by default)

My problem is that i am a newby in linux, and i don't really know how to compile that into my system, it seems like it's not something very complicated for experienced users, but it is rather complicated to me.

I would appreciate any help with information of syntax of how to compile Large Files Support (LFS) in my GNU, or maybe i have to use some flags on my bgzip command in order to make the LFS run on my LARGE_FILE.vcf?

Help please !

Last edited by ricfoz; 12-07-2017 at 05:16 AM.
 
Old 12-07-2017, 09:01 AM   #2
sundialsvcs
LQ Guru
 
Registered: Feb 2004
Location: SE Tennessee, USA
Distribution: Gentoo, LFS
Posts: 8,706
Blog Entries: 4

Rep: Reputation: 3030Reputation: 3030Reputation: 3030Reputation: 3030Reputation: 3030Reputation: 3030Reputation: 3030Reputation: 3030Reputation: 3030Reputation: 3030Reputation: 3030
The root cause of the problem is that "file offset" values within the file are 32 bits long enough to describe a file of approximately 4GB but not larger.

I would think that a 37GB single disk-file would be problematic to deal with for many reasons. (Isn't it amazing how a single cell compresses all that data and more?) Could you split the file into more-manageable chunks?
 
Old 12-07-2017, 02:46 PM   #3
fatmac
Senior Member
 
Registered: Sep 2011
Location: Upper Hale, Surrey/Hants Border, UK
Distribution: AntiX
Posts: 1,989

Rep: Reputation: Disabled
These may give you a clue.

https://www.digitalocean.com/communi...-linux-servers

https://unix.stackexchange.com/quest...-gb-using-gzip

https://superuser.com/questions/5911...rge-100g-files
 
Old 12-07-2017, 03:30 PM   #4
jefro
Moderator
 
Registered: Mar 2008
Posts: 17,195

Rep: Reputation: 2562Reputation: 2562Reputation: 2562Reputation: 2562Reputation: 2562Reputation: 2562Reputation: 2562Reputation: 2562Reputation: 2562Reputation: 2562Reputation: 2562
bgzip?

There are many compression programs. Maybe another would be better for large files? Many tools have features that favor some use. Some do compress huge files assuming you have the resources available. Some have settings/options to assist if you don't have the resources. https://www.unixmen.com/top-15-file-...ilities-linux/ other examples on web too.

Last edited by jefro; 12-07-2017 at 03:31 PM.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
walking a file path and compressing files (except for today) hattori.hanzo Linux - Newbie 3 12-08-2008 12:04 AM
GNU getline appears to choke with large file support (can't read >2GB) VelocideX Programming 5 06-06-2008 11:24 AM
batch replace in .vcf file? mma8x Linux - General 1 09-20-2007 10:35 PM
Large file support realnickky Slackware 1 03-07-2007 11:03 AM
File does not exist/Large file support dreamtheater Linux - General 3 04-19-2004 10:14 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Software

All times are GMT -5. The time now is 04:44 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration