Visit Jeremy's Blog.
Go Back > Forums > Linux Forums > Linux - Software
User Name
Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.


  Search this Thread
Old 12-07-2017, 04:14 AM   #1
LQ Newbie
Registered: Dec 2017
Posts: 1

Rep: Reputation: Disabled
Trouble compressing a 37GB .vcf file because my GNU utilities doesn't support Large Files, file is a human chromosome in bioinformatics

Hello everyone,

I am new to linux, and i have been working some large files in a bioinformatics project.

I have been able to compress and decompress some large files, around 3GB or so, but now i stumbled upon these important 37GB file i need to bgzip compress, but the command returns an error because it doesn't support large files.

I have been browsing, and the problem is that my GNU utilites doesn't support Large files, and i have to compile the program with something like this: " -D_FILE_OFFSET_BITS=64", also i have read something about a tool named "lseek", which is able to compile an offset, with a name like this "off_t" (in contrast to "large" offset which appears to be supported by default)

My problem is that i am a newby in linux, and i don't really know how to compile that into my system, it seems like it's not something very complicated for experienced users, but it is rather complicated to me.

I would appreciate any help with information of syntax of how to compile Large Files Support (LFS) in my GNU, or maybe i have to use some flags on my bgzip command in order to make the LFS run on my LARGE_FILE.vcf?

Help please !

Last edited by ricfoz; 12-07-2017 at 04:16 AM.
Old 12-07-2017, 08:01 AM   #2
LQ Guru
Registered: Feb 2004
Location: SE Tennessee, USA
Distribution: Gentoo, LFS
Posts: 9,078
Blog Entries: 4

Rep: Reputation: 3169Reputation: 3169Reputation: 3169Reputation: 3169Reputation: 3169Reputation: 3169Reputation: 3169Reputation: 3169Reputation: 3169Reputation: 3169Reputation: 3169
The root cause of the problem is that "file offset" values within the file are 32 bits long enough to describe a file of approximately 4GB but not larger.

I would think that a 37GB single disk-file would be problematic to deal with for many reasons. (Isn't it amazing how a single cell compresses all that data and more?) Could you split the file into more-manageable chunks?
Old 12-07-2017, 01:46 PM   #3
Senior Member
Registered: Sep 2011
Location: Upper Hale, Surrey/Hants Border, UK
Distribution: AntiX
Posts: 2,429

Rep: Reputation: Disabled
These may give you a clue.
Old 12-07-2017, 02:30 PM   #4
Registered: Mar 2008
Posts: 18,376

Rep: Reputation: 2739Reputation: 2739Reputation: 2739Reputation: 2739Reputation: 2739Reputation: 2739Reputation: 2739Reputation: 2739Reputation: 2739Reputation: 2739Reputation: 2739

There are many compression programs. Maybe another would be better for large files? Many tools have features that favor some use. Some do compress huge files assuming you have the resources available. Some have settings/options to assist if you don't have the resources. other examples on web too.

Last edited by jefro; 12-07-2017 at 02:31 PM.


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off

Similar Threads
Thread Thread Starter Forum Replies Last Post
walking a file path and compressing files (except for today) hattori.hanzo Linux - Newbie 3 12-07-2008 11:04 PM
GNU getline appears to choke with large file support (can't read >2GB) VelocideX Programming 5 06-06-2008 10:24 AM
batch replace in .vcf file? mma8x Linux - General 1 09-20-2007 09:35 PM
Large file support realnickky Slackware 1 03-07-2007 10:03 AM
File does not exist/Large file support dreamtheater Linux - General 3 04-19-2004 09:14 AM > Forums > Linux Forums > Linux - Software

All times are GMT -5. The time now is 04:24 PM.

Main Menu
Write for LQ is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration