LinuxQuestions.org
Latest LQ Deal: Latest LQ Deals
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 06-06-2008, 01:28 AM   #1
VelocideX
LQ Newbie
 
Registered: Oct 2007
Posts: 23

Rep: Reputation: 15
GNU getline appears to choke with large file support (can't read >2GB)


Hi all,

I have compiled my program to enable large file support. That is, I pass -D_LARGEFILE_SOURCE and -D_FILE_OFFSET_BITS=64 to gcc.

I can open files larger than 2GB file (whereas I could not if these options were not enabled).

I have been reading in data from the text files using the GNU getline command. Getline also reads in data fine from before the 2GB mark, but immediately after it cannot read any data in.

Does anyone know why this is, and how I can fix it? Is GNU getline compatible with LFS, or do I have to write my own equivalent routine?

Cheers
 
Old 06-06-2008, 04:01 AM   #2
jschiwal
LQ Guru
 
Registered: Aug 2001
Location: Fargo, ND
Distribution: SuSE AMD64
Posts: 15,733

Rep: Reputation: 682Reputation: 682Reputation: 682Reputation: 682Reputation: 682Reputation: 682
You might want to check what resource limits might be imposed. You aren't trying to read a binary file without returns, are you?

If your ulimit restricts memory to below 2GB, then getline's realloc call will probably fail.
 
Old 06-06-2008, 05:34 AM   #3
jschiwal
LQ Guru
 
Registered: Aug 2001
Location: Fargo, ND
Distribution: SuSE AMD64
Posts: 15,733

Rep: Reputation: 682Reputation: 682Reputation: 682Reputation: 682Reputation: 682Reputation: 682
I compiled this simple test program from the getline manpage. The "hardwired" file is a text file copy of the info bash manual cat'ed over & over until test.txt was 2.7GB. I'll have to let it run for a while to see if it reaches the end. I defined _FILE_OFFSET_BITS = 64 as per the feature_test_macros manpage.

Code:
#define _GNU_SOURCE
#define _FILE_OFFSET_BITS 64

#include <stdio.h>
#include <stdlib.h>

int
main(void)
{
    FILE * fp;
    char * line = NULL;
    size_t len = 0;
    ssize_t read;
    fp = fopen("test.txt", "r");
    if (fp == NULL)
        exit(EXIT_FAILURE);
    while ((read = getline(&line, &len, fp)) != -1) {
        printf("Retrieved line of length %zu :\n", read);
        printf("%s", line);
    }
    if (line)
        free(line);
    return EXIT_SUCCESS;
}

Last edited by jschiwal; 06-06-2008 at 05:36 AM.
 
Old 06-06-2008, 06:31 AM   #4
VelocideX
LQ Newbie
 
Registered: Oct 2007
Posts: 23

Original Poster
Rep: Reputation: 15
Quote:
Originally Posted by jschiwal View Post
You might want to check what resource limits might be imposed. You aren't trying to read a binary file without returns, are you?

If your ulimit restricts memory to below 2GB, then getline's realloc call will probably fail.
It's not a binary fine.. It's a text file that I have made as the output of a fortran simulation. Each like is about 40,000 characters. There are PLENTY of line returns.

I've checked ulimit and there's no restrictions I can see that would affect it. realloc shouldn't matter because the string that is being filled is only about ~40kB. Each string is discarded. There is another few dynamic vectors allocated, but the memory usage for those is only about 300MB (I have 3.5GB memory).

It's suspicious that getline fails as soon as the file position hits 2^31.

jschiwal - thanks for taking the time to compile a test routine. My routine checks the output of getline, and it's never < 0 (it prints an error message then dumps if it is), which is strange.
 
Old 06-06-2008, 06:36 AM   #5
jschiwal
LQ Guru
 
Registered: Aug 2001
Location: Fargo, ND
Distribution: SuSE AMD64
Posts: 15,733

Rep: Reputation: 682Reputation: 682Reputation: 682Reputation: 682Reputation: 682Reputation: 682
I checked and the test program is still running.
Please note this from the feature_test_macro page:
Quote:
_LARGEFILE64_SOURCE
Expose definitions for the alternative API specified by the LFS
(Large File Summit) as a "transitional extension" to the Single
UNIX Specification. (See http://opengroup.org/plat‐
form/lfs.html.) The alternative API consists of a set of new
objects (i.e., functions and types) whose names are suffixed
with "64" (e.g., off64_t versus off_t, lseek64() versus lseek(),
etc.). New programs should not employ this interface; instead
_FILE_OFFSET_BITS=64 should be employed.
I compiled my example with "#define _FILE_OFFSET_BITS 64" to test the getline() function and not the _LARGEFILE64_SOURCE feature test macro.

Last edited by jschiwal; 06-06-2008 at 06:38 AM.
 
Old 06-06-2008, 10:24 AM   #6
jschiwal
LQ Guru
 
Registered: Aug 2001
Location: Fargo, ND
Distribution: SuSE AMD64
Posts: 15,733

Rep: Reputation: 682Reputation: 682Reputation: 682Reputation: 682Reputation: 682Reputation: 682
Update:
Hours after starting the little demo program it finished reading the 2.7GB text file. EXIT_SUCCESS!
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
ReiserFS 2Gb maximum file size limit? Cannot copy files bigger than 2Gb ihtus SUSE / openSUSE 2 10-26-2007 09:21 AM
Large file support realnickky Slackware 1 03-07-2007 10:03 AM
Where can I get info about Large File Support on woody? MrinCodex Debian 3 05-19-2005 12:01 PM
Large file size support for linux merlin23 Linux - Newbie 11 01-11-2005 08:01 AM
File does not exist/Large file support dreamtheater Linux - General 3 04-19-2004 09:14 AM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 07:58 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration