LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (https://www.linuxquestions.org/questions/programming-9/)
-   -   Maximum file size (https://www.linuxquestions.org/questions/programming-9/maximum-file-size-503822/)

Hiran Joseph 11-22-2006 01:56 AM

Maximum file size
 
Does anyone know what is the maximum file size that can
be parsed by the AWK / or by linux in general?

bigearsbilly 11-22-2006 04:58 AM

linux file sizes, pretty large about 1TB i believe,
awk, dependent on memory and awk itself, dunno

with perl i have parsed very large files

jim mcnamara 11-22-2006 05:38 AM

Unless you mean "line size" -- which is usually thought of as the number of bytes per record in a text-formatted file -- if the filesystem can store the file, then utilities like awk and sed can read it.

What error are you getting?

matthewg42 11-22-2006 05:39 AM

I once ran into a 2 GiB limit with awk on a HP-UX machine. I think it depends on which c library it was compiled against. You can probably tell by looking for the 64 bit file IO functions in the binary:
Code:

strings $(which awk) |grep fopen
If you see fopen64, I think you should be OK up to files of stupidly large size. If you see regular fopen, probably you'll hit the limit at 2 GiB.

randyding 11-22-2006 08:34 PM

sed also counts line numbers, there could be an issue if there is more than 2^32 lines streamed in or read from a file. I don't know if the following works, to print the 2^32 line number?? Anyone have a file this big laying around?
Code:

sed -n 4294967296p

tuxdev 11-22-2006 10:28 PM

The most general answer is as many bytes as you can index with an off_t.

primo 11-23-2006 12:16 AM

Quote:

Originally Posted by tuxdev
The most general answer is as many bytes as you can index with an off_t.

Yes, it's the most reliable way. This and off64_t. As a matter of fact, the *BSD's lack fopen64() because they use a 64-bits wide off_t.

kshkid 11-24-2006 07:54 AM

And fopen64 is a non-standard function, to support 64-bit file pointers


All times are GMT -5. The time now is 10:58 PM.