LinuxQuestions.org
Download your favorite Linux distribution at LQ ISO.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - General
User Name
Password
Linux - General This Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.

Notices


Reply
  Search this Thread
Old 04-16-2017, 11:57 AM   #1
jr_bob_dobbs
Member
 
Registered: Mar 2009
Distribution: Bedrock, Devuan, Slackware, Linux From Scratch, Void
Posts: 651
Blog Entries: 135

Rep: Reputation: 188Reputation: 188
going nuts: lseek fails to make sparse holes


So consider the following short program:
Code:
/* gcc -o sparse_demo_test sparse_demo_test.c */
#define neq !=
#define eq ==
#define DOOPS 20
#define OKDOKEY 0

#include <stdio.h>
#include <stddef.h>
#include <stdlib.h>

#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>


main()
{
  int f2;
  off_t status;

  f2 = open(
    "sparse_create_test.dat",
    (O_CREAT | O_RDWR | O_TRUNC),
    (S_IRUSR | S_IWUSR)
  );
  /* took out O_WRONLY, but O_RDWR does not work any better */
  if (-1 eq f2) {
    fprintf(stderr, "file create error\n");
    return(DOOPS);
  }

  status = lseek(f2, 1073741824, SEEK_CUR); /* this is not working.  WHY? */
  if ((off_t) -1 eq status) {
    fprintf(stderr, "lseek error\n");
  }

  if (close(f2)) {
    fprintf(stderr, "error closing output file\n");
    exit(DOOPS);
  }
  exit(OKDOKEY);
}


/* actual end of this file */
The idea is that this program, when run, writes a sparse file that appears to be 1 gig of zeroes, yet takes up almost no room on the file system.

What actually happens is that a zero-length file is written. D'oh!

I'm running on an ext4 file system, which I know handles sparse files (have used dd and fallocate to verify this). All my googling indicates that, in a .c program, the lseek is the *only * way to write a sparse "hole" to a file.

So what am I overlooking? I have no idea. Can anyone assist? Thank you.
 
Old 04-16-2017, 04:06 PM   #2
rknichols
Senior Member
 
Registered: Aug 2009
Distribution: Rocky Linux
Posts: 4,779

Rep: Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212
You never wrote anything at that offset. If you are not going to write something to the file, you need to call ftruncate(2) to set a size in the inode.
 
Old 04-18-2017, 01:32 PM   #3
jr_bob_dobbs
Member
 
Registered: Mar 2009
Distribution: Bedrock, Devuan, Slackware, Linux From Scratch, Void
Posts: 651

Original Poster
Blog Entries: 135

Rep: Reputation: 188Reputation: 188
Quote:
Originally Posted by rknichols View Post
You never wrote anything at that offset. If you are not going to write something to the file, you need to call ftruncate(2) to set a size in the inode.
Oh, I see that now!

If I write even just *one* byte after the lseek, the file expands to the proper size. I mean, "ls -l" reports the "full" size and du the small space actually occupied by such a sparse file.

ftruncate is a more elegant solution (not having to write that kludgy extra byte)

Thank you very much.

edit: confirmed. rewrote my program to use the ftruncate after the lseek. Works!

Last edited by jr_bob_dobbs; 04-18-2017 at 01:36 PM.
 
Old 04-18-2017, 06:37 PM   #4
rknichols
Senior Member
 
Registered: Aug 2009
Distribution: Rocky Linux
Posts: 4,779

Rep: Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212
Quote:
Originally Posted by jr_bob_dobbs View Post
edit: confirmed. rewrote my program to use the ftruncate after the lseek. Works!
You don't even need the lseek() there. The ftruncate() call takes the size as a parameter and does not change the current file offset.
 
Old 04-19-2017, 01:00 PM   #5
jr_bob_dobbs
Member
 
Registered: Mar 2009
Distribution: Bedrock, Devuan, Slackware, Linux From Scratch, Void
Posts: 651

Original Poster
Blog Entries: 135

Rep: Reputation: 188Reputation: 188
Quote:
Originally Posted by rknichols View Post
You don't even need the lseek() there. The ftruncate() call takes the size as a parameter and does not change the current file offset.
Oh, that is very good to know! Thank you.

For my eventual future project(s), however, I'll be mixing holes and regular data, so I will be needing lseek some of the time.
 
Old 04-20-2017, 08:19 AM   #6
sundialsvcs
LQ Guru
 
Registered: Feb 2004
Location: SE Tennessee, USA
Distribution: Gentoo, LFS
Posts: 10,659
Blog Entries: 4

Rep: Reputation: 3941Reputation: 3941Reputation: 3941Reputation: 3941Reputation: 3941Reputation: 3941Reputation: 3941Reputation: 3941Reputation: 3941Reputation: 3941Reputation: 3941
Although "sparse files" are tempting, you should be mindful of just how the underlying filesystem implements them. They might not do so efficiently. Not at all.

It's very hard to beat an SQLite database file for many such situations. (Just be sure to use transactions, so that SQLite will do "lazy writes.")
 
Old 04-26-2017, 05:17 PM   #7
jr_bob_dobbs
Member
 
Registered: Mar 2009
Distribution: Bedrock, Devuan, Slackware, Linux From Scratch, Void
Posts: 651

Original Poster
Blog Entries: 135

Rep: Reputation: 188Reputation: 188
Yeah, fragmentation when the hole gets (partially) filled later on. Still, an archiver ought to know about sparse files so that some zero k sparse file doesn't balloon out to 100 gig on restore.
 
Old 04-26-2017, 08:49 PM   #8
sundialsvcs
LQ Guru
 
Registered: Feb 2004
Location: SE Tennessee, USA
Distribution: Gentoo, LFS
Posts: 10,659
Blog Entries: 4

Rep: Reputation: 3941Reputation: 3941Reputation: 3941Reputation: 3941Reputation: 3941Reputation: 3941Reputation: 3941Reputation: 3941Reputation: 3941Reputation: 3941Reputation: 3941
Quote:
Originally Posted by jr_bob_dobbs View Post
Yeah, fragmentation when the hole gets (partially) filled later on. Still, an archiver ought to know about sparse files so that some zero k sparse file doesn't balloon out to 100 gig on restore.
Don't make any such assumption about the behavior of an archiver.

Seriously, I would encourage you to reconsider the technical wisdom of "deliberately-sparse files." That smacks of a use-case that would be better served by some kind of database or indexed-file structure. I would form my argument as follows:
  1. It is, in fact, the key space that is "sparse."
  2. Key-distribution should have no bearing on the physical layout. On physical disks, this will greatly increase the tendency of "seek time" (the slowest operation an HDA (Head/Disk Assembly) can do ...), and destroy the usefulness of caching.
  3. There is "a data structure": the data-structures of the underlying file system. But, these data structures are designed only to store files. Although provisions are made for sparse files, this is not their design focus.
  4. We are no longer in the days of mainframe MVS®-yore, where when we "allocated a dataset" we got a physically contiguous block of "DASD cylinders" that we knew would be adjacent. We really don't know – and, can't control – where the "sparse" records actually are.
  5. Thus, the design could be severely impacted by the differences between the conceptual view of the arrangement, and the possibly-entirely-different reality.
  6. Whereas, very well-known indexed-file structures ... even VSAM (aka "NoSQL") ... are engineered for this use-case. The "Sqlite" project put this idea "on steroids," provided that you remember to use transactions to get lazy-writes. (The programmers on that project are Wizards.)
#undef SOAPBOX

Last edited by sundialsvcs; 04-27-2017 at 10:15 AM.
 
Old 04-26-2017, 09:37 PM   #9
Laserbeak
Member
 
Registered: Jan 2017
Location: Manhattan, NYC NY
Distribution: Mac OS X, iOS, Solaris
Posts: 508

Rep: Reputation: 143Reputation: 143
What's with this?

Code:
#define neq !=
#define eq ==
You confused me, I didn't know if I was looking at C or a shell script for a minute there...
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Howdy from a newbie. Linux drives me nuts BUT Windose make me angry. doccpu LinuxQuestions.org Member Intro 1 07-01-2011 08:21 AM
Asterisk PBX hacked - looking to make sure all holes are closed simonmason Linux - Security 12 10-05-2010 07:41 AM
LXer: Why are you not running Apache? New IIS holes should make you rething your web LXer Syndicated Linux News 0 05-20-2009 11:50 PM
Kernel compile fails with make-kpkg, not with make cspos Debian 37 11-09-2005 09:11 AM
how to make a chart with holes? caminoix Linux - Software 2 05-24-2005 08:10 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - General

All times are GMT -5. The time now is 05:04 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration