LinuxQuestions.org
Share your knowledge at the LQ Wiki.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 08-13-2020, 01:31 PM   #1
trist007
Senior Member
 
Registered: May 2008
Distribution: Slackware
Posts: 1,052

Rep: Reputation: 70
A question about extracting a zip archive in C...


I am using the libzip library.

I just want to extract the contents into a directory. I have no problem opening the archive, getting the number of files in the archive, using a for loop for the number of files to stat the files by index, open a file with the filename that corresponds to that index, run a read and write while loop, then close the file and close the zip archive. However, the issue is that some filenames that I get from the zip_stat_index are prepended with a directory. It would have been nice if the directory was listed on its own as a separate index but it is not. I don't see any options in the open system call to create one or more parent directories if the filename being opened/created is under one or more directories.

Here is my code so far

Code:
#include <zip.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>


#define BUFSIZE 4096

int main()
{

  int i = 0;
  int x = 0;
  int n = 0;
  int file = 0;
  zip_int64_t *zipfile;
  char buf[BUFSIZE];
  struct zip *z;
  struct zip_stat st;

  int err = 0;
  z = zip_open("/usr/share/httpd/mapcrafter/worlds/Freehold/freehold.zip", 0, &err);
  zip_stat_init(&st);

  x = zip_get_num_files(z);

  printf("There are %i number of files in this archive.\n", x);
  printf("Here is a list of the files in the archive.\n");

  for (i = 0; i < x; i++) {
    zip_stat_index(z, i, 0, &st);
    file = open(st.name, O_WRONLY | O_CREAT | O_TRUNC, S_IRUSR | S_IWUSR | S_IRGRP | S_IROTH);
    zipfile = zip_fopen_index(z, i, 0);

    while ((n = read(zipfile, buf, sizeof buf)) > 0)
      write(file, buf, n);

    close(file);
    zip_fclose(zipfile);


    printf("Copied %s...\n", st.name);

  }

  zip_close(z);
  return 0;

}
Here is the output

Code:
There are 222 number of files in this archive.
Here is a list of the files in the archive.
Copied stats/a82d1496-1f6f-4df6-9f49-33b48c1e69b2.json...
Copied stats/eb231265-254a-41f2-97d8-b479b24ddaf7.json...
Copied icon.png...
Copied level.dat...
Copied level.dat_old...
Copied session.lock...
Copied advancements/a82d1496-1f6f-4df6-9f49-33b48c1e69b2.json...
Copied advancements/eb231265-254a-41f2-97d8-b479b24ddaf7.json...
Copied data/advancements/...
Copied data/EndCity.dat...
Copied data/Fortress.dat...
Copied data/functions/...
Copied data/Mineshaft.dat...
Copied data/Monument.dat...
Copied data/Stronghold.dat...
Copied data/Temple.dat...
Copied data/Village.dat...
Copied data/villages.dat...
Copied data/villages_end.dat...
Copied data/villages_nether.dat...
Copied DIM1/region/...
Copied DIM1/region/r.0.0.mca...
Copied DIM1/region/r.0.1.mca...
Copied DIM1/region/r.0.-1.mca...
Copied DIM1/region/r.0.2.mca...
Copied DIM1/region/r.0.-2.mca...
Copied DIM1/region/r.1.0.mca...
Copied DIM1/region/r.-1.0.mca...
Copied DIM1/region/r.1.1.mca...
Copied DIM1/region/r.1.-1.mca...
Copied DIM1/region/r.-1.1.mca...
Copied DIM1/region/r.-1.-1.mca...
Copied DIM1/region/r.1.2.mca...
Copied DIM1/region/r.1.-2.mca...
Copied DIM1/region/r.-1.-2.mca...
Copied DIM1/region/r.2.0.mca...
Copied DIM1/region/r.-2.0.mca...
Copied DIM1/region/r.2.1.mca...
Copied DIM1/region/r.-2.-1.mca...
Copied DIM1/region/r.2.2.mca...
Copied DIM1/region/r.-2.2.mca...
Copied DIM1/region/r.-2.-2.mca...
Copied DIM1/region/r.3.0.mca...
Copied DIM1/region/r.3.1.mca...
Copied DIM1/region/r.3.-1.mca...
Copied DIM1/region/r.3.2.mca...
Copied DIM1/region/r.3.-2.mca...
Copied DIM1/region/r.-3.2.mca...
Copied DIM1/region/r.3.-3.mca...
Copied DIM1/region/r.4.0.mca...
Copied DIM1/region/r.4.1.mca...
Copied DIM1/region/r.4.-1.mca...
Copied DIM1/region/r.4.2.mca...
Copied DIM1/region/r.4.-2.mca...
Copied DIM1/region/r.4.3.mca...
Copied DIM1/region/r.4.-3.mca...
Copied DIM1/region/r.5.0.mca...
Copied DIM1/region/r.5.1.mca...
Copied DIM1/region/r.5.-1.mca...
Copied DIM1/region/r.5.2.mca...
Copied DIM1/region/r.5.-2.mca...
Copied DIM1/region/r.5.3.mca...
Copied DIM1/region/r.5.-3.mca...
Copied DIM-1/region/...
Copied DIM-1/region/r.0.0.mca...
Copied DIM-1/region/r.0.-1.mca...
Copied DIM-1/region/r.0.-2.mca...
Copied DIM-1/region/r.1.0.mca...
Copied DIM-1/region/r.-1.0.mca...
Copied DIM-1/region/r.1.-1.mca...
Copied DIM-1/region/r.-1.-1.mca...
Copied DIM-1/region/r.1.-2.mca...
Copied DIM-1/region/r.-1.-2.mca...
Copied DIM-1/region/r.-2.-1.mca...
Copied playerdata/a82d1496-1f6f-4df6-9f49-33b48c1e69b2.dat...
Copied playerdata/eb231265-254a-41f2-97d8-b479b24ddaf7.dat...
Copied region/r.0.0.mca...
Copied region/r.0.1.mca...
Copied region/r.0.-1.mca...
Copied region/r.0.2.mca...
Copied region/r.0.-2.mca...
Copied region/r.0.3.mca...
Copied region/r.0.-3.mca...
Copied region/r.0.4.mca...
Copied region/r.0.-4.mca...
Copied region/r.0.5.mca...
Copied region/r.0.-5.mca...
Copied region/r.0.6.mca...
Copied region/r.0.-6.mca...
Copied region/r.0.7.mca...
Copied region/r.0.-7.mca...
Copied region/r.1.0.mca...
Copied region/r.-1.0.mca...
Copied region/r.1.1.mca...
Copied region/r.1.-1.mca...
Copied region/r.-1.1.mca...
Copied region/r.-1.-1.mca...
Copied region/r.1.2.mca...
Copied region/r.1.-2.mca...
Copied region/r.-1.2.mca...
Copied region/r.-1.-2.mca...
Copied region/r.1.3.mca...
Copied region/r.1.-3.mca...
Copied region/r.-1.3.mca...
Copied region/r.-1.-3.mca...
Copied region/r.1.4.mca...
Copied region/r.1.-4.mca...
Copied region/r.-1.4.mca...
Copied region/r.-1.-4.mca...
Copied region/r.1.5.mca...
Copied region/r.1.-5.mca...
Copied region/r.-1.5.mca...
Copied region/r.-1.-5.mca...
Copied region/r.1.6.mca...
Copied region/r.1.-6.mca...
Copied region/r.-1.6.mca...
Copied region/r.-1.-6.mca...
Copied region/r.1.7.mca...
Copied region/r.1.-7.mca...
Copied region/r.1.8.mca...
Copied region/r.-10.1.mca...
Copied region/r.-10.2.mca...
Copied region/r.2.0.mca...
Copied region/r.-2.0.mca...
Copied region/r.2.1.mca...
Copied region/r.2.-1.mca...
Copied region/r.-2.1.mca...
Copied region/r.-2.-1.mca...
Copied region/r.2.2.mca...
Copied region/r.2.-2.mca...
Copied region/r.-2.2.mca...
Copied region/r.-2.-2.mca...
Copied region/r.2.3.mca...
Copied region/r.2.-3.mca...
Copied region/r.-2.3.mca...
Copied region/r.-2.-3.mca...
Copied region/r.2.4.mca...
Copied region/r.2.-4.mca...
Copied region/r.-2.-4.mca...
Copied region/r.2.5.mca...
Copied region/r.2.-5.mca...
Copied region/r.-2.-5.mca...
Copied region/r.2.6.mca...
Copied region/r.2.-6.mca...
Copied region/r.-2.-6.mca...
Copied region/r.2.7.mca...
Copied region/r.2.8.mca...
Copied region/r.3.0.mca...
Copied region/r.-3.0.mca...
Copied region/r.3.1.mca...
Copied region/r.3.-1.mca...
Copied region/r.-3.1.mca...
Copied region/r.-3.-1.mca...
Copied region/r.3.2.mca...
Copied region/r.3.-2.mca...
Copied region/r.-3.2.mca...
Copied region/r.-3.-2.mca...
Copied region/r.3.3.mca...
Copied region/r.3.-3.mca...
Copied region/r.-3.3.mca...
Copied region/r.-3.-3.mca...
Copied region/r.3.4.mca...
Copied region/r.3.-4.mca...
Copied region/r.-3.-4.mca...
Copied region/r.3.5.mca...
Copied region/r.3.-5.mca...
Copied region/r.-3.-5.mca...
Copied region/r.3.6.mca...
Copied region/r.3.-6.mca...
Copied region/r.3.7.mca...
Copied region/r.4.0.mca...
Copied region/r.-4.0.mca...
Copied region/r.4.1.mca...
Copied region/r.4.-1.mca...
Copied region/r.-4.1.mca...
Copied region/r.-4.-1.mca...
Copied region/r.4.2.mca...
Copied region/r.4.-2.mca...
Copied region/r.-4.2.mca...
Copied region/r.4.-3.mca...
Copied region/r.4.4.mca...
Copied region/r.4.-4.mca...
Copied region/r.4.5.mca...
Copied region/r.4.-5.mca...
Copied region/r.4.6.mca...
Copied region/r.4.-6.mca...
Copied region/r.4.7.mca...
Copied region/r.4.-7.mca...
Copied region/r.-5.0.mca...
Copied region/r.5.1.mca...
Copied region/r.-5.1.mca...
Copied region/r.-5.-1.mca...
Copied region/r.5.2.mca...
Copied region/r.-5.2.mca...
Copied region/r.5.3.mca...
Copied region/r.5.-4.mca...
Copied region/r.5.-5.mca...
Copied region/r.5.-6.mca...
Copied region/r.5.-7.mca...
Copied region/r.-6.0.mca...
Copied region/r.6.1.mca...
Copied region/r.-6.1.mca...
Copied region/r.-6.-1.mca...
Copied region/r.6.2.mca...
Copied region/r.-6.2.mca...
Copied region/r.6.3.mca...
Copied region/r.6.-4.mca...
Copied region/r.6.-5.mca...
Copied region/r.6.-6.mca...
Copied region/r.-7.0.mca...
Copied region/r.7.1.mca...
Copied region/r.-7.1.mca...
Copied region/r.-7.-1.mca...
Copied region/r.7.2.mca...
Copied region/r.-7.2.mca...
Copied region/r.7.-5.mca...
Copied region/r.7.-6.mca...
Copied region/r.-8.0.mca...
Copied region/r.-8.1.mca...
Copied region/r.-8.2.mca...
Copied region/r.-9.1.mca...
Copied region/r.-9.2.mca...
I mean do I really have to stat each index and scan for "/" to see if that index contains a directory? There must be an easier way. Here are the docs for the libzip library. Thank you!!

https://libzip.org/documentation/

-Tristan

Last edited by trist007; 08-13-2020 at 01:32 PM.
 
Old 08-13-2020, 02:12 PM   #2
rtmistler
Moderator
 
Registered: Mar 2011
Location: USA
Distribution: MINT Debian, Angstrom, SUSE, Ubuntu, Debian
Posts: 9,883
Blog Entries: 13

Rep: Reputation: 4931Reputation: 4931Reputation: 4931Reputation: 4931Reputation: 4931Reputation: 4931Reputation: 4931Reputation: 4931Reputation: 4931Reputation: 4931Reputation: 4931
Quote:
Originally Posted by trist007 View Post
the issue is that some filenames that I get from the zip_stat_index are prepended with a directory. It would have been nice if the directory was listed on its own as a separate index but it is not.
Code:
There are 222 number of files in this archive.
Here is a list of the files in the archive.
Copied DIM1/region/...
Copied DIM1/region/r.0.0.mca...
Copied DIM1/region/r.0.1.mca...
Copied DIM1/region/r.0.-1.mca...
Copied DIM1/region/r.0.2.mca...
I haven't checked the whole stat output inclusively, but I was thinking that what you said, "It would have been nice if the directory was listed on its own as a separate index" is exactly what things like zip and tar do.

Is not /DIM1/region/ a directory? And therefore it is there.

Added: OK a more detailed review shows that this is not universal. Plus it doesn't show DIM1/... for instance or some number of other ones.

Does a command line listing of this archive show the same 222 entries? Other than that, I have no understanding any better than you.
 
Old 08-13-2020, 02:17 PM   #3
trist007
Senior Member
 
Registered: May 2008
Distribution: Slackware
Posts: 1,052

Original Poster
Rep: Reputation: 70
Yea you're right, I unzipped it and I got 230 entries instead of 222. Ok thanks.

-Tristan
 
Old 08-13-2020, 02:34 PM   #4
rtmistler
Moderator
 
Registered: Mar 2011
Location: USA
Distribution: MINT Debian, Angstrom, SUSE, Ubuntu, Debian
Posts: 9,883
Blog Entries: 13

Rep: Reputation: 4931Reputation: 4931Reputation: 4931Reputation: 4931Reputation: 4931Reputation: 4931Reputation: 4931Reputation: 4931Reputation: 4931Reputation: 4931Reputation: 4931
Quote:
Originally Posted by trist007 View Post
Yea you're right, I unzipped it and I got 230 entries instead of 222. Ok thanks.

-Tristan
Wow! What's up with that library then?!? I mean ... is it NOT the same code being used by the command line? Or nearly/very close? Maybe check that out. I can see that there would be several versions, or potentially a different library...
 
1 members found this post helpful.
Old 08-13-2020, 10:50 PM   #5
NevemTeve
Senior Member
 
Registered: Oct 2011
Location: Budapest
Distribution: Debian/GNU/Linux, AIX
Posts: 4,881
Blog Entries: 1

Rep: Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871
You should create a `pathopen` that splits filenames and creates the missing directories. While doing this, you should check for '..' as directoy-name, e.g. `src/../../../../etc/passwd`.

Note: opening an existing file can be problematic (e.g. symlink attack or text-file-busy problems) instead you could create the file on a temporary name in the same directory then `rename` to the final name.
 
Old 08-14-2020, 03:56 AM   #6
mina86
Member
 
Registered: Aug 2008
Distribution: Debian
Posts: 517

Rep: Reputation: 229Reputation: 229Reputation: 229
Quote:
Originally Posted by trist007 View Post
I mean do I really have to stat each index and scan for "/" to see if that index contains a directory?
Yes. Usually there will be a separate entry for the directory in a ZIP archive (and there was in your case; it was the entry ending in a slash) but in general that’s not a guarantee. If you want your code to be robust, you should split and the name and check for all the directories on the path.

Quote:
Originally Posted by NevemTeve View Post
Note: opening an existing file can be problematic (e.g. symlink attack or text-file-busy problems)
Worth noting that symlink attack may be performed with the leaf file name as well as with a directory on the path. E.g. one could craft a ZIP archive with ‘foo’ being symbolic link to ‘/etc/passwd’ and then another entry which overwrites contents of ‘foo’ but one could also craft a ZIP archive with ‘foo’ being symbolic link to ‘/etc’ and then have another entry ‘foo/passwd’ which overwrites contents of ‘passwd’.

To combat that, a good extracting tool should verify that whatever it creates is under the directory it was told to extract the archive into.

Yes, security is hard.
 
1 members found this post helpful.
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] How can I have zip -d file.zip "__MACOSX*" work on all zip files in directory? thomwblair Linux - Newbie 10 10-08-2018 02:30 PM
Cannot zip a mutli-part archive, zip -FF command failing with out of memory error sanchit.sharma Linux - Newbie 8 05-21-2018 06:47 AM
LXer: Easy Ways to Read/View Zip & Archive File Contents Without Extracting LXer Syndicated Linux News 0 05-24-2017 02:31 PM
Linux zip program's -d -tt option deletes all files from zip archive Arun Gupta Linux - Software 4 04-27-2011 07:06 PM
create a self-extracting zip file with zip on solaris? samsolaris Solaris / OpenSolaris 3 10-15-2004 01:50 AM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 03:12 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration