LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (https://www.linuxquestions.org/questions/programming-9/)
-   -   Looking to write a backup utility, need tips on what language (https://www.linuxquestions.org/questions/programming-9/looking-to-write-a-backup-utility-need-tips-on-what-language-768043/)

DaneM 11-09-2009 06:37 PM

Looking to write a backup utility, need tips on what language
 
Hello. I am a novice programmer, having dabbled in C/C++ and BASH scripting. I know very little C/C++, but have gotten pretty far with BASH scripting.

I am looking to make a backup/cloning solution similar in functionality to Acronis True Image Home. I know that there are similar things available already, including (but not limited to) G4U/G4L, Clonezilla, Partimage, ntfsclone, etc., but I have found that none of these do what I want them to, which is this:

I want to be able to efficiently create an archive of a filesystem that is efficiently compressed, fast to decompress, and (here's the crucial part) allows the user to pull out single files from anywhere in the archive without having to read the entire archive's compressed data.

Here's what I plan to do (the jest of it):

1) Copy the filesystem information, such as NTFS indexes, tables, etc., hopefully by using a pre-existing library, such as libntfs3g or some such (still need to learn how to integrate 3rd party libraries with my program...).

2) Copy the actual data (files, folders, etc.), one directory at the time, starting at the ends of the directory tree (like C:\rootleveldir\anotherdir\endofdirectorytree\), and compressing each directory into its own archive. This will recurse backward from the ends of the tree, until it reaches the root directory. At this point, many compressed archives will exist on the backup media.

3) Archive all of the compressed data and the filesystem info, arranged in it's original structure, into one non-compressed archive (for faster retrieval).

4) When the archive needs to be opened, it should only have to open the uncompressed archive file, and then delve into directories as needed, uncompressing them on the fly. I intend to use a compression method (not sure which one yet) that will be accessible through standard Linux/Windows GUI archiving utilities, such as WinRAR and File-Roller.

***********
My main question, therefore, is which language will most likely allow me to accomplish this with minimal time and headache? I'm open to learning pretty much any language that will get this done (as long as I don't have to deal with C-style strings...).
***********

Thanks for your input.

--Dane

smeezekitty 11-10-2009 01:07 PM

for backups - bash is a fairly good choice.

DaneM 11-10-2009 01:42 PM

Thanks for the reply, smeezekitty.

I tried doing it with BASH, but found that the command-line tools available (Tar, etc.) were extremely slow, and ran into complications with trying to get the syntax of all these commands to jive with each other (piping and such).

Any other thoughts?

tuxdev 11-10-2009 01:59 PM

What is this actually *for*? A lot of people use backups when they really should use version control + a decent RAID setup. One of the more valid reasons to actually do a backup is as a final failsafe recovery that's stored offsite. For that, a straight compressed tarball works quite fine.

DaneM 11-10-2009 04:50 PM

I'm a computer technician (repairs and such), and before I wipe a hard drive or do any major work to a system, I back it up using Acronis or Clonezilla, so that if the customer needs data restored, I will be able to do so. I have experimented with making 100GB+ tarballs, and they take forever to create, forever load the contents of (see them in the GUI), and even longer to decompress. For this reason, I intend to make a compressed file that will not take so long to make, read, or decompress. Making it quickly, however, is secondary to the other two.

So, after having done the work on the system, I restore the data that the customer SAID he/she needed restored. The customer takes his/her computer, and later comes back because some data from another part of the filesystem is missing. I don't want to wait 15min+ to read the contents of a tarball using any of the 5 or more decompression tools (GUI) I have tried; I just want to get that one missing file and be done with it.

Acronis does this superbly, but is a Windows only app, unless you want to pay a LOT of money for the Linux server version. (No "desktop" version exists for Linux.) I want to make an open-source program that will run off of a live CD or pen drive (like booting up an Ubuntu disk and running/installing the program from pen drive into RAM), will work on an installed system, and will perform the above tasks (in Linux) without costing me a lot of money.

Any suggestions?

bigearsbilly 11-10-2009 05:43 PM

what you could do is...
backup to a filesystem.

you can mkisofs your directory tree to an ISO CD image.

or use squashfs

then these can be mounted and browsed seamlessly like you would a CD or usb stick.
look at puppy linux for clever use of squashfs.

ISO could of course be seen on windows though.

you could even use puppy linux (a live CD) for this because it
automatically mounts squashfs for you and can read ntfs.

I even used it to fix an NT partition that windows couldn't fix!

catkin 11-11-2009 08:12 AM

Have you investigated using dar instead of tar? From that page: "Even using compression dar has not to read the whole backup to extract one file. This way if you just want to restore one file from a huge backup, the process will be much faster than using tar. Dar first reads the catalogue (i.e. the contents of the backup), then it goes directly to the location of the saved file(s) you want to restore and proceed to restoration".

archtoad6 11-15-2009 10:39 AM

Quote:

Originally Posted by DaneM (Post 3752144)
I'm a computer technician ...

Acronis does this superbly, but is a Windows only app, unless you want to pay a LOT of money for the Linux server version. (No "desktop" version exists for Linux.) ...

As a fellow professional tech., I sympathize. A less sympathetic fellow pro. recently said that if I am serious about my work, I should just buy Ghost or Acronis. This was in the context of trying to use Clonezilla to do a bare-metal restore to different h/w than the back-up was made from. -- As in the case of h/w failure, rather than the more common catastrophic malware infection.

Have you tried bigearsbilly or catkin's suggestions?

DaneM 11-15-2009 03:56 PM

Thanks for the suggestions, bigearsbilly and catskin!

I have been running tests with squashfs to see how well it performs the above proceedures. I will try dar next; I've never used it before, and you've gotten my hopes up. ;-)

I'll post my findings. Once I get this to work, I would like to make some kind of a front-end for it so the technician using it would not have to type in a bunch of commands to make it happen.

'Till next post...

--Dane

ta0kira 11-15-2009 04:11 PM

Are you backing up file-system properties (i.e. system type, geometry, journals, etc.)? Just curious how you're doing it, if so. Since you're a computer tech, chances are you're only dealing with a few very specific file systems, I guess.
Kevin Barry

DaneM 11-27-2009 12:04 AM

Making Progress
 
Hello, and thanks again to all those who replied.

I have been investigating dar and the kdar gui, and found that they will do very nearly everything that I want to do. (squashfs is very cool, but turned out to be way too slow for my tastes.) I have created a Slax bootable CD and pen drive (.iso and .tar, respectively), and am at this moment attempting to upload them here:

http://www.4shared.com/dir/24438317/...p_Utility.html

If anybody knows of a better hosting option for 200+MB files, totaling about 1.2GB, please let me know.

Feel free to download and test them if/when they finally finish uploading. You will need to use the 'cat' tool as described in the readme.txt file to put the split files back together again. The packages I made and the original, unmodified Slax images, tarballs, etc. are there for reference as well.

ta0kira, you bring up a valid point that I'd like to address with regard to filesystem information. I don't know how to back up the filesystem info, but would definitely like to know if anybody can give assistance in that regard. It would be nice to be able to restore the filesystem, partition table, etc. EXACTLY as it was, in case a customer decides half-way through that he/she doesn't want any work done, after all. (This has happened to me before, and I was very grateful that Acronis could do this.) I already know how to get the partition information and MBR like so (if I remember correctly):

dd if=/dev/sda of=mbr.iso count=1

This grabs both the partition info and the MBR. I think it's possible to just get one or the other (perhaps useful for keeping the MBR while resizing, or some such), but I don't know that procedure.

As ta0kira mentioned, I really have only been dealing with NTFS, but I would like to be able to save/restore the filesystem information on ext2/3/4 as well.

Comments welcome!

--Dane

P.S. If anybody knows of a Windows GUI tool for manipulating, or at least reading a dar archive, that would be invaluable for restoring data without having to boot onto a live CD. I know that there's a command line dar tool, but that doesn't quite cut it for my purposes.

P.P.S I have found a way to read .dar files from Windows via a GUI (Total Commander with a plugin). It's included in the link given.


All times are GMT -5. The time now is 06:59 PM.