LinuxQuestions.org
Download your favorite Linux distribution at LQ ISO.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 01-12-2016, 01:28 AM   #1
rblampain
Senior Member
 
Registered: Aug 2004
Location: Western Australia
Distribution: Debian 11
Posts: 1,288

Rep: Reputation: 52
duplicate file identification


As I understand, if I copy "file_1" to "file_2" the OS does not create a second file but gives 2 references (names) to the same file so that "/maindir/dir1/this_file" and "~other_dir/that_file" point to the same and unique file, only when one of the files become different than the other does the OS create a second file.

Is there a Linux command to find when that is the case?

Thank you for your help.
 
Old 01-12-2016, 01:35 AM   #2
dugan
LQ Guru
 
Registered: Nov 2003
Location: Canada
Distribution: distro hopper
Posts: 11,221

Rep: Reputation: 5319Reputation: 5319Reputation: 5319Reputation: 5319Reputation: 5319Reputation: 5319Reputation: 5319Reputation: 5319Reputation: 5319Reputation: 5319Reputation: 5319
Quote:
Originally Posted by rblampain View Post
As I understand, if I copy "file_1" to "file_2" the OS does not create a second file but gives 2 references (names) to the same file so that "/maindir/dir1/this_file" and "~other_dir/that_file" point to the same and unique file, only when one of the files become different than the other does the OS create a second file.
Well, that's obviously not the case. But...

Remember that *nix has a technical term for "pointing to the same file", and that term is "hard link".

How to tell if two files are hardlinked? I googled it, and this was a prominent hit:

http://unix.stackexchange.com/a/24139

Last edited by dugan; 01-12-2016 at 01:47 AM.
 
Old 01-12-2016, 02:55 AM   #3
bigearsbilly
Senior Member
 
Registered: Mar 2004
Location: england
Distribution: Mint, Armbian, NetBSD, Puppy, Raspbian
Posts: 3,515

Rep: Reputation: 239Reputation: 239Reputation: 239
If you do
Code:
ln this that
as opposed to
Code:
ln -s
you create another link (name) in the directory to that file. This is pretty rarely used.
You can find such files with multiple hard links by

Code:
find . -type f ! -links 1
and find the matches by:

Code:
find . -samefile this
 
Old 01-12-2016, 03:18 AM   #4
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 21,125

Rep: Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120
Let's not forget copy-on-write and de-dup filesystems. Technically that is multiple files comprised of the same data. Probably not what the OP is asking, but possibly still sufficiently relevant to muddy the waters.
 
Old 01-12-2016, 04:57 AM   #5
pan64
LQ Addict
 
Registered: Mar 2012
Location: Hungary
Distribution: debian/ubuntu/suse ...
Posts: 21,830

Rep: Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308
and also check fat32 and similar filesystems where files cannot be hardlinked to each other at all.
 
Old 01-12-2016, 07:42 AM   #6
Ramurd
Member
 
Registered: Mar 2009
Location: Rotterdam, the Netherlands
Distribution: Slackwarelinux
Posts: 703

Rep: Reputation: 111Reputation: 111
Let me break down the OP and let's see where it leads...

Quote:
As I understand, if I copy "file_1" to "file_2" the OS does not create a second file but gives 2 references (names) to the same file so that "/maindir/dir1/this_file" and "~other_dir/that_file" point to the same and unique file, only when one of the files become different than the other does the OS create a second file.
No; when you copy a file a new file is create with the same content. If you refer to linking, as above replies indicate: if file1 changes, file2 changes along; they're the same physical file. Then, when referring to hard linking (rather than soft linking): this is not possible on different filesystems at all.

Some filesystems may exist where your description is implemented, but I'm not aware one such exists. If at all, both files should still reside on the same filesystem.
The question arises if such a thing would be desirable actually... given a multi-gigabyte file, if you would make such a copy, and then change the first file only slightly, it would take a great amount of time, because then the new file would have to be written to disk. Also, something would have to be implemented that keeps track of all the files on that filesystem and see what changes are made and how to act on those changes.

e.g. what would/should happen if you change file1 after your 'copy' and then revert it back?

As for the difference between a hard and a soft link:
Each file can be considered (for ease of understanding) a single hardlink to a physical file. The location is stored in the file which is called a directory, where the name and location on disk (inode) are stored. Creating a (new) hard link to the file adds to the special file 'directory' a new filename with the same location. The file is only physically removed from disk if no hard links direct to said location (actually, the inode is marked as 'free' so new files can start writing there).

A soft link is a new physical (special) file (=new inode), containing a path to an existing file. If that exsiting file is removed, the link remains but is broken. Since a soft link contains the path to the file, it can be made across different filesystems. Since a hard link points to an inode, it has to be on the same filesystem (as on another filesystem the same inode number will be occupied by another file or it is free yet)

Quote:
Is there a Linux command to find when that is the case?

Thank you for your help.
So, this would not be done with a linux command; but should be implemented in the filesystem that stores and can keep track of the changes.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
How do I create a duplicate file of a text file in PHP? puppymagic Programming 1 11-11-2010 01:56 AM
[SOLVED] Which file contains konqueror browser identification? cola Slackware 8 04-19-2010 05:57 AM
File type identification ?? Bhagyesh Linux - Newbie 3 07-15-2009 11:50 PM
Swap file - identification hinetvenkat Linux - Software 3 04-01-2006 05:29 PM
duplicate file names? BajaNick Linux - Software 3 08-25-2003 04:00 PM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 12:37 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration