How to best test timestamp equality using (Ba)sh if?
Linux - GeneralThis Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
How to best test timestamp equality using (Ba)sh if?
For a bash script (not necessarily bash shell, but from what I can tell, bash compatible) I need to compare dates.
Using if you can test if a file is older or newer than another file like so :
[ FILE1 -nt FILE2 ] True if FILE1 has been changed more recently than FILE2, or if FILE1 exists and FILE2 does not.
[ FILE1 -ot FILE2 ] True if FILE1 is older than FILE2, or is FILE2 exists and FILE1 does not.
(source http://www.tldp.org/LDP/Bash-Beginne...ect_07_01.html )
My question is now, how do I test if FILE1 has been changed at EXACTLY the same time as FILE2. I.e. it's probably the same file.
Aside from grepping and parsing commandline output which I wanted to avoid, how would you best do this, with the highest possible accuracy?
Anything to worry about if FILE1 is not in the same file system as FILE2? (for example a mounted NAS?)
There are much better ways of testing for file equality, like "diff -q".
If you really are interested in the timestamps though, you could try testing both -nt and -ot, if they both fail or are both true (not sure if they're > or >=) then it must be the same, assuming you check the usual caveats (both files exist, etc.)
As for the different filesystem, absolutely yes that would cause problems, one of the many reasons why testing for file equality using timestamps is not a good idea. Different filesystems have different time resolution. ext3, for example, only has integer second resolution, while ext4 has nanosecond resolution. Even the exact same file would fail a timestamp comparison when crossing between these two filesystems.
Last edited by suicidaleggroll; 07-01-2014 at 01:21 PM.
There are much better ways of testing for file equality, like "diff -q".
I know, it's the date that I must check. The file contents may or may not be identical.
Quote:
Originally Posted by suicidaleggroll
If you really are interested in the timestamps though, you could try testing both -nt and -ot, if they both fail or are both true (not sure if they're > or >=) then it must be the same, assuming you check the usual caveats (both files exist, etc.)
Are you sure that would be correct? I considered it but I wanted a check to make sure.
Quote:
Originally Posted by suicidaleggroll
As for the different filesystem, absolutely yes that would cause problems, one of the many reasons why testing for file equality using timestamps is not a good idea. Different filesystems have different time resolution. ext3, for example, only has integer second resolution, while ext4 has nanosecond resolution. Even the exact same file would fail a timestamp comparison when crossing between these two filesystems.
That's why I'm looking for a way to take this into account, so I can check if the difference is caused by the filesystem.
There's a Bourne Shell variant called ash, so why not make it B(a(sh)).
Because there are many more and I was trying to stick to just sh but found that simple things like [[double brackets]] in bash made things much easier and were well supported.
I know, it's the date that I must check. The file contents may or may not be identical.
If the file contents aren't identical, then they're not the same file, in which case the time stamp will never match on any recent filesystem (it may on ext3 and older due to the integer second resolution). I do hesitate to say "never" here, but I honestly can't imagine any scenario in which two non-identical files could have identical modification times to the nanosecond.
Quote:
Originally Posted by Rygir
Are you sure that would be correct? I considered it but I wanted a check to make sure.
No, but you could test it and see.
Quote:
Originally Posted by Rygir
That's why I'm looking for a way to take this into account, so I can check if the difference is caused by the filesystem.
Take it into account how? What's your end goal here? The only way to "take it into account" would be to lose the accurate timestamping of any recent filesystem and drop to the least common denominator (or just fall back to integer second regardless), in which case you can follow Habitual's suggestion.
Maybe it would help if you clarified what you're really trying to do here. I guess I just don't see the value in comparing whether or not two timestamps are identical on non-identical files that can cross filesystems with different timestamping resolution. What would you be able to glean from that result?
If the file contents aren't identical, then they're not the same file, in which case the time stamp will never match on any recent filesystem (it may on ext3 and older due to the integer second resolution). I do hesitate to say "never" here, but I honestly can't imagine any scenario in which two non-identical files could have identical modification times to the nanosecond.
I do : The goal is to verify a synchronization process. If the copy was incomplete or corrupt, the file contents would differ but the date would not. In fact it may happen later on, that the data has become corrupted. The goal is long term storage.
Quote:
No, but you could test it and see.
I was going to
Quote:
Take it into account how?
Well I was planning on testing if they were exact matches, if not, see what the difference is. If the difference happened to fall within a certain margin of error, it could be ignored and logged as a file system difference.
Quote:
What's your end goal here?
Verification that dates have been preserved as well as possible in backups and/or archives.
Quote:
The only way to "take it into account" would be to lose the accurate timestamping of any recent filesystem and drop to the least common denominator (or just fall back to integer second regardless)
That would be perfect.
Quote:
, in which case you can follow Habitual's suggestion.
I'm still looking into that, haven't yet had the time to fully understand it. It's looking promising though.
Quote:
Maybe it would help if you clarified what you're really trying to do here. I guess I just don't see the value in comparing whether or not two timestamps are identical on non-identical files that can cross filesystems with different timestamping resolution. What would you be able to glean from that result?
At risk of letting the thread go off topic : I have a linux phone that can't be recharged, so I have to get the data off within a battery charge or keep track of how much I've done.
To aid in this process, I've got it to connect to the wifi, mount a NAS drive and start copying files. Which is great, but I want to check after each copy if the file was treated correctly by verifying that reading the data yields the same result (instead of blindingly trusting some result variable interpretation). This should make sure the copy was a success and the file can now be deleted.
The goal is to move the whole file system as intact as possible to a NAS.
The way I'm approaching the script is as a general robust file copier/mover that can be reused for similar issues.
I do : The goal is to verify a synchronization process. If the copy was incomplete or corrupt, the file contents would differ but the date would not. In fact it may happen later on, that the data has become corrupted. The goal is long term storage.
Ok, so with that in mind, what information would a matching or non-matching timestamp provide? It sounds to me, from the sentence quoted above, that the timestamp does not provide the information you need to verify a backup.
Quote:
Originally Posted by Rygir
To aid in this process, I've got it to connect to the wifi, mount a NAS drive and start copying files. Which is great, but I want to check after each copy if the file was treated correctly by verifying that reading the data yields the same result (instead of blindingly trusting some result variable interpretation). This should make sure the copy was a success and the file can now be deleted.
So if it's the file content you care about, why are you comparing the timestamp instead of the content?
Say you're copying a file onto the backup system. A lot of different things can go wrong here, but there are only four end results:
1 - date does not match, content does not match
2 - date does not match, content matches
3 - date matches, content does not match
4 - date matches, content matches
What would you want the script to do in these four instances? For a backup, it seems to me that #2 and #4 you would do nothing, #1 and #3 you would re-copy the file, in which case the timestamp is meaningless and all you care about is if the files match. #2 could cause some irregularities in the backup, but the majority of the time this is going to be caused by filesystem mismatches and there isn't anything you can do about it. The only time #2 could happen without it being a filesystem timestamp resolution issue, I would think, is if you run your copy command without the necessary flags to preserve timestamps, but once this is scripted that should be a non-issue.
Last edited by suicidaleggroll; 07-02-2014 at 01:41 PM.
Ok, so with that in mind, what information would a matching or non-matching timestamp provide? It sounds to me, from the sentence quoted above, that the timestamp does not provide the information you need to verify a backup.
It does, it verifies that the timestamp has been backed up correctly.
Quote:
So if it's the file content you care about, why are you comparing the timestamp instead of the content?
Because I do care about the timestamp, it's a piece of data I wish to preserve. I also care about the file content obviously, but that has been taken care of with cmp command elsewhere.
Quote:
Say you're copying a file onto the backup system. A lot of different things can go wrong here, but there are only four end results:
1 - date does not match, content does not match
2 - date does not match, content matches
3 - date matches, content does not match
4 - date matches, content matches
What would you want the script to do in these four instances?
If the date matches, check the content, if the content matches as well delete original.
If the date doesn't match or the content doesn't match, log an error for manual review because I'm not sure how often that would happen, I hope almost never.
Quote:
For a backup, it seems to me that #2 and #4 you would do nothing, #1 and #3 you would re-copy the file, in which case the timestamp is meaningless and all you care about is if the files match. #2 could cause some irregularities in the backup, but the majority of the time this is going to be caused by filesystem mismatches and there isn't anything you can do about it.
Actually it can be because of a permissions issue and/or a file system issue that I can resolve (format target drive differently) and/or I copied it wrong and forgot to synchronize the dates and/or the nas protocol won't accept dates.
Quote:
The only time #2 could happen without it being a filesystem timestamp resolution issue, I would think, is if you run your copy command without the necessary flags to preserve timestamps, but once this is scripted that should be a non-issue.
I hope so, but better to be safe than sorry. I wouldn't have to check if the content was identical if everything always went perfectly .
All in all, what I want should be just one line in a shell script; "if datesource=datetarget set dates are equal flag".
Then I would follow Habitual's recommendation, but note that you'll probably get false errors on some files when crossing between "old" and "new" filesystems due to roundoff/truncation. You may need to take the absolute value of the difference between the two and check if it's less than some threshold instead.
Last edited by suicidaleggroll; 07-02-2014 at 04:13 PM.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.