Upgrading from Ubuntu 16.04 LTS to 18.04 LTS erased entire user's directory!
Linux - GeneralThis Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Upgrading from Ubuntu 16.04 LTS to 18.04 LTS erased entire user's directory!
I am having an extremely serious problem.
I upgraded my father's computer from 16.04 LTS to 18.04 LTS using the normal graphical prompt. The upgrade seemed to work as normal until I tried to log into his user account and discovered that this was not possible; brief investigation revealed that this was because the entire user directory had been erased!. That is, there is no /home/robert ("robert" being the username) at all. I only discovered this after upgrading HP printer/scanner drivers.
Checking the automatic backups, it seems as if many of the backups of things other than pictures and system files are some years out of date; I am not sure why as I can no longer log into my father's account, but I suspect that he may have been storing things in a file structure outside the directories that I had set to be backed up.
This is very, very, very, very serious as he has spent most of his time for the last three or four years working on writing a book, which is stored on that computer. I have managed to find a set of backups of this dating from late 2017 but no later.
The /home folder was mounted from a RAID1 array (/dev/md0); the other two users (my mother and me) are correct and all the files are intact. I am in the process of running photorec - it purports to have recovered hundreds of thousands of files, but, an hour or so after I started running it, I have found that it is dumping all the recovery files in /export/users, which is the same filesystem as the files that it is trying to recover! It did this automatically without asking me for this option.
I have tried testdisk and the deleted /home/robert directory appears in the listings, but almost none of its contents appear.
I have never seen anything quite this catastrophic before without a major hardware failure or a crash during the upgrade process. Does anyone have any idea how on earth to recover this drastic situation?
instead of photorec, you can also use something called extundelete - it's more likely to recover full filenames & directory structures.
Quote:
Originally Posted by jamespetts
photorec - it purports to have recovered hundreds of thousands of files, but, an hour or so after I started running it, I have found that it is dumping all the recovery files in /export/users, which is the same filesystem as the files that it is trying to recover! It did this automatically without asking me for this option.
that was the biggest error of all!
and it's no excuse that "it did this automatically" - in fact photorec does very little automatically, and is the kind of software that you only use after reading the documentation and knowing 100% what it will do.
sorry to be captain hindsight here, but you should've known to check the directory/partitioning structure before dist-upgrading, and of course you should've backed up personal data... well now you know.
actually, the more i think about this, the less likely i find it that a dist-upgrade would actively delete files under /home. maybe some partition just isn't mounted anymore, and all the data is till there hiding in plain sight?
The information that I found regarding Photorec did not make it at all clear that it "is the kind of software that you only use after reading the documentation and knowing 100% what it will do": the impression given was distinctly that it was simply a straightforward file recovery utility. Indeed, it was not at all clear that the operation that I was performing was actually going to change anything rather than just tell me whether it could find any files that might be able to be recovered. I only used it after searching through forum posts and finding a suggestion to use photorec in the case of the accidental deletion of a home directory.
My first thought was that the folder in question was just unmounted and I spent some time investigating this - but the structure of /etc/fstab makes it clear that the whole of /home was the mount (which remains mounted and working, save for the erasure), and that /robert was a subfolder. Testdisk shows that there is a deleted subfolder of this name, but it contains no meaningful files that Testdisk can find.
# /etc/fstab: static file system information.
#
# Use 'blkid' to print the universally unique identifier for a
# device; this may be used with UUID= as a more robust way to name devices
# that works even if disks are added and removed. See fstab(5).
#
# <file system> <mount point> <type> <options> <dump> <pass>
# / was on /dev/sda5 during installation
UUID=5ffe3483-ee16-4c10-b0e0-36078771a47c / ext4 errors=remount-ro 0 1
# swap was on /dev/sda1 during installation
UUID=d43886b2-d2f4-4865-a074-09b1793e2649 none swap sw 0 0
# RAID 1 array
/dev/md0 /home ext4 defaults 0 0
# Exports for NFS
/home /export/users none bind 0 0
# FTP version of backup drive
#curlftpfs#mounter:Redcoat@backup/Public /backup fuse auto,user,uid=1000,allow_other 0 0
# NOTE: The below is now deprecated as this old MyBook Live has been decommissioned as of 1 January 2017.
# NFS support on the MyBook Live was questionable in the past, but may be better now. FTP is the alternative (above).
#backup:/nfs/Public /backup nfs _netdev,auto 0 0
Edit 2: May I ask - what precautions should be taken before running extundelete - or is it too late even to try now?
Running it gives a warning that the partition should be unmounted - but I do not believe that I can simply unmount all the /home directories and still run Linux. I quit without doing anything on receiving that warning.
Edit 3: The output of ls -al /home is:
Code:
robert@Study:/home$ ls -al /home
total 21976
drwxr-xr-x 866 root users 36864 Apr 16 19:00 .
drwxr-xr-x 27 root root 4096 Apr 16 12:53 ..
drwx------ 3 valerie valerie 4096 Aug 8 2012 expunged
drwxrwxr-x 73 james james 4096 Apr 16 17:37 james
-rw-r--r-- 1 guest-lyf6uk guest-lyf6uk 76 Aug 1 2011 '.~lock.Current diary 2011 temp.xls#'
drwx------ 3 root root 16384 Mar 6 2011 lost+found
drwxr-xr-x 3 statd systemd-network 4096 Apr 30 2011 mythtv
-rw-r--r-- 1 root root 3545589 Apr 16 19:07 photorec.ses
drwxr-xr-x 2 root root 20480 Apr 16 14:42 recup_dir.1
drwxr-xr-x 2 root root 20480 Apr 16 14:43 recup_dir.10
drwxr-xr-x 2 root root 20480 Apr 16 15:01 recup_dir.100
drwxr-xr-x 2 root root 20480 Apr 16 15:01 recup_dir.101
LOTS MORE LIKE THIS - TOO MANY TO FIT IN MESSAGE...
14:41 testdisk.log
-rw-r--r-- 1 guest-lyf6uk guest-lyf6uk 7699 Jul 5 2011 'test listUntitled 1.ods'
drwx------ 5 james james 4096 Nov 18 2016 .Trash-1000
drwx------ 5 robert robert 4096 May 22 2011 .Trash-1001
drwx------ 5 valerie valerie 4096 Aug 27 2012 .Trash-1002
drwx------ 5 guest-lyf6uk guest-lyf6uk 4096 Jul 25 2011 .Trash-999
-rw-r--r-- 1 guest-lyf6uk guest-lyf6uk 9936 Jul 28 2011 'Untitled 1 test 28072011.odt'
drwxr-xr-x 46 valerie valerie 16384 Apr 16 14:24 valerie
Edit 4:
Quote:
actually, the more i think about this, the less likely i find it that a dist-upgrade would actively delete files under /home.
This is what I believed, too, which is why I did not meticulously check that the automatic backup to the NAS was up to date before upgrading. There is obviously something very, very drastically wrong somewhere in the upgrade process.
Last edited by jamespetts; 04-16-2019 at 01:52 PM.
Reason: More information
re fdisk:
i see 3 drives - sda has 223GiB and contains a standard Linux install partitioning scheme.
sdb and sdc are both exactly the same size - 1.8TiB. I suppose you actually have 2 physical drives like that.
re fstab:
i see 2 entries pertaining to /home. that seems strange to me.
like i said, i know nothing of raid.
but i know about NFS, and /home is not on that computer but somewhere else, pertaining to fstab.
why are you then running photorec on that computer.
maybe your exports are just messed up.
some info is missing here.
Quote:
Originally Posted by jamespetts
Edit 2: May I ask - what precautions should be taken before running extundelete - or is it too late even to try now?
Running it gives a warning that the partition should be unmounted - but I do not believe that I can simply unmount all the /home directories and still run Linux. I quit without doing anything on receiving that warning.
but you just said that it ran for an hour, destroying the very filesystem you are trying to recover from?
yes, you must unmount. boot a live linux medium and work from that.
it is also recommended to work on a bitwise copy of the drive, some photorec actions are intrusive, and it isn't always clear.
however, recovering to the same drive you are recovering from, that was your uninformed snafu.
i don't mean to annoy or pick on you, but it needs to be said.
see https://www.cgsecurity.org/wiki - notice that there's testdisk and photorec, both utilities might be useful for you.
re fdisk:
i see 3 drives - sda has 223GiB and contains a standard Linux install partitioning scheme.
sdb and sdc are both exactly the same size - 1.8TiB. I suppose you actually have 2 physical drives like that.
Yes - these two drives between them form the RAID array. /dev/sda is the SSD for the system drive, and /dev/sdb and /dev/sdc are the two individual drives comprising the RAID array. /dev/md0 is the array itself.
Quote:
re fstab:
i see 2 entries pertaining to /home. that seems strange to me.
like i said, i know nothing of raid.
but i know about NFS, and /home is not on that computer but somewhere else, pertaining to fstab.
why are you then running photorec on that computer.
maybe your exports are just messed up.
some info is missing here.
/exports/users is just a symlink to /home on /dev/md0: it is used for NFS to export the directories to other computers on the network so that they can read/write the same files.
Photorec has already recovered a large number of files on /dev/md0, some of them useful, (although it is still running). The files are definitely on that volume and have definitely been deleted.
It is actually very, very disturbing that a simple distribution upgrade should be capable of something so catastrophic without any error message or warning.
This may need to be investigated very seriously by the developers.
Quote:
but you just said that it ran for an hour, destroying the very filesystem you are trying to recover from?
yes, you must unmount. boot a live linux medium and work from that.
it is also recommended to work on a bitwise copy of the drive, some photorec actions are intrusive, and it isn't always clear.
however, recovering to the same drive you are recovering from, that was your uninformed snafu.
i don't mean to annoy or pick on you, but it needs to be said.
see https://www.cgsecurity.org/wiki - notice that there's testdisk and photorec, both utilities might be useful for you.
It is not clear - at all - from the documentation or the UI that the operation that was being performed and the options that I chose were even potentially destructive. Indeed, it was not clear that it was going to write any data rather than just scan the drive to see whether anything might be recoverable. It is only after I noticed all the files accumulating in the /home directory that I realised what it was actually doing. It really does not help to castigate users for making an "uninformed snafu" when the documentation and UI are so opaque that it was not clear to someone who has been using Linux since 2002 that this operation carried a degree of risk.
It is still not clear from your message now or the previous message whether the right thing to do after photorec already running on the filesystem in this way now for over 19 hours is to stop it part way through and run something else or continue.
I did use testdisk, incidentally, using that before attempting photorec. It found the deleted /home/robert directory, but found only 2-3 useless files in it.
I am not sure why as I can no longer log into my father's account
The only account that can be logged into without a directory in /home/ is the root account, whose homedir is in /, not /home/. What is the output from:
Code:
cat /proc/mdstat; df
and the content of /etc/mdadm.conf? What is the "normal graphical prompt" used to upgrade? What was the last thing done before that? How did you invoke testdisk?
Quote:
Does anyone have any idea how on earth to recover this drastic situation?
This is where backups need to be restored. I wouldn't interrupt photorec. At this point its results might be your only hope.
On the contrary, photorec should immediately be terminated. See ondohos advice above - boot a liveCD, get a copy of the array, then recover to another disk. Yes, that's a lot of disk, but if your backup regime is inadequate you have to do what you can.
Note my sigline - it says "verified" for a reason.
As for photorec, it is a fantastic tool, but is merely a tool - if you don't read what it presents before hitting <Enter>, the tool is not to blame.
There may have been some misunderstanding of the passage quoted from my first post in the following terms, "I am not sure why as I can no longer log into my father's account"; the "as" changes the meaning. I am aware of why I cannot log into my father's account, but not being able to log into my father's account means that I was unaware of why the backups had not worked for anything other than photographs and system files since 2016.
Photorec
The documentation regarding this is very unclear. In particular, the command to start the recovery operation is labelled "search". This is not a word that normally connotes a computer operation that will write or change anything, which is why I selected this option at an early stage without full investigation, believing that it would simply search the drive and report whether there was anything that could be recovered, rather than commence the recovery process.
What photorec actually does (which is really not clear from the information that I had found so far) when one presses "search" is:
(1) search the unallocated parts of the hard drive for data that match the pattern of certain files; and
(2) dump those data into a series of archived files of ~500 files each (depending on the size) - these files can be accessed using Nautilus in the usual way.
This is different to testdisk, which will detect deleted files without writing/altering anything. A photorec recovery session can take a very long time (multiple days), but the time is greatly reduced if one reduces the types of files for which one searches. A search for all filetypes will return vast numbers of temporary files which are of no use and will make it very hard to extract useful documents. It will also greatly increase the recovery time.
As has been pointed out, it is important to realise what photorec does at this stage and make sure to modify the options to point to a filesystem that is not the filesystem from which data must be recovered. It is very odd that photorec does not check for this itself and prevent the user from saving the recovered files to the recovery filesystem (or, at the very least, warn the user about this).
I should note that photorec has actually worked well: I stopped the recovery part way through (which was not a problem: it can even be recommenced from where it left off, it turns out), and re-started using the system SSD as the destination and selecting only desired types (which was then much quicker), and at least most of the lost files have been recovered successfully (there may be a few missing/corrupted files, but this is hard to tell at this stage; my father and I have identified a large number of recent versions of book chapters and appendices).
I suspect that part of the reason that the initial writing did not, in the event, damage too much was that the drive was not very full: only about 20% or so, I think.
I write all this because many of these details were not at all clear from the documentation and forum responses, and in case anyone else finds her/himself in the same situation.
I write all this because many of these details were not at all clear from the documentation and forum responses, and in case anyone else finds her/himself in the same situation.
Well done - others will surely benefit from your experience.
Those of us that use this (semi-)regularly overlook these idiosyncrasies.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.