LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - General (http://www.linuxquestions.org/questions/linux-general-1/)
-   -   How to change hardlinks to softlinks? (http://www.linuxquestions.org/questions/linux-general-1/how-to-change-hardlinks-to-softlinks-797250/)

phi 03-23-2010 05:36 AM

How to change hardlinks to softlinks?
 
Hi dear shell-scripters and other power users.

A missuse of the tool "fslint" created many (realy many) hardlinks in my /home folder.

Now I am looking for a script, which converts all those hardlinks into softlinks. The difficult part will be to select a "major file". This is normally the file with the shortest path.

Using

Code:

find . -type f -printf "%n %p\n"
I get all files including hard link count.

Any Idea? Scripts?

catkin 03-23-2010 05:47 AM

Use find's printf's %i to also print the inode and output to file. Sort by inode and secondarily by path length ...

Will only work for a single file system.

phi 03-23-2010 06:55 AM

... ongoing ...
 
Thx

Better than the filename-size (or path-size) is for me the path depth (find's print's %d-argument). The filenames size can be used as a second argument. Anyone knows how to print the filenames size?

But back to the main problem:

The following code shows all my files in the right order:

Code:

>find 2>/dev/null <directory> -type f -printf "%n %i %d %p\n" | sort -g
(printing Hardlink-Count, I-Node, Path-Depth and Pathname.)


Now, I need a shell-script which handles this output in blocks of the same I-Node.

phi 04-15-2010 05:08 PM

[SOLVED] Turn Hardlinks into softlinks
 
The following script does the job!
Attention: No Filenames or Directory-Names must contain spaces.

The crutial part is commented out, so you don't spoil your file-system, before you have studied the output (echo).

Code:

folder=/home/phi


tmpFileName1=/tmp/linklist1.log
tmpFileName2=/tmp/linklist2.log


#Remove oldFile and (soft)-link it to the "newFile"
function unlinkAndSoftLink {
  oldFile=$1
  newFile=$2
  echo "link " ${oldFile} " to " ${newFile}

  # Uncomment, if realy sure: No Blanks in Directory or File Names!
  # Auskommentieren, wenn ganz sicher: Keine Leerschläge in Datei- oder Verzeichnisnamen!
  #rm "${oldFile}"
  #ln -s "${newFile}" "${oldFile}"
  echo
}

# START


# Finde alle Dateien
find 2>/dev/null ${folder} -type f -printf "%n:%i:%d:%p\n" >${tmpFileName1}

#Extrahiere "count" "Inode" "Tiefe (2stellig)" "Dateiname"
#Only Export, if hard-link count > 1
awk -F: '{if ($1 > 1) printf "%02d\t%d\t%02d\t%s\n", $1, $2, $3, $4}' ${tmpFileName1} | sort -n >${tmpFileName2}

oldInode=""
oldFname=""
while read count inode depth fname
do 
  echo ${count} - ${inode} - ${depth} - ${fname}
  if [[ "${oldInode}" == "${inode}" ]] ; then
    unlinkAndSoftLink ${fname} ${oldFname}
  else
    oldInode=${inode}
    oldFname=${fname}
  fi
done <${tmpFileName2}


catkin 04-16-2010 03:40 AM

Thanks for sharing the solution :)

Mountain 03-26-2012 06:42 PM

I'm looking for the same solution. However, I want to use it on files that contain spaces in the names. I reviwed the proposed solution above and it looks weak to me. Has anyone got an improved solution by any chance?

catkin 03-27-2012 03:54 AM

This works as long as the path names don't contain newlines:
Code:

#!/bin/bash

# Configure script environment
# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
export PATH=/usr/sbin:/sbin:/usr/bin:/bin
set -o nounset

# For each path which has multiple links
# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# (except ones containing newline)
last_inode=
while IFS= read -r path_info
do
  echo "DEBUG: path_info: '$path_info'"
  inode=${path_info%%:*}
  path=${path_info#*:}
  if [[ $last_inode != $inode ]]; then
      last_inode=$inode
      path_to_keep=$path
  else
      echo rm "$path"
      echo ln -s "$path_to_keep" "$path"
  fi
done < <( find $dir -type f -links +1 ! -wholename '*
*' -printf '%i:%p\n' | sort --field-separator=: )

# Warn about any excluded files
# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
buf=$( find $dir -type f -links +1 -wholename '*
*' )
if [[ $buf != '' ]]; then
    echo 'Some files not processed because their paths contained newline(s):'$'\n'"$buf"
fi

Test files may be set up by
Code:

dir=$( mktemp -d )
touch $dir/two_link.1 "$dir/two_link. .1" $dir/one_link
ln $dir/two_link.1 $dir/two_link.2
ln "$dir/two_link. .1" "$dir/two_link. .2"
touch "$dir/two_link.newline
.1"
ln "$dir/two_link.newline
.1" "$dir/two_link.newline
.2"

I tried with sort's --files0-from=- and --zero-terminated options but could not make them work to allow pathnames including newlines.

Mountain 03-27-2012 10:52 AM

My paths do include newlines, unfortunately. (Personally, I wish operating systems had not adopted this convention of allowing file and directory names with newlines. I don't name my stuff with spaces when I create the names myself.)

I tried using code like this earlier without success yet.
Code:

IFS=$'\n'
...
unset IFS

I guess I'll keep experimenting a bit and searching for a solution. I appreciate any other ideas.

catkin 03-27-2012 11:04 AM

Quote:

Originally Posted by Mountain (Post 4637789)
My paths do include newlines
...
I guess I'll keep experimenting a bit and searching for a solution.

It may be possible to fix my code to accomodate paths with newlines by changing -printf '%i:%p\n' to -printf '%i:%p\0' and exploring sort's --files0-from=- and --zero-terminated options. According to my understanding of the options they should have worked but I probably messed up some detail.

EDIT: and the while IFS= read -r path_info to while IFS= read -d '' -r path_info

Mountain 03-27-2012 12:59 PM

Quote:

Originally Posted by catkin (Post 4637803)
It may be possible to fix my code to accomodate paths with newlines by changing -printf '%i:%p\n' to -printf '%i:%p\0' and exploring sort's --files0-from=- and --zero-terminated options. According to my understanding of the options they should have worked but I probably messed up some detail.

EDIT: and the while IFS= read -r path_info to while IFS= read -d '' -r path_info

Thanks again. I will definitely try your code. I'm going to do that right now.

But I have to apologize because I seriously misread your prior post. Somehow when I read and wrote the word "newlines" I really meant "spaces"! Don't ask me how/why I did that. :-/

Generally, my paths do not contain newlines, but they do contain spaces. Now that I re-read your post, everything is clear to me and I'm going to try your solution as-is.

However, even though my paths do not generally contain newlines, there are a few rare exceptions. Something (possibly a Mac OS X client) is writing files with the name "Icon" followed by a newline. The names show up either as "Icon?" in the terminal or "Icon^M" in other places. I delete these files every time I find them. Your code gives me the idea to do this:
Code:

find $dir -type f -wholename '*Icon*
*' -exec rm '{}' \;

I'll just run that before I run the script. Thanks for your help!

P.S. I'm curious why you use -wholename rather than -path.

catkin 03-27-2012 09:12 PM

Quote:

Originally Posted by Mountain (Post 4637920)
P.S. I'm curious why you use -wholename rather than -path.

Mis-remembering the difference between the two :redface:. -path would be better.

Mountain 03-27-2012 11:28 PM

Quote:

Originally Posted by catkin (Post 4637370)
This works as long as the path names don't contain newlines:
Code:

#!/bin/bash

# Configure script environment
# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
export PATH=/usr/sbin:/sbin:/usr/bin:/bin
set -o nounset

# For each path which has multiple links
# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# (except ones containing newline)
last_inode=
while IFS= read -r path_info
do
  echo "DEBUG: path_info: '$path_info'"
  inode=${path_info%%:*}
  path=${path_info#*:}
  if [[ $last_inode != $inode ]]; then
      last_inode=$inode
      path_to_keep=$path
  else
      echo rm "$path"
      echo ln -s "$path_to_keep" "$path"
  fi
done < <( find $dir -type f -links +1 ! -wholename '*
*' -printf '%i:%p\n' | sort --field-separator=: )

# Warn about any excluded files
# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
buf=$( find $dir -type f -links +1 -wholename '*
*' )
if [[ $buf != '' ]]; then
    echo 'Some files not processed because their paths contained newline(s):'$'\n'"$buf"
fi

Test files may be set up by
Code:

dir=$( mktemp -d )
touch $dir/two_link.1 "$dir/two_link. .1" $dir/one_link
ln $dir/two_link.1 $dir/two_link.2
ln "$dir/two_link. .1" "$dir/two_link. .2"
touch "$dir/two_link.newline
.1"
ln "$dir/two_link.newline
.1" "$dir/two_link.newline
.2"

I tried with sort's --files0-from=- and --zero-terminated options but could not make them work to allow pathnames including newlines.

First of all, thank you! It works. I've been using it extensively.

I've been playing around with a few tweaks. Here's what I have right now;

Code:

#!/bin/bash

# Configure script environment
# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#export PATH=/usr/sbin:/sbin:/usr/bin:/bin
set -o nounset
dir='/put/something/here'

# For each path which has multiple links
# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# (except ones containing newline)
last_inode=
while IFS= read -r path_info
do
  inode=${path_info%%:*}
  path=${path_info#*:}
  if [[ $last_inode != $inode ]]; then
      last_inode=$inode
      path_to_keep=$path
  else
      echo "DEBUG: ln -s '$path_to_keep' '$path'"
      rm "$path"
      ln -s "$path_to_keep" "$path"
  fi
done < <( find "$dir" -type f -links +1 -iname '*.txt' ! -wholename '*
*' -printf '%i:%p\n' | sort --field-separator=: )

# Warn about any excluded files
# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
buf=$( find "$dir" -type f -links +1 -wholename '*
*' )
if [[ $buf != '' ]]; then
    echo 'Some files not processed because their paths contained newline(s):'$'\n'"$buf"
fi

exit 0

Notice I set it to work only on .txt files (for the moment). I had to play with the quotes a bit until I got it to handle paths with spaces.


All times are GMT -5. The time now is 04:06 PM.