LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Slackware (https://www.linuxquestions.org/questions/slackware-14/)
-   -   Bug in makepkg and symlinks with blanks in filename? (https://www.linuxquestions.org/questions/slackware-14/bug-in-makepkg-and-symlinks-with-blanks-in-filename-4175480597/)

ruario 10-16-2013 03:09 AM

By using sed's extended regular expression capability (-r) it is possible to do a slightly neater patch to removepkg to work with my patched makepkg:

Code:

--- removepkg.orig
+++ removepkg
@@ -149,7 +149,7 @@
 }
 
 extract_links() {
- sed -n 's,^( *cd \([^ ;][^ ;]*\) *; *rm -rf \([^ )][^ )]*\) *) *$,\1/\2,p'
+ sed -rn 's,^\( *cd ([^;]*[[:graph:]]) *; *rm -rf ([^)]*[[:graph:]]) *\) *$,\1/\2,p'
 }
 
 preserve_file() {
@@ -228,7 +228,7 @@
  if [ -L "$ROOT/$LINK" ]; then
    if [ ! "$WARN" = "true" ]; then
    echo "  --> Deleting symlink $ROOT/$LINK"
-    rm -f $ROOT/$LINK
+    rm -f "$ROOT/$LINK"
    else
    echo "  --> $ROOT/$LINK (symlink) would be deleted"
    fi

EDIT: I thought about this a little bit more. Here is another version doesn't require extended regex (in case that is something volkerdi wants/needs to avoid):

Code:

--- removepkg.orig
+++ removepkg
@@ -149,7 +149,7 @@
 }
 
 extract_links() {
- sed -n 's,^( *cd \([^ ;][^ ;]*\) *; *rm -rf \([^ )][^ )]*\) *) *$,\1/\2,p'
+ sed -n 's,^( *cd \([^;]*[!-~]\) *; *rm -rf \([^)]*[!-~]\) *) *$,\1/\2,p'
 }
 
 preserve_file() {
@@ -228,7 +228,7 @@
  if [ -L "$ROOT/$LINK" ]; then
    if [ ! "$WARN" = "true" ]; then
    echo "  --> Deleting symlink $ROOT/$LINK"
-    rm -f $ROOT/$LINK
+    rm -f "$ROOT/$LINK"
    else
    echo "  --> $ROOT/$LINK (symlink) would be deleted"
    fi

Note: The above two example patches work no matter which method of escaping is used (e.g. some\ file or "some file").

ruario 10-17-2013 02:04 PM

Quote:

Originally Posted by ruario (Post 5046668)
Code:

--- removepkg.orig
+++ removepkg
@@ -149,7 +149,7 @@
 }
 
 extract_links() {
- sed -n 's,^( *cd \([^ ;][^ ;]*\) *; *rm -rf \([^ )][^ )]*\) *) *$,\1/\2,p'
+ sed -n 's,^( *cd \([^;]*[!-~]\) *; *rm -rf \([^)]*[!-~]\) *) *$,\1/\2,p'
 }
 
 preserve_file() {
@@ -228,7 +228,7 @@
  if [ -L "$ROOT/$LINK" ]; then
    if [ ! "$WARN" = "true" ]; then
    echo "  --> Deleting symlink $ROOT/$LINK"
-    rm -f $ROOT/$LINK
+    rm -f "$ROOT/$LINK"
    else
    echo "  --> $ROOT/$LINK (symlink) would be deleted"
    fi


I realised I could fairly easily check that the new regex performs the same as the original for packages. My main laptop install is a fairly complete Slackware64 14.0 install (I do not have the localised KDE and Calligra packages, Emacs, Apache, the huge kernel, MySQL, PHP, Pidgin, Samba, Seamonkey, Slackpkg or Seamonkey), but I do have an extra 213 non-official packages (from SBo, AlienBOB, Salix, plus some self packaged stuff). Making a total of 1271 packages.

I tested the current regex against the one from my last example as follows:

Code:

diff <(sed -n 's,^( *cd \([^ ;][^ ;]*\) *; *rm -rf \([^ )][^ )]*\) *) *$,\1/\2,p' /var/log/scripts/*) <(sed -n 's,^( *cd \([^;]*[!-~]\) *; *rm -rf \([^)]*[!-~]\) *) *$,\1/\2,p' /var/log/scripts/*)
There was no difference. :D

I also checked to see if there was a large speed difference between them but there isn't:

Code:

$ time sed -n 's,^( *cd \([^ ;][^ ;]*\) *; *rm -rf \([^ )][^ )]*\) *) *$,\1/\2,p' /var/log/scripts/* >/dev/null

real    0m0.245s
user    0m0.234s
sys    0m0.010s
$ time sed -n 's,^( *cd \([^;]*[!-~]\) *; *rm -rf \([^)]*[!-~]\) *) *$,\1/\2,p' /var/log/scripts/* >/dev/null

real    0m0.269s
user    0m0.261s
sys    0m0.007s

I'm now fairly convinced that the final suggested patch to removepkg should be safe and has the added advantage that it will work with either potential method of escaping.

Now I'll try and find a way to extensively test my suggested makepkg improvements and report back if I come up with something.

EDIT: P.S. I also checked the extended regex variant. It works as well, though it is a little bit slower.

ruario 10-17-2013 03:42 PM

Ok, I worked out a nice way of testing my makepkg changes against my 1271 installed packages. Firstly, I extracted out the code from the original/official makepkg that does the link display and shell script conversion and tweaked it just enough to make a short test script that looks as follows:

Code:

#!/bin/sh

cd /

TMP=/tmp

make_install_script() {
  COUNT=1
  LINE="$(sed -n "$COUNT p" $1)"
  while [ ! "$LINE" = "" ]; do
  LINKGOESIN="$(echo "$LINE" | cut -f 1 -d " ")"
  LINKGOESIN="$(dirname $LINKGOESIN)"
  LINKNAMEIS="$(echo "$LINE" | cut -f 1 -d ' ')"
  LINKNAMEIS="$(basename "$LINKNAMEIS")"
  LINKPOINTSTO="$(echo "$LINE" | cut -f 3 -d ' ')"
  echo "( cd $LINKGOESIN ; rm -rf $LINKNAMEIS )"
  echo "( cd $LINKGOESIN ; ln -sf $LINKPOINTSTO $LINKNAMEIS )"
  COUNT=$(expr $COUNT + 1)
  LINE="$(sed -n "$COUNT p" $1)"
  done
}

# Get rid of possible pre-existing trouble:
INST=$(mktemp $TMP/makepkg.XXXXXX)
# This requires the ls from coreutils-5.0 (or newer):
find ./bin ./etc ./lib ./lib64 ./opt ./sbin ./usr -type l -exec ls -l --time-style=long-iso {} \; | while read foo ; do echo $foo ; done | cut -f 8- -d ' ' | cut -b3- | tee $INST
if [ ! "$(cat $INST)" = "" ]; then
  echo
  echo "Making symbolic link creation script:"
  make_install_script $INST
fi

I then ran this test script as root:

Code:

# time ./makepkg-test > original-test

real    10m9.865s
user    5m43.135s
sys    3m9.594s

Next I changed the script to use my proposed, updated method. Here is a diff for the changes I made:

Code:

26c26
< find ./bin ./etc ./lib ./lib64 ./opt ./sbin ./usr -type l -exec ls -l --time-style=long-iso {} \; | while read foo ; do echo $foo ; done | cut -f 8- -d ' ' | cut -b3- | tee $INST
---
> find ./bin ./etc ./lib ./lib64 ./opt ./sbin ./usr -type l -printf '%p@@ARROW@@%l\n' | sed 's,^./,,;s/ /@@SPACE@@/g;s/@@ARROW@@/ -> /' | tee $INST | sed 's/@@SPACE@@/\\ /g'
30c30
<  make_install_script $INST
---
>  make_install_script $INST | sed 's/@@SPACE@@/\\ /g'

I then ran the updated script as follows:

Code:

# time ./makepkg-test > updated-test

real    6m14.363s
user    1m58.464s
sys    2m50.869s

As you can see it was notably faster. Additionally diff'ing the original-test and updated-test files showed them to be identical! :D

The only other major consideration is that my updated version would have a problem if a package contained symlinks with "@@ARROW@@" or "@@SPACE@@" in their names, though IMHO this would be less likely than having spaces in the name. If this is a concern however, these keywords could be switched to something else.

EDIT: P.S. If anyone is curious:
Code:

$ wc -l updated-test
49646 updated-test


TommyC7 10-17-2013 04:31 PM

While we're on this topic, we could also make use of:

Code:

  echo "( cd -- $LINKGOESIN ; rm -rf -- $LINKNAMEIS )"
  echo "( cd -- $LINKGOESIN ; ln -sf -- $LINKPOINTSTO $LINKNAMEIS )"

Just in the rare situation a package has a filename that starts with "-" or "--". Can't hurt (or affect) current packages.

volkerdi 10-17-2013 04:35 PM

Quote:

Originally Posted by TommyC7 (Post 5047690)
While we're on this topic, we could also make use of:

Code:

  echo "( cd -- $LINKGOESIN ; rm -rf -- $LINKNAMEIS )"
  echo "( cd -- $LINKGOESIN ; ln -sf -- $LINKPOINTSTO $LINKNAMEIS )"

Just in the rare situation a package has a filename that starts with "-" or "--". Can't hurt (or affect) current packages.

Yeah it can. Any time the format of the script changes, matching with old scripts breaks. Otherwise I'd probably use && instead of ;.

ruario 10-17-2013 05:00 PM

Quote:

Originally Posted by volkerdi (Post 5047692)
Yeah it can. Any time the format of the script changes, matching with old scripts breaks. Otherwise I'd probably use && instead of ;.

Yeah true! Such a change could potentially screw with any scripts that a user has written that assume the old style formatting.

That said, removepkg itself could probably accommodate the old way, with space escaping and TommyC7's proposal using a regex like this:

Code:

s|^( *cd\( --\)\{,1\} \([^;]*[!-~]\) *; *rm -rf\( --\)\{,1\} \([^)]*[!-~]\) *) *$|\2/\4|p
I retested this updated regex against my locally installed packages and it seems to work OK.

TommyC7 10-17-2013 10:55 PM

Ah poo. :(

ruario 10-18-2013 05:40 AM

Quote:

Originally Posted by DarkVision (Post 5044826)
i was trying to speed up packaging of the kawoken-icon-theme which includes hundreds of symlinks. makepkg takes ages here to move the links to the doinst.sh script.

Whilst testing my previously suggested makepkg changes, I also started to appreciate how slow the make_install_script function really is, so I scrapped it and replaced it with another regex via sed:

Code:

--- makepkg.orig
+++ makepkg
@@ -47,22 +47,6 @@
  sleep 5
 fi
 
-make_install_script() {
-  COUNT=1
-  LINE="$(sed -n "$COUNT p" $1)"
-  while [ ! "$LINE" = "" ]; do
-  LINKGOESIN="$(echo "$LINE" | cut -f 1 -d " ")"
-  LINKGOESIN="$(dirname $LINKGOESIN)"
-  LINKNAMEIS="$(echo "$LINE" | cut -f 1 -d ' ')"
-  LINKNAMEIS="$(basename "$LINKNAMEIS")"
-  LINKPOINTSTO="$(echo "$LINE" | cut -f 3 -d ' ')"
-  echo "( cd $LINKGOESIN ; rm -rf $LINKNAMEIS )"
-  echo "( cd $LINKGOESIN ; ln -sf $LINKPOINTSTO $LINKNAMEIS )"
-  COUNT=$(expr $COUNT + 1)
-  LINE="$(sed -n "$COUNT p" $1)"
-  done
-}
-
 usage() {
  cat << EOF
 
@@ -194,12 +178,11 @@
 echo "Searching for symbolic links:"
 # Get rid of possible pre-existing trouble:
 INST=$(mktemp $TMP/makepkg.XXXXXX)
-# This requires the ls from coreutils-5.0 (or newer):
-find . -type l -exec ls -l --time-style=long-iso {} \; | while read foo ; do echo $foo ; done | cut -f 8- -d ' ' | cut -b3- | tee $INST
+find * -type l -printf '%p/*>/%l\n' | sed 's, ,/*/,g;s,/\*>/, -> ,;s,/\*/,\\ ,g' | tee $INST
 if [ ! "$(cat $INST)" = "" ]; then
  echo
  echo "Making symbolic link creation script:"
-  make_install_script $INST | tee doinst.sh
+  sed 's,\(.*\)/\(.*\) -> \(.*\),( cd \1 \; rm -rf \2 )\n( cd \1 \; ln -sf \3 \2 ),' $INST | tee doinst.sh
 fi
 echo
 if [ ! "$(cat $INST)" = "" ]; then

EDIT: I have updated the patch above since I first published it, so that /*/ is used instead of @@SPACE@@ and /*>/ is used instead of @@ARROW@@. They are shorter and IMHO even less likely to generate false positives. It would require the package to contain directories called * or *> which seems exceptionally unlikely.

When converting large numbers of links, this is indeed many times faster. EDIT: I recreated the ./makepkg-test experiment from above and this time I was able to complete the conversion of 16548 symlinks (from 1271 packages) in 1.4 seconds, while the old make_install_script function takes around 10 minutes. I also diff'ed the output created by both methods and it is identical.

EDIT: removed the regex 's,^./,,' because it is possible to achieve the same with 'find *', instead of 'find .'

DarkVision 10-19-2013 08:33 AM

Thanks ruario... i applied the following patches based on your suggestions and the kawoken-package seem to build/install/remove fine. I will now patch a clean install of Slackware 14 and test building some packages to see how well it works.

makepkg also seem to be much faster now too, thanks :D

For the record, here are the patches i applied to makepkg and removepkg on Slackware 14.0.

Code:

--- makepkg.orig        2009-06-02 06:12:30.000000000 +0200
+++ makepkg        2013-10-19 12:38:03.000000000 +0200
@@ -47,22 +47,6 @@
  sleep 5
 fi
 
-make_install_script() {
-  COUNT=1
-  LINE="$(sed -n "$COUNT p" $1)"
-  while [ ! "$LINE" = "" ]; do
-  LINKGOESIN="$(echo "$LINE" | cut -f 1 -d " ")"
-  LINKGOESIN="$(dirname $LINKGOESIN)"
-  LINKNAMEIS="$(echo "$LINE" | cut -f 1 -d ' ')"
-  LINKNAMEIS="$(basename "$LINKNAMEIS")"
-  LINKPOINTSTO="$(echo "$LINE" | cut -f 3 -d ' ')"
-  echo "( cd $LINKGOESIN ; rm -rf $LINKNAMEIS )"
-  echo "( cd $LINKGOESIN ; ln -sf $LINKPOINTSTO $LINKNAMEIS )"
-  COUNT=$(expr $COUNT + 1)
-  LINE="$(sed -n "$COUNT p" $1)"
-  done
-}
-
 usage() {
  cat << EOF
 
@@ -194,12 +178,11 @@
 echo "Searching for symbolic links:"
 # Get rid of possible pre-existing trouble:
 INST=$(mktemp $TMP/makepkg.XXXXXX)
-# This requires the ls from coreutils-5.0 (or newer):
-find . -type l -exec ls -l --time-style=long-iso {} \; | while read foo ; do echo $foo ; done | cut -f 8- -d ' ' | cut -b3- | tee $INST
+find * -type l -printf '%p/*>/%l\n' | sed 's, ,/*/,g;s,/\*>/, -> ,;s,/\*/,\\ ,g' | tee $INST
 if [ ! "$(cat $INST)" = "" ]; then
  echo
  echo "Making symbolic link creation script:"
-  make_install_script $INST | tee doinst.sh
+  sed 's,\(.*\)/\(.*\) -> \(.*\),( cd \1 \; rm -rf \2 )\n( cd \1 \; ln -sf \3 \2 ),' $INST | tee doinst.sh
 fi
 echo
 if [ ! "$(cat $INST)" = "" ]; then

Code:

--- removepkg.orig        2009-04-28 22:44:26.000000000 +0200
+++ removepkg        2013-10-19 12:42:26.000000000 +0200
@@ -149,7 +149,7 @@
 }
 
 extract_links() {
- sed -n 's,^( *cd \([^ ;][^ ;]*\) *; *rm -rf \([^ )][^ )]*\) *) *$,\1/\2,p'
+ sed -n 's|^( *cd\( --\)\{,1\} \([^;]*[!-~]\) *; *rm -rf\( --\)\{,1\} \([^)]*[!-~]\) *) *$|\2/\4|p'
 }
 
 preserve_file() {
@@ -228,7 +228,7 @@
  if [ -L "$ROOT/$LINK" ]; then
    if [ ! "$WARN" = "true" ]; then
    echo "  --> Deleting symlink $ROOT/$LINK"
-    rm -f $ROOT/$LINK
+    rm -f "$ROOT/$LINK"
    else
    echo "  --> $ROOT/$LINK (symlink) would be deleted"
    fi
@@ -262,7 +262,7 @@
    if [ -f "$ROOT/$FILE" ]; then
      if [ ! "$WARN" = "true" ]; then
        echo "  --> Deleting $ROOT/$FILE (fmt man page)"
-      rm -f $ROOT/$FILE
+      rm -f "$ROOT/$FILE"
      else
        echo "  --> $ROOT/$FILE (fmt man page) would be deleted"
      fi


ruario 10-20-2013 03:29 AM

Cool, thanks for testing as well! Makre sure to report back even if you don't have problems. Just to confirm that the changes really are OK.

Although one suggested change. I get the feeling that the birthday boy does not want to alter the formatting of the shell script code based on his comment above (also, whilst TommyC7's suggestion is certainly nice, packages containing symlinks/directories starting with - have not been reported yet). Therefore you don't really need the more complex regex in removepkg, s,^( *cd \([^;]*[!-~]\) *; *rm -rf \([^)]*[!-~]\) *) *$,\1/\2,p should be enough.

ruario 10-20-2013 05:09 AM

Of course if Patrick is open to accepting -- and && as improvements within the shell script generation code, you can make a regex for removepkg that should deal with all variants and remain pretty false positive resistant. Here is a fake doinst.sh with various different formatting possibilities (as allowed by the current regex, plus support for escaped spaces and the proposed -- and && improvements):

Code:

( cd usr/bin ; rm -rf link\ 1 )
( cd usr/bin ; ln -sf application link\ 1 )
( cd usr/bin && rm -rf link_2 )
( cd usr/bin && ln -sf application link_2 )
( cd -- usr/bin ; rm -rf -- "link 3" )
( cd -- usr/bin ; ln -sf -- application "link 3" )
( cd -- usr/bin && rm -rf -- link_4 )
( cd -- usr/bin && ln -sf -- application link_4 )
(cd usr/bin;rm -rf "link 5")
(cd usr/bin;ln -sf application "link 5")
(cd usr/bin&& rm -rf link_6 )
(cd usr/bin&& ln -sf application link_6 )
(  cd -- usr/bin  ; rm -rf -- link\ 7  )
(  cd -- usr/bin  ; ln -sf -- application link\ 7  )
( cd -- usr/bin &&  rm -rf -- link_8 )
( cd -- usr/bin &&  ln -sf -- application link_8 )

This can be processed with the following sed -n regex: s#^( *cd\( --\)\{,1\} \([^;&]*[!-~]\) *\(&&\|;\) *rm -rf\( --\)\{,1\} \([^)]*[!-~]\) *) *$#\2/\5#p

It would produce the following output:

Code:

usr/bin/link\ 1
usr/bin/link_2
usr/bin/"link 3"
usr/bin/link_4
usr/bin/"link 5"
usr/bin/link_6
usr/bin/link\ 7
usr/bin/link_8

EDIT: I compared this latest iteration against the original regex with symlinks on my system (from the 1271 installed packages), using this method again:

Code:

diff <(sed -n 's,^( *cd \([^ ;][^ ;]*\) *; *rm -rf \([^ )][^ )]*\) *) *$,\1/\2,p' /var/log/scripts/*) <(sed -n 's#^( *cd\( --\)\{,1\} \([^;&]*[!-~]\) *\(&&\|;\) *rm -rf\( --\)\{,1\} \([^)]*[!-~]\) *) *$#\2/\5#p' /var/log/scripts/*)
There is still no difference in output.

That said, I am really starting to appreciate why Patrick is reluctant to allow too many changes to the output formatting of the symlink shell script code from makepkg. The more variation that is allowed the greater the complexity of the regex in removepkg to handle it. This increases the likelihood of false positives and a false positive could potentially mean a file that shouldn't be removed is removed. Additionally, there is an argument to be made that whilst support for -- and && and the benefits they bring would be nice to have, you aren't gonna need it given that thus far the current formatting has worked well for years. In the case of handling spaces in symlink names however a bug has now been reported, so it makes sense to try and handle that. Hence s,^( *cd \([^;]*[!-~]\) *; *rm -rf \([^)]*[!-~]\) *) *$,\1/\2,p is probably still the best regex to test with, as I suspect it has the best chance of making its way into removepkg of the suggestions I have made thus far.

gnashley 10-20-2013 12:35 PM

Don't forget 'ln -fs' as opposed to 'ln -sf'...

I am a bit surprised that Patrick expressed any desire to change from ';' to using '&&'. I figured that the use of ';' was kept so that it would be obvious when link-creation lines were failing, by leaving the faulty links in the root of '/'. Either way, seeing the commotion makes me glad I use package-creation software which pretty much guarantees the consistency of the code in doinst.sh scripts. It's worth mentioning that (src2pkg) also verifies that links are valid and converts any absolute paths to relative paths.

ruario 10-20-2013 01:01 PM

Quote:

Originally Posted by gnashley (Post 5049161)
Don't forget 'ln -fs' as opposed to 'ln -sf'...

makepkg uses 'ln -sf':

Code:

$ wget -qO- http://mirrors.slackware.com/slackware/slackware-current/source/a/pkgtools/scripts/makepkg | grep -n 'ln '
60:  echo "( cd $LINKGOESIN ; ln -sf $LINKPOINTSTO $LINKNAMEIS )"

I do the same in my suggestions. Are you trying to suggest it should be 'ln -fs' and if so why?

DarkVision 10-20-2013 11:03 PM

Quote:

Originally Posted by ruario (Post 5048921)
Cool, thanks for testing as well! Makre sure to report back even if you don't have problems. Just to confirm that the changes really are OK.

I re-compiled about 900 packages and the doinst.sh scripts seem to be the same for every package except for those with spaces in links.

I also tried to remove a package created with the patched makepkg tool and spaces in links with the old removepkg command and that failed. The files with spaces in filenames will not get removed.

ruario 10-20-2013 11:34 PM

Quote:

Originally Posted by DarkVision (Post 5049396)
I also tried to remove a package created with the patched makepkg tool and spaces in links with the old removepkg command and that failed. The files with spaces in filenames will not get removed.

Yep, but those files are removed with the patched removepkg, right?


All times are GMT -5. The time now is 02:31 PM.