Bug in makepkg and symlinks with blanks in filename?
Hi...
i was trying to speed up packaging of the kawoken-icon-theme which includes hundreds of symlinks. makepkg takes ages here to move the links to the doinst.sh script. I was playing with the find-command to create a doinst.sh contain the same symlinks that would be created later by much slower code used by makepkg. When i checked that my find-command would produce the same doinst.sh code like makepkg i figured out that some links by makepkg were wrong if the symlink or filename includes blanks. And there are a few in this package. Here is a small test script that creates a testing package with only one file and a symlink, both with blanks in the filename. The output of my find command uses " around filenames so blanks will not split filenames any longer. The output of makepkg is different and in my opinion not correct: Code:
#!/bin/sh Quote:
I don't take care about it since my find-code seem to work with those special symlinks (at least a diff against makepkgs doinst.sh for this special case does not show much differences except filenames with blanks) and speeds up packing from about 50min down to 3min but i wonder if this is something that should be fixed or not... |
You know, it might be better to "fix" the problem before you start.
Here's a little shell program, blank_rename.sh, that I use prior to doing anything with "sentence" file names; it replaces the spaces with underscores, file names, directory names, whatever. Kinda handy when you get stuff from winidiots... Hope this helps some. Code:
#!/bin/bash |
I know how to handle filenames with blanks, this is not the problem. Thanks anyway for your script.
I also don't use blanks in filenames myself and i know why. But can you be sure that every source package available does not make use of blanks in filenames? If you can not be sure... you should either take care about it in your program or at least show a warning that the source files include some stupid filenames. As i said... i don't take care about it. I already added a sanity check to my scripts just in case there are other source packages with that kind of filenames. P.S. And i'm not sure if icon names with blanks replaced by underscore will continue to work. |
I can't recall ever seeing any Unix/Linux package that had spaces (or carets or vertical bars or dollar signs (or euros or pounds) anything meaningful to the shell) in file or directory names -- only crap from Windows. Anytime I do get anything at all from Windows users I just walk the tree and change all the file masks (644 for files, 755 for directories) and use that little shell program to fix the spaces. I do that first, saves a heckuva lot of time and trouble. If there happens to be a Makefile with embedded spaces (and quoted file names), I can deal with that with an editor quick and and easy but, on the other hand, I've never tried to build anything from Windows so I dunno what kind of a mess it would be.
I seem to remember a CD-ROM that came with a book on graphics that was all Windows source. And, of course, it was also all Windows-specific graphic libraries. Tossed the book and used the CD-ROM for scaring away birds from the garden (they're really pretty when the sun is shining and there's a little breeze). In Unix/Linux systems, a file name is just a file name and if you have a graphic it won't matter if there's an underscore in it, it'll open just like any other file would; there's almost never anything embedded that would prevent that, at least nothing I know of. Sorry that it didn't help. |
Well... the package i was trying to build is an icon theme especially for linux KDE or XFCE so it is nothing windows specific.
The filename that su**z is Quote:
moka-icon-theme I'm sure this is a really rare condition and maybe there are no other source packages out there using such stupid filenames... but who knows? P.S. Here is another filename (there is no symlink for it so makepkg would not fail here... anyway): Quote:
|
I think I agree with DarkVision.
If you write a program that accepts input and someone throws input at it that it can't handle then the sensible solution must be to adjust your program accordingly. Even if, as in this scenario, the program exits and says 'Please fix the spaces in the filenames'. The better solution would probably be rewrite the program code so it can deal with the spaces. You may write a program and think you've covered all eventualities but once you've released it in to the wild you never know what users will do with it. |
Quote:
I will rebuild a bunch of packages and check for those stupid filenames but i'm sure these are the only ones causing trouble here. |
Such things are rare in 'nix-ish sources, but do occur. I recently had the problem (spaces in filenames)when building and packaging MIT 'scratch'. I definitely agree that it should be fixed in makepkg -if you simply hope to notice the problem when it occurs and fix the problem in the sources, you'll more likely miss the problem.
|
I can not speak for Patrick, but I think that suggestions for fixes in Slackware's native tools will be taken more seriously when they are accompanied by patches that actually fix the issue.
Eric |
I did spend some time looking at this last night. What I'll note is that a solution isn't going to be as easy as wrapping arguments in quotes in all cases. It would be fine to use quotes *only* if spaces are detected in a filename or directory, but otherwise the script lines need to be produced so that they are exactly the same as the ones made by the existing script. Otherwise, comparisons between scripts made with the new script and scripts made with the old script aren't going to work.
|
Quote:
Quote:
Code:
make_install_script() { As i said.. i'm not good at that, and i'm sure someone else has a much better idea, but might that be a way to handle it? |
How about something like this.
Note: I have only very quickly tested this as I am at work and I don't really have the time to do this right now. It worked for a single, test package. I also have no doubt it will need more work but perhaps it is a initial step towards something better and maybe it will give someone ideas to play around with: makepkg.patch: Code:
--- makepkg.orig Code:
--- removepkg.orig |
Quote:
Quote:
For the kawoken-icon-theme you may get some 8000 links in the doinst.sh ... might be difficult to find files / symlinks with spaces in there just by reading the script created by makepkg. Of course you could check before creating the package if there are symlinks or files with spaces. BTW i just checked against /var/log/packages here and there are a few packages that have filenames with spaces (python 2.7 for example, found a few custom packages also). This does not mean that there is a generic problem with spaces, i have not tested what happens if i remove a package that only contains filenames and directories with spaces but no symlinks. I just want to make clear that there are packages around that are using spaces in filenames and it might be also more complex to work around that problem by patching the applications. For the kawoken-icon-theme you may need to patch applications that use those icon-names too, not only the icon-theme itself. And you are correct that the other tools like removepkg also need to be fixed, i hadn't that in mind. If this is not going to be fixed there should be at least a warning in makepkg that there might be filenames/symlinks with spaces. A simple find command could help here. |
The code wrapped in parentheses executes in an asynchronous sub-shell. I'm not sure it that sub-shell inherits the parent environment. I'm confused by the printf statement. Its parens probably need to be escaped with a backslash. I don't see an -exec command in the find statement.
|
@hpfeil: I'm not sure what you are talking about. If you are talking about DarkVision's suggested fix and use of subshells, have a look at your original makepkg, it uses subshells to generate the same variables.
As for my use of find's builtin printf, I'm not sure what you think is wrong but it generates the correct string. It does not need an exec because I do not execute an external command. Nonetheless, I have just tweaked it slightly to make it more readable. EDIT: If you feel I changed that find command too much here is an alternative version, that just tweaks the original very slightly. I warn you however, it ain't pretty! Code:
find . -type l -exec ls -l --time-style=long-iso {} \; | while read foo ; do echo $foo ; done | cut -f 8- -d ' ' | cut -b3- | sed 's/ -> /@@ARROW@@/;s/ /@@SPACE@@/g;s/@@ARROW@@/ -> /' | tee $INST | sed 's/@@SPACE@@/\\ /g' |
By using sed's extended regular expression capability (-r) it is possible to do a slightly neater patch to removepkg to work with my patched makepkg:
Code:
--- removepkg.orig Code:
--- removepkg.orig |
Quote:
I tested the current regex against the one from my last example as follows: Code:
diff <(sed -n 's,^( *cd \([^ ;][^ ;]*\) *; *rm -rf \([^ )][^ )]*\) *) *$,\1/\2,p' /var/log/scripts/*) <(sed -n 's,^( *cd \([^;]*[!-~]\) *; *rm -rf \([^)]*[!-~]\) *) *$,\1/\2,p' /var/log/scripts/*) I also checked to see if there was a large speed difference between them but there isn't: Code:
$ time sed -n 's,^( *cd \([^ ;][^ ;]*\) *; *rm -rf \([^ )][^ )]*\) *) *$,\1/\2,p' /var/log/scripts/* >/dev/null Now I'll try and find a way to extensively test my suggested makepkg improvements and report back if I come up with something. EDIT: P.S. I also checked the extended regex variant. It works as well, though it is a little bit slower. |
Ok, I worked out a nice way of testing my makepkg changes against my 1271 installed packages. Firstly, I extracted out the code from the original/official makepkg that does the link display and shell script conversion and tweaked it just enough to make a short test script that looks as follows:
Code:
#!/bin/sh Code:
# time ./makepkg-test > original-test Code:
26c26 Code:
# time ./makepkg-test > updated-test The only other major consideration is that my updated version would have a problem if a package contained symlinks with "@@ARROW@@" or "@@SPACE@@" in their names, though IMHO this would be less likely than having spaces in the name. If this is a concern however, these keywords could be switched to something else. EDIT: P.S. If anyone is curious: Code:
$ wc -l updated-test |
While we're on this topic, we could also make use of:
Code:
echo "( cd -- $LINKGOESIN ; rm -rf -- $LINKNAMEIS )" |
Quote:
|
Quote:
That said, removepkg itself could probably accommodate the old way, with space escaping and TommyC7's proposal using a regex like this: Code:
s|^( *cd\( --\)\{,1\} \([^;]*[!-~]\) *; *rm -rf\( --\)\{,1\} \([^)]*[!-~]\) *) *$|\2/\4|p |
Ah poo. :(
|
Quote:
Code:
--- makepkg.orig When converting large numbers of links, this is indeed many times faster. EDIT: I recreated the ./makepkg-test experiment from above and this time I was able to complete the conversion of 16548 symlinks (from 1271 packages) in 1.4 seconds, while the old make_install_script function takes around 10 minutes. I also diff'ed the output created by both methods and it is identical. EDIT: removed the regex 's,^./,,' because it is possible to achieve the same with 'find *', instead of 'find .' |
Thanks ruario... i applied the following patches based on your suggestions and the kawoken-package seem to build/install/remove fine. I will now patch a clean install of Slackware 14 and test building some packages to see how well it works.
makepkg also seem to be much faster now too, thanks :D For the record, here are the patches i applied to makepkg and removepkg on Slackware 14.0. Code:
--- makepkg.orig 2009-06-02 06:12:30.000000000 +0200 Code:
--- removepkg.orig 2009-04-28 22:44:26.000000000 +0200 |
Cool, thanks for testing as well! Makre sure to report back even if you don't have problems. Just to confirm that the changes really are OK.
Although one suggested change. I get the feeling that the birthday boy does not want to alter the formatting of the shell script code based on his comment above (also, whilst TommyC7's suggestion is certainly nice, packages containing symlinks/directories starting with - have not been reported yet). Therefore you don't really need the more complex regex in removepkg, s,^( *cd \([^;]*[!-~]\) *; *rm -rf \([^)]*[!-~]\) *) *$,\1/\2,p should be enough. |
Of course if Patrick is open to accepting -- and && as improvements within the shell script generation code, you can make a regex for removepkg that should deal with all variants and remain pretty false positive resistant. Here is a fake doinst.sh with various different formatting possibilities (as allowed by the current regex, plus support for escaped spaces and the proposed -- and && improvements):
Code:
( cd usr/bin ; rm -rf link\ 1 ) It would produce the following output: Code:
usr/bin/link\ 1 Code:
diff <(sed -n 's,^( *cd \([^ ;][^ ;]*\) *; *rm -rf \([^ )][^ )]*\) *) *$,\1/\2,p' /var/log/scripts/*) <(sed -n 's#^( *cd\( --\)\{,1\} \([^;&]*[!-~]\) *\(&&\|;\) *rm -rf\( --\)\{,1\} \([^)]*[!-~]\) *) *$#\2/\5#p' /var/log/scripts/*) That said, I am really starting to appreciate why Patrick is reluctant to allow too many changes to the output formatting of the symlink shell script code from makepkg. The more variation that is allowed the greater the complexity of the regex in removepkg to handle it. This increases the likelihood of false positives and a false positive could potentially mean a file that shouldn't be removed is removed. Additionally, there is an argument to be made that whilst support for -- and && and the benefits they bring would be nice to have, you aren't gonna need it given that thus far the current formatting has worked well for years. In the case of handling spaces in symlink names however a bug has now been reported, so it makes sense to try and handle that. Hence s,^( *cd \([^;]*[!-~]\) *; *rm -rf \([^)]*[!-~]\) *) *$,\1/\2,p is probably still the best regex to test with, as I suspect it has the best chance of making its way into removepkg of the suggestions I have made thus far. |
Don't forget 'ln -fs' as opposed to 'ln -sf'...
I am a bit surprised that Patrick expressed any desire to change from ';' to using '&&'. I figured that the use of ';' was kept so that it would be obvious when link-creation lines were failing, by leaving the faulty links in the root of '/'. Either way, seeing the commotion makes me glad I use package-creation software which pretty much guarantees the consistency of the code in doinst.sh scripts. It's worth mentioning that (src2pkg) also verifies that links are valid and converts any absolute paths to relative paths. |
Quote:
Code:
$ wget -qO- http://mirrors.slackware.com/slackware/slackware-current/source/a/pkgtools/scripts/makepkg | grep -n 'ln ' |
Quote:
I also tried to remove a package created with the patched makepkg tool and spaces in links with the old removepkg command and that failed. The files with spaces in filenames will not get removed. |
Quote:
|
Quote:
I think that is OK and much better then install broken symlinks with an unpatched makepkg. |
Quote:
If time could be turned back then && would be much better, IMHO. It is up to Patrick however to decide is it is worth doing now. I provided a regex for removepkg to work with both methods but there may be other scripts that could break as they do not expect this. |
An example of how changes to the symlink shell script formatting can break things. I just tested Spkg (a re-implemented of installpkg, upgradepkg and removepkg in C, optimised for speed). Whilst not installed on Slackware itself by default, I know that some people do use it. Additionally it used by some Slackware derivatives. For example, Salix installs it (alongside the official pkgtools) and most Salix users would use it by default as their slapt-get version uses it in place of the original Pkgtools. Spkg naturally expects the current formatting conventions in symlink shell script code in doinst.sh. Adding support for -- and && results in uninstall of all symlinks failing.
If I use a patched makepkg to produce packages that contain some symlinks with escaped spaces in their names, then Spkg fails to remove those symlinks but the rest are still removed correctly. So this reaffirms (to me at least) that the more cautious, minimal change of only using escaping when needed causes the least problems for current tools that might already be out there. EDIT: To play devil's advocate, I suppose a counter argument is that if you change the format and all third party tools expecting the current format break in all cases, those tools are likely to be fixed sooner as their author will probably notice immediately. If you allow one small change (space escaping) those maintaining other tools may not notice for years as spaces in file names are not common in *nix, so the author might not experience the issue. |
I wasn't advocating for either 'ln -fs' or the use, or not, of '&&' instead of ';'.
|
OK... looks like it's not that easy (3rd-party apps need to be fixed) so for now i will use ruarios find/sed code in slackbuilds and remove any symlinks before i call makepkg if the package contain symlinks with spaces.
If i don't do that and a user want to build a slackbuild on a system with an unpatched makepkg he will get a corrupt doinst.sh script (even if that may happen for about one in a million packages only). On the other side the only problem may be that removepkg will not be able to remove symlinks with spaces and that other 3rd-party tools may not be able to handle that package correct. Well... this is something i can not take care about. I'm sure patching makepkg and removepkg may be the better solution, but for now slackbuild maintainers must check for spaces in symlinks and apply a workaround to the script since they can not be sure that the package will be compiled on a system with a patched makepkg. As mentioned above a corrupt script could remove existing files if parts of the symlink filename match other files in the same directory. Right now we were talking about symlinks with spaces in the filename. I checked what will happen with an unpatched makepkg if you create a symlink inside a directory with spaces in the directory name. Here is my test script: Code:
#!/bin/sh Quote:
As i said above... i will check for spaces and use ruarios code to create a doinst.sh manually for the slackbuilds affected by that. So this is temporary solved for me right now. Thanks a lot for the help ruario :) P.S. To give a more "real world" example, here is another script: Code:
#!/bin/sh Code:
Make symlinks in doinst.sh using find... Yes, i know... all examples here are more or less constructed, may only appear once in a million. :D |
Why the use of the recursive flag when deleting symlinks?
Code:
( cd path/to/somewhere ; rm -rf somelink ) Or is the purpose really to remove whatever happened to be in the way, even if it is a directory? The total lack of security is amazing... Test if it is a link, and only then remove it. Otherwise inform the user about which link couldn't be made and needs to be made manually. |
It is intentionally to allow a symlink to a replace a directory. 'rm -rf linkname/' (with the final slash) will not be created by the shell script creation code, so this is irrelevant.
The "security" is the person doing the packaging. He/She need to make sure the package does what it should and no more. All official packages are built by Pat. I think we can trust him. Any extra packages should only come from a reputable source or you should make them (and hence check them yourself). |
Just to be clear, on post install rm -rf somelink is used to clear stuff out of the way prior creating the symlink. However on uninstall of the package -r is not used, it would be deleted as rm -f path/to/somewhere/somelink and only if it is not found in some other package.
|
Quote:
Quote:
|
Quote:
Quote:
|
3rd-party slackbuilds are exactly the same. The person who writes it has a responsibility to check that it works as expected. Given that almost nobody (perhaps just you and me thus far) will have patched their makepkg/removepkg to handle symlinks with spaces in their paths, the slackbuild author should assume that they will not be handled and find some way to work around this. There are several possible work-arounds e.g. hard link instead, copy the file so that a symlink is not needed, rename the file or directory (assuming this will not break how the package works), delete them (if the particular files are not essential), etc. The packager might also want to ask upstream if they can change the source package to be more UNIX-like in its file naming conventions, thus eliminating the issue.
Before they start packaging the SlackBuild creator can check for spaces in symlink paths in the original source package with a command like the following: Code:
find . -type l -printf '%p>%l\n' | grep ' ' If at some later stage Pat starts shipping Slackware with a makepkg and removepkg that can handle symlinks with spaces in their path, the slackbuild creator/maintainer may choose to remove any workaround they have implemented (though this would make their Slackbuild will only compatible with that version of Slackware and above, so they might want to leave it in place if they care about users of older versions of Slackware). Anyway, I wouldn't expect an immediate response from volkerdi on how he intends to handle this right now, since 14.1 is just about to release. I think it is unlikely that he would make changes to pkgtools at this late stage, so any fix (if there is one) will likely roll into current after 14.1 ships. Which means he doesn't need to spend too much time thinking about this right away. |
Quote:
Anyway i agree with this comment: Quote:
|
Quote:
Code:
gazl@ws1:/tmp$ mkdir wibble doinst.sh runs as root, so a malicious package doesn't need to play silly games with symlinks in order to wreak havok, so this approach doesn't make the situation any worse than it already is from a security standpoint. A discussion about the most robust approach to handling a package that contains a symlink replacing an existing non-empty directory is something that is likely to have many facets. I believe that other packaging systems handle this sort of thing by throwing out packaging conflict errors, aborting the package install and requiring the end user to resolve them manually. Slackware's approach is not to burden the end user with this and just to clobber the existing stuff. Which is better is a matter of opinion. P.S. The safest way to replace an existing empty directory with a symlink is "ln -Tsf", no need for a "rmdir" or "rm -rf" first. |
@DarkVision: As a side note, just by having started this thread you have gone some way to alleviating the problem. volkerdi and AlienBob have commented in the thread and hence know about the issue, so will probably be checking for spaces in symlink paths for any new packages they choose to maintain (assuming of course they hadn't been doing this already). Additionally with 1,550 views to this thread thus far, I'd be surprised if one or more of the SBo team haven't seen it as well, so they can keep an eye on new submissions.
Perhaps gnashley might want to consider adding a warning/error to src2pkg (and/or pkg-check) if you try to make a package with spaces in the symlink path? |
Quote:
Code:
if find * -type l -printf '%p>%l\n' | grep -q ' '; then |
Quote:
Really... in about one year no one would remember this thread. This is something really rare but has the possibility to delete existing files from other packages. Just because of that i would welcome something like a warning... i would simply check if LINKGOESIN/LINKNAMEIS does exist in the package while creating the doinst.sh script. If not something went wrong. Just a reminder to the developers. Then they can decide what to do. This would not break anything... My scripts will simply throw out a warning for now so i don't get hit by that anymore, no matter how the developers or makepkg will handle that in the future. |
Quote:
|
Quote:
Quote:
|
Quote:
|
Quote:
Quote:
|
All times are GMT -5. The time now is 10:08 AM. |