(search was taking too long so i wrote a new article for the record - for a good cause)
Many want to use wget to download a site but exclude directory trees by name pattern (glob) - but using wget's manpage doesn't seem to work.
Many including me had problems figuring out how to use -X, and found the answer hard to remember (between years of span between use).
This is a hard to remember trick write it down
# does not work for directories
$ wget -X fo*o ...
# works for dirs
$ wget -X */fo*o,*/*/fo*o,*/*/*/fo*o ...
(biatch my stars dissapeared try again?)
$ wget -X \*/fo\*o/,\*/\*/fo\*o/,\*/\*/\*/fo\*o/ ...
ANSWER:
site hack wget-?/src/utils.c so to use basename instead of current path, also on command line use filename pattern (just 'fo*o', nothing else). and note param FNM_PATHNAME set to 0 (otherwise it demands '/' to finish tail of match). the following patch is against wget-1.12 there my be a simpler way by definining ?FNM_FLAGS but this works "fine".
<code>
--- utils.c.old 2016-09-13 07:49:11.000000000 -0400
+++ utils.c 2016-09-13 09:32:58.000000000 -0400
@@ -907,6 +907,9 @@
return *d1 == '\0' && (*d2 == '\0' || *d2 == '/');
}
+/* for basename */
+#include <libgen.h>
+
/* Iterate through DIRLIST (which must be NULL-terminated), and return the
first element that matches DIR, through wildcards or front comparison (as
appropriate). */
@@ -921,18 +924,24 @@
{
/* Remove leading '/' */
char *p = *x + (**x == '/');
+ /* SITE HACK - only if patterned ignore leading dirs cmp as file */
+ char sh_str[1024*16], *pp;
+ strcpy(sh_str,basename(dir));
+ pp=sh_str;
+#if 0
+ printf("? %s == %s ?\n",p,pp);
+#endif
if (has_wildcards_p (p))
{
- if (matcher (p, dir, FNM_PATHNAME) == 0)
+ if (matcher (p, pp, 0) == 0)
break;
}
else
{
- if (subdir_p (p, dir))
+ if (subdir_p (p, pp))
break;
}
}
-
return *x ? true : false;
}
</code>
this is my example of use, however demented the result might be. prepending each level of */*/*/ to each would obviously be tedious
$ wget \
--no-remove-listing -L -r -nc -np -nH -l 10 -p --limit-rate=127k \
-X '*-alpha*,*-arm*,*-arm64*,*-hppa*,*-ia64*,*-m68k*,*-mips*,*-sparc*,*-amd64*,*-armel*,*-armhf*,*-mipsel*,*-powerpc*,*-ppc64el*,*-s390x*,*-s390*,*-kfreebsd*' \
-R '*_alpha*,*_arm*,*_arm64*,*_hppa*,*_ia64*,*_m68k*,*_mips*,*_sparc*,*_amd64*,*_armel*,*_armhf*,*_mips el*,*_powerpc*,*_ppc64el*,*_s390*,*_kfreebsd*,*-alpha*,*-arm*,*-arm64*,*-hppa*,*-ia64*,*-m68k*,*-mips*,*-sparc*,*-amd64*,*-armel*,*-armhf*,*-mipsel*,*-powerpc*,*-ppc64el*,*-s390*,*-kfreebsd*' \
http://archive.debian.org/debian/
CONCLUSION: enjoy !