Atta Boy Piete! You are definitely on the right track. I've been working on this lately. I've been working with a port of the crux ports program called crux4slack which is used by delilinux. You can easily find and get a copy of it to further your work a bit. I've also downloaded all the actula ports directories for versions 2.1, 2.2, 2.3 and 2.4 of crux plus some for delilinux.
The ports Pkgfile scripts are going to be very easy to work with with src2pkg -either by translating them to src2pkg format or working with them directly. they are alos quite similar to the format used by a couple of other slackpak builder programs. I like the crux repositories because they have quite a few ports available and wil be easy to use with src2pkg. I've looked at using gentoo e-builds or bsd ports, but they would be much harder to integrate.
The crux ports system uses a program called httpup to 'mirror' a whole directory to your own ports dir. The newer versions can also use rsync and I think there is a ay to use cvs as well(from the cruxPPC distro). Some time ago I was fooling with an old program called 'snarf' which could also be used for http or ftp protocols. (I dislike the httpup because it's written in C++)
But you definitely have the right idea. I'd like to see this work by having a list of mirrors and a list of possible categories and a list or specification of possible targets. A list of mirrors could be set that would be used by default. Then each program could have a list of category or subdirectory paths.
I can see that you've been reading the posts where I have talked about this subject lately. You are right about my site being different -I have considered re-arranging it. Still, my site is not like most 'ports' directories because it is not a reflection of versions which have been released for a particular distro version. And for some programs I purposely have more than one version.
Also, the directories for official slackware SlackBuilds are not the same as the categories used by slackbuilds.org. And Bob and Robby both have more SlackBuilds in their slackware.com home directories which use still other directory structures.
All this makes it more harder top implement any sort of 'search' capability into src2pkg. Of course we could specify these different (sub)dir layouts. But there may alwys be more variants to deal with.
I've done a little bit of research into how to search websites recursively over http or ftp. I haven't found out much for sure. But I seem to remember seeing something awhile back about 'lynx' being able to do something like --dumplinks and return a list if links contained in a webpage. I'm wondering if this can be used to do a sort of 'ls' of a directory over http/ftp. If we could get something like this working it would be much easier to integrate a search function into src2pkg without always having to know the path to the piort/build directory.
The crux ports program will download a whole dir and so can snarf. Of course wget and rsync can also do this. Just for handling raw sources/tarballs I need to handle ftp, http, rsync, cvs, svn and probably git. The newer crux ports system uses both rsync and http/ftp protocols.
For a raw search facility on src2pkg it would be nice to be able to just tell it something like 'src2pkg --search mysuperprog' or 'src2pkg '--search mysuperprog-1.3' and have it search some well-known source repos -like sourceforge, gnome, kde etc. In addition searching specifically for SlackBuilds, src2pkg/PkgBuilds or PkgFiles would of course be excellent.
It seems to me that most all of the ports systems keep the build script and all extra materials together in one subdirectory for each program. Most of them specify the URL of the original sources as a part of the build script -either as a full path or as just the tarball name and using a lit of mirrors to search.
src2pkg is already setup to work like this -it's just that for my own convenience(and that of visitors to my site) I've always kept sources and packages in the same directory.
There is also the possibility of generating databases which list what is available from certain sites, but these would have to be updated and could become rather large and still not cover all the possible searched-for items. For sites with a limited selection this might be okay. Still it would be much better if we could implement a true search facility.
Thanks very much for working on this and submitting a working example of code. I'm going to have a better look at it over the next couple of days. BTW, it fits well into the src2pkg philosophy that most of this code be kept outside of the package-building functions. It could be kept as a completely separate program or integrated into the src2pkg wrapper program.
keep the ideas coming and if you find out any info about whether lynx could be of some use, that would help. I've just been looking at these commands:
lynx -traversal -crawl -dump
http://distro.ibiblio.org/pub/linux/...inux/download/
lynx -traversal -crawl
http://distro.ibiblio.org/pub/linux/...inux/download/
lynx -crawl -dump
http://distro.ibiblio.org/pub/linux/...inux/download/
The second one is what gives a recursive output, but it needs to be redirected to a file or files I think.
Running these commands will produce a bunch of *.dat files in your $HOME -maybe a good idea to create a fresh dir and run the lynx commands from there so as not to fill your home dir with stuff.
I have tried to find a way to get recursive dir listings over http/ftp using wget or some other program, but got no intelligent replies from a thread on the subject. If you have other ideas I'd love to hear them.
Here's a link to the crawl.announce which is mentioned in the man-page but not included with the slack lynx package.
http://www.neurophys.wisc.edu/comp/l..._announce.html