[SOLVED] ? Suggestions programs for cross referencing downloaded docs, etc?

Moss · 12-30-2011, 12:26 PM

I have a bunch of downloaded web pages, pdf's, iso's, etc and have trouble finding them, so I end up downloading them several times (memory problems prevent remembering specific key words). Right now, I have a system based on the Dewey Decimal system, but that's copy-righted and I'd like to dump it for something else. Intent is to put links to a document in several subject areas. This would be mostly on my computer, probably nothing physical.

Snark1994 · 12-30-2011, 01:23 PM

Well a reasonably basic system would be to create a database of some sort (either a "proper" one or just a programme in e.g. python which writes/reads a file) storing file name, location and a bunch of keywords associated with the contents - then before downloading them again you can just search the database to see if you have it already. Is this the sort of thing you were thinking of?

Edit: Also, I don't think the concept behind the Dewey Decimal system is copyrighted, more its application. So you would be well within your rights to use the idea of assigning values to different categories of documents, though its value for remembering whether or not you've already downloaded a file is questionable...

Finally, many files such as .iso's have MD5 hashes on the download page - if you stored this information in your database, then even if you didn't pick up the file based on keyword searches you would be able to pick it up based on the MD5 (and then presumably add more keywords to the entry, so you could find it again next time)

Moss · 12-30-2011, 01:58 PM

Sounds possible, but I was thinking of something with a framework already built. If you are familiar with the Dewey Decimal system used by some libraries (mostly here in the US, IIRC), there's a class of subjects with a sub-class under that, and a sub-sub-class under that.

5__.___ - Science
59_.___ - Zoological sciences
599.___ - Mammals
599.?__ Dogs
599.??_ Collies
etc.

I don't recall what is used in Welsh public libraries.

I would want to cross reference that as what Perl would run on, programs written with Perl, what books to read for advice, what sites to go to for information, etc.

The reason for looking for some framework already built is that the data entry _alone_ is going to be a bear, never mind building the framework to fill in.

Moss · 12-30-2011, 02:25 PM

On second thought, what I was considering was going to involve a very long path name plus file name, and I've already exceeded what ever is allowed for the CD's I tried to back up to. Your system would be flat, avoiding that problem. Thanks, I'll try that.

Snark1994 · 12-31-2011, 10:21 AM

I am familiar with the Dewey Decimal system, but as far as I know, there's an "official" designation which some board (Library of Congress?) decides upon and is made standard for a book. So if you have a .pdf explaining how to use quaternions to represent rotations in 3D space, which you're planning to use in a spaceship game (to pick an example from my desktop), you wouldn't want to have to decide whether this is under "Maths > Rotations" or "Programming > Games" or "Projects > Background reading" or whatever categories you have set up. Ideally you want a completely objective way of getting from file to the record.

Also, if you wanted a "ready-made" solution, you don't need to code it yourself. A MySQL database would do reasonably well - you can have the fields I suggested above (plus any ones you care to pick yourself), and a keyword seach would just be:

Code:

SELECT * FROM file_table WHERE keywords LIKE '%quaternions%'

Sorry if I've misunderstood you at all

as always, feel free to come back here if you want any more help...