LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - General (https://www.linuxquestions.org/questions/linux-general-1/)
-   -   Home Office Biotech Data Mining - Data Collection (https://www.linuxquestions.org/questions/linux-general-1/home-office-biotech-data-mining-data-collection-249330/)

Adler 11-01-2004 04:50 PM

320mb,

Jeez, lighten up a bit.

I'm not looking to become some type of spammer. If I was, there is enough stuff in the MS Universe to go after the NewsGroups and the participants there.

I think this could turn into a positive and informational exchange of what's there to be had. And there are certainly people out there with this type of knowledge, therefore I'd approached the Linux "Community".

I understand your concerns, but I think that you can appreciate my point.

TigerOC 11-02-2004 02:15 AM

Sounds like you need your own crawler. I found one Larbin which you may be able to tune to your own needs.

Adler 11-02-2004 07:23 AM

TigerOC,

Thanks for the link. It'll take some playing around with, but it looks interesting.

Adler 11-02-2004 05:04 PM

TigerOC,

The Larbin project doesn't look like it is what I'm looking for.

Any other ideas?

TigerOC 11-03-2004 03:51 AM

After a quick google search this is a very techniical area in which some of the major players (IBM et al) are involved in. Basically it is multi-step process involving server clusters using crawlers and then consolidating the information in databases for rational use. It may be well worth your while contacting the author of Larbin and discussing your ideas with him.
Personal opinion is that unless you replicate this in some way it is not possible. Your only option is to use conventional detailed search parameters on all the major search engines. In addition establishing connections with leading personalties in various geographical regions who are well acquainted with biotech developments in their own regions would be imperative.

Adler 11-03-2004 04:17 AM

TigerOC,

Thanks for the incitive post.

I have run into the technologies wrapped around this on the Linux side. Over in the MS Universe there's so many Crawlers, Spiders, Extractors (e-mail, address, phone, fax), etc. that is almost like a zoo. I'd bought a couple of apps @ $US100 a pop, but would like to try eveything in Linux and not jump to the MS side of my box.

It seems the big players -- like Big Blue -- are clustering server farms and spending big bucks just to re-create the wheel, basically trying to do a better Google. Novell and Sun seem to be pretty quite.

Linux is so much simpler that -- no reflection on the OS, I love it -- someone usually comes up with a simple solution, another does a GUI and then finally someone says -- Hey, here's how to do it.

I'll try the Larbin guy per your suggestion.

Finally, I lived in Europe for 10 years and also spent another 10 years living in Asia and have my fair share of Biotech / Colleague contacts. We trade information back and forth when one of us has a special need. Call it my Global Network in the Biotech area.

Again, thanks for your post.


All times are GMT -5. The time now is 05:03 AM.