LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   Tricky WGET (https://www.linuxquestions.org/questions/linux-newbie-8/tricky-wget-932970/)

tawandasm 03-06-2012 01:12 AM

Tricky WGET
 
hie

I have this site im trying to mirror / download onto my machine for offline viewing but am failing to put the right switches / wget commands. I need the whole "Message section" & "Bible + Lexicon section" as well...

http://nt.scbbs.com/cgi-bin/om_isapi...wse_Frame_Pg42

http://nt.scbbs.com/cgi-bin/om_isapi...wse_Frame_Pg42

Pliz help {a bit new to WGET ;-/}

Thanx

[tawanda@eagle ~]$ wget -m -r -k -p -L --convert-links -P ./MsgArch http://nt.scbbs.com/cgi-bin/om_isapi...wse_Frame_Pg42
[1] 2564
[2] 2565
[tawanda@eagle ~]$ --2012-03-06 08:16:49-- http://nt.scbbs.com/cgi-bin/om_isapi...ntID=547754057
Resolving nt.scbbs.com... 66.6.218.75
Connecting to nt.scbbs.com|66.6.218.75|:80... connected.
HTTP request sent, awaiting response... 302 Object Moved
Location: http://nt.scbbs.com/cgi-bin/om_isapi...ntID=291922769 [following]
--2012-03-06 08:16:50-- http://nt.scbbs.com/cgi-bin/om_isapi...ntID=291922769
Reusing existing connection to nt.scbbs.com:80.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
Saving to: “./MsgArch/nt.scbbs.com/cgi-bin/om_isapi.dll?clientID=291922769”

[ <=> ] 4,245 --.-K/s in 0.004s

Last-modified header missing -- time-stamps turned off.
2012-03-06 08:16:50 (1.01 MB/s) - “./MsgArch/nt.scbbs.com/cgi-bin/om_isapi.dll?clientID=291922769” saved [4245]

FINISHED --2012-03-06 08:16:50--
Downloaded: 1 files, 4.1K in 0.004s (1.01 MB/s)
Converting ./MsgArch/nt.scbbs.com/cgi-bin/om_isapi.dll?clientID=291922769... nothing to do.
Converted 1 files in 0 seconds.

[1]- Done wget -m -r -k -p -L --convert-links -P ./MsgArch http://nt.scbbs.com/cgi-bin/om_isapi...ntID=547754057
[2]+ Done infobase=message2010.nfo

T3RM1NVT0R 03-07-2012 02:41 PM

@ Reply
 
Hi tawandasm,

Welcome to LQ!!!

I have checked this site and it does not appear that it allow you to follow the link on the page via automated process. So basically you cannot download the pages which are linked on the site that you have mentioned.

For your information you can use the following command to create a mirror of a website on your system:

Code:

wget -mk http://www.sitename.com
But this command will not work in your case or in a case where site requires an authentication.

tawandasm 03-09-2012 04:04 AM

hmmn...!!! thanx for the post & advice.... smway smhow will find a way to crack this, just nids a lil bit o patience & smthing should crop up.... havent given up yet.... Thanx

T3RM1NVT0R 03-09-2012 07:30 AM

@ Reply
 
You're welcome.

I would like to mention that we at LQ do not promote the use of sms language as that creates confusion. Spell out your words correctly and you will get much better response for your post / queries.

tawandasm 03-14-2012 01:55 AM

thanks & my bad... sorry about the full language grammar here is a switch i found and am about to try try it out

Mask User Agent and Display wget like Browser Using wget –user-agent

Some websites can disallow you to download its page by identifying that the user
agent is not a browser. So you can mask the user agent by using –user-agent
options and show wget like a browser as shown below.

$ wget --user-agent="Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.3)
Gecko/2008092416 Firefox/3.0.3" URL-TO-DOWNLOAD


All times are GMT -5. The time now is 04:36 AM.