LinuxQuestions.org
Download your favorite Linux distribution at LQ ISO.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - General
User Name
Password
Linux - General This Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.

Notices


Reply
  Search this Thread
Old 01-15-2006, 06:42 AM   #1
Tinku
Member
 
Registered: Jul 2004
Location: INDIA
Distribution: SusE, Gentoo,Debian,FreeBSD
Posts: 197

Rep: Reputation: 30
wget problem with mirroring


I am trying to mirror a website.So,I am using the following command line.

Code:
 wget -r -l0 http://en.wikipedia.org/
I even tried the mirror argument.

This is what I get.


Code:
tinku@localhost:~/docs/books$ wget -r -l0 http://en.wikipedia.org/
--18:19:24--  http://en.wikipedia.org/
           => `en.wikipedia.org/index.html'
Resolving proxy.esi.edu... 168.1.1.192
Connecting to proxy.esi.edu [168.1.1.192]:3128... connected.
Proxy request sent, awaiting response... 301 Moved Permanently
Location: http://en.wikipedia.org/wiki/Main_Page [following]
--18:19:24--  http://en.wikipedia.org/wiki/Main_Page
           => `en.wikipedia.org/wiki/Main_Page'
Connecting to proxy.esi.edu[168.1.1.192]:3128... connected.
Proxy request sent, awaiting response... 200 OK
Length: unspecified [text/html]
Length: unspecified [text/html]

    [     <=>                             ] 42,008        29.06K/s             

18:19:26 (29.05 KB/s) - `en.wikipedia.org/wiki/Main_Page' saved [42008]

Loading robots.txt; please ignore errors.
--18:19:26--  http://en.wikipedia.org/robots.txt
           => `en.wikipedia.org/robots.txt'
Connecting to proxy.esi.edu[168.1.1.192]:3128... connected.
Proxy request sent, awaiting response... 200 OK
Length: 3,793 [text/plain]

100%[====================================>] 3,793         --.--K/s             

18:19:26 (1.15 MB/s) - `en.wikipedia.org/robots.txt' saved [3793/3793]


FINISHED --18:19:26--
Downloaded: 45,801 bytes in 2 files
And thats it.Only 2 files downloaded
Any help will be appreciated
tia
tinku
 
Old 01-15-2006, 07:37 AM   #2
kwacka
Member
 
Registered: Dec 2003
Posts: 77

Rep: Reputation: 15
The option l gives the levels to which you want wget to go down into the recursive (r) directories.

Making it l 0 means that it only gets that level, l 1 the next level, l 2 next and so on.

Take care when using this - nless you want wikipedia and all the sites linked to it. 8-)
 
Old 01-15-2006, 07:41 AM   #3
Tinku
Member
 
Registered: Jul 2004
Location: INDIA
Distribution: SusE, Gentoo,Debian,FreeBSD
Posts: 197

Original Poster
Rep: Reputation: 30
No change.I tried changing 0 to other levels,but only the same 2 files are being downloaded.
 
Old 01-15-2006, 08:18 AM   #4
trickykid
LQ Guru
 
Registered: Jan 2001
Posts: 24,149

Rep: Reputation: 269Reputation: 269Reputation: 269
I think it's the fact that by default, wget will look for .html files, your dealing with a wiki, which most of the internal links do not end with any type of extension.

You'll probably have to play around with the -E or --html-extensions option as I don't really see any other way around it. The only other problem I see is that it would cause pages that are not .html files to have the .html extension.

man wget for more options and details.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Problem mirroring with lftp Deathspawner Linux - Software 1 11-22-2005 02:25 PM
wget proxy problem henryluo Linux - Newbie 1 07-05-2004 06:31 PM
wget problem - spaning domains MarlaSinger Linux - Software 1 03-05-2004 07:29 PM
problem wiht wget true_atlantis Linux - Software 5 01-13-2004 07:52 PM
wget continuation problem ksd Linux - Software 2 10-20-2003 10:51 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - General

All times are GMT -5. The time now is 11:43 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration