Getting filesize before downloading the page
Hi,
I am doing a search engine project .I run the crawler everyday.When I am using to crawl the pages from the net I dont need to recrawl the page that was crawled before.
So,can neone gimme an idea abt whatz the best way to solve this problem??
I think one idea may be like comparing the page size of the page before crawling with already the page which is crawled.If the size doesnt vary then i will crawl the page.
But to implement the above strategy i need to know the pagesize before testing comparing page sizes.
So, can neone tell me how to get the pagesize before crawling it??
Waiting for ur suggestions,
M.Bala Nagi Reddy
|