Preventing google from indexing a link

MicahCarrick · 06-20-2006, 02:35 PM

I'm helping with a website which uses an "Add to Cart" button which is a link to a php script with GET variables which updates the cart and then uses header("Location: whatever.html"); to return the user to the original page. We need to prevent google from listing this link in search results. I've read that using rel="nofollow" only causes the link to not carry any weight-- but doesn't actually exclude from the index. Is there any way to keep robots--specifically google-- from indexing those links?

macemoneta · 06-20-2006, 02:41 PM

You can use a robots.txt file or meta tag. Google, like all well behaved indexing services honors them. The meta tag you are looking for is specifically:

Code:

<META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW">

MicahCarrick · 06-20-2006, 02:46 PM

Right, but there are two problems. The specific file I don't want listed is update.php in the cart folder. The robots.txt blocks /cart/ from being crawled, however, since there are links to /cart/update.php in the main site, those are being indexed. Say /index.html has a link to /cart/update.php?add_item=123. Google then adds /cart/update.php?add_item=123 to it's search results.

I can't use a META tag in update.php as it doesn't have any output and thus nowhere to put a META tag. It updates a session variable and returns to /index.html using header("Location: index.html"); PHP function.

- Micah

xhi · 06-20-2006, 06:34 PM

http://www.robotstxt.org/wc/exclusion-admin.html

the info at the bottom of the page may help.

MicahCarrick · 06-20-2006, 06:49 PM

Thank you. However, I already have /cart/ blocked in robots.txt. The links still get indexed though because they are linked from pages on the home page. So the page doesn't get crawled via /cart/, but still gets listed in search engine results.

xhi · 06-21-2006, 09:54 AM

ah. so even if you specify a particular filename (update.php) it still crawls it since there is an external link? or because of the ?xxx params on the url?

MicahCarrick · 06-21-2006, 10:56 AM

Well, I suppose both of those reasons. I'm not entirely sure how it works. But I know that even though it's not be crawled, it's still being listed in the google search.