« bottom dwellers | Main | Dissecting Popeye on the Occasion of his 75th Birthday »

help wanted

Can someone please direct me to instructions on how to stop search engines from crawling my site? I've officially had it with some of the requests that lead people here. There are some sick, sick people in this world and I do not want to be part of their quest to find the depraved things they are looking for. They won't find them here, of course, but the thought that people like that are on my site looking for this crap really freaks me out.

UPDATE: Thanks to everyone for their help. I'm going to try a couple of different things today and I'll let you know how it works out. You all rock, as usual.

TrackBack

Listed below are links to weblogs that reference help wanted:

» Robots also attack old people for their medicine... from Amish Tech Support
Michele goes wiggy over spider-robots: Can someone please direct me to instructions on how to stop search engines from crawling my site? I've officially had it with some of the requests that lead people here. There are some sick, sick... [Read More]

» BLOCKING ROBOTS from DiscountBlogger
Michele is wondering how to block Search Engine Robots from visiting and indexing her site. The answer, I believe, is here. I've copied and pasted the text of the instructions into the extended entry.... [Read More]

» Geek Stuff from Mike (and family)
This morning Michele asked if there was a way to stop search engines from crawling her blog: "I've officially had it with some of the requests that lead people here. There are some sick, sick people in this world and [Read More]

Comments

Read this:

http://www.tamingthebeast.net/articles/robotswhoread.htm

http://www.annessa.net/blog/archives/001039.php

LGF has a link to it's search requests log...

lgf search requests

Since I don't have a web page, that's the only way I can see what sort of searches people are making.

Amazing.

Sorry if the world is treating you poorly. What specifically do they seem to be tracking? I think there are some common PGP code keys that might help you...?

Creat a file named robots.txt Put the following lines in it:
User-agent: *
Disallow: /
upload it your public_html directory. If you have subdomains, also upload it to those directories. That will tell the search engines to stay away. (But it will not remove old caches they already have, you'll still get hits from those.)

robots.txt is the way to go - but be warned that some search engines don't respect it (I don't remember the full list at the moment, but there are a couple). You can go and ask Google to remove your site from the index. Not sure how quickly that happens, but it's worth a shot.

In increasing order of veracity
1) <meta name="robots" content="noindex,nofollow">

2) the robots.txt file listed above

3) Complete blockage via your .htaccess file. I did this with a few of the more spam-type bots that don't respect robots.txt or anything else. Google will respect both the meta tags and the robots.txt file.

Sigh. As the "My mother was a splody dope" entry scrolls off the bottom of the log, the chances that my last rants will be answered nearly evaporates.

I think I'm going to go into withdrawl.

How do you know they're sick? If so, figure out their IP and block it. There are some sites. Google "IP Blocker"...

I'm not going to go through my logs and block the IP of every person who comes here looking for things about sexual acts that are illegal in most countries (probably not Thailand).

... and (on the scrolled thread) I'd posted this cool picture of Vash the Stampede (no it's not MY artwork)


"The world is made of Love and Peace! Love and Peace!""The world is made of Love and Peace! Love and Peace!"

oops didn't mean to double the caption

Hey, I didn't expect a pic of Vash. Yes, love and peace with a big gun, ha! Anyway, as to use of robots.txt and metatags, they do work with Google which is the biggie. I do remember Excite not recognizing metatags but haven't checked the other search engines. Why bother with Google bars in my browsers. If you are still set up to ping sites like weblogs.com, blogrolling, blo.gs, they will still work and N.Z. Bear and Technorati will still spider you if you signed up for them too. What you will find is Google spidering your index with no description, any and all trackbacks from other blogs listing you, any comments you make elsewhere, webrings, stuff like that. I did both but did finally edit it to allow Google some spidering. best is to do it all.

Michele, the robots.txt thing works very well. I've used it to stop people from grabbing images via google images, etc.