Search engine robots that visit your web site by John A Fotheringham.

Published on Sep. 5 2007

List of robots and where they come from, thanks to research by John Fotheringham!

Search engine robots that visit your web site by John A Fotheringham.
Source: http://www.jafsoft.com/searchengines/webbots.html

Home page/search engine Robot identifier IP address(es)

abacho.com AbachoBOT srv-ze-robot1.tricus.com

abcdatos.com abcdatos_botlink 217.126.39.167 abcdatos.com/botlink/

aesop.com AESOP_com_SpiderMan 209.189.115.49

ah-ha.com ah-ha.com crawler (crawler(at)ah-ha.com) c7pub-216-250-141-186.center7.com

alexa.com ia_archiver green.alexa.com sarah.alexa.com

altavista.com Scooter test-scooter.pa.alta-vista.net Mercator brillo.pa.alta-vista.net Scooter2_Mercator_3-1.0 av-dev4.pa.alta-vista.net roach.smo.av.com-1.0 scooter.aveurope.co.uk Tv#nn#_Merc_resh_26_1_D-1.0 bigip1-snat.sv.av.com mercator.pa-x.dec.com scooter.pa.alta-vista.net election2000crawl-complaints-to-admin.webresearch.pa-x.dec.com scooter.sv.av.com avfwclient.sv.av.com tv#nn#.sv.av.com

altavista.co.uk AltaVista-Intranet host-119.altavista.se jan.gelin(at)av.com

alltheweb.com FAST-WebCrawler 209.67.247.154 crawler(at)fast.no fast.no/faq/faqfastwebsearch/faqfastwebcrawler.html Wget ext-gw.trd.fast.no

acoon.de Acoon Robot 194.231.42.178

antisearch.net antibot 62.210.155.50

atomz.com Atomz router-sc.atomz.com index.atomz.com

axmo.com AxmoRobot 194.248.208.82

buscaplus.com Buscaplus Robi buscaplus.com/robi/

canseek.ca CanSeek/ 216.168.111.111 support(at)canseek.ca

christcrawler.com/search.cfm ChristCRAWLER 207.191.111.231 christcrawler.com/

clush.com Clushbot 209.249.80.242 clush.com/bot.html

crawler.de Crawler crawlit.crawler.de admin(at)crawler.de

daadle.com DaAdLe.com ROBOT/ 216.12.213.32

daum.net RaBot 210.183.28.46 Agent-admin/ phortse(at)hanmail.net 211.50.57.6 contact/jylee(at)kies.co.kr RaBot 202.30.94.34 Agent-admin/ webmaster(at)kisco.go.kr

en.deepindex.com DeepIndex deepindex.net1.nerim.net

ditto.com DittoSpyder 65.169.94.188

domanova.co.uk Jack

earthcom.info EARTHCOM.info 194.108.39.74

entireweb.com Speedy Spider 62.13.25.209

excite.com ArchitextSpider Musical instrumentss are used in the name such as viola.excite.com cello.excite.com piano.excite.com kazoo.excite.com ride.excite.com sabian.excite.com sax.excite.com bugle.excite.com snare.excite.com ziljian.excite.com bongos.excite.com maturana.excite.com mandolin.excite.com piccolo.excite.com kettle.excite.com ichiban.excite.com (and the rest of the band) more recently first names are being used like philip.excite.com peter.excite.con perdita.excite.com macduff.excite.com agouti.excite.com

(excite) ArchitectSpider crimpshrine.atext.com ichiban.atext.com

eurip.com EuripBot 81.169.172.30

euroseek.net Arachnoidea 212.209.54.134 arachnoidea(at)euroseek.net

ezresults.com EZResult 216.28.23.59

fastsearch.net Fast PartnerSite Crawler psprdcrw001.sac2.fastsearch.net FAST Data Search Crawler 65.198.110.185 FAST Data Search Document Retriever 69.38.159.128

fireball.de KIT-Fireball ?

france.misesajour.com/ france.misesajour.com 66.98.210.71

fybersearch.com FyberSearch 69.49.241.9

galaxy.com GalaxyBot 63.121.41.175 galaxy.com/galaxybot.html

geckobot.com geckobot .rdc1.az.coxatwork.com

gendoor.com GenCrawler ?

(Genealogical Search Engine)

geona.com GeonaBot 69.59.142.17

getrax.com getRAX 81.169.156.246

google.com Googlebot c#nn#.googlebot.com googlebot(at)googlebot.com googlebot.com/

goo.ne.jp moget/2.0 202.229.31.13 moget(at)goo.ne.jp

girafa.com Aranha Aranha.girafa.com

(inktomi) Slurp.so/1.0 q2004.inktomisearch.com slurp(at)inktomi.com j5006.inktomisearch.com

(inktomi) Slurp/2.0j 202.212.5.34 slurp(at)inktomi.com goo313.goo.ne.jp inktomisearch.com

(inktomi) Slurp/2.0-KiteHourly y400.inktomi.com slurp(at)inktomi.com; inktomi.com/slurp.html

(inktomi) Slurp/2.0-OwlWeekly 209.185.143.198 spider(at)aeneid.com inktomi.com/slurp.html

(inktomi) Slurp/3.0-AU j6000.inktomi.com slurp(at)inktomi.com

hoppa.com/ Toutatis 2.5-2 tisnix.xs4all.nl

(need V5 browsers to view)

hubat.com Hubater 209.114.176.250

almaden.ibm.com almaden.ibm.com/cs/crawler wfp2.almaden.ibm.com

(research centre)

iltrovatore.it IlTrovatore-Setaccio 213.26.21.8

incywincy.com IncyWincy 64.81.243.66

infoseek.com UltraSeek cde2c923.infoseek.com InfoSeek Sidewinder cde2c91f.infoseek.com cca26215.infoseek.com

intags.de Mole2/1.0 217.160.75.10 webmaster(at)intags.de

mp3bot.de/ MP3Bot #..#

ip3000.com C-PBWF-ip3000.com-crawler ip3000.com ip3000.com-crawler

istarthere.com istarthere.com 66.220.24.80 spider(at)istarthere.com

knowledge.com Knowledge.com/ 213.170.2.69

kuloko.com kuloko-bot/0.2 66.90.81.41

lexis-nexis.com LNSpiderguy firewall5.lexis-nexis.com

linknz.co.nz Linknzbot 202.191.32.67

look.com lookbot magma.com

looksmart.com MantraAgent fjupiter.looksmart.com

loopimprovements.com NetResearchServer leg-64-133-109-250-STK.sprinthome.com

(see also incywincy.com) loopimprovements.com/robot.html

lycos.com Lycos_Spider_(T-Rex) bos-spider#n#.bos.lycos.com 216.35.194.188

joocer.com JoocerBot 80.46.38.169

mirago.co.uk HenryTheMiragoRobot 194.202.39.46

mojeek.com MojeekBot

mozdex.com mozDex/ (within comcast.net)

search.msn.com/ MSNBOT/0.1 131.107.163.47 search.msn.com/msnbot.htm)

navadoo.com Navadoo Crawler

northernlight.com Gulliver marvin.northernlight.com taz.northernlight.com

objectssearch.com ObjectsSearch/0.01 68.88.244.177

szukaj.onet.pl/ OnetSzukaj/

picosearch.com PicoSearch/ pipe.picosearch.com

portaljuice.com PJspider timber.nextopia.com

powerinter.net DIIbot node-d8e93393.powerinter.net

but it won't let us in

navi.ocn.ne.jp/ nttdirectory_robot lilis00.navi.ocn.ne.jp super-robot(at)super.navi.ocn.ne.jp lilis04.navi.ocn.ne.jp griffon griffon(at)super.navi.ocn.ne.jp

maxbot.com Spider/maxbot.com search.wport.com admin(at)maxbot.com

various (fakes agent on each access) pool0058.cvx2-bradley.dialup.earthlink.net

gazz/1.0 deleuze.infobee.ne.jp gazz(at)nttrd.com derrida.infobee.ne.jp

search-8.xift.com

nationaldirectory.com NationalDirectory-SuperSpider spider.nationaldirectory.com 209.116.58.143

naver.com dloader(NaverRobot)/ 211.218.151.209 dumrobo(NaverRobot)/

noxtrum.com noxtrumbot/ 194.224.199.52

openfind.com "Openfind piranha Shark"

(Chinese language) robot-response(at)openfind.com.tw Openbot/ abovenet4.openfind.com

picsearch.org psbot 217.75.104.26 picsearch.org/bot.html

pinpoint.com CrawlerBoy Pinpoint.com nitrogen.pinpoint.com

petersnews.com user#n#.ip3000.com news#n#.petersnews.com

qweery.nl QweeryBot 84.82.133.41 qweerybot.qweery.com)

vestris.com/alkaline AlkalineBOT host130.uv-ray.com

rambler.ru StackRambler/ 81.222.64.10

seznam.cz SeznamBot 212.80.76.87

search-10.com Search-10 82.41.144.99

searchhippo.com Fluffy the spider 208.148.122.27 info(at)searchhippo.com)

scrubtheweb.com Scrubby/ 208.145.190.254

singingfish.com asterias grouper.singingfish.com

speedfind.de speedfind ramBot xtreme BWEB.highway.telekom.at

s.u-tokyo.ac.jp Kototoi/0.1 crawler-red3.is.s.u-tokyo.ac.jp

searchbyusa.com SearchByUsa

searchspider.com Searchspider/ 24.90.243.203

sightquest.com SightQuestBot/ 64.49.245.212 sightquest.com/bot.htm

spidermonkey.ca Spider_Monkey/ 66.163.18.197

surfnomore.com Surfnomore Spider v1.1 165.90.194.245

supersnooper.com Robot(at)SuperSnooper.Com 207.8.212.162

teoma.com teoma_agent1 63.236.92.148 teoma_admin(at)hawkholdings.com

mapper.teradex.com Teradex_Mapper 65.110.6.26 mapper(at)teradex.com

travel-finder.com ESISmartSpider 202.46.33.15

traficdublu.ro Spider TraficDublu "81.196.*.* 193.16.218.66"

tutorgig.com Tutorial Crawler 216.40.225.75 tutorgig.com/crawler

updated.com updated/0.1beta 38.119.96.107 crawler(at)updated.com

uksearcher.co.uk UK Searcher Spider -

vivante.com Vivante Link Checker 216.93.167.106

(coming soon)

walhello.com appie "uses an address at planet.nl a Dutch ISP"

websmostlinked.com Nazilla -

webwombat.com.au WebWombat.com.au 202.139.99.131

webseek.de marvin/infoseek arthur4.sda.t-online.de marvin-team(at)webseek.de

webtop.com MuscatFerret ferret#nn#.webtop.com

whizbanglabs.com WhizBang! Lab 216.250.143.108

wisenut.com ZyBorg - (info(at)WISEnut.com)

wire.co.uk WIRE WebRefiner: brighton.wire.co.uk webrefiner(at)wire.co.uk

worldsearchcenter.com WSCbot

yandex.com Yandex ya.yandex.ru

yellowpet.com Yellopet-Spider 212-82-36-23.ip.zeitraum.com

pet-based search engine

yelo.no Findexa Crawler

yourbettersearch.com YBSbot search engine indexer 12.25.90.3

#client sites# libwww-perl linpro.no/lwp/

verno.ueda.info.waseda.ac.jp/ Iron33 207.18.183.251

Browsers

Most browsers identify themselves with a string that begins 'Mozilla...'. I've chosen not to document those (as yet). Here are a few of the rarer browser identifiers that I've seen.

Browser identifier Information

AmigaVoyager v3.vapor.com/ Voyager browser for the Amiga

xChaos_Arachne browser.arachne.cz/ (DOS-compatible browser. Linux version under development)

IBrowse hisoft.co.uk (search for IBrowse) Amiga-based browser

ICab icab.de/index.html (Macintosh-only)

JustView www3.justsystem.co.jp/download/justview/3.01win1a.html (I think this is a browser. Site is in Japanese)

KMeleon kmeleon.sourceforge.net/ (Light browser based on the Mozilla code base)

Konqueror konqueror.org/konq-browser.html (Linux KDE browser)

Lynx lynx.browser.org/ (Cross-platform text based browser)

OmniWeb omnigroup.com/products/omniweb/ (Macintosh-only)

Opera opera.com "(Cross-platform small efficient and standards lead browser)"

Plucker plkr.org/index.pl/faq#1.1 (Palm handhelds. Written in Python)

pwWebSpeak prodworks.com/issound/catalog/catalog_pwwebspeak.html Audio Browser

QWeb sunsite.auc.dk/qweb/ (Linux browser) (see also browswerwatch.internet.com/news/story/qweb8.html)

retawq retawq.sourceforge.net/ Text-based browser for text terminals. Runs under Linux

SlimBrowser flashpeak.com/sbrowser/sbrowser.htm Freeware tabbed browser

Sleipnir sleipnir.pos.to/software/sleipnir/index.html (Japanese) Japanese browser with apparantly an English version available.

VMS_Mosaic vaxa.wvnet.edu/vmswww/vms_mosaic.html "(OpenVMS only version of Mosaic a pre-Netscape browser)"

WannaBe mindstory.com/wb2/ (Macintosh text-only browser)

w3m w3m.sourceforge.net/ (text-based browser)

"Link Checkers Link monitors and bookmark managers "

"Link checkers and bookmark managers are run by people wanting to keep their pages and bookmarks up to date. Being visited by a link checker is good news as it means that someone has linked to you and cares that you're still alive. Link monitors regularly check your pages for changes usually because someone has selected your page as 'one to watch'. "

(pause for warm glow

"If you have access to the server log check the referrer page to try and get the URL from which you are linked. Sometimes these URLs are inside password protected parts of sites so you won't be able to view the page. "

"If you build up a list of sites that link to you these are the guys you should tell when you move (moral never move) "

"It's also quite common for the Link checker to give no indication of which URL it's coming from. Some link checkers always come from the same IP address more usually they come from the client's site. It depends on whether the site owner has purchased a copy of the link checking software or signed up to some centralized link checking service. If you get the client's IP address you can always try visiting that if they blank the referrer URL field and surfing their site. "

Some of these tools appear to imply they're extracting email addresses (e.g. emailSiphon). As such they're probably unwelcome visitors since these addresses are probably being collected for spammers.

A page listing various link checkers (and other tools) can be found at softwareqatest.com/qatweb1.html#LINK

Robot identifier IP address(es) Link Checker home page

ActiveBookmark #client site# libmaster.com/software.php

ALink #client site# info-pack.com/alink/ "Reciprocal Link Checker Manager and Page Generator."

AMeta #client site# info-pack.com/ameta/ Meta Tag Generator

ASPSearch URL Checker #client site# search.santry.com/downloads/ a site search engine/index maintenance tool

BlogBot #client site# sourceforge.net/projects/blogbot/

BMChecker #client site# fureai.or.jp/~yoichi37/soft/bmchecker.html (Japanese Bookmark Checker)

Bookmark Buddy #client site# bookmarkbuddy.net/about.shtml

Check&Get #client site# checkget.com

CheckWeb #client site# checkweb.com

CNET_Snoop download.com (only if you have software listed at that site)

CSE HTML Validator #client site# htmlvalidator.com HTML page validator that includes a link checker amongst it's functions.

DRKSpider #client site# drk.com.ar/spider/ (An Open Source project)

DISCo Watchman #client site# t-guild.com/gamesite/Software/Disco_w/Disco_w.htm

DoctorHTML draco.imagiware.com www2.imagiware.com/RxHTML/

Email Extractor #client site# #email collector# We don't list links to email collectors on this site

EmailSiphon #client site# #email collector# We don't list links to email collectors on this site

EmailWolf #client site# pixeltech.com.au/~msw/ewolf/index.html

FavOrg #client site# "pcmag.com/article2/0 1759 1558477 00.asp" A utility written by PC Magazine to fetch icons files (favicon.ico) for your IE favorites

Favorites Sweeper #client site# manitoolssoftware.cjb.net Another 'favorites' tidy-up utility

FreshLinks.exe #client site# resqpc.com/features.html

Funnel Web Profiler #client site# quest.com/funnel_web/profiler/ "Profiles your site including links to/from it"

Html Link Validator #client site# lithopssoft.com/hlv/index.html

HTMLParser #client site# htmlparser.sourceforge.net/ an open source "HTML parser that is probably exercising it's" link-checking features.

The Informant cosmo.dartmouth.edu informant.dartmouth.edu/

The Intraformant

InternetLinkAgent #client site# www1.odn.ne.jp/freeware/rank/ineternet/internetlinkagent.html (in Japanese)

InternetPeriscope #client site# lokboxsoftware.com/internetperiscope.asp

javElink salix.ingetech.com dailydiffs.com

jdwhatsnew.cgi #client site# jdrowell.com/projects/jdwhatsnew/view

JRTS Check Favorites Utility #client site# jrtwine.com/Products/CheckFavs/

Lambda LinkCheck 195.139.70.25 stud.ifi.uio.no/~lmariusg/download/python/LinkCheck.html

LinkLint-checkonly -- goldwarp.com/bowlin/linklint/

LinkAlarm linkalarm.com linkalarm.com

Linkbot #client site# tetranetsoftware.com/products/linkbot.htm

Linkman (Mozilla...) 66.89.128.242 outertech.com/product.php?product=5

LinkProver #client site# tafweb.com/linkprover.html

Links -- gossamer-threads.com/scripts/links/ (Link management cgi script)

LinkScan Server #client site# elsop.com

LinkSweeper #client site# lss.com.au/lss/windows/ls/linksweeper.htm

Link Valet Online 195.82.114.5 htmlhelp.com/tools/valet/

LinkVerify Spider frances.yourwebhost.com enduser.co.uk/linkverify/

LinkWalker lw.seventwentyfour.com seventwentyfour.com 209.167.50.23

Morning Paper #client site# boutell.com/morning/

MoveAnnouncer -- moveannouncer.com (notifies webmasters when your pages have moved)

mylinkcheck -- mylinkcheck.de (German)

NetLookout -- frugalsoft.com

NetMechanic gamma.netmechanic2.com netmechanic.com

elsop.com

NetMind-Minder marvin.netmind.com (retired) netmind.com gary.netmind.com meg.netmind.com inyanga.netmind.com leo.netmind.com gemini.netmind.com

NetMonitor -- modemwizard.com/netmonitor.html

Netprospector JavaCrawler #client site# actaddons.com/products/netprospector.asp

online link validator 216.93.171.138 dead-links.com (online link checker submit your URL)

Rational SiteCheck #client site# rational.com/products/teamtest/prodinfo/sitecheck.jtmpl

Robozilla h-206-#n#-#n#-#n#.netscape.com dmoz.org/ (checks links in the dmoz directory)

RPT-HTTPClient #client site# purplefrog.com/~thoth/jchecklinks/ Java utility that uses the Java HTTPClient class library

SiteBar #client site# sitebar.org

SpurlBot spurl.net Online bookmark agent

SurfMaster #client site# maskbit.com/surfmaster.htm

SyncIT #client site# bookmarksync.com

Watchfire WebXM #client site# watchfire.com/products/webxm.asp

WatzNew Agent #client site# watznew.com

WebSite-Watcher #client site# aignes.com

WebTrends Link Analyzer #client site# webtrends.com

Weblink Scanner #client site# iterix.com/products/WeblinkScanner/weblinkScanner.asp

Xenu's Link Sleuth #client site# snafu.de/~tilman/xenulink.html

Z-Add Link Checker #client site?# w3.z-add.co.uk/linkcheck/

Validators

"Validators check your web pages for HTML correctness and standards compliance. Since other people are unlikely to send a validator to your site you don't usually see much of this. Consequently the 'list' below is restricted to the on-line validators I've used myself. "

"However if you choose to validate your own site then the validation attempts will appear in your logs. The following list is thus limited to the on-line validator I use (and recommend) and a URL submission service that I use. "

Robot Identifier IP address Validator home page

W3C_Validator abyss.w3.org validator.w3.org/

WDG_Validator/ 64.29.16.182 htmlhelp.com/tools/validator/

Tooter selfpromotion.com selfpromotion.com. This is used as part of a link submission agent (trebor(at)animeigo.com)

FTP clients and download managers

"If you offer files for download then you'll start to be visited by various FTP clients. Clients like Go!Zilla and GetRight are smart in that they can resume downloads that have been interrupted. This relies on your web server supporting the necessary protocol but that's fairly standard these days. "

"If your download files are over 1Mb in size (or if your server is slow) you'll often see the same IP address make multiple partial downloads of your file (look at the file size). In the case of Clients line Go!Zilla and GetRight if these add up to the right number of bytes then chances are the download succeeded. "

Client Identifier FTP Client home page

Alligator nearsoftware.com/alligator/maininfo/

BatchFTP dynamicnet.net/products/batchftp.htm

ChinaClaw download.pchome.net/internet/download/860.html (Chinese) (Chinese download utility)

DA lidan.com downloadaccelerator.com

DLExpert yanew.com (English and Chinese versions available)

Download Demon netzip.com

Download Master one.com.ua/dm/ (Russian)

Download Ninja h-fd.org/~mkro/mt/archives/000585.html (Japanese)

Download Wonder forty.com

Ez Auto Downloader anatari.com/ezad/index.html "Downloads all files of a given type from a site so it's" more like a site grabber

FreshDownload freshdevices.com/freshdown.html

Go!Zilla gozilla.com

GetRight getright.com

MyGetRight

GetSmart getsmart.hypermart.net/

HiDownload hidownload.com

JetCar (or FlashGet) amazesoft.com

Kapere kapere.com/menu.php?lang=english

Kontiki Client kontiki.com/products/index.html

LeechFTP stud.fh-heilbronn.de/~jdebis/leechftp/

LeechGet leechget.de

LightningDownload lightningdownload.com

Mass Downloader geocities.com/SiliconValley/Vista/2865/md.htm

MetaProducts Download Express metaproducts.com/DE.html

NetZip Downloader netzip.com

SmartDownload

NetAnts netants.com

NetButler webcelerator.com/netbutler/

NetPumper netpumper.com

Net Vampire netvampire.com

Nitro Downloader klsofttools.com/nitro.html

Octopus moskalyuk.com/octopus/

PuxaRapido puxarapido.com.br

RealDownload service.real.com/help/faq/rdown4/rdownfaqa01.html

SpeedDownload yazsoft.com (for Macintosh)

WebDownloader for X 1.30 krasu.ru/soft/chuchelo/features.php3 (Linux web downloader with X GUI)

WebLeacher webleacher.dk (down last time I tried it) more details at davecentral.com/projects/thewebleacher/

WebPictures Downloader fullstrong.com Locates and downloads pictures

X-Uploader "Can't find the home page but it's described (in Russian)" on compulenta.ru/2002/1/17/24333/

Research projects

These agents come from research projects. Of course that's how Google started...

citenikbot/ citenik.co.uk/bot.html. One-man project due for release in 2004.

CLIPS-index clips-index.imag.fr/ (French) French research robot from a linguistics project (?)

Computer_and_Automation_Research_Institute_Crawler Robot from the research centre at Hungarian Acedemy of Sciences at sztaki.hu Crawls from IP 195.111.1.93

cosmos Spider from xyleme.com which is a project to locate

robot(at)xyleme.com and index XML content on the web. The company is a spin off "from project at INRIA in France a frequent source of" web robots. The word 'xyleme' apparantly relates to the "vascular system in plants but cleverly must be one of" "the very few words to contain the letters 'X' 'M' and 'L'" (although not in that order

D2KWebCrawler archive.ncsa.uiuc.edu/TechFocus/Projects/NCSA/D2K-_Data_To_Knowledge.html Data to Knowledge' data miner. Crawls from 141.142.15.21

DiaGem/ Experimental spider from Mitsibushi R&D division skyrocket.gr.jp/diagem.html Crawls from IP 203.178.88.244

Digimarc WebReader Digimarc search images on the web looking for digital watermatrs More details at digimarc.com

EchO!/2.0 "Spiders from 194.254.160.3 which would seem to be part" "of voila.com a French-based search engine."

FinaleRobot The expressus.com site describes an Interactive Natural

robot-master(at)expressus.com Language encyclopedia that will become a search engine "at final-e.com. Good name but at present it just" maps back onto the ExpressUs site (not such a good name). Crawls from IP address 64.114.34.115

Ideare SignSite ideare.com. Spiders from spider3.tiscalinet.it. Ideare are "a research company producing search engine technology and are" "part owned by Tiscali in Italy who seem to use their various" "tools for different search engines (mp3 images etc)."

GentleSpider Some sort of spider that usually visits using an IP address from within research.att.com or crawler.tivra.com

Gulper Web Bot ecsl.cs.sunysb.edu/~maxim/cgi-bin/Link/GulperBot (Open research project to produce opinion-based search engine)

larbin "And from the people that brought you xyro (see below) "

sebastien.ailleret(at)inria.fr "comes another newer bot. This one seems to crawl from"

ghi(at)lcs.mit.edu the IP address cremant.inria.fr. Update more recently it's also been seen coming from barracutta.lcs.mit.edu

cosmos "And then there was 'cosmos' crawling from pomelos.inria.fr" Seems these people are a webbot factory. Cosmos doesn't offer an email address.

IRLbot irl.cs.tamu.edu/crawler. Crawls from 128.194.135.80 crawls randomly to determine the topology of the web.

KnowItAll cs.washington.edu/research/knowitall/ a project that extracts massive amounts of information from the Web in "an autonomous scalable manner'. Don't they know that" everyone hates a know-it-all?

MJ12bot majestic12.co.uk/projects/dsearch/ A dsitributed search engine project

MultiText Research project to index the last weeks' news items canola1.uwaterloo.ca/

NEC Research Agent heavenly.nj.nec.com/ Research 'Inquirus' (meta?) search engine

OntoSpider ontospider.i-n.info Dutch robot for a research project. Crawls from 195.11.244.52

sherlock_spider sherlock.com.cn. A course project from burrocs.indiana.edu:15003/b659/ Crawls from 129.79.245.98

S.T.A.L.K.E.R. seo-tools.net/en/bot.aspx. 'My first robot' Crawls from 195.71.117.89

Steeler tkl.iis.u-tokyo.ac.jp/~crawler/crawler.html.en Japanese research robot.

ru-robot "Unable to find details on this but I'm guessing it's"

0.1_hseo(at)cs.rutgers.edu a research spider from rutgers.edu. Crawls using the IP teal.rutgers.edu

USyd-NLP-Spider it.usyd.edu.au/~vinci/webcorpus.html research into Natural "Language Processing at University of Sydney Australia"

WebGather pccms.pku.edu.cn:8000/ Chinese search project

xyro Seems to be a spider associated with a French

xcrawler(at)inria.fr research institute. Usually crawls using the IP address vamos.inria.fr

Zao/0.2 kototoi.org/zao/ Another Japanese research robot Crawls from 133.11.36.41.

Zao-Crawler "Same as above but crawled from 133.11.36.40"

Software packages

These agents are the default identifiers for various software packages. Software developers uses these packages to add Internet functionality to their own applications. As such it's impossible to say without looking at the pattern of access what these agents are being used for as the same agent name may be used by different developers fo achieve differemt results.

"While many of these packages allow you to change the user agent some do not and many developers are too lazy to change the agent string. "

GT::WWW Apparantly some form of web-accessing perl module. Possible included in the Links SQL product produced by gossamer-threads.com/scripts/index.htm.

HTTPClient Default agent name used by the Java HTTPClient class. innovation.ch/java/HTTPClient/ (See also RPT-HTTPClient below)

HTTP::Lite Default identifier for a set of light-weight perl modules for retrieving web documents . See toybox.ca/http-lite/

IP*Works! Set of TCP/IP components used in cross-platform development of internet tools nsoftware.com/products/ipworks.aspx

libwww-perl The PERL programming language comes with a number of routines for constructing web-aware scripts. This and "related strings are the default user agent identifiers " although it's perfectly easy to change this to be whatever you want.

libghttp The GNOME http library. A Linux software library the offers connectivity to the web. Found in many places on the web. There is a description at fifi.org/doc/libghttp-dev/html/ghttp.html

Macromedia Flash Player Flash movies can contain scripts that can fetch content from the web (such as other Flash movies or images)

MFC_Tear_Sample Agent name used in the sample code supplied with Visual C++ for accessing the web. This may be therefore be someone running a program they've written based on that code.

PEAR HTTP_Request class TPEAR is a framework and distribution system for reusable PHP components pear.php.net/

Python-urllib Presumably the default identifier for the urllib module in the Python programming language lib.uchicago.edu/keith/courses/python/class/7/

RPT-HTTPClient The Java HTTPClient class library

TeamSoft WinInet Component winsoft.sk/wininet.htm (menus require Java) Internet software component suite

wget gnu.org/software/wget/wget.html Free Unix/Linux package for retrieving web pages

WinScripter iNet Tools winscripter.com/wsh/tools/wsInetTools.asp COM/DLL object that supports the SMTP and HTTP protocols

W3CRobot/ A fast web-spidering robot included with the libwww package (?). See w3.org/Robot/

W3C-WebCon/ w3.org/ComLine a command-line toolkit that allows you to perform HTTP operations

wxWidgets wxwidgets.org cross-platform open source C++ GUI builder "which includes 'HTML viewing' and much much more."

Zeus #nnnn# Webster Pro homepagesw.com/webster_overview.htm

Offline browsers and other agents

Agent Identifier Agent home page

DigOut4U arisem.com/Enu/

DISCoFinder ars.ru/eng/products/discof.asp

eCatch ecatch.com

EirGrabber www2p.biglobe.ne.jp/~eir/index.htm (Japanese software from the 'Eir Project')

ExtractorPro (Bulk email marketing tool. URL deliberately omitted)

FairAd Client hager.co.at/fordelka/fairad.htm (German) A German pay-to-surf client

JoBo matuschek.net/software/jobo/index.html a site downloader

iSiloWeb isilo.com (for palm pilot)

Kenjin Spider autonomy.com

MSIECrawler (Microsoft IE4.0)

MSProxy

NexTools WebAgent vector.co.jp/soft/win95/net/se053030.html

Offline Explorer metaproducts.com/OE.html

NetAttache Offline browser and search engine agent

PageDown Details (in Japanese) at www01.u-page.so-net.ne.jp/fa2/y_yutaka/share/pagedown.htm

ParaSite ianett.com/parasite/

Searchworks Spider nedesign.com/Phipps/products.html

SiteMapper trellian.com/mapper/index.html

SiteSnagger "pcmag.com/article2/0 1759 1559896 00.asp"

SuperBot sparkleware.com/superbot/index.html

Teleport Pro tenmax.com/teleport/pro/home.htm

URL2File chami.com/free/url2file_wincon.html

Web2Map web2map.com/us/index.htm Web site copier. English/German versions available

WebAuto yanasoft.co.jp/webauto.html I think this is an offline browser. Site is in Japanese

WebCopier maximumsoft.com

Webdup webdup.com (Chinese software. Not 100% sure what it does)

WebFetch webfetch.com

WebReaper webreaper.net/

Webrobot multimania.com/dilletb/WebRobot/

Website eXtractor asona.org

WebSnatcher theronwelch.com/websnatcher/

WebStripper solentsoftware.com/webstripper/

WebTwin WebTwin.com Convert websites into help files.

WebVCR netresultscorp.com/fs_webvcr_info.html

WebZIP spidersoft.com

WWWOFFLE gedanken.demon.co.uk/wwwoffle/

Xaldon WebSpider xaldon.de/produkte_webspider.html (German) Offline browser

Other miscellaneous agents

"These agents are ones that we've seen but been unable to get information for or which are slightly unusual in origin. If you have any additional information on any of these feel free to send it to info(at)jafsoft.com "

User Agent Information

Ad Muncher admuncher.com "Browser plug-in that monitors the pages as you view them " "and removes all adverts popup windows etc."

ADSAComponent cnds.ucd.ie/adsa/

ADSARobot distributed search engine project Contact postmaster(at)cnds.ucd.ie browses from acropolis.ucd.ie (which doesn't make sense for a distributed search engine

Albert Indexer albert.com Multi-lingual search technology

AnswerChase answerchase.com a personal search robot.

ASPSeek aspseek.org/about.html. An open source search engine project

ATA-Translation-Service "Looks to be an online translation tool much like" Babelfish. Possibly related to atanet.org/

AVSearch Seems to be the AltaVista personal search agent. The crawling site is sometimes referred to in the agent name

Avant Browser avantbrowser.com Browser add-on for Internet Explorer

Beamer pagebeamer.org/fr/index.php (French). A browser accelerator that requires sites to create a 'pagebeamer.txt' file that is fetched by this agent to do predictive downloads.

beholder or vigiltech.com/esensedisclaim.html

e-sense vigiltech.com/esensedisclaim.html

BravoBrian bstop.bravobrian.it/ (may require IE). A content filtering service that offers protection from pornography and other unwanted content for children. Comes from IP 213.215.133.19

bumblebee(at)relevare.com Software used to build 'Vortals' (vertical portals). Details (requires Flash) can be found at relevare.com/site/

Checkbot Seems to come from oxxfordinfo.com who offer B2B services

contype Possibly Adobe Acrobat or Reader or Adobe Acrobat Reader used with MSIE (I have been unable to confirm this)

Convera Internet Spider A 'RetrievalWare' product which claims to be a multimedia web cralwer. convera.com/Products/rw_ancillis.asp

ConveraCrawler Probably related to the above

ccubee Crawler technology from empyreum.com/technologies/platforms/ccubee/

Custo Tool to map the structure of a web site netwu.com/custo/

CyberNavi_WebGet "UA points to cybertech-inc.co.jp but there's not" much there. It crawls from 222.151.213.124 which is bsearchtech.com/ (Japanese). Bablefish suggests this is a Japanese company offering search products

DaviesBot wholeweb.net/web/

deepweb Also calls itself an 'Intelligent Deep-Web Robotic Agent' A search engine indexer that will index dynamic content. deepweb.com. Indexs from IP 66.96.221.180

EbiNess sourceforge.net/projects/ebiness An Open Source project to display Internet information ina 3D format.

EmailWolf pixeltech.com.au/~msw/ewolf/ email program no longer available that's the only reason I'm prepared to list it on this page.

Excalibur Internet Spider excalib.com/products/ispi/index.shtml

Expired Domain Sleuth "Hunts down popular yet expired domain names with" a view to letting you purchase an already popular domain name. expireddomainsleuth.com

Everest-Vulcan Inc./ everest.vulcan.com/crawlerhelp Next-generation services rechnology (under development)

GigaBaz brainbot.com/

GigaBazVStheWeb

crawler(at)brainbot.com

Giskard oralco.com (Trivia note: Giskard is probably named after the Isaac Asimov robot)

grub-client "Grub is a distributed open source web crawler. Users" download the client which then indexes the web as part of a distibuted effort grub.org/html/documents.php

heritrix "Open-source extensible web crawler project" crawler.archive.org/

htdig htdig.org search engine software for companies and universities

webwarper.net A browser accelerator. The idea is that you browser 'through' "their site taking advantage of their faster Internet connection " caching and most importantly compression (of the file sent to your browser) in return for their adverts added to the viewed pages. "Such accesses give the webwarper URL as the User Agent concealing" the true agent of the original user. More details at webwarper.net/ww.pl/0/wwgz/about.htm?*

infoGIST infogist.com

InterGO teachersoft.com browserwatch.internet.com/news/story/intergo1.html "This was a child-safe browser nut it seems no associated" page remains

InternetArchive "Presumably internetarchive.com but that's in 'stealth mode'"

Internet Ninja ifour.co.jp (Japanese Macintosh browser?)

InternetSeer A web monitoring service. More details at internetseer.com/

ipiumBot laurion.com/ipium-analysis.html (French) A tool that searches for copies of your documents on the web. Crawls from petula.laurion.net

InternetAmi IOR internetami.se/ior.html robot gathering data for an English/Swedish translation service.

InsumaScout/ insuma.de/insuma/de/SEscout.html Searches data situated in open data sources.

Katriona Something to do with the European Regional Internet Registry (RIPE) Browses using IP address 213.219.19.148

larbin pauillac.inria.fr/~ailleret/prog/larbin/index-eng.html

LEIA Unable to find (Too many 'Star Wars' references get in the way)

LexiBot lexibot.com

LimeBot cruiselime.com/LimeBot.php Robot searching for information on cruises. Browses using IP address 24.42.113.89

logikabot logika.net

Mata Hari thewebtools.com (Internet search agent)

metabot Geographical-based text search tool. Crawls from 66.28.23.147 metacarta.com/products.htm

Mister Pix II Picture finder mister-pix.com/en/home.htm

MOSES 2.0 Spider ideas2internet.com/products/moses2/ NOTE Site crashes my version of netscape 4.7

MonkeyCrawl monkeymethods.org. 'Futuristic play'.

NetCruiser netcruiser-software.com/products.html "It's not clear to me which of these products this might be " but I'm assuming it's one of them.

NPBot nameprotect.com crawls from 12.148.209.196 (crawler1.crawler918.com) A trademark protection service

NetZippy innerprise.net/usp-spider.asp

NutchCVS lucene.apache.org/nutch/bot.html. Open source web-search project

NZBot navigationzone.com Offers 'information management' tools

Opencola opencola.com "A search application combining data from multiple sources"

ORA_checksite oreilly.com/openbook/webclient/ch06.html Identifier used in a sample perl program in the online book 'Web Client Programming with Perl'. The program is "used to check links. Obviously people have tried it and it works "

Onekit.com PAD File Get. PAD file poller. PAD files describe software applications to download sites.

Oxxbot1 oxxfordinfo.com (Data mining bot on IP 216.0.86.75)

Pansophica homepage.mac.com/zigkit/Pansophica/index.html A Web search agent with neural net intelligence which organizes and personalizes Web sites and searches.

Phoaks phoaks.com/index.html. An index or web resources listed in UseNet. See also public.iastate.edu/~CYBERSTACKS/Aristotle.htm

phpMySearch-Crawler phpMySearch.web4.hm a search engine for individual sites.

PICgrabber A free picture and movie locator movies-free.net

PictureOfInternet malfunction.org/poi

erik(at)malfunction.org Seems to be a project to create a collage of images gathered from the Internet.

PicSpider bildkiste.de.vu (German). Site offers a 'picture crate' "according to babelfish which seems to be some form of" "repository. Not sure why it's spidering but crawls" from 217-20-118-26 which is part of internetserviceteam.com

PintaSpider Unable to find But the spider came from cnet.fr

Pita (Chub.Stanford.EDU) --

PitSpyder Thread#n#0 Unable to find

psbot picsearch.org/bot.html A bot indexing pictures. Crawls from ps.direct2internet.com

PolyBot cis.poly.edu/polybot/ crawls from "weasel.poly.edu " "grampus.poly.edu " bumblebee.poly.edu

PureSight puresight.com/Products/PureSightHomeDescription.htm (child-safe content filtering)

Rumours-Agent "Comes from IP 202.214.69.131 which a lookup" identifies as 'Cross Lingual Info Research' in Japan.

RepoMonkey Bait & Tackle A bit of detective work here. Recent entries in the "the log file link this to the site hungryhippo.com " although the robot always appears to come from an IP address at backflip.com (a bookmarking service). Visiting hungryhippo.com reveals a 'coming soon' site. Looking at the HTML source leads to another page at mezzaluna.net/hungryhippo.com/ (appears identical). The META tags for this page all appear to be references "to day trading futures training and the like although" we did spot the word 'fibonacci' (our favourite . So... possibly a future search engine related to stock "trading? or maybe the Monkey and Hippo are just feeding" me a red herring? There's more. The picture on the Kenjin site at kenjin.com/kenjin/info.html is currently the same as that at HungryHippo. Kenjin is an Autonomy company.

Robot2.0(PingSoft) "There are several 'PingSoft's around but I suspect that" this belongs to one of the products listed at pingsoft.net/ (e.g. SmartHunter) since I was visited froma Chinese IP address.

SilentSurf silentsurf.com. A surf anonymizer service

SlySearch slysearch.com. A site that hunts down infringements

slysearch(at)slysearch.com of intellectual property rights.

SpaceBison proxomitron.org/ "A web filter that is 'ShonenWare' i.e. you should" purchase a Shonen Knife CD if you use it. Shonen Knife "are a great Japanese band much loved by the late Kurt" Cobain. Sometimes this sets the referrer page to the band's home page at mmjp.or.jp/knife/ (or maybe the users just happen to go there themselves).

CrawlWave "spiderwave.aueb.gr (Greek and requires login)" "Crawls from 195.251.252.44 which is part of the" Athens University of Economics and Business (aueb.gr)

SpotOn spoton.com (IE add-on that organizes your browsing)

SQ Webscanner macinsearch.com/users/webscanner/ (on holiday last time I looked)

Squid squid-cache.org An open-source web proxy cache for Unix systems

SquidClamAV_Redirector freshmeat.net/projects/scavr/?branch_id=54042&release_id=188491 An open-source anti-virus program that I saw accessing icons on my site (!)

Sqworm Not 100% sure about this one. When it visited me it came from the WebSense site 63.212.171.* (and a Google search show others seem to see the same). At the WebSense site you "can find WebCatcher a product used to monitor" employees web-surfing habits (as near as I can tell). "But as I say I'm not 100% sure..." websense.com/products/about/webcatcher/index.cfm

Steganos Internet Anonym steganos.com/?layout=default&content=products_siapro&language=en A surf anonymizer utility

SurfControl surfcontrol.com/products/web/default.aspx content tracking product

Tagword Tool that surveys the links in the Open Directory "at dmoz.org checking their status etc." See tagword.com/dmoz_survey.php

TaWWWantula Unable to find

Tcl http client package The default identifier for any software built using the Tcl HTTP package tcl.activestate.com/software/tcltk/ tcl.activestate.com/man/tcl8.0/TclCmd/http.htm

TeraCrawl Unable to find

TurnitinBot turnitin.com Plagarism prevention system. Crawls from 64.140.48.25

UCmore ucmore.com A broswer plug-in (initially IE only) that searches for related pages and categories. In my experience this seems to entail accessing a favicon.ico file on a daily basis (presumably to refresh the 'favorites' list)

UdmSearch search.mnogo.ru/ "Search engine technology as used at sites such as" maplesearch.com. Now called mnoGoSearch.

unchaos_crawler unchaos.com. A search engine that offers a 'hybrid' "of human and machine intelligence but no search box" that I could see . Crawls from 192.115.134.201

unlostBot unlost.com is 'under construction'. The robot came

unlostBot(at)unlost.com from IP address 212.37.219.147 which is in France.

URLBlaze File/web search utility urlblaze.net

utopy Coming soon at utopy.com (requires flash). This

crawler(at)utopy.com venture-capital funded site is 'running in stealth mode' before launching the 'new new thing' (is that a typo?). "One of the Flash pages defines Utopia (geddit?) and some" of the browsing is done by IP addresses at ...myutopy.com.

UtilMind HTTPGet A component intended for downloading pages from the web using standard Microsoft Windows Internet library (winInet.dll) Listed on utilmind.com/delphi2.html

UrlScope Unable to find

Vagabondo Appears to be a log analyzer for Russian BBS systems. (I may have got that wrong). I found reference to "it being copyright John Gladkih 1998 but I've not found" any URL that gives a description (not even a Russian one).

VCI WebViewer "Web browser object that may be incorporated into software" homepagesw.com/webster_dl.htm

vspider verity.com/products/intspider/ A commercial spidering product.

WAVETools A set of Delphi components offered to build Internet applications from transerve.com

Webbandit softwaresolutions.net/webbandit/index.htm Collates search engine results

Webclipping.com Webclipping.com News-gathering agent

webcollage Forms collage from randomly select web images jwz.org/webcollage/ pet project of one of the authors of Netscape. Seems to come from differing IP nodes.

WebCompass (quarterdeck search engine software)

WebGenie webgenie.com/products.html. presumably one of the CGI-based products available on this site. Possibly the 'Site Sleuth'

Web Hound Unable to find "Or rather I found several different 'web hounds' so can't tell" "which this was "

Web Magnet webmagnet.com this appears to be a tool used by this web consultancy.

WebMiner Either tribolic.com/webminer/ or webminer.com/webminer/index.cfm?section=overview A tool to track down and target visitors to your website

WebPix Tool to fetch all pictures from a web site netwu.com/webpix/

Webpush webhauler.com/webpush.htm

WebSymmetrix "Originates in Korea and is possibly related to their" National Computerization Agency. Uses IP address 210.183.28.39

webrank webrank.com/features.asp Search engine popularity meter.

webwasher webwasher.com/en/products/wwash/functions.htm (browser filter)

WhosTalking softwaresolutions.net/whostalking/ Software that tracks Trademark usage last time I saw it it was creating 404 errors by adding &dg.. to each URL. Hopefully they'll fix this

MacroX.de macrox.de (German). Appears to be an interpreter designed to help automate regular tasks on a Windows PC.

XupiterToolbar A toolbar that sets up xupiter.com as the default search engine. There appears to be a lot of negative press regarding this toolbar

yacy yacy.net/home.html. An open source and distributed search engine project. The above URL seems to redirect to an IP-based one

YottaShopping_Bot www-yottashopping-com/. User arent clains this is a "Shopping Search Engine but the URL requires a login" so I was unable to verify (so I deliberately made it's URL non-clickable). Crawled from 64.62.175.133

Sites that regularly visit

"Some IP addresses or sites may regularly visit you although the user agent may be obscure blank or even change. "

Here are a few that I've been able to work out

Site address(es) Description

proxy.netsetter.org This is a site thet offers a speed-up "to your surfing in return for being able to" monitoring people's surfing habits. The speed-ups "are acheived through a variety of techniques " "and the monitoring info is sold on although your" privacy is protected. Visit netsetter.org for more details.

pwoshoes.transport.com Not known

...lightrealm.com This site daily reads any xml files submitted to a shareware site in PAD format. PAD is a means for describing shareware devised by the Association of Shareware Professionals (asp-shareware.org). This site "is performing daily checks looking to automatically" update its lists with any changes.

Other useful sites

Here are links to other sites you might find useful when looking into web robots

botspot.com "A Bot monitor site with regular updates and links to" the bot's home pages.

htmlhelp.com/links/validators.htm A list of HTML validators

iplists.com A site that lists IP addresses of search engine bots and others. More comprehensive (and probably more up to date) that the IP addresses shown on this page (which tends to record the first IP address seen)

tool.motoricerca.info/robots-checker.phtml An online syntax checker for robots.txt files. Enter the URL of your robots.txt file to get it checked and to see a summary of what effect it will have.

mozilla.org/build/revised-user-agent-strings.html Mozilla web browser project. This page describes the conventions used for formatting the User Agent in the form 'Mozilla...'

robotstxt.org/wc/robots.html A site dedicated to the robots.txt file. This page "gives some background to how robots work although" there list of robots is quite small.

searchtools.com/robots/ A page collecting together a number of resources to do with all aspects of web robots.

spiderhunter.com A site primarily about 'cloaking' sites the art of making a site look different to different visitors. Contains articles on how to detect spiders.

webcab.de/wapua.htm A site listing WAP user agent strings. These will mostly be mobile phones

webmasterworld.com/forum11/index.htm This site contains a number of forums for topics of interest to webmasters everywhere. This particular forum actively discusses robots and search engines that visit your site.

"...And finally some fakers "

Increasingly security and privacy concerns mean that users and companies are wary about giving away information to sites they visit through the user agent and other fields that appear in server logs.

"Some browsers will allow you to select the user agent you present when visiting a site. The Opera browser does this for example to allow it's users to pretend to be either IE or Netscape when visiting web sites coded in a way that forgets there are other browsers in use. "

"Also as firewalls become more common we will see more and more user agent fields beling blocked by the firewall that will prevent this information being transmitted to the outside world. "

"Just to prove that you can never rely on the user agent here is a selection of user agent strings I've seen in my log files that tell us nothing about the software being used (although some of them speak volumes about the person driving the software). I'm omitting any IP addresses I may have to protect the identities of those concerned "

user agent' seen Comments

Bruciebot I'm assured this was created by a regular in alt.webmasters

Blocked by Norton The agent has been blocked

Geblokkeerd door Norton by Norton Utilities. The refferrer

Blockeriet von Norton is also withheld. The second version is Dutch. No doubt other languages occur

Don't Like AOL Oh dear. This could start a trend!

Don't be so nosey "Hey! you came to my site first remember? "

Don't you wish you knew. Obviously.

Go Away A bit rich from someone who came to my site!

Field blocked by AtGuard Surfer is behind the AtGuard firewall (now part of Norton Internet Security 2000) which prevents the true User Agent being transmitted. home.pages.at/atguard/

Field blocked by Outpost agnitum.com Again field is witheld by the software

Isch habe gar kein Browser German for 'I have no browser' "Or so I thought until I received the following" from Clemens Marschner Actually it is German with Italian accent! The word refers to an advertisement of the Nescafe "coffee where a smart Italian convinces a beautiful" lady to stay and drink coffee with her after she knocks at his door to complain that his car is in the way of hers. And after she stayed and listened to him while he prepares the coffee with lots of gestures "and Italian speak she again asks him to move his car " "and he goes 'Isch 'abe gar keine Auto Signorina' (I" "don't even have a car signorina). Since that" "commercial was shown for years presumably all German" web masters know it...

My Web browser is not of your business "True but no fun."

multiBlocker browser multiblocker.com/home.html Although this seems to mainly offer protection against visitor "to your site they obviously also provide a" user agent blocker for people browsing

Wabbit's don't use browsers Probably the proxy service at rabbit-proxy.sourceforge.net/

"Wot no browser? (Win67; X; SK) " Win67 ?!? Ah... a dream come true!

Who gives a ? It's as least as good as Lynx "Ah yes but how do we know that?"

Who wants to know? I do.

Awards for this page

I've been told this page is referenced in the book Spidering Hacks

All awards gratefully received

"This page is � 2000-2005 John A Fotheringham. It may not be reproduced without permission "

although you are welcome to save a copy for personal use to your hard disk.

The original page is located at http://free.naplesplus.us/articles/view.php/34017/search-engine-robots-that-visit-your-web-site-by-john-a-fotheringham.

Search engine robots that visit your web site by John A Fotheringham.

List of robots and where they come from, thanks to research by John Fotheringham!Search engine robots that visit your web site by John A Fotheringham.Source: http://www.jafsoft.com/searchengines/webbots.html

List of robots and where they come from, thanks to research by John Fotheringham!

Search engine robots that visit your web site by John A Fotheringham.
Source: http://www.jafsoft.com/searchengines/webbots.html