# Robots.txt file writen by Xaragua Enterprise Corporation # For domain: http://www.xaragua.com # For domain: http://www.xaragua.ca # All robots will spider the domain User-agent: * User-agent: askjeeves User-agent: teoma User-agent: Altavista (Scooter) User-agent: EuroSeek (Freecrawl) User-agent: Altavista (Top 1000) User-agent: EuroSeek (Freecrawl) User-agent: Excite (ArchitextSpider) User-agent: Planet Search (Fido) User-agent: infoseek User-agent: slurp User-agent: fastcrawler User-agent: lycos User-agent: msnbot User-agent: scooter User-agent: googlebot #User Agents -- These look like new robots, but have no contact info... User-agent: BizBot04 #kirk.overleaf.com User-agent: HappyBot #(gserver.kw.net) User-agent: CaliforniaBrownSpider User-agent: EI*Net/0.1 #libwww/0.1 User-agent: Ibot/1.0 #libwww-perl/0.40 User-agent: Merritt/1.0 User-agent: StatFetcher/1.0 User-agent: TeacherSoft/1.0 #libwww/2.17 User-agent: WWW Collector User-agent: processor/0.0ALPHA #libwww-perl/0.20 User-agent: wobot/1.0 #from 206.214.202.45 User-agent: Libertech-Rover #www.libertech.com? User-agent: WhoWhere Robot User-agent: ITI Spider User-agent: w3index User-agent: MyCNNSpider User-agent: SummyCrawler User-agent: OGspider User-agent: linklooker User-agent: CyberSpyder #(amant@www.cyberspyder.com) User-agent: SlowBot User-agent: heraSpider User-agent: Surfbot User-agent: Bizbot003 User-agent: WebWalker User-agent: SandBot User-agent: EnigmaBot User-agent: spyder3.microsys.com User-agent: www.freeloader.com #Hosts Agents -- These have no known user-agent, but have requested /robots.txt repeatedly or exhibited crawling patterns. 205.252.60.71 194.20.32.131 198.5.209.201 acke.dc.luth.se dallas.mt.cs.cmu.edu darkwing.cadvision.com waldec.com www2000.ogsm.vanderbilt.edu unet.ca murph.cais.net (rapid fire... sigh) spyder3.microsys.com www.freeloader.com. #Magellan -- These services must use robots, but haven't replied to requests for an entry... User-agent field: Wobot/1.00 From: mckinley.mckinley.com (206.214.202.2) and galileo.mckinley.com. (206.214.202.45) Honors "robots.txt": yes Contact: cedeno@mckinley.mckinley.com (or possibly: spider@mckinley.mckinley.com) Purpose: Resource discovery for Magellan (http://www.mckinley.com/) Disallow: /cgi-bin/ #this bans robots from our cgi-bin Disallow: /scgi-bin/ #this bans robots from our scgi-bin Disallow: /images/ #this bans robots from our images Disallow: /HeadOffice/ #this bans robots from our HeadOffice Disallow: /Customer_Portal/ #this bans robots from our Customer_Portal Disallow: /ClientCentre/ #this bans robots from our ClientCentre Disallow: /CS_Client_access_1/ #this bans robots from Disallow: /FireWall/ #this bans robots from FireWall Disallow: /CustomerService/ #this bans robots from the Customer Service Centre Disallow: http://www.phpcoin.com/ #this bans robots from phpcoin.com site