Web hosting with free domain names in europeCpanel X unlimited pop3 accounts with linux web servers
dedicated server uptimeGreek Lang | gr domain name registration 
SERVICES
Web Hosting
Cpanel Free Scripts
Dedicated Servers
Servers Stock
Network
Web Design
Domain Parking
Domain Registration
Data Center Tour
FREE WEB TOOLS
Flash Toolbar Generators
Graphic Toolbar Generators
DHTML / CSS Menu Generators
Java Script Menu Generators
MAILING LIST
Sign up to our mailing list
E-mail:
I want to:
SSL
Google Ads

Web Robots Index

  • robot-id: myweb
    robot-name: Internet Shinchakubin
    robot-cover-url: http://naragw.sharp.co.jp/myweb/home/
    robot-details-url:
    robot-owner-name: SHARP Corp.
    robot-owner-url: http://naragw.sharp.co.jp/myweb/home/
    robot-owner-email: shinchakubin-request@isl.nara.sharp.co.jp
    robot-status: active
    robot-purpose: find new links and changed pages
    robot-type: standalone
    robot-platform: Windows98
    robot-availability: binary as bundled software
    robot-exclusion: yes
    robot-exclusion-useragent: sharp-info-agent
    robot-noindex: no
    robot-host: *
    robot-from: no
    robot-useragent: User-Agent: Mozilla/4.0 (compatible; sharp-info-agent v1.0; )
    robot-language: Java
    robot-description: makes a list of new links and changed pages based
    on user's frequently clicked pages in the past 31 days.
    client may run this software one or few times every day, manually or
    specified time.
    robot-history: shipped for SHARP's PC users since Feb 2000
    robot-environment: commercial
    modified-date: Fri, 30 Jun 2000 19:02:52 JST
    modified-by: Katsuo Doi <doi@isl.nara.sharp.co.jp>

  • robot-id: netcarta
    robot-name: NetCarta WebMap Engine
    robot-cover-url: http://www.netcarta.com/
    robot-details-url:
    robot-owner-name: NetCarta WebMap Engine
    robot-owner-url: http://www.netcarta.com/
    robot-owner-email: info@netcarta.com
    robot-status:
    robot-purpose: indexing, maintenance, mirroring, statistics
    robot-type: standalone
    robot-platform:
    robot-availability:
    robot-exclusion: yes
    robot-exclusion-useragent:
    robot-noindex:
    robot-host:
    robot-from: yes
    robot-useragent: NetCarta CyberPilot Pro
    robot-language: C++.
    robot-description: The NetCarta WebMap Engine is a general purpose, commercial
    spider. Packaged with a full GUI in the CyberPilo Pro
    product, it acts as a personal spider to work with a browser
    to facilitiate context-based navigation. The WebMapper
    product uses the robot to manage a site (site copy, site
    diff, and extensive link management facilities). All
    versions can create publishable NetCarta WebMaps, which
    capture the crawled information. If the robot sees a
    published map, it will return the published map rather than
    continuing its crawl. Since this is a personal spider, it
    will be launched from multiple domains. This robot tends to
    focus on a particular site. No instance of the robot should
    have more than one outstanding request out to any given site
    at a time. The User-agent field contains a coded ID
    identifying the instance of the spider; specific users can
    be blocked via robots.txt using this ID.
    robot-history:
    robot-environment:
    modified-date: Sun Feb 18 02:02:49 1996.
    modified-by:

  • robot-id: netmechanic
    robot-name: NetMechanic
    robot-cover-url: http://www.netmechanic.com
    robot-details-url: http://www.netmechanic.com/faq.html
    robot-owner-name: Tom Dahm
    robot-owner-url: http://iquest.com/~tdahm
    robot-owner-email: tdahm@iquest.com
    robot-status: development
    robot-purpose: Link and HTML validation
    robot-type: standalone with web gateway
    robot-platform: UNIX
    robot-availability: via web page
    robot-exclusion: Yes
    robot-exclusion-useragent: WebMechanic
    robot-noindex: no
    robot-host: 206.26.168.18
    robot-from: no
    robot-useragent: NetMechanic
    robot-language: C
    robot-description: NetMechanic is a link validation and
    HTML validation robot run using a web page interface.
    robot-history:
    robot-environment:
    modified-date: Sat, 17 Aug 1996 12:00:00 GMT
    modified-by:

  • robot-id: netscoop
    robot-name: NetScoop
    robot-cover-url: http://www-a2k.is.tokushima-u.ac.jp/search/index.html
    robot-owner-name: Kenji Kita
    robot-owner-url: http://www-a2k.is.tokushima-u.ac.jp/member/kita/index.html
    robot-owner-email: kita@is.tokushima-u.ac.jp
    robot-status: active
    robot-purpose: indexing
    robot-type: standalone
    robot-platform: UNIX
    robot-availability: none
    robot-exclusion: yes
    robot-exclusion-useragent: NetScoop
    robot-host: alpha.is.tokushima-u.ac.jp, beta.is.tokushima-u.ac.jp
    robot-useragent: NetScoop/1.0 libwww/5.0a
    robot-language: C
    robot-description: The NetScoop robot is used to build the database
    for the NetScoop search engine.
    robot-history: The robot has been used in the research project
    at the Faculty of Engineering, Tokushima University, Japan.,
    since Dec. 1996.
    robot-environment: research
    modified-date: Fri, 10 Jan 1997.
    modified-by: Kenji Kita

  • robot-id: newscan-online
    robot-name: newscan-online
    robot-cover-url: http://www.newscan-online.de/
    robot-details-url: http://www.newscan-online.de/info.html
    robot-owner-name: Axel Mueller
    robot-owner-url:
    robot-owner-email: mueller@newscan-online.de
    robot-status: active
    robot-purpose: indexing
    robot-type: standalone
    robot-platform: Linux
    robot-availability: binary
    robot-exclusion: yes
    robot-exclusion-useragent: newscan-online
    robot-noindex: no
    robot-host: *newscan-online.de
    robot-from: yes
    robot-useragent: newscan-online/1.1
    robot-language: perl
    robot-description: The newscan-online robot is used to build a database for
    the newscan-online news search service operated by smart information
    services. The robot runs daily and visits predefined sites in a random order.
    robot-history: This robot finds its roots in a prereleased software for
    news filtering for Lotus Notes in 1995.
    robot-environment: service
    modified-date: Fri, 9 Apr 1999 11:45:00 GMT
    modified-by: Axel Mueller

  • robot-id: nhse
    robot-name: NHSE Web Forager
    robot-cover-url: http://nhse.mcs.anl.gov/
    robot-details-url:
    robot-owner-name: Robert Olson
    robot-owner-url: http://www.mcs.anl.gov/people/olson/
    robot-owner-email: olson@mcs.anl.gov
    robot-status:
    robot-purpose: indexing
    robot-type: standalone
    robot-platform:
    robot-availability:
    robot-exclusion: yes
    robot-exclusion-useragent:
    robot-noindex: no
    robot-host: *.mcs.anl.gov
    robot-from: yes
    robot-useragent: NHSEWalker/3.0
    robot-language: perl 5
    robot-description: to generate a Resource Discovery database
    robot-history:
    robot-environment:
    modified-date: Fri May 5 15:47:55 1995
    modified-by:

  • robot-id: nomad
    robot-name: Nomad
    robot-cover-url: http://www.cs.colostate.edu/~sonnen/projects/nomad.html
    robot-details-url:
    robot-owner-name: Richard Sonnen
    robot-owner-url: http://www.cs.colostate.edu/~sonnen/
    robot-owner-email: sonnen@cs.colostat.edu
    robot-status:
    robot-purpose: indexing
    robot-type: standalone
    robot-platform:
    robot-availability:
    robot-exclusion: no
    robot-exclusion-useragent:
    robot-noindex:
    robot-host: *.cs.colostate.edu
    robot-from: no
    robot-useragent: Nomad-V2.x
    robot-language: Perl 4
    robot-description:
    robot-history: Developed in 1995 at Colorado State University.
    robot-environment:
    modified-date: Sat Jan 27 21:02:20 1996.
    modified-by:

  • robot-id: northstar
    robot-name: The NorthStar Robot
    robot-cover-url: http://comics.scs.unr.edu:7000/top.html
    robot-details-url:
    robot-owner-name: Fred Barrie
    robot-owner-url:
    robot-owner-email: barrie@unr.edu
    robot-status:
    robot-purpose: indexing
    robot-type:
    robot-platform:
    robot-availability:
    robot-exclusion:
    robot-exclusion-useragent:
    robot-noindex:
    robot-host: frognot.utdallas.edu, utdallas.edu, cnidir.org
    robot-from: yes
    robot-useragent: NorthStar
    robot-language:
    robot-description: Recent runs (26 April 94) will concentrate on textual
    analysis of the Web versus GopherSpace (from the Veronica
    data) as well as indexing.
    robot-history:
    robot-environment:
    modified-date:
    modified-by:

  • robot-id: occam
    robot-name: Occam
    robot-cover-url: http://www.cs.washington.edu/research/projects/ai/www/occam/
    robot-details-url:
    robot-owner-name: Marc Friedman
    robot-owner-url: http://www.cs.washington.edu/homes/friedman/
    robot-owner-email: friedman@cs.washington.edu
    robot-status: development
    robot-purpose: indexing
    robot-type: standalone
    robot-platform: unix
    robot-availability: none
    robot-exclusion: yes
    robot-exclusion-useragent: Occam
    robot-noindex: no
    robot-host: gentian.cs.washington.edu, sekiu.cs.washington.edu, saxifrage.cs.washington.edu
    robot-from: yes
    robot-useragent: Occam/1.0
    robot-language: CommonLisp, perl4
    robot-description: The robot takes high-level queries, breaks them down into
    multiple web requests, and answers them by combining disparate
    data gathered in one minute from numerous web sites, or from
    the robots cache. Currently the only user is me.
    robot-history: The robot is a descendant of Rodney,
    an earlier project at the University of Washington.
    robot-environment: research
    modified-date: Thu, 21 Nov 1996 20:30 GMT
    modified-by: friedman@cs.washington.edu (Marc Friedman)

  • robot-id: octopus
    robot-name: HKU WWW Octopus
    robot-cover-url: http://phoenix.cs.hku.hk:1234/~jax/w3rui.shtml
    robot-details-url:
    robot-owner-name: Law Kwok Tung , Lee Tak Yeung , Lo Chun Wing
    robot-owner-url: http://phoenix.cs.hku.hk:1234/~jax
    robot-owner-email: jax@cs.hku.hk
    robot-status:
    robot-purpose: indexing
    robot-type: standalone
    robot-platform:
    robot-availability:
    robot-exclusion: no.
    robot-exclusion-useragent:
    robot-noindex:
    robot-host: phoenix.cs.hku.hk
    robot-from: yes
    robot-useragent: HKU WWW Robot,
    robot-language: Perl 5, C, Java.
    robot-description: HKU Octopus is an ongoing project for resource discovery in
    the Hong Kong and China WWW domain . It is a research
    project conducted by three undergraduate at the University
    of Hong Kong
    robot-history:
    robot-environment:
    modified-date: Thu Mar 7 14:21:55 1996.
    modified-by:

  • robot-id: openfind
    robot-name: Openfind data gatherer
    robot-cover-url: http://www.openfind.com.tw/
    robot-details-url: http://www.openfind.com.tw/robot.html
    robot-owner-name:
    robot-owner-url:
    robot-owner-email: robot-response@openfind.com.tw
    robot-status: active
    robot-purpose: indexing
    robot-type: standalone
    robot-platform:
    robot-availability:
    robot-exclusion: yes
    robot-exclusion-useragent:
    robot-noindex:
    robot-host: 66.7.131.132
    robot-from:
    robot-useragent: Openfind data gatherer, Openbot/3.0+(robot-response@openfind.com.tw;+http://www.openfind.com.tw/robot.html)
    robot-language:
    robot-description:
    robot-history:
    robot-environment:
    modified-date: Thu, 26 Apr 2001 02:55:21 GMT
    modified-by: stanislav shalunov <shalunov@internet2.edu>

  • robot-id: orb_search
    robot-name: Orb Search
    robot-cover-url: http://orbsearch.home.ml.org
    robot-details-url: http://orbsearch.home.ml.org
    robot-owner-name: Matt Weber
    robot-owner-url: http://www.weberworld.com
    robot-owner-email: webernet@geocities.com
    robot-status: active
    robot-purpose: indexing
    robot-type: standalone
    robot-platform: unix
    robot-availability: data
    robot-exclusion: yes
    robot-exclusion-useragent: Orbsearch/1.0
    robot-noindex: yes
    robot-host: cow.dyn.ml.org, *.dyn.ml.org
    robot-from: yes
    robot-useragent: Orbsearch/1.0
    robot-language: Perl5
    robot-description: Orbsearch builds the database for Orb Search Engine.
    It runs when requested.
    robot-history: This robot was started as a hobby.
    robot-environment: hobby
    modified-date: Sun, 31 Aug 1997 02:28:52 GMT
    modified-by: Matt Weber

  • robot-id: packrat
    robot-name: Pack Rat
    robot-cover-url: http://web.cps.msu.edu/~dexterte/isl/packrat.html
    robot-details-url:
    robot-owner-name: Terry Dexter
    robot-owner-url: http://web.cps.msu.edu/~dexterte
    robot-owner-email: dexterte@cps.msu.edu
    robot-status: development
    robot-purpose: both maintenance and mirroring
    robot-type: standalone
    robot-platform: unix
    robot-availability: at the moment, none...source when developed.
    robot-exclusion: yes
    robot-exclusion-useragent: packrat or *
    robot-noindex: no, not yet
    robot-host: cps.msu.edu
    robot-from:
    robot-useragent: PackRat/1.0
    robot-language: perl with libwww-5.0
    robot-description: Used for local maintenance and for gathering
    web pages so
    that local statisistical info can be used in artificial intelligence programs.
    Funded by NEMOnline.
    robot-history: In the making...
    robot-environment: research
    modified-date: Tue, 20 Aug 1996 15:45:11
    modified-by: Terry Dexter

  • robot-id:pageboy
    robot-name:PageBoy
    robot-cover-url:http://www.webdocs.org/
    robot-details-url:http://www.webdocs.org/
    robot-owner-name:Chihiro Kuroda
    robot-owner-url:http://www.webdocs.org/
    robot-owner-email:pageboy@webdocs.org
    robot-status:development
    robot-purpose:indexing
    robot-type:standalone
    robot-platform:unix
    robot-availability:none
    robot-exclusion:yes
    robot-exclusion-useragent:pageboy
    robot-noindex:yes
    robot-nofollow:yes
    robot-host:*.webdocs.org
    robot-from:yes
    robot-useragent:PageBoy/1.0
    robot-language:c
    robot-description:The robot visits at regular intervals.
    robot-history:none
    robot-environment:service
    modified-date:Fri, 21 Oct 1999 17:28:52 GMT
    modified-by:webdocs

  • robot-id: parasite
    robot-name: ParaSite
    robot-cover-url: http://www.ianett.com/parasite/
    robot-details-url: http://www.ianett.com/parasite/
    robot-owner-name: iaNett.com
    robot-owner-url: http://www.ianett.com/
    robot-owner-email: parasite@ianett.com
    robot-status: active
    robot-purpose: indexing
    robot-type: standalone
    robot-platform: windowsNT
    robot-availability: none
    robot-exclusion: yes
    robot-exclusion-useragent: ParaSite
    robot-noindex: yes
    robot-nofollow: yes
    robot-host: *.ianett.com
    robot-from: yes
    robot-useragent: ParaSite/0.21 (http://www.ianett.com/parasite/)
    robot-language: c++
    robot-description: Builds index for ianett.com search database. Runs
    continiously.
    robot-history: Second generation of ianett.com spidering technology,
    originally called Sven.
    robot-environment: service
    modified-date: July 28, 2000
    modified-by: Marty Anstey

  • robot-id: patric
    robot-name: Patric
    robot-cover-url: http://www.nwnet.net/technical/ITR/index.html
    robot-details-url: http://www.nwnet.net/technical/ITR/index.html
    robot-owner-name: toney@nwnet.net
    robot-owner-url: http://www.nwnet.net/company/staff/toney
    robot-owner-email: webmaster@nwnet.net
    robot-status: development
    robot-purpose: statistics
    robot-type: standalone
    robot-platform: unix
    robot-availability: data
    robot-exclusion: yes
    robot-exclusion-useragent: patric
    robot-noindex: yes
    robot-host: *.nwnet.net
    robot-from: no
    robot-useragent: Patric/0.01a
    robot-language: perl
    robot-description: (contained at http://www.nwnet.net/technical/ITR/index.html )
    robot-history: (contained at http://www.nwnet.net/technical/ITR/index.html )
    robot-environment: service
    modified-date: Thurs, 15 Aug 1996
    modified-by: toney@nwnet.net

  • robot-id: pegasus
    robot-name: pegasus
    robot-cover-url: http://opensource.or.id/projects.html
    robot-details-url: http://pegasus.opensource.or.id
    robot-owner-name: A.Y.Kiky Shannon
    robot-owner-url: http://go.to/ayks
    robot-owner-email: shannon@opensource.or.id
    robot-status: inactive - open source
    robot-purpose: indexing
    robot-type: standalone
    robot-platform: unix
    robot-availability: source, binary
    robot-exclusion: yes
    robot-exclusion-useragent: pegasus
    robot-noindex: yes
    robot-host: *
    robot-from: yes
    robot-useragent: web robot PEGASUS
    robot-language: perl5
    robot-description: pegasus gathers information from HTML pages (7 important
    tags). The indexing process can be started based on starting URL(s) or a range
    of IP address.
    robot-history: This robot was created as an implementation of a final project on
    Informatics Engineering Department, Institute of Technology Bandung, Indonesia.
    robot-environment: research
    modified-date: Fri, 20 Oct 2000 14:58:40 GMT
    modified-by: A.Y.Kiky Shannon

  • robot-id: perignator
    robot-name: The Peregrinator
    robot-cover-url: http://www.maths.usyd.edu.au:8000/jimr/pe/Peregrinator.html
    robot-details-url:
    robot-owner-name: Jim Richardson
    robot-owner-url: http://www.maths.usyd.edu.au:8000/jimr.html
    robot-owner-email: jimr@maths.su.oz.au
    robot-status:
    robot-purpose:
    robot-type:
    robot-platform:
    robot-availability:
    robot-exclusion: yes
    robot-exclusion-useragent:
    robot-noindex: no
    robot-host:
    robot-from: yes
    robot-useragent: Peregrinator-Mathematics/0.7
    robot-language: perl 4
    robot-description: This robot is being used to generate an index of documents
    on Web sites connected with mathematics and statistics. It
    ignores off-site links, so does not stray from a list of
    servers specified initially.
    robot-history: commenced operation in August 1994
    robot-environment:
    modified-date:
    modified-by:

  • robot-id: perlcrawler
    robot-name: PerlCrawler 1.0
    robot-cover-url: http://perlsearch.hypermart.net/
    robot-details-url: http://www.xav.com/scripts/xavatoria/index.html
    robot-owner-name: Matt McKenzie
    robot-owner-url: http://perlsearch.hypermart.net/
    robot-owner-email: webmaster@perlsearch.hypermart.net
    robot-status: active
    robot-purpose: indexing
    robot-type: standalone
    robot-platform: unix
    robot-availability: source
    robot-exclusion: yes
    robot-exclusion-useragent: perlcrawler
    robot-noindex: yes
    robot-host: server5.hypermart.net
    robot-from: yes
    robot-useragent: PerlCrawler/1.0 Xavatoria/2.0
    robot-language: perl5
    robot-description: The PerlCrawler robot is designed to index and build
    a database of pages relating to the Perl programming language.
    robot-history: Originated in modified form on 25 June 1998
    robot-environment: hobby
    modified-date: Fri, 18 Dec 1998 23:37:40 GMT
    modified-by: Matt McKenzie

  • robot-id: phantom
    robot-name: Phantom
    robot-cover-url: http://www.maxum.com/phantom/
    robot-details-url:
    robot-owner-name: Larry Burke
    robot-owner-url: http://www.aktiv.com/
    robot-owner-email: lburke@aktiv.com
    robot-status:
    robot-purpose: indexing
    robot-type: standalone
    robot-platform: Macintosh
    robot-availability:
    robot-exclusion: yes
    robot-exclusion-useragent:
    robot-noindex:
    robot-host:
    robot-from: yes
    robot-useragent: Duppies
    robot-language:
    robot-description: Designed to allow webmasters to provide a searchable index
    of their own site as well as to other sites, perhaps with
    similar content.
    robot-history:
    robot-environment:
    modified-date: Fri Jan 19 05:08:15 1996.
    modified-by:

  • robot-id: phpdig
    robot-name: PhpDig
    robot-cover-url: http://phpdig.toiletoine.net/
    robot-details-url: http://phpdig.toiletoine.net/
    robot-owner-name: Antoine Bajolet
    robot-owner-url: http://phpdig.toiletoine.net/
    robot-owner-email: phpdig@toiletoine.net
    robot-status: *
    robot-purpose: indexing
    robot-type: standalone
    robot-platform: all supported by Apache/php/mysql
    robot-availability: source
    robot-exclusion: yes
    robot-exclusion-useragent: phpdig
    robot-noindex: yes
    robot-host: yes
    robot-from: no
    robot-useragent: phpdig/x.x.x
    robot-language: php 4.x
    robot-description: Small robot and search engine written in php.
    robot-history: writen first 2001-03-30
    robot-environment: hobby
    modified-date: Sun, 21 Nov 2001 20:01:19 GMT
    modified-by: Antoine Bajolet

  • robot-id: piltdownman
    robot-name: PiltdownMan
    robot-cover-url: http://profitnet.bizland.com/
    robot-details-url: http://profitnet.bizland.com/piltdownman.html
    robot-owner-name: Daniel Vilà
    robot-owner-url: http://profitnet.bizland.com/aboutus.html
    robot-owner-email: profitnet@myezmail.com
    robot-status: active
    robot-purpose: statistics
    robot-type: standalone
    robot-platform: windows95, windows98, windowsNT
    robot-availability: none
    robot-exclusion: yes
    robot-exclusion-useragent: piltdownman
    robot-noindex: no
    robot-nofollow: no
    robot-host: 62.36.128.*, 194.133.59.*, 212.106.215.*
    robot-from: no
    robot-useragent: PiltdownMan/1.0 profitnet@myezmail.com
    robot-language: c++
    robot-description: The PiltdownMan robot is used to get a
    list of links from the search engines
    in our database. These links are
    followed, and the page that they refer
    is downloaded to get some statistics
    from them.
    The robot runs once a month, more or
    less, and visits the first 10 pages
    listed in every search engine, for a
    group of keywords.
    robot-history: To maintain a database of search engines,
    we needed an automated tool. That's why
    we began the creation of this robot.
    robot-environment: service
    modified-date: Mon, 13 Dec 1999 21:50:32 GMT
    modified-by: Daniel Vilà

  • robot-id: pimptrain
    robot-name: Pimptrain.com's robot
    robot-cover-url: http://www.pimptrain.com/search.cgi
    robot-details-url: http://www.pimptrain.com/search.cgi
    robot-owner-name: Bryan Ankielewicz
    robot-owner-url: http://www.pimptrain.com
    robot-owner-email: webmaster@pimptrain.com
    robot-status: active
    robot-purpose: indexing
    robot-type: standalone
    robot-platform: unix
    robot-availability: source;data
    robot-exclusion: yes
    robot-exclusion-useragent: Pimptrain
    robot-noindex: yes
    robot-host: pimtprain.com
    robot-from: *
    robot-useragent: Mozilla/4.0 (compatible: Pimptrain's robot)
    robot-language: perl5
    robot-description: Crawls remote sites as part of a search engine program
    robot-history: Implemented in 2001
    robot-environment: commercial
    modified-date: May 11, 2001
    modified-by: Bryan Ankielewicz

  • robot-id: pioneer
    robot-name: Pioneer
    robot-cover-url: http://sequent.uncfsu.edu/~micah/pioneer.html
    robot-details-url:
    robot-owner-name: Micah A. Williams
    robot-owner-url: http://sequent.uncfsu.edu/~micah/
    robot-owner-email: micah@sequent.uncfsu.edu
    robot-status:
    robot-purpose: indexing, statistics
    robot-type: standalone
    robot-platform:
    robot-availability:
    robot-exclusion: yes
    robot-exclusion-useragent:
    robot-noindex:
    robot-host: *.uncfsu.edu or flyer.ncsc.org
    robot-from: yes
    robot-useragent: Pioneer
    robot-language: C.
    robot-description: Pioneer is part of an undergraduate research
    project.
    robot-history:
    robot-environment:
    modified-date: Mon Feb 5 02:49:32 1996.
    modified-by:

  • robot-id: pitkow
    robot-name: html_analyzer
    robot-cover-url:
    robot-details-url:
    robot-owner-name: James E. Pitkow
    robot-owner-url:
    robot-owner-email: pitkow@aries.colorado.edu
    robot-status:
    robot-purpose: maintainance
    robot-type:
    robot-platform:
    robot-availability:
    robot-exclusion:
    robot-exclusion-useragent:
    robot-noindex: no
    robot-host:
    robot-from:
    robot-useragent:
    robot-language:
    robot-description: to check validity of Web servers. I'm not sure if it has
    ever been run remotely.
    robot-history:
    robot-environment:
    modified-date:
    modified-by:

  • robot-id: pjspider
    robot-name: Portal Juice Spider
    robot-cover-url: http://www.portaljuice.com
    robot-details-url: http://www.portaljuice.com/pjspider.html
    robot-owner-name: Nextopia Software Corporation
    robot-owner-url: http://www.portaljuice.com
    robot-owner-email: pjspider@portaljuice.com
    robot-status: active
    robot-purpose: indexing, statistics
    robot-type: standalone
    robot-platform: unix
    robot-availability: none
    robot-exclusion: yes
    robot-exclusion-useragent: pjspider
    robot-noindex: yes
    robot-host: *.portaljuice.com, *.nextopia.com
    robot-from: yes
    robot-useragent: PortalJuice.com/4.0
    robot-language: C/C++
    robot-description: Indexing web documents for Portal Juice vertical portal
    search engine
    robot-history: Indexing the web since 1998 for the purposes of offering our
    commerical Portal Juice search engine services.
    robot-environment: service
    modified-date: Wed Jun 23 17:00:00 EST 1999
    modified-by: pjspider@portaljuice.com

  • robot-id: pka
    robot-name: PGP Key Agent
    robot-cover-url: http://www.starnet.it/pgp
    robot-details-url:
    robot-owner-name: Massimiliano Pucciarelli
    robot-owner-url: http://www.starnet.it/puma
    robot-owner-email: puma@comm2000.it
    robot-status: Active
    robot-purpose: indexing
    robot-type: standalone
    robot-platform: UNIX, Windows NT
    robot-availability: none
    robot-exclusion: no
    robot-exclusion-useragent:
    robot-noindex: no
    robot-host: salerno.starnet.it
    robot-from: yes
    robot-useragent: PGP-KA/1.2
    robot-language: Perl 5
    robot-description: This program search the pgp public key for the
    specified user.
    robot-history: Originated as a research project at Salerno
    University in 1995.
    robot-environment: Research
    modified-date: June 27 1996.
    modified-by: Massimiliano Pucciarelli

  • robot-id: plumtreewebaccessor
    robot-name: PlumtreeWebAccessor
    robot-cover-url:
    robot-details-url: http://www.plumtree.com/
    robot-owner-name: Joseph A. Stanko
    robot-owner-url:
    robot-owner-email: josephs@plumtree.com
    robot-status: development
    robot-purpose: indexing for the Plumtree Server
    robot-type: standalone
    robot-platform: windowsNT
    robot-availability: none
    robot-exclusion: yes
    robot-exclusion-useragent: PlumtreeWebAccessor
    robot-noindex: yes
    robot-host:
    robot-from: yes
    robot-useragent: PlumtreeWebAccessor/0.9
    robot-language: c++
    robot-description: The Plumtree Web Accessor is a component that
    customers can add to the
    Plumtree Server to index documents on the World Wide Web.
    robot-history:
    robot-environment: commercial
    modified-date: Thu, 17 Dec 1998
    modified-by: Joseph A. Stanko <josephs@plumtree.com>

  • robot-id: poppi
    robot-name: Poppi
    robot-cover-url: http://members.tripod.com/poppisearch
    robot-details-url: http://members.tripod.com/poppisearch
    robot-owner-name: Antonio Provenzano
    robot-owner-url: Antonio Provenzano
    robot-owner-email:
    robot-status: active
    robot-purpose: indexing
    robot-type: standalone
    robot-platform: unix/linux
    robot-availability: none
    robot-exclusion:
    robot-exclusion-useragent:
    robot-noindex: yes
    robot-host:=20
    robot-from:
    robot-useragent: Poppi/1.0
    robot-language: C
    robot-description: Poppi is a crawler to index the web that runs weekly
    gathering and indexing hypertextual, multimedia and executable file
    formats
    robot-history: Created by Antonio Provenzano in the april of 2000, has
    been acquired from Tomi Officine Multimediali srl and it is next to
    release as service and commercial
    robot-environment: service
    modified-date: Mon, 22 May 2000 15:47:30 GMT
    modified-by: Antonio Provenzano

  • robot-id: portalb
    robot-name: PortalB Spider
    robot-cover-url: http://www.portalb.com/
    robot-details-url:
    robot-owner-name: PortalB Spider Bug List
    robot-owner-url:
    robot-owner-email: spider@portalb.com
    robot-status: active
    robot-purpose: indexing
    robot-type: standalone
    robot-platform: unix
    robot-availability: none
    robot-exclusion: yes
    robot-exclusion-useragent: PortalBSpider
    robot-noindex: yes
    robot-nofollow: yes
    robot-host: spider1.portalb.com, spider2.portalb.com, etc.
    robot-from: no
    robot-useragent: PortalBSpider/1.0 (spider@portalb.com)
    robot-language: C++
    robot-description: The PortalB Spider indexes selected sites for
    high-quality business information.
    robot-history:
    robot-environment: service

  • robot-id: psbot
    robot-name: psbot
    robot-cover-url: http://www.picsearch.com/
    robot-details-url: http://www.picsearch.com/bot.html
    robot-owner-name: picsearch AB
    robot-owner-url: http://www.picsearch.com/
    robot-owner-email: psbot@picsearch.com
    robot-status: active
    robot-purpose: indexing
    robot-type: standalone
    robot-platform: Linux
    robot-availability: none
    robot-exclusion: yes
    robot-exclusion-useragent: psbot
    robot-noindex: yes
    robot-nofollow: yes
    robot-host: *.picsearch.com
    robot-from: yes
    robot-useragent: psbot/0.X (+http://www.picsearch.com/bot.html)
    robot-language: c, c++
    robot-description: Spider for www.picsearch.com
    robot-history: Developed and tested in 2000/2001
    robot-environment: commercial
    modified-date: Tue, 21 Aug 2001 10:55:38 CEST 2001
    modified-by: psbot@picsearch.com

  • robot-id: Puu
    robot-name: GetterroboPlus Puu
    robot-details-url: http://marunaka.homing.net/straight/getter/
    robot-cover-url: http://marunaka.homing.net/straight/
    robot-owner-name: marunaka
    robot-owner-url: http://marunaka.homing.net
    robot-owner-email: marunaka@homing.net
    robot-status: active: robot actively in use
    robot-purpose: Purpose of the robot. One or more of:
    - gathering: gather data of original standerd TAG for Puu contains the
    information of the sites registered my Search Engin.
    - maintenance: link validation
    robot-type: standalone
    robot-platform: unix
    robot-availability: none
    robot-exclusion: yes (Puu patrols only registered url in my Search Engine)
    robot-exclusion-useragent: Getterrobo-Plus
    robot-noindex: no
    robot-host: straight FLASH!! Getterrobo-Plus, *.homing.net
    robot-from: yes
    robot-useragent: straight FLASH!! GetterroboPlus 1.5
    robot-language: perl5
    robot-description:
    Puu robot is used to gater data from registered site in Search Engin
    "straight FLASH!!" for building anouncement page of state of renewal of
    registered site in "straight FLASH!!".
    Robot runs everyday.
    robot-history:
    This robot patorols based registered sites in Search Engin "straight FLASH!!"
    robot-environment: hobby
    modified-date: Fri, 26 Jun 1998

  • robot-id: python
    robot-name: The Python Robot
    robot-cover-url: http://www.python.org/
    robot-details-url:
    robot-owner-name: Guido van Rossum
    robot-owner-url: http://www.python.org/~guido/
    robot-owner-email: guido@python.org
    robot-status: retired
    robot-purpose:
    robot-type:
    robot-platform:
    robot-availability: none
    robot-exclusion:
    robot-exclusion-useragent:
    robot-noindex: no
    robot-host:
    robot-from:
    robot-useragent:
    robot-language:
    robot-description:
    robot-history:
    robot-environment:
    modified-date:
    modified-by:

  • robot-id: raven
    robot-name: Raven Search
    robot-cover-url: http://ravensearch.tripod.com
    robot-details-url: http://ravensearch.tripod.com
    robot-owner-name: Raven Group
    robot-owner-url: http://ravensearch.tripod.com
    robot-owner-email: ravensearch@hotmail.com
    robot-status: Development: robot under development
    robot-purpose: Indexing: gather content for commercial query engine.
    robot-type: Standalone: a separate program
    robot-platform: Unix, Windows98, WindowsNT, Windows2000
    robot-availability: None
    robot-exclusion: Yes
    robot-exclusion-useragent: Raven
    robot-noindex: Yes
    robot-nofollow: Yes
    robot-host: 192.168.1.*
    robot-from: Yes
    robot-useragent: Raven-v2
    robot-language: Perl-5
    robot-description: Raven was written for the express purpose of indexing the web.
    It can parallel process hundreds of URLS's at a time. It runs on a sporadic basis
    as testing continues. It is really several programs running concurrently.
    It takes four computers to run Raven Search. Scalable in sets of four.
    robot-history: This robot is new. First active on March 25, 2000.
    robot-environment: Commercial: is a commercial product. Possibly GNU later ;-)
    modified-date: Fri, 25 Mar 2000 17:28:52 GMT
    modified-by: Raven Group

  • robot-id: rbse
    robot-name: RBSE Spider
    robot-cover-url: http://rbse.jsc.nasa.gov/eichmann/urlsearch.html
    robot-details-url:
    robot-owner-name: David Eichmann
    robot-owner-url: http://rbse.jsc.nasa.gov/eichmann/home.html
    robot-owner-email: eichmann@rbse.jsc.nasa.gov
    robot-status: active
    robot-purpose: indexing, statistics
    robot-type:
    robot-platform:
    robot-availability:
    robot-exclusion: yes
    robot-exclusion-useragent:
    robot-noindex:
    robot-host: rbse.jsc.nasa.gov (192.88.42.10)
    robot-from:
    robot-useragent:
    robot-language: C, oracle, wais
    robot-description: Developed and operated as part of the NASA-funded Repository
    Based Software Engineering Program at the Research Institute
    for Computing and Information Systems, University of Houston
    - Clear Lake.
    robot-history:
    robot-environment:
    modified-date: Thu May 18 04:47:02 1995
    modified-by:

  • robot-id: resumerobot
    robot-name: Resume Robot
    robot-cover-url: http://www.onramp.net/proquest/resume/robot/robot.html
    robot-details-url:
    robot-owner-name: James Stakelum
    robot-owner-url: http://www.onramp.net/proquest/resume/java/resume.html
    robot-owner-email: proquest@onramp.net
    robot-status:
    robot-purpose: indexing.
    robot-type: standalone
    robot-platform:
    robot-availability:
    robot-exclusion: yes
    robot-exclusion-useragent:
    robot-noindex:
    robot-host:
    robot-from: yes
    robot-useragent: Resume Robot
    robot-language: C++.
    robot-description:
    robot-history:
    robot-environment:
    modified-date: Tue Mar 12 15:52:25 1996.
    modified-by:

  • robot-id: rhcs
    robot-name: RoadHouse Crawling System
    robot-cover-url: http://stage.perceval.be (under developpement)
    robot-details-url:
    robot-owner-name: Gregoire Welraeds, Emmanuel Bergmans
    robot-owner-url: http://www.perceval.be
    robot-owner-email: helpdesk@perceval.be
    robot-status: development
    robot-purpose1: indexing
    robot-purpose2: maintenance
    robot-purpose3: statistics
    robot-type: standalone
    robot-platform1: unix (FreeBSD & Linux)
    robot-availability: none
    robot-exclusion: no (under development)
    robot-exclusion-useragent: RHCS
    robot-noindex: no (under development)
    robot-host: stage.perceval.be
    robot-from: no
    robot-useragent: RHCS/1.0a
    robot-language: c
    robot-description: robot used tp build the database for the RoadHouse search service project operated by Perceval
    robot-history: The need of this robot find its roots in the actual RoadHouse directory not maintenained since 1997
    robot-environment: service
    modified-date: Fri, 26 Feb 1999 12:00:00 GMT
    modified-by: Gregoire Welraeds

  • robot-id: roadrunner
    robot-name: Road Runner: The ImageScape Robot
    robot-owner-name: LIM Group
    robot-owner-email: lim@cs.leidenuniv.nl
    robot-status: development/active
    robot-purpose: indexing
    robot-type: standalone
    robot-platform: UNIX
    robot-exclusion: yes
    robot-exclusion-useragent: roadrunner
    robot-useragent: Road Runner: ImageScape Robot (lim@cs.leidenuniv.nl)
    robot-language: C, perl5
    robot-description: Create Image/Text index for WWW
    robot-history: ImageScape Project
    robot-environment: commercial service
    modified-date: Dec. 1st, 1996

  • robot-id: robbie
    robot-name: Robbie the Robot
    robot-cover-url:
    robot-details-url:
    robot-owner-name: Robert H. Pollack
    robot-owner-url:
    robot-owner-email: robert.h.pollack@lmco.com
    robot-status: development
    robot-purpose: indexing
    robot-type: standalone
    robot-platform: unix, windows95, windowsNT
    robot-availability: none
    robot-exclusion: yes
    robot-exclusion-useragent: Robbie
    robot-noindex: no
    robot-host: *.lmco.com
    robot-from: yes
    robot-useragent: Robbie/0.1
    robot-language: java
    robot-description: Used to define document collections for the DISCO system.
    Robbie is still under development and runs several
    times a day, but usually only for ten minutes or so.
    Sites are visited in the order in which references
    are found, but no host is visited more than once in
    any two-minute period.
    robot-history: The DISCO system is a resource-discovery component in
    the OLLA system, which is a prototype system, developed
    under DARPA funding, to support computer-based education
    and training.
    robot-environment: research
    modified-date: Wed, 5 Feb 1997 19:00:00 GMT
    modified-by:

    Next Page

  • WEBMASTERS
    Search Engine Submit Global
    Web Hosting FAQ
    Web Hosting Glossary
    Search engine ranking tips
    Download free scripts
    Keyword Suggestion Tool
    Downloads
    Google Page Ranking
    Search Engine Analysis
    Robots Index
    Web Crawlers
    Affiliates
    WHOIS
    SUPPORT
    24/7 Help Desk
    Cpanel
    Contact
    WE RECOMMEND
       
    Dependable Linux Servers providing cheap web hosting worldwide
    INTRO | HOME | WEB HOSTING | DEDICATED SERVERS | DEDICATED SERVERS STOCK | NETWORK DIAGRAMM |WEB DESIGN | DOMAIN PARKING | FREE FLASH MENU GENERATORS | FREE GRAPHICS NAVBARS | DHTML/CSS CODE GENERATORS | JAVA SCRIPT CSS CODE GENERATORS | FREE SEARCH ENGINE SUBMISSION | WEB HOSTING F.A.Q | WEB HOSTING GLOSSARY | WEEKLY SEARCH ENGINE RANKING TIPS | DOWNLOAD FREE SCRIPTS & PROGRAMMS | SEARCH ENGINE ANALYSIS | SEARCH TERM SUGESSTION TOOL | TECH NEWS FEED | DOWNLOAD FREE HTML TOOLS | GOOGLE PAGE RANK TIPS | ROBOTS INDEX | WEB CRAWLERS | CPANEL DOCUMENTATION | TERMS OF USE | CONTACT | FORUMS
    © 2002 Hostsun™ All wrignts reserved

    Dedicated servers provider in Europe and Greece