webbase is an internet web crawler written in C and later ported to C++. It uses a MySQL database to store information about crawled URLs. It is available as a command line program or as a library (shared or static). It has two main functions: crawl the WEB to get documents and build a full text database with these documents. The crawler part visits the documents and stores intersting information about them locally. It visits the document on a regular basis to make sure that it is still there and updates it if it changes. The full text database uses the local copies of the document to build a searchable index. The full text indexing functions are not included in webbase. /
Similar scripts
Larbin
(Popularity: ) : Larbin is a web crawler (also called (web) robot, spider, scooter, etc). It is intended to fetch a large number ... harvest
(Popularity: ) : Harvest is a system to collect information and make them searchable using a web interface. Harvest can collect information on ...