If you want to build a search engine on your own website or company’s internal website, you can try Crawlzilla.
Crawlzilla is a free software that helps you easily build a search engine. With it, you don’t have to rely on the search engine of a commercial company, and you don’t have to worry about the index of the company’s internal website.
The nutch project is the core, and more related kits are integrated, and the UI is designed, installed and managed to make it easier for users to get started.
In addition to crawling basic HTML, Crawlzilla can also analyze files on web pages, such as (DOC, PDF, PPT, RSS) and other file formats. So that your search engine is not just a web search engine, but a website with full data index library.
Crawlzilla with word segmentation ability to make your search more accurate. The main features and goals of this free software are to provide users with a convenient and easy to install search platform.
Features: easy to install, with word segmentation
License: Apache License 2
Operating System: Linux
Download: Free Download Crawlzilla