sarge  ] [  etch  ] [  etch-m68k  ] [  lenny  ] [  sid  ]
[ Source: htdig  ]

Package: htdig (1:3.1.6-11)

WWW search system for an intranet or small internet

The ht://Dig system is a complete world wide web indexing and searching system for a small domain or intranet. This system is not meant to replace the need for powerful internet-wide search systems like Lycos, Infoseek, Webcrawler and AltaVista. Instead it is meant to cover the search needs for a single company, campus, or even a particular sub section of a web site.

As opposed to some WAIS-based or web-server based search engines, ht://Dig can span several web servers at a site. The type of these different web servers doesn't matter as long as they understand the HTTP 1.0 protocol.

Features:

   * Intranet searching
   * It is free
   * Robot exclusion is supported
   * Boolean expression searching
   * Configurable search results
   * Fuzzy searching
   * Searching of HTML and text files
   * Keywords can be added to HTML documents
   * Email notification of expired documents
   * A Protected server can be indexed
   * Searches on subsections of the database
   * Full source code included
   * The depth of the search can be limited
   * Full support for the ISO-Latin-1 character set

Disk space requirements:

The search engine will require lots of disk space to store its databases. Unfortunately, there is no exact formula to compute the space requirements. It depends on the number of documents you are going to index but also on the various options you use. To give you an idea of the space requirements, here is what I have deduced from our own database size at San Diego State University.

If you keep around the wordlist database (for update digging instead of initial digging) I found that multiplying the number of documents covered by 12,000 will come pretty close to the space required.

We have about 13,000 documents: 150MB index size with a 'wordlist' database

                                 93MB index size without a 'wordlist' database

The package is available in two varieties, the 'stable', well-tested version (this one) and a less tested version (as 'htdig3.2').

Other Packages Related to htdig

  • depends
  • recommends
  • suggests
  • dep: debconf (>= 0.5)
    Debian configuration management system
    or debconf-2.0
    virtual package provided by cdebconf, debconf
  • dep: gawk
    GNU awk, a pattern scanning and processing language
  • dep: libc6 (>= 2.3.2.ds1-4) [not alpha, ia64]
    GNU C Library: Shared libraries and Timezone data
    also a virtual package provided by libc6-udeb
  • dep: libc6.1 (>= 2.3.2.ds1-4) [alpha, ia64]
    GNU C Library: Shared libraries and Timezone data
    also a virtual package provided by libc6.1-udeb
  • dep: libdb2 (>= 2:2.7.7.0-7)
    The Berkeley database routines (run-time files)
  • dep: libgcc1 (>= 1:3.3.4-1) [hppa, m68k]
    GCC support library
    dep: libgcc1 (>= 1:3.4.1-3) [not hppa, ia64, m68k]
    dep: libgcc1 (>= 1:3.4.3-6) [ia64]
  • dep: libstdc++5 (>= 1:3.3.4-1)
    The GNU Standard C++ Library v3
  • dep: lockfile-progs
    Programs for locking and unlocking files and mailboxes
  • dep: perl
    Larry Wall's Practical Extraction and Report Language
  • dep: sed (>= 4.0)
    The GNU sed stream editor
  • dep: zlib1g (>= 1:1.2.1)
    compression library - runtime
    also a virtual package provided by zlib1g-udeb
  • sug: catdoc
    MS-Word to TeX or plain text converter
  • sug: pstotext
    Extract text from PostScript and PDF files
    or gs
    Transitional package
    also a virtual package provided by gs-afpl, gs-esp, gs-gpl
    or xpdf
    Portable Document Format (PDF) suite
    also a virtual package provided by gs-afpl, gs-esp, gs-gpl
    or xpdf-i
    Package not available
    also a virtual package provided by gs-afpl, gs-esp, gs-gpl

Download htdig

Download for all available architectures
Architecture Package Size Installed Size Files
alpha 1,111.7 kB3820 kB [list of files]
amd64 (unofficial port) 986.7 kB3108 kB [list of files]
arm 991.5 kB3276 kB [list of files]
hppa 1,110.5 kB3900 kB [list of files]
i386 944.5 kB2880 kB [list of files]
ia64 1,285.8 kB5740 kB [list of files]
m68k 939.5 kB2896 kB [list of files]
mips 1,039.5 kB4500 kB [list of files]
mipsel 1,030.6 kB4496 kB [list of files]
powerpc 968.9 kB3296 kB [list of files]
s390 950.1 kB3324 kB [list of files]
sparc 964.9 kB3448 kB [list of files]