ライン

ポイント:*

ライン

 はじめに

namazu2

 ウェブ内検索エンジンの中で設置が簡単なナマズを導入設定を行います。

 FreeBSD 9.3での例となります。

 導入と設定

インストール

 pkgで導入します。

#pkg install namazu2
Updating FreeBSD repository catalogue...
Fetching meta.txz: 100%   968 B   1.0k/s    00:01
Fetching packagesite.txz: 100%    5 MB 229.3k/s    00:23
Processing entries: 100%
FreeBSD repository update completed. 23732 packages processed
New version of pkg detected; it needs to be installed first.
The following 1 packages will be affected (of 0 checked):

Installed packages to be UPGRADED:
        pkg: 1.4.1 -> 1.4.3

The operation will free 1 kB.
1 MB to be downloaded.

Proceed with this action? [y/N]: y
Fetching pkg-1.4.3.txz: 100%    1 MB 252.5k/s    00:08
Checking integrity... done (0 conflicting)
[1/1] Upgrading pkg from 1.4.1 to 1.4.3...
[1/1] Extracting pkg-1.4.3: 100%
Message for pkg-1.4.3:
 If you are upgrading from the old package format, first run:

  # pkg2ng
Updating FreeBSD repository catalogue...
FreeBSD repository is up-to-date.
All repositories are up-to-date.
The following 2 packages will be affected (of 0 checked):

New packages to be INSTALLED:
        namazu2: 2.0.21_1
        p5-File-MMagic: 1.30_1

The process will require 1 MB more space.
375 kB to be downloaded.

Proceed with this action? [y/N]: y
Fetching namazu2-2.0.21_1.txz: 100%  354 kB 181.7k/s    00:02
Fetching p5-File-MMagic-1.30_1.txz: 100%   20 kB  20.6k/s    00:01
Checking integrity... done (0 conflicting)
[1/2] Installing p5-File-MMagic-1.30_1...
[1/2] Extracting p5-File-MMagic-1.30_1: 100%
[2/2] Installing namazu2-2.0.21_1...
[2/2] Extracting namazu2-2.0.21_1: 100%

メインのnamazuはこれで終わり。

 さて、今回は久々にMeCab に戻すことにしました。

#pkg install ja-mecab
Updating FreeBSD repository catalogue...
FreeBSD repository is up-to-date.
All repositories are up-to-date.
The following 1 packages will be affected (of 0 checked):

New packages to be INSTALLED:
        ja-mecab: 0.996_2

The process will require 4 MB more space.
453 kB to be downloaded.

Proceed with this action? [y/N]: y
Fetching ja-mecab-0.996_2.txz: 100%  453 kB 232.2k/s    00:02
Checking integrity... done (0 conflicting)
[1/1] Installing ja-mecab-0.996_2...
[1/1] Extracting ja-mecab-0.996_2: 100%
Message for ja-mecab-0.996_2:
 ========================================================
                     **** NOTE ****
  ipadic was split into japanese/mecab-ipadic port.
========================================================
#pkg install ja-mecab ja-mecab-ipadic
Updating FreeBSD repository catalogue...
FreeBSD repository is up-to-date.
All repositories are up-to-date.
The following 1 packages will be affected (of 0 checked):

New packages to be INSTALLED:
        ja-mecab-ipadic: 2.7.0.20070801

The process will require 39 MB more space.
7 MB to be downloaded.

Proceed with this action? [y/N]: y
Fetching ja-mecab-ipadic-2.7.0.20070801.txz: 100%    7 MB 234.0k/s    00:35
Checking integrity... done (0 conflicting)
[1/1] Installing ja-mecab-ipadic-2.7.0.20070801...
[1/1] Extracting ja-mecab-ipadic-2.7.0.20070801: 100%

入りました。辞書は種類が消えているみたいでした。IPA辞書があればとりあえず大丈夫なので、そのままで進めます。

いくつか追加をするものを加えていきます。

#pkg install ja-p5-nkf
Updating FreeBSD repository catalogue...
FreeBSD repository is up-to-date.
All repositories are up-to-date.
The following 1 packages will be affected (of 0 checked):

New packages to be INSTALLED:
        ja-p5-nkf: 2.1.2_1,1

The process will require 269 kB more space.
95 kB to be downloaded.

Proceed with this action? [y/N]: y
Fetching ja-p5-nkf-2.1.2_1,1.txz: 100%   95 kB  97.9k/s    00:01
Checking integrity... done (0 conflicting)
[1/1] Installing ja-p5-nkf-2.1.2_1,1...
[1/1] Extracting ja-p5-nkf-2.1.2_1,1: 100%

ja-p5-Text-MeCabはpkgに準備されていないみたいなので、portsで加えます。

# cd /usr/ports/japanese/p5-Text-MeCab
# make install clean
===>  ja-p5-Text-MeCab-0.20009_1 is marked as broken: Does not build with Perl 5.18 or above.
*** [install] Error code 1

Stop in /usr/ports/japanese/p5-Text-MeCab.

なるほど。そういうことでした。
http://search.cpan.org/dist/Text-MeCab/
を参照する限り、バージョンは0.20016。

http://www.cpantesters.org/distro/T/Text-MeCab.html#Text-MeCab-0.20016
を見ると、その後も動作した様子は見られてはいない様子。
新しい版で動くのかを確認してみることに。Makefileを修正し、make makesum してみましたが、パッチでエラー。
…その他試行錯誤してみましたが、時間をかけるのをやめました。
残念ながら、MeCab利用を再びあきらめることにします。切り戻します。

#pkg delete  p5-Test-Requires-0.08_1 p5-Test-Simple-1.001.003_1  p5-ExtUtils-ParseXS-3.24_1  p5-Devel-PPPort-3.25  ja-mecab-ipadic-2.7.0.20070801  ja-mecab-0.996_2
Updating database digests format: 100%
Checking integrity... done (0 conflicting)
Deinstallation has been requested for the following 6 packages (of 0 packages in the universe):

Installed packages to be REMOVED:
        p5-Test-Requires-0.08_1
        p5-Test-Simple-1.001.003_1
        p5-ExtUtils-ParseXS-3.24_1
        p5-Devel-PPPort-3.25
        ja-mecab-ipadic-2.7.0.20070801
        ja-mecab-0.996_2

The operation will free 45 MB.

Proceed with deinstalling packages? [y/N]: y
[1/6] Deinstalling p5-Test-Requires-0.08_1...
[1/6] Deleting files for p5-Test-Requires-0.08_1: 100%
[2/6] Deinstalling p5-Test-Simple-1.001.003_1...
[2/6] Deleting files for p5-Test-Simple-1.001.003_1: 100%
[3/6] Deinstalling p5-ExtUtils-ParseXS-3.24_1...
[3/6] Deleting files for p5-ExtUtils-ParseXS-3.24_1: 100%
[4/6] Deinstalling p5-Devel-PPPort-3.25...
[4/6] Deleting files for p5-Devel-PPPort-3.25: 100%
[5/6] Deinstalling ja-mecab-ipadic-2.7.0.20070801...
[5/6] Deleting files for ja-mecab-ipadic-2.7.0.20070801: 100%
[6/6] Deinstalling ja-mecab-0.996_2...
[6/6] Deleting files for ja-mecab-0.996_2: 100%

 さて、次はp5-Text-ChaSenでいくことにします。
これは、大丈夫そうなportsに見えますが、既にメンテナがいません。
Namazuも利用者減っているしなぁ。

#pkg install chasen-base p5-Text-ChaSen
Updating FreeBSD repository catalogue...
FreeBSD repository is up-to-date.
All repositories are up-to-date.
pkg: No packages available to install matching 'chasen-base' have been found in the repositories
kimura /usr/ports/japanese/chasen-base #pkg install ja-chasen-base ja-p5-Text-ChaSen
Updating FreeBSD repository catalogue...
FreeBSD repository is up-to-date.
All repositories are up-to-date.
The following 2 packages will be affected (of 0 checked):

New packages to be INSTALLED:
        ja-chasen-base: 2.4.5_1
        ja-p5-Text-ChaSen: 1.03_6

The process will require 817 kB more space.
502 kB to be downloaded.

Proceed with this action? [y/N]: y
Fetching ja-chasen-base-2.4.5_1.txz: 100%  490 kB 250.9k/s    00:02
Fetching ja-p5-Text-ChaSen-1.03_6.txz: 100%   12 kB  12.3k/s    00:01
Checking integrity... done (0 conflicting)
[1/2] Installing ja-chasen-base-2.4.5_1...
[1/2] Extracting ja-chasen-base-2.4.5_1: 100%
[2/2] Installing ja-p5-Text-ChaSen-1.03_6...
[2/2] Extracting ja-p5-Text-ChaSen-1.03_6: 100%

こちらは大丈夫。実は、Chasenもかなり久々です。
辞書も加えます。

#pkg install ja-ipadic
Updating FreeBSD repository catalogue...
FreeBSD repository is up-to-date.
All repositories are up-to-date.
The following 1 packages will be affected (of 0 checked):

New packages to be INSTALLED:
        ja-ipadic: 2.7.0_2

The process will require 37 MB more space.
5 MB to be downloaded.

Proceed with this action? [y/N]: y
Fetching ja-ipadic-2.7.0_2.txz: 100%    5 MB 233.3k/s    00:23
Checking integrity... done (0 conflicting)
[1/1] Installing ja-ipadic-2.7.0_2...
#cp -p /usr/local/share/chasen/dic/ipadic/chasenrc.sample /usr/local/etc/chasenrc

こんな感じ。

設定

 以下の例では、 /home 以下の foo アカウント以下に example.ne.jp というディレクトリがあることを前提で記述します。
CGI ディレクトリは、ここでは cgi という中に入る前提で記述します。
 サンプルファイルは、すべてコメント行になっているファイルですので、基本はデフォルトという状態です。
コピーしてから修正していきましょう。

% cp /usr/local/etc/namazu/namazurc-sample /home/foo/example.ne.jp/cgi/.namazurc
% cp mknmzrc-sample mknmzrc.example.ne.jp
% ln -s /usr/local/libexec/namazu.cgi /home/foo/example.ne.jp/cgi/namazu.cgi

.namazurc を修正していきます。
変更は、1行後に記述し、太文字で表示しておきます。

# This is a Namazu configuration file for namazu or namazu.cgi.
#
#  Originally, this file is named 'namazurc-sample'.  so you should
#  copy this to 'namazurc' to make the file effective.
#  see 'doc/ja/manual.html#namazurc' or 'doc/en/manual.html#namazurc'.
#
#  Each item is must be separated by one or more SPACE or TAB characters.
#  You can use a double-quoted string for represanting a string which
#  contains SPACE or TAB characters like "foo bar baz".


##
## Index: Specify the default directory.
##
#Index         /usr/local/var/namazu/index
Index         /var/namazu/example.ne.jp

##
## Template: Set the template directory containing
## NMZ.{head,foot,body,tips,result} files.
##
#Template      /usr/local/var/namazu/index
Template      /var/namazu/example.ne.jp

##
## Replace: Replace TARGET with REPLACEMENT in URIs in search
## results.
##
## TARGET is specified by Ruby's perl-like regular expressions.
## You can caputure sub-strings in TARGET by surrounding them
## with `(' and `)'and use them later as backreferences by
## \1, \2, \3,... \9.
##
## To use meta characters literally such as `*', `+', `?', `|',
## `[', `]', `{', `}', `(', `)', escape them with `\'.
##
## e.g.,
##
##    Replace  /home/foo/public_html/   http://www.example.jp/~foo/
##    Replace  /home/(.*)/public_html/  http://www.example.jp/\1/
##    Replace  /[Cc]\|/foo/             http://www.example.jp/
##
## If you do not want to do the processing on command line use,
## run namazu with -U option.
##
## You can specify more than one Replace rules but the only
## first-matched rule are applied.
##
#Replace       /home/foo/public_html/  http://www.example.jp/~foo/
Replace       /home/foo/example.ne.jp/ http://example.ne.jp/

##
## Logging: Set OFF to turn off keyword logging to NMZ.slog.
## Default is ON.
##
#Logging       off
Logging       on

##
## Lang: Set the locale code such as `ja_JP.eucJP', `ja_JP.SJIS',
## `de', etc.  This directive works only if the environment
## variable LANG is not set because the directive is mainly
## intended for CGI use.  On the shell, You can set
## environemtnt variable LANG instead of using the directive.
##
## If you set `de' to it, namazu.cgi use
## NMZ.(head|foot|body|tips|results).de for displaying results
## and use a proper message catalog for `de'.
##
#Lang          ja
Lang          ja

##
## Scoring: Set the scoring method "tfidf" or "simple".
##
#Scoring       tfidf
Scoring       tfidf

##
## EmphasisTags: Set the pair of html elements which is used in
## keyword emphasizing for search results.
##
#EmphasisTags  "<strong class=\"keyword\">"   "</strong>"
EmphasisTags  "<strong class=\"keyword\">"   "</strong>"

##
## MaxHit: Set the maximum number of documents which can be
## handled in query operation.  If documents matching a
## query exceed the value, they will be ignored.
##
#MaxHit 10000

##
## MaxMatch: Set the maximum number of words which can be
## handled in regex/prefix/inside/suffix query. If documents
## matching a query exceed the value, they will be ignored.
##
#MaxMatch       1000

##
## ContentType: Set "Content-Type" header output. Specify "charset".
##
## When you specify English, French, German and Spanish charset
##
#ContentType    "text/html; charset=ISO-8859-1"
##
## When you specify Polish charset
##
#ContentType    "text/html; charset=ISO-8859-2"
##
## When you specify Japanese charset by UNIX
##
#ContentType    "text/html; charset=EUC-JP"
##
## When you specify Japanese charset by Windows
##
#ContentType    "text/html; charset=Shift_JIS"
ContentType    "text/html; charset=Shift_JIS"
##
## If you want to use non-HTML template files, set it suitably.
##
#ContentType    "text/x-hdml; charset=Shift_JIS"
ContentType    "text/x-hdml; charset=Shift_JIS"
##
## Charset: "charset" of each "Lang" is defined.
## When "charset" is not included in "ContentType", "charset" of default
## of each "Lang" is output.
## Please define it by "Charset" when you use the language of the
## unsupport. (It is necessary to prepare the template and the message
## catalog.)
##
#Charset "ja" "EUC-JP"
##
#Charset "ja_JP.SJIS" "Shift_JIS"
##
#Charset "ja_JP.ISO-2022-JP" "ISO-2022-JP"
##
#Charset "fr" "ISO-8859-1"
##
#Charset "de" "ISO-8859-1"
##
#Charset "es" "ISO-8859-1"
##
#Charset "pl" "ISO-8859-2"

##
## Suicide_Time: namazu.cgi stops the process in 60 seconds by
## default.
## (Only UNIX)
##
#Suicide_Time   60

##
## Regex_Search: Set OFF to turn off regex_search.
## Default is ON.
##
#Regex_Search   off

この設定に従い、namazuのファイルを生成されるファイルを作成しておきます。

# cd /var/
# mkdir -p namazu/example.ne.jp

として準備します。

 設定ファイルは /usr/local/etc/namazu/mknmzrc.example.ne.jp は、同様にサンプルファイルとその修正を記述しました。

#
# This is a Namazu configuration file for mknmz.
#
package conf;  # Don't remove this line!

#===================================================================
#
# Administrator's email address
#
# $ADDRESS = 'webmaster@91amd64-default-job-08.isc.freebsd.org';
$ADDRESS = '';

#===================================================================
#
# Regular Expression Patterns
#

#
# This pattern specifies HTML suffixes.
#
# $HTML_SUFFIX = "html?|[ps]html|html\\.[a-z]{2}";
$HTML_SUFFIX = "html?|[ps]html|html\\.[a-z]{2}";

#
# This pattern specifies file names which will be targeted.
# NOTE: It can be specified by --allow=regex option.
#       Do NOT use `$' or `^' anchors.
#       Case-insensitive.
#
$ALLOW_FILE =   ".*\\.(?:$HTML_SUFFIX)|.*\\.txt" . # HTML, plain text
                "|.*\\.gz|.*\\.Z|.*\\.bz2" .       # Compressed files
                "|.*\\.pdf|.*\\.ps" .              # PDF, PostScript
                "|.*\\.tex|.*\\.dvi" .             # TeX, DVI
                "|.*\\.rpm|.*\\.deb" .             # RPM, DEB
                "|.*\\.doc|.*\\.xls|.*\\.pp[st]" . # Word, Excel, PowerPoint
                "|.*\\.docx|.*\\.xlsx|.*\\.pp[st]x" . # MS-OfficeOpenXML Word, Excel, PowerPoint
                "|.*\\.vs[dst]|.*\\.v[dst]x" .     # Visio
                "|.*\\.j[sabf]w|.*\\.jtd" .        # Ichitaro 4, 5, 6, 7, 8
                "|.*\\.sx[widc]" .                 # OpenOffice Writer,Calc,Impress,Draw
                "|.*\\.od[tspg]" .                 # OpenOffice2.0
                "|.*\\.rtf" .                      # Rich Text Format
                "|.*\\.hdml|.*\\.mht" .            # HDML MHTML
                "|.*\\.mp3" .                      # MP3
                "|.*\\.gnumeric" .                 # Gnumeric
                "|.*\\.kwd|.*\\.ksp" .             # KWord, KSpread
                "|.*\\.kpr|.*\\.flw" .             # KPresenter, Kivio
                "|.*\\.eml|\\d+|[-\\w]+\\.[1-9n]"; # Mail/News, man

#
# This pattern specifies file names which will NOT be targeted.
# NOTE: It can be specified by --deny=regex option.
#       Do NOT use `$' or `^' anchors.
#       Case-insensitive.
#
$DENY_FILE = ".*\\.(gif|png|jpg|jpeg)|.*\\.tar\\.gz|core|.*\\.bak|.*~|\\..*|\x23.*|.*\\.txt|google*.html";

#
# This pattern specifies DDN(DOS Device Name) which will NOT be targeted.
# NOTE: Only for Windows.
#       Do NOT use `$' or `^' anchors.
#       Case-insensitive.
#
$DENY_DDN = "con|aux|nul|prn|lpt[1-9]|com[1-9][0-9]?|clock\$|xmsxxxx0";

#
# This pattern specifies PATHNAMEs which will NOT be targeted.
# NOTE: Usually specified by --exclude=regex option.
#
# $EXCLUDE_PATH = undef;
$EXCLUDE_PATH = "/home/foo/example.ne.jp/cgi/|/home/foo/example.ne.jp/images/";

#
# This pattern specifies file names which can be omitted
# in URI.  e.g., 'index.html|index.htm|Default.html'
#
# NOTE: This is similar to Apache's "DirectoryIndex" directive.
#
# $DIRECTORY_INDEX = "";
$DIRECTORY_INDEX = "index.html";

#
# This pattern specifies Mail/News's fields in its header which
# should be searchable.  NOTE: case-insensitive
#
# $REMAIN_HEADER = "From|Date|Message-ID";

#
# This pattern specifies fields which used for field-specified
# searching.  NOTE: case-insensitive
#
#$SEARCH_FIELD = "message-id|subject|from|date|uri|newsgroups|to|summary|size";
$SEARCH_FIELD = "message-id|subject|date|uri|newsgroups|to|summary|size";

#
# This pattern specifies meta tags which used for field-specified
# searching.  NOTE: case-insensitive
#
$META_TAGS = "keywords|description";

#
# This pattern specifies aliases for NMZ.field.* files.
# NOTE: Editing NOT recommended.
#
%FIELD_ALIASES = ('title' => 'subject', 'author' => 'from');

#
# This pattern specifies HTML elements which should be replaced with
# null string when removing them. Normally, the elements are replaced
# with a single space character.
#
$NON_SEPARATION_ELEMENTS = 'A|TT|CODE|SAMP|KBD|VAR|B|STRONG|I|EM|CITE|FONT|U|'.
                        'STRIKE|BIG|SMALL|DFN|ABBR|ACRONYM|Q|SUB|SUP|SPAN|BDO';

#
# This pattern specifies attribute of a HTML tag which should be
# searchable.
#
$HTML_ATTRIBUTES = 'ALT|SUMMARY|TITLE';


#===================================================================
#
# Critical Numbers
#

#
# The max size of files which can be loaded in memory at once.
# If you have much memory, you can increase the value.
# If you have less memory, you can decrease the value.
#
$ON_MEMORY_MAX   = 5000000;

#
# The max file size for indexing. Files larger than this
# will be ignored.
# NOTE: This value is usually larger than TEXT_SIZE_MAX because
#       binary-formated files such as PDF, Word are larger.
#
$FILE_SIZE_MAX   = 2000000;

#
# The max text size for indexing. Files larger than this
# will be ignored.
#
$TEXT_SIZE_MAX   =  600000;

#
# The max length of a word. the word longer than this will be ignored.
#
# $WORD_LENG_MAX   = 128;


#
# Weights for HTML elements which are used for term weightning.
#
# %Weight =
#     (
#      'html' => {
#          'title'  => 16,
#          'h1'     => 8,
#          'h2'     => 7,
#          'h3'     => 6,
#          'h4'     => 5,
#          'h5'     => 4,
#          'h6'     => 3,
#          'a'      => 4,
#          'strong' => 2,
#          'em'     => 2,
#          'kbd'    => 2,
#          'samp'   => 2,
#          'var'    => 2,
#          'code'   => 2,
#          'cite'   => 2,
#          'abbr'   => 2,
#          'acronym'=> 2,
#          'dfn'    => 2,
#      },
#      'metakey' => 32, # for <meta name="keywords" content="foo bar">
#      'headers' => 8,  # for Mail/News' headers
# );

#
# The max length of a HTML-tagged string which can be processed for
# term weighting.
# NOTE: There are not a few people has a bad manner using
#       <h[1-6]> for changing a font size.
#
# $INVALID_LENG = 128;

#
# The max length of a field.
# This MUST be smaller than libnamazu.h's BUFSIZE (usually 1024).
#
# $MAX_FIELD_LENGTH = 200;


#===================================================================
#
# Softwares for handling a Japanese text
#

#
# Network Kanji Filter nkf v1.71 or later
#
$NKF = "module_nkf";

#
# KAKASI 2.x or later
# Text::Kakasi 1.05 or later
#
# $KAKASI = "module_kakasi";

#
# ChaSen 2.02 or later (simple wakatigaki)
# Text::ChaSen 1.03
#
$CHASEN = "module_chasen";

#
# ChaSen 2.02 or later (with noun words extraction)
#
$CHASEN_NOUN = "no";

#
# MeCab
#
# $MECAB = "module_mecab";

#
# Default Japanese processer: KAKASI or ChaSen or MeCab.
#
# $WAKATI  = $KAKASI;
$WAKATI  = $CHASEN;
# $WAKATI  = $MECAB;


#===================================================================
#
# Directories
#
# $LIBDIR = "@PERLLIBDIR@";
# $FILTERDIR = "@FILTERDIR@";
# $TEMPLATEDIR = "@TEMPLATEDIR@";
#

# 1;

以上のようにしてみました。

 実際に同ホームページの中身の検索用インデックスファイルを作成してみることにします。

#cd /var/namazu/example.ne.jp
#mknmz -c --config=/usr/local/etc/namazu/mknmzrc.example.ne.jp /home/foo/example.ne.jp
Looking for indexing files...
100 files are found to be indexed.
1/100 - /home/foo/example.ne.jp/ABC.html [text/html]
...
100/100 -/home/foo/example.ne.jp/XYZ.html [text/html]
Writing index files...
[Base]
Date:                Sat Jan  3 12:52:06 2015
Added Documents:     100
Size (bytes):        3,323,430
Total Documents:     100
Added Keywords:      61,482
Total Keywords:      61,482
Wakati:              module_chasen
Time (sec):          8
File/Sec:            25.12
System:              freebsd
Perl:                5.018004
Namazu:              2.0.21

これ用の更新ファイルを作成しました。

#!/bin/csh
setenv LANG ja_JP.EUC
setenv LC_ALL ja_JP.EUC
set path = (/sbin /bin /usr/local/sbin /usr/local/bin /usr/sbin /usr/bin)
cd /usr/local/var/namazu/example.ne.jp.main
mknmz -c --config=/usr/local/etc/namazu/mknmzrc.example.ne.jp.main /home/foo/example.ne.jp

実際に実行すると

Looking for indexing files...
999 files are found to be indexed.
1/999 - /home/foo/example.ne.jp/123-123.html [text/html]
2/999 - /home/foo/example.ne.jp/index.hml [text/html]
.
.
.
Writing index files...
[Base]
Date:                DDD MMM DD HH:MM:SS YYYY
Added Documents:     999
Size (bytes):        999,999,999
Total Documents:     999
Added Keywords:      333,444
Total Keywords:      333,444
Wakati:              module_kakasi
Time (sec):          40
File/Sec:            13.86
System:              freebsd
Perl:                5.008009
Namazu:              2.0.18

のような表示が行われたと思います。
 /etc/crontab に加えて定時実行に設定してしまいます。

20      2       *       *       *       root    /bin/csh /usr/local/etc/namazu/namazu-example.ne.jp.csh > /dev/null 2>&1 /dev/null

のようにしました。

 インターネットブラウザからアクセスしてみることにしました。
「Namazu による全文検索システム
現在、 100 の文書がインデックス化され、 37,543 個のキーワードが登録されています。 」
 のようにページが表示されました。

> namazu -f /home/foo/example.ne.jp/cgi/.namazurc -C
読み込んだ設定ファイル: /home/foo/example.ne.jp/cgi/.namazurc
--
インデックス (Index):    /usr/local/var/namazu/
ログの記録 (Logging):    on
使用する言語 (Lang):     ja_JP.eucJP
スコア計算 (Scoring):    tfidf
テンプレート (Template): /usr/local/var/namazu/mknmzrc.example.ne.jp.main
ヒット件数の上限 (MaxHit):      10000
マッチする語の上限 (MaxMatch):  1000
強調タグ (EmphasisTags): <strong class="keyword">       </strong>
置換 (Replace): /home/foo/example.ne.jp/      http://example.ne.jp/

前回に引き続き NMZ.field.from と NMZ.field.from.i が生成されていません。

# touch NMZ.warnlog
# chmod 666 NMZ.warnlog

は、追加しておきました。

デザインの修正

そのままのNamazuのデザインでは寂しいので、オリジナルデザインに修正をします。

フッター用           NMZ.foot.ja
ヘッダー用           NMZ.head.ja

私が修正しているのは上記2つです。スタイルシートの追加や

<META http-equiv="Content-Type" content="text/html; charset=euc-jp" />

などを加えています。

NMZ.result.normal.ja に「著者」というのが残っていると、これも表示対象になってしまいます。

<dt>${namazu::counter}. <strong><a href="${uri}">${title}</a></strong> (スコア: ${namazu::score})</dt>
<!-- <dd><strong>著者</strong>: <em>${author}</em></dd> -->
<dd><strong>日付</strong>: <em>${date}</em></dd>
<dd>${summary}</dd>
<dd><a href="${uri}">${uri}</a> (${size} bytes)<br><br></dd>

なので、上記のコメント行にしている1行(※2行目)を削除してしまうことで解消することができました。

 ページ上にフォームは好きにはめ込めばいいので

<form method="GET" action="/cgi/namazu.cgi">
  <p align="center">検 索 
    <input type="TEXT" name="key" size="30">
    <input type="hidden" name="idxname" value="">
    <input type="SUBMIT" value="検索">
  </p>
</form>

のように、入れてみましょう。

検索できたのではないでしょうか。とても便利に使うことができます。

 感想

 久々にChaSenで導入設定を行いました。
実は、kakasiで少し不満だったので、変更しようとは思っていました。
結果をいくつか試してみて、以前よりは良い結果になっていると思っています。

【改訂履歴】作成:2015/01/03 加筆修正:2015/01/14
【参考リンク】
全文検索システム Namazu … Namazu Projectのオフィシャルページ

Copyright © 1996,1997-2006,2007- by F.Kimura,