Sphinx是由俄羅斯人Andrew Aksyonoff開發的一個全文檢索引擎。意圖爲其他應用提供高速、低空間佔用、高結果 相關度的全文搜索功能。由於開發要求Sphinx中文分詞,安裝環境,就做下筆記
[root@localhost mmseg-3.2.14]# yum -y install make gcc g++ gcc-c++ libtool autoconf automake imake [root@localhost mmseg-3.2.14]# yum install libxml2-devel expat-devel [root@localhost sphinx]# tar xvf coreseek-3.2.14.tar.gz [root@localhost sphinx]# cd coreseek-3.2.14 [root@localhost coreseek-3.2.14]# cd mmseg-3.2.14/ [root@localhost mmseg-3.2.14]# aclocal [root@localhost mmseg-3.2.14]# libtoolize --force libtoolize: putting auxiliary files in AC_CONFIG_AUX_DIR, `config'. libtoolize: linking file `config/ltmain.sh' libtoolize: Consider adding `AC_CONFIG_MACRO_DIR([m4])' to configure.in and libtoolize: rerunning libtoolize, to keep the correct libtool macros in-tree. libtoolize: Consider adding `-I m4' to ACLOCAL_AMFLAGS in Makefile.am. [root@localhost mmseg-3.2.14]# [root@localhost mmseg-3.2.14]# automake --add-missing [root@localhost mmseg-3.2.14]# autoconf [root@localhost mmseg-3.2.14]# autoheader [root@localhost mmseg-3.2.14]# make clean [root@localhost mmseg-3.2.14]# ./configure --prefix=/usr/local/mmseg3 [root@localhost mmseg-3.2.14]# make && make install [root@localhost coreseek-3.2.14]# cd csft-3.2.14/ [root@localhost csft-3.2.14]# sh buildconf.sh [root@localhost csft-3.2.14]# ./configure --prefix=/usr/local/coreseek --without-unixodbc --with-mmseg --with-mmseg-includes=/usr/local/mmseg3/include/mmseg/ --with-mmseg- libs=/usr/local/mmseg3/lib/ --with-mysql [root@localhost csft-3.2.14]# make && make install [root@localhost testpack]# cat var/test/test.xml #顯示中文 [root@localhost testpack]# /usr/local/mmseg3/bin/mmseg -d /usr/local/mmseg3/etc var/test/test.xml </x ?/x xml/x /x version/x =/x "/x 1/x ./x 0/x "/x /x encoding/x =/x "/x utf/x -/x 8/x "/x ?/x >/x </x sphinx/x :/x docset/x >/x /x </x sphinx/x :/x schema/x >/x /x </x sphinx/x :/x field/x /x name/x =/x "/x subject/x "/x //x >/x /x /x </x sphinx/x :/x field/x /x name/x =/x "/x content/x "/x //x >/x /x </x sphinx/x :/x attr/x /x name/x =/x "/x published/x "/x /x type/x =/x "/x timestamp/x "/x //x >/x /x </x sphinx/x :/x attr/x /x name/x =/x "/x author/x _/x id/x "/x /x type/x =/x "/x int/x "/x /x bits/x =/x "/x 16/x "/x /x default/x =/x "/x 1/x "/x //x >/x /x </x //x sphinx/x :/x schema/x >/x /x </x sphinx/x :/x document/x /x id/x =/x "/x 1/x "/x >/x /x /x </x subject/x >/x 愚人/x 節/x 最佳/x 蠱惑/x 爆/x 料/x /x 谷/x 歌/x 300/x 億/x 美元/x 收購/x 百/x 度/x </x //x subject/x >/x /x /x </x published/x >/x 1270131607/x </x //x published/x >/x /x /x </x content/x >/x 據/x 國外/x 媒體/x 報道/x ,/x 谷/x 歌/x 將/x 巨資/x 收購/x 百/x 度/x ,/x 涉及/x 金額/x 高達/x 300/x 億/x 美元/x 。/x 谷/x 歌/x 借/x 此/x 重返/x 大陸/x 市場/x 。/x /x /x 該/x 報道/x 稱/x ,/x 目前/x 谷/x 歌/x 與/x 百/x 度/x 已經/x 達成/x 了/x 收購/x 協議/x ,/x 將/x 擇機/x 對外/x 公佈/x 。/x 百/x 度/x 的/x 管理層/x 將/x 100/x %/x 保 留/x ,/x 但/x 會/x 將/x 項目/x 縮減/x ,/x 包括/x 有/x 啊/x 商城/x ,/x 以及/x 目前/x 實施/x 不力/x 的/x 鳳/x 巢/x 計劃/x 。/x 正在/x 進行/x 測試/x 階段/x 的/x 視頻/x 網站/x qiyi/x ./x com/x 將/x 輸入/x 更/x 多/x 的/x Youtube/x 資源/x 。/x (/x YouTube/x 在/x 大陸/x 區/x 因/x 內容/x 審查/x 暫/x 不/x 能/x 訪問/x )/x 。/x [root@localhost testpack]# /usr/local/coreseek/bin/indexer -c etc/csft.conf --all Coreseek Fulltext 3.2 [ Sphinx 0.9.9-release (r2117)] Copyright (c) 2007-2011, Beijing Choice Software Technologies Inc (http://www.coreseek.com) using config file 'etc/csft.conf'... indexing index 'xml'... collected 3 docs, 0.0 MB sorted 0.0 Mhits, 100.0% done total 3 docs, 7585 bytes total 0.008 sec, 945524 bytes/sec, 373.97 docs/sec total 2 reads, 0.000 sec, 4.2 kb/call avg, 0.0 msec/call avg total 7 writes, 0.000 sec, 3.1 kb/call avg, 0.0 msec/call avg [root@localhost testpack]# /usr/local/coreseek/bin/search -c etc/csft.conf 結婚的和尚未結婚的 Coreseek Fulltext 3.2 [ Sphinx 0.9.9-release (r2117)] Copyright (c) 2007-2011, Beijing Choice Software Technologies Inc (http://www.coreseek.com) using config file 'etc/csft.conf'... index 'xml': query '結婚的和尚未結婚的 ': returned 0 matches of 0 total in 0.004 sec words: 1. '結婚': 0 documents, 0 hits 2. '的': 3 documents, 83 hits 3. '和': 3 documents, 15 hits 4. '尚未': 0 documents, 0 hits [root@localhost python]# /usr/local/coreseek/bin/searchd -c /opt/sphinx/coreseek-3.2.14/testpack/etc/csft_cjk.conf &