备忘:phpBB3实现sphinx搜索的要点

回复
寂默心流

备忘:phpBB3实现sphinx搜索的要点

未读帖子 寂默心流 »

  步骤可以参考phpBB简体中文支持社区版主的文章《中文全文搜索》(https://www.phpbbchinese.com/viewtopic.php?f=32&t=1203),以下几点备忘:

  1、对于阿里云用户来说,/etc/sphinxsearch/sphinx.conf 中最后一段的那个IP不能用阿里云给的公网IP,要用那个公私网IP。经原文版主高人指点,直接用localhost就行。

  2、对于阿里云用户来说,不但要在安全组里配置规则放开sphinx的端口,如果用了UFW还要用UFW命令放开该端口。

  3、在/etc/sphinxsearch/sphinx.conf里就按版主的说法只替换论坛数据库的用户名和密码,千万别自作多情地为sphinx专门建数据库。

  4、手动建立索引时会有一个warning,看似不要紧,其实非常影响搜索结果,基本是残的。所以要在/etc/sphinxsearch/sphinx.conf的index段加上红色那两行!
index index_phpbb_5evtvcsxiqi0djrd_main
{
path = /*****/sphinx/index_phpbb_5evtvcsxiqi0djrd_main
source = source_phpbb_5evtvcsxiqi0djrd_main
docinfo = extern
morphology = none
stopwords =
wordforms = # optional, specify path to wordforms file. See ./docs/sphinx_wordforms.txt for example
exceptions = # optional, specify path to exceptions file. See ./docs/sphinx_exceptions.txt for example
min_word_len = 2
charset_table = U+FF10..U+FF19->0..9, 0..9, U+FF41..U+FF5A->a..z, U+FF21..U+FF3A->a..z, A..Z->a..z, a..z, U+0149, U+017F, U+0138, U+00DF, U+00FF, U+00C0..U+00D6->U+00$ ignore_chars = U+0027, U+002C
min_prefix_len = 3 # Minimum number of characters for wildcard searches by prefix (min 1). Default is 3. If specified, set min_infix_len to 0
min_infix_len = 0 # Minimum number of characters for wildcard searches by infix (min 2). If specified, set min_prefix_len to 0
html_strip = 1
index_exact_words = 0 # Set to 1 to enable exact search operator. Requires wordforms or morphology
blend_chars = U+23, U+24, U+25, U+26, U+40
#####################
ngram_len = 1
ngram_chars = U+3000..U+2FA1F
  5、手动更新要多加一个开关:

代码: 全选

sudo indexer --all --rotate
  6、修改/etc/default/sphinxsearch的以下选项:

代码: 全选

START=yes
小心,yes前千万别加空格!
回复