solr的配置参数理解

lhc0

2008-10-09

l dataDir参数

用于替换默认的索引数据目录（./data）。如果重复指定，将使用重复的值。如果不是绝对路径，将使用servlet容器当前工作目录下的相对路径。

l mainIndex参数部分

mainIndex>

    <!-- lucene options specific to the main on-disk lucene index -->

    <useCompoundFile>false</useCompoundFile>

    <mergeFactor>10</mergeFactor>

    <maxBufferedDocs>1000</maxBufferedDocs>

    <maxMergeDocs>2147483647</maxMergeDocs>

    <maxFieldLength>10000</maxFieldLength>

  </mainIndex>

【mergeFactor】指定同样大小的segment达到多少时会被合并。如果你设置改值为10，那么每当1000（maxBufferedDocs）个doc被添加到索引时（它们可能在内存中），一个新的sgement将在硬盘上创建，当第10个同样大小的segment被创建后，这10个segement 将被合并成一个包含10000（10*1000）个doc的segment。同样当第10个包含10000个doc的segment被创建的时候，他们将合并成更大的segment。当然这种合并并不是无休止的。这是因为下面的参数对其进行了限制。

【maxMergeDocs】每个segment所能容纳的doc数目上限。

【maxFieldLength】指定每个field的最大长度。

l Update Handler 参数部分

这部分通常是关于内部如如何处理update低级配置信息（不要与处理客户端发送的update的Request Handler高级配置信息相混淆）。

<updateHandler class="solr.DirectUpdateHandler2">

    <!-- Limit the number of deletions Solr will buffer during doc updating.

        Setting this lower can help bound memory use during indexing.

-->

    <maxPendingDeletes>100000</maxPendingDeletes>

    <!-- autocommit pending docs if certain criteria are met.  Future versions may expand the available

     criteria -->

    <autoCommit>

      <maxDocs>10000</maxDocs> <!-- maximum uncommited docs before autocommit triggered -->

      <maxTime>86000</maxTime> <!-- maximum time (in MS) after adding a doc before an autocommit is triggered -->

    </autoCommit>

l 与更新相关的事件监听器（"Update" Related Event Listeners）

为与特殊更新相关的事件（"postCommit" 和 "postOptimize".）指定监听器。监听器能触发任意的特殊代码，它们的典型应用是快照功能。

...

    <!-- The RunExecutableListener executes an external command.

         exe  - the name of the executable to run

         dir  -  dir to use as the current working directory. default="."

         wait - the calling thread waits until the executable returns.

                default="true"

         args - the arguments to pass to the program.  default=nothing

         env  - environment variables to set.  default=nothing

-->

    <!-- A postCommit event is fired after every commit

-->

    <listener event="postCommit" class="solr.RunExecutableListener">

      <str name="exe">snapshooter</str>

      <str name="dir">solr/bin</str>

      <bool name="wait">true</bool>

      <!--

      <arr name="args"> <str>arg1</str> <str>arg2</str> </arr>

      <arr name="env"> <str>MYVAR=val1</str> </arr>

-->

    </listener>

  </updateHandler>

l 查询参数部分（The Query Section）

控制与查询相关的一切。

<query>

    <!-- Maximum number of clauses in a boolean query... can affect range

         or wildcard queries that expand to big boolean queries.

         An exception is thrown if exceeded.

-->

    <maxBooleanClauses>1024</maxBooleanClauses>

l 缓存参数部分（Caching Section）

当你的索引量增加或变化的时候，你需要在这里进行配置。关于缓存配置的更多细节请点这里。

<!-- Cache used by SolrIndexSearcher for filters (DocSets),

         unordered sets of *all* documents that match a query.

         When a new searcher is opened, its caches may be prepopulated

         or "autowarmed" using data from caches in the old searcher.

         autowarmCount is the number of items to prepopulate.  For LRUCache,

         the autowarmed items will be the most recently accessed items.

       Parameters:

         class - the SolrCache implementation (currently only LRUCache)

         size - the maximum number of entries in the cache

         initialSize - the initial capacity (number of entries) of

           the cache.  (seel java.util.HashMap)

         autowarmCount - the number of entries to prepopulate from

           and old cache.

-->

    <filterCache

      class="solr.LRUCache"

      size="512"

      initialSize="512"

      autowarmCount="256"/>

   <!-- queryResultCache caches results of searches - ordered lists of

         document ids (DocList) based on a query, a sort, and the range

         of documents requested.  -->

    <queryResultCache

      class="solr.LRUCache"

      size="512"

      initialSize="512"

      autowarmCount="256"/>

  <!-- documentCache caches Lucene Document objects (the stored fields for each document).

       Since Lucene internal document ids are transient, this cache will not be autowarmed.  -->

    <documentCache

      class="solr.LRUCache"

      size="512"

      initialSize="512"

      autowarmCount="0"/>

    <!-- Example of a generic cache.  These caches may be accessed by name

         through SolrIndexSearcher.getCache().cacheLookup(), and cacheInsert().

         The purpose is to enable easy caching of user/application level data.

         The regenerator argument should be specified as an implementation

         of solr.search.CacheRegenerator if autowarming is desired.  -->

    <!--

    <cache name="myUserCache"

      class="solr.LRUCache"

      size="4096"

      initialSize="1024"

      autowarmCount="1024"

      regenerator="org.mycompany.mypackage.MyRegenerator"

/>

-->

    <!-- An optimization that attempts to use a filter to satisfy a search.

         If the requested sort does not include a score, then the filterCache

         will be checked for a filter matching the query.  If found, the filter

         will be used as the source of document ids, and then the sort will be

         applied to that.

-->

    <useFilterForSortedQuery>true</useFilterForSortedQuery>

    <!-- An optimization for use with the queryResultCache.  When a search

         is requested, a superset of the requested number of document ids

         are collected.  For example, of a search for a particular query

         requests matching documents 10 through 19, and queryWindowSize is 50,

         then documents 0 through 50 will be collected and cached. Any further

         requests in that range can be satisfied via the cache.

-->

    <queryResultWindowSize>50</queryResultWindowSize>

   <!-- This entry enables an int hash representation for filters (DocSets)

         when the number of items in the set is less than maxSize. For smaller

         sets, this representation is more memory efficient, more efficient to

         iterate over, and faster to take intersections.

-->

    <HashDocSet maxSize="3000" loadFactor="0.75"/>

    <!-- boolToFilterOptimizer converts boolean clauses with zero boost

         cached filters if the number of docs selected by the clause exceeds the

         threshold (represented as a fraction of the total index)

-->

    <boolTofilterOptimizer enabled="true" cacheSize="32" threshold=".05"/>

    <!-- Lazy field loading will attempt to read only parts of documents on disk that are

         requested.  Enabling should be faster if you aren't retrieving all stored fields.

-->

    <enableLazyFieldLoading>false</enableLazyFieldLoading>

l 查询相关的事件监听器参数配置（"Query" Related Event Listeners）

在这里定义与特殊查询相关的事件监听器，使用该监听器实现需要的代码，例如启动常用的查询去预热缓存。

【newSearcher】在有注册搜索器存在的时启动一个新的搜索器，下例中的监听器就是这类，它获得查询列表并将它们发送到新的搜索器以达到预热的目的。

<!-- a newSearcher event is fired whenever a new searcher is being

         prepared and there is a current searcher handling requests

         (aka registered).

-->

    <!-- QuerySenderListener takes an array of NamedList and

         executes a local query request for each NamedList in sequence.

-->

    <!--

    <listener event="newSearcher" class="solr.QuerySenderListener">

      <arr name="queries">

        <lst> <str name="q">solr</str>

              <str name="start">0</str>

              <str name="rows">10</str>

        </lst>

        <lst> <str name="q">rocks</str>

              <str name="start">0</str>

              <str name="rows">10</str>

        </lst>

      </arr>

-->

【firstSearcher】

当不存在已注册的搜索器时启动新的firstSearcher。下例正式如此，该监听器获得查询列表将其发送到正启动的新的搜索器，将其预热。（注意，只有当存在已注册搜索器的时候才可以使用自动预热auto-warming）

<!-- a firstSearcher event is fired whenever a new searcher is being

         prepared but there is no current registered searcher to handle

         requests or to gain prewarming data from.

-->

    <!--

    <listener event="firstSearcher" class="solr.QuerySenderListener">

      <arr name="queries">

        <lst> <str name="q">fast_warm</str>

              <str name="start">0</str>

              <str name="rows">10</str>

        </lst>

      </arr>

    </listener>

solr

安科网

solr的配置参数理解

lhc0

lhc0

相关推荐

docker 安装solr8.6.2 配置中文分词器的方法

Apache Solr velocity模板注入RCE漏洞

【solr】使用-取反时注意，-是表示排除，不能单独做条件用

solr与.net系列课程(五)solrnet的使用

关于Solr服务搭建

【solr】添加分词器ik-analyzer-solr

Solr7-4的学习与使用

01 CentOS7中安装和启动solr

solr replication原理探究

ant编译solr源码生成eclipse项目，解决一直resolve，一直[ivy:retrieve]的问题

2020.2.3学习进度总结

Solr请求概念和配置详解

lucene&solr全文检索_7solr后台界面的介绍

lucene&solr全文检索_3查询索引

Lucene、Solr、ElasticSearch、hibernate-search四部曲

lunrjs - A bit like Solr, but much smaller and not as bright.

一、linux系统安装配置solr8

详细solr集成搭建

solr 7.7.0 添加多个core（三）

solr7.7.0 添加core （二）

lhc0