• <tfoot id='omqps'></tfoot>

    <small id='omqps'></small><noframes id='omqps'>

      1. <legend id='omqps'><style id='omqps'><dir id='omqps'><q id='omqps'></q></dir></style></legend>
      2. <i id='omqps'><tr id='omqps'><dt id='omqps'><q id='omqps'><span id='omqps'><b id='omqps'><form id='omqps'><ins id='omqps'></ins><ul id='omqps'></ul><sub id='omqps'></sub></form><legend id='omqps'></legend><bdo id='omqps'><pre id='omqps'><center id='omqps'></center></pre></bdo></b><th id='omqps'></th></span></q></dt></tr></i><div id='omqps'><tfoot id='omqps'></tfoot><dl id='omqps'><fieldset id='omqps'></fieldset></dl></div>
          <bdo id='omqps'></bdo><ul id='omqps'></ul>

        在 Lucene 中获取词频

        Get term frequencies in Lucene(在 Lucene 中获取词频)

        <i id='IUzOD'><tr id='IUzOD'><dt id='IUzOD'><q id='IUzOD'><span id='IUzOD'><b id='IUzOD'><form id='IUzOD'><ins id='IUzOD'></ins><ul id='IUzOD'></ul><sub id='IUzOD'></sub></form><legend id='IUzOD'></legend><bdo id='IUzOD'><pre id='IUzOD'><center id='IUzOD'></center></pre></bdo></b><th id='IUzOD'></th></span></q></dt></tr></i><div id='IUzOD'><tfoot id='IUzOD'></tfoot><dl id='IUzOD'><fieldset id='IUzOD'></fieldset></dl></div>

        <legend id='IUzOD'><style id='IUzOD'><dir id='IUzOD'><q id='IUzOD'></q></dir></style></legend>
        <tfoot id='IUzOD'></tfoot>

            <small id='IUzOD'></small><noframes id='IUzOD'>

              • <bdo id='IUzOD'></bdo><ul id='IUzOD'></ul>
                  <tbody id='IUzOD'></tbody>

                1. 本文介绍了在 Lucene 中获取词频的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着跟版网的小编来一起学习吧!

                  问题描述

                  有没有一种快速简便的方法从 Lucene 索引中获取词频,而无需通过 TermVectorFrequencies 类来完成,因为对于大型集合来说这需要大量时间?

                  Is there a fast and easy way of getting term frequencies from a Lucene index, without doing it through the TermVectorFrequencies class, since that takes an awful lot of time for large collections?

                  我的意思是,有没有像 TermEnum 这样的东西,它不仅有文档频率,还有词频?

                  What I mean is, is there something like TermEnum which has not just the document frequency but term frequency as well?

                  更新:使用 TermDocs 太慢了.

                  UPDATE: Using TermDocs is way too slow.

                  推荐答案

                  使用TermDocs 获取给定文档的词频.与文档频率一样,您可以使用感兴趣的术语从 IndexReader 获取术语文档.

                  您不会找到比 TermDocs 更快的方法而不失一些通用性.TermDocs 直接从索引段中的.frq"文件中读取,其中每个术语频率按文档顺序列出.

                  You won't find a faster method than TermDocs without losing some generality. TermDocs reads directly from the ".frq" file in an index segment, where each term frequency is listed in document order.

                  如果这太慢",请确保您已优化索引以将多个段合并为一个段.按顺序遍历文档(跳过没问题,但不能高效地在文档列表中来回跳转).

                  If that's "too slow", make sure that you've optimized your index to merge multiple segments into a single segment. Iterate over the documents in order (skips are alright, but you can't jump back and forth in the document list efficiently).

                  您的下一步可能是进行额外处理,以创建一个更专业的文件结构,省略 SkipData.就我个人而言,我会寻找更好的算法来实现我的目标,或者提供更好的硬件——大量内存,或者保存 RAMDirectory,或者提供给操作系统以在其自己的文件缓存系统上使用.

                  Your next step might be additional processing to create an even more specialized file structure that leaves out the SkipData. Personally I would look for a better algorithm to achieve my objective, or provide better hardware—lots of memory, either to hold a RAMDirectory, or to give to the OS for use on its own file-caching system.

                  这篇关于在 Lucene 中获取词频的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持跟版网!

                  本站部分内容来源互联网,如果有图片或者内容侵犯了您的权益,请联系我们,我们会在确认后第一时间进行删除!

                  相关文档推荐

                  How to send data to COM PORT using JAVA?(如何使用 JAVA 向 COM PORT 发送数据?)
                  How to make a report page direction to change to quot;rtlquot;?(如何使报表页面方向更改为“rtl?)
                  Use cyrillic .properties file in eclipse project(在 Eclipse 项目中使用西里尔文 .properties 文件)
                  Is there any way to detect an RTL language in Java?(有没有办法在 Java 中检测 RTL 语言?)
                  How to load resource bundle messages from DB in Java?(如何在 Java 中从 DB 加载资源包消息?)
                  How do I change the default locale settings in Java to make them consistent?(如何更改 Java 中的默认语言环境设置以使其保持一致?)
                  • <small id='D8AfH'></small><noframes id='D8AfH'>

                    • <tfoot id='D8AfH'></tfoot>

                        <tbody id='D8AfH'></tbody>

                          <bdo id='D8AfH'></bdo><ul id='D8AfH'></ul>

                          <legend id='D8AfH'><style id='D8AfH'><dir id='D8AfH'><q id='D8AfH'></q></dir></style></legend>
                          1. <i id='D8AfH'><tr id='D8AfH'><dt id='D8AfH'><q id='D8AfH'><span id='D8AfH'><b id='D8AfH'><form id='D8AfH'><ins id='D8AfH'></ins><ul id='D8AfH'></ul><sub id='D8AfH'></sub></form><legend id='D8AfH'></legend><bdo id='D8AfH'><pre id='D8AfH'><center id='D8AfH'></center></pre></bdo></b><th id='D8AfH'></th></span></q></dt></tr></i><div id='D8AfH'><tfoot id='D8AfH'></tfoot><dl id='D8AfH'><fieldset id='D8AfH'></fieldset></dl></div>