Why Lucene doesn#39;t support any type of update to an existing document(为什么 Lucene 不支持对现有文档进行任何类型的更新)
问题描述
我的用例涉及索引一个 Lucene 文档,然后在以后的多个场合添加指向该现有文档的术语,而不是为每个新术语删除和重新添加整个文档(因为性能,而不是保留原始条款).
My use case involves index a Lucene document, then on multiple future occasions add terms that point to this existing doc, that's without deleting and re-adding the entire document for each new term (because of performance, and not keeping the original terms).
我知道文档不能真正更新.我的问题是为什么?
I do know that a document can not be truly updated. My question is why?
或者更准确地说,为什么不支持所有形式的更新(术语、存储字段)?
为什么不可能添加另一个术语来指向现有文档 - 从技术上讲:所需要的不仅仅是将现有的文档 ID 放在术语的发布列表中.为什么这么难?是否有一些不可变的统计数据?
Or more precisely, why are all forms of updates (terms, stored fields) not supported?
Why it's not possible to add another term to point to an existing document - technically: isn't all that's needed is to have the existing doc Id placed in the posting list of the term. Why is that hard? Is there some immutable statistics that are in the way?
是否有任何解决方法可以支持我将术语(索引字段)添加到现有文档的用例?
Are there any workarounds for supporting my usecase of adding a term (indexed field) to an existing doc?
推荐答案
我知道文档不能真正更新.我的问题是为什么?
I do know that a document can not be truly updated. My question is why?
Gili,编辑文档会导致相关术语发布发生变化,由于术语发布列表结构,这是有问题的.过帐列表被排序并按顺序存储在内存中.因此,要将文档添加到术语的发布列表中,您必须为其提供更高的 doc id
,这是通过删除并重新索引整个文档来完成的.
Gili, editing a document will cause changes in the related terms postings and this is problematic due to to the terms posting-list structure. The posting-list is sorted and stored sequential in memory. Thus to add a document to a term's posting-list you have to give it a higher doc id
this is done by deleting and re-index the entire document.
这篇关于为什么 Lucene 不支持对现有文档进行任何类型的更新的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!
本文标题为:为什么 Lucene 不支持对现有文档进行任何类型的更新


基础教程推荐
- FirebaseListAdapter 不推送聊天应用程序的单个项目 - Firebase-Ui 3.1 2022-01-01
- 无法使用修饰符“public final"访问 java.util.Ha 2022-01-01
- Java:带有char数组的println给出乱码 2022-01-01
- “未找到匹配项"使用 matcher 的 group 方法时 2022-01-01
- 设置 bean 时出现 Nullpointerexception 2022-01-01
- 减少 JVM 暂停时间 >1 秒使用 UseConcMarkSweepGC 2022-01-01
- 降序排序:Java Map 2022-01-01
- 在 Libgdx 中处理屏幕的正确方法 2022-01-01
- Java Keytool 导入证书后出错,"keytool error: java.io.FileNotFoundException &拒绝访问" 2022-01-01
- 如何使用 Java 创建 X509 证书? 2022-01-01