如何对 Solr 中的多个字段执行嵌套聚合?

2023-06-28Java开发问题
6

本文介绍了如何对 Solr 中的多个字段执行嵌套聚合?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着跟版网的小编来一起学习吧!

问题描述

我正在尝试以嵌套方式按多个字段执行搜索结果聚合(计数和总和)分组.

I am trying to perform search result aggregation (count and sum) grouping by several fields in a nested fashion.

例如,使用本文末尾显示的架构,我希望能够获得按类别"分组并按子类别"进一步分组的大小"总和,并得到类似这个:

For example, with the schema shown at the end of this post, I'd like to be able to get the sum of "size" grouped by "category" and sub-grouped further by "subcategory" and get something like this:

<category name="X">
  <subcategory name="X_A">
    <size sum="..." />
  </subcategory>
  <subcategory name="X_B">
    <size sum="..." />
  </subcategory>
</category>
....

我主要关注 Solr 的 Stats 组件,据我所知,它不允许嵌套聚合.

I've been looking primarily at Solr's Stats component which, as far as I can see, doesn't allow nested aggregation.

如果有人知道使用或不使用 Stats 组件的某种方式来实现这一点,我将不胜感激.

I'd appreciate it if anyone knows of some way to implement this, with or without the Stats component.

这是目标架构的精简版:

Here is a cut-down version of the target schema:

<types>
  <fieldType name="string" class="solr.StrField" />
  <fieldType name="text" class="solr.TextField">
    <analyzer><tokenizer class="solr.StandardTokenizerFactory" /></analyzer>
  </fieldType>
  <fieldType name="date" class="solr.DateField" />
  <fieldType name="int" class="solr.TrieIntField" precisionStep="0" omitNorms="true" positionIncrementGap="0" />
</types>

<fields>
  <field name="id" type="string" indexed="true" stored="true" />
  <field name="category" type="text" indexed="true" stored="true" />
  <field name="subcategory" type="text" indexed="true" stored="true" />
  <field name="pdate" type="date" indexed="true" stored="true" />
  <field name="size" type="int" indexed="true" stored="true" />
</fields>

推荐答案

Solr 5.1 中新的 faceting 模块可以做到这一点,它被添加到 https://issues.apache.org/jira/browse/SOLR-7214

The new faceting module in Solr 5.1 can do this, it was added in https://issues.apache.org/jira/browse/SOLR-7214

以下是如何将 sum(size) 添加到每个构面桶,并按该统计数据降序排序.

Here is how you would add sum(size) to every facet bucket, and sort descending by that statistic.

json.facet={
  categories:{terms:{
    field:category,
    sort:"total_size desc",  // this will sort the facet buckets by your stat 
    facet:{
      total_size:"sum(size)"  // this calculates the stat per bucket
    }
  }}
}

这就是您在子类别中添加子方面的方式:

And this is how you would add in the subfacet on subcategory:

json.facet={
  categories:{terms:{
    field:category,
    sort:"total_size desc",
    facet:{
      total_size:"sum(size)",
      subcat:{terms:{ // this will facet on the subcategory field for each bucket
        field:subcategory,
        facet:{
         sz:"sum(size)"  // this calculates the sum per sub-cat bucket          
      }}
    }
  }}
}

因此,以上内容将为您提供类别和子类别级别的总和(大小).新 facet 模块的文档目前位于 http://yonik.com/json-facet-api/

So the above will give you the sum(size) at both the category and subcategory levels. Documentation for the new facet module is currently at http://yonik.com/json-facet-api/

这篇关于如何对 Solr 中的多个字段执行嵌套聚合?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持跟版网!

The End

相关推荐

如何使用 JAVA 向 COM PORT 发送数据?
How to send data to COM PORT using JAVA?(如何使用 JAVA 向 COM PORT 发送数据?)...
2024-08-25 Java开发问题
21

如何使报表页面方向更改为“rtl"?
How to make a report page direction to change to quot;rtlquot;?(如何使报表页面方向更改为“rtl?)...
2024-08-25 Java开发问题
19

在 Eclipse 项目中使用西里尔文 .properties 文件
Use cyrillic .properties file in eclipse project(在 Eclipse 项目中使用西里尔文 .properties 文件)...
2024-08-25 Java开发问题
18

有没有办法在 Java 中检测 RTL 语言?
Is there any way to detect an RTL language in Java?(有没有办法在 Java 中检测 RTL 语言?)...
2024-08-25 Java开发问题
11

如何在 Java 中从 DB 加载资源包消息?
How to load resource bundle messages from DB in Java?(如何在 Java 中从 DB 加载资源包消息?)...
2024-08-25 Java开发问题
13

如何更改 Java 中的默认语言环境设置以使其保持一致?
How do I change the default locale settings in Java to make them consistent?(如何更改 Java 中的默认语言环境设置以使其保持一致?)...
2024-08-25 Java开发问题
13