根据sql server中的相关值对数据进行分组

data grouping according to relevant values in sql server(根据sql server中的相关值对数据进行分组)
本文介绍了根据sql server中的相关值对数据进行分组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着跟版网的小编来一起学习吧!

问题描述

数据以 15 分钟为间隔:

<前>时间价值2010-01-01 00:15 32010-01-01 00:30 22010-01-01 00:45 42010-01-01 01:00 52010-01-01 01:15 12010-01-01 01:30 32010-01-01 01:45 42010-01-01 02:00 122010-01-01 02:15 132010-01-01 02:30 122010-01-01 02:45 142010-01-01 03:00 152010-01-01 03:15 32010-01-01 03:30 22010-01-01 03:45 32010-01-01 04:00 5...............2010-01-02 00:00

通常会有 96 分.

根据数值,我们可能会注意到,00:15 到 01:45 之间的值彼此接近,02:00 到 03:00 之间的值彼此接近,而从 03:15 到04:00 他们靠得很近.

基于彼此接近"规则,我希望将数据分组"为 3 部分:

  • 00:15 到 01:45
  • 02:00 至 03:00
  • 03:15 到 04:00

请注意数据可能是随机的,根据上面定义的规则可以分为 3 个以上的部分,但最多不应超过 10 个部分.并且分组必须遵循时间顺序,例如,您不能将 00:15/02:30/04:45 归为 1 组,因为这 3 个点不连续.

请给出一些如何在 t-sql 中实现它的想法.

更新:该值可能是:

<前>时间价值2010-01-01 00:15 32010-01-01 00:30 22010-01-01 00:45 42010-01-01 01:00 52010-01-01 01:15 12010-01-01 01:30 32010-01-01 01:45 42010-01-01 02:00 122010-01-01 02:15 132010-01-01 02:30 4 --突然减少2010-01-01 02:45 142010-01-01 03:00 152010-01-01 03:15 32010-01-01 03:30 22010-01-01 03:45 32010-01-01 04:00 5...............2010-01-02 00:00

对于这种情况,我们不应该把 02:30 分开分组,因为我们希望分组大小必须至少为 3 点,我们会将那个点 (02:30) 放到前一组(从 02:00 到 03:00).

解决方案

由于你的问题变化太大,这里是新问题的新答案,我只包括代码部分.

声明@t table(time datetime, value int)声明@variation浮动设置@variation = 2不计较插入@t 值('2010-01-01 00:15',3)插入@t 值('2010-01-01 00:30',2)插入@t 值('2010-01-01 00:45',4)插入@t 值('2010-01-01 01:00',5)插入@t 值('2010-01-01 01:15',1)插入@t 值('2010-01-01 01:30',3)插入@t 值('2010-01-01 01:45',4)插入@t 值('2010-01-01 02:00',52)插入@t 值('2010-01-01 02:15',5)插入@t 值('2010-01-01 02:30',52)插入@t 值('2010-01-01 02:45',54)插入@t 值('2010-01-01 03:00',55)插入@t 值('2010-01-01 03:15',3)插入@t 值('2010-01-01 03:30',2)插入@t 值('2010-01-01 03:45',3)插入@t 值('2010-01-01 04:00',5)声明@result 表(最小时间日期时间,最大时间日期时间)A:删除@result; t 为(select *, rn = row_number() over(order by time), log(value) lv from @t where datediff(day, time, '2010-01-01') = 0),作为(从 t 中选择时间、lv、rn、0 grp,其中 rn = 1联合所有选择 t1.time, a.lv, t1.rn,存在的情况(从 t t2 中选择 1 其中 t1.rn 介于 rn + 1 和 rn + 3 之间,并且lv 之间 t1.lv - @variation 和 t1.lv +@variation) 然后 grp else grp + 1 end从 t t1 加入 a ont1.rn = a.rn +1)插入@result通过 grp 从组中选择 min(time), max(time)如果@@rowcount >10开始设置@variation=@variation + .5转到一个结尾从@result中选择*

结果:

mintime maxtime2010-01-01 00:15:00.000 2010-01-01 01:45:00.0002010-01-01 02:00:00.000 2010-01-01 03:00:00.0002010-01-01 03:15:00.000 2010-01-01 04:00:00.000

the data is in 15 minute interval:

Time               Value
2010-01-01 00:15   3
2010-01-01 00:30   2
2010-01-01 00:45   4
2010-01-01 01:00   5
2010-01-01 01:15   1
2010-01-01 01:30   3
2010-01-01 01:45   4
2010-01-01 02:00   12
2010-01-01 02:15   13
2010-01-01 02:30   12
2010-01-01 02:45   14
2010-01-01 03:00   15
2010-01-01 03:15   3
2010-01-01 03:30   2
2010-01-01 03:45   3
2010-01-01 04:00   5
..........
..........
..........
2010-01-02 00:00

Typically there will be 96 points.

According to the values, we may notice that the values from 00:15 to 01:45 are close to each other, and from 02:00 to 03:00 they are close to each other, and from 03:15 to 04:00 they are close to each other.

Based on the "close to each other" rule, I want the data to be "grouped" into 3 parts:

  • 00:15 to 01:45
  • 02:00 to 03:00
  • 03:15 to 04:00

Please consider that the data could be random, and could be grouped into more than 3 parts according to the rule defined above, but maximum should not be more than 10 parts. And the grouping must honor the time sequence, for example, you cannot just put 00:15/02:30/04:45 into 1 group because these 3 points are NOT consecutive.

Please give some thoughts how to implement it in t-sql.

updated: The value could be:

Time               Value
2010-01-01 00:15   3
2010-01-01 00:30   2
2010-01-01 00:45   4
2010-01-01 01:00   5
2010-01-01 01:15   1
2010-01-01 01:30   3
2010-01-01 01:45   4
2010-01-01 02:00   12
2010-01-01 02:15   13
2010-01-01 02:30   4  --suddenly decreased
2010-01-01 02:45   14
2010-01-01 03:00   15
2010-01-01 03:15   3
2010-01-01 03:30   2
2010-01-01 03:45   3
2010-01-01 04:00   5
..........
..........
..........
2010-01-02 00:00

for these kinds of situation, we should not group 02:30 separately, because we want the group size has to be at least 3 points, and we will put that point (02:30) to the previous group (from 02:00 to 03:00).

解决方案

Since your question changed so much, here is a new answer to the new question, I only included the code part.

declare @t table(time datetime, value int)
declare @variation float
set @variation = 2
set nocount on

insert @t values('2010-01-01 00:15',3)
insert @t values('2010-01-01 00:30',2)
insert @t values('2010-01-01 00:45',4)
insert @t values('2010-01-01 01:00',5)
insert @t values('2010-01-01 01:15',1)
insert @t values('2010-01-01 01:30',3)
insert @t values('2010-01-01 01:45',4)
insert @t values('2010-01-01 02:00',52)
insert @t values('2010-01-01 02:15',5)
insert @t values('2010-01-01 02:30',52)
insert @t values('2010-01-01 02:45',54)
insert @t values('2010-01-01 03:00',55)
insert @t values('2010-01-01 03:15',3)
insert @t values('2010-01-01 03:30',2)
insert @t values('2010-01-01 03:45',3)
insert @t values('2010-01-01 04:00',5)


declare @result table(mintime datetime, maxtime datetime)
a:
delete @result

;with t as
(
select *, rn = row_number() over(order by time), log(value) lv from @t where datediff(day, time, '2010-01-01') = 0
), a as
(
select time, lv, rn, 0 grp from t where rn = 1
union all
select t1.time, a.lv, t1.rn, 
case when exists (select 1 from t t2 where t1.rn between rn + 1 and rn + 3 and 
lv between t1.lv - @variation and t1.lv +@variation) then grp else grp + 1 end
from t t1 join a on 
t1.rn = a.rn +1
)
insert @result
select min(time), max(time) from a group by grp

if @@rowcount > 10 
begin 
    set @variation=@variation + .5 
    goto a 
end

select * from @result

Result:

mintime                     maxtime
2010-01-01 00:15:00.000     2010-01-01 01:45:00.000
2010-01-01 02:00:00.000     2010-01-01 03:00:00.000
2010-01-01 03:15:00.000     2010-01-01 04:00:00.000

这篇关于根据sql server中的相关值对数据进行分组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持跟版网!

本站部分内容来源互联网,如果有图片或者内容侵犯了您的权益,请联系我们,我们会在确认后第一时间进行删除!

相关文档推荐

ibtmp1是非压缩的innodb临时表的独立表空间,通过innodb_temp_data_file_path参数指定文件的路径,文件名和大小,默认配置为ibtmp1:12M:autoextend,也就是说在文件系统磁盘足够的情况下,这个文件大小是可以无限增长的。 为了避免ibtmp1文件无止境的暴涨导致
SQL query to group by day(按天分组的 SQL 查询)
What does SQL clause quot;GROUP BY 1quot; mean?(SQL 子句“GROUP BY 1是什么意思?意思是?)
MySQL groupwise MAX() returns unexpected results(MySQL groupwise MAX() 返回意外结果)
MySQL SELECT most frequent by group(MySQL SELECT 按组最频繁)
Include missing months in Group By query(在 Group By 查询中包含缺失的月份)