• <bdo id='doPcU'></bdo><ul id='doPcU'></ul>

    <tfoot id='doPcU'></tfoot>

  • <legend id='doPcU'><style id='doPcU'><dir id='doPcU'><q id='doPcU'></q></dir></style></legend>

    1. <i id='doPcU'><tr id='doPcU'><dt id='doPcU'><q id='doPcU'><span id='doPcU'><b id='doPcU'><form id='doPcU'><ins id='doPcU'></ins><ul id='doPcU'></ul><sub id='doPcU'></sub></form><legend id='doPcU'></legend><bdo id='doPcU'><pre id='doPcU'><center id='doPcU'></center></pre></bdo></b><th id='doPcU'></th></span></q></dt></tr></i><div id='doPcU'><tfoot id='doPcU'></tfoot><dl id='doPcU'><fieldset id='doPcU'></fieldset></dl></div>
      1. <small id='doPcU'></small><noframes id='doPcU'>

        6000万个条目,选择某个月份的条目.如何优化数据库?

        60 million entries, select entries from a certain month. How to optimize database?(6000万个条目,选择某个月份的条目.如何优化数据库?)

            <legend id='rJ487'><style id='rJ487'><dir id='rJ487'><q id='rJ487'></q></dir></style></legend>
              <tbody id='rJ487'></tbody>

              • <bdo id='rJ487'></bdo><ul id='rJ487'></ul>

                  <tfoot id='rJ487'></tfoot>

                  <i id='rJ487'><tr id='rJ487'><dt id='rJ487'><q id='rJ487'><span id='rJ487'><b id='rJ487'><form id='rJ487'><ins id='rJ487'></ins><ul id='rJ487'></ul><sub id='rJ487'></sub></form><legend id='rJ487'></legend><bdo id='rJ487'><pre id='rJ487'><center id='rJ487'></center></pre></bdo></b><th id='rJ487'></th></span></q></dt></tr></i><div id='rJ487'><tfoot id='rJ487'></tfoot><dl id='rJ487'><fieldset id='rJ487'></fieldset></dl></div>

                  <small id='rJ487'></small><noframes id='rJ487'>

                  本文介绍了6000万个条目,选择某个月份的条目.如何优化数据库?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着跟版网的小编来一起学习吧!

                  问题描述

                  限时送ChatGPT账号..

                  我有一个包含 6000 万个条目的数据库.

                  I have a database with 60 million entries.

                  每个条目都包含:

                  • 身份证
                  • 数据源ID
                  • 一些数据
                  • 日期时间
                  1. 我需要选择某个月份的条目.每个月包含大约 200 万个条目.

                  1. I need to select entries from certain month. Each month contains approximately 2 million entries.

                   select * 
                     from Entries 
                    where time between "2010-04-01 00:00:00" and "2010-05-01 00:00:00"
                  

                  (查询大约需要 1.5 分钟)

                  (query takes approximately 1.5 minutes)

                  我还想从给定的 DataSourceID 中选择某个月份的数据.(大约需要 20 秒)

                  I'd also like to select data from certain month from a given DataSourceID. (takes approximately 20 seconds)

                  大约有 50-100 个不同的 DataSourceID.

                  There are about 50-100 different DataSourceIDs.

                  有没有办法让它更快?我有哪些选择?如何优化这个数据库/查询?

                  Is there a way to make this faster? What are my options? How to optimize this database/query?

                  大约有.每秒 60-100 次插入!

                  There's approx. 60-100 inserts PER second!

                  推荐答案

                  利用 innodb 聚集主键索引.

                  Take advantage of innodb clustered primary key indexes.

                  http://dev.mysql.com/doc/refman/5.0/en/innodb-index-types.html

                  这将非常高效:

                  create table datasources
                  (
                  year_id smallint unsigned not null,
                  month_id tinyint unsigned not null,
                  datasource_id tinyint unsigned not null,
                  id int unsigned not null, -- needed for uniqueness
                  data int unsigned not null default 0,
                  primary key (year_id, month_id, datasource_id, id)
                  )
                  engine=innodb;
                  
                  select * from datasources where year_id = 2011 and month_id between 1 and 3;
                  
                  select * from datasources where year_id = 2011 and month_id = 4 and datasouce_id = 100;
                  
                  -- etc..
                  

                  编辑 2

                  忘记了我正在使用 3 个月的数据运行第一个测试脚本.这是一个月的结果:0.34 和 0.69 秒.

                  Forgot i was running the first test script with 3 months of data. Here's the results for a single month : 0.34 and 0.69 seconds.

                  select d.* from datasources d where d.year_id = 2010 and d.month_id = 3 and datasource_id = 100 order by d.id desc limit 10;
                  +---------+----------+---------------+---------+-------+
                  | year_id | month_id | datasource_id | id      | data  |
                  +---------+----------+---------------+---------+-------+
                  |    2010 |        3 |           100 | 3290330 | 38434 |
                  |    2010 |        3 |           100 | 3290329 |  9988 |
                  |    2010 |        3 |           100 | 3290328 | 25680 |
                  |    2010 |        3 |           100 | 3290327 | 17627 |
                  |    2010 |        3 |           100 | 3290326 | 64508 |
                  |    2010 |        3 |           100 | 3290325 | 14257 |
                  |    2010 |        3 |           100 | 3290324 | 45950 |
                  |    2010 |        3 |           100 | 3290323 | 49986 |
                  |    2010 |        3 |           100 | 3290322 |  2459 |
                  |    2010 |        3 |           100 | 3290321 | 52971 |
                  +---------+----------+---------------+---------+-------+
                  10 rows in set (0.34 sec)
                  
                  select d.* from datasources d where d.year_id = 2010 and d.month_id = 3 order by d.id desc limit 10;
                  +---------+----------+---------------+---------+-------+
                  | year_id | month_id | datasource_id | id      | data  |
                  +---------+----------+---------------+---------+-------+
                  |    2010 |        3 |           116 | 3450346 | 42455 |
                  |    2010 |        3 |           116 | 3450345 | 64039 |
                  |    2010 |        3 |           116 | 3450344 | 27046 |
                  |    2010 |        3 |           116 | 3450343 | 23730 |
                  |    2010 |        3 |           116 | 3450342 | 52380 |
                  |    2010 |        3 |           116 | 3450341 | 35700 |
                  |    2010 |        3 |           116 | 3450340 | 20195 |
                  |    2010 |        3 |           116 | 3450339 | 21758 |
                  |    2010 |        3 |           116 | 3450338 | 51378 |
                  |    2010 |        3 |           116 | 3450337 | 34687 |
                  +---------+----------+---------------+---------+-------+
                  10 rows in set (0.69 sec)
                  

                  编辑 1

                  决定用大约测试上述模式.6000 万行分布在 3 年内.每个查询都是冷运行的,即每个查询都单独运行,然后重新启动 mysql,清除任何缓冲区,并且没有查询缓存.

                  Decided to test the above schema with approx. 60 million rows spread over 3 years. Each query is run cold i.e. each run separately after which mysql is restarted clearing any buffers and with no query caching.

                  完整的测试脚本可以在这里找到:http://pastie.org/1723506 或以下...

                  The full test script can be found here : http://pastie.org/1723506 or below...

                  正如你所看到的,即使在我简陋的桌面上,它也是一个非常高性能的架构:)

                  As you can see it's a pretty performant schema even on my humble desktop :)

                  select count(*) from datasources;
                  +----------+
                  | count(*) |
                  +----------+
                  | 60306030 |
                  +----------+
                  
                  select count(*) from datasources where year_id = 2010;
                  +----------+
                  | count(*) |
                  +----------+
                  | 16691669 |
                  +----------+
                  
                  select
                   year_id, month_id, count(*) as counter
                  from
                   datasources
                  where 
                   year_id = 2010
                  group by
                   year_id, month_id;
                  +---------+----------+---------+
                  | year_id | month_id | counter |
                  +---------+----------+---------+
                  |    2010 |        1 | 1080108 |
                  |    2010 |        2 | 1210121 |
                  |    2010 |        3 | 1160116 |
                  |    2010 |        4 | 1300130 |
                  |    2010 |        5 | 1860186 |
                  |    2010 |        6 | 1220122 |
                  |    2010 |        7 | 1250125 |
                  |    2010 |        8 | 1460146 |
                  |    2010 |        9 | 1730173 |
                  |    2010 |       10 | 1490149 |
                  |    2010 |       11 | 1570157 |
                  |    2010 |       12 | 1360136 |
                  +---------+----------+---------+
                  12 rows in set (5.92 sec)
                  
                  
                  select 
                   count(*) as counter
                  from 
                   datasources d
                  where 
                   d.year_id = 2010 and d.month_id between 1 and 3 and datasource_id = 100;
                  
                  +---------+
                  | counter |
                  +---------+
                  |   30003 |
                  +---------+
                  1 row in set (1.04 sec)
                  
                  explain
                  select 
                   d.* 
                  from 
                   datasources d
                  where 
                   d.year_id = 2010 and d.month_id between 1 and 3 and datasource_id = 100
                  order by
                   d.id desc limit 10;
                  
                  +----+-------------+-------+-------+---------------+---------+---------+------+---------+-----------------------------+
                  | id | select_type | table | type  | possible_keys | key     | key_len | ref  |rows    | Extra                       |
                  +----+-------------+-------+-------+---------------+---------+---------+------+---------+-----------------------------+
                  |  1 | SIMPLE      | d     | range | PRIMARY       | PRIMARY | 4       | NULL |4451372 | Using where; Using filesort |
                  +----+-------------+-------+-------+---------------+---------+---------+------+---------+-----------------------------+
                  1 row in set (0.00 sec)
                  
                  
                  select 
                   d.* 
                  from 
                   datasources d
                  where 
                   d.year_id = 2010 and d.month_id between 1 and 3 and datasource_id = 100
                  order by
                   d.id desc limit 10;
                  
                  +---------+----------+---------------+---------+-------+
                  | year_id | month_id | datasource_id | id      | data  |
                  +---------+----------+---------------+---------+-------+
                  |    2010 |        3 |           100 | 3290330 | 38434 |
                  |    2010 |        3 |           100 | 3290329 |  9988 |
                  |    2010 |        3 |           100 | 3290328 | 25680 |
                  |    2010 |        3 |           100 | 3290327 | 17627 |
                  |    2010 |        3 |           100 | 3290326 | 64508 |
                  |    2010 |        3 |           100 | 3290325 | 14257 |
                  |    2010 |        3 |           100 | 3290324 | 45950 |
                  |    2010 |        3 |           100 | 3290323 | 49986 |
                  |    2010 |        3 |           100 | 3290322 |  2459 |
                  |    2010 |        3 |           100 | 3290321 | 52971 |
                  +---------+----------+---------------+---------+-------+
                  10 rows in set (0.98 sec)
                  
                  
                  select 
                   count(*) as counter
                  from 
                   datasources d
                  where 
                   d.year_id = 2010 and d.month_id between 1 and 3;
                  
                  +---------+
                  | counter |
                  +---------+
                  | 3450345 |
                  +---------+
                  1 row in set (1.64 sec)
                  
                  explain
                  select 
                   d.* 
                  from 
                   datasources d
                  where 
                   d.year_id = 2010 and d.month_id between 1 and 3
                  order by
                   d.id desc limit 10;
                  
                  +----+-------------+-------+-------+---------------+---------+---------+------+---------+-----------------------------+
                  | id | select_type | table | type  | possible_keys | key     | key_len | ref  |rows    | Extra                       |
                  +----+-------------+-------+-------+---------------+---------+---------+------+---------+-----------------------------+
                  |  1 | SIMPLE      | d     | range | PRIMARY       | PRIMARY | 3       | NULL |6566916 | Using where; Using filesort |
                  +----+-------------+-------+-------+---------------+---------+---------+------+---------+-----------------------------+
                  1 row in set (0.00 sec)
                  
                  
                  select 
                   d.* 
                  from 
                   datasources d
                  where 
                   d.year_id = 2010 and d.month_id between 1 and 3
                  order by
                   d.id desc limit 10;
                  
                  +---------+----------+---------------+---------+-------+
                  | year_id | month_id | datasource_id | id      | data  |
                  +---------+----------+---------------+---------+-------+
                  |    2010 |        3 |           116 | 3450346 | 42455 |
                  |    2010 |        3 |           116 | 3450345 | 64039 |
                  |    2010 |        3 |           116 | 3450344 | 27046 |
                  |    2010 |        3 |           116 | 3450343 | 23730 |
                  |    2010 |        3 |           116 | 3450342 | 52380 |
                  |    2010 |        3 |           116 | 3450341 | 35700 |
                  |    2010 |        3 |           116 | 3450340 | 20195 |
                  |    2010 |        3 |           116 | 3450339 | 21758 |
                  |    2010 |        3 |           116 | 3450338 | 51378 |
                  |    2010 |        3 |           116 | 3450337 | 34687 |
                  +---------+----------+---------------+---------+-------+
                  10 rows in set (1.98 sec)
                  

                  希望这有帮助:)

                  这篇关于6000万个条目,选择某个月份的条目.如何优化数据库?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持跟版网!

                  本站部分内容来源互联网,如果有图片或者内容侵犯了您的权益,请联系我们,我们会在确认后第一时间进行删除!

                  相关文档推荐

                  ibtmp1是非压缩的innodb临时表的独立表空间,通过innodb_temp_data_file_path参数指定文件的路径,文件名和大小,默认配置为ibtmp1:12M:autoextend,也就是说在文件系统磁盘足够的情况下,这个文件大小是可以无限增长的。 为了避免ibtmp1文件无止境的暴涨导致
                  SQL query to group by day(按天分组的 SQL 查询)
                  What does SQL clause quot;GROUP BY 1quot; mean?(SQL 子句“GROUP BY 1是什么意思?意思是?)
                  MySQL groupwise MAX() returns unexpected results(MySQL groupwise MAX() 返回意外结果)
                  MySQL SELECT most frequent by group(MySQL SELECT 按组最频繁)
                  Include missing months in Group By query(在 Group By 查询中包含缺失的月份)
                    <bdo id='o3Os9'></bdo><ul id='o3Os9'></ul>

                    • <tfoot id='o3Os9'></tfoot>
                    • <legend id='o3Os9'><style id='o3Os9'><dir id='o3Os9'><q id='o3Os9'></q></dir></style></legend>

                        <tbody id='o3Os9'></tbody>

                          <i id='o3Os9'><tr id='o3Os9'><dt id='o3Os9'><q id='o3Os9'><span id='o3Os9'><b id='o3Os9'><form id='o3Os9'><ins id='o3Os9'></ins><ul id='o3Os9'></ul><sub id='o3Os9'></sub></form><legend id='o3Os9'></legend><bdo id='o3Os9'><pre id='o3Os9'><center id='o3Os9'></center></pre></bdo></b><th id='o3Os9'></th></span></q></dt></tr></i><div id='o3Os9'><tfoot id='o3Os9'></tfoot><dl id='o3Os9'><fieldset id='o3Os9'></fieldset></dl></div>

                            <small id='o3Os9'></small><noframes id='o3Os9'>