Mysql count performance on very big tables(Mysql 在非常大的表上计算性能)
问题描述
我在 Innodb 中有一个超过 1 亿行的表.
I have a table with more than 100 millions rows in Innodb.
我必须知道是否有超过 5000 行的外键 = 1.我不需要确切的数字.
I have to know if there is more than 5000 rows where the foreign key = 1. I don't need the exact number.
我做了一些测试:
SELECT COUNT(*) FROM table WHERE fk = 1
=> 16 秒SELECT COUNT(*) FROM table WHERE fk = 1 LIMIT 5000
=> 16 秒SELECT primary FROM table WHERE fk = 1
=> 0.6 秒
SELECT COUNT(*) FROM table WHERE fk = 1
=> 16 seconds
SELECT COUNT(*) FROM table WHERE fk = 1 LIMIT 5000
=> 16 seconds
SELECT primary FROM table WHERE fk = 1
=> 0.6 seconds
我将拥有更大的网络和治疗时间,但可能会超载 15.4 秒!
I will have a bigger network and treatment time but it can be an overload of 15.4 seconds !
你有更好的主意吗?
谢谢
[添加 OP 的相关评论]
我尝试了 SELECT SQL_NO_CACHE COUNT(fk) FROM table WHERE fk = 1 但花了 25 秒
I tried SELECT SQL_NO_CACHE COUNT(fk) FROM table WHERE fk = 1 but it took 25 seconds
使用 Mysql Tuner 为 Innodb 调整了 Mysql.
Mysql was tuned for Innodb with Mysql Tuner.
CREATE TABLE table ( pk bigint(20) NOT NULL AUTO_INCREMENT,
fk tinyint(3) unsigned DEFAULT '0',
PRIMARY KEY (pk), KEY idx_fk (fk) USING BTREE )
ENGINE=InnoDB AUTO_INCREMENT=100380914 DEFAULT CHARSET=latin1
数据库资料:
'have_innodb', 'YES' 'ignore_builtin_innodb', 'OFF' 'innodb_adaptive_hash_index', 'ON'
'innodb_additional_mem_pool_size', '20971520' 'innodb_autoextend_increment', '8'
'innodb_autoinc_lock_mode', '1' 'innodb_buffer_pool_size', '25769803776'
'innodb_checksums', 'ON' 'innodb_commit_concurrency', '0',
'innodb_concurrency_tickets', '500' 'innodb_data_file_path',
'ibdata1:10M:autoextend' 'innodb_data_home_dir', '', 'innodb_doublewrite', 'ON'
'innodb_fast_shutdown', '1' 'innodb_file_io_threads', '4'
'innodb_file_per_table', 'OFF', 'innodb_flush_log_at_trx_commit', '1'
'innodb_flush_method', '' 'innodb_force_recovery', '0' 'innodb_lock_wait_timeout', '50'
'innodb_locks_unsafe_for_binlog', 'OFF' 'innodb_log_buffer_size', '8388608'
'innodb_log_file_size', '26214400' 'innodb_log_files_in_group', '2'
'innodb_log_group_home_dir', './' 'innodb_max_dirty_pages_pct', '90'
'innodb_max_purge_lag', '0' 'innodb_mirrored_log_groups', '1' 'innodb_open_files',
'300' 'innodb_rollback_on_timeout', 'OFF' 'innodb_stats_on_metadata', 'ON'
'innodb_support_xa', 'ON' 'innodb_sync_spin_loops', '20' 'innodb_table_locks', 'ON'
'innodb_thread_concurrency', '8' 'innodb_thread_sleep_delay', '10000'
'innodb_use_legacy_cardinality_algorithm', 'ON'
更新 '15:到目前为止,我使用了相同的方法,每天处理 6 亿行和 64 万行新行.它仍然工作正常.
Update '15: I used the same method up to now with 600 millions rows and 640 000 new rows per day. It's still working fine.
推荐答案
最后最快的是使用 C# 查询前 X 行并计算行数.
Finally the fastest was to query the first X rows using C# and counting the rows number.
我的应用程序正在批量处理数据.两批之间的时间长短取决于需要处理的行数
My application is treating the data in batches. The amount of time between two batches are depending the number of rows who need to be treated
SELECT pk FROM table WHERE fk = 1 LIMIT X
我在 0.9 秒内得到了结果.
I got the result in 0.9 seconds.
感谢大家的想法!
这篇关于Mysql 在非常大的表上计算性能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!
本文标题为:Mysql 在非常大的表上计算性能


基础教程推荐
- 使用pyodbc“不安全"的Python多处理和数据库访问? 2022-01-01
- 如何在 SQL Server 的嵌套过程中处理事务? 2021-01-01
- SQL Server:只有 GROUP BY 中的最后一个条目 2021-01-01
- 无法在 ubuntu 中启动 mysql 服务器 2021-01-01
- ERROR 2006 (HY000): MySQL 服务器已经消失 2021-01-01
- SQL Server 中单行 MERGE/upsert 的语法 2021-01-01
- 在 VB.NET 中更新 SQL Server DateTime 列 2021-01-01
- 将数据从 MS SQL 迁移到 PostgreSQL? 2022-01-01
- SQL Server 2016更改对象所有者 2022-01-01
- Sql Server 字符串到日期的转换 2021-01-01