MySQL 分区之RANGE && HASH
从Mysql5.1之后,分区功能出现了,表分区就像是将一个大表分成了若干个小表,用户在执行查询的时候无需进行全表扫描,只需要对满足要求的表分区中进行查询即可,极大的提高了查询速率,另外,表分区的实现也方便了对数据的管理,比如产品需要删除去年的所有数据,那么只需要将去年数据所在的表分区删除即可。
mysql 表分区有很多种,详情点击:http://dev.mysql.com/doc/refman/5.1/zh/partitioning.html
此处就讨论RANGE 跟HASH 以及RANGE 结合HASH进行的分区操作。
注意:所有的表分区使用的列均需要使用源表中存在的主键或者唯一索引列,否则创建失败,如果源表中本来就不存在任何的主键或者唯一索引列,那么可以在分区的时候根据需要选取任意列。
RANGE:顾名思义,通过确定选取列的值的范围的方式进行分区。
推荐阅读:
如下是创建普通表的语句:
为了实验的方便,此处date 字段使用的时间类型为:DATETIME,而非TIMESTAMP,原因是TIMESTAMP不支持在分区的时候使用YEAR(),MONTH(),TO_DAYS()等操作,只能使用UNIX_TIMSTAMP()函数,所以在设计表的时候需要考虑到这点:
CREATE TABLE t1 ( id INT, date DATETIME DEFAULT CURRENT_TIMESTAMP) ENGINE=Innodb;
插入一些测试数据:
mysql> SELECT * FROM t1;
+------+---------------------+
| id | date |
+------+---------------------+
| 1 | 2013-05-23 12:59:39 |
| 2 | 2013-05-23 12:59:43 |
| 3 | 2013-05-23 12:59:44 |
| 4 | 2013-07-04 19:35:45 |
| 5 | 2014-04-04 19:35:45 |
| 6 | 2014-05-04 19:35:45 |
| 7 | 2015-05-04 19:35:45 |
| 8 | 2015-05-05 19:35:45 |
| 9 | 2017-05-05 19:35:45 |
| 10 | 2018-05-05 19:35:45 |
+------+---------------------+
10 rows in set (0.00 sec)
查看查询语句执行计划:发现进行了全表扫描
mysql> EXPLAIN SELECT * FROM t1;
+----+-------------+-------+------+---------------+------+---------+------+------+-------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+------+---------+------+------+-------+
| 1 | SIMPLE | t1 | ALL | NULL | NULL | NULL | NULL | 10 | NULL |
+----+-------------+-------+------+---------------+------+---------+------+------+-------+
1 row in set (0.00 sec)
mysql> EXPLAIN SELECT * FROM t1 WHERE date >= '2014-03-05 19:00:12'
-> AND date <= '2016-03-05 18:45:12';
+----+-------------+-------+------+---------------+------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+------+---------+------+------+-------------+
| 1 | SIMPLE | t1 | ALL | NULL | NULL | NULL | NULL | 10 | Using where |
+----+-------------+-------+------+---------------+------+---------+------+------+-------------+
1 row in set (0.00 sec)
创建新表t2,并根据年份进行表分区
mysql> CREATE TABLE t2 ( id INT, date DATETIME DEFAULT CURRENT_TIMESTAMP) ENGINE=Innodb
-> PARTITION BY RANGE (YEAR(date)) (
-> PARTITION p2013 VALUES LESS THAN(2014),
-> PARTITION p2014 VALUES LESS THAN(2015),
-> PARTITION p2015 VALUES LESS THAN(2016),
-> PARTITION p2016 VALUES LESS THAN(2017),
-> PARTITION p2017 VALUES LESS THAN(2018),
-> PARTITION p2099 VALUES LESS THAN MAXVALUE
-> ) ;
Query OK, 0 rows affected (2.47 sec)
查看数据分布状态并导入t1表的数据:
mysql> SELECT table_name,partition_name,table_rows FROM information_schema.PARTITIONS WHERE table_schema=database() AND table_name='t2';
+------------+----------------+------------+
| table_name | partition_name | table_rows |
+------------+----------------+------------+
| t2 | p2013 | 0 |
| t2 | p2014 | 0 |
| t2 | p2015 | 0 |
| t2 | p2016 | 0 |
| t2 | p2017 | 0 |
| t2 | p2099 | 0 |
+------------+----------------+------------+
6 rows in set (0.00 sec)
mysql> SELECT * FROM t2;
Empty set (0.00 sec)
mysql> EXPLAIN SELECT * FROM t2;
+----+-------------+-------+------+---------------+------+---------+------+------+-------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+------+---------+------+------+-------+
| 1 | SIMPLE | t2 | ALL | NULL | NULL | NULL | NULL | 6 | NULL |
+----+-------------+-------+------+---------------+------+---------+------+------+-------+
1 row in set (0.00 sec)
mysql> INSERT INTO t2 SELECT * FROM t1;
Query OK, 10 rows affected (0.36 sec)
Records: 10 Duplicates: 0 Warnings: 0
再次查看数据分部状态:
mysql> SELECT * FROM t2;
+------+---------------------+
| id | date |
+------+---------------------+
| 1 | 2013-05-23 12:59:39 |
| 2 | 2013-05-23 12:59:43 |
| 3 | 2013-05-23 12:59:44 |
| 4 | 2013-07-04 19:35:45 |
| 5 | 2014-04-04 19:35:45 |
| 6 | 2014-05-04 19:35:45 |
| 7 | 2015-05-04 19:35:45 |
| 8 | 2015-05-05 19:35:45 |
| 9 | 2017-05-05 19:35:45 |
| 10 | 2018-05-05 19:35:45 |
+------+---------------------+
10 rows in set (0.00 sec)
mysql> SELECT table_name,partition_name,table_rows FROM information_schema.PARTITIONS WHERE table_schema=database() AND table_name='t2';
+------------+----------------+------------+
| table_name | partition_name | table_rows |
+------------+----------------+------------+
| t2 | p2013 | 4 |
| t2 | p2014 | 2 |
| t2 | p2015 | 2 |
| t2 | p2016 | 0 |
| t2 | p2017 | 1 |
| t2 | p2099 | 1 |
+------------+----------------+------------+
6 rows in set (0.00 sec)
查看全表扫描行数情况:
mysql> EXPLAIN SELECT * FROM t2\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: t2
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 11
Extra: NULL
1 row in set (0.00 sec)
进行where子句过滤后:
1234567 mysql> EXPLAIN PARTITIONS SELECT * FROM t2 WHERE date >= '2014-03-05 19:00:12' AND date <= '2016-03-05 18:45:12';
+----+-------------+-------+-------------------+------+---------------+------+---------+------+------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------------------+------+---------------+------+---------+------+------+-------------+
| 1 | SIMPLE | t2 | p2014,p2015,p2016 | ALL | NULL | NULL | NULL | NULL | 5 | Using where |
+----+-------------+-------+-------------------+------+---------------+------+---------+------+------+-------------+
1 row in set (0.00 sec)
可以发现,进行分区之后没有全表扫描,扫描的分区也仅仅是在时间范围内的。