mysql2008 第11次性能优化过程与分析

mysql 优化过程与分析

一.Sql语句的错误使用

问题：多次查询-循环

案例：查询粉丝最多的前十个用户的文章总数

粉丝关注表 user_fans

id,user_id,fans_id

文章表 article

title

content

user_id 作者

1.查询排名前10的用户

2.查询排名前10用户的文章数

select user_id from (select user_id,count(*) from user_fans group by user_id order by c dese limit,0,5) as fans_count

方案a 循环

for (fans_count)

select count(*) from article where user_id = ?;

方案b in

select count(*) from article where user_id in (fans_count);

方案c

select count(*) from article where user_id in (fans_count) group by user_id;

最优方案：b

a执行总次数1+10=11

b执行总次数1+1

c执行总次数1+1 多了多余的分组。

总次数=io次数输入与输出

1.一分钟执行10000次，每次读写消耗5个 10000*5=5w =>优化=> 2w

2.一分钟执行100次，每次读写消耗3000个 100*3000=300000(30w)=>优化 =>二十万

sql语句的复杂度，可以决定的程序的性能问题

1.对于查询是否有过多的分组，排序，join

2.需要查询的表数量

3.是否使用了大量函数，如 ount sum AVG等

二.Sql语句优化之索引使用指标

mysql 的innodb引擎，是解析器根据二分法形成平衡二叉树的b+tree 树形结构.形成有

二分法=》不断选择一个中位数，根据左小右大，=》形成有序的数据序列（叶子节点）树形结构。

案例1

alter tabel user add index idx_name(name) 普通索引

这个普通索引形成的树形结构，存储的该索引（字符串数据转化为ascii 数字）及对应主键id数据

条件及所查询的结果数据列，不在索引中时，在查出索引后，还须根据索引对应的主键，找到其它非索引数据，这样就产生了回表，性能就是差。

案例2

业务初期：针对某一地区，比如长沙，用户所在地区长沙

中期扩展：增加武汉

统计功能=》可以根据地区来进行统计=》但武汉的用户很少很少，绝对大部分用户都是长沙。即使以地区建立索引，进行查询时，扫描的数据量太大，与不使用索引相关不大。

一般使用索引，在不是覆盖索引的情况下，建议索引的使用所扫描的数据量/指标,不要超过表数据量的30%；

多使用explain分析。

三.优化器执行过程与分析

案例：select * from customers1s where city="长沙" and gender=0;

show WARNINGS\G; //分析上面执行语句，底层的的重写

explain select * from customers1s where city="长沙" and gender=0; //分析语句

#开启trace查看优化器

set optimizer_trace="enabled=on";--开启trace查看优化器的结果

set end_markers_in_json=on;--增加注释

#优化的执行语句

select id,`name`,city,monthsalary,gender from customers1s where city=“长沙” and gender=0 and monthsalary=99;

#执行语句的优化执行过程打印

select * from information_schema.optimizer_trace \G;--查询打印执行计划

结果:各参数的意义

···

*************************** 1. row ***************************

QUERY: select id,`name`,city,monthsalary,gender from customers1s where city="长沙" and gender=0 and monthsalary=99

TRACE: {

"steps": [

{

##join_preparation展示了准备阶段的执行过程

"join_preparation": {

"select#": 1,

"steps": [

{

##重写查询语句，加库名称列出字段及别名

"expanded_query": "/* select#1 */ select `customers1s`.`id` AS `id`,`customers1s`.`name` AS `name`,`customers1s`.`city` AS `city`,`customers1s`.`monthsalary` AS `monthsalary`,`customers1s`.`gender` AS `gender` from `customers1s` where ((`customers1s`.`city` = '长沙') and (`customers1s`.`gender` = 0) and (`customers1s`.`monthsalary` = 99))"

}

] /* steps */

} /* join_preparation */

{

##join_optimization：展示了优化阶段的执行过程，是分析OPTIMIZER TRACE的重点

"join_optimization": {

"select#": 1,

"steps": [

{

/*condition_processing：该段用来做条件处理，主要对WHERE条件进行优化处理。*/

"condition_processing": {

"condition": "WHERE", /*优化对象类型，如where,having */

#优化前的原始语句

"original_condition": "((`customers1s`.`city` = '长沙') and (`customers1s`.`gender` = 0) and (`customers1s`.`monthsalary` = 99))",

#steps：主要包括三步， transformation：转换类型句 resulting_condition：转换之后的结果输出

"steps": [

{

/*等值条件 equality_propagation（等值条件句转换） */

"transformation": "equality_propagation",

"resulting_condition": "((`customers1s`.`city` = '长沙') and (`customers1s`.`monthsalary` = 99) and multiple equal(0, `customers1s`.`gender`))"

{

/*constant_propagation（常量条件句转换）无意义的条件，移除，如1=1 */

"transformation": "constant_propagation",

"resulting_condition": "((`customers1s`.`city` = '长沙') and (`customers1s`.`monthsalary` = 99) and multiple equal(0, `customers1s`.`gender`))"

{

/*trivial_condition_removal（无效条件移除的转换）条件字段数据类型转化如 `gender` = '0' => `gender` = 0 */

"transformation": "trivial_condition_removal",

"resulting_condition": "((`customers1s`.`city` = '长沙') and (`customers1s`.`monthsalary` = 99) and multiple equal(0, `customers1s`.`gender`))"

}

] /* steps */

} /* condition_processing */

{

#substitute_generated_columns用于替换虚拟生成列

"substitute_generated_columns": {

} /* substitute_generated_columns */

{

/*table_dependencies：分析表之间的依赖关系*/

"table_dependencies": [

{

"table": "`customers1s`",#涉及的表名，如果有别名，也会展示出来

"row_may_be_null": false,#行是否可能为NULL，这里是指JOIN操作之后，这张表里的数据是不是可能为 NULL。如果语句中使用了LEFT JOIN，则后一张表的row_may_be_null会显示为true

"map_bit": 0,#表的映射编号，从0开始递增

#depends_on_map_bits：依赖的映射表。主要是当使用STRAIGHT_JOIN强行控制连接顺序或者LEFT JOIN/RIGHT JOIN有顺序差别时，会在depends_on_map_bits中展示前置表的map_bit值。

"depends_on_map_bits": [

] /* depends_on_map_bits */

}

] /* table_dependencies */

{

#ref_optimizer_key_uses：列出所有可用的ref类型的索引。如果使用了组合索引的多个部分,（例如本例，用到了index(from_date, to_date) 的多列索引），则会在ref_optimizer_key_uses下列出多个元素，每个元素中会列出ref使用的索引及对应值.

"ref_optimizer_key_uses": [

{

"table": "`customers1s`",

"field": "gender",

"equals": "0",

"null_rejecting": false

{

"table": "`customers1s`",

"field": "city",

"equals": "'长沙'",

"null_rejecting": false

}

] /* ref_optimizer_key_uses */

{

#rows_estimation：顾名思义，用于估算需要扫描的记录数。

"rows_estimation": [

{

"table": "`customers1s`",#表名

"range_analysis": {

table_scan:如果全表扫描的话，需要扫描多少行（row，5073417），以及需要的代价（cost， 542989）

"table_scan": {

"rows": 5073417,#扫描行数

"cost": 542989 #性能

} /* table_scan */,

#可能使用到的索引.列出表中所有的索引并分析其是否可用。如果不可用的话，会列出不可用的原因是什么；如果可用会列出索引中可用的字段

"potential_range_indexes": [

{

"index": "PRIMARY",

"usable": false,

"cause": "not_applicable"

{

"index": "idx_gender_city_name_monthsalary_yearbonus",

"usable": true,

"key_parts": [

"gender",

"city",

"name",

"monthsalary",

"yearbonus",

"id"

] /* key_parts */

{

"index": "idx_name_photo",

"usable": false,

"cause": "not_applicable"

{

"index": "testfulltext",

"usable": false,

"cause": "not_applicable"

}

] /* potential_range_indexes */,

"best_covering_index_scan": {

"index": "idx_gender_city_name_monthsalary_yearbonus",

"cost": 605355,

"chosen": false,

"cause": "cost"

} /* best_covering_index_scan */,

#setup_range_conditions：如果有可下推的条件，则带条件考虑范围查询

"setup_range_conditions": [

] /* setup_range_conditions */,

#group_index_range：当使用了GROUP BY或DISTINCT时，是否有合适的索引可用。当未使用GROUP BY或DISTINCT时，会显示chosen=false, cause=not_group_by_or_distinct；如使用了GROUP BY或 DISTINCT，但是多表查询时，会显示chosen=false，cause =not_single_table。其他情况下会尝试分析可用的索引（potential_group_range_indexes）并计算对应的扫描行数及其所需代价

"group_index_range": {

"chosen": false,

"cause": "not_group_by_or_distinct"

} /* group_index_range */,

#analyzing_range_alternatives：分析各个索引的使用成本

"analyzing_range_alternatives": {

#range_scan_alternatives：range扫描分析

"range_scan_alternatives": [

{

"index": "idx_gender_city_name_monthsalary_yearbonus",#索引

#range扫描的条件范围

"ranges": [

"0 <= gender <= 0 AND 长沙 <= city <= 长沙"

] /* ranges */,

#index_dives_for_eq_ranges：是否使用了index dive，该值会被参数 eq_range_index_dive_limit变量值影响

"index_dives_for_eq_ranges": true,

"rowid_ordered": false,#rowid_ordered：该range扫描的结果集是否根据PK值进行排序

"using_mrr": false,#是否使用了mrr

"index_only": true,#表示是否使用了覆盖索引

"rows": 471868,#扫描的行数

"cost": 56304,#索引的使用成本

"chosen": true #表示是否使用了该索引

}

] /* range_scan_alternatives */,

#analyzing_roworder_intersect：分析是否使用了索引合并（index merge），如果未使用，会在cause中展示原因；如果使用了索引合并，会在该部分展示索引合并的代价。

"analyzing_roworder_intersect": {

"usable": false,

"cause": "too_few_roworder_scans"

} /* analyzing_roworder_intersect */

} /* analyzing_range_alternatives */,

#chosen_range_access_summary：在前一个步骤中分析了各类索引使用的方法及代价，得出了一定的中间结果之后，在summary阶段汇总前一阶段的中间结果确认最后的方案

"chosen_range_access_summary": {

#range_access_plan：range扫描最终选择的执行计划。

"range_access_plan": {

"type": "range_scan",#展示执行计划的type，如果使用了索引合并，则会显示index_roworder_intersect

"index": "idx_gender_city_name_monthsalary_yearbonus",#索引名

"rows": 471868,#扫描的行数

#range扫描的条件范围

"ranges": [

"0 <= gender <= 0 AND 长沙 <= city <= 长沙"

] /* ranges */

} /* range_access_plan */,

"rows_for_plan": 471868,#该执行计划的扫描行数

"cost_for_plan": 56304,#该执行计划的执行代价

"chosen": true #是否选择该执行计划

} /* chosen_range_access_summary */

} /* range_analysis */

}

] /* rows_estimation */

{

#considered_execution_plans：负责对比各可行计划的开销，并选择相对最优的执行计划。

"considered_execution_plans": [

{

#plan_prefix：当前计划的前置执行计划。

"plan_prefix": [

] /* plan_prefix */,

"table": "`customers1s`",#涉及的表名，如果有别名，也会展示出来

#best_access_path：通过对比considered_access_paths，选择一个最优的访问路径

"best_access_path": {

#当前考虑的访问路径

"considered_access_paths": [

{

"access_type": "ref",#使用索引的方式，可参考explain中的type字段

"index": "idx_gender_city_name_monthsalary_yearbonus",#索引

"rows": 471868,#行数

"cost": 56304,#开销

"chosen": true #是否选用这种执行路径

{

"access_type": "range",

"range_details": {

"used_index": "idx_gender_city_name_monthsalary_yearbonus"

} /* range_details */,

"chosen": false,

"cause": "heuristic_index_cheaper"

}

] /* considered_access_paths */

} /* best_access_path */,

"condition_filtering_pct": 100, #类似于explain的filtered列，是一个估算值

"rows_for_plan": 471868,#执行计划最终的扫描行数，由considered_access_paths.rows X condition_filtering_pct计算获得。

"cost_for_plan": 56304,#执行计划的代价，由considered_access_paths.cost相加获得

"chosen": true #是否选择了该执行计划

}

] /* considered_execution_plans */

{

#attaching_conditions_to_tables 基于considered_execution_plans中选择的执行计划，改造原有where条件，并针对表增加适当的附加条件，以便于单表数据的筛选。

"attaching_conditions_to_tables": {

#original_condition：原始的条件语句

"original_condition": "((`customers1s`.`gender` = 0) and (`customers1s`.`city` = '长沙') and (`customers1s`.`monthsalary` = 99))",

#attached_conditions_computation：使用启发式算法计算已使用的索引，如果已使用的索引的访问类型是ref，则计算用range能否使用组合索引中更多的列，如果可以，则用range的方式替换ref。

"attached_conditions_computation": [

] /* attached_conditions_computation */,

#attached_conditions_summary：附加之后的情况汇总

"attached_conditions_summary": [

{

"table": "`customers1s`",#表名

"attached": "(`customers1s`.`monthsalary` = 99)" #附加的条件或原语句中能直接下推给单表筛选的条件。

}

] /* attached_conditions_summary */

} /* attaching_conditions_to_tables */

{

#refine_plan 改善执行计划

"refine_plan": [

{

"table": "`customers1s`" #表名

}

] /* refine_plan */

}

] /* steps */

} /* join_optimization */

{

"join_execution": {

"select#": 1,

"steps": [

] /* steps */

} /* join_execution */

}

] /* steps */

}

````

补未出现的部分

finalizing_table_conditions 最终的、经过优化后的表条件。

{

"finalizing_table_conditions": [

{

"table": "`salaries`",

"original_table_condition": "((`salaries`.`to_date` = DATE'1987-06-26') and

(`salaries`.`from_date` = DATE'1986-06-26'))",

"final_table_condition ": null

}

] /* finalizing_table_conditions */

}

优化过程做的事情

1.确定查询的表（多对多那就是多表）

2.查询字段信息，字段类型信息，索引信息等

3.重写sql语句或者是where条件。

4.判断索引使用

四、Explaine结果如何进行分析

优化器执行过程分析（在explain相当的条件下）

show index from customers1s;

alter table customers1s add index idx_city_gender(city,gender);

select * from customers1s where city="长沙" and gender=0;

S:9s多

explain select * from customers1s where city="长沙" and gender=0;

id select_type table partitions type possible_keys key key_len ref rows filtered Extra

1 SIMPLE customers1sNull ref idx_gender_city_name_monthsalary_yearbonus idx_city_gender 33 const,const471868100 Null

idx_city_gender

原因分析使用了索引idx_city_gender,idx_city_gender索引树上存储的是主键、city、gender,但查询的是所有字段，又根据索引对应的主键，查找了满足条件的数据，即产生了回表。

回表查询的本质：索引当中并不存在sql语句需要查询的数据

1. 连接层 -》 sql层（优化器：得到最优的执行计划 idx_city_gender）=》查询数据的时候发现索引

并不满足需要查询数据 =》通过查询的数据的主键索引去磁盘重查询数据 =》返回数据

select id,city,gender from customers1s where city="长沙" and gender=0;

s:0.21s

explain select id,city,gender from customers1s where city="长沙" and gender=0;

id select_type table partitions type possible_keys key key_len ref rows filtered Extra

1 SIMPLE customers1sNull ref idx_gender_city_name_monthsalary_yearbonus idx_city_gender 33 const,const444500100 Using index

idx_city_gender

2. 连接层 -》 sql层（优化器：得到最优的执行计划 idx_city_gender）=》查询数据 =》返回数据

select * from customers1s where city="长沙" and yearbonus=0;

s:37.521s

explain select * from customers1s where city="长沙" and yearbonus=0;

id select_type table partitions type possible_keys key key_len ref rows filtered Extra

1 SIMPLE customers1sNull ref idx_city_gender idx_city_gender 33 const94478410 Using where

3. 连接层 -》 sql层（优化器：得到最优的执行计划 idx_city_gender）=》查询数据的时候发现索引

并不满足需要查询数据 =》通过查询的数据的主键索引去磁盘重查询数据 =》还需要通过where来进行数据过滤 =》返回数据

即：索引过滤==》回表==》非索引过滤

总结：分析查询语句过程

1. 先看查询使用的索引

2. 确定索引数据

3. 在预判sql语句扫描的数据与索引数据做对比

4. 在确定这个sql执行是不是最好

发表于 2020-11-06 17:38
阅读 ( 695 )
分类：数据库

mysql2008 第11次性能优化过程与分析

你可能感兴趣的文章

相关问题

0 条评论

作家榜 »

mysql2008 第11次 性能优化过程与分析

你可能感兴趣的文章

相关问题

0 条评论

作家榜 »

mysql2008 第11次性能优化过程与分析