数据库 Index-Merge 导致的死锁
问题
监控系统,告警发现 serviceA-task 定时同步供应商订单任务执行出现失败,后来发现 serviceB-api 的日志报文中也一直有死锁存在,Log 日志具体如下:
### Cause: com.mysql.jdbc.exceptions.jdbc4.MySQLTransactionRollbackException: Deadlock found when trying to get lock; try restarting transaction
; Deadlock found when trying to get lock; try restarting transaction; nested exception is com.mysql.jdbc.exceptions.jdbc4.MySQLTransactionRollbackException: Deadlock found when trying to get lock; try restarting transaction, dubbo version: 2.13.2.RELEASE, current host: 10.36.131.12 org.springframework.dao.DeadlockLoserDataAccessException:
### Error updating database. Cause: com.mysql.jdbc.exceptions.jdbc4.MySQLTransactionRollbackException: Deadlock found when trying to get lock; try restarting transaction
问题原因
问题背景
(1)RPC 接口:InboundAppointmentRpcService.createInboundAppointment
(2)Task 任务:shire-task#supplierSaleOrderAggregateTask
上述都会对表 wms_supplier_order_info 执行下面语句更新数据 库记录
@Override
public int updateSupplierOrderInfo(WmsSupplierOrderInfo reocrd) {
if (Objects.isNull(reocrd)) {
return 0;
}
WmsSupplierOrderInfoExample example = new WmsSupplierOrderInfoExample();
example.createCriteria()
.andWarehouseIdEqualTo(reocrd.getWarehouseId())
.andSupplierIdEqualTo(reocrd.getSupplierId())
.andGoodsIdEqualTo(reocrd.getGoodsId())
.andExpectTimeEqualTo(reocrd.getExpectTime())
.andIsDeletedEqualTo(Boolean.FALSE);
return mapper.updateByExampleSelective(reocrd, example);
}
问题原因分析
分析进度 1:
按照上述的更新语句,模拟写了一条查询语句(和上面更新语句的条件是一样的),查看执行计划,出现一条语句走了两条索引,索引类型为 index merge
explain SELECT * FROM wms_supplier_order_info where warehouse_id = 2 and supplier_id = 434135643 and goods_id = 175091807986 and expect_time = '2020-10-12 00:00:00' and is_deleted = 0
查了文档,发现出现 index merge 操作会大大增加数据库死锁的概率-附上官方文档:https://bugs.mysql.com/bug.php?id=77209
分析进度 2
找 DBA 同事拿到了当时的死锁日志如下,并还原当时的 DDL 更新语句:
还原当时场景下对应的 SQL 语句,如下:
第一个事务对应的语句
SET warehouse_id = 6880,
supplier_id = 334985823,
expect_time = '2021-01-26 00:00:00',
cargo_id = 218057,
goods_id = 207967301292,
goods_name = '【无辣不欢】香辣小米椒100g',
quantity = 66,
fetch_time = '2021-01-25 01:31:42.944000',
sku_spec = '100g/份',
supplier_name = '河源市喜悦生态农业科技有限公司'
WHERE ( warehouse_id = 6880
and supplier_id = 334985823
and goods_id = 207967301292 and expect_time = '2021-01-26 00:00:00' and is_deleted = "false"
第二个事务对应的语句
SET id = 1566583,
warehouse_id = 3997,
supplier_id = 735844057,
expect_time = '2021-01-26 00:00:00',
cargo_id = 26557,
goods_id = 184949669110,
goods_name = '苏菲超熟睡柔棉感420mm4片+弹力贴身纤巧230mm10片日夜组合装',
quantity = 11,
fetch_time = '2021-01-25 00:47:37',
created_at = '2021-01-25 00:03:08',
updated_at = '2021-01-25 00:47:37',
is_deleted = 0,
sku_spec = '14片/条',
supplier_name = '恒安金盛居家日用专营店',
`status` = 1,
expect_first_arrival_time = '2021-01-25 10:00:00' , expect_quantity = 400 where ( warehouse_id = 3997
and supplier_id = 735844057
and goods_id = 184949669110 and expect_time = '2021-01-26 00:00:00' and is_deleted = "false")
对上面的死锁日志进行分析
(1)事务 1 TRANSACTION(1):持有 4 把行锁(4 row lock(s)), 其中一把处于等待状态((1) WAITING FOR THIS LOCK TO BE GRANTED),位于页面编号为 61 的地方,锁针对的是主键索引(index PRIMARY of table)的记录锁(Record lock),只锁记录不锁区间
(2)事务 2 TRANSACTION(2)持有 23 把锁,其中持有((2) HOLDS THE LOCK(S))页面编号为 61 的一个主键记录锁(正是事务 1 在等待的),有 15 把是处于所等待的状态,就是等待索引 index idx_expect_time_is_deleted 的锁
(3)进一步分析,事务 1 需要的那把锁正好在事务 2 的手里,而对于事务 2 需要获取的索引 index idx_expect_time_is_deleted 的锁,发现已经被事务 1 占用,只能阻塞等待,出现了互相等待问题,进入死锁
模拟分析当时的两个事务加锁的场景可能如下:
解决方式
1、尽量避免使用多 where 条件更新记录,可以先查询出来,然后根据主键更新。
2、优化索引,创建联合索引 goods_id
,expect_time
,is_deleted
3、避免使用大事务,在事务内的加上的行锁,只能在 commit 后,锁才会释放,控制所有锁的时间
4、关闭 MySQL 优化器的 index merge 功能
附录
问题:
1、为什么使用 Merge Index 技术呢?
答:Merge Index 索引合并技术是 MySQL 的一个优化,对多个索引分别进行条件扫描,然后将它们各自的结果进行合并(intersect/union)
参考:
Merger-Index 导致死锁:https://blog.csdn.net/daidaineteasy/article/details/109266083
Merge-Index 技术剖析:https://www.orczhou.com/index.php/2013/01/mysql-source-code-query-optimization-index-merge/
Merge-Index 技术官方介绍:https://dev.mysql.com/doc/refman/5.6/en/index-merge-optimization.html