封尘网

让学习成为一种习惯!

关于mysql异常退出的问题分析

系统:Centos6.5
Mysql版本:5.7.9 做了主从同步,基于gtid并行复制的半同步机制。

根据宕机情况来看,Mysql发生宕机前是由于大量的并发update更新导致Mysql的内存和CPU爆满,导致系统无法负载,而Mysql也就异常退出了,更重要的是主从都挂掉了。由于日志记录并不太详细,这里是初步的判定。

Mysql异常退出后第一步就是尝试启动服务,现在启动时报如下异常信息:

2017-05-16T05:44:09.163578Z 0 [ERROR] InnoDB: Operating system error number 2 in a file operation.
2017-05-16T05:44:09.163580Z 0 [ERROR] InnoDB: The error means the system cannot find the path specified.
2017-05-16T05:44:09.163582Z 0 [ERROR] InnoDB: If you are installing InnoDB, remember that you must create directories yourself, InnoDB does not create them.
2017-05-16T05:44:09.163585Z 0 [ERROR] InnoDB: Cannot open datafile for read-only: ‘./CR/MEETINGPLACE_CUSTOMER.ibd’ OS error: 71
2017-05-16T05:44:09.163587Z 0 [ERROR] InnoDB: Operating system error number 2 in a file operation.
2017-05-16T05:44:09.163589Z 0 [ERROR] InnoDB: The error means the system cannot find the path specified.
2017-05-16T05:44:09.163591Z 0 [ERROR] InnoDB: If you are installing InnoDB, remember that you must create directories yourself, InnoDB does not create them.
2017-05-16T05:44:09.163595Z 0 [ERROR] InnoDB: Could not find a valid tablespace file for CR/MEETINGPLACE_CUSTOMER. Please r
efer to http://dev.mysql.com/doc/refman/5.7/en/innodb-troubleshooting-datadict.html for how to resolve the issue.
2017-05-16T05:44:09.163598Z 0 [Warning] InnoDB: Ignoring tablespace CR/MEETINGPLACE_CUSTOMER because it could not be opened

这信息跟平常的信息看来没什么异样,但是服务却真的启动不了,再次怀疑是之前线上的参数调整和配置文件my.cnf的参数值不一致,或者更新了my.cnf的配置文件所致;

尝试把my.cnf恢复到故障发生前的配置文件,此配置文件是Mysql上线时配置的初始配置;再次尝试启动;同样报下面的信息。

2017-05-16T05:44:09.891004Z 0 [Note] InnoDB: FTS optimize thread exiting.
2017-05-16T05:44:09.891064Z 0 [Note] InnoDB: Starting shutdown…
2017-05-16T05:44:17.682489Z 0 [Note] InnoDB: Shutdown completed; log sequence number 22709655982
2017-05-16T05:44:17.682620Z 0 [Note] InnoDB: Removed temporary tablespace data file: “ibtmp1”
2017-05-16T05:44:17.682626Z 0 [Note] Shutting down plugin ‘PERFORMANCE_SCHEMA’
2017-05-16T05:44:17.682667Z 0 [Note] Shutting down plugin ‘CSV’
2017-05-16T05:44:17.682670Z 0 [Note] Shutting down plugin ‘sha256_password’
2017-05-16T05:44:17.682672Z 0 [Note] Shutting down plugin ‘mysql_native_password’
2017-05-16T05:44:17.682837Z 0 [Note] Shutting down plugin ‘binlog’
2017-05-16T05:44:17.710553Z 0 [Note] /usr/local/mysql-5.7.9/bin/mysqld: Shutdown complete
170516 13:44:17 mysqld_safe mysqld from pid file /data/mysql/mysql.pid ended

这次看来没什么ERROR信息了,但是服务还是启动不来。

接下来尝试把Mysql目录下的ib*开头的几个文件移走,尝试启动服务:移动或者复制保险。

service mysqld start 还是无法成功启动,直接不使用配置文件启动。--no-defaults

[root@mysql mysql]# /usr/local/mysql/bin/mysqld_safe --no-defaults
170516 14:20:57 mysqld_safe Logging to '/data/mysql/mysql.err'.
170516 14:20:57 mysqld_safe Starting mysqld daemon with databases from /data/mysql

服务是启动了,登陆mysql查看一下表;

mysql> select * from test;
ERROR 1146 (42S02): Table 'devod.test' doesn't exist

由于刚才移走了表空间文件,所以就看不到表的信息了,把上面移走的恢复回来。

闭关mysql:

[root@mysql ~]# /usr/local/mysql/bin/mysqladmin -uroot -p密码 shutdown
再次尝试启动,/usr/local/mysql/bin/mysqld_safe --no-defaults
mysql> select * from test;

+------+---------+
| id   | name    |
+------+---------+
|  799 | tom     |
|  100 | whoami  |
|  121 | abcd    |

终于查到数据了,这说明数据是正常的,至于无法启动Mysql服务那只是配置文件中的配置导致。

接下来就是慢慢把配置文件调整到一个正确的状态。

总结:
1、Mysql的版本变化导致每个大版本中的参数都可能不太适用。网上说的最优配置,什么调优配置并不是都适合,要跟据自己的机器配置,性能来不断的调整。

2、启动Mysql服务时除了一些不能动态调整的参数要必须要配置文件中加入。其它的可动态调整的参数最好在启动服务后慢慢调整。当调整后的性能测试通过后再加入到配置文件中。

附:下面这段是在尝试把ibdata1文件移除后启动时报的错误;

2017-05-16T14:09:07.821208+08:00 0 [Note] InnoDB: Completed initialization of buffer pool
2017-05-16T14:09:12.390930+08:00 0 [Note] InnoDB: If the mysqld execution user is authorized, page cleaner thread priority can be changed. See the man page of setpriority().
2017-05-16T14:09:12.400674+08:00 0 [Note] InnoDB: The first innodb_system data file ‘ibdata1’ did not exist. A new tablespace will be created!
2017-05-16T14:09:12.420788+08:00 0 [ERROR] InnoDB: redo log file ‘./ib_logfile0’ exists. Creating system tablespace with existing redo log files is not recommended. Please delete all redo log files before creating new system tablespace.
2017-05-16T14:09:12.420841+08:00 0 [ERROR] InnoDB: InnoDB Database creation was aborted with error Generic error. You may need to delete the ibdata1 file before trying to start up again.
2017-05-16T14:09:12.762441+08:00 0 [ERROR] Plugin ‘InnoDB’ init function returned error.
2017-05-16T14:09:12.769643+08:00 0 [ERROR] Plugin ‘InnoDB’ registration as a STORAGE ENGINE failed.
2017-05-16T14:09:12.790050+08:00 0 [ERROR] Failed to initialize plugins.
2017-05-16T14:09:12.790096+08:00 0 [ERROR] Aborting
2017-05-16T14:09:12.811310+08:00 0 [Note] Binlog end
2017-05-16T14:09:12.837605+08:00 0 [Note] Shutting down plugin ‘CSV’
2017-05-16T14:09:12.887349+08:00 0 [Note] /usr/local/mysql-5.7.9/bin/mysqld: Shutdown complete
170516 14:09:13 mysqld_safe mysqld from pid file /data/mysql/mysql.pid ended

这段日志信息提示删除ib_logfile的文件,就是说要移除就把ib* 的文件一起移除;再尝试启动mysql服务;

在Mysql启动后又有一段异常信息:

InnoDB: Error: page 19 log sequence number 2363194248
InnoDB: is in the future! Current system log sequence number 78250719.
InnoDB: Your database may be corrupt or you may have copied the InnoDB
InnoDB: tablespace but not the InnoDB log files.

这个跟上面的有点关系,就是只移除了tablespace的文件:ibdata1而与它关连的ib_log文件未移除;因为Mysql启动是会创建一个ibdata1文件,同时如果ib_log文件存在的话就会把ib_log文件加载到ibdata1文件中,这样因为一个新的文件和旧的日志文件造成数据会不一致的问题;

说明:在执行故障恢复的情况下一定要先把数据copy一份,确保数据的完成。Mysql服务是正常运行了,至于问题的原因还有待细查,为什么会把数据库直接搞开异常结束那还真的是问题。