운영 중이던 10.0.20 버전 MariaDB가 signal 11 에러로 Down 발생하여 원인 파악했던 이력 공유합니다.
signal 6, signal 11 등 signal 에러는 Oracle의 ORA-00600 에러처럼 내부적인 에러로, 원인 파악 및 해결이 어려운 경우가 많습니다.
이번 signal 에러는 10.0.20 버전 MariaDB에서 tmp_table 관련 버그로 추정됩니다.
--1. 관련 로그
1번 DB (당시 Master )
180409 20:04:00 [ERROR] mysqld got signal 11 ; mysys/stacktrace.c:247(my_print_stacktrace)[0xbcb2ee] sql/signal_handler.cc:153(handle_fatal_signal)[0x71fa8c] /lib64/libpthread.so.0[0x38ff40f790] sql/item.cc:2291(Item_field)[0x745742] sql/item_subselect.cc:877(Item_subselect::get_tmp_table_item(THD*))[0x7ba3d9] sql/item.cc:7450(Item_ref::get_tmp_table_item(THD*))[0x752703] sql/sql_list.h:206(base_list::push_back(void*))[0x5f0de1] sql/sql_select.cc:2370(JOIN::exec())[0x5ee19a] sql/sql_select.cc:373(handle_select(THD*, LEX*, select_result*, unsigned long))[0x5f1dcd] sql/sql_parse.cc:5275(execute_sqlcom_select)[0x595bf0] sql/sql_parse.cc:2562(mysql_execute_command(THD*))[0x598847] sql/sql_parse.cc:6529(mysql_parse(THD*, char*, unsigned int, Parser_state*))[0x59fc86] sql/sql_parse.cc:1310(dispatch_command(enum_server_command, THD*, char*, unsigned int))[0x5a1bb7] sql/sql_parse.cc:999(do_command(THD*))[0x5a22f9] sql/sql_connect.cc:1378(do_handle_one_connection(THD*))[0x66c7d4] sql/sql_connect.cc:1295(handle_one_connection)[0x66c912] perfschema/pfs.cc:1863(pfs_spawn_thread)[0xa84cd9] /lib64/libpthread.so.0[0x38ff407a51] /lib64/libc.so.6(clone+0x6d)[0x38ff0e893d] Trying to get some variables. Some pointers may be invalid and cause the dump to abort. Query (0x7facf0586020): is an invalid pointer Connection ID (thread ID): 744926 Status: NOT_KILLED
2번 DB (Slave -> Master)
180409 20:05:18 [ERROR] mysqld got signal 11 ; Thread pointer: 0x0x7f6ff9645008 Attempting backtrace. You can use the following information to find out where mysqld died. If you see no messages after this, something went terribly wrong... stack_bottom = 0x7f70c6d92de0 thread_stack 0x48000 mysys/stacktrace.c:247(my_print_stacktrace)[0xbcb2ee] sql/signal_handler.cc:153(handle_fatal_signal)[0x71fa8c] /lib64/libpthread.so.0[0x347aa0f790] sql/item.cc:2291(Item_field)[0x745742] sql/item_subselect.cc:877(Item_subselect::get_tmp_table_item(THD*))[0x7ba3d9] sql/item.cc:7450(Item_ref::get_tmp_table_item(THD*))[0x752703] sql/sql_list.h:206(base_list::push_back(void*))[0x5f0de1] sql/sql_select.cc:2370(JOIN::exec())[0x5ee19a] sql/sql_select.cc:373(handle_select(THD*, LEX*, select_result*, unsigned long))[0x5f1dcd] sql/sql_parse.cc:5275(execute_sqlcom_select)[0x595bf0] sql/sql_parse.cc:2562(mysql_execute_command(THD*))[0x598847] sql/sql_parse.cc:6529(mysql_parse(THD*, char*, unsigned int, Parser_state*))[0x59fc86] sql/sql_parse.cc:1310(dispatch_command(enum_server_command, THD*, char*, unsigned int))[0x5a1bb7] sql/sql_parse.cc:999(do_command(THD*))[0x5a22f9] sql/sql_connect.cc:1378(do_handle_one_connection(THD*))[0x66c7d4] sql/sql_connect.cc:1295(handle_one_connection)[0x66c912] perfschema/pfs.cc:1863(pfs_spawn_thread)[0xa84cd9] /lib64/libpthread.so.0[0x347aa07a51] /lib64/libc.so.6(clone+0x6d)[0x347a6e893d] Trying to get some variables. Some pointers may be invalid and cause the dump to abort. Query (0x7f6fe74a2020): is an invalid pointer Connection ID (thread ID): 67721
Master DB가 down되고 페일오버 된 slave DB도 같은 에러로 DOWN 발생했으며
에러로그를 자세히 살펴보면
sql/item_subselect.cc:877(Item_subselect::get_tmp_table_item(THD*))[0x7ba3d9]
sql/item.cc:7450(Item_ref::get_tmp_table_item(THD*))[0x752703]
tmp_table 관련 에러로 추정하였음.
MariaDB 공식사이트 검색 결과 MariaDB 10.0.20 이하 버전에서
tmp_table 생성 및 조회 등 내부 처리 과정에서 발생가능한 버그임을 확인
* 해당 버그 사례
https://jira.mariadb.org/browse/MDEV-8525
https://jira.mariadb.org/browse/MDEV-9006
--2. 해결방법
해당 버그가 fix 된 10.0.22 버전이상으로 엔진 업그레이드
* 업그레이드 방법 : https://sarc.io/index.php/mariadb/802-maria
* fix 버전 관련 내용 : https://mariadb.com/kb/en/library/mariadb-10022-changelog/
=> Revision #d88aaaa 2015-10-28 08:34:08 +0100
MDEV-8525 mariadb 10.0.20 crashing when data is read by Kodi media center (http://kodi.tv).