galera cluster 2nd does not start
Hi all,
I am running galera cluster on centos7 and my 2nd node fails to start. 1st node is running [root@os-db1 ~]# cat /var/lib/mysql/grastate.dat # GALERA saved state version: 2.1 uuid: 638dd3db-9ecd-11e7-8c45-136f872211b0 seqno: -1 safe_to_bootstrap: 1 [root@os-db1 ~]systemctl status mariadbat ● mariadb.service - MariaDB database server Loaded: loaded (/usr/lib/systemd/system/mariadb.service; enabled; vendor preset: disabled) Drop-In: /etc/systemd/system/mariadb.service.d └─migrated-from-my.cnf-settings.conf Active: active (running) since Thu 2017-10-05 22:33:34 CEST; 5min ago Process: 1884 ExecStartPost=/bin/sh -c systemctl unset-environment _WSREP_START_POSITION (code=exited, status=0/SUCCESS) Process: 1772 ExecStartPre=/bin/sh -c [ ! -e /usr/bin/galera_recovery ] && VAR= || VAR=`/usr/bin/galera_recovery`; [ $? -eq 0 ] && systemctl set-environment _WSREP_START_POSITION=$VAR || exit 1 (code=exited, status=0/SUCCESS) Process: 1770 ExecStartPre=/bin/sh -c systemctl unset-environment _WSREP_START_POSITION (code=exited, status=0/SUCCESS) Main PID: 1851 (mysqld) Status: "Taking your SQL requests now..." CGroup: /system.slice/mariadb.service └─1851 /usr/sbin/mysqld --wsrep_start_position=638dd3db-9ecd-11e7-8c45-136f872211b0:1366 Oct 05 22:37:00 os-db1.lab.local mysqld[1851]: group UUID = 638dd3db-9ecd-11e7-8c45-136f872211b0 Oct 05 22:37:00 os-db1.lab.local mysqld[1851]: 2017-10-05 22:37:00 140605334218496 [Note] WSREP: Flow-control interval: [16, 16] Oct 05 22:37:00 os-db1.lab.local mysqld[1851]: 2017-10-05 22:37:00 140605634771712 [Note] WSREP: New cluster view: global ...sion 3 Oct 05 22:37:00 os-db1.lab.local mysqld[1851]: 2017-10-05 22:37:00 140605634771712 [Note] WSREP: wsrep_notify_cmd is not d...ation. Oct 05 22:37:00 os-db1.lab.local mysqld[1851]: 2017-10-05 22:37:00 140605634771712 [Note] WSREP: REPL Protocols: 7 (3, 2) Oct 05 22:37:00 os-db1.lab.local mysqld[1851]: 2017-10-05 22:37:00 140605634771712 [Note] WSREP: Assign initial position f...ion: 3 Oct 05 22:37:00 os-db1.lab.local mysqld[1851]: 2017-10-05 22:37:00 140605392942848 [Note] WSREP: Service thread queue flushed. Oct 05 22:37:04 os-db1.lab.local mysqld[1851]: 2017-10-05 22:37:04 140605342611200 [Note] WSREP: (75852029, 'tcp://0.0.0.0...1:4567 Oct 05 22:37:04 os-db1.lab.local mysqld[1851]: 2017-10-05 22:37:04 140605342611200 [Warning] WSREP: discarding established...:4567) Oct 05 22:37:05 os-db1.lab.local mysqld[1851]: 2017-10-05 22:37:05 140605342611200 [Note] WSREP: cleaning up ed995d35 (tc...:4567) Hint: Some lines were ellipsized, use -l to show in full. however the 2nd one has still the problem [root@os-db2 ~]# cat /var/lib/mysql/grastate.dat # GALERA saved state version: 2.1 uuid: 638dd3db-9ecd-11e7-8c45-136f872211b0 seqno: -1 safe_to_bootstrap: 1 [root@os-db2 ~]# systemctl status mariadb -l ● mariadb.service - MariaDB database server Loaded: loaded (/usr/lib/systemd/system/mariadb.service; enabled; vendor preset: disabled) Drop-In: /etc/systemd/system/mariadb.service.d └─migrated-from-my.cnf-settings.conf Active: failed (Result: exit-code) since Thu 2017-10-05 22:37:10 CEST; 3min 28s ago Process: 1786 ExecStart=/usr/sbin/mysqld $MYSQLD_OPTS $_WSREP_NEW_CLUSTER $_WSREP_START_POSITION (code=exited, status=1/FAILURE) Process: 1710 ExecStartPre=/bin/sh -c [ ! -e /usr/bin/galera_recovery ] && VAR= || VAR=`/usr/bin/galera_recovery`; [ $? -eq 0 ] && systemctl set-environment _WSREP_START_POSITION=$VAR || exit 1 (code=exited, status=0/SUCCESS) Process: 1708 ExecStartPre=/bin/sh -c systemctl unset-environment _WSREP_START_POSITION (code=exited, status=0/SUCCESS) Main PID: 1786 (code=exited, status=1/FAILURE) Status: "MariaDB server is down" Oct 05 22:36:57 os-db2.lab.local mysqld[1786]: 2017-10-05 22:36:57 140047168820992 [ERROR] WSREP: Failed to read 'ready <addr>' from: wsrep_sst_rsync --role 'joiner' --address '10.0.10.21' --datadir '/var/lib/mysql/' --parent '1786' '' Oct 05 22:36:57 os-db2.lab.local mysqld[1786]: Read: '(null)' Oct 05 22:36:57 os-db2.lab.local mysqld[1786]: 2017-10-05 22:36:57 140047168820992 [ERROR] WSREP: Process completed with error: wsrep_sst_rsync --role 'joiner' --address '10.0.10.21' --datadir '/var/lib/mysql/' --parent '1786' '' : 2 (No such file or directory) Oct 05 22:36:57 os-db2.lab.local mysqld[1786]: 2017-10-05 22:36:57 140047479429888 [ERROR] WSREP: Failed to prepare for 'rsync' SST. Unrecoverable. Oct 05 22:36:57 os-db2.lab.local mysqld[1786]: 2017-10-05 22:36:57 140047479429888 [ERROR] Aborting Oct 05 22:37:10 os-db2.lab.local mysqld[1786]: Error in my_thread_global_end(): 1 threads didn't exit Oct 05 22:37:10 os-db2.lab.local systemd[1]: mariadb.service: main process exited, code=exited, status=1/FAILURE Oct 05 22:37:10 os-db2.lab.local systemd[1]: Failed to start MariaDB database server. Oct 05 22:37:10 os-db2.lab.local systemd[1]: Unit mariadb.service entered failed state. Oct 05 22:37:10 os-db2.lab.local systemd[1]: mariadb.service failed. cat /var/log/messages Oct 5 22:41:33 os-db2 systemd: Starting Cleanup of Temporary Directories... Oct 5 22:41:33 os-db2 systemd: Started Cleanup of Temporary Directories. Oct 5 22:41:33 os-db2 systemd: Starting MariaDB database server... Oct 5 22:41:33 os-db2 sh: 2017-10-05 22:41:33 139893839038720 [Note] /usr/sbin/mysqld (mysqld 10.1.26-MariaDB) starting as process 1853 ... Oct 5 22:41:36 os-db2 sh: WSREP: Recovered position 638dd3db-9ecd-11e7-8c45-136f872211b0:1083 Oct 5 22:41:36 os-db2 mysqld: 2017-10-05 22:41:36 140106319649024 [Note] /usr/sbin/mysqld (mysqld 10.1.26-MariaDB) starting as process 1885 ... Oct 5 22:41:36 os-db2 mysqld: 2017-10-05 22:41:36 140106319649024 [Note] WSREP: Read nil XID from storage engines, skipping position init Oct 5 22:41:36 os-db2 mysqld: 2017-10-05 22:41:36 140106319649024 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib64/galera/libgalera_smm.so' Oct 5 22:41:36 os-db2 mysqld: 2017-10-05 22:41:36 140106319649024 [Note] WSREP: wsrep_load(): Galera 25.3.20(r3703) by Codership Oy <info@codership.com> loaded successfully. Oct 5 22:41:36 os-db2 mysqld: 2017-10-05 22:41:36 140106319649024 [Note] WSREP: CRC-32C: using hardware acceleration. Oct 5 22:41:36 os-db2 mysqld: 2017-10-05 22:41:36 140106319649024 [Note] WSREP: Found saved state: 638dd3db-9ecd-11e7-8c45-136f872211b0:-1, safe_to_bootsrap: 1 Oct 5 22:41:36 os-db2 mysqld: 2017-10-05 22:41:36 140106319649024 [Note] WSREP: Passing config to GCS: base_dir = /var/lib/mysql/; base_host = 10.0.10.21; base_port = 4567; cert.log_conflicts = no; debug = no; evs.auto_evict = 0; evs.delay_margin = PT1S; evs.delayed_keep_period = PT30S; evs.inactive_check_period = PT0.5S; evs.inactive_timeout = PT15S; evs.join_retrans_period = PT1S; evs.max_install_timeouts = 3; evs.send_window = 4; evs.stats_report_period = PT1M; evs.suspect_timeout = PT5S; evs.user_send_window = 2; evs.view_forget_timeout = PT24H; gcache.dir = /var/lib/mysql/; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /var/lib/mysql//galera.cache; gcache.page_size = 128M; gcache.recover = no; gcache.size = 128M; gcomm.thread_prio = ; gcs.fc_debug = 0; gcs.fc_factor = 1.0; gcs.fc_limit = 16; gcs.fc_master_slave = no; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = no; gmcast.segment = 0; gmcast.version = 0; pc.announce_timeout = PT3S; pc.checksum = false; pc.ign Oct 5 22:41:36 os-db2 mysqld: 2017-10-05 22:41:36 140106319649024 [Note] WSREP: GCache history reset: old(638dd3db-9ecd-11e7-8c45-136f872211b0:0) -> new(638dd3db-9ecd-11e7-8c45-136f872211b0:1083) Oct 5 22:41:36 os-db2 mysqld: 2017-10-05 22:41:36 140106319649024 [Note] WSREP: Assign initial position for certification: 1083, protocol version: -1 Oct 5 22:41:36 os-db2 mysqld: 2017-10-05 22:41:36 140106319649024 [Note] WSREP: wsrep_sst_grab() Oct 5 22:41:36 os-db2 mysqld: 2017-10-05 22:41:36 140106319649024 [Note] WSREP: Start replication Oct 5 22:41:36 os-db2 mysqld: 2017-10-05 22:41:36 140106319649024 [Note] WSREP: Setting initial position to 638dd3db-9ecd-11e7-8c45-136f872211b0:1083 Oct 5 22:41:36 os-db2 mysqld: 2017-10-05 22:41:36 140106319649024 [Note] WSREP: protonet asio version 0 Oct 5 22:41:36 os-db2 mysqld: 2017-10-05 22:41:36 140106319649024 [Note] WSREP: Using CRC-32C for message checksums. Oct 5 22:41:36 os-db2 mysqld: 2017-10-05 22:41:36 140106319649024 [Note] WSREP: backend: asio Oct 5 22:41:36 os-db2 mysqld: 2017-10-05 22:41:36 140106319649024 [Note] WSREP: gcomm thread scheduling priority set to other:0 Oct 5 22:41:36 os-db2 mysqld: 2017-10-05 22:41:36 140106319649024 [Warning] WSREP: access file(/var/lib/mysql//gvwstate.dat) failed(No such file or directory) Oct 5 22:41:36 os-db2 mysqld: 2017-10-05 22:41:36 140106319649024 [Note] WSREP: restore pc from disk failed Oct 5 22:41:36 os-db2 mysqld: 2017-10-05 22:41:36 140106319649024 [Note] WSREP: GMCast version 0 Oct 5 22:41:36 os-db2 mysqld: 2017-10-05 22:41:36 140106319649024 [Note] WSREP: (94ec7371, 'tcp://0.0.0.0:4567') listening at tcp://0.0.0.0:4567 Oct 5 22:41:36 os-db2 mysqld: 2017-10-05 22:41:36 140106319649024 [Note] WSREP: (94ec7371, 'tcp://0.0.0.0:4567') multicast: , ttl: 1 Oct 5 22:41:36 os-db2 mysqld: 2017-10-05 22:41:36 140106319649024 [Note] WSREP: EVS version 0 Oct 5 22:41:36 os-db2 mysqld: 2017-10-05 22:41:36 140106319649024 [Note] WSREP: gcomm: connecting to group 'galera', peer '10.0.10.20:,10.0.10.21:' Oct 5 22:41:36 os-db2 mysqld: 2017-10-05 22:41:36 140106319649024 [Note] WSREP: (94ec7371, 'tcp://0.0.0.0:4567') connection established to 94ec7371 tcp://10.0.10.21:4567 Oct 5 22:41:36 os-db2 mysqld: 2017-10-05 22:41:36 140106319649024 [Warning] WSREP: (94ec7371, 'tcp://0.0.0.0:4567') address 'tcp://10.0.10.21:4567' points to own listening address, blacklisting Oct 5 22:41:36 os-db2 mysqld: 2017-10-05 22:41:36 140106319649024 [Note] WSREP: (94ec7371, 'tcp://0.0.0.0:4567') connection established to 75852029 tcp://10.0.10.20:4567 Oct 5 22:41:36 os-db2 mysqld: 2017-10-05 22:41:36 140106319649024 [Note] WSREP: (94ec7371, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: Oct 5 22:41:37 os-db2 mysqld: 2017-10-05 22:41:37 140106319649024 [Note] WSREP: declaring 75852029 at tcp://10.0.10.20:4567 stable Oct 5 22:41:37 os-db2 mysqld: 2017-10-05 22:41:37 140106319649024 [Note] WSREP: Node 75852029 state prim Oct 5 22:41:37 os-db2 mysqld: 2017-10-05 22:41:37 140106319649024 [Note] WSREP: view(view_id(PRIM,75852029,6) memb { Oct 5 22:41:37 os-db2 mysqld: 75852029,0 Oct 5 22:41:37 os-db2 mysqld: 94ec7371,0 Oct 5 22:41:37 os-db2 mysqld: } joined { Oct 5 22:41:37 os-db2 mysqld: } left { Oct 5 22:41:37 os-db2 mysqld: } partitioned { Oct 5 22:41:37 os-db2 mysqld: }) Oct 5 22:41:37 os-db2 mysqld: 2017-10-05 22:41:37 140106319649024 [Note] WSREP: save pc into disk Oct 5 22:41:37 os-db2 mysqld: 2017-10-05 22:41:37 140106319649024 [Note] WSREP: gcomm: connected Oct 5 22:41:37 os-db2 mysqld: 2017-10-05 22:41:37 140106319649024 [Note] WSREP: Changing maximum packet size to 64500, resulting msg size: 32636 Oct 5 22:41:37 os-db2 mysqld: 2017-10-05 22:41:37 140106319649024 [Note] WSREP: Shifting CLOSED -> OPEN (TO: 0) Oct 5 22:41:37 os-db2 mysqld: 2017-10-05 22:41:37 140106319649024 [Note] WSREP: Opened channel 'galera' Oct 5 22:41:37 os-db2 mysqld: 2017-10-05 22:41:37 140106319649024 [Note] WSREP: Waiting for SST to complete. Oct 5 22:41:37 os-db2 mysqld: 2017-10-05 22:41:37 140106019104512 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 1, memb_num = 2 Oct 5 22:41:37 os-db2 mysqld: 2017-10-05 22:41:37 140106019104512 [Note] WSREP: STATE EXCHANGE: Waiting for state UUID. Oct 5 22:41:37 os-db2 mysqld: 2017-10-05 22:41:37 140106019104512 [Note] WSREP: STATE EXCHANGE: sent state msg: 9587a7c7-aa0d-11e7-b1c3-377185d9ee32 Oct 5 22:41:37 os-db2 mysqld: 2017-10-05 22:41:37 140106019104512 [Note] WSREP: STATE EXCHANGE: got state msg: 9587a7c7-aa0d-11e7-b1c3-377185d9ee32 from 0 (galera1) Oct 5 22:41:37 os-db2 mysqld: 2017-10-05 22:41:37 140106019104512 [Note] WSREP: STATE EXCHANGE: got state msg: 9587a7c7-aa0d-11e7-b1c3-377185d9ee32 from 1 (galera2) Oct 5 22:41:37 os-db2 mysqld: 2017-10-05 22:41:37 140106019104512 [Note] WSREP: Quorum results: Oct 5 22:41:37 os-db2 mysqld: version = 4, Oct 5 22:41:37 os-db2 mysqld: component = PRIMARY, Oct 5 22:41:37 os-db2 mysqld: conf_id = 5, Oct 5 22:41:37 os-db2 mysqld: members = 1/2 (joined/total), Oct 5 22:41:37 os-db2 mysqld: act_id = 1414, Oct 5 22:41:37 os-db2 mysqld: last_appl. = -1, Oct 5 22:41:37 os-db2 mysqld: protocols = 0/7/3 (gcs/repl/appl), Oct 5 22:41:37 os-db2 mysqld: group UUID = 638dd3db-9ecd-11e7-8c45-136f872211b0 Oct 5 22:41:37 os-db2 mysqld: 2017-10-05 22:41:37 140106019104512 [Note] WSREP: Flow-control interval: [23, 23] Oct 5 22:41:37 os-db2 mysqld: 2017-10-05 22:41:37 140106019104512 [Note] WSREP: Shifting OPEN -> PRIMARY (TO: 1414) Oct 5 22:41:37 os-db2 mysqld: 2017-10-05 22:41:37 140106319330048 [Note] WSREP: State transfer required: Oct 5 22:41:37 os-db2 mysqld: Group state: 638dd3db-9ecd-11e7-8c45-136f872211b0:1414 Oct 5 22:41:37 os-db2 mysqld: Local state: 638dd3db-9ecd-11e7-8c45-136f872211b0:1083 Oct 5 22:41:37 os-db2 mysqld: 2017-10-05 22:41:37 140106319330048 [Note] WSREP: New cluster view: global state: 638dd3db-9ecd-11e7-8c45-136f872211b0:1414, view# 6: Primary, number of nodes: 2, my index: 1, protocol version 3 Oct 5 22:41:37 os-db2 mysqld: 2017-10-05 22:41:37 140106319330048 [Warning] WSREP: Gap in state sequence. Need state transfer. Oct 5 22:41:37 os-db2 mysqld: 2017-10-05 22:41:37 140106010711808 [Note] WSREP: Running: 'wsrep_sst_rsync --role 'joiner' --address '10.0.10.21' --datadir '/var/lib/mysql/' --parent '1885' '' ' Oct 5 22:41:37 os-db2 mysqld: 2017-10-05 22:41:37 140106010711808 [ERROR] WSREP: Failed to read 'ready <addr>' from: wsrep_sst_rsync --role 'joiner' --address '10.0.10.21' --datadir '/var/lib/mysql/' --parent '1885' '' Oct 5 22:41:37 os-db2 mysqld: Read: '(null)' Oct 5 22:41:37 os-db2 mysqld: 2017-10-05 22:41:37 140106010711808 [ERROR] WSREP: Process completed with error: wsrep_sst_rsync --role 'joiner' --address '10.0.10.21' --datadir '/var/lib/mysql/' --parent '1885' '' : 2 (No such file or directory) Oct 5 22:41:37 os-db2 mysqld: 2017-10-05 22:41:37 140106319330048 [ERROR] WSREP: Failed to prepare for 'rsync' SST. Unrecoverable. Oct 5 22:41:37 os-db2 mysqld: 2017-10-05 22:41:37 140106319330048 [ERROR] Aborting Oct 5 22:41:51 os-db2 mysqld: Error in my_thread_global_end(): 1 threads didn't exit Oct 5 22:41:51 os-db2 systemd: mariadb.service: main process exited, code=exited, status=1/FAILURE Oct 5 22:41:51 os-db2 systemd: Failed to start MariaDB database server. Oct 5 22:41:51 os-db2 systemd: Unit mariadb.service entered failed state. Oct 5 22:41:51 os-db2 systemd: mariadb.service failed. Many thanks! |
All times are GMT -5. The time now is 01:48 PM. |