GBase 8a 对主副本不一致时的处理方案测试 - 墨天轮 (2024)

由于某些原因,主副本出现了不一致,包括表结构,数量等。GBase 8a的新版本,对这种情况做了处理,本文介绍相关参数和测试用例。

总结

经过测试,GBase 8a的新功能,可以在表数据行数不一致,表结构不一致,表SCN号不一致等多种情况下,自动判断并恢复。

不一致原因

从个人项目看,出现过如下几种情况,导致主副本不一致

参数不同

数据库某些参数不同,导致主副本的不一致。由于参数都是本地配置,其一致性需要用户自行保障。 实例请看参考的文章。

服务器断电、死机

对于物理机,一般是RAID卡有问题,包括驱动BUG,无电池等。 对于虚拟机,云主机,实例异常退出,断电等。

GBase 8a在写入数据时,是直接方式(direct io), 不是缓冲写(先写到操作系统的脏数据 Dirty data,再依赖操作系统的刷新),所以只有写入返回成功时,数据库内部才认定写入成功。但如果底层环境出现了上述的问题,则成功写入的数据,并不能保证一定刷到磁盘上了。

手工误操作

最常见的情况是节点【临时】故障期间,不小心清掉了该节点的Event。

参数

gcluster_suffix_consistency_resolve

默认值为0,不解决分片一致性问题。 设置为1,尝试解决。

该参数支持session级和global级。

注意:

  • 该参数为近期(2022-09-02)新增加的,您如果要使用,需要看版本是否支持。目前应覆盖到862Build33、862Build43和9.5系列。
  • 该功能最少要3个主机节点才可以。2节点集群不保证所有功能可以工作。

测试环境

862Build33的新版本,3节点集群。

节点

[gbase@862b43 gcinstall_862.33.35]$ gcadminCLUSTER STATE: ACTIVECLUSTER MODE: NORMAL==========================================================| GBASE COORDINATOR CLUSTER INFORMATION |==========================================================| NodeName | IpAddress |gcware |gcluster |DataState |----------------------------------------------------------| coordinator1 | 10.0.2.106 | OPEN | OPEN | 0 |----------------------------------------------------------| coordinator2 | 10.0.2.107 | OPEN | OPEN | 0 |----------------------------------------------------------| coordinator3 | 10.0.2.108 | OPEN | OPEN | 0 |----------------------------------------------------------======================================================| GBASE DATA CLUSTER INFORMATION |======================================================|NodeName | IpAddress |gnode |syncserver |DataState |------------------------------------------------------| node1 | 10.0.2.106 | OPEN | OPEN | 0 |------------------------------------------------------| node2 | 10.0.2.107 | OPEN | OPEN | 0 |------------------------------------------------------| node3 | 10.0.2.108 | OPEN | OPEN | 0 |------------------------------------------------------

软件

[gbase@862b43 gcluster]$ gccli testdbGBase client 8.6.2.33-R48.274e5f5e. Copyright (c) 2004-2022, GBase. All Rights Reserved.gbase> select version();+-----------------------+| version() |+-----------------------+| 8.6.2.33-R48.274e5f5e |+-----------------------+1 row in set (Elapsed: 00:00:00.01)

数据

2字段的分布表,方便观察节点的数据。

初始数据

手工插入9行数据,确认每个分片都有数据。

gbase> create database testdb;Query OK, 1 row affected (Elapsed: 00:00:00.03)gbase> use testdb;Query OK, 0 rows affected (Elapsed: 00:00:00.01)gbase> create table t1(id int, name varchar(100))distributed by('id');Query OK, 0 rows affected (Elapsed: 00:00:00.26)gbase> insert into t1 values(1,1111),(2,2222),(3,33333),(4,44444),(5,5555),(6,66666),(7,777777),(8,8888),(9,9999);Query OK, 9 rows affected (Elapsed: 00:00:00.26)Records: 9 Duplicates: 0 Warnings: 0gbase> select * from t1;+------+--------+| id | name |+------+--------+| 4 | 44444 || 6 | 66666 || 7 | 777777 || 1 | 1111 || 3 | 33333 || 5 | 5555 || 9 | 9999 || 2 | 2222 || 8 | 8888 |+------+--------+9 rows in set (Elapsed: 00:00:00.01)

检查分片,每个都有数据

[gbase@862b43 gcinstall_862.33.35]$ gncli testbERROR 1049 (42000): Unknown database 'testb'[gbase@862b43 gcinstall_862.33.35]$ gncli testdbGBase client 8.6.2.33-R48.274e5f5e. Copyright (c) 2004-2022, GBase. All Rights Reserved.gbase> select * from t1_n1;+------+--------+| id | name |+------+--------+| 4 | 44444 || 6 | 66666 || 7 | 777777 |+------+--------+3 rows in set (Elapsed: 00:00:00.00)gbase> select * from t1_n2;ERROR 1146 (42S02): Table 'testdb.t1_n2' doesn't existgbase> select * from t1_n3;+------+-------+| id | name |+------+-------+| 1 | 1111 || 3 | 33333 || 5 | 5555 || 9 | 9999 |+------+-------+4 rows in set (Elapsed: 00:00:00.00)gbase> ^CAborted[gbase@862b43 gcinstall_862.33.35]$ gncli testdb -h 10.0.2.107GBase client 8.6.2.33-R48.274e5f5e. Copyright (c) 2004-2022, GBase. All Rights Reserved.gbase> select * from t1_n2;+------+------+| id | name |+------+------+| 2 | 2222 || 8 | 8888 |+------+------+2 rows in set (Elapsed: 00:00:00.01)

加载数据

6行数据

[gbase@862b43 ~]$ cat t1.txt1,11113,33332,22224,44445,55556,6666

测试分片行数不一致场景

构造故障

手工向分片插入数据。注意SCN号要保持一致。

[gbase@862b43 gcluster]$ gncli testdbGBase client 8.6.2.33-R48.274e5f5e. Copyright (c) 2004-2022, GBase. All Rights Reserved.gbase> select scn from information_schema.tables where table_name='t1_n1';+------+| scn |+------+| 22 |+------+1 row in set (Elapsed: 00:00:00.00)gbase> SET SELF SCN =22;Query OK, 0 rows affected (Elapsed: 00:00:00.00)gbase> select * from t1_n1;+------+--------+| id | name |+------+--------+| 4 | 44444 || 6 | 66666 || 7 | 777777 || 4 | 4444 || 6 | 6666 || 4 | 4444 || 6 | 6666 || 4 | 4444 || 6 | 6666 |+------+--------+9 rows in set (Elapsed: 00:00:00.01)gbase> insert into t1_n1 values(4,4444444444444);Query OK, 1 row affected (Elapsed: 00:00:00.05)Records: 1 Duplicates: 0 Warnings: 0 Total: 10gbase> select scn from information_schema.tables where table_name='t1_n1';+------+| scn |+------+| 22 |+------+1 row in set (Elapsed: 00:00:00.00)

加载测试

先连续加载2次,均报错。设置参数后,再加载一次,报错,后台会自动同步,再加载成功。

注意报错内容是:Column 0, no_obj: 10 — 9 行数不同

gbase> load data infile 'ftp://gbase:gbase1234@10.0.2.106/t1.txt' into table t1 fields terminated by ',';ERROR 1733 (HY000): (GBA-01EX-700) Gbase general error: Task 24 failed, [10.0.2.107:5050](GBA-02AD-0005)Failed to query in gnode:DETAIL: (GBA-01EX-700) Gbase general error: (gns_host: 10.0.2.106) source table and destination table are not same: src:( localhost:testdb.t1_n1 ) dst:( 10.0.2.107:testdb.t1_n1 ) Column 0, no_obj: 10 -- 9SQL: LOAD /*+ TID('168') */ DATA INFILE 'ftp://gbase:*********@10.0.2.106/t1.txt#offset=0&length=42&firstblock&ffsize=42' INTO TABLE `testdb`.`t1_n1` DATA_FORMAT 3 FILE_FORMAT UNDEFINED FIELDS TEgbase> load data infile 'ftp://gbase:gbase1234@10.0.2.106/t1.txt' into table t1 fields terminated by ',';ERROR 1733 (HY000): (GBA-01EX-700) Gbase general error: Task 25 failed, [10.0.2.106:5050](GBA-02AD-0005)Failed to query in gnode:DETAIL: (GBA-01EX-700) Gbase general error: (gns_host: 10.0.2.106) source table and destination table are not same: src:( localhost:testdb.t1_n1 ) dst:( 10.0.2.107:testdb.t1_n1 ) Column 0, no_obj: 10 -- 9SQL: LOAD /*+ TID('170') */ DATA INFILE 'ftp://gbase:*********@10.0.2.106/t1.txt#offset=0&length=42&firstblock&ffsize=42' INTO TABLE `testdb`.`t1_n1` DATA_FORMAT 3 FILE_FORMAT UNDEFINED FIELDS TEgbase> set gcluster_suffix_consistency_resolve=1;Query OK, 0 rows affected (Elapsed: 00:00:00.01)gbase> load data infile 'ftp://gbase:gbase1234@10.0.2.106/t1.txt' into table t1 fields terminated by ',';ERROR 1733 (HY000): (GBA-01EX-700) Gbase general error: Task 26 failed, [10.0.2.106:5050](GBA-02AD-0005)Failed to query in gnode:DETAIL: (GBA-01EX-700) Gbase general error: (gns_host: 10.0.2.106) source table and destination table are not same: src:( localhost:testdb.t1_n1 ) dst:( 10.0.2.107:testdb.t1_n1 ) Column 0, no_obj: 10 -- 9SQL: LOAD /*+ TID('173') */ DATA INFILE 'ftp://gbase:*********@10.0.2.106/t1.txt#offset=0&length=42&firstblock&ffsize=42' INTO TABLE `testdb`.`t1_n1` DATA_FORMAT 3 FILE_FORMAT UNDEFINED FIELDS TEgbase> load data infile 'ftp://gbase:gbase1234@10.0.2.106/t1.txt' into table t1 fields terminated by ',';Query OK, 6 rows affected (Elapsed: 00:00:00.50)Task 27 finished, Loaded 6 records, Skipped 0 recordsgbase>

查看日志

确认有不一致处理和同步信息

连接节点 106 gcluster的express.log

注意其中的SUFFIX_INCONSISTENCY_JUDGER部分,设置了storageevent 4

2022-09-02 10:30:47.823 [SQLDISP][ERROR][S:74][Q:176]:Query failed, SessionCtx(0x5390000) HOST(10.0.2.106:5050->98) reason: (GBA-01EX-700) Gbase general error: (gns_host: 10.0.2.106) source table and destination table are not same: src:( localhost:testdb.t1_n1 ) dst:( 10.0.2.107:testdb.t1_n1 ) Column 0, no_obj: 10 -- 9 SQL(LOAD /*+ TID('170') */ DATA INFILE 'ftp://gbase:*********@10.0.2.106/t1.txt#offset=0&length=42&firstblock&ffsize=42' INTO TABLE `testdb`.`t1_n1` DATA_FORMAT 3 FILE_FORMAT UNDEFINED FIELDS TERMINATED BY ',' HOST '10.0.2.106' CURRENT_TIMESTAMP 1662085847 SCN_NUMBER 25 GCLUSTER_PORT 5258 INTO SERVER (HOST '10.0.2.106, 10.0.2.107, 10.0.2.108', PORT '5050', USER 'root', DATABASE 'testdb', TABLE 't1_n1, t1_n2, t1_n3', COMMENT 'table_host 0 0 1 # 1 1 2 # 2 2 0, scn 25, group 0 ,distribution 1, hash_map 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 12022-09-02 10:30:47.847 [LOAD][ERROR][S:74][Q:176]:Task 25 failed, [10.0.2.106:5050](GBA-02AD-0005)Failed to query in gnode:DETAIL: (GBA-01EX-700) Gbase general error: (gns_host: 10.0.2.106) source table and destination table are not same: src:( localhost:testdb.t1_n1 ) dst:( 10.0.2.107:testdb.t1_n1 ) Column 0, no_obj: 10 -- 9SQL: LOAD /*+ TID('170') */ DATA INFILE 'ftp://gbase:*********@10.0.2.106/t1.txt#offset=0&length=42&firstblock&ffsize=42' INTO TABLE `testdb`.`t1_n1` DATA_FORMAT 3 FILE_FORMAT UNDEFINED FIELDS TERMINATED BY ',' HOST '10.0.2.106' CURRENT_TIMESTAMP 1662085847 SCN_NUMBER 25 GCLUSTER_PORT 5258 INTO SERVER (HOST '10.0.2.106, 10.0.2.107, 10.0.2.108', PORT '5050', USER 'root', DATABASE 'testdb', TABLE 't1_n1, t1_n2, t1_n3', COMMENT 'table_host 0 0 1 # 1 1 2 # 2 2 0, scn 25, group 0 ,distribution 1, hash_map 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 ... 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 12022-09-02 10:30:47.973 [SQLDISP][ERROR][S:74][Q:176]:Query failed, SessionCtx(0x5390000) HOST(10.0.2.106:5050->92) reason: Can't revert table: Table testdb.t1_n1 can not do revert with scn(25/cur:22) SQL(revert table "testdb"."t1_n1" scn_number 25)2022-09-02 10:30:47.983 [SQLDISP][ERROR][S:74][Q:176]:Query failed, SessionCtx(0x5390000) HOST(10.0.2.107:5050->96) reason: Can't revert table: Table testdb.t1_n1 can not do revert with scn(25/cur:22) SQL(revert table "testdb"."t1_n1" scn_number 25)2022-09-02 10:30:47.983 [SQLDISP][ERROR][S:74][Q:176]:Query failed, SessionCtx(0x5390000) HOST(10.0.2.108:5050->86) reason: Can't revert table: Table testdb.t1_n2 can not do revert with scn(25/cur:22) SQL(revert table "testdb"."t1_n2" scn_number 25)2022-09-02 10:30:47.983 [SQLDISP][ERROR][S:74][Q:176]:Query failed, SessionCtx(0x5390000) HOST(10.0.2.107:5050->97) reason: Can't revert table: Table testdb.t1_n2 can not do revert with scn(25/cur:22) SQL(revert table "testdb"."t1_n2" scn_number 25)2022-09-02 10:30:47.983 [SQLDISP][ERROR][S:74][Q:176]:Query failed, SessionCtx(0x5390000) HOST(10.0.2.108:5050->87) reason: Can't revert table: Table testdb.t1_n3 can not do revert with scn(25/cur:22) SQL(revert table "testdb"."t1_n3" scn_number 25)2022-09-02 10:30:47.983 [SQLDISP][ERROR][S:74][Q:176]:Query failed, SessionCtx(0x5390000) HOST(10.0.2.106:5050->93) reason: Can't revert table: Table testdb.t1_n3 can not do revert with scn(25/cur:22) SQL(revert table "testdb"."t1_n3" scn_number 25)2022-09-02 10:30:47.983 [COMMIT][WARN ][S:74][Q:176]:revert in host:10.0.2.106 for table:t1 success.2022-09-02 10:30:47.983 [COMMIT][WARN ][S:74][Q:176]:revert in host:10.0.2.107 for table:t1 success.2022-09-02 10:30:47.983 [COMMIT][WARN ][S:74][Q:176]:revert in host:10.0.2.107 for table:t1 success.2022-09-02 10:30:47.983 [COMMIT][WARN ][S:74][Q:176]:revert in host:10.0.2.108 for table:t1 success.2022-09-02 10:30:47.983 [COMMIT][WARN ][S:74][Q:176]:revert in host:10.0.2.108 for table:t1 success.2022-09-02 10:30:47.983 [COMMIT][WARN ][S:74][Q:176]:revert in host:10.0.2.106 for table:t1 success.2022-09-02 10:30:47.996 [COMMIT][INFO ][S:74][Q:176]:rollback table(testdb.t1) success for scn:25.2022-09-02 10:30:56.316 [SQLDISP][ERROR][S:74][Q:178]:Query failed, SessionCtx(0x5390000) HOST(10.0.2.106:5050->98) reason: (GBA-01EX-700) Gbase general error: (gns_host: 10.0.2.106) source table and destination table are not same: src:( localhost:testdb.t1_n1 ) dst:( 10.0.2.107:testdb.t1_n1 ) Column 0, no_obj: 10 -- 9 SQL(LOAD /*+ TID('173') */ DATA INFILE 'ftp://gbase:*********@10.0.2.106/t1.txt#offset=0&length=42&firstblock&ffsize=42' INTO TABLE `testdb`.`t1_n1` DATA_FORMAT 3 FILE_FORMAT UNDEFINED FIELDS TERMINATED BY ',' HOST '10.0.2.106' CURRENT_TIMESTAMP 1662085856 SCN_NUMBER 26 GCLUSTER_PORT 5258 INTO SERVER (HOST '10.0.2.106, 10.0.2.107, 10.0.2.108', PORT '5050', USER 'root', DATABASE 'testdb', TABLE 't1_n1, t1_n2, t1_n3', COMMENT 'table_host 0 0 1 # 1 1 2 # 2 2 0, scn 26, group 0 ,distribution 1, hash_map 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 12022-09-02 10:30:56.349 [SUFFIX_INCONSISTENCY_JUDGER][WARN ][S:74][Q:178]:row count: 10.0.2.106:10 < 10.0.2.107:92022-09-02 10:30:56.349 [SUFFIX_INCONSISTENCY_JUDGER][WARN ][S:74][Q:178]:host 10.0.2.106 table:testdb.t1_n1 judeged by compare row count.2022-09-02 10:30:56.363 [SUFFIX_INCONSISTENCY_JUDGER][WARN ][S:74][Q:178]:Set testdb.t1 suffix n1 StorageEvent successfully, event id is 42022-09-02 10:30:56.364 [LOAD][ERROR][S:74][Q:178]:Task 26 failed, [10.0.2.106:5050](GBA-02AD-0005)Failed to query in gnode:DETAIL: (GBA-01EX-700) Gbase general error: (gns_host: 10.0.2.106) source table and destination table are not same: src:( localhost:testdb.t1_n1 ) dst:( 10.0.2.107:testdb.t1_n1 ) Column 0, no_obj: 10 -- 9SQL: LOAD /*+ TID('173') */ DATA INFILE 'ftp://gbase:*********@10.0.2.106/t1.txt#offset=0&length=42&firstblock&ffsize=42' INTO TABLE `testdb`.`t1_n1` DATA_FORMAT 3 FILE_FORMAT UNDEFINED FIELDS TERMINATED BY ',' HOST '10.0.2.106' CURRENT_TIMESTAMP 1662085856 SCN_NUMBER 26 GCLUSTER_PORT 5258 INTO SERVER (HOST '10.0.2.106, 10.0.2.107, 10.0.2.108', PORT '5050', USER 'root', DATABASE 'testdb', TABLE 't1_n1, t1_n2, t1_n3', COMMENT 'table_host 0 0 1 # 1 1 2 # 2 2 0, scn 26, group 0 ,distribution 1, hash_map 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 ... 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 12022-09-02 10:30:56.488 [SQLDISP][ERROR][S:74][Q:178]:Query failed, SessionCtx(0x5390000) HOST(10.0.2.106:5050->92) reason: Can't revert table: Table testdb.t1_n1 can not do revert with scn(26/cur:22) SQL(revert table "testdb"."t1_n1" scn_number 26)2022-09-02 10:30:56.499 [SQLDISP][ERROR][S:74][Q:178]:Query failed, SessionCtx(0x5390000) HOST(10.0.2.107:5050->96) reason: Can't revert table: Table testdb.t1_n1 can not do revert with scn(26/cur:22) SQL(revert table "testdb"."t1_n1" scn_number 26)2022-09-02 10:30:56.499 [SQLDISP][ERROR][S:74][Q:178]:Query failed, SessionCtx(0x5390000) HOST(10.0.2.108:5050->86) reason: Can't revert table: Table testdb.t1_n2 can not do revert with scn(26/cur:22) SQL(revert table "testdb"."t1_n2" scn_number 26)2022-09-02 10:30:56.499 [SQLDISP][ERROR][S:74][Q:178]:Query failed, SessionCtx(0x5390000) HOST(10.0.2.107:5050->97) reason: Can't revert table: Table testdb.t1_n2 can not do revert with scn(26/cur:22) SQL(revert table "testdb"."t1_n2" scn_number 26)2022-09-02 10:30:56.499 [SQLDISP][ERROR][S:74][Q:178]:Query failed, SessionCtx(0x5390000) HOST(10.0.2.108:5050->87) reason: Can't revert table: Table testdb.t1_n3 can not do revert with scn(26/cur:22) SQL(revert table "testdb"."t1_n3" scn_number 26)2022-09-02 10:30:56.499 [SQLDISP][ERROR][S:74][Q:178]:Query failed, SessionCtx(0x5390000) HOST(10.0.2.106:5050->93) reason: Can't revert table: Table testdb.t1_n3 can not do revert with scn(26/cur:22) SQL(revert table "testdb"."t1_n3" scn_number 26)2022-09-02 10:30:56.499 [COMMIT][WARN ][S:74][Q:178]:revert in host:10.0.2.106 for table:t1 success.2022-09-02 10:30:56.499 [COMMIT][WARN ][S:74][Q:178]:revert in host:10.0.2.107 for table:t1 success.2022-09-02 10:30:56.499 [COMMIT][WARN ][S:74][Q:178]:revert in host:10.0.2.107 for table:t1 success.2022-09-02 10:30:56.499 [COMMIT][WARN ][S:74][Q:178]:revert in host:10.0.2.108 for table:t1 success.2022-09-02 10:30:56.499 [COMMIT][WARN ][S:74][Q:178]:revert in host:10.0.2.108 for table:t1 success.2022-09-02 10:30:56.499 [COMMIT][WARN ][S:74][Q:178]:revert in host:10.0.2.106 for table:t1 success.2022-09-02 10:30:56.512 [COMMIT][INFO ][S:74][Q:178]:rollback table(testdb.t1) success for scn:26.2022-09-02 10:30:58.208 [LOAD][WARN ][S:74][Q:179]:slice(n1) on node(10.0.2.106) is invalid.2022-09-02 10:30:58.480 [DEF ] [S:74][Q:179][load]Node(10.0.2.106)testdb.t1[n1][scn:27]'s data state fails2022-09-02 10:30:58.480 [DEF ] [S:74][Q:179]Set data(testdb.t1:1) to LOCKED2022-09-02 10:30:58.500 [DEF ] [S:74][Q:179]Set data(testdb.t1:normal) to LOCKED, event id is 42022-09-02 10:30:58.670 [DEF ] [S:74][Q:179][load]Node(10.0.2.106)testdb.t1[n1][scn:27]'s data state fails2022-09-02 10:30:58.670 [DEF ] [S:74][Q:179]Set data(testdb.t1:1) to LOCKED2022-09-02 10:30:58.687 [DEF ] [S:74][Q:179]Set data(testdb.t1:normal) to LOCKED, event id is 4

gc_recovery.log

2022-09-02 10:31:17.789 [INFO ] <RECOVER-INFO-1>: Start Recovering testdb.t1 tid 72022-09-02 10:31:17.789 [INFO ] <DATA-Recover-1>: Start dml recover testdb.t1,tid 72022-09-02 10:31:17.789 [INFO ] <DATA-Recover-1>: Start to dml recover of eventid(4)2022-09-02 10:31:17.801 [INFO ] <GCWare>: GetSourceNodeForDMl for nodeId(1778515978), the node is unValid2022-09-02 10:31:17.802 [INFO ] <DATA-Recover-1>: DoNormSqlNoResult, do(flush rollback `testdb`.`t1_n1`)2022-09-02 10:31:17.805 [INFO ] <DATA-Recover-1>: recoverinfo gn dml: eventid:4 DB(testdb), TABLE(t1), SLICE(1), TID(7), src:0x1795293194(10.0.2.107) dst:0x1778515978(10.0.2.106)2022-09-02 10:31:17.805 [INFO ] <CLockResource>: Lock: testdb start2022-09-02 10:31:17.806 [INFO ] <CLockResource>: Lock: testdb end success2022-09-02 10:31:17.806 [INFO ] <CLockResource>: Lock: testdb.t1.rsync start2022-09-02 10:31:17.807 [INFO ] <CLockResource>: Lock: testdb.t1.rsync end success2022-09-02 10:31:17.807 [INFO ] <CLockResource>: Lock: testdb.t1824A6746-7C28-4B39-B1E6-D6E088F858AA start2022-09-02 10:31:17.809 [INFO ] <CLockResource>: Lock: testdb.t1824A6746-7C28-4B39-B1E6-D6E088F858AA end success2022-09-02 10:31:17.809 [INFO ] <CLockResource>: Lock: testdb.t1D88707C8-BB5F-41D8-AAF9-55A76780953B start2022-09-02 10:31:17.811 [INFO ] <CLockResource>: Lock: testdb.t1D88707C8-BB5F-41D8-AAF9-55A76780953B end success2022-09-02 10:31:17.811 [INFO ] <DATA-Recover-1>: DO_DATA_SYNC, Lock_for_ddl2022-09-02 10:31:17.811 [INFO ] <DATA-Recover-1>: SYNC_SLICE, DISTRIBUTED TABLE: DB: 'testdb', Table: 't1', slice: 'n1'2022-09-02 10:31:17.811 [INFO ] <GCWare>: testdb.t1 nodeid: 1778515978, segid: 1, tableid 7 have dmlstorageevent,total num 1,lasteventid: 42022-09-02 10:31:17.812 [INFO ] <DATA-Recover-1>: SYNC_SLICE, Find it is a meta broken table testdb.t1,and the event will be given up,event id 42022-09-02 10:31:17.825 [INFO ] <DATA-Recover-1>: SET_ONLINE, Succeeded in setting local node table 'testdb.t1_n1' online, eventinfo(eventid:4,segId:1)2022-09-02 10:31:17.826 [INFO ] <CLockResource>: UnLock: testdb unlock success2022-09-02 10:31:17.828 [INFO ] <CLockResource>: UnLock: testdb.t1.rsync unlock success2022-09-02 10:31:17.829 [INFO ] <CLockResource>: UnLock: testdb.t1824A6746-7C28-4B39-B1E6-D6E088F858AA unlock success2022-09-02 10:31:17.831 [INFO ] <CLockResource>: UnLock: testdb.t1D88707C8-BB5F-41D8-AAF9-55A76780953B unlock success2022-09-02 10:31:17.832 [INFO ] <DATA-Recover-1>: End dml recover testdb.t1,tid 72022-09-02 10:31:17.832 [INFO ] <STORAGE-Recover-1>: Start DMLStorge recover testdb.t1,tid 7 eventnum 12022-09-02 10:31:17.832 [INFO ] <STORAGE-Recover-1>: Start to DMLStorge recover of eventid(4)2022-09-02 10:31:17.833 [INFO ] <STORAGE-Recover-1>: recoverinfo gn(dmls): eventid=4, tablename=testdb.t1, slice:n1 src:0x1795293194(10.0.2.106) dst:0x1778515978(10.0.2.106)2022-09-02 10:31:17.833 [INFO ] <STORAGE-Recover-1>: Revovering ..., for dmlstorage, forgcluster=false, dbname=testdb, tablename=t1, slice=n1, tid=7, eventid=42022-09-02 10:31:17.846 [INFO ] <CLockResource>: Lock: testdb start2022-09-02 10:31:17.848 [INFO ] <CLockResource>: Lock: testdb end success2022-09-02 10:31:17.848 [INFO ] <CLockResource>: Lock: testdb.t1.rsync start2022-09-02 10:31:17.850 [INFO ] <CLockResource>: Lock: testdb.t1.rsync end success2022-09-02 10:31:17.850 [INFO ] <CLockResource>: Lock: testdb.t1824A6746-7C28-4B39-B1E6-D6E088F858AA start2022-09-02 10:31:17.852 [INFO ] <CLockResource>: Lock: testdb.t1824A6746-7C28-4B39-B1E6-D6E088F858AA end success2022-09-02 10:31:17.852 [INFO ] <CLockResource>: Lock: testdb.t1D88707C8-BB5F-41D8-AAF9-55A76780953B start2022-09-02 10:31:17.853 [INFO ] <CLockResource>: Lock: testdb.t1D88707C8-BB5F-41D8-AAF9-55A76780953B end success2022-09-02 10:31:17.860 [INFO ] <STORAGE-Recover-1>: BackUpTable, meta broken table(t1_n1:7) has been deleted and backuped success2022-09-02 10:31:17.896 [INFO ] <STORAGE-Recover-1>: CreateFailNodeField, do "CREATE TABLE /*+ distribution(1)*/ `testdb`.`t1_n1`( "id" int(11) DEFAULT NULL, "name" varchar(100) DEFAULT NULL) ENGINE=EXPRESS TID(8) UID(1) DEFAULT CHARSET=utf8 TABLESPACE='sys_tablespace' COLUMN_IDS(0, 1) TID(8)" success2022-09-02 10:31:17.896 [INFO ] <STORAGE-Recover-1>: CreateTableInfoFromSafeGroup, do CreateFailNodeField success2022-09-02 10:31:17.896 [INFO ] <STORAGE-Recover-1>: DO_DATA_SYNC, Lock_for_ddl2022-09-02 10:31:17.896 [INFO ] <STORAGE-Recover-1>: SYNC_SLICE, DISTRIBUTED TABLE: DB: 'testdb', Table: 't1', slice: 'n1'2022-09-02 10:31:17.896 [INFO ] <STORAGE-Recover-1>: CALL_SYNC, Syncing table slice{DB: 'testdb', Table: 't1', Slice: 'n1'} from sample gnode: '10.0.2.107'2022-09-02 10:31:17.896 [INFO ] <STORAGE-Recover-1>: CALL_SYNC, Sync command line: 'start 'gc_sync_client' '10.0.2.107 testdb t1_n1 10.0.2.106 5258 3 1''2022-09-02 10:31:18.482 [INFO ] <STORAGE-Recover-1>: check drop sql thread quit.2022-09-02 10:31:18.482 [INFO ] <STORAGE-Recover-1>: start_sync_by_node, sync table{start 'gc_sync_client' '10.0.2.107 testdb t1_n1 10.0.2.106 5258 3 1'} returned with (0)2022-09-02 10:31:18.482 [INFO ] <STORAGE-Recover-1>: CALL_SYNC, Sync client executed successfully2022-09-02 10:31:18.482 [INFO ] <STORAGE-Recover-1>: REFRESH_TABLE, Refreshing local gnode, DB: 'testdb', Table: 't1', slice: 'n1'2022-09-02 10:31:18.482 [INFO ] <STORAGE-Recover-1>: REFRESH_TABLE, Refresh query: 'refresh table `testdb`.`t1_n1`;'2022-09-02 10:31:18.483 [INFO ] <STORAGE-Recover-1>: Sync cl2022-09-02 10:31:18.483 [INFO ] <STORAGE-Recover-1>: CALL_SYNC, Sync end, return 02022-09-02 10:31:18.483 [INFO ] <CLockResource>: Lock: testdb.t1.09B5BEEC-1EF7-4FA6-9850-C4217A781E0F start2022-09-02 10:31:18.485 [INFO ] <CLockResource>: Lock: testdb.t1.09B5BEEC-1EF7-4FA6-9850-C4217A781E0F end success2022-09-02 10:31:18.486 [INFO ] <STORAGE-Recover-1>: CHECK_SYNC_RESULT,Checking whether SYNC is succeeded{DB: 'testdb', Table: 't1', slice: 'n1'}2022-09-02 10:31:18.489 [INFO ] <STORAGE-Recover-1>: CHECK_SYNC_RESULT, SCN of table slice of local GNode and sample GNode(number of working nodes: 2)2022-09-02 10:31:18.489 [INFO ] <STORAGE-Recover-1>: CHECK_SYNC_RESULT, Local GNode: 272022-09-02 10:31:18.489 [INFO ] <STORAGE-Recover-1>: CHECK_SYNC_RESULT, Sample GNode[1]: 272022-09-02 10:31:18.489 [INFO ] <CLockResource>: UnLock: testdb.t1.09B5BEEC-1EF7-4FA6-9850-C4217A781E0F unlock success2022-09-02 10:31:18.490 [INFO ] <STORAGE-Recover-1>: CreateTableInfoFromSafeGroup, do CallSync success2022-09-02 10:31:18.490 [INFO ] <CLockResource>: UnLock: testdb unlock success2022-09-02 10:31:18.492 [INFO ] <CLockResource>: UnLock: testdb.t1.rsync unlock success2022-09-02 10:31:18.493 [INFO ] <CLockResource>: UnLock: testdb.t1824A6746-7C28-4B39-B1E6-D6E088F858AA unlock success2022-09-02 10:31:18.494 [INFO ] <CLockResource>: UnLock: testdb.t1D88707C8-BB5F-41D8-AAF9-55A76780953B unlock success2022-09-02 10:31:18.502 [INFO ] <STORAGE-Recover-1>: CreateTableInfoFromSafeGeoup success!2022-09-02 10:31:18.502 [INFO ] <STORAGE-Recover-1>: delete DMLStorge Redo Log start, eventId=4.............2022-09-02 10:31:18.510 [INFO ] <STORAGE-Recover-1>: delete DMLStorge Redo Log end, eventId=4.............2022-09-02 10:31:18.512 [INFO ] <RECOVER-INFO-1>: Finishing Recovering testdb.t1,tid 7

测试主备分片表结构不一致

故障模拟

手工增加一个列。 注意SCN号要保持一致。

gbase> select scn from information_schema.tables where table_name='t1_n1';+------+| scn |+------+| 27 |+------+1 row in set (Elapsed: 00:00:00.00)gbase> desc t1_n1;+-------+--------------+------+-----+---------+-------+| Field | Type | Null | Key | Default | Extra |+-------+--------------+------+-----+---------+-------+| id | int(11) | YES | | NULL | || name | varchar(100) | YES | | NULL | |+-------+--------------+------+-----+---------+-------+2 rows in set (Elapsed: 00:00:00.00)gbase> SET SELF SCN =27;Query OK, 0 rows affected (Elapsed: 00:00:00.00)gbase> alter table t1_n1 add column age int;Query OK, 11 rows affected (Elapsed: 00:00:00.08)Records: 11 Duplicates: 0 Warnings: 0gbase> desc t1_n1;+-------+--------------+------+-----+---------+-------+| Field | Type | Null | Key | Default | Extra |+-------+--------------+------+-----+---------+-------+| id | int(11) | YES | | NULL | || name | varchar(100) | YES | | NULL | || age | int(11) | YES | | NULL | |+-------+--------------+------+-----+---------+-------+3 rows in set (Elapsed: 00:00:00.00)gbase> select scn from information_schema.tables where table_name='t1_n1';+------+| scn |+------+| 27 |+------+1 row in set (Elapsed: 00:00:00.00)

加载测试

方法一样。在设置参数后,第一次加载失败,数据库会设置标志来自动恢复,第二次加载成功。

注意报错信息:column count: 3 — 2, 表示列数不一样。

gbase> load data infile 'ftp://gbase:gbase1234@10.0.2.106/t1.txt' into table t1 fields terminated by ',';ERROR 1733 (HY000): (GBA-01EX-700) Gbase general error: Task 28 failed, [10.0.2.107:5050](GBA-02AD-0005)Failed to query in gnode:DETAIL: (GBA-01EX-700) Gbase general error: (gns_host: 10.0.2.106) source table and destination table are not same: src:( localhost:testdb.t1_n1 ) dst:( 10.0.2.107:testdb.t1_n1 ) column count: 3 -- 2SQL: LOAD /*+ TID('181') */ DATA INFILE 'ftp://gbase:*********@10.0.2.106/t1.txt#offset=0&length=42&firstblock&ffsize=42' INTO TABLE `testdb`.`t1_n1` DATA_FORMAT 3 FILE_FORMAT UNDEFINED FIELDS TERMINAgbase> load data infile 'ftp://gbase:gbase1234@10.0.2.106/t1.txt' into table t1 fields terminated by ',';ERROR 1733 (HY000): (GBA-01EX-700) Gbase general error: Task 29 failed, [10.0.2.107:5050](GBA-02AD-0005)Failed to query in gnode:DETAIL: (GBA-01EX-700) Gbase general error: (gns_host: 10.0.2.106) source table and destination table are not same: src:( localhost:testdb.t1_n1 ) dst:( 10.0.2.107:testdb.t1_n1 ) column count: 3 -- 2SQL: LOAD /*+ TID('183') */ DATA INFILE 'ftp://gbase:*********@10.0.2.106/t1.txt#offset=0&length=42&firstblock&ffsize=42' INTO TABLE `testdb`.`t1_n1` DATA_FORMAT 3 FILE_FORMAT UNDEFINED FIELDS TERMINAgbase> set gcluster_suffix_consistency_resolve=1;Query OK, 0 rows affected (Elapsed: 00:00:00.00)gbase> load data infile 'ftp://gbase:gbase1234@10.0.2.106/t1.txt' into table t1 fields terminated by ',';ERROR 1733 (HY000): (GBA-01EX-700) Gbase general error: Task 30 failed, [10.0.2.106:5050](GBA-02AD-0005)Failed to query in gnode:DETAIL: (GBA-01EX-700) Gbase general error: (gns_host: 10.0.2.106) source table and destination table are not same: src:( localhost:testdb.t1_n1 ) dst:( 10.0.2.107:testdb.t1_n1 ) column count: 3 -- 2SQL: LOAD /*+ TID('186') */ DATA INFILE 'ftp://gbase:*********@10.0.2.106/t1.txt#offset=0&length=42&firstblock&ffsize=42' INTO TABLE `testdb`.`t1_n1` DATA_FORMAT 3 FILE_FORMAT UNDEFINED FIELDS TERMINAgbase> load data infile 'ftp://gbase:gbase1234@10.0.2.106/t1.txt' into table t1 fields terminated by ',';Query OK, 6 rows affected (Elapsed: 00:00:00.52)Task 31 finished, Loaded 6 records, Skipped 0 records

查看日志

连接节点 106 gcluster的express.log

注意其中的SUFFIX_INCONSISTENCY_JUDGER部分,设置了storageevent 5

2022-09-02 10:38:15.969 [SQLDISP][ERROR][S:78][Q:192]:Query failed, SessionCtx(0x5764000) HOST(10.0.2.106:5050->132) reason: (GBA-01EX-700) Gbase general error: (gns_host: 10.0.2.106) source table and destination table are not same: src:( localhost:testdb.t1_n1 ) dst:( 10.0.2.107:testdb.t1_n1 ) column count: 3 -- 2 SQL(LOAD /*+ TID('186') */ DATA INFILE 'ftp://gbase:*********@10.0.2.106/t1.txt#offset=0&length=42&firstblock&ffsize=42' INTO TABLE `testdb`.`t1_n1` DATA_FORMAT 3 FILE_FORMAT UNDEFINED FIELDS TERMINATED BY ',' HOST '10.0.2.106' CURRENT_TIMESTAMP 1662086295 SCN_NUMBER 30 GCLUSTER_PORT 5258 INTO SERVER (HOST '10.0.2.106, 10.0.2.107, 10.0.2.108', PORT '5050', USER 'root', DATABASE 'testdb', TABLE 't1_n1, t1_n2, t1_n3', COMMENT 'table_host 0 0 1 # 1 1 2 # 2 2 0, scn 30, group 0 ,distribution 1, hash_map 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 02022-09-02 10:38:15.996 [SUFFIX_INCONSISTENCY_JUDGER][WARN ][S:78][Q:192]:The same with: [10.0.2.107_n1][10.0.2.108_n2]2022-09-02 10:38:15.996 [SUFFIX_INCONSISTENCY_JUDGER][WARN ][S:78][Q:192]:host 10.0.2.106 table:testdb.t1_n1 judeged by compare table defination.2022-09-02 10:38:16.010 [SUFFIX_INCONSISTENCY_JUDGER][WARN ][S:78][Q:192]:Set testdb.t1 suffix n1 StorageEvent successfully, event id is 52022-09-02 10:38:16.011 [LOAD][ERROR][S:78][Q:192]:Task 30 failed, [10.0.2.106:5050](GBA-02AD-0005)Failed to query in gnode:DETAIL: (GBA-01EX-700) Gbase general error: (gns_host: 10.0.2.106) source table and destination table are not same: src:( localhost:testdb.t1_n1 ) dst:( 10.0.2.107:testdb.t1_n1 ) column count: 3 -- 2SQL: LOAD /*+ TID('186') */ DATA INFILE 'ftp://gbase:*********@10.0.2.106/t1.txt#offset=0&length=42&firstblock&ffsize=42' INTO TABLE `testdb`.`t1_n1` DATA_FORMAT 3 FILE_FORMAT UNDEFINED FIELDS TERMINATED BY ',' HOST '10.0.2.106' CURRENT_TIMESTAMP 1662086295 SCN_NUMBER 30 GCLUSTER_PORT 5258 INTO SERVER (HOST '10.0.2.106, 10.0.2.107, 10.0.2.108', PORT '5050', USER 'root', DATABASE 'testdb', TABLE 't1_n1, t1_n2, t1_n3', COMMENT 'table_host 0 0 1 # 1 1 2 # 2 2 0, scn 30, group 0 ,distribution 1, hash_map 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 ... 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 12022-09-02 10:38:16.135 [SQLDISP][ERROR][S:78][Q:192]:Query failed, SessionCtx(0x5764000) HOST(10.0.2.106:5050->120) reason: Can't revert table: Table testdb.t1_n1 can not do revert with scn(30/cur:27) SQL(revert table "testdb"."t1_n1" scn_number 30)2022-09-02 10:38:16.144 [SQLDISP][ERROR][S:78][Q:192]:Query failed, SessionCtx(0x5764000) HOST(10.0.2.107:5050->123) reason: Can't revert table: Table testdb.t1_n1 can not do revert with scn(30/cur:27) SQL(revert table "testdb"."t1_n1" scn_number 30)2022-09-02 10:38:16.144 [SQLDISP][ERROR][S:78][Q:192]:Query failed, SessionCtx(0x5764000) HOST(10.0.2.108:5050->108) reason: Can't revert table: Table testdb.t1_n2 can not do revert with scn(30/cur:27) SQL(revert table "testdb"."t1_n2" scn_number 30)2022-09-02 10:38:16.144 [SQLDISP][ERROR][S:78][Q:192]:Query failed, SessionCtx(0x5764000) HOST(10.0.2.107:5050->124) reason: Can't revert table: Table testdb.t1_n2 can not do revert with scn(30/cur:27) SQL(revert table "testdb"."t1_n2" scn_number 30)2022-09-02 10:38:16.144 [SQLDISP][ERROR][S:78][Q:192]:Query failed, SessionCtx(0x5764000) HOST(10.0.2.108:5050->109) reason: Can't revert table: Table testdb.t1_n3 can not do revert with scn(30/cur:27) SQL(revert table "testdb"."t1_n3" scn_number 30)2022-09-02 10:38:16.145 [SQLDISP][ERROR][S:78][Q:192]:Query failed, SessionCtx(0x5764000) HOST(10.0.2.106:5050->121) reason: Can't revert table: Table testdb.t1_n3 can not do revert with scn(30/cur:27) SQL(revert table "testdb"."t1_n3" scn_number 30)2022-09-02 10:38:16.145 [COMMIT][WARN ][S:78][Q:192]:revert in host:10.0.2.106 for table:t1 success.2022-09-02 10:38:16.145 [COMMIT][WARN ][S:78][Q:192]:revert in host:10.0.2.107 for table:t1 success.2022-09-02 10:38:16.145 [COMMIT][WARN ][S:78][Q:192]:revert in host:10.0.2.107 for table:t1 success.2022-09-02 10:38:16.145 [COMMIT][WARN ][S:78][Q:192]:revert in host:10.0.2.108 for table:t1 success.2022-09-02 10:38:16.145 [COMMIT][WARN ][S:78][Q:192]:revert in host:10.0.2.108 for table:t1 success.2022-09-02 10:38:16.145 [COMMIT][WARN ][S:78][Q:192]:revert in host:10.0.2.106 for table:t1 success.2022-09-02 10:38:16.158 [COMMIT][INFO ][S:78][Q:192]:rollback table(testdb.t1) success for scn:30.2022-09-02 10:38:17.402 [LOAD][WARN ][S:78][Q:193]:slice(n1) on node(10.0.2.106) is invalid.2022-09-02 10:38:17.678 [DEF ] [S:78][Q:193][load]Node(10.0.2.106)testdb.t1[n1][scn:31]'s data state fails2022-09-02 10:38:17.678 [DEF ] [S:78][Q:193]Set data(testdb.t1:1) to LOCKED2022-09-02 10:38:17.700 [DEF ] [S:78][Q:193]Set data(testdb.t1:normal) to LOCKED, event id is 52022-09-02 10:38:17.881 [DEF ] [S:78][Q:193][load]Node(10.0.2.106)testdb.t1[n1][scn:31]'s data state fails2022-09-02 10:38:17.881 [DEF ] [S:78][Q:193]Set data(testdb.t1:1) to LOCKED2022-09-02 10:38:17.897 [DEF ] [S:78][Q:193]Set data(testdb.t1:normal) to LOCKED, event id is 5

gc_recovery

2022-09-02 10:38:18.632 [INFO ] <RECOVER-INFO-1>: Start Recovering testdb.t1 tid 72022-09-02 10:38:18.633 [INFO ] <DATA-Recover-1>: Start dml recover testdb.t1,tid 72022-09-02 10:38:18.633 [INFO ] <DATA-Recover-1>: Start to dml recover of eventid(5)2022-09-02 10:38:18.642 [INFO ] <GCWare>: GetSourceNodeForDMl for nodeId(1778515978), the node is unValid2022-09-02 10:38:18.642 [INFO ] <DATA-Recover-1>: DoNormSqlNoResult, do(flush rollback `testdb`.`t1_n1`)2022-09-02 10:38:18.645 [INFO ] <DATA-Recover-1>: recoverinfo gn dml: eventid:5 DB(testdb), TABLE(t1), SLICE(1), TID(7), src:0x1795293194(10.0.2.107) dst:0x1778515978(10.0.2.106)2022-09-02 10:38:18.646 [INFO ] <CLockResource>: Lock: testdb start2022-09-02 10:38:18.647 [INFO ] <CLockResource>: Lock: testdb end success2022-09-02 10:38:18.647 [INFO ] <CLockResource>: Lock: testdb.t1.rsync start2022-09-02 10:38:18.648 [INFO ] <CLockResource>: Lock: testdb.t1.rsync end success2022-09-02 10:38:18.648 [INFO ] <CLockResource>: Lock: testdb.t1824A6746-7C28-4B39-B1E6-D6E088F858AA start2022-09-02 10:38:18.650 [INFO ] <CLockResource>: Lock: testdb.t1824A6746-7C28-4B39-B1E6-D6E088F858AA end success2022-09-02 10:38:18.650 [INFO ] <CLockResource>: Lock: testdb.t1D88707C8-BB5F-41D8-AAF9-55A76780953B start2022-09-02 10:38:18.651 [INFO ] <CLockResource>: Lock: testdb.t1D88707C8-BB5F-41D8-AAF9-55A76780953B end success2022-09-02 10:38:18.652 [INFO ] <DATA-Recover-1>: DO_DATA_SYNC, Lock_for_ddl2022-09-02 10:38:18.652 [INFO ] <DATA-Recover-1>: SYNC_SLICE, DISTRIBUTED TABLE: DB: 'testdb', Table: 't1', slice: 'n1'2022-09-02 10:38:18.652 [INFO ] <GCWare>: testdb.t1 nodeid: 1778515978, segid: 1, tableid 7 have dmlstorageevent,total num 1,lasteventid: 52022-09-02 10:38:18.652 [INFO ] <DATA-Recover-1>: SYNC_SLICE, Find it is a meta broken table testdb.t1,and the event will be given up,event id 52022-09-02 10:38:18.664 [INFO ] <DATA-Recover-1>: SET_ONLINE, Succeeded in setting local node table 'testdb.t1_n1' online, eventinfo(eventid:5,segId:1)2022-09-02 10:38:18.665 [INFO ] <CLockResource>: UnLock: testdb unlock success2022-09-02 10:38:18.666 [INFO ] <CLockResource>: UnLock: testdb.t1.rsync unlock success2022-09-02 10:38:18.668 [INFO ] <CLockResource>: UnLock: testdb.t1824A6746-7C28-4B39-B1E6-D6E088F858AA unlock success2022-09-02 10:38:18.669 [INFO ] <CLockResource>: UnLock: testdb.t1D88707C8-BB5F-41D8-AAF9-55A76780953B unlock success2022-09-02 10:38:18.670 [INFO ] <DATA-Recover-1>: End dml recover testdb.t1,tid 72022-09-02 10:38:18.670 [INFO ] <STORAGE-Recover-1>: Start DMLStorge recover testdb.t1,tid 7 eventnum 12022-09-02 10:38:18.670 [INFO ] <STORAGE-Recover-1>: Start to DMLStorge recover of eventid(5)2022-09-02 10:38:18.671 [INFO ] <STORAGE-Recover-1>: recoverinfo gn(dmls): eventid=5, tablename=testdb.t1, slice:n1 src:0x1795293194(10.0.2.106) dst:0x1778515978(10.0.2.106)2022-09-02 10:38:18.671 [INFO ] <STORAGE-Recover-1>: Revovering ..., for dmlstorage, forgcluster=false, dbname=testdb, tablename=t1, slice=n1, tid=7, eventid=52022-09-02 10:38:18.683 [INFO ] <CLockResource>: Lock: testdb start2022-09-02 10:38:18.684 [INFO ] <CLockResource>: Lock: testdb end success2022-09-02 10:38:18.684 [INFO ] <CLockResource>: Lock: testdb.t1.rsync start2022-09-02 10:38:18.686 [INFO ] <CLockResource>: Lock: testdb.t1.rsync end success2022-09-02 10:38:18.686 [INFO ] <CLockResource>: Lock: testdb.t1824A6746-7C28-4B39-B1E6-D6E088F858AA start2022-09-02 10:38:18.687 [INFO ] <CLockResource>: Lock: testdb.t1824A6746-7C28-4B39-B1E6-D6E088F858AA end success2022-09-02 10:38:18.687 [INFO ] <CLockResource>: Lock: testdb.t1D88707C8-BB5F-41D8-AAF9-55A76780953B start2022-09-02 10:38:18.688 [INFO ] <CLockResource>: Lock: testdb.t1D88707C8-BB5F-41D8-AAF9-55A76780953B end success2022-09-02 10:38:18.695 [INFO ] <STORAGE-Recover-1>: BackUpTable, meta broken table(t1_n1:7) has been deleted and backuped success2022-09-02 10:38:18.727 [INFO ] <STORAGE-Recover-1>: CreateFailNodeField, do "CREATE TABLE /*+ distribution(1)*/ `testdb`.`t1_n1`( "id" int(11) DEFAULT NULL, "name" varchar(100) DEFAULT NULL) ENGINE=EXPRESS TID(8) UID(1) DEFAULT CHARSET=utf8 TABLESPACE='sys_tablespace' COLUMN_IDS(0, 1) TID(8)" success2022-09-02 10:38:18.728 [INFO ] <STORAGE-Recover-1>: CreateTableInfoFromSafeGroup, do CreateFailNodeField success2022-09-02 10:38:18.728 [INFO ] <STORAGE-Recover-1>: DO_DATA_SYNC, Lock_for_ddl2022-09-02 10:38:18.728 [INFO ] <STORAGE-Recover-1>: SYNC_SLICE, DISTRIBUTED TABLE: DB: 'testdb', Table: 't1', slice: 'n1'2022-09-02 10:38:18.728 [INFO ] <STORAGE-Recover-1>: CALL_SYNC, Syncing table slice{DB: 'testdb', Table: 't1', Slice: 'n1'} from sample gnode: '10.0.2.107'2022-09-02 10:38:18.728 [INFO ] <STORAGE-Recover-1>: CALL_SYNC, Sync command line: 'start 'gc_sync_client' '10.0.2.107 testdb t1_n1 10.0.2.106 5258 3 1''2022-09-02 10:38:19.242 [INFO ] <STORAGE-Recover-1>: check drop sql thread quit.2022-09-02 10:38:19.242 [INFO ] <STORAGE-Recover-1>: start_sync_by_node, sync table{start 'gc_sync_client' '10.0.2.107 testdb t1_n1 10.0.2.106 5258 3 1'} returned with (0)2022-09-02 10:38:19.242 [INFO ] <STORAGE-Recover-1>: CALL_SYNC, Sync client executed successfully2022-09-02 10:38:19.242 [INFO ] <STORAGE-Recover-1>: REFRESH_TABLE, Refreshing local gnode, DB: 'testdb', Table: 't1', slice: 'n1'2022-09-02 10:38:19.242 [INFO ] <STORAGE-Recover-1>: REFRESH_TABLE, Refresh query: 'refresh table `testdb`.`t1_n1`;'2022-09-02 10:38:19.243 [INFO ] <STORAGE-Recover-1>: Sync cl2022-09-02 10:38:19.243 [INFO ] <STORAGE-Recover-1>: CALL_SYNC, Sync end, return 02022-09-02 10:38:19.243 [INFO ] <CLockResource>: Lock: testdb.t1.09B5BEEC-1EF7-4FA6-9850-C4217A781E0F start2022-09-02 10:38:19.245 [INFO ] <CLockResource>: Lock: testdb.t1.09B5BEEC-1EF7-4FA6-9850-C4217A781E0F end success2022-09-02 10:38:19.246 [INFO ] <STORAGE-Recover-1>: CHECK_SYNC_RESULT,Checking whether SYNC is succeeded{DB: 'testdb', Table: 't1', slice: 'n1'}2022-09-02 10:38:19.250 [INFO ] <STORAGE-Recover-1>: CHECK_SYNC_RESULT, SCN of table slice of local GNode and sample GNode(number of working nodes: 2)2022-09-02 10:38:19.250 [INFO ] <STORAGE-Recover-1>: CHECK_SYNC_RESULT, Local GNode: 312022-09-02 10:38:19.250 [INFO ] <STORAGE-Recover-1>: CHECK_SYNC_RESULT, Sample GNode[1]: 312022-09-02 10:38:19.250 [INFO ] <CLockResource>: UnLock: testdb.t1.09B5BEEC-1EF7-4FA6-9850-C4217A781E0F unlock success2022-09-02 10:38:19.251 [INFO ] <STORAGE-Recover-1>: CreateTableInfoFromSafeGroup, do CallSync success2022-09-02 10:38:19.252 [INFO ] <CLockResource>: UnLock: testdb unlock success2022-09-02 10:38:19.254 [INFO ] <CLockResource>: UnLock: testdb.t1.rsync unlock success2022-09-02 10:38:19.255 [INFO ] <CLockResource>: UnLock: testdb.t1824A6746-7C28-4B39-B1E6-D6E088F858AA unlock success2022-09-02 10:38:19.257 [INFO ] <CLockResource>: UnLock: testdb.t1D88707C8-BB5F-41D8-AAF9-55A76780953B unlock success2022-09-02 10:38:19.266 [INFO ] <STORAGE-Recover-1>: CreateTableInfoFromSafeGeoup success!2022-09-02 10:38:19.266 [INFO ] <STORAGE-Recover-1>: delete DMLStorge Redo Log start, eventId=5.............2022-09-02 10:38:19.275 [INFO ] <STORAGE-Recover-1>: delete DMLStorge Redo Log end, eventId=5.............2022-09-02 10:38:19.277 [INFO ] <RECOVER-INFO-1>: Finishing Recovering testdb.t1,tid 7

测试SCN不一致

故障模拟

直接对分片操作,不设置SCN值,会导致表的SCN数值不一样。我们插入一行数据。

[gbase@862b43 gcinstall_862.33.35]$ gncli testdbGBase client 8.6.2.33-R48.274e5f5e. Copyright (c) 2004-2022, GBase. All Rights Reserved.gbase> select * from t1_n1;+------+--------+| id | name |+------+--------+| 4 | 44444 || 6 | 66666 || 7 | 777777 || 4 | 4444 || 6 | 6666 |+------+--------+5 rows in set (Elapsed: 00:00:00.00)gbase> insert into t1_n1 values(4,4444444444444);Query OK, 1 row affected (Elapsed: 00:00:00.06)Records: 1 Duplicates: 0 Warnings: 0 Total: 6

测试加载

故障信息是:table_max_scn: 0 — 14, SCN号不同

[gbase@862b43 gcinstall_862.33.35]$ gccli testdbGBase client 8.6.2.33-R48.274e5f5e. Copyright (c) 2004-2022, GBase. All Rights Reserved.gbase> load data infile 'ftp://gbase:gbase1234@10.0.2.106/t1.txt' into table t1 fields terminated by ',';ERROR 1733 (HY000): (GBA-01EX-700) Gbase general error: Task 15 failed, [10.0.2.108:5050](GBA-02AD-0005)Failed to query in gnode:DETAIL: (GBA-01EX-700) Gbase general error: (gns_host: 10.0.2.106) source table and destination table are not same: src:( localhost:testdb.t1_n1 ) dst:( 10.0.2.107:testdb.t1_n1 ) table_max_scn: 0 -- 14SQL: LOAD /*+ TID('110') */ DATA INFILE 'ftp://gbase:*********@10.0.2.106/t1.txt#offset=0&length=42&firstblock&ffsize=42' INTO TABLE `testdb`.`t1_n2` DATA_FORMAT 3 FILE_FORMAT UNDEFINED FIELDS TERMIgbase> load data infile 'ftp://gbase:gbase1234@10.0.2.106/t1.txt' into table t1 fields terminated by ',';ERROR 1733 (HY000): (GBA-01EX-700) Gbase general error: Task 16 failed, [10.0.2.106:5050](GBA-02AD-0005)Failed to query in gnode:DETAIL: (GBA-01EX-700) Gbase general error: (gns_host: 10.0.2.106) source table and destination table are not same: src:( localhost:testdb.t1_n1 ) dst:( 10.0.2.107:testdb.t1_n1 ) table_max_scn: 0 -- 14SQL: LOAD /*+ TID('112') */ DATA INFILE 'ftp://gbase:*********@10.0.2.106/t1.txt#offset=0&length=42&firstblock&ffsize=42' INTO TABLE `testdb`.`t1_n1` DATA_FORMAT 3 FILE_FORMAT UNDEFINED FIELDS TERMIgbase>gbase> show variables like '%gcluster_suffix_consistency_resolve%';+-------------------------------------+-------+| Variable_name | Value |+-------------------------------------+-------+| gcluster_suffix_consistency_resolve | 0 |+-------------------------------------+-------+1 row in set (Elapsed: 00:00:00.01)gbase> set gcluster_suffix_consistency_resolve=1;Query OK, 0 rows affected (Elapsed: 00:00:00.01)gbase> show variables like '%gcluster_suffix_consistency_resolve%';+-------------------------------------+-------+| Variable_name | Value |+-------------------------------------+-------+| gcluster_suffix_consistency_resolve | 1 |+-------------------------------------+-------+1 row in set (Elapsed: 00:00:00.01)gbase> load data infile 'ftp://gbase:gbase1234@10.0.2.106/t1.txt' into table t1 fields terminated by ',';ERROR 1733 (HY000): (GBA-01EX-700) Gbase general error: Task 17 failed, [10.0.2.107:5050](GBA-02AD-0005)Failed to query in gnode:DETAIL: (GBA-01EX-700) Gbase general error: (gns_host: 10.0.2.106) source table and destination table are not same: src:( localhost:testdb.t1_n1 ) dst:( 10.0.2.107:testdb.t1_n1 ) table_max_scn: 0 -- 14SQL: LOAD /*+ TID('118') */ DATA INFILE 'ftp://gbase:*********@10.0.2.106/t1.txt#offset=0&length=42&firstblock&ffsize=42' INTO TABLE `testdb`.`t1_n1` DATA_FORMAT 3 FILE_FORMAT UNDEFINED FIELDS TERMIgbase>gbase> load data infile 'ftp://gbase:gbase1234@10.0.2.106/t1.txt' into table t1 fields terminated by ',';Query OK, 6 rows affected (Elapsed: 00:00:00.52)Task 18 finished, Loaded 6 records, Skipped 0 records

查看日志

连接节点 106 gcluster的express.log

2022-09-02 09:37:26.662 [SQLDISP][ERROR][S:57][Q:131]:Query failed, SessionCtx(0x5732800) HOST(10.0.2.107:5050->44) reason: (GBA-01EX-700) Gbase general error: (gns_host: 10.0.2.106) source table and destination table are not same: src:( localhost:testdb.t1_n1 ) dst:( 10.0.2.107:testdb.t1_n1 ) table_max_scn: 0 -- 14 SQL(LOAD /*+ TID('118') */ DATA INFILE 'ftp://gbase:*********@10.0.2.106/t1.txt#offset=0&length=42&firstblock&ffsize=42' INTO TABLE `testdb`.`t1_n1` DATA_FORMAT 3 FILE_FORMAT UNDEFINED FIELDS TERMINATED BY ',' HOST '10.0.2.107' CURRENT_TIMESTAMP 1662082646 SCN_NUMBER 17 GCLUSTER_PORT 5258 INTO SERVER (HOST '10.0.2.106, 10.0.2.107, 10.0.2.108', PORT '5050', USER 'root', DATABASE 'testdb', TABLE 't1_n1, t1_n2, t1_n3', COMMENT 'table_host 0 0 1 # 1 1 2 # 2 2 0, scn 17, group 0 ,distribution 1, hash_map 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 02022-09-02 09:37:26.691 [SUFFIX_INCONSISTENCY_JUDGER][WARN ][S:57][Q:131]:The same with: [10.0.2.107_n1]2022-09-02 09:37:26.691 [SUFFIX_INCONSISTENCY_JUDGER][WARN ][S:57][Q:131]:host 10.0.2.106 table:testdb.t1_n1 judeged by compare scn.2022-09-02 09:37:26.704 [SUFFIX_INCONSISTENCY_JUDGER][WARN ][S:57][Q:131]:Set testdb.t1 suffix n1 StorageEvent successfully, event id is 22022-09-02 09:37:26.705 [LOAD][ERROR][S:57][Q:131]:Task 17 failed, [10.0.2.107:5050](GBA-02AD-0005)Failed to query in gnode:DETAIL: (GBA-01EX-700) Gbase general error: (gns_host: 10.0.2.106) source table and destination table are not same: src:( localhost:testdb.t1_n1 ) dst:( 10.0.2.107:testdb.t1_n1 ) table_max_scn: 0 -- 14SQL: LOAD /*+ TID('118') */ DATA INFILE 'ftp://gbase:*********@10.0.2.106/t1.txt#offset=0&length=42&firstblock&ffsize=42' INTO TABLE `testdb`.`t1_n1` DATA_FORMAT 3 FILE_FORMAT UNDEFINED FIELDS TERMINATED BY ',' HOST '10.0.2.107' CURRENT_TIMESTAMP 1662082646 SCN_NUMBER 17 GCLUSTER_PORT 5258 INTO SERVER (HOST '10.0.2.106, 10.0.2.107, 10.0.2.108', PORT '5050', USER 'root', DATABASE 'testdb', TABLE 't1_n1, t1_n2, t1_n3', COMMENT 'table_host 0 0 1 # 1 1 2 # 2 2 0, scn 17, group 0 ,distribution 1, hash_map 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 ... 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 02022-09-02 09:37:26.830 [SQLDISP][ERROR][S:57][Q:131]:Query failed, SessionCtx(0x5732800) HOST(10.0.2.106:5050->29) reason: Can't revert table: Table testdb.t1_n1 can not do revert with scn(17/cur:0) SQL(revert table "testdb"."t1_n1" scn_number 17)2022-09-02 09:37:26.840 [SQLDISP][ERROR][S:57][Q:131]:Query failed, SessionCtx(0x5732800) HOST(10.0.2.107:5050->32) reason: Can't revert table: Table testdb.t1_n1 can not do revert with scn(17/cur:14) SQL(revert table "testdb"."t1_n1" scn_number 17)2022-09-02 09:37:26.840 [SQLDISP][ERROR][S:57][Q:131]:Query failed, SessionCtx(0x5732800) HOST(10.0.2.108:5050->30) reason: Can't revert table: Table testdb.t1_n2 can not do revert with scn(17/cur:14) SQL(revert table "testdb"."t1_n2" scn_number 17)2022-09-02 09:37:26.840 [SQLDISP][ERROR][S:57][Q:131]:Query failed, SessionCtx(0x5732800) HOST(10.0.2.107:5050->33) reason: Can't revert table: Table testdb.t1_n2 can not do revert with scn(17/cur:14) SQL(revert table "testdb"."t1_n2" scn_number 17)2022-09-02 09:37:26.840 [SQLDISP][ERROR][S:57][Q:131]:Query failed, SessionCtx(0x5732800) HOST(10.0.2.106:5050->30) reason: Can't revert table: Table testdb.t1_n3 can not do revert with scn(17/cur:14) SQL(revert table "testdb"."t1_n3" scn_number 17)2022-09-02 09:37:26.840 [SQLDISP][ERROR][S:57][Q:131]:Query failed, SessionCtx(0x5732800) HOST(10.0.2.108:5050->31) reason: Can't revert table: Table testdb.t1_n3 can not do revert with scn(17/cur:14) SQL(revert table "testdb"."t1_n3" scn_number 17)2022-09-02 09:37:26.840 [COMMIT][WARN ][S:57][Q:131]:revert in host:10.0.2.106 for table:t1 success.2022-09-02 09:37:26.840 [COMMIT][WARN ][S:57][Q:131]:revert in host:10.0.2.107 for table:t1 success.2022-09-02 09:37:26.840 [COMMIT][WARN ][S:57][Q:131]:revert in host:10.0.2.107 for table:t1 success.2022-09-02 09:37:26.840 [COMMIT][WARN ][S:57][Q:131]:revert in host:10.0.2.108 for table:t1 success.2022-09-02 09:37:26.840 [COMMIT][WARN ][S:57][Q:131]:revert in host:10.0.2.108 for table:t1 success.2022-09-02 09:37:26.840 [COMMIT][WARN ][S:57][Q:131]:revert in host:10.0.2.106 for table:t1 success.2022-09-02 09:37:26.853 [COMMIT][INFO ][S:57][Q:131]:rollback table(testdb.t1) success for scn:17.2022-09-02 09:37:30.167 [LOAD][WARN ][S:57][Q:132]:slice(n1) on node(10.0.2.106) is invalid.2022-09-02 09:37:30.464 [DEF ] [S:57][Q:132][load]Node(10.0.2.106)testdb.t1[n1][scn:18]'s data state fails2022-09-02 09:37:30.464 [DEF ] [S:57][Q:132]Set data(testdb.t1:1) to LOCKED2022-09-02 09:37:30.486 [DEF ] [S:57][Q:132]Set data(testdb.t1:normal) to LOCKED, event id is 22022-09-02 09:37:30.655 [DEF ] [S:57][Q:132][load]Node(10.0.2.106)testdb.t1[n1][scn:18]'s data state fails2022-09-02 09:37:30.655 [DEF ] [S:57][Q:132]Set data(testdb.t1:1) to LOCKED2022-09-02 09:37:30.669 [DEF ] [S:57][Q:132]Set data(testdb.t1:normal) to LOCKED, event id is 2

gc_recovyer.log

2022-09-02 09:37:46.250 [INFO ] <RECOVER-INFO-1>: Start Recovering testdb.t1 tid 72022-09-02 09:37:46.250 [INFO ] <DATA-Recover-1>: Start dml recover testdb.t1,tid 72022-09-02 09:37:46.250 [INFO ] <DATA-Recover-1>: Start to dml recover of eventid(2)2022-09-02 09:37:46.260 [INFO ] <GCWare>: GetSourceNodeForDMl for nodeId(1778515978), the node is unValid2022-09-02 09:37:46.262 [INFO ] <DATA-Recover-1>: DoNormSqlNoResult, do(flush rollback `testdb`.`t1_n1`)2022-09-02 09:37:46.264 [INFO ] <DATA-Recover-1>: recoverinfo gn dml: eventid:2 DB(testdb), TABLE(t1), SLICE(1), TID(7), src:0x1795293194(10.0.2.107) dst:0x1778515978(10.0.2.106)2022-09-02 09:37:46.265 [INFO ] <CLockResource>: Lock: testdb start2022-09-02 09:37:46.269 [INFO ] <CLockResource>: Lock: testdb end success2022-09-02 09:37:46.269 [INFO ] <CLockResource>: Lock: testdb.t1.rsync start2022-09-02 09:37:46.271 [INFO ] <CLockResource>: Lock: testdb.t1.rsync end success2022-09-02 09:37:46.271 [INFO ] <CLockResource>: Lock: testdb.t1824A6746-7C28-4B39-B1E6-D6E088F858AA start2022-09-02 09:37:46.273 [INFO ] <CLockResource>: Lock: testdb.t1824A6746-7C28-4B39-B1E6-D6E088F858AA end success2022-09-02 09:37:46.273 [INFO ] <CLockResource>: Lock: testdb.t1D88707C8-BB5F-41D8-AAF9-55A76780953B start2022-09-02 09:37:46.275 [INFO ] <CLockResource>: Lock: testdb.t1D88707C8-BB5F-41D8-AAF9-55A76780953B end success2022-09-02 09:37:46.276 [INFO ] <DATA-Recover-1>: DO_DATA_SYNC, Lock_for_ddl2022-09-02 09:37:46.276 [INFO ] <DATA-Recover-1>: SYNC_SLICE, DISTRIBUTED TABLE: DB: 'testdb', Table: 't1', slice: 'n1'2022-09-02 09:37:46.276 [INFO ] <GCWare>: testdb.t1 nodeid: 1778515978, segid: 1, tableid 7 have dmlstorageevent,total num 1,lasteventid: 22022-09-02 09:37:46.276 [INFO ] <DATA-Recover-1>: SYNC_SLICE, Find it is a meta broken table testdb.t1,and the event will be given up,event id 22022-09-02 09:37:46.290 [INFO ] <DATA-Recover-1>: SET_ONLINE, Succeeded in setting local node table 'testdb.t1_n1' online, eventinfo(eventid:2,segId:1)2022-09-02 09:37:46.291 [INFO ] <CLockResource>: UnLock: testdb unlock success2022-09-02 09:37:46.292 [INFO ] <CLockResource>: UnLock: testdb.t1.rsync unlock success2022-09-02 09:37:46.294 [INFO ] <CLockResource>: UnLock: testdb.t1824A6746-7C28-4B39-B1E6-D6E088F858AA unlock success2022-09-02 09:37:46.295 [INFO ] <CLockResource>: UnLock: testdb.t1D88707C8-BB5F-41D8-AAF9-55A76780953B unlock success2022-09-02 09:37:46.296 [INFO ] <DATA-Recover-1>: End dml recover testdb.t1,tid 72022-09-02 09:37:46.296 [INFO ] <STORAGE-Recover-1>: Start DMLStorge recover testdb.t1,tid 7 eventnum 12022-09-02 09:37:46.297 [INFO ] <STORAGE-Recover-1>: Start to DMLStorge recover of eventid(2)2022-09-02 09:37:46.297 [INFO ] <STORAGE-Recover-1>: recoverinfo gn(dmls): eventid=2, tablename=testdb.t1, slice:n1 src:0x1795293194(10.0.2.106) dst:0x1778515978(10.0.2.106)2022-09-02 09:37:46.297 [INFO ] <STORAGE-Recover-1>: Revovering ..., for dmlstorage, forgcluster=false, dbname=testdb, tablename=t1, slice=n1, tid=7, eventid=22022-09-02 09:37:46.309 [INFO ] <CLockResource>: Lock: testdb start2022-09-02 09:37:46.310 [INFO ] <CLockResource>: Lock: testdb end success2022-09-02 09:37:46.310 [INFO ] <CLockResource>: Lock: testdb.t1.rsync start2022-09-02 09:37:46.312 [INFO ] <CLockResource>: Lock: testdb.t1.rsync end success2022-09-02 09:37:46.312 [INFO ] <CLockResource>: Lock: testdb.t1824A6746-7C28-4B39-B1E6-D6E088F858AA start2022-09-02 09:37:46.313 [INFO ] <CLockResource>: Lock: testdb.t1824A6746-7C28-4B39-B1E6-D6E088F858AA end success2022-09-02 09:37:46.313 [INFO ] <CLockResource>: Lock: testdb.t1D88707C8-BB5F-41D8-AAF9-55A76780953B start2022-09-02 09:37:46.315 [INFO ] <CLockResource>: Lock: testdb.t1D88707C8-BB5F-41D8-AAF9-55A76780953B end success2022-09-02 09:37:46.322 [INFO ] <STORAGE-Recover-1>: BackUpTable, meta broken table(t1_n1:7) has been deleted and backuped success2022-09-02 09:37:46.355 [INFO ] <STORAGE-Recover-1>: CreateFailNodeField, do "CREATE TABLE /*+ distribution(1)*/ `testdb`.`t1_n1`( "id" int(11) DEFAULT NULL, "name" varchar(100) DEFAULT NULL) ENGINE=EXPRESS TID(8) UID(1) DEFAULT CHARSET=utf8 TABLESPACE='sys_tablespace' COLUMN_IDS(0, 1) TID(8)" success2022-09-02 09:37:46.355 [INFO ] <STORAGE-Recover-1>: CreateTableInfoFromSafeGroup, do CreateFailNodeField success2022-09-02 09:37:46.355 [INFO ] <STORAGE-Recover-1>: DO_DATA_SYNC, Lock_for_ddl2022-09-02 09:37:46.355 [INFO ] <STORAGE-Recover-1>: SYNC_SLICE, DISTRIBUTED TABLE: DB: 'testdb', Table: 't1', slice: 'n1'2022-09-02 09:37:46.355 [INFO ] <STORAGE-Recover-1>: CALL_SYNC, Syncing table slice{DB: 'testdb', Table: 't1', Slice: 'n1'} from sample gnode: '10.0.2.107'2022-09-02 09:37:46.355 [INFO ] <STORAGE-Recover-1>: CALL_SYNC, Sync command line: 'start 'gc_sync_client' '10.0.2.107 testdb t1_n1 10.0.2.106 5258 3 1''2022-09-02 09:37:46.934 [INFO ] <STORAGE-Recover-1>: check drop sql thread quit.2022-09-02 09:37:46.935 [INFO ] <STORAGE-Recover-1>: start_sync_by_node, sync table{start 'gc_sync_client' '10.0.2.107 testdb t1_n1 10.0.2.106 5258 3 1'} returned with (0)2022-09-02 09:37:46.935 [INFO ] <STORAGE-Recover-1>: CALL_SYNC, Sync client executed successfully2022-09-02 09:37:46.935 [INFO ] <STORAGE-Recover-1>: REFRESH_TABLE, Refreshing local gnode, DB: 'testdb', Table: 't1', slice: 'n1'2022-09-02 09:37:46.935 [INFO ] <STORAGE-Recover-1>: REFRESH_TABLE, Refresh query: 'refresh table `testdb`.`t1_n1`;'2022-09-02 09:37:46.935 [INFO ] <STORAGE-Recover-1>: Sync cl2022-09-02 09:37:46.935 [INFO ] <STORAGE-Recover-1>: CALL_SYNC, Sync end, return 02022-09-02 09:37:46.935 [INFO ] <CLockResource>: Lock: testdb.t1.09B5BEEC-1EF7-4FA6-9850-C4217A781E0F start2022-09-02 09:37:46.937 [INFO ] <CLockResource>: Lock: testdb.t1.09B5BEEC-1EF7-4FA6-9850-C4217A781E0F end success2022-09-02 09:37:46.937 [INFO ] <STORAGE-Recover-1>: CHECK_SYNC_RESULT,Checking whether SYNC is succeeded{DB: 'testdb', Table: 't1', slice: 'n1'}2022-09-02 09:37:46.942 [INFO ] <STORAGE-Recover-1>: CHECK_SYNC_RESULT, SCN of table slice of local GNode and sample GNode(number of working nodes: 2)2022-09-02 09:37:46.942 [INFO ] <STORAGE-Recover-1>: CHECK_SYNC_RESULT, Local GNode: 182022-09-02 09:37:46.942 [INFO ] <STORAGE-Recover-1>: CHECK_SYNC_RESULT, Sample GNode[1]: 182022-09-02 09:37:46.942 [INFO ] <CLockResource>: UnLock: testdb.t1.09B5BEEC-1EF7-4FA6-9850-C4217A781E0F unlock success2022-09-02 09:37:46.943 [INFO ] <STORAGE-Recover-1>: CreateTableInfoFromSafeGroup, do CallSync success2022-09-02 09:37:46.944 [INFO ] <CLockResource>: UnLock: testdb unlock success2022-09-02 09:37:46.945 [INFO ] <CLockResource>: UnLock: testdb.t1.rsync unlock success2022-09-02 09:37:46.947 [INFO ] <CLockResource>: UnLock: testdb.t1824A6746-7C28-4B39-B1E6-D6E088F858AA unlock success2022-09-02 09:37:46.948 [INFO ] <CLockResource>: UnLock: testdb.t1D88707C8-BB5F-41D8-AAF9-55A76780953B unlock success2022-09-02 09:37:46.957 [INFO ] <STORAGE-Recover-1>: CreateTableInfoFromSafeGeoup success!2022-09-02 09:37:46.957 [INFO ] <STORAGE-Recover-1>: delete DMLStorge Redo Log start, eventId=2.............2022-09-02 09:37:46.966 [INFO ] <STORAGE-Recover-1>: delete DMLStorge Redo Log end, eventId=2.............2022-09-02 09:37:46.967 [INFO ] <RECOVER-INFO-1>: Finishing Recovering testdb.t1,tid 7
GBase 8a 对主副本不一致时的处理方案测试 - 墨天轮 (2024)
Top Articles
Latest Posts
Recommended Articles
Article information

Author: Patricia Veum II

Last Updated:

Views: 5901

Rating: 4.3 / 5 (44 voted)

Reviews: 83% of readers found this page helpful

Author information

Name: Patricia Veum II

Birthday: 1994-12-16

Address: 2064 Little Summit, Goldieton, MS 97651-0862

Phone: +6873952696715

Job: Principal Officer

Hobby: Rafting, Cabaret, Candle making, Jigsaw puzzles, Inline skating, Magic, Graffiti

Introduction: My name is Patricia Veum II, I am a vast, combative, smiling, famous, inexpensive, zealous, sparkling person who loves writing and wants to share my knowledge and understanding with you.