環境:
mysql 5.7,elasticsearch 7.4.2,canal.deployer-1.1.5
這裏我要做的是通過canal將MySQL中修改的數據同步到elasticsearch當中。
一、MySQL配置
1.1 修改MySQL的配置文件
[root@localhost local]# vim /etc/my.cnf
[root@localhost local]# systemctl restart mysqld
my.cnf: (新增部分)附:MySQL官方文檔
#開啓日誌
log_bin = mysql‐bin
#設置服務id
server_id = 1
#不記錄每條sql語句的上下文信息,僅需記錄哪條數據被修改了,修改成什麼樣了
binlog_format = ROW
修改完配置文件,需要重啓MySQL,如果啓動失敗,則可以使用如下命令查看:
cat /var/log/mysqld.log
1.2 查看log_bin是否成功開啓 :
[root@localhost local]# mysql -uroot -p
Enter password:
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 2
Server version: 5.7.30-log MySQL Community Server (GPL)
Copyright (c) 2000, 2020, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql> show variables like '%log_bin%';
+---------------------------------+----------------------------------+
| Variable_name | Value |
+---------------------------------+----------------------------------+
| log_bin | ON |
| log_bin_basename | /var/lib/mysql/mysql‐bin |
| log_bin_index | /var/lib/mysql/mysql‐bin.index |
| log_bin_trust_function_creators | OFF |
| log_bin_use_v1_row_events | OFF |
| sql_log_bin | ON |
+---------------------------------+----------------------------------+
6 rows in set (0.03 sec)
1.3 創建canal賬號,並賦予權限
mysql> grant select,replication slave,replication client on *.* to 'canal'@'%' identified by 'canal';
Query OK, 0 rows affected, 1 warning (0.01 sec)
mysql> flush privileges;
Query OK, 0 rows affected (0.01 sec)
二、Canal 服務端配置
[root@localhost canal]# ll
總用量 8
drwxr-xr-x. 2 root root 76 5月 23 20:28 bin
drwxr-xr-x. 5 root root 123 5月 23 20:28 conf
drwxr-xr-x. 2 root root 4096 5月 23 20:28 lib
drwxrwxrwx. 2 root root 6 10月 9 2019 logs
[root@localhost canal]# cd conf/
[root@localhost conf]# ll
總用量 16
-rwxrwxrwx. 1 root root 291 8月 31 2019 canal_local.properties
-rwxrwxrwx. 1 root root 5259 9月 30 2019 canal.properties
drwxrwxrwx. 2 root root 33 5月 23 20:28 example
-rwxrwxrwx. 1 root root 3262 9月 16 2019 logback.xml
drwxrwxrwx. 2 root root 39 5月 23 20:28 metrics
drwxrwxrwx. 3 root root 149 5月 23 20:28 spring
[root@localhost conf]# cd example/
[root@localhost example]# ll
總用量 4
-rwxrwxrwx. 1 root root 2036 9月 30 2019 instance.properties
[root@localhost example]# vi instance.properties
[root@localhost example]# cd ..
[root@localhost conf]# cd ..
[root@localhost canal]# cd bin
[root@localhost bin]# ls
restart.sh startup.bat startup.sh stop.sh
[root@localhost bin]# ./startup.sh
.。。。。。。。(省略)
cd to /usr/local/canal/bin for continue
【####################以下爲了查看canal是否啓動成功(選其中一種即可)#########################】
[root@localhost bin]# ps -ef | grep canal
.。。。。。。。(省略)
[root@localhost bin]# netstat -an | grep 11111
tcp 0 0 0.0.0.0:11111 0.0.0.0:* LISTEN
.。。。。。。。(省略)
[root@localhost canal]# cd logs/
[root@localhost logs]# ls
canal example
[root@localhost logs]# cd example/
[root@localhost example]# ll
總用量 192
-rw-r--r--. 1 root root 90509 5月 23 20:41 example.log
[root@localhost example]# tail -f example.log
.。。。。。。。(省略)
2020-05-23 20:41:39.517 [main] INFO c.a.otter.canal.instance.core.AbstractCanalInstance - start successful....
2020-05-23 20:41:39.751 [destination = example , address = /127.0.0.1:3306 , EventParser] WARN c.a.o.c.p.inbound.mysql.rds.RdsBinlogEventParserProxy - ---> begin to find start position, it will be long time for reset or first position
2020-05-23 20:41:39.751 [destination = example , address = /127.0.0.1:3306 , EventParser] WARN c.a.o.c.p.inbound.mysql.rds.RdsBinlogEventParserProxy - prepare to find start position just show master status
[root@localhost canal]# cat /usr/local/canal/logs/canal/canal.log
2020-05-23 20:41:38.118 [main] INFO com.alibaba.otter.canal.deployer.CanalController - ## start the canal server[172.17.0.1(172.17.0.1):11111]
2020-05-23 20:41:39.646 [main] INFO com.alibaba.otter.canal.deployer.CanalStarter - ## the canal server is running now ......
2020-05-23 20:48:44.964 [canal-instance-scan-0] INFO com.alibaba.otter.canal.deployer.CanalController - auto notify stop example successful.
2020-05-23 20:48:45.957 [canal-instance-scan-0] INFO com.alibaba.otter.canal.deployer.CanalController - auto notify start example successful.
2020-05-23 20:48:45.957 [canal-instance-scan-0] INFO com.alibaba.otter.canal.deployer.CanalController - auto notify reload example successful.
instance.properties:(只展示部分)
#################################################
## mysql serverId , v1.0.26+ will autoGen
canal.instance.mysql.slaveId=2
# username/password
canal.instance.dbUsername=canal
canal.instance.dbPassword=canal
三、Canal Java客戶端
3.1 pom.xml
<dependency>
<groupId>com.baomidou</groupId>
<artifactId>mybatis-plus-boot-starter</artifactId>
<version>3.3.1</version>
</dependency>
<dependency>
<groupId>org.projectlombok</groupId>
<artifactId>lombok</artifactId>
</dependency>
<dependency>
<groupId>com.alibaba.otter</groupId>
<artifactId>canal.client</artifactId>
<version>1.1.4</version>
</dependency>
<dependency>
<groupId>com.alibaba.otter</groupId>
<artifactId>canal.common</artifactId>
<version>1.1.4</version>
</dependency>
<dependency>
<groupId>com.alibaba.otter</groupId>
<artifactId>canal.protocol</artifactId>
<version>1.1.4</version>
</dependency>
<dependency>
<groupId>org.elasticsearch.client</groupId>
<artifactId>elasticsearch-rest-high-level-client</artifactId>
<version>7.4.2</version>
</dependency>
3.2 canal java客戶端連接canal服務端的配置
package com.lucifer.dianping.canal;
import com.alibaba.otter.canal.client.CanalConnector;
import com.alibaba.otter.canal.client.CanalConnectors;
import com.google.common.collect.Lists;
import org.springframework.beans.factory.DisposableBean;
import org.springframework.context.annotation.Bean;
import org.springframework.stereotype.Component;
import java.net.InetSocketAddress;
/**
* author: lucifer
* date: 2020/5/23 21:16
* description: canal客戶端連接canal服務端配置
*/
@Component
public class CanalClient implements DisposableBean {
private CanalConnector canalConnector;
@Bean
public CanalConnector getCanalConnector() {
canalConnector = CanalConnectors.newClusterConnector(Lists.newArrayList(
new InetSocketAddress("192.168.24.133", 11111)),
"example", "canal", "canal"
);
canalConnector.connect();
//指定filter,格式{database}.{table}
canalConnector.subscribe();
//回滾尋找上次中斷的爲止
canalConnector.rollback();
return canalConnector;
}
/**
* 在spring容器銷燬的時候,需要斷開canal客戶端的連接
* 防止canal連接的泄露
*
* @throws Exception
*/
@Override
public void destroy() throws Exception {
if (canalConnector!=null){
canalConnector.disconnect();
}
}
}
連接成功,如圖:
application.yml:
server:
port: 8010
spring:
datasource:
driver-class-name: com.mysql.cj.jdbc.Driver
url: jdbc:mysql://192.168.24.133:3306/dianping?autoReconnect=true&useUnicode=true&createDatabaseIfNotExist=true&characterEncoding=utf8&serverTimezone=UTC
username: root
password: 123456
type: com.alibaba.druid.pool.DruidDataSource
elasticsearch:
rest:
uris: 192.168.24.133:9200
四、整合進es
4.1 創建好es索引:
#創建es索引
PUT user
#查詢
GET /user/_search
{
"query": {
"match_all": {}
}
}
4.2 通過canal將MySQL中數據插入es
package com.lucifer.dianping.canal;
import com.alibaba.fastjson.JSON;
import com.alibaba.fastjson.JSONObject;
import com.alibaba.fastjson.serializer.SerializerFeature;
import com.alibaba.otter.canal.client.CanalConnector;
import com.alibaba.otter.canal.protocol.CanalEntry;
import com.alibaba.otter.canal.protocol.Message;
import com.baomidou.mybatisplus.core.conditions.query.QueryWrapper;
import com.google.protobuf.InvalidProtocolBufferException;
import com.lucifer.dianping.mapper.UserMapper;
import com.lucifer.dianping.pojo.User;
import lombok.extern.slf4j.Slf4j;
import org.apache.commons.lang3.StringUtils;
import org.elasticsearch.action.index.IndexRequest;
import org.elasticsearch.client.RequestOptions;
import org.elasticsearch.client.RestHighLevelClient;
import org.springframework.beans.BeansException;
import org.springframework.context.ApplicationContext;
import org.springframework.context.ApplicationContextAware;
import org.springframework.scheduling.annotation.Scheduled;
import org.springframework.stereotype.Component;
import javax.annotation.Resource;
import java.io.IOException;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
/**
* author: lucifer
* date: 2020/5/23 21:44
* description: TODO
*/
@Slf4j
@Component
public class CanalScheduling implements Runnable, ApplicationContextAware {
private ApplicationContext applicationContext;
@Resource
private UserMapper userMapper;
@Resource
private RestHighLevelClient restHighLevelClient;
@Resource
private CanalConnector canalConnector;
@Scheduled(fixedDelay = 100) //每隔100秒執行
@Override
public void run() {
long batchId = -1;
try {
//每次拉取條數
int batchSize = 1000;
Message message = canalConnector.getWithoutAck(batchSize);
//批次id
batchId = message.getId();
List<CanalEntry.Entry> entries = message.getEntries();
if (batchId != -1 && entries.size() > 0) {
entries.forEach(entry -> {
//MySQL種my.cnf中配置的是binlog_format = ROW,這裏只解析ROW類型
if (entry.getEntryType() == CanalEntry.EntryType.ROWDATA) {
//解析處理
publishCanalEvent(entry);
}
});
}
canalConnector.ack(batchId);
} catch (Exception e) {
e.printStackTrace();
canalConnector.rollback(batchId);
}
}
private void publishCanalEvent(CanalEntry.Entry entry) {
// CanalEntry.EntryType entryType = entry.getEntryType();
//表名
String tableName = entry.getHeader().getTableName();
//數據庫名
String database = entry.getHeader().getSchemaName();
CanalEntry.RowChange rowChange = null;
try {
rowChange = CanalEntry.RowChange.parseFrom(entry.getStoreValue());
} catch (InvalidProtocolBufferException e) {
e.printStackTrace();
return;
}
rowChange.getRowDatasList().forEach(rowData -> {
//這裏也可以獲取改變前的數據
List<CanalEntry.Column> beforeColumnsList = rowData.getBeforeColumnsList();
beforeColumnsList.stream().forEach(column -> {
log.info("更改前的數據:name:{},value:{}", column.getName(),column.getValue());
});
//獲取改變後的數據
List<CanalEntry.Column> afterColumnsList = rowData.getAfterColumnsList();
/* String primaryKey = "id";
CanalEntry.Column idColumn = afterColumnsList.stream().filter(column ->
column.getIsKey() && primaryKey.equals(column.getName())).findFirst().orElse(null);*/
Map<String, Object> columnsToMap = parseColumnsToMap(afterColumnsList);
try {
//插入es
indexES(columnsToMap, database, tableName);
} catch (IOException e) {
e.printStackTrace();
}
});
}
Map<String, Object> parseColumnsToMap(List<CanalEntry.Column> columns) {
Map<String, Object> map = new HashMap<>();
columns.forEach(column -> {
if (column == null) {
return;
}
log.info("更改後的數據:name:{},value:{}", column.getName(),column.getValue());
map.put(column.getName(), column.getValue());
});
return map;
}
/**
* ps:
* 1. 問題1:異常:java.lang.IllegalArgumentException: The number of object passed must be even but was [1]
* 如果使用下面寫法:
* User user = userMapper.selectById(new Integer((String) dataMap.get("id")));
* .....
* indexRequest.source(user);
* 所以這裏我改成使用indexRequest.source(map),使用map;
* <p>
* 2.問題2:異常:cannot write xcontent for unknown value of type class java.sql.Timestamp
* QueryWrapper<User> queryWrapper = new QueryWrapper<>();
* queryWrapper.ge("id", new Integer((String) dataMap.get("id")));
* List<Map<String, Object>> maps = userMapper.selectMaps(queryWrapper);
* for (Map<String, Object> map : maps) {
* IndexRequest indexRequest = new IndexRequest();
* indexRequest.id(String.valueOf(map.get("id")));
* indexRequest.source(map);
* restHighLevelClient.index(indexRequest, RequestOptions.DEFAULT);
* }
* User對象中,時間爲Date類型
* 調用此代碼:List<Map<String, Object>> maps = userMapper.selectMaps(queryWrapper);
* maps中的updateAt、createAt兩字段不是實體類中定義的date類型成了java.sql.Timestamp類型
* es7.3.2無法處理Timestamp類型,因此這裏修改寫法,正確的寫法在下面代碼中
* <p>
* 問題3:
* 異常:Found interface org.elasticsearch.common.bytes.BytesReference, but class was expected
* 控制檯輸出:
* java.lang.IncompatibleClassChangeError: Found interface org.elasticsearch.common.bytes.BytesReference, but class was expected
* at org.elasticsearch.client.RequestConverters.index(RequestConverters.java:340) ~[elasticsearch-rest-high-level-client-7.4.2.jar:7.6.2]
* at org.elasticsearch.client.RestHighLevelClient.internalPerformRequest(RestHighLevelClient.java:1450) ~[elasticsearch-rest-high-level-client-7.4.2.jar:7.6.2]
* at org.elasticsearch.client.RestHighLevelClient.performRequest(RestHighLevelClient.java:1424) ~[elasticsearch-rest-high-level-client-7.4.2.jar:7.6.2]
* 原因:我個人猜想:可能是版本問題
* 在spring boot2.3.0.RELEASE版本中,可以查看es的依賴版本是7.6.2,而我elasticsearch-rest-high-level-client版本是7.4.2,與我安裝的es版本一致
* 因此我在這裏修改spring boot2.3.0.RELEASE版本中默認提供的es版本,在pom.xml中增加如下部分:
* <properties>
* <elasticsearch.version>7.4.2</elasticsearch.version>
* </properties>
* 問題即可解決。
*/
private void indexES(Map<String, Object> dataMap, String database, String table) throws IOException {
log.info("dataMap:{},database:{},table:{}", dataMap, database, table);
//不是“dianping”庫中的,不處理
if (!StringUtils.equals("dianping", database)) {
return;
}
//不是user表中的數據不處理
if (StringUtils.equals("user", table)) {
//利用mybatis-plus 根據id查詢出數據,並將其轉化成map
User user = userMapper.selectById(new Integer((String) dataMap.get("id")));
Map<String, Object> map = JSON.parseObject(JSON.toJSONString(user, SerializerFeature.WriteNullStringAsEmpty,
SerializerFeature.WriteNullNumberAsZero, SerializerFeature.WriteMapNullValue), Map.class);
IndexRequest indexRequest = new IndexRequest("user");
indexRequest.id(String.valueOf(map.get("id")));
indexRequest.source(map);
restHighLevelClient.index(indexRequest, RequestOptions.DEFAULT);
} else {
return;
}
}
@Override
public void setApplicationContext(ApplicationContext applicationContext) throws BeansException {
this.applicationContext = applicationContext;
}
}
問題2的截圖:
問題3的截圖:
這裏可以看出spring boot2.3.0.RELEASE版本中,可以查看es的依賴版本是7.6.2:
mysql中user表數據:
修改:
控制檯打印:
2020-05-24 00:15:16.168 INFO 21540 --- [ scheduling-1] c.l.dianping.canal.CanalScheduling : dataMap:{password=123, gender=1, telphone=1234567, nick_name=Lucifer, update_at=2020-05-23 21:02:08, id=1, create_at=2020-05-24 00:15:17},database:dianping,table:user
2020-05-24 00:15:16.194 DEBUG 21540 --- [ scheduling-1] c.l.d.mapper.UserMapper.selectById : ==> Preparing: SELECT id,create_at,update_at,telphone,password,nick_name,gender FROM user WHERE id=?
2020-05-24 00:15:16.195 DEBUG 21540 --- [ scheduling-1] c.l.d.mapper.UserMapper.selectById : ==> Parameters: 1(Integer)
2020-05-24 00:15:16.199 DEBUG 21540 --- [ scheduling-1] c.l.d.mapper.UserMapper.selectById : <== Total: 1
再次修改成:
es查詢:這裏使用的是kibana查看