初次使用datax,發現datax不支持mysql8.x

一:下載安裝

wget http://datax-opensource.oss-cn-hangzhou.aliyuncs.com/datax.tar.gz
tar -zxvf datax.tar.gz -C /usr/local/

二:
官方文檔https://github.com/alibaba/DataX/blob/master/dataxPluginDev.md

三:簡單案例
(1)stream—>stream

[root@hadoop01 home]# cd /usr/local/datax/

[root@hadoop01 datax]# vi ./job/first.json
內容如下:
{
  "job": {
    "content": [
      {
        "reader": {
          "name": "streamreader",
          "parameter": {
            "sliceRecordCount": 10,
            "column": [
              {
                "type": "long",
                "value": "10"
              },
              {
                "type": "string",
                "value": "hello,你好,世界-DataX"
              }
            ]
          }
        },
        "writer": {
          "name": "streamwriter",
          "parameter": {
            "encoding": "UTF-8",
            "print": true
          }
        }
      }
    ],
    "setting": {
      "speed": {
        "channel": 5
       }
    }
  }
}

運行job:
[root@hadoop01 datax]# python ./bin/datax.py ./job/first.json

在這裏插入圖片描述
(2)mysql—>hdfs

[root@hadoop01 datax]# vi ./job/mysql2hdfs.json
內容如下:
{
    "job": {
        "content": [
            {
                "reader": {
                    "name": "mysqlreader", 
                    "parameter": {
                        "column": [
							"id",
                            "name"
						], 
                        "connection": [
                            {
                                "jdbcUrl": ["jdbc:mysql://hadoop01:3306/test"], 
                                "table": ["stu"]
                            }
                        ], 
                        "username": "root",
                        "password": "root"
                    }
                }, 
                "writer": {
                    "name": "hdfswriter",
                    "parameter": {
                        "defaultFS": "hdfs://hadoop01:9000",
                        "fileType": "orc",
                        "path": "/datax/mysql2hdfs/orcfull",
                        "fileName": "m2h01",
                        "column": [
                            {
                                "name": "col1",
                                "type": "INT"
                            },
                            {
                                "name": "col2",
                                "type": "STRING"
                            }
                        ],
                        "writeMode": "append",
                        "fieldDelimiter": "\t",
                        "compress":"NONE"
                    }
                }
            }
        ], 
        "setting": {
            "speed": {
                "channel": "1"
            }
        }
    }
}

注:
運行前,需提前創建好輸出目錄:
[root@hadoop01 datax]# hdfs dfs -mkdir -p /datax/mysql2hdfs/orcfull

運行job:
[root@hadoop01 datax]# python ./bin/datax.py ./job/mysql2hdfs.json

報錯:
ERROR RetryUtil - Exception when calling callable, 即將嘗試執行第1次重試.本次重試計劃等待[1000]ms,實際等待[1000]ms, 異常Msg:[DataX無法連接對應的數據庫,可能原因是:1) 配置的ip/port/database/jdbc錯誤,無法連接。2) 配置的username/password錯誤,鑑權失敗。請和DBA確認該數據庫的連接信息是否正確。]

datax裏面的mysql驅動更換成合適的8.x的版本就好了
在這裏插入圖片描述
因爲我的hdfs是ha的,所有嘗試配置了
“hadoopConfig”:{
“dfs.nameservices”: “hdfs://testDfs”,
“dfs.ha.namenodes.testDfs”:“namenode1,namenode2”,
“dfs.namenode.rpc-address.aliDfs.namenode1”: “hadoop01:9000”,
“dfs.namenode.rpc-address.aliDfs.namenode2”: “hadoop02:9000”,
“dfs.client.failover.proxy.provider.testDfs”: “org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider”
}
但是會報java.io.IOException: Couldn’t create proxy provider class org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider
解決方法:用winrar把hdfs-site.xml,core-site.xml,hive-site.xml三個文件壓縮到datax/plugin/reader/hdfsreader/hdfsreader-0.0.1-SNAPSHOT.jar裏面
但我還沒嘗試

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章