Flink 使用之 SQL Gateway Flink 使用介紹相關文檔目錄 背景 部署服務 配置項 使用

Flink 使用介紹相關文檔目錄

Flink 使用介紹相關文檔目錄

背景

Flink 1.16.0整合了SQL Gateway,提供了多種客戶端遠程併發執行SQL的能力。Flink終於擁有了類似於Spark Thrift server的能力。

本篇爲大家帶來Flink SQL Gateway的部署、配置和使用。

作者使用的環境信息:

  • Flink 1.16.0
  • Hadoop 3.1.1
  • Hive 3.1.2

官網關於SQL Gateway的講解參見https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/table/sql-gateway/overview/

部署服務

SQL Gateway提交作業的執行後端可以是Flink的standalone集羣或者是Yarn集羣。

Standalone 集羣

部署standalone集羣可參見官網https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/deployment/resource-providers/standalone/overview/

簡單來說有如下步驟:

  1. 建立集羣主節點到各個子節點的免密。
  2. 解壓Flink 1.16.0安裝包到主節點。
  3. 編輯$FLINK_HOME/conf/masters$FLINK_HOME/conf/workers文件,分別填寫job manager和task manager的ip或者hostname,一行填寫一個。通過這種方式手工指定Flink結羣各角色在集羣中的分佈情況。
  4. 切換到需要運行Flink集羣的用戶,在主節點執行$FLINK_HOME/bin/start-cluster.sh,啓動集羣。

關閉standalone集羣可以執行$FLINK_HOME/bin/stop-cluster.sh

集羣成功啓動之後可以接着啓動sql-client。執行:

$FLINK_HOME/bin/sql-gateway.sh start -Dsql-gateway.endpoint.rest.address=xxx.xxx.xxx.xxx

其中-Dsql-gateway.endpoint.rest.address用來指定SQL Gateway服務綁定的地址。注意如果指定爲localhost則SQL Gateway只能通過本機訪問,無法對外提供服務。SQL Gateway服務日誌文件在$FLINK_HOME/log目錄中。

可以執行$FLINK_HOME/bin/sql-gateway.sh -h獲取sql-gateway.sh命令更多的使用方式:

Usage: sql-gateway.sh [start|start-foreground|stop|stop-all] [args]
  commands:
    start               - Run a SQL Gateway as a daemon
    start-foreground    - Run a SQL Gateway as a console application
    stop                - Stop the SQL Gateway daemon
    stop-all            - Stop all the SQL Gateway daemons
    -h | --help         - Show this help message

建議調試運行的時候使用start-foreground前臺運行,方便查看運行日誌和故障重啓服務。

Yarn 集羣

將Flink 1.16.0安裝包解壓在Yarn集羣任意節點,然後切換Flink用戶執行:

export HADOOP_CLASSPATH=`hadoop classpath`
$FLINK_HOME/bin/yarn-session.sh -d -s 2 -jm 2048 -tm 2048

啓動Flink Yarn集羣。yarn-session.sh後面的參數按照實際情況修改。最後需要在Yarn管理頁面的RUNNING Applications頁面檢查Flink Yarn集羣是否正常啓動。

要求Flink用戶必須擁有提交Yarn作業的權限。如果沒有,需要切換用戶或者使用Ranger賦權。

Yarn啓動成功之後接着啓動SQL Gateway。務必使用和啓動yarn-session相同的用戶來啓動SQL Gateway。否則SQL Gateway無法找到yarn application id。儘管能正常啓動,但是執行SQL提交任務的時候會失敗。

SQL Gateway正常啓動後應能看到類似如下的日誌:

INFO  org.apache.flink.yarn.cli.FlinkYarnSessionCli                [] - Found Yarn properties file under /tmp/.yarn-properties-flink

Yarn properties file命名格式爲.yarn-properties-{用戶名}。本文作者使用flink用戶,所以文件名爲.yarn-properties-flink。如果有這一行日誌,說明SQL Gateway找到了Flink Yarn集羣。

在後面使用過程中,作業成功提交之後,日誌中可以看到類似如下內容:

INFO  org.apache.flink.yarn.YarnClusterDescriptor                  [] - Found Web Interface xxx.xxx.xxx.xxx:40494 of application 'application_1670204805747_0006'.
INFO  org.apache.flink.client.program.rest.RestClusterClient       [] - Submitting job 'collect' (8bbea014547408c4716a483a701af8ab).
INFO  org.apache.flink.client.program.rest.RestClusterClient       [] - Successfully submitted job 'collect' (8bbea014547408c4716a483a701af8ab) to 'http://ip:40494'.

SQL Gateway能夠找到Flink Yarn集羣對應的application id,並且將作業提交給這個集羣。

配置項

可以通過如下方式動態指定SQL Gateway的配置項

$FLINK_HOME/bin/sql-gateway.sh -Dkey=value

官網給出的配置項列表如下:

Key Default Type Description
sql-gateway.session.check-interval 1 min Duration The check interval for idle session timeout, which can be disabled by setting to zero or negative value.
sql-gateway.session.idle-timeout 10 min Duration Timeout interval for closing the session when the session hasn't been accessed during the interval. If setting to zero or negative value, the session will not be closed.
sql-gateway.session.max-num 1000000 Integer The maximum number of the active session for sql gateway service.
sql-gateway.worker.keepalive-time 5 min Duration Keepalive time for an idle worker thread. When the number of workers exceeds min workers, excessive threads are killed after this time interval.
sql-gateway.worker.threads.max 500 Integer The maximum number of worker threads for sql gateway service.
sql-gateway.worker.threads.min 5 Integer The minimum number of worker threads for sql gateway service.
  • sql-gateway.session.check-interval: 多長時間檢查一次session是否超時。配置爲0或者負數可以禁止這個行爲。
  • sql-gateway.session.idle-timeout: session的超時時間,超時的session會被自動關閉。同樣配置爲0或者負數可以禁止這個行爲。
  • sql-gateway.session.max-num: 活躍session數量的最大值。
  • sql-gateway.worker.keepalive-time: 空閒的worker線程保活時間。當實際worker線程數超過最小worker線程數之時,多出來的線程會在這個時間之後被kill掉。
  • sql-gateway.worker.threads.max: 最大worker線程數。
  • sql-gateway.worker.threads.min: 最小worker線程數。

使用

Flink SQL Gateway支持Rest API模式和hiveserver2模式。下面分別介紹它們的使用方式。

Rest API

前面部署過程中SQL Gateway默認是以Rest API的形式提供服務,這裏直接講解使用方式。假設在我們的測試環境SQL Gateway運行的IP和端口爲sql-gateway-ip:8083

首先執行:

curl --request POST http://sql-gateway-ip:8083/v1/sessions

創建並獲取到一個sessionHandle。示例返回如下:

{"sessionHandle":"2f35eb7e-97f0-40a4-b22d-f49c3a8fe7ef"}

然後以執行SQL SELECT 1語句爲例。格式爲:

curl --request POST http://sql-gateway-ip:8083/v1/sessions/${sessionHandle}/statements/ --data '{"statement": "SELECT 1"}'

我們替換sessionHandle爲上面返回的sessionHandle,實際命令如下:

curl --request POST http://sql-gateway-ip:8083/v1/sessions/2f35eb7e-97f0-40a4-b22d-f49c3a8fe7ef/statements/ --data '{"statement": "SELECT 1"}'

得到的返回值包含一個operationHandle,如下所示:

{"operationHandle":"7dcb0266-ed64-423d-a984-310dc6398e5e"}

最後我們使用sessionHandleoperationHandle來獲取運行結果。格式爲:

curl --request GET http://sql-gateway-ip:8083/v1/sessions/${sessionHandle}/operations/${operationHandle}/result/0

其中最後一個0爲token。可以理解爲查詢結果是分頁(分批)返回,token爲頁碼。

替換sessionHandleoperationHandle爲前面獲取的真實值,實際命令如下:

curl --request GET http://localhost:8083/v1/sessions/2f35eb7e-97f0-40a4-b22d-f49c3a8fe7ef/operations/7dcb0266-ed64-423d-a984-310dc6398e5e/result/0

得到結果如下:

{"results":{"columns":[{"name":"EXPR$0","logicalType":{"type":"INTEGER","nullable":false},"comment":null}],"data":[{"kind":"INSERT","fields":[1]}]},"resultType":"PAYLOAD","nextResultUri":"/v1/sessions/2f35eb7e-97f0-40a4-b22d-f49c3a8fe7ef/operations/7dcb0266-ed64-423d-a984-310dc6398e5e/result/1"}

我們從result -> data -> fields 可以得到SELECT 1的運行結果爲1。

前面提到token的作用類似於分頁。上面JSON的nextResultUri告訴我們獲取下一批結果的URL。發現token從0變成了1。我們訪問這個nextResultUri

curl --request GET http://localhost:8083/v1/sessions/2f35eb7e-97f0-40a4-b22d-f49c3a8fe7ef/operations/7dcb0266-ed64-423d-a984-310dc6398e5e/result/1

返回如下內容:

{"results":{"columns":[{"name":"EXPR$0","logicalType":{"type":"INTEGER","nullable":false},"comment":null}],"data":[]},"resultType":"EOS","nextResultUri":null}

可以看到resultTypeEOS,表示所有結果都已經獲取到了。此時nextResultUri爲null,沒有下一頁結果。

hiveserver2

除了上述的Rest API之外,SQL Gateway還支持hiveserver2模式。

官網SQL Gateway hiveserver2模式相關內容參見https://nightlies.apache.org/flink/flink-docs-release-1.16/zh/docs/dev/table/hive-compatibility/hiveserver2/

要支持hiveserver2模式要求配置相關的依賴。首先需要添加flink-connector-hive_2.12-1.16.0.jar到Flink的lib目錄中。jar下載地址爲:https://repo1.maven.org/maven2/org/apache/flink/flink-connector-hive_2.12/1.16.0/flink-connector-hive_2.12-1.16.0.jar

除此之外還需要Hive的相關依賴:

  • hive-common.jar
  • hive-service-rpc.jar
  • hive-exec.jar
  • libthrift.jar
  • libfb303.jar
  • antlr-runtime.jar

這些包的版本需要和集羣內的Hive保持一致,建議從集羣Hive安裝位置的lib目錄直接複製。

以hiveserver2模式啓動SQL Gateway的命令爲:

$FLINK_HOME/bin/sql-gateway.sh start -Dsql-gateway.endpoint.rest.address=xxx.xxx.xxx.xxx -Dsql-gateway.endpoint.type=hiveserver2 -Dsql-gateway.endpoint.hiveserver2.catalog.hive-conf-dir=/path/to/hive/conf -Dsql-gateway.endpoint.hiveserver2.thrift.port=10000

其參數的含義爲:

  • -Dsql-gateway.endpoint.rest.address: SQL Gateway服務綁定地址。
  • -Dsql-gateway.endpoint.type: 指定endpoint類型。默認值爲rest即Rest API。使用hiveserver2類型必須顯式配置。
  • -Dsql-gateway.endpoint.hiveserver2.catalog.hive-conf-dir: hive-site.xml配置文件所在目錄。方便連接到Hive metastore,獲取表的元數據信息。
  • -Dsql-gateway.endpoint.hiveserver2.thrift.port: hiveserver2模式SQL Gateway使用的端口。相當於Hive thriftserver的端口。

除了上面列舉出的之外,hiveserver2模式還有很多配置項,參見https://nightlies.apache.org/flink/flink-docs-release-1.16/zh/docs/dev/table/hive-compatibility/hiveserver2/#endpoint-options。這裏不再一一列出。

現在啓動SQL Gateway可能出現下面的錯誤:

org.apache.flink.table.api.ValidationException: Could not find any factory for identifier 'hive' that implements 'org.apache.flink.table.planner.delegation.DialectFactory' in the classpath.

Available factory identifiers are:

Note: if you want to use Hive dialect, please first move the jar `flink-table-planner_2.12` located in `FLINK_HOME/opt` to `FLINK_HOME/lib` and then move out the jar `flink-table-planner-loader` from `FLINK_HOME/lib`.
        at org.apache.flink.table.factories.FactoryUtil.discoverFactory(FactoryUtil.java:545) ~[flink-table-api-java-uber-1.16.0.jar:1.16.0]
        at org.apache.flink.table.planner.delegation.PlannerBase.getDialectFactory(PlannerBase.scala:161) ~[?:?]
        at org.apache.flink.table.planner.delegation.PlannerBase.getParser(PlannerBase.scala:171) ~[?:?]
        at org.apache.flink.table.api.internal.TableEnvironmentImpl.getParser(TableEnvironmentImpl.java:1694) ~[flink-table-api-java-uber-1.16.0.jar:1.16.0]
        at org.apache.flink.table.api.internal.TableEnvironmentImpl.<init>(TableEnvironmentImpl.java:240) ~[flink-table-api-java-uber-1.16.0.jar:1.16.0]
        at org.apache.flink.table.api.bridge.internal.AbstractStreamTableEnvironmentImpl.<init>(AbstractStreamTableEnvironmentImpl.java:89) ~[flink-table-api-java-uber-1.16.0.jar:1.16.0]
        at org.apache.flink.table.api.bridge.java.internal.StreamTableEnvironmentImpl.<init>(StreamTableEnvironmentImpl.java:84) ~[flink-table-api-java-uber-1.16.0.jar:1.16.0]
        at org.apache.flink.table.gateway.service.context.SessionContext.createStreamTableEnvironment(SessionContext.java:309) ~[flink-sql-gateway-1.16.0.jar:1.16.0]
        at org.apache.flink.table.gateway.service.context.SessionContext.createTableEnvironment(SessionContext.java:269) ~[flink-sql-gateway-1.16.0.jar:1.16.0]
        at org.apache.flink.table.gateway.service.operation.OperationExecutor.getTableEnvironment(OperationExecutor.java:218) ~[flink-sql-gateway-1.16.0.jar:1.16.0]
        at org.apache.flink.table.gateway.service.operation.OperationExecutor.executeStatement(OperationExecutor.java:89) ~[flink-sql-gateway-1.16.0.jar:1.16.0]
        at org.apache.flink.table.gateway.service.SqlGatewayServiceImpl.lambda$executeStatement$0(SqlGatewayServiceImpl.java:182) ~[flink-sql-gateway-1.16.0.jar:1.16.0]
        at org.apache.flink.table.gateway.service.operation.OperationManager.lambda$submitOperation$1(OperationManager.java:111) ~[flink-sql-gateway-1.16.0.jar:1.16.0]
        at org.apache.flink.table.gateway.service.operation.OperationManager$Operation.lambda$run$0(OperationManager.java:239) ~[flink-sql-gateway-1.16.0.jar:1.16.0]
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_121]
        at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_121]
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_121]
        at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_121]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_121]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_121]
        at java.lang.Thread.run(Thread.java:745) [?:1.8.0_121]
2022-12-08 17:42:03,007 INFO  org.apache.flink.table.catalog.hive.HiveCatalog              [] - Created HiveCatalog 'hive'
2022-12-08 17:42:03,008 INFO  org.apache.hadoop.hive.metastore.HiveMetaStoreClient         [] - Trying to connect to metastore with URI thrift://xxx.xxx.xxx.xxx:9083
2022-12-08 17:42:03,008 INFO  org.apache.hadoop.hive.metastore.HiveMetaStoreClient         [] - Opened a connection to metastore, current connections: 3
2022-12-08 17:42:03,009 INFO  org.apache.hadoop.hive.metastore.HiveMetaStoreClient         [] - Connected to metastore.
2022-12-08 17:42:03,010 INFO  org.apache.hadoop.hive.metastore.RetryingMetaStoreClient     [] - RetryingMetaStoreClient proxy=class org.apache.hadoop.hive.metastore.HiveMetaStoreClient ugi=yarn (auth:SIMPLE) retries=24 delay=5 lifetime=0
2022-12-08 17:42:03,010 INFO  org.apache.flink.table.catalog.hive.HiveCatalog              [] - Connected to Hive metastore
2022-12-08 17:42:03,026 INFO  org.apache.flink.table.module.ModuleManager                  [] - Loaded module 'hive' from class org.apache.flink.table.module.hive.HiveModule
2022-12-08 17:42:03,030 INFO  org.apache.flink.table.gateway.service.session.SessionManager [] - Session f3f6f339-f5b0-425f-94ad-3e9ad11981c1 is opened, and the number of current sessions is 3.
2022-12-08 17:42:03,043 ERROR org.apache.flink.table.gateway.service.operation.OperationManager [] - Failed to execute the operation 7922e186-8110-4bb8-b93d-db17d88eac48.
org.apache.flink.table.api.ValidationException: Could not find any factory for identifier 'hive' that implements 'org.apache.flink.table.planner.delegation.DialectFactory' in the classpath.

如果遇到這個錯誤,說明Flink沒有發現Hive方言,需要將Flink opt目錄中的flink-table-planner_2.12-1.16.0.jarlib目錄,然後將lib目錄中的flink-table-planner-loader-1.16.0.jar移除掉。

到目前爲止Flink的lib目錄內容爲:

antlr-runtime-3.5.2.jar
flink-cep-1.16.0.jar
flink-connector-files-1.16.0.jar
flink-connector-hive_2.12-1.16.0.jar
flink-csv-1.16.0.jar
flink-dist-1.16.0.jar
flink-json-1.16.0.jar
flink-scala_2.12-1.16.0.jar
flink-shaded-zookeeper-3.5.9.jar
flink-table-api-java-uber-1.16.0.jar
flink-table-planner_2.12-1.16.0.jar
flink-table-runtime-1.16.0.jar
hive-common-3.1.0.3.0.1.0-187.jar
hive-exec-3.1.0.3.0.1.0-187.jar
hive-service-rpc-3.1.0.3.0.1.0-187.jar
libfb303-0.9.3.jar
libthrift-0.9.3.jar
log4j-1.2-api-2.17.1.jar
log4j-api-2.17.1.jar
log4j-core-2.17.1.jar
log4j-slf4j-impl-2.17.1.jar

此時已經可以正常使用SQL Gateway。但是使用Flink查詢Hive表仍會出現缺少依賴問題。還需要添加Hadoop相關依賴:

  • hadoop-common.jar
  • hadoop-mapreduce-client-common.jar
  • hadoop-mapreduce-client-core.jar
  • hadoop-mapreduce-client-jobclient.jar

最終lib目錄內容爲:

antlr-runtime-3.5.2.jar
flink-cep-1.16.0.jar
flink-connector-files-1.16.0.jar
flink-connector-hive_2.12-1.16.0.jar
flink-csv-1.16.0.jar
flink-dist-1.16.0.jar
flink-json-1.16.0.jar
flink-scala_2.12-1.16.0.jar
flink-shaded-zookeeper-3.5.9.jar
flink-table-api-java-uber-1.16.0.jar
flink-table-planner_2.12-1.16.0.jar
flink-table-runtime-1.16.0.jar
hadoop-common-3.1.1.3.0.1.0-187.jar
hadoop-mapreduce-client-common-3.1.1.3.0.1.0-187.jar
hadoop-mapreduce-client-core-3.1.1.3.0.1.0-187.jar
hadoop-mapreduce-client-jobclient-3.1.1.3.0.1.0-187.jar
hive-common-3.1.0.3.0.1.0-187.jar
hive-exec-3.1.0.3.0.1.0-187.jar
hive-service-rpc-3.1.0.3.0.1.0-187.jar
libfb303-0.9.3.jar
libthrift-0.9.3.jar
log4j-1.2-api-2.17.1.jar
log4j-api-2.17.1.jar
log4j-core-2.17.1.jar
log4j-slf4j-impl-2.17.1.jar

最後再次嘗試啓動,筆者測試能夠啓動成功。

接下來的工作是使用JDBC連接SQL Gateway。需要注意的是連接URL必須添加auth=noSasl屬性。比如:

jdbc:hive2://sql-gateway-ip:10000/default;auth=noSasl

否則SQL Gateway會出現下面錯誤:

org.apache.thrift.protocol.TProtocolException: Missing version in readMessageBegin, old client?

接下來分別介紹使用DBeaver,Java代碼和Beeline方式連接Flink SQL Gateway。

DBeaver

依次點擊 新建連接 -> Apache Hive(可以搜索出來)。在主要 -> 一般窗格中填寫主機端口號和數據庫(可不寫)。然後在驅動屬性tab頁,添加名稱爲auth的用戶屬性,值爲noSasl。點擊完成按鈕,連接創建完畢,可以點擊工具欄SQL按鈕打開SQL窗口編寫SQL。

注意:在創建連接的最後異步需要從GitHub上下載Hive JDBC驅動。可能會因爲網絡問題下載超時,在DBeaver中點擊重試也沒辦法解決。我們可以手動下載。方法爲在連接到數據庫嚮導中點擊編輯驅動,點擊庫這個tab頁。可以看到驅動的下載鏈接。將其複製到瀏覽器下載。然後我們進入C:\Users\xxx\AppData\Roaming\DBeaverData\drivers\remote\目錄逐層向下查找驅動類的存放路徑,例如C:\Users\xxx\AppData\Roaming\DBeaverData\drivers\remote\timveil\hive-jdbc-uber-jar\releases\download\v1.9-2.6.5。將瀏覽器下載好的驅動放置到這個目錄(如果目錄中有DBeaver下載了一半失敗的驅動文件,需要先刪除掉)。點擊在連接到數據庫嚮導的完成按鈕關閉嚮導就可以了。

使用Java代碼

Maven需要添加如下依賴:

<dependency>
    <groupId>org.apache.hive</groupId>
    <artifactId>hive-jdbc</artifactId>
    <version>3.1.2</version>
</dependency>

然後編寫Java代碼:

public static void main(String[] args) throws Exception {
    Class.forName("org.apache.hive.jdbc.HiveDriver");
    try (
            // Please replace the JDBC URI with your actual host, port and database.
            Connection connection = DriverManager.getConnection("jdbc:hive2://sql-gateway-ip:10000/default;auth=noSasl");
            Statement statement = connection.createStatement()) {
        statement.execute("select * from some_table");
        ResultSet resultSet = statement.getResultSet();
        while (resultSet.next()) {
            System.out.println(resultSet.getString(1));
        }
    }
}

和傳統JDBC使用方式沒有任何區別。需要注意Hive驅動的類名爲org.apache.hive.jdbc.HiveDriver

使用 Beeline

啓動beeline並使用如下命令連接SQL Gateway:

./beeline
!connect jdbc:hive2://sql-gateway-ip:10000/default;auth=noSasl

接下來會詢問使用的用戶名和密碼。由於當前版本不支持認證,可直接回車略過。連接成功之後可以像使用Hive一樣使用SQL語句。

上面是官網給出的使用beeline工具的方式。但本人在驗證的過程中遇到了如下錯誤:

2022-12-09 10:24:28,600 ERROR org.apache.flink.table.endpoint.hive.HiveServer2Endpoint     [] - Failed to GetInfo.
java.lang.UnsupportedOperationException: Unrecognized TGetInfoType value: CLI_ODBC_KEYWORDS.
        at org.apache.flink.table.endpoint.hive.HiveServer2Endpoint.GetInfo(HiveServer2Endpoint.java:371) [flink-connector-hive_2.12-1.16.0.jar:1.16.0]
        at org.apache.hive.service.rpc.thrift.TCLIService$Processor$GetInfo.getResult(TCLIService.java:1537) [hive-exec-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
        at org.apache.hive.service.rpc.thrift.TCLIService$Processor$GetInfo.getResult(TCLIService.java:1522) [hive-exec-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
        at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) [hive-exec-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
        at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) [hive-exec-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
        at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286) [hive-exec-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_121]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_121]
        at java.lang.Thread.run(Thread.java:745) [?:1.8.0_121]
2022-12-09 10:24:28,600 ERROR org.apache.thrift.server.TThreadPoolServer                   [] - Thrift error occurred during processing of message.
org.apache.thrift.protocol.TProtocolException: Required field 'infoValue' is unset! Struct:TGetInfoResp(status:TStatus(statusCode:ERROR_STATUS, infoMessages:[*java.lang.UnsupportedOperationException:Unrecognized TGetInfoType value: CLI_ODBC_KEYWORDS.:9:8, org.apache.flink.table.endpoint.hive.HiveServer2Endpoint:GetInfo:HiveServer2Endpoint.java:371, org.apache.hive.service.rpc.thrift.TCLIService$Processor$GetInfo:getResult:TCLIService.java:1537, org.apache.hive.service.rpc.thrift.TCLIService$Processor$GetInfo:getResult:TCLIService.java:1522, org.apache.thrift.ProcessFunction:process:ProcessFunction.java:39, org.apache.thrift.TBaseProcessor:process:TBaseProcessor.java:39, org.apache.thrift.server.TThreadPoolServer$WorkerProcess:run:TThreadPoolServer.java:286, java.util.concurrent.ThreadPoolExecutor:runWorker:ThreadPoolExecutor.java:1142, java.util.concurrent.ThreadPoolExecutor$Worker:run:ThreadPoolExecutor.java:617, java.lang.Thread:run:Thread.java:745], errorMessage:Unrecognized TGetInfoType value: CLI_ODBC_KEYWORDS.), infoValue:null)
        at org.apache.hive.service.rpc.thrift.TGetInfoResp.validate(TGetInfoResp.java:379) ~[hive-exec-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
        at org.apache.hive.service.rpc.thrift.TCLIService$GetInfo_result.validate(TCLIService.java:5228) ~[hive-exec-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
        at org.apache.hive.service.rpc.thrift.TCLIService$GetInfo_result$GetInfo_resultStandardScheme.write(TCLIService.java:5285) ~[hive-exec-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
        at org.apache.hive.service.rpc.thrift.TCLIService$GetInfo_result$GetInfo_resultStandardScheme.write(TCLIService.java:5254) ~[hive-exec-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
        at org.apache.hive.service.rpc.thrift.TCLIService$GetInfo_result.write(TCLIService.java:5205) ~[hive-exec-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
        at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:53) ~[hive-exec-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
        at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) ~[hive-exec-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
        at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286) [hive-exec-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_121]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_121]
        at java.lang.Thread.run(Thread.java:745) [?:1.8.0_121]
2022-12-09 10:24:28,600 WARN  org.apache.thrift.transport.TIOStreamTransport               [] - Error closing output stream.
java.net.SocketException: Socket closed
        at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:118) ~[?:1.8.0_121]
        at java.net.SocketOutputStream.write(SocketOutputStream.java:155) ~[?:1.8.0_121]
        at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82) ~[?:1.8.0_121]
        at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140) ~[?:1.8.0_121]
        at java.io.FilterOutputStream.close(FilterOutputStream.java:158) ~[?:1.8.0_121]
        at org.apache.thrift.transport.TIOStreamTransport.close(TIOStreamTransport.java:110) [hive-exec-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
        at org.apache.thrift.transport.TSocket.close(TSocket.java:235) [hive-exec-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
        at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:303) [hive-exec-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_121]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_121]
        at java.lang.Thread.run(Thread.java:745) [?:1.8.0_121]

調查這個錯誤發現是Flnk 1.16.0版本的bug。這個問題鏈接爲FLINK-29839。社區已經在1.16.1版本中解決。

本博客爲作者原創,歡迎大家參與討論和批評指正。如需轉載請註明出處。

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章