JDBC遠程連接hiveserver2

轉自http://www.mamicode.com/info-detail-1563222.html

我在 JDBC 連接 Hive 的過程中遇到了很多問題,主要是兩大問題:

  • 第一個問題: 沒有開啓 hiveserver2 服務,對該概念理解不到位,以爲像 CLI 一樣直接就能連接 Hive;
  • 第二個問題:缺少了 hadoop-2.2.0/share/hadoop/common/hadoop-common-2.2.0.jar 。本以爲只需要 hive 的包就足夠了,沒想到還需要hadoop的包。

多虧看到這篇博客! 在此分享! 
其中有一些我實際過程中與原文不同的地方,我已標出。


介紹

在之前的學習和實踐Hive中,使用的都是 CLI 或者 hive –e 的方式,該方式僅允許使用HiveQL執行查詢、更新等操作,並且該方式比較笨拙單一。
幸好Hive提供了輕客戶端的實現,通過 HiveServer 或者 HiveServer2 ,客戶端可以在不啓動CLI的情況下對Hive中的數據進行操作,兩者都允許遠程客戶端使用多種編程語言如 Java 、 Python 向 Hive 提交請求,取回結果。
HiveServer 或者 HiveServer2 都是基於 Thrift 的,但 HiveSever 有時被稱爲 Thrift server ,而HiveServer2卻不會。既然已經存在HiveServer爲什麼還需要HiveServer2呢?
這是因爲 HiveServer不能處理多於一個客戶端的併發請求 ,這是由於 HiveServer 使用的 Thrift接口所導致的限制 ,不能通過修改 HiveServer 的代碼修正。因此在 Hive-0.11.0 版本中重寫了 HiveServer 代碼得到了 HiveServer2 ,進而解決了該問題。
HiveServer2 支持多客戶端的併發和認證,爲開放 API 客戶端如 JDBC 、 ODBC 提供了更好的支持。

所以本文將以HiveServer2爲例,介紹並編寫遠程操作的Hive的Java API。


Hive的配置信息

首先先列出並本文使用的 hive的關鍵的配置信息

( “hive.server2.long.polling.timeout” 我實際做的時候沒有修改,沒有遇到報錯)

<property>
  <name>hive.metastore.warehouse.dir</name>
  <value>/usr/hive/warehouse</value>               //(hive中的數據庫和表在HDFS中存放的文件夾的位置)
  <description>location of default database for the warehouse</description>
</property>
<property>
  <name>hive.server2.thrift.port</name>
  <value>10000</value>                               //(HiveServer2遠程連接的端口,默認爲10000)
  <description>Port number of HiveServer2 Thrift interface.
  Can be overridden by setting $HIVE_SERVER2_THRIFT_PORT</description>
</property>

<property>
  <name>hive.server2.thrift.bind.host</name>
  <value>**.**.**.**</value>                          //(hive所在集羣的IP地址)
  <description>Bind host on which to run the HiveServer2 Thrift interface.
  Can be overridden by setting $HIVE_SERVER2_THRIFT_BIND_HOST</description>
</property>
<property>
  <name>hive.server2.long.polling.timeout</name>
  <value>5000</value>                                // (默認爲5000L,此處修改爲5000,不然程序會報錯)
  <description>Time in milliseconds that HiveServer2 will wait, before responding to asynchronous calls that use long polling</description>
</property>
<property>
  <name>javax.jdo.option.ConnectionURL</name>
  <value>jdbc:mysql://localhost:3306/hive?createDatabaseIfNotExist=true</value>  //(Hive的元數據庫,我採用的是本地Mysql作爲元數據庫)
  <description>JDBC connect string for a JDBC metastore</description>
</property>

<property>                         
  <name>javax.jdo.option.ConnectionDriverName</name>          //(連接元數據的驅動名)
  <value>com.mysql.jdbc.Driver</value>
  <description>Driver class name for a JDBC metastore</description>
</property>
<property>
  <name>javax.jdo.option.ConnectionUserName</name>             //(連接元數據庫用戶名)
  <value>hive</value>
  <description>username to use against metastore database</description>
</property>

<property>
  <name>javax.jdo.option.ConnectionPassword</name>             // (連接元數據庫密碼)
  <value>hive</value>
  <description>password to use against metastore database</description>
</property>

 啓動 HiveServer2 服務

確保上述正確配置後,下面啓動 HiveServer2 服務

(我在實際做的時候,沒有使用該語句啓動元數據庫,沒遇到問題 )

先啓動元數據庫,在命令行中鍵入:
hive –service metastore &
(&符號表示該進程將在後臺運行,因爲執行此命令後命令行會卡住,如果沒加此符號,用ctrl+C退回命令行輸入界面後會自動shotdown 該服務)

如下圖:

之後命令行會卡住,此時查看日誌文件hive.log,顯示如下:

2016-04-26 04:44:53,956 INFO  [main]: metastore.HiveMetaStore (HiveMetaStore.java:main(5060)) - Starting hive metastore on port 9083
2016-04-26 04:44:54,174 WARN  [main]: conf.HiveConf (HiveConf.java:initialize(1390)) - DEPRECATED: hive.metastore.ds.retry.* no longer has any effect.  Use hive.hmshandler.retry.* instead
2016-04-26 04:44:54,326 INFO  [main]: metastore.HiveMetaStore (HiveMetaStore.java:newRawStore(494)) - 0: Opening raw store with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore
2016-04-26 04:44:54,412 INFO  [main]: metastore.ObjectStore (ObjectStore.java:initialize(245)) - ObjectStore, initialize called
2016-04-26 04:44:57,240 WARN  [main]: conf.HiveConf (HiveConf.java:initialize(1390)) - DEPRECATED: hive.metastore.ds.retry.* no longer has any effect.  Use hive.hmshandler.retry.* instead
2016-04-26 04:44:57,246 INFO  [main]: metastore.ObjectStore (ObjectStore.java:getPMF(314)) - Setting MetaStore object pin classes with hive.metastore.cache.pinobjtypes="Table,StorageDescriptor,SerDeInfo,Partition,Database,Type,FieldSchema,Order"
2016-04-26 04:45:03,597 INFO  [main]: metastore.ObjectStore (ObjectStore.java:setConf(228)) - Initialized ObjectStore
2016-04-26 04:45:03,806 WARN  [main]: metastore.ObjectStore (ObjectStore.java:checkSchema(6273)) - Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 0.13.0
2016-04-26 04:45:04,811 INFO  [main]: metastore.HiveMetaStore (HiveMetaStore.java:createDefaultRoles(552)) - Added admin role in metastore
2016-04-26 04:45:04,828 INFO  [main]: metastore.HiveMetaStore (HiveMetaStore.java:createDefaultRoles(561)) - Added public role in metastore
2016-04-26 04:45:04,984 INFO  [main]: metastore.HiveMetaStore (HiveMetaStore.java:addAdminUsers(589)) - No user is added in admin role, since config is empty
2016-04-26 04:45:05,361 INFO  [main]: metastore.HiveMetaStore (HiveMetaStore.java:startMetaStore(5182)) - Starting DB backed MetaStore Server
2016-04-26 04:45:05,369 INFO  [main]: metastore.HiveMetaStore (HiveMetaStore.java:startMetaStore(5194)) - Started the new metaserver on port [9083]...
2016-04-26 04:45:05,369 INFO  [main]: metastore.HiveMetaStore (HiveMetaStore.java:startMetaStore(5196)) - Options.minWorkerThreads = 200
2016-04-26 04:45:05,370 INFO  [main]: metastore.HiveMetaStore (HiveMetaStore.java:startMetaStore(5198)) - Options.maxWorkerThreads = 100000
2016-04-26 04:45:05,370 INFO  [main]: metastore.HiveMetaStore (HiveMetaStore.java:startMetaStore(5200)) - TCP keepalive = true

此時證明metastore已經開啓。

接下來開啓hiveserver2服務:

在命令行中鍵入:hive –service hiveserver2 &

同上,也會出現命令行卡住的現象。查看日誌文件如下:

2016-04-26 04:53:24,212 INFO  [main]: server.HiveServer2 (HiveStringUtils.java:startupShutdownMessage(605)) - STARTUP_MSG: 
/************************************************************
STARTUP_MSG: Starting HiveServer2
STARTUP_MSG:   host = master/(你之前配置的IP)
STARTUP_MSG:   args = []
STARTUP_MSG:   version = 0.13.0
STARTUP_MSG:   classpath = /opt/modules/hadoop-2.2.0/etc/hadoop:/opt/modules/hadoop-2.2.0/share/hadoop/common/lib
//(……中間略掉classpath內容,日誌信息太長……)
STARTUP_MSG:   build = file:///Users/hbutani/svn/branch-0.13 -r Unknown; compiled by ‘hbutani‘ on Tue Apr 15 13:55:42 PDT 2014
************************************************************/
2016-04-26 04:53:24,553 WARN  [main]: conf.HiveConf (HiveConf.java:initialize(1390)) - DEPRECATED: hive.metastore.ds.retry.* no longer has any effect.  Use hive.hmshandler.retry.* instead
2016-04-26 04:53:25,258 INFO  [main]: metastore.HiveMetaStore (HiveMetaStore.java:newRawStore(494)) - 0: Opening raw store with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore
2016-04-26 04:53:25,325 INFO  [main]: metastore.ObjectStore (ObjectStore.java:initialize(245)) - ObjectStore, initialize called
2016-04-26 04:53:28,312 WARN  [main]: conf.HiveConf (HiveConf.java:initialize(1390)) - DEPRECATED: hive.metastore.ds.retry.* no longer has any effect.  Use hive.hmshandler.retry.* instead
2016-04-26 04:53:28,313 INFO  [main]: metastore.ObjectStore (ObjectStore.java:getPMF(314)) - Setting MetaStore object pin classes with hive.metastore.cache.pinobjtypes="Table,StorageDescriptor,SerDeInfo,Partition,Database,Type,FieldSchema,Order"
2016-04-26 04:53:31,537 INFO  [main]: metastore.ObjectStore (ObjectStore.java:setConf(228)) - Initialized ObjectStore
2016-04-26 04:53:32,064 INFO  [main]: metastore.HiveMetaStore (HiveMetaStore.java:createDefaultRoles(552)) - Added admin role in metastore
2016-04-26 04:53:32,079 INFO  [main]: metastore.HiveMetaStore (HiveMetaStore.java:createDefaultRoles(561)) - Added public role in metastore
2016-04-26 04:53:32,205 INFO  [main]: metastore.HiveMetaStore (HiveMetaStore.java:addAdminUsers(589)) - No user is added in admin role, since config is empty
2016-04-26 04:53:33,887 INFO  [main]: session.SessionState (SessionState.java:start(358)) - No Tez session required at this point. hive.execution.engine=mr.
2016-04-26 04:53:34,168 WARN  [main]: conf.HiveConf (HiveConf.java:initialize(1390)) - DEPRECATED: hive.metastore.ds.retry.* no longer has any effect.  Use hive.hmshandler.retry.* instead
2016-04-26 04:53:34,241 INFO  [main]: service.CompositeService (SessionManager.java:init(70)) - HiveServer2: Async execution thread pool size: 100
2016-04-26 04:53:34,241 INFO  [main]: service.CompositeService (SessionManager.java:init(72)) - HiveServer2: Async execution wait queue size: 100
2016-04-26 04:53:34,242 INFO  [main]: service.CompositeService (SessionManager.java:init(74)) - HiveServer2: Async execution thread keepalive time: 10
2016-04-26 04:53:34,244 INFO  [main]: service.AbstractService (AbstractService.java:init(89)) - Service:OperationManager is inited.
2016-04-26 04:53:34,247 INFO  [main]: service.AbstractService (AbstractService.java:init(89)) - Service:SessionManager is inited.
2016-04-26 04:53:34,247 INFO  [main]: service.AbstractService (AbstractService.java:init(89)) - Service:CLIService is inited.
2016-04-26 04:53:34,247 INFO  [main]: service.AbstractService (AbstractService.java:init(89)) - Service:ThriftBinaryCLIService is inited.
2016-04-26 04:53:34,247 INFO  [main]: service.AbstractService (AbstractService.java:init(89)) - Service:HiveServer2 is inited.
2016-04-26 04:53:34,248 INFO  [main]: service.AbstractService (AbstractService.java:start(104)) - Service:OperationManager is started.
2016-04-26 04:53:34,248 INFO  [main]: service.AbstractService (AbstractService.java:start(104)) - Service:SessionManager is started.
2016-04-26 04:53:34,248 INFO  [main]: service.AbstractService (AbstractService.java:start(104)) - Service:CLIService is started.
2016-04-26 04:53:34,698 INFO  [main]: metastore.HiveMetaStore (HiveMetaStore.java:addAdminUsers(589)) - No user is added in admin role, since config is empty
2016-04-26 04:53:34,699 INFO  [main]: metastore.HiveMetaStore (HiveMetaStore.java:logInfo(624)) - 0: get_databases: default
2016-04-26 04:53:34,701 INFO  [main]: HiveMetaStore.audit (HiveMetaStore.java:logAuditEvent(306)) - ugi=hh  ip=unknown-ip-addr  cmd=get_databases: default  
2016-04-26 04:53:34,725 INFO  [main]: metastore.HiveMetaStore (HiveMetaStore.java:newRawStore(494)) - 0: Opening raw store with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore
2016-04-26 04:53:34,728 INFO  [main]: metastore.ObjectStore (ObjectStore.java:initialize(245)) - ObjectStore, initialize called
2016-04-26 04:53:34,745 INFO  [main]: metastore.ObjectStore (ObjectStore.java:setConf(228)) - Initialized ObjectStore
2016-04-26 04:53:34,795 INFO  [main]: service.AbstractService (AbstractService.java:start(104)) - Service:ThriftBinaryCLIService is started.
2016-04-26 04:53:34,796 INFO  [main]: service.AbstractService (AbstractService.java:start(104)) - Service:HiveServer2 is started.
2016-04-26 04:53:34,947 WARN  [Thread-5]: conf.HiveConf (HiveConf.java:initialize(1390)) - DEPRECATED: hive.metastore.ds.retry.* no longer has any effect.  Use hive.hmshandler.retry.* instead
2016-04-26 04:53:35,584 INFO  [Thread-5]: thrift.ThriftCLIService (ThriftBinaryCLIService.java:run(88)) - ThriftBinaryCLIService listening on /(你的IP):10000

你也可以通過下述命令查看hiveserver2是否已經開啓:

[hh@master Desktop]$ netstat -nl |grep 10000
tcp        0      0 (你的IP:10000        0.0.0.0:*                   LISTEN

此時證明 hiveserver2 服務已經開啓
(注意:一定要去查看日誌信息,因爲命令行並不會報錯,如果啓動失敗,相應的異常會在日誌信息中顯示,日誌文件hive.log的路徑在$HIVE_HOME/conf/hive-log4j.properties中配置)


編寫 Java API

下面開始編寫java API:

首先列出本程序依賴的Jar包:
(此處強調一下Hadoop的那個包,是必須的,不然會報錯)

hadoop-2.2.0/share/hadoop/common/hadoop-common-2.2.0.jar
$HIVE_HOME/lib/hive-exec-0.11.0.jar 
$HIVE_HOME/lib/hive-jdbc-0.11.0.jar 
$HIVE_HOME/lib/hive-metastore-0.11.0.jar 
$HIVE_HOME/lib/hive-service-0.11.0.jar 
$HIVE_HOME/lib/libfb303-0.9.0.jar 
$HIVE_HOME/lib/commons-logging-1.0.4.jar 
$HIVE_HOME/lib/slf4j-api-1.6.1.jar

下面貼出java代碼:
(該部分參考別的博客代碼也可以)

JDBCToHiveUtils.java

import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.PreparedStatement;
import java.sql.SQLException;

public class JDBCToHiveUtils {
    private static String driverName ="org.apache.hive.jdbc.HiveDriver";
    private static String Url="jdbc:hive2://**.**.**.**:10000/default";    //填寫hive的IP,之前在配置文件中配置的IP
    private static Connection conn;
    public static Connection getConnnection()
    {
        try
               {
                  Class.forName(driverName);
                  conn = DriverManager.getConnection(Url,"hh","");        //此處的用戶名一定是有權限操作HDFS的用戶,否則程序會提示"permission deny"異常
               }
        catch(ClassNotFoundException e)  {
                   e.printStackTrace();
                   System.exit(1);
                }
         catch (SQLException e) {
            e.printStackTrace();
        }
        return conn;
    }
    public static PreparedStatement prepare(Connection conn, String sql) {
        PreparedStatement ps = null;
        try {
            ps = conn.prepareStatement(sql);
        } catch (SQLException e) {
            e.printStackTrace();
        }
        return ps;
    }
}

QueryHiveUtils.java

import java.sql.Connection;
import java.sql.PreparedStatement;
import java.sql.ResultSet;
import java.sql.SQLException;
import java.sql.Statement;

public class QueryHiveUtils {
    private static Connection conn=JDBCToHiveUtils.getConnnection();
    private static PreparedStatement ps;
    private static ResultSet rs;
    public static void getAll(String tablename)
    {
        String sql="select * from "+tablename;
        System.out.println(sql);
        try {
            ps=JDBCToHiveUtils.prepare(conn, sql);
            rs=ps.executeQuery();
            int columns=rs.getMetaData().getColumnCount();
            while(rs.next())
            {
                for(int i=1;i<=columns;i++)
                {
                    System.out.print(rs.getString(i));
                    System.out.print("\t\t");
                }
                System.out.println();
            }
        } catch (SQLException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }

    }

}

QuerHiveTest.java

public class QueryHiveTest {

    public static void main(String[] args) {
        String tablename="test1";
                QueryHiveUtils.getAll(tablename);
    }

}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章