IDEA中使用Spark SQL查詢Hive

1、使用Maven搭建項目環境,添加Maven依賴

<properties>
    <scala.version>2.11.8</scala.version>
    <spark.version>2.3.1</spark.version>
  </properties>

<dependencies>
    <dependency>
      <groupId>org.scala-lang</groupId>
      <artifactId>scala-library</artifactId>
      <version>${scala.version}</version>
    </dependency>

    <dependency>
      <groupId>org.apache.spark</groupId>
      <artifactId>spark-sql_2.11</artifactId>
      <version>${spark.version}</version>
    </dependency>
    <dependency>
      <groupId>org.apache.spark</groupId>
      <artifactId>spark-hive_2.11</artifactId>
      <version>${spark.version}</version>
    </dependency>
  </dependencies>

2、複製SPARK_HOME/conf下的hive-site.xml文件放在項目resources下,文件內容如下:

<configuration>
    
    <property>
      <name>hive.metastore.client.connect.retry.delay</name>
      <value>5</value>
    </property>
    
    <property>
      <name>hive.metastore.client.socket.timeout</name>
      <value>1800</value>
    </property>
    
    <property>
      <name>hive.metastore.uris</name>
      <value>thrift://master.htdata.com:9083</value>
    </property>
    
    <property>
      <name>hive.server2.enable.doAs</name>
      <value>false</value>
    </property>
    
    <property>
      <name>hive.server2.thrift.port</name>
      <value>10016</value>
    </property>
    
    <property>
      <name>hive.server2.transport.mode</name>
      <value>binary</value>
    </property>
    
  </configuration>

3、運行環境:JDK1.8、Scala2.11.12,

4、代碼

val conf = new SparkConf()
    conf.setAppName("hive").setMaster("local[2]")
    val hive = SparkSession
      .builder()
      .enableHiveSupport()
      .config(conf)
      .getOrCreate()

    hive.sql("show tables")
      .collect()
      .foreach(println)
    

按照這個流程走沒有遇到坑。

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章