Linux CentOS7安裝Hive2.3並配置sparkSQL訪問Hive

一、安裝mysql

yum install wget
wget http://repo.mysql.com/mysql-community-release-el7-5.noarch.rpm
rpm -ivh mysql-community-release-el7-5.noarch.rpm

yum install mysql-server


啓動mysql

service mysqld start

開機啓動

systemctl enable mysqld.service

設置密碼

#/usr/bin/mysql_secure_installation
[...]
Enter current password for root (enter for none):
OK, successfully used password, moving on...
[...]
Set root password? [Y/n] y
New password:
Re-enter new password:
Remove anonymous users? [Y/n] Y
[...]
Disallow root login remotely? [Y/n] N
[...]
Remove test database and access to it [Y/n] Y
[...]
Reload privilege tables now? [Y/n] Y
All done!

 

二、安裝配置hive
官網下載hive 和 mysql-connector-java-5.*.*-bin.jar
上傳後解壓
tar -zxvf apache-hive-2.3.6-bin.tar.gz -C ../app/

Hive環境變量設置

vi ~/.bashrc
# Hive environment  (#代表註釋)
export HIVE_HOME=/home/hadoop/app/apache-hive-2.3.6-bin  
export PATH=$HIVE_HOME/bin:$HIVE_HOME/conf:$PATH

激活環境變量
source ~/.bashrc  修改配置文件
cd ../app/
創建hive-site.xml文件 在hive/conf/目錄下創建hive-site.xml文件
vi hive-site.xml

<configuration>
   <property>
        <name>javax.jdo.option.ConnectionURL</name>
        <value>jdbc:mysql://localhost:3306/hive?createDatabaseIfNotExist=true</value>
    </property>

    <property>
        <name>javax.jdo.option.ConnectionDriverName</name>
        <value>com.mysql.jdbc.Driver</value>
    </property>
 
    <property>
        <name>javax.jdo.option.ConnectionUserName</name>
        <value>root</value>
    </property>
 
    <property>
        <name>javax.jdo.option.ConnectionPassword</name>
        <value>root</value>
    </property>
 
    <property>
   <name>hive.metastore.schema.verification</name>
   <value>false</value>
    <description>
    Enforce metastore schema version consistency.  
    </description>
 </property>
</configuration>

將mysql-connector-Java-5.1.15-bin.jar拷貝到/opt/software/hive/apache-hive-2.1.1-bin下的lib下即可

 

三、源數據初始化

[hadoop@sparkServer apache-hive-2.3.6-bin]$ bin/schematool -initSchema -dbType mysql
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/hadoop/app/apache-hive-2.3.6-bin/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/hadoop/app/hadoop-2.7.7/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Metastore connection URL:        jdbc:mysql://localhost:3306/hive?createDatabaseIfNotExist=true
Metastore Connection Driver :    com.mysql.jdbc.Driver
Metastore connection User:       root
Starting metastore schema initialization to 2.3.0
Initialization script hive-schema-2.3.0.mysql.sql
Initialization script completed
schemaTool completed


四、測試
(1)創建數據庫create database db_hive_test;
(2)創建測試表

use db_hive_test;
create table student(id int,name string) row format delimited fields terminated by '\t';

(3)返回linux新建student.txt 文件寫入數據(id,name 按tab鍵分隔)
1001    zhangsan
1002    lisi
(4)在hive中導入數據

load data local inpath '/home/hadoop/student.txt' into table  db_hive_test.student;


(5)查看結果

select * from db_hive_test.student;

 

五、Spark 連接hive 元數據庫(mysql)

1)拷貝hive的hive-site.xml文件到spark的conf目錄下

2)修改spark中hive-site.xml文件
添加以下:
 

<configuration>
<property>
  <name>hive.metastore.uris</name>
 <value>thrift://localhost:9083</value>
</property>
</configuration>

3)另建窗口啓動:
[root@head42 conf]$ hive --service metastore

4)啓動pyspark:
[root@head42 conf]$ pyspark

5)測試:
 

>>> from pyspark.sql import HiveContext
>>> sqlContext = HiveContext(sc)
>>> read_hive_score = sqlContext.sql("select * from db_hive_test.student")
>>> read_hive_score.show()
+----+--------+
|  id|    name|
+----+--------+
|1001|zhangsan|
|1002|    lisi|
+----+--------+

這樣就OK了!

參考:

https://blog.csdn.net/hyj_king/article/details/95728307

https://www.cnblogs.com/tudousiya/p/11387823.html 

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章