前提:Hive默認用戶名和密碼爲空,並沒有做修改,可以在hive-site.xml中進行配置
(1)安裝相關Python庫
pip install sasl
pip install thrift
pip install thrift-sasl
pip install PyHive
安裝sasl的過程中,可能會報以下錯誤:
error: command 'gcc' failed with exit status
解決方法:Ubuntu系統可能需要先裝好libsasl2-dev,CentOS系統需要預先裝好python-devel和cyrus-sasl-devel。再pip install sasl即可
yum install gcc-c++ python-devel.x86_64 cyrus-sasl-devel.x86_64
(2)啓動Hive的metastore和hiveserver2
hive --service metastore &
hive --service hiveserver2 &
hiveserver2 正常啓動會默認監聽10000端口,可以通過以下命令查看它是否正常啓動
netstat -anp | grep 10000
(3)編寫Python腳本
#!/usr/bin/env python
# -*- coding: utf-8 -*-
from pyhive import hive
conn = hive.Connection(host='hadoop000', port=10000, database='***')
cursor=conn.cursor()
cursor.execute('select * from user_log limit 10')
for each in cursor.fetchall():
print each