Hadoop-Hive常用基礎HQL語句

2018.07.16 22:52 字數 399 閱讀 201評論 0喜歡 0

一. 數據庫

1. 查詢數據庫列表

show databases ;

2. 使用指定的數據庫

use default;

3. 查看數據庫的描述信息

desc database extended db_hive_03 ;

二. 表

1. 查詢表列表

show tables ;

2. 查詢表的描述信息:

desc student ;
desc extended student ;
desc formatted student ;

3. 創建表

create table student(
id int, 
name string) 
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';
load data local inpath '/opt/datas/student.txt'into table student ;

4. 創建一張表並複製一個表的結構和數據

create table if not exists default.dept_cats as select * from dept ;

5. 使用另一張表的結構創建一張新表

create table if not exists default.dept_like like default.dept ;

6. 清空表：

truncate table dept_cats ;

7.刪除表

drop table if exists dept_like_rename ;

8. 修改表名

alter table dept_like rename to dept_like_rename ;

9.查詢表

select * from student ;
select id from student ;

三. 功能函數:

1. 顯示功能函數列表

show functions ;

2. 查看功能函數的描述信息

desc function upper ;

3. 查詢功能函數的擴展信息

desc function extended upper ;

4. 測試功能函數

select id ,upper(name) uname from db_hive.student ;

四. 進階：

1. 創建一個外部表，並指定導入文件的位置和字段分割符：

create EXTERNAL table IF NOT EXISTS default.emp_ext2(
empno int,
ename string,
job string,
mgr int,
hiredate string,
sal double,
comm double,
deptno int
)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
location '/user/hive/warehouse/emp_ext2';

2. 創建分區表：

create EXTERNAL table IF NOT EXISTS default.emp_partition(
empno int,
ename string,
job string,
mgr int,
hiredate string,
sal double,
comm double,
deptno int
)
partitioned by (month string,day string)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' ;

3. 向分區表中導入數據：

load data local inpath '/usr/local/app/hive_test/emp.txt' into table default.emp_partition partition (month='201805',day='31') ;

4. 查看分區表列表：

show partitions emp_partition ;

5. 查詢分區表中的數據：

select * from emp_partition where month = '201509' and day = '13' ;

6. 加載數據到hive：

1）加載本地文件到hive表
load data local inpath '/opt/datas/emp.txt' into table default.emp ;

2）加載hdfs文件到hive中
load data inpath '/user/beifeng/hive/datas/emp.txt' overwrite into table default.emp ;

3）加載數據覆蓋表中已有的數據
load data inpath '/user/beifeng/hive/datas/emp.txt' into table default.emp ;

4）創建表是通過insert加載
create table default.emp_ci like emp ;
insert into table default.emp_ci select * from default.emp ;

5）創建表的時候通過location指定加載

7. hive到文件：

insert overwrite local directory '/opt/datas/hive_exp_emp'
select * from default.emp ;

insert overwrite local directory '/opt/datas/hive_exp_emp2'
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' COLLECTION ITEMS TERMINATED BY '\n'
select * from default.emp ;

bin/hive -e "select * from default.emp ;" > /opt/datas/exp_res.txt

8. 將查詢結果導出到本地文件中：

insert overwrite directory '/hive_test/export_emp.txt' select * from emp;

select * from emp ;
select t.empno, t.ename, t.deptno from emp t ;

五. 進階查詢：

1. = >= <= between and

select * from emp limit 5 ;
select t.empno, t.ename, t.deptno from emp t where  t.sal between 800 and 1500 ;

2. is null / is not null /in /not in

select t.empno, t.ename, t.deptno from emp t where comm is null ;

3. max/min/count/sum/avg

select count(*) cnt from emp ;
select max(sal) max_sal from emp ;
select sum(sal) from emp ;
select avg(sal) from emp ;

4. group by /having 分組

emp表
* 每個部門的平均工資
select t.deptno, avg(t.sal) avg_sal from emp t group by t.deptno ;

* 每個部門中每個崗位的做高薪水
select t.deptno, t.job, max(t.sal) avg_sal from emp t group by t.deptno, job ;

5. >>>having

* where 是針對單條記錄進行篩選
* having 是針對分組結果進行篩選

求每個部門的平均薪水大於2000的部門

select deptno, avg(sal) from emp group by deptno ;
select deptno, avg(sal) avg_sal from emp group by deptno having avg_sal > 2000;

6. join 兩個表進行連接

##等值jion  join ... on
select e.empno, e.ename, d.deptno, d.dname from emp e join dept d on e.deptno = d.deptno ;

##左連接  left join
select e.empno, e.ename, d.deptno, d.dname  from emp e left join dept d on e.deptno = d.deptno ;

##右連接  right join
select e.empno, e.ename, e.deptno, d.dname  from emp e right join dept d on e.deptno = d.deptno ;

##全連接  full join
select e.empno, e.ename, e.deptno, d.dname  from emp e full join dept d on e.deptno = d.deptno ;

六. 客戶端配置與啓停

1. 關閉CLI客戶端命令:

exit

#退出hive命令,使用exit,不要直接用ctrl+c,否則進程還在,只是窗口關閉了而已.

2. 在啓動hive時設置配置屬性信息

$ bin/hive --hiveconf <property=value>

3. 查看當前所有的配置信息

hive > set ;

hive (db_hive)> set system:user.name ;
system:user.name=beifeng

4. 查看幫助

[beifeng@hadoop-senior hive-0.13.1]$ bin/hive -help

5. 執行sql語句

* bin/hive -e <quoted-query-string>
eg:
bin/hive -e "select * from db_hive.student ;"

6. 執行指定的文件

* bin/hive -f <filename>
eg:
$ touch hivef.sql
select * from db_hive.student ;
$ bin/hive -f /opt/datas/hivef.sql 

#將執行結果輸入到指定的文件中
$ bin/hive -f /opt/datas/hivef.sql > /opt/datas/hivef-res.txt

7. 在hive cli命令窗口中如何查看hdfs文件系統

hive (default)> dfs -ls / ;

8. 在hive cli命令窗口中如何查看本地文件系統

hive (default)> !ls /opt/datas ；

Hive常用基礎HQL語句

Hadoop-Hive常用基礎HQL語句

win11關閉自動檢測病毒刪文件

千兆寬帶實際網速能到達多少？

web交互設計方法:頁面表達原則

mybatis-pagehelper-ajax-log分頁插件實現日誌分頁顯示

html 複選框(checkbox)和單選框(radio)與文字水平(label)垂直居

幾款實用的前端日曆時間日期選擇控件

Java中PO、BO、VO、DTO、POJO、DAO概念及其作用和項目實例圖（轉）

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結