Hadoop學習筆記（十八）---Hive內部表，外部表，分區表，桶表

原創

2020-02-21 20:05

內部表：
1.創建表：

create table stu(id int);

2.插入數據：

hive插入數據不能用insert語句，需要從外部文件中加載數據，比如創建一個文件stu_data，內容爲：

命令爲（其中/home/hadoop/Public/stu_data爲文件路徑）：

load data local inpath '/home/hadoop/Public/stu_data' into table stu

3.如果是要創建有多個字段的表，就需要區分一下：

create table stu2(id int, name string) row format delimited fields terminated by ' ';

需要加載數據時就要是這樣的：

1 john
2 tom
3 gary

插入數據的方法跟上面一樣。

查看數據：

select * from stu2

如果將加載語句中的local去掉：

load data inpath '/stu2_data' into table stu2

意思是從hdfs上面加載。

4.刪除表：

drop table stu；

外部表：

內部表與外部表的區別是內部表如果刪除表，數據一起刪除，而外部表如果你刪除表，則創建的數據並不會刪除。

1.創建表結構：

create external table stu_info(id int, name string) row format delimited fields terminated by ' ' location '/data_stu';

external：指的是一個外部表
location指定數據的位置，注意這裏數據在hdfs存儲的位置。

2.本地編寫文件並上傳到hdfs中/data的位置：

1 gary
2 tom

hadoop fs -put /stu2_data /data/stu_data

加載數據跟內部表一樣：

load data inpath '/data/stu_data' into table stu_info;

分區表
在hive中表中的一個partition對應的是表下面的一個目錄，所有的partition的數據都存儲在對應的目錄下面：

例如：test表中含data和city兩個partition，則對應於date=20130201，city=bj的HDFS子目錄爲：
/warehouse/test/date=20130201/city=bj

創建表：

create table partition_tmp(id int, name string)partitioned by(d string) row format delimited fields terminated by ' ';

然後存入數據：

load data local inpath '/stu2_data' into table partition_tmp partition(d='0902');

查看的時候可以全部查看：

select * from partition_tmp;

顯示內容如下：

hive> select * from partition_tmp;
OK
1   john    0901
2   tom 0901
3   gary    0901
1   john    0902
2   tom 0902
3   gary    0902

如果按照下面的方式查詢：

select * from partition_tmp where d='0901'

hive> select * from partition_tmp where d='0901'
    > ;
OK
1   john    0901
2   tom 0901
3   gary    0901
Time taken: 0.308 seconds

桶表

桶表是對數據進行哈希取值，然後放到不同的文件中進行存儲。

創建表：

create table bucket_tmp(id int) clustered by (id) into 4 buckets;

加載數據。

set hive.enforce.bucketing=true;
insert into table bucket_tmp select id from partition_stu；

抽樣查詢：

select * from bucket_tmp tablesample(bucket 1 out of 4 on id);

發佈了74 篇原創文章 · 獲贊 3 · 訪問量 6萬+

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.