sqoop之mysql數據導入hive

1、查看mysql表數據

mysql> select * from stu;
+----------+------+----------+
| name     | age  | address  |
+----------+------+----------+
| zhangsan |   20 | henan    |
| lisi     |   20 | hebei    |
| wangwu   |   20 | beijing  |
| liuqi    |   20 | shandong |
| xuwei    |   20 | fujian   |
+----------+------+----------+
5 rows in set (0.00 sec)

mysql>

2、在hive中創建一個字段類型都相同的表

hive> create table default.mysql_stu(name string,age int,address string) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';
OK
Time taken: 0.07 seconds
hive> select * from mysql_stu;
OK
Time taken: 0.054 seconds

3、執行sqoop命令

[root@master bin]# ./sqoop import --connect jdbc:mysql://192.168.230.21:3306/mysql --username root --password 123456 --table stu --fields-terminated-by '\t' --delete-target-dir --num-mappers 1 --hive-import --hive-database default --hive-table mysql_stu
部分命令參數說明:
--table stu \  #要連接的表
--fields-terminated-by '\t' \  #字段通過空格分隔
--delete-target-dir \  #如果目錄存在就刪除
--num-mappers 1 \  #啓動一個Map並行任務
--hive-import \  #執行導入Hive
--hive-database default \  #導入到默認的default庫

 

[root@master bin]# ./sqoop import --connect jdbc:mysql://192.168.230.21:3306/mysql --username root --password 123456 --table stu --fields-terminated-by '\t' --delete-target-dir --num-mappers 1 --hive-import --hive-database default --hive-table mysql_stu
Warning: /opt/softWare/sqoop/sqoop-1.4.7/../hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
Warning: /opt/softWare/sqoop/sqoop-1.4.7/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
Warning: /opt/softWare/sqoop/sqoop-1.4.7/../zookeeper does not exist! Accumulo imports will fail.
Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation.
20/06/29 17:49:34 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
20/06/29 17:49:34 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
20/06/29 17:49:34 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
20/06/29 17:49:35 INFO tool.CodeGenTool: Beginning code generation
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/softWare/hadoop/hadoop-2.7.3/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/softWare/hbase/hbase-1.2.6/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/softWare/hive/apache-hive-2.1.1-bin/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
20/06/29 17:49:35 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `stu` AS t LIMIT 1
20/06/29 17:49:35 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `stu` AS t LIMIT 1
20/06/29 17:49:35 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /opt/softWare/hadoop/hadoop-2.7.3
Note: /tmp/sqoop-root/compile/b4f789b81ef3ba08a63dc2fd15a7f73e/stu.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
20/06/29 17:49:37 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-root/compile/b4f789b81ef3ba08a63dc2fd15a7f73e/stu.jar
20/06/29 17:49:37 INFO tool.ImportTool: Destination directory stu is not present, hence not deleting.
20/06/29 17:49:37 WARN manager.MySQLManager: It looks like you are importing from mysql.
20/06/29 17:49:37 WARN manager.MySQLManager: This transfer can be faster! Use the --direct
20/06/29 17:49:37 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path.
20/06/29 17:49:37 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql)
20/06/29 17:49:37 INFO mapreduce.ImportJobBase: Beginning import of stu
20/06/29 17:49:37 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
20/06/29 17:49:37 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
20/06/29 17:49:37 INFO client.RMProxy: Connecting to ResourceManager at master/192.168.230.21:8032
20/06/29 17:49:43 INFO db.DBInputFormat: Using read commited transaction isolation
20/06/29 17:49:43 INFO mapreduce.JobSubmitter: number of splits:1
20/06/29 17:49:43 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1593394152340_0031
20/06/29 17:49:43 INFO impl.YarnClientImpl: Submitted application application_1593394152340_0031
20/06/29 17:49:43 INFO mapreduce.Job: The url to track the job: http://master:8088/proxy/application_1593394152340_0031/
20/06/29 17:49:43 INFO mapreduce.Job: Running job: job_1593394152340_0031
20/06/29 17:49:53 INFO mapreduce.Job: Job job_1593394152340_0031 running in uber mode : false
20/06/29 17:49:53 INFO mapreduce.Job:  map 0% reduce 0%
20/06/29 17:50:00 INFO mapreduce.Job:  map 100% reduce 0%
20/06/29 17:50:00 INFO mapreduce.Job: Job job_1593394152340_0031 completed successfully
20/06/29 17:50:00 INFO mapreduce.Job: Counters: 30
	File System Counters
		FILE: Number of bytes read=0
		FILE: Number of bytes written=138013
		FILE: Number of read operations=0
		FILE: Number of large read operations=0
		FILE: Number of write operations=0
		HDFS: Number of bytes read=87
		HDFS: Number of bytes written=84
		HDFS: Number of read operations=4
		HDFS: Number of large read operations=0
		HDFS: Number of write operations=2
	Job Counters 
		Launched map tasks=1
		Other local map tasks=1
		Total time spent by all maps in occupied slots (ms)=4564
		Total time spent by all reduces in occupied slots (ms)=0
		Total time spent by all map tasks (ms)=4564
		Total vcore-milliseconds taken by all map tasks=4564
		Total megabyte-milliseconds taken by all map tasks=4673536
	Map-Reduce Framework
		Map input records=5
		Map output records=5
		Input split bytes=87
		Spilled Records=0
		Failed Shuffles=0
		Merged Map outputs=0
		GC time elapsed (ms)=89
		CPU time spent (ms)=920
		Physical memory (bytes) snapshot=115523584
		Virtual memory (bytes) snapshot=2082172928
		Total committed heap usage (bytes)=40632320
	File Input Format Counters 
		Bytes Read=0
	File Output Format Counters 
		Bytes Written=84
20/06/29 17:50:00 INFO mapreduce.ImportJobBase: Transferred 84 bytes in 22.7439 seconds (3.6933 bytes/sec)
20/06/29 17:50:00 INFO mapreduce.ImportJobBase: Retrieved 5 records.
20/06/29 17:50:00 INFO mapreduce.ImportJobBase: Publishing Hive/Hcat import job data to Listeners for table stu
20/06/29 17:50:00 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `stu` AS t LIMIT 1
20/06/29 17:50:00 INFO hive.HiveImport: Loading uploaded data into Hive
20/06/29 17:50:00 INFO conf.HiveConf: Found configuration file file:/opt/softWare/sqoop/sqoop-1.4.7/conf/hive-site.xml

Logging initialized using configuration in jar:file:/opt/softWare/hive/apache-hive-2.1.1-bin/lib/hive-common-2.1.1.jar!/hive-log4j2.properties Async: true
20/06/29 17:50:02 INFO SessionState: 
Logging initialized using configuration in jar:file:/opt/softWare/hive/apache-hive-2.1.1-bin/lib/hive-common-2.1.1.jar!/hive-log4j2.properties Async: true
20/06/29 17:50:02 INFO metastore.HiveMetaStore: 0: Opening raw store with implementation class:org.apache.hadoop.hive.metastore.ObjectStore
20/06/29 17:50:02 INFO metastore.ObjectStore: ObjectStore, initialize called
20/06/29 17:50:02 INFO DataNucleus.Persistence: Property hive.metastore.integral.jdo.pushdown unknown - will be ignored
20/06/29 17:50:02 INFO DataNucleus.Persistence: Property datanucleus.cache.level2 unknown - will be ignored
20/06/29 17:50:03 INFO metastore.ObjectStore: Setting MetaStore object pin classes with hive.metastore.cache.pinobjtypes="Table,StorageDescriptor,SerDeInfo,Partition,Database,Type,FieldSchema,Order"
20/06/29 17:50:05 INFO metastore.MetaStoreDirectSql: Using direct SQL, underlying DB is MYSQL
20/06/29 17:50:05 INFO metastore.ObjectStore: Initialized ObjectStore
20/06/29 17:50:05 INFO metastore.HiveMetaStore: Added admin role in metastore
20/06/29 17:50:05 INFO metastore.HiveMetaStore: Added public role in metastore
20/06/29 17:50:05 INFO metastore.HiveMetaStore: No user is added in admin role, since config is empty
20/06/29 17:50:06 INFO metastore.HiveMetaStore: 0: get_all_functions
20/06/29 17:50:06 INFO HiveMetaStore.audit: ugi=root	ip=unknown-ip-addr	cmd=get_all_functions	
20/06/29 17:50:06 INFO session.SessionState: Created HDFS directory: /tmp/hive/root/8bead44f-cae5-47e3-a44a-4b3bdbc75795
20/06/29 17:50:06 INFO session.SessionState: Created local directory: /tmp/root/8bead44f-cae5-47e3-a44a-4b3bdbc75795
20/06/29 17:50:06 INFO session.SessionState: Created HDFS directory: /tmp/hive/root/8bead44f-cae5-47e3-a44a-4b3bdbc75795/_tmp_space.db
20/06/29 17:50:06 INFO conf.HiveConf: Using the default value passed in for log id: 8bead44f-cae5-47e3-a44a-4b3bdbc75795
20/06/29 17:50:06 INFO session.SessionState: Updating thread name to 8bead44f-cae5-47e3-a44a-4b3bdbc75795 main
20/06/29 17:50:06 INFO conf.HiveConf: Using the default value passed in for log id: 8bead44f-cae5-47e3-a44a-4b3bdbc75795
20/06/29 17:50:06 INFO ql.Driver: Compiling command(queryId=root_20200629175006_a0bfcb64-07b3-4757-b9fd-ae9b3f09a8be): CREATE TABLE IF NOT EXISTS `default`.`mysql_stu` ( `name` STRING, `age` INT, `address` STRING) COMMENT 'Imported by sqoop on 2020/06/29 17:50:00' ROW FORMAT DELIMITED FIELDS TERMINATED BY '\011' LINES TERMINATED BY '\012' STORED AS TEXTFILE
20/06/29 17:50:07 INFO parse.CalcitePlanner: Starting Semantic Analysis
20/06/29 17:50:07 INFO parse.CalcitePlanner: Creating table default.mysql_stu position=27
20/06/29 17:50:07 INFO metastore.HiveMetaStore: 0: get_table : db=default tbl=mysql_stu
20/06/29 17:50:07 INFO HiveMetaStore.audit: ugi=root	ip=unknown-ip-addr	cmd=get_table : db=default tbl=mysql_stu	
20/06/29 17:50:07 INFO ql.Driver: Semantic Analysis Completed
20/06/29 17:50:07 INFO ql.Driver: Returning Hive schema: Schema(fieldSchemas:null, properties:null)
20/06/29 17:50:07 INFO ql.Driver: Completed compiling command(queryId=root_20200629175006_a0bfcb64-07b3-4757-b9fd-ae9b3f09a8be); Time taken: 1.151 seconds
20/06/29 17:50:07 INFO ql.Driver: Concurrency mode is disabled, not creating a lock manager
20/06/29 17:50:07 INFO ql.Driver: Executing command(queryId=root_20200629175006_a0bfcb64-07b3-4757-b9fd-ae9b3f09a8be): CREATE TABLE IF NOT EXISTS `default`.`mysql_stu` ( `name` STRING, `age` INT, `address` STRING) COMMENT 'Imported by sqoop on 2020/06/29 17:50:00' ROW FORMAT DELIMITED FIELDS TERMINATED BY '\011' LINES TERMINATED BY '\012' STORED AS TEXTFILE
20/06/29 17:50:07 INFO ql.Driver: Completed executing command(queryId=root_20200629175006_a0bfcb64-07b3-4757-b9fd-ae9b3f09a8be); Time taken: 0.008 seconds
OK
20/06/29 17:50:07 INFO ql.Driver: OK
Time taken: 1.166 seconds
20/06/29 17:50:07 INFO CliDriver: Time taken: 1.166 seconds
20/06/29 17:50:07 INFO conf.HiveConf: Using the default value passed in for log id: 8bead44f-cae5-47e3-a44a-4b3bdbc75795
20/06/29 17:50:07 INFO session.SessionState: Resetting thread name to  main
20/06/29 17:50:07 INFO conf.HiveConf: Using the default value passed in for log id: 8bead44f-cae5-47e3-a44a-4b3bdbc75795
20/06/29 17:50:07 INFO session.SessionState: Updating thread name to 8bead44f-cae5-47e3-a44a-4b3bdbc75795 main
20/06/29 17:50:07 INFO ql.Driver: Compiling command(queryId=root_20200629175007_e3c0af73-0f16-4f55-b7cb-d96899f52ece): 
LOAD DATA INPATH 'hdfs://master:9000/user/root/stu' INTO TABLE `default`.`mysql_stu`
20/06/29 17:50:07 INFO metastore.HiveMetaStore: 0: get_table : db=default tbl=mysql_stu
20/06/29 17:50:07 INFO HiveMetaStore.audit: ugi=root	ip=unknown-ip-addr	cmd=get_table : db=default tbl=mysql_stu	
20/06/29 17:50:07 INFO ql.Driver: Semantic Analysis Completed
20/06/29 17:50:07 INFO ql.Driver: Returning Hive schema: Schema(fieldSchemas:null, properties:null)
20/06/29 17:50:07 INFO ql.Driver: Completed compiling command(queryId=root_20200629175007_e3c0af73-0f16-4f55-b7cb-d96899f52ece); Time taken: 0.342 seconds
20/06/29 17:50:07 INFO ql.Driver: Concurrency mode is disabled, not creating a lock manager
20/06/29 17:50:07 INFO ql.Driver: Executing command(queryId=root_20200629175007_e3c0af73-0f16-4f55-b7cb-d96899f52ece): 
LOAD DATA INPATH 'hdfs://master:9000/user/root/stu' INTO TABLE `default`.`mysql_stu`
20/06/29 17:50:07 INFO ql.Driver: Starting task [Stage-0:MOVE] in serial mode
20/06/29 17:50:07 INFO metastore.HiveMetaStore: 0: Cleaning up thread local RawStore...
20/06/29 17:50:07 INFO HiveMetaStore.audit: ugi=root	ip=unknown-ip-addr	cmd=Cleaning up thread local RawStore...	
20/06/29 17:50:07 INFO metastore.HiveMetaStore: 0: Done cleaning up thread local RawStore
20/06/29 17:50:07 INFO HiveMetaStore.audit: ugi=root	ip=unknown-ip-addr	cmd=Done cleaning up thread local RawStore	
Loading data to table default.mysql_stu
20/06/29 17:50:07 INFO exec.Task: Loading data to table default.mysql_stu from hdfs://master:9000/user/root/stu
20/06/29 17:50:07 INFO metastore.HiveMetaStore: 0: get_table : db=default tbl=mysql_stu
20/06/29 17:50:07 INFO HiveMetaStore.audit: ugi=root	ip=unknown-ip-addr	cmd=get_table : db=default tbl=mysql_stu	
20/06/29 17:50:07 INFO metastore.HiveMetaStore: 0: Opening raw store with implementation class:org.apache.hadoop.hive.metastore.ObjectStore
20/06/29 17:50:07 INFO metastore.ObjectStore: ObjectStore, initialize called
20/06/29 17:50:07 INFO metastore.MetaStoreDirectSql: Using direct SQL, underlying DB is MYSQL
20/06/29 17:50:07 INFO metastore.ObjectStore: Initialized ObjectStore
20/06/29 17:50:07 INFO metastore.HiveMetaStore: 0: get_table : db=default tbl=mysql_stu
20/06/29 17:50:07 INFO HiveMetaStore.audit: ugi=root	ip=unknown-ip-addr	cmd=get_table : db=default tbl=mysql_stu	
20/06/29 17:50:07 ERROR hdfs.KeyProviderCache: Could not find uri with key [dfs.encryption.key.provider.uri] to create a keyProvider !!
20/06/29 17:50:07 INFO metastore.HiveMetaStore: 0: alter_table: db=default tbl=mysql_stu newtbl=mysql_stu
20/06/29 17:50:07 INFO HiveMetaStore.audit: ugi=root	ip=unknown-ip-addr	cmd=alter_table: db=default tbl=mysql_stu newtbl=mysql_stu	
20/06/29 17:50:07 INFO ql.Driver: Starting task [Stage-1:STATS] in serial mode
20/06/29 17:50:07 INFO exec.StatsTask: Executing stats task
20/06/29 17:50:07 INFO metastore.HiveMetaStore: 0: get_table : db=default tbl=mysql_stu
20/06/29 17:50:07 INFO HiveMetaStore.audit: ugi=root	ip=unknown-ip-addr	cmd=get_table : db=default tbl=mysql_stu	
20/06/29 17:50:07 INFO metastore.HiveMetaStore: 0: get_table : db=default tbl=mysql_stu
20/06/29 17:50:07 INFO HiveMetaStore.audit: ugi=root	ip=unknown-ip-addr	cmd=get_table : db=default tbl=mysql_stu	
20/06/29 17:50:07 INFO metastore.HiveMetaStore: 0: alter_table: db=default tbl=mysql_stu newtbl=mysql_stu
20/06/29 17:50:07 INFO HiveMetaStore.audit: ugi=root	ip=unknown-ip-addr	cmd=alter_table: db=default tbl=mysql_stu newtbl=mysql_stu	
20/06/29 17:50:07 INFO hive.log: Updating table stats fast for mysql_stu
20/06/29 17:50:07 INFO hive.log: Updated size of table mysql_stu to 84
20/06/29 17:50:07 INFO exec.StatsTask: Table default.mysql_stu stats: [numFiles=1, numRows=0, totalSize=84, rawDataSize=0]
20/06/29 17:50:07 INFO ql.Driver: Completed executing command(queryId=root_20200629175007_e3c0af73-0f16-4f55-b7cb-d96899f52ece); Time taken: 0.259 seconds
OK
20/06/29 17:50:07 INFO ql.Driver: OK
Time taken: 0.602 seconds
20/06/29 17:50:07 INFO CliDriver: Time taken: 0.602 seconds
20/06/29 17:50:07 INFO conf.HiveConf: Using the default value passed in for log id: 8bead44f-cae5-47e3-a44a-4b3bdbc75795
20/06/29 17:50:07 INFO session.SessionState: Resetting thread name to  main
20/06/29 17:50:07 INFO conf.HiveConf: Using the default value passed in for log id: 8bead44f-cae5-47e3-a44a-4b3bdbc75795
20/06/29 17:50:07 INFO session.SessionState: Deleted directory: /tmp/hive/root/8bead44f-cae5-47e3-a44a-4b3bdbc75795 on fs with scheme hdfs
20/06/29 17:50:07 INFO session.SessionState: Deleted directory: /tmp/root/8bead44f-cae5-47e3-a44a-4b3bdbc75795 on fs with scheme file
20/06/29 17:50:07 INFO metastore.HiveMetaStore: 0: Cleaning up thread local RawStore...
20/06/29 17:50:07 INFO HiveMetaStore.audit: ugi=root	ip=unknown-ip-addr	cmd=Cleaning up thread local RawStore...	
20/06/29 17:50:07 INFO metastore.HiveMetaStore: 0: Done cleaning up thread local RawStore
20/06/29 17:50:07 INFO HiveMetaStore.audit: ugi=root	ip=unknown-ip-addr	cmd=Done cleaning up thread local RawStore	
20/06/29 17:50:07 INFO hive.HiveImport: Hive import complete.
20/06/29 17:50:07 INFO hive.HiveImport: Export directory is contains the _SUCCESS file only, removing the directory.
[root@master bin]# 

4、查看hive中結果表數據

hive> select * from mysql_stu;
OK
zhangsan	20	henan
lisi	20	hebei
wangwu	20	beijing
liuqi	20	shandong
xuwei	20	fujian
Time taken: 0.068 seconds, Fetched: 5 row(s)
hive>

 

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章