記一次不成功的拉鍊表

2019-05-10 02:19:37,565 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics report from attempt_1556531708937_6923_r_000000_0: Error: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row (tag=0) {"key":{},"value":{"_col0":"A904A900-AA07-A0C6-907A-CCC60CBA9AB7","_col1":426.02,"_col2":1,"_col3":426.02,"_col4":429.32,"_col5":0.0,"_col6":"FINISHED","_col7":"2019-04-19 12:13:25.0","_col8":"2019-04-19 00:35:19.0","_col9":22,"_col10":"A8092000-A044-46A9-400A-CB74427979C6","_col11":429.32,"_col12":53.05,"_col13":0.0,"_col14":30,"_col15":"轉-鳳靈瓏-20190419-54413","_col16":0.0,"_col17":3.3,"_col18":"2019-04-19","_col19":"FINISHED","_col20":"2019-04-19 00:35:19.0","_col21":"2019-04-19 14:07:35.0","_col22":"EE31263C-D894-45DA-9220-7288CB84CFD8","_col23":"Y","_col24":"DUE_QUIT","_col25":"N","_col26":"2019-05-08","_col27":"9999-12-31","_col28":"2019-05-08"}}
	at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:265)
	at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:444)
	at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row (tag=0) {"key":{},"value":{"_col0":"A904A900-AA07-A0C6-907A-CCC60CBA9AB7","_col1":426.02,"_col2":1,"_col3":426.02,"_col4":429.32,"_col5":0.0,"_col6":"FINISHED","_col7":"2019-04-19 12:13:25.0","_col8":"2019-04-19 00:35:19.0","_col9":22,"_col10":"A8092000-A044-46A9-400A-CB74427979C6","_col11":429.32,"_col12":53.05,"_col13":0.0,"_col14":30,"_col15":"轉-鳳靈瓏-20190419-54413","_col16":0.0,"_col17":3.3,"_col18":"2019-04-19","_col19":"FINISHED","_col20":"2019-04-19 00:35:19.0","_col21":"2019-04-19 14:07:35.0","_col22":"EE31263C-D894-45DA-9220-7288CB84CFD8","_col23":"Y","_col24":"DUE_QUIT","_col25":"N","_col26":"2019-05-08","_col27":"9999-12-31","_col28":"2019-05-08"}}
	at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:253)
	... 7 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.IllegalArgumentException: No enum constant parquet.hadoop.metadata.CompressionCodecName.SNAPPY;
	at org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketFiles(FileSinkOperator.java:527)
	at org.apache.hadoop.hive.ql.exec.FileSinkOperator.createNewPaths(FileSinkOperator.java:812)
	at org.apache.hadoop.hive.ql.exec.FileSinkOperator.getDynOutPaths(FileSinkOperator.java:919)
	at org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:666)
	at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
	at org.apache.hadoop.hive.ql.exec.ExtractOperator.processOp(ExtractOperator.java:45)
	at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:244)

 

報錯如上,將parquet作爲拉鍊的底層存儲,通過insert overwrite的方式進行拉鍊,分析原因是以爲初始數據導入的時候都在初始分區,這個分區的數據很大,做insert overwrite的時候就是將所有的數據全部讀出來,然後關聯之後再全部寫進去,數據量很大,造成數據很大的傾斜,所以,這裏將分區表變成了普通表,然後重新執行拉鍊,搞定

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章