個人心得(CDH5.14,心得是對下文轉載步驟的補充):
CDH5.14的config.mk
config.mk的配置要改成如下:
USE_HDFS = 1
HDFS_LIB_PATH = /home/user/xgboost/xgboost-package/libhdfs/lib
HADOOP_HOME = /opt/cloudera/parcels/CDH
HADOOP_HDFS_HOME = /opt/cloudera/parcels/CDH
環境變量
export HADOOP_CONF_DIR=/etc/hadoop/conf
export HADOOP_HOME=/opt/cloudera/parcels/CDH
修改yarn.py
編輯 xgboost-package/dmlc-core/tracker/dmlc_tracker/yarn.py,
在48行修改:
out = out.decode('utf-8').split('\n')[0].split()
編譯dmlc-yarn.jar
No FileSystem for scheme: hdfs 錯誤
修改xgboost目錄下的文件/xgboost-package/dmlc-core/tracker/yarn/src/main/java/org/apache/hadoop/yarn/dmlc/Client.java
/**
* constructor
* @throws I