通過widows客戶端訪問Hadoop集羣,讀取HDFS文件
使用平臺爲eclipse,CDH5.1.0,hdf2.3.0
1、新建java project
2、在集羣中找到core-site.xml和hdfs-site文件拷貝到java project的工程下,放置到bin文件夾下
在src右鍵,新建source folder即可,如下
2、編程代碼如下:
package com.mail;
import java.net.URI;
import java.io.InputStream;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IOUtils;
public class Hdfs {
public static void main(String[] args) {
// TODO Auto-generated method stub
try {
Configuration conf = new Configuration();
FileSystem file = FileSystem.get(conf);
String path ="/tmp/data/mllib/kmeans_data.txt";
if(file.exists(new Path(path))){
boolean b = true;
System.out.println("********************");
}
InputStream in = null;
try {
in = file.open(new Path(path));
IOUtils.copyBytes(in, System.out, 4096, true);
} finally {
IOUtils.closeStream(in);
}
} catch (Exception ex) {
ex.printStackTrace();
}
}
}
3、程序編寫完成後執行會存在很多問題。
1)org/apache/commons/logging/LogFactory,缺少commons-logging-1.0.4.jar
2)com/google/common/collect/Interners,缺少google的包,我下載的是guava-18.0.jar
3)NoClassDefFoundError: org/apache/commons/configuration/Configuration,缺少Configuration的jar包
4)DistributedFileSystem could not be instantiated,org.apache.hadoop.conf.Configuration.addDeprecations
這是由於hdfs版本與集羣版本版本不一致
4、讀取hdfs的文件,遍歷文件夾中的文件,在文件夾中新建文件,並且寫入內容,代碼如下:
public static void main(String[] args){try {
Configuration conf = new Configuration();
FileSystem file = FileSystem.get(conf);
String path ="/tmp/daily_mail/CN/sql/";
String Outputpath ="/tmp/daily_mail/CN/hql/";
FileStatus[] lstStatus = file.listStatus(new Path(path));
for (FileStatus status : lstStatus) {
FSDataInputStream inputStream = file.open(status.getPath());
BufferedReader br = new BufferedReader(new InputStreamReader(inputStream));
String sql = "";
String line = null;
while (null != (line = br.readLine())) {
sql += line;
sql += " ";
}
System.out.println(sql);
String name = Outputpath + status.getPath().getName();
FileSystem OutPutfile = FileSystem.get(conf);
OutPutfile.deleteOnExit(new Path(name));
OutPutfile.createNewFile(new Path(name));
FSDataOutputStream Outputfs = OutPutfile.append(new Path(name));
Outputfs.write(sql.getBytes());
Outputfs.flush();
Outputfs.close();
}
} catch (Exception ex) {
ex.printStackTrace();
}
}