window客戶端訪問HDFS

通過widows客戶端訪問Hadoop集羣,讀取HDFS文件

使用平臺爲eclipse,CDH5.1.0,hdf2.3.0

1、新建java project

2、在集羣中找到core-site.xml和hdfs-site文件拷貝到java project的工程下,放置到bin文件夾下

在src右鍵,新建source folder即可,如下


2、編程代碼如下:

package com.mail;


import java.net.URI;
import java.io.InputStream;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IOUtils;


public class Hdfs {


public static void main(String[] args) {
// TODO Auto-generated method stub
try {
Configuration conf = new Configuration();
FileSystem file = FileSystem.get(conf);
String path ="/tmp/data/mllib/kmeans_data.txt";
if(file.exists(new Path(path))){
boolean b = true;
System.out.println("********************");
}

InputStream in = null;
try {
in = file.open(new Path(path));
IOUtils.copyBytes(in, System.out, 4096, true);
} finally {
IOUtils.closeStream(in);
}
} catch (Exception ex) {
ex.printStackTrace();
}
}
}

3、程序編寫完成後執行會存在很多問題。

1)org/apache/commons/logging/LogFactory,缺少commons-logging-1.0.4.jar 

2)com/google/common/collect/Interners,缺少google的包,我下載的是guava-18.0.jar

3)NoClassDefFoundError: org/apache/commons/configuration/Configuration,缺少Configuration的jar包

4)DistributedFileSystem could not be instantiated,org.apache.hadoop.conf.Configuration.addDeprecations

這是由於hdfs版本與集羣版本版本不一致



4、讀取hdfs的文件,遍歷文件夾中的文件,在文件夾中新建文件,並且寫入內容,代碼如下:

public static void main(String[] args){
try {
Configuration conf = new Configuration();
FileSystem file = FileSystem.get(conf);
String path ="/tmp/daily_mail/CN/sql/";
String Outputpath ="/tmp/daily_mail/CN/hql/";
FileStatus[] lstStatus = file.listStatus(new Path(path));
for (FileStatus status : lstStatus) {
FSDataInputStream inputStream = file.open(status.getPath());
BufferedReader br = new BufferedReader(new InputStreamReader(inputStream));
String sql = "";
String line = null;
while (null != (line = br.readLine())) {
sql += line;
sql += " ";
}
System.out.println(sql);
String name = Outputpath + status.getPath().getName();
FileSystem OutPutfile = FileSystem.get(conf);
OutPutfile.deleteOnExit(new Path(name));
OutPutfile.createNewFile(new Path(name));
FSDataOutputStream Outputfs = OutPutfile.append(new Path(name));
Outputfs.write(sql.getBytes());
Outputfs.flush();
Outputfs.close();
}
} catch (Exception ex) {
ex.printStackTrace();
}
}

發佈了86 篇原創文章 · 獲贊 23 · 訪問量 25萬+
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章