環境:系統centos6.6;hadoop版本:1.0.3;java運行環境:jdk1.6
單節點配置過程:
1.配置系統ssh:hadoop在運行過程中會用訪問ssh服務,將ssh服務設置成無密碼訪問,這樣hadoop在訪問ssh服務的時候就不需要人工手動輸入密碼了:
detail:
step 1:生成密鑰
[hjchaw@localhost ~]$ ssh-keygen -t rsa -P ""
[hjchaw@localhost ~]$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
step 2:測試ssh ,如果ssh成功連接,說明配置ssh配置成功
[hjchaw@localhost ~]$ ssh localhost
如果ssh訪問還提示輸入密碼:一般是.ssh路徑訪問權限問題,把權限設置成700,配置的時候注意。
2.hadoop配置過程:
step1:hadoop-env.xml配置,修改其中的JAVA_HOME:如:JAVA_HOME=/usr/local/jdk
step2:core-site.xml文件配置:
<configuration>
NO1:配置hadoop數據存放路徑
<property>
<name>hadoop.tmp.dir</name>
<value>/home/hjchaw/hadoop-datastore/hadoop-${user.name}</value>
<description>The name of the default file system. Either the
literal string "local" or a host:port for NDFS.
</description>
<final>true</final>
</property>
NO2:設置fs名稱
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
<description>The name of the default file system. Either the
literal string "local" or a host:port for NDFS.
</description>
<final>true</final>
</property>
</configuration>
step3:配置hdfs-site.xml
<!-- file system properties -->
<property>
<name>dfs.name.dir</name>
<value>${hadoop.tmp.dir}/dfs/name</value>
<description>Determines where on the local filesystem the DFS name node
should store the name table. If this is a comma-delimited list
of directories then the name table is replicated in all of the
directories, for redundancy. </description>
<final>true</final>
</property>
<property>
<name>dfs.data.dir</name>
<value>${hadoop.tmp.dir}/dfs/data</value>
<description>Determines where on the local filesystem an DFS data node
should store its blocks. If this is a comma-delimited
list of directories, then data will be stored in all named
directories, typically on different devices.
Directories that do not exist are ignored.
</description>
<final>true</final>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
<final>true</final>
</property>
step4: mapred-site.xml 配置
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>localhost:9001</value>
</property>
</configuration>
以上幾步是hadoop單節點,僞分佈式配置。
3.hadoop啓動:
可以將hadoop/bin設置到PATH路徑中
setup1:格式化文件系統:
[hjchaw@localhost bin]$ hadoop namenode -format
12/05/27 04:25:19 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = localhost.localdomain/127.0.0.1
STARTUP_MSG: args = [-format]
STARTUP_MSG: version = 1.0.3
STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0 -r 1335192; compiled by 'hortonfo' on Tue May 8 20:31:25 UTC 2012
************************************************************/
12/05/27 04:25:19 INFO util.GSet: VM type = 32-bit
12/05/27 04:25:19 INFO util.GSet: 2% max memory = 19.33375 MB
12/05/27 04:25:19 INFO util.GSet: capacity = 2^22 = 4194304 entries
12/05/27 04:25:19 INFO util.GSet: recommended=4194304, actual=4194304
12/05/27 04:25:20 INFO namenode.FSNamesystem: fsOwner=hjchaw
12/05/27 04:25:20 INFO namenode.FSNamesystem: supergroup=supergroup
12/05/27 04:25:20 INFO namenode.FSNamesystem: isPermissionEnabled=true
12/05/27 04:25:20 INFO namenode.FSNamesystem: dfs.block.invalidate.limit=100
12/05/27 04:25:20 INFO namenode.FSNamesystem: isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s), accessTokenLifetime=0 min(s)
12/05/27 04:25:20 INFO namenode.NameNode: Caching file names occuring more than 10 times
12/05/27 04:25:21 INFO common.Storage: Image file of size 112 saved in 0 seconds.
12/05/27 04:25:21 INFO common.Storage: Storage directory /home/hjchaw/hadoop-datastore/hadoop-hjchaw/dfs/name has been successfully formatted.
12/05/27 04:25:21 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at localhost.localdomain/127.0.0.1
************************************************************/
step2:啓動hadoop:
[hjchaw@localhost bin]$ start-all.sh
starting namenode, logging to /opt/hadoop/hadoop-1.0.3/libexec/../logs/hadoop-hjchaw-namenode-localhost.localdomain.out
localhost: starting datanode, logging to /opt/hadoop/hadoop-1.0.3/libexec/../logs/hadoop-hjchaw-datanode-localhost.localdomain.out
localhost: starting secondarynamenode, logging to /opt/hadoop/hadoop-1.0.3/libexec/../logs/hadoop-hjchaw-secondarynamenode-localhost.localdomain.out
starting jobtracker, logging to /opt/hadoop/hadoop-1.0.3/libexec/../logs/hadoop-hjchaw-jobtracker-localhost.localdomain.out
localhost: starting tasktracker, logging to /opt/hadoop/hadoop-1.0.3/libexec/../logs/hadoop-hjchaw-tasktracker-localhost.localdomain.out
如果看到以上結果信息,那麼configuration is OK,now!
4.嘗試使用hadoop命令行接口操作文件系統:
如:新建一個文件夾:
[hjchaw@localhost bin]$ hadoop fs -mkdir input
查看文件:
[hjchaw@localhost bin]$ hadoop fs -ls
Found 1 items
drwxr-xr-x - hjchaw supergroup 0 2012-05-27 04:26 /user/hjchaw/input
到此hadoop單節點,僞分佈式配置結束。