首先是 n 臺 建立好的虛擬機, 我用的是3臺.
在虛擬機上安裝 好java
修改etc/hosts 文件
192.168.1.111 h1
192.168.1.112 h2
192.168.1.113 h3
- 192.168.1.111 h1
- 192.168.1.112 h2
- 192.168.1.113 h3
- # Do not remove the following line, or various programs
- # that require network functionality will fail.
- 127.0.0.1 localhost.localdomain localhost
- ::1 localhost6.localdomain6 localhost6
添加 hadoop 運行賬號
添加ssh 密鑰
ssh-keygen -t rsa
cp id_rsa.pub authorized_keys
配置hadoop-env.sh
- # The java implementation to use. Required.
- export JAVA_HOME=/usr/java/jdk1.7.0_15
然後是 3大配置文件
core-site.xml
- <?xml version="1.0"?>
- <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
- <!-- Put site-specific property overrides in this file. -->
- <configuration>
- <property>
- <name>fs.default.name</name>
- <value>hdfs://h1:9000</value>
- </property>
- <!--臨時文件存放路徑-->
- <property>
- <name>hadoop.tmp.dir</name>
- <value>/home/root/hadoop/tmp</value>
- </property>
- <!--垃圾回收 如果是0 的話關閉-->
- <property>
- <name>fs.trash.interval</name>
- <value>1400</value>
- <description>Number of minutes between trash checkpoints.
- If zero, the trash feature is disabled.
- </description>
- </property>
- </configuration>
附加:
fs.trash.interval 垃圾間隔
如果爲0 則取消垃圾回收
hdfs-site.xml
- <?xml version="1.0"?>
- <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
- <!-- Put site-specific property overrides in this file. -->
- <configuration>
- <property>
- <name>dfs.data.dir</name>
- <value>${hadoop.tmp.dir}/dfs/data</value>
- <description>Determines where on the local filesystem an DFS data node
- should store its blocks. If this is a comma-delimited
- list of directories, then data will be stored in all named
- directories, typically on different devices.
- Directories that do not exist are ignored.
- </description>
- </property>
- <!--備份的節點數 一共3臺虛擬機 一個是master 兩個是 slaves 備份數就是2 -->
- <property>
- <name>dfs.replication</name>
- <value>2</value>
- <description>Default block replication.
- The actual number of replications can be specified when the file is created.
- The default is used if replication is not specified in create time.
- </description>
- </property>
- <!--指定塊的百分比,應該滿足
- 最小的複製要求定義爲最小dfs複製。
- 值小於或等於0的意思不是在安全模式啓動。
- 值大於1將使安全模式永久。
- -->
- <property>
- <name>dfs.safemode.threshold.pct</name>
- <value>1</value>
- <description>
- Specifies the percentage of blocks that should satisfy
- the minimal replication requirement defined by dfs.replication.min.
- Values less than or equal to 0 mean not to start in safe mode.
- Values greater than 1 will make safe mode permanent.
- </description>
- </property>
- <!--hdfs 的權限 設置, 由於是在win 下邊的eclipse開發,爲防止權限問題 ,把權限設置爲false 關閉掉-->
- <property>
- <name>dfs.permissions</name>
- <value>false</value>
- <description>
- If "true", enable permission checking in HDFS.
- If "false", permission checking is turned off,
- but all other behavior is unchanged.
- Switching from one parameter value to the other does not change the mode,
- owner or group of files or directories.
- </description>
- </property>
- </configuration>
mapred-site.xml
- <?xml version="1.0"?>
- <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
- <!-- Put site-specific property overrides in this file. -->
- <configuration>
- <property>
- <name>mapred.job.tracker</name>
- <value>h1:9001</value>
- </property>
- </configuration>
配置 master 和slaves
master
- h1
slaves
- h2
- h3
複製到 其他節點
scp -r ./hadoop h1:~
然後格式化 節點
hadoop namenode -format
注意事項:
如果沒有吧 hadoop 添加到環境變量中 ,需要在 hadoop /bin 中執行 命令;
防火牆最好關掉,否則會連接不上
iptables 防火牆的設置
查看防火牆狀態:sudo service iptables status
暫時關閉防火牆:sudo service iptables stop
禁止防火牆在開機時啓動chkconfig iptables off
設置防火牆在開機時啓動chkconfig iptables on