snappy也是一個開源的高效壓縮和解壓框架。具體介紹,這裏不再贅述,可以去官網查詢。
安裝所需軟件:gcc、g++、snappy包、hadoop-snappy源碼包、maven.
gcc在ubuntu上已經安裝,如果未安裝,包括g++,maven,可以通過以下命令安裝
- sudo apt-get install gcc g++ maven2
下載軟件包:
snappy :http://code.google.com/p/snappy/downloads/list
執行如下命令:
- tar -zxvf snappy-1.0.5.tar.gz
- cd snappy-1.0.5
- ./configure
- make
- sudo make install
使用SVN客戶端下載hadoop-snappy 源碼:
下載地址:http://hadoop-snappy.googlecode.com/svn/trunk/
編譯hadoop-snappy 需要automake和libtool包,執行如下命令:
- sudo apt-get install automake libtool
- cd hadoop-snappy
- mvn package
然後把編譯後的 hadoop-snappy-1.0.5-tar/hadoop-snappy-1.0.5/lib/ 下的jar包 拷貝到$HADOOP_HOME/lib下,修改配置文件core-site.xml
- <property>
- <name>mapred.compress.map.output</name>
- <value>true</value>
- </property>
- <property>
- <name>mapred.map.output.compression.codec</name>
- <value>org.apache.hadoop.io.compress.SnappyCodec</value>
- </property>
- <property>
- <name>io.compression.codecs</name>
- <value>org.apache.hadoop.io.compress.GzipCodec,
- org.apache.hadoop.io.compress.DefaultCodec,
- org.apache.hadoop.io.compress.BZip2Codec,
- com.hadoop.compression.lzo.LzoCodec,
- com.hadoop.compression.lzo.LzopCodec,
- org.apache.hadoop.io.compress.SnappyCodec
- </value>
- </property>
- <property>
- <name>io.compression.codec.lzo.class</name>
- <value>com.hadoop.compression.lzo.LzoCodec</value>
- </property>
然後重啓hadoop完事。