hadoop2.7.3是一个小版本发布,基于hadoop2.7.2稳定版本;本次编译主要是支持snappy压缩功能;
下面是hadoop2.7.3主要功能和改进概述:
Common
- 当使用HTTP 代理服务器的时候,认证改进。当通过代理服务器访问WebHDFS是非常有用的。
- 新的hadoop metrics sink允许直接写Graphite
- 规范Hadoop Compatible Filesystem (HCFS) 相关的工作
HDFS
- 支持POSIX风格的文件系统扩展属性。
- 使用OfflineImageViewer,客户端可以通过WebHDFS API浏览fsimage
- NFS gateway 接受了一些可支持的改进和bug的修复。
- SecondaryNameNode, JournalNode, and DataNode web 界面已经使用HTML5和Javascript.
YARN
- YARN’s REST APIs 现在支持写/修改操作,用户可以通过 REST APIs提交和杀掉应用程序。
- Yarn中存储TImeline,用于存储常规和指定信息为应用程序,支持Kerberos认证。
- Fair Scheduler 支持用户动态分层。
本次编译环境:虚拟机
Centos6.5_64位
JDK:1.7.0
Maven:3.3.9
Findbugs:3.0.1
protobuf:2.5.0
ant:1.9.7
内存:2G
由于编译过程会下载大量有maven包,所以网络要求比较高。编译速度也主要取决于下载依赖包的速度;
源码包下载地址:
http://apache.fayea.com/hadoop/common/hadoop-2.7.3/hadoop-2.7.3-src.tar.gz
tar xf hadoop-2.7.3-src.tar.gz
cd hadoop-2.7.3-src
[root@moban hadoop-2.7.3-src]# head -15 BUILDING.txt #查看编译依赖环境最低要求;
Build instructions for Hadoop
----------------------------------------------------------------------------------
Requirements:
* Unix System
* JDK 1.7+
* Maven 3.0 or later
* Findbugs 1.3.9 (if running findbugs)
* ProtocolBuffer 2.5.0
* CMake 2.6 or newer (if compiling native code), must be 3.0 or newer on Mac
* Zlib devel (if compiling native code)
* openssl devel ( if compiling native hadoop-pipes and to get the best HDFS encryption performance )
* Linux FUSE (Filesystem in Userspace) version 2.6 or above ( if compiling fuse_dfs )
* Internet connection for first build (to fetch all Maven and Hadoop dependencies)
下面安装各种库:
yum -y install svn ncurses-devel gcc*
yum -y install lzo-devel zlib-devel autoconf automake libtool cmake openssl-devel
编译:protobuf
下载地址:https://github.com/google/protobuf
tar zxvf protobuf-2.5.0.tar.gz
mv protobuf-2.5.0 /usr/local/
cd /usr/local/protobuf-2.5.0
./configure
make && make install
验证是否安装成功:
[root@moban protobuf-2.5.0]# protoc
Missing input file.
查看版本
[root@moban protobuf-2.5.0]# protoc --version
libprotoc 2.5.0
如上显示libprotoc 2.5.0则安装成功
ant下载地址:
http://apache.fayea.com/ant/binaries/apache-ant-1.9.7-bin.tar.gz
tar xf apache-ant-1.9.7-bin.tar.gz -C /usr/local
findbugs-3.0.1 下载地址:
http://prdownloads.sourceforge.net/findbugs/findbugs-3.0.1.tar.gz
tar xf findbugs-3.0.1.tar.gz
mv findbugs-3.0.1 /usr/local/
查看版本信息:
[root@moban local]# findbugs -version
3.0.1
maven下载地址:
http://mirrors.cnnic.cn/apache/maven/maven-3/3.3.9/binaries/apache-maven-3.3.9-bin.tar.gz
tar xf apache-maven-3.3.9-bin.tar.gz -C /usr/local/
安装snappy-1.1.3
下载地址:https://github.com/google/snappy/releases/download/1.1.3/snappy-1.1.3.tar.gz
tar xf snappy-1.1.3.tar.gz -C /usr/local/
cd /usr/local/snappy-1.1.3
./configure
make && make install
安装完成后检查效果:
[root@moban snappy-1.1.3]# ls -lh /usr/local/lib |grep snappy
-rw-r--r-- 1 root root 462K Dec 20 12:06 libsnappy.a
-rwxr-xr-x 1 root root 955 Dec 20 12:06 libsnappy.la
lrwxrwxrwx 1 root root 18 Dec 20 12:06 libsnappy.so -> libsnappy.so.1.3.0
lrwxrwxrwx 1 root root 18 Dec 20 12:06 libsnappy.so.1 -> libsnappy.so.1.3.0
-rwxr-xr-x 1 root root 223K Dec 20 12:06 libsnappy.so.1.3.0
JDK下载地址:
http://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html
配置环境变量:
vim ~/.bash_profile
export PATH
export JAVA_HOME=/usr/local/jdk1.7.0_79
export JRE_HOME=/usr/local/java/jre
export CLASSPATH=$CLASSPATH:./:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
export MAVEN_HOME=/usr/local/apache-maven-3.3.9
export SCALA_HOME=/usr/local/scala-2.11.8 #这个是编译Spark时用到
export FINDBUGS_HOME=/usr/local/findbugs-3.0.1
export PROTOBUF_HOME=/usr/local/protobuf-2.5.0
export ANT_HOME=/usr/local/apache-ant-1.9.7
export PATH=$PATH:$JAVA_HOME/bin:$MAVEN_HOME/bin:$SCALA_HOME/bin:$FINDBUGS_HOME/bin:$ANT_HOME/bin
export MAVEN_OPTS="-Xmx2g -XX:MaxPermSize=512M -XX:ReservedCodeCacheSize=512m" source ~/.bash_profile
编译参数:
mvn clean package -Pdist,native -DskipTests -Dtar -Dbundle.snappy -Dsnappy.lib=/usr/local/lib
参数说明:
- Pdist,native :把重新编译生成的hadoop动态库;
- DskipTests :跳过测试
- Dtar :最后把文件以tar打包
- Dbundle.snappy :添加snappy压缩支持【默认官网下载的是不支持的】
- Dsnappy.lib=/usr/local/lib :指snappy在编译机器上安装后的库路径
编译结果:
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO]
[INFO] Apache Hadoop Main ................................. SUCCESS [ 2.469 s]
[INFO] Apache Hadoop Build Tools .......................... SUCCESS [ 1.331 s]
[INFO] Apache Hadoop Project POM .......................... SUCCESS [ 1.582 s]
[INFO] Apache Hadoop Annotations .......................... SUCCESS [ 3.130 s]
[INFO] Apache Hadoop Assemblies ........................... SUCCESS [ 0.227 s]
[INFO] Apache Hadoop Project Dist POM ..................... SUCCESS [ 1.957 s]
[INFO] Apache Hadoop Maven Plugins ........................ SUCCESS [ 4.687 s]
[INFO] Apache Hadoop MiniKDC .............................. SUCCESS [ 7.643 s]
[INFO] Apache Hadoop Auth ................................. SUCCESS [ 7.308 s]
[INFO] Apache Hadoop Auth Examples ........................ SUCCESS [ 2.713 s]
[INFO] Apache Hadoop Common ............................... SUCCESS [01:34 min]
[INFO] Apache Hadoop NFS .................................. SUCCESS [ 5.755 s]
[INFO] Apache Hadoop KMS .................................. SUCCESS [02:47 min]
[INFO] Apache Hadoop Common Project ....................... SUCCESS [ 0.054 s]
[INFO] Apache Hadoop HDFS ................................. SUCCESS [03:55 min]
[INFO] Apache Hadoop HttpFS ............................... SUCCESS [02:31 min]
[INFO] Apache Hadoop HDFS BookKeeper Journal .............. SUCCESS [03:53 min]
[INFO] Apache Hadoop HDFS-NFS ............................. SUCCESS [ 3.825 s]
[INFO] Apache Hadoop HDFS Project ......................... SUCCESS [ 0.043 s]
[INFO] hadoop-yarn ........................................ SUCCESS [ 0.029 s]
[INFO] hadoop-yarn-api .................................... SUCCESS [01:23 min]
[INFO] hadoop-yarn-common ................................. SUCCESS [01:42 min]
[INFO] hadoop-yarn-server ................................. SUCCESS [ 0.057 s]
[INFO] hadoop-yarn-server-common .......................... SUCCESS [ 15.157 s]
[INFO] hadoop-yarn-server-nodemanager ..................... SUCCESS [ 19.710 s]
[INFO] hadoop-yarn-server-web-proxy ....................... SUCCESS [ 3.938 s]
[INFO] hadoop-yarn-server-applicationhistoryservice ....... SUCCESS [ 10.158 s]
[INFO] hadoop-yarn-server-resourcemanager ................. SUCCESS [ 19.264 s]
[INFO] hadoop-yarn-server-tests ........................... SUCCESS [ 5.756 s]
[INFO] hadoop-yarn-client ................................. SUCCESS [ 8.633 s]
[INFO] hadoop-yarn-server-sharedcachemanager .............. SUCCESS [ 3.851 s]
[INFO] hadoop-yarn-applications ........................... SUCCESS [ 0.029 s]
[INFO] hadoop-yarn-applications-distributedshell .......... SUCCESS [ 2.681 s]
[INFO] hadoop-yarn-applications-unmanaged-am-launcher ..... SUCCESS [ 1.785 s]
[INFO] hadoop-yarn-site ................................... SUCCESS [ 0.046 s]
[INFO] hadoop-yarn-registry ............................... SUCCESS [ 6.323 s]
[INFO] hadoop-yarn-project ................................ SUCCESS [ 3.170 s]
[INFO] hadoop-mapreduce-client ............................ SUCCESS [ 0.149 s]
[INFO] hadoop-mapreduce-client-core ....................... SUCCESS [ 21.400 s]
[INFO] hadoop-mapreduce-client-common ..................... SUCCESS [ 17.242 s]
[INFO] hadoop-mapreduce-client-shuffle .................... SUCCESS [ 4.433 s]
[INFO] hadoop-mapreduce-client-app ........................ SUCCESS [ 10.327 s]
[INFO] hadoop-mapreduce-client-hs ......................... SUCCESS [ 6.912 s]
[INFO] hadoop-mapreduce-client-jobclient .................. SUCCESS [01:59 min]
[INFO] hadoop-mapreduce-client-hs-plugins ................. SUCCESS [ 1.805 s]
[INFO] Apache Hadoop MapReduce Examples ................... SUCCESS [ 7.366 s]
[INFO] hadoop-mapreduce ................................... SUCCESS [ 2.525 s]
[INFO] Apache Hadoop MapReduce Streaming .................. SUCCESS [ 33.129 s]
[INFO] Apache Hadoop Distributed Copy ..................... SUCCESS [ 8.177 s]
[INFO] Apache Hadoop Archives ............................. SUCCESS [ 1.759 s]
[INFO] Apache Hadoop Rumen ................................ SUCCESS [ 6.449 s]
[INFO] Apache Hadoop Gridmix .............................. SUCCESS [ 4.048 s]
[INFO] Apache Hadoop Data Join ............................ SUCCESS [ 2.997 s]
[INFO] Apache Hadoop Ant Tasks ............................ SUCCESS [ 1.976 s]
[INFO] Apache Hadoop Extras ............................... SUCCESS [ 3.139 s]
[INFO] Apache Hadoop Pipes ................................ SUCCESS [ 6.732 s]
[INFO] Apache Hadoop OpenStack support .................... SUCCESS [ 5.706 s]
[INFO] Apache Hadoop Amazon Web Services support .......... SUCCESS [33:55 min]
[INFO] Apache Hadoop Azure support ........................ SUCCESS [ 53.715 s]
[INFO] Apache Hadoop Client ............................... SUCCESS [ 6.341 s]
[INFO] Apache Hadoop Mini-Cluster ......................... SUCCESS [ 0.645 s]
[INFO] Apache Hadoop Scheduler Load Simulator ............. SUCCESS [ 6.669 s]
[INFO] Apache Hadoop Tools Dist ........................... SUCCESS [ 7.077 s]
[INFO] Apache Hadoop Tools ................................ SUCCESS [ 0.021 s]
[INFO] Apache Hadoop Distribution ......................... SUCCESS [ 43.940 s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 01:00 h
[INFO] Finished at: 2016-12-20T11:57:33+08:00
[INFO] Final Memory: 124M/766M
其实编译时间不长,总计1个小时;但是下载依赖包的时候看网速了,所以有人说编译要多久,其实都是因为网络不稳定下载依赖包慢导致的;
编译过程遇到的问题:
报错:
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-antrun-[plugin:1.7:run (dist) on project hadoop-kms: An Ant BuildException has
[occured: exec returned: 2 ERROR] around Ant part ...<exec[dir="/root/soft/hadoop-2.7.3-src/hadoop-common-project/hadoop-kms/target"
[executable="sh" failonerror="true">... @ 10:118 in[/root/soft/hadoop-2.7.3-src/hadoop-common-project/hadoop-kms/target/antrun[/build-main.xml ERROR] -> Help 1]
出现原因:
有可能是ant没有安装好或环境配置有问题
如果1已经确认安装好,就是apache-tomcat-6.0.44.tar.gz 这个包没下载完成。
添加一个普通用户如:hadoop ,授权整个hadoop-2.7.3-src 目录
useradd hadoop
chown -R hadoop.hadoop hadoop-2.7.3-src
另外一个情况:在编译过程中发现Maven会自动下载依赖包。编译时会自动下载两次apache-tomcat-6.0.44.tar.gz
[mkdir] Created dir: /root/soft/hadoop-2.7.3-src/hadoop-hdfs-project/hadoop-hdfs-httpfs/downloads
[mkdir] Created dir: /root/soft/hadoop-2.7.3-src/hadoop-common-project/hadoop-kms/downloads
两个文件相同,却存着两份,不同的目录;
[root@moban hadoop-2.7.3-src]# ls /root/soft/hadoop-2.7.3-src/hadoop-hdfs-project/hadoop-hdfs-httpfs/downloads
apache-tomcat-6.0.44.tar.gz
[root@moban hadoop-2.7.3-src]# ls /root/soft/hadoop-2.7.3-src/hadoop-common-project/hadoop-kms/downloads
apache-tomcat-6.0.44.tar.gz
如果这只下载一次那明显编译所用的时间也就更少了;因这个文件不是存到Maven的仓库里的,根据项目下载,上面的download就是编译项目时创建的目录;这个只是一个记录不影响编译;
最后怎么确定编译后是否支持snappy压缩呢?
[root@moban native]# pwd
/root/soft/hadoop-2.7.3-src/hadoop-dist/target/hadoop-2.7.3/lib/native
[root@moban native]# ll libsnappy.*
-rw-r--r-- 1 root root 472950 Dec 20 15:08 libsnappy.a
-rwxr-xr-x 1 root root 955 Dec 20 15:08 libsnappy.la
lrwxrwxrwx 1 root root 18 Dec 20 15:08 libsnappy.so -> libsnappy.so.1.3.0
lrwxrwxrwx 1 root root 18 Dec 20 15:08 libsnappy.so.1 -> libsnappy.so.1.3.0
-rwxr-xr-x 1 root root 228177 Dec 20 15:08 libsnappy.so.1.3.0
如果此目录下有这样的文件即说明此次的编译成功添加了snappy压缩支持;同样可以在安装Hadoop后检查,
命令:hadoop checknative