博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
spark编译安装 spark 2.1.0 hadoop2.6.0-cdh5.7.0
阅读量:6718 次
发布时间:2019-06-25

本文共 6322 字,大约阅读时间需要 21 分钟。

 

1、准备:

centos 6.5

jdk 1.7

Java SE安装包下载地址:

maven3.3.9  

Maven3.3.9安装包下载地址:

spark 2.1.0 下载

http://spark.apache.org/downloads.html

 

下载后文件名:

 

 

***************************************************分界线  编译开始*********************************************************************

 

上传到linux

安装maven,解压,配置环境变量

在此略掉...

 mvn-v

 

说明mvn就已经没问题

*************************************************************分界线***********************************************************************************

我的hadoop版本是hadoop2.6.0-cdh5.7.0

解压spark源码包

得到源码包

忽略我这边已经编译好的spark安装包

先设置maven的内存,不然会有问题,直接设置临时的

export MAVEN_OPTS="-Xmx2g -XX:ReservedCodeCacheSize=512m"

[root@master109 opt]# echo $MAVEN_OPTS-Xmx2g -XX:ReservedCodeCacheSize=512m

 

 进入spark源码主目录

1
./dev/make-distribution.sh --name 2.6.0-cdh5.7.0   --tgz   -Phadoop-2.6 -Dhadoop.version=2.6.0-cdh5.7.0 -Phive -Phive-thriftserver  -Pyarn

  

结果:

[INFO] BUILD FAILURE[INFO] ------------------------------------------------------------------------[INFO] Total time: 9.810 s (Wall Clock)[INFO] Finished at: 2017-10-13T15:52:09+08:00[INFO] Final Memory: 67M/707M[INFO] ------------------------------------------------------------------------[ERROR] Failed to execute goal on project spark-launcher_2.11: Could not resolve dependencies for project org.apache.spark:spark-launcher_2.11:jar:2.1.0: Failure to find org.apache.hadoop:hadoop-client:jar:2.6.0-cdh5.7.0 in https://repo1.maven.org/maven2 was cached in the local repository, resolution will not be reattempted until the update interval of central has elapsed or updates are forced -> [Help 1][ERROR] [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.[ERROR] Re-run Maven using the -X switch to enable full debug logging.[ERROR] [ERROR] For more information about the errors and possible solutions, please read the following articles:[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException[ERROR] [ERROR] After correcting the problems, you can resume the build with the command[ERROR]   mvn 
-rf :spark-launcher_2.11

 

编译失败,显示没有找到一些包,这里是数据源不对,默认的是Apache的源,这里要改成cdh的源

编辑 pom.xml

[root@master109 spark-2.1.0]# lsappveyor.yml  bin    common  CONTRIBUTING.md  data  docs      external  launcher  licenses  mllib        NOTICE   project  R          repl  scalastyle-config.xml  streaming  toolsassembly      build  conf    core             dev   examples  graphx    LICENSE   mesos     mllib-local  pom.xml  python   README.md  sbin  sql                    target     yarn[root@master109 spark-2.1.0]# vim pom.xml

 

 在如下位置插入

#--------------------------------------------- 中间的内容,改变数据源。记住,删掉上下的分隔符。
#---------------------------------------------
central
Maven Repository
https://repo1.maven.org/maven2
true
false
#---------------------------------------------
cloudera
cloudera Repository
https://repository.cloudera.com/artifactory/cloudera-repos
#---------------------------------------------

 

重新编译开始:

[root@master109 spark-2.1.0]# ./dev/make-distribution.sh --name 2.6.0-cdh5.7.0   --tgz   -Phadoop-2.6 -Dhadoop.version=2.6.0-cdh5.7.0 -Phive -Phive-thriftserver  -Pyarn

等待几分钟:

[INFO] Reactor Summary:[INFO] [INFO] Spark Project Parent POM ........................... SUCCESS [  3.997 s][INFO] Spark Project Tags ................................. SUCCESS [  3.394 s][INFO] Spark Project Sketch ............................... SUCCESS [ 14.061 s][INFO] Spark Project Networking ........................... SUCCESS [ 37.680 s][INFO] Spark Project Shuffle Streaming Service ............ SUCCESS [ 12.750 s][INFO] Spark Project Unsafe ............................... SUCCESS [ 33.158 s][INFO] Spark Project Launcher ............................. SUCCESS [ 50.148 s][INFO] Spark Project Core ................................. SUCCESS [04:16 min][INFO] Spark Project ML Local Library ..................... SUCCESS [ 45.832 s][INFO] Spark Project GraphX ............................... SUCCESS [ 26.712 s][INFO] Spark Project Streaming ............................ SUCCESS [ 58.080 s][INFO] Spark Project Catalyst ............................. SUCCESS [02:22 min][INFO] Spark Project SQL .................................. SUCCESS [03:02 min][INFO] Spark Project ML Library ........................... SUCCESS [02:16 min][INFO] Spark Project Tools ................................ SUCCESS [  2.588 s][INFO] Spark Project Hive ................................. SUCCESS [01:19 min][INFO] Spark Project REPL ................................. SUCCESS [  6.337 s][INFO] Spark Project YARN Shuffle Service ................. SUCCESS [ 13.252 s][INFO] Spark Project YARN ................................. SUCCESS [ 57.556 s][INFO] Spark Project Hive Thrift Server ................... SUCCESS [ 45.074 s][INFO] Spark Project Assembly ............................. SUCCESS [  7.410 s][INFO] Spark Project External Flume Sink .................. SUCCESS [ 30.214 s][INFO] Spark Project External Flume ....................... SUCCESS [ 19.359 s][INFO] Spark Project External Flume Assembly .............. SUCCESS [  6.082 s][INFO] Spark Integration for Kafka 0.8 .................... SUCCESS [ 30.266 s][INFO] Spark Project Examples ............................. SUCCESS [ 28.668 s][INFO] Spark Project External Kafka Assembly .............. SUCCESS [  6.919 s][INFO] Spark Integration for Kafka 0.10 ................... SUCCESS [ 30.811 s][INFO] Spark Integration for Kafka 0.10 Assembly .......... SUCCESS [  6.551 s][INFO] Kafka 0.10 Source for Structured Streaming ......... SUCCESS [ 17.707 s][INFO] ------------------------------------------------------------------------[INFO] BUILD SUCCESS[INFO] ------------------------------------------------------------------------[INFO] Total time: 13:25 min (Wall Clock)[INFO] Finished at: 2017-10-13T16:35:47+08:00[INFO] Final Memory: 90M/979M[INFO] ------------------------------------------------------------------------

 

完事!

转载于:https://www.cnblogs.com/xc1234/p/9188696.html

你可能感兴趣的文章
极限编程创始人Ron Jeffries建议开发者放弃敏捷
查看>>
SSPL的MongoDB再被抛弃,GUN Health也合流PostgreSQL
查看>>
SegmentFault 2016 第四季度 Top Writer
查看>>
Go 领军人物谢孟军:智能制造渴望银弹,首先要摒弃偏见
查看>>
金丝雀测试实践
查看>>
KubeEdge:开源的Kubernetes原生边缘计算框架
查看>>
AccessibilityService
查看>>
麦当劳数字化转型中获得的6个数据科学经验
查看>>
react反模式之index作为key
查看>>
如何撰写好文档?精益文档的六个实践
查看>>
专访朱诗雄:Apache Spark中的全新流式引擎Structured Streaming
查看>>
移动端DNUN:危险通知和用户导航
查看>>
举重若轻的人人车移动端数据平台
查看>>
麻省理工学院研究人员设计出针对幽灵党和熔毁的DAWG方法
查看>>
自由软件救世主Richard Stallman:我们可以比比特币做得更好
查看>>
AlphaZero进化论:从零开始,制霸所有棋类游戏
查看>>
百度云BaaS体系揭秘,突破共识机制、单机计算和串行处理三大瓶颈
查看>>
Prometheus正式从CNCF毕业
查看>>
专访《更敏捷的测试》作者Janet Gregory和Lisa Crispin
查看>>
伯克利论断:Serverless 才是云时代的主宰
查看>>