Idea+maven+scala构建包并在spark on yarn 运行
配置Maven
项目
在pom.xml
配置文件中配置spark开发所需要的包,根据你Spark
版本找对应的包,Maven中央仓库
<dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-core_2.11</artifactId> <version>2.3.1</version> </dependency>
构建方式
配置Artifacts
构建包
配置Maven
构建包
- 使用
Maven
构建包只需要在pom.xml
中添加如下插件(maven-shade-plugin
)即可
<plugin> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-shade-plugin</artifactId> <version>2.4.1</version> <executions> <execution> <phase>package</phase> <goals> <goal>shade</goal> </goals> <configuration> <filters> <filter> <artifact>*:*</artifact> <excludes> <exclude>META-INF/*.SF</exclude> <exclude>META-INF/*.DSA</exclude> <exclude>META-INF/*.RSA</exclude> </excludes> </filter> </filters> <transformers> <transformer implementation="org.apache.maven.plugins.shade.resource.AppendingTransformer"> <resource>META-INF/spring.handlers</resource> </transformer> <transformer implementation="org.apache.maven.plugins.shade.resource.AppendingTransformer"> <resource>META-INF/spring.schemas</resource> </transformer> <transformer implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer"> <mainClass>cn.mucang.sensor.SensorMain</mainClass> </transformer> </transformers> </configuration> </execution> </executions> </plugin>
构建示例scala
代码
import org.apache.spark.storage.StorageLevel import org.apache.spark.{SparkConf, SparkContext} object InfoOutput { def main(args: Array[String]): Unit = { val sparkConf = new SparkConf().setMaster("local[*]").setAppName("NginxLog") val sc = new SparkContext(sparkConf) val fd = sc.textFile("hdfs:///xxx/logs/access.log") val logRDD = fd.filter(_.contains(".baidu.com")).map(_.split(" ")) logRDD.persist(StorageLevel.DISK_ONLY) val ipTopRDD = logRDD.map(v => v(2)).countByValue().take(10) ipTopRDD.foreach(println) } }
上传Jar
包
- 使用
scp
上传Jar
包到spark-submit服务器,Jar
位置在项目的out目录下 - 因为没有依赖第三方包所以打出怕jar会很小,使用spark-submit提示任务:
spark-submit --class InfoOutput --verbose --master yarn --deploy-mode cluster nginxlogs.jar
相关推荐
yegen00 2020-10-21
Notzuonotdied 2020-09-17
hline 2020-07-29
tomli 2020-07-26
xieting 2020-07-04
YarnSup 2020-06-28
flyingbird 2020-06-14
Notzuonotdied 2020-06-13
xieting 2020-05-29
tomli 2020-05-27
xieting 2020-05-26
tomli 2020-05-25
tomli 2020-05-11