Win10+IDEA創建Maven並配置Scala

目錄

1.在IDEA中新建Project-->Maven-->Next

2.GroupId一般寫公司統一名稱,ArtifactId寫項目名稱 -->Next

3.點擊Finish

4.目錄結構

5.解壓apache-maven-3.3.9-bin.zip

6.打開conf中的settings.xml,修改本地倉庫路徑

7.在IDEA中打開File-->settings-->搜maven-->配置解壓的目錄和修改的settings文件-->OK-->右下角彈窗選Enable Auto-Import

8.maven依賴查詢 與添加

9.配置maven的環境變量

10.打開cmd --> mvn -v

11.配置Scala

12.配置pom文件

13.測試環境是否成功


1.在IDEA中新建Project-->Maven-->Next

2.GroupId一般寫公司統一名稱,ArtifactId寫項目名稱 -->Next

3.點擊Finish

4.目錄結構

.idea 是元信息,是工作目錄所在位置,如果想拷貝工程到自己的電腦需要把這個目錄刪了重新加載

src 是編輯代碼的目錄

pom.xml 是它的依賴,寫所用的jar包

5.解壓apache-maven-3.3.9-bin.zip

6.打開conf中的settings.xml,修改本地倉庫路徑

拷貝第53行並指定本地倉庫路徑,我指定的是<localRepository>D:\maven\repository</localRepository>,保存。

7.在IDEA中打開File-->settings-->搜maven-->配置解壓的目錄和修改的settings文件-->OK-->右下角彈窗選Enable Auto-Import

8.maven依賴查詢 與添加

9.配置maven的環境變量

10.打開cmd --> mvn -v

11.配置Scala

main-->新建scala文件夾-->File-->Project Structure

-->Modules-->點擊main目錄scala-->點擊Sources

-->Modules-->點擊test目錄scala-->點擊Tests

-->Libraries-->點擊+-->Scala SDK-->OK

12.配置pom文件

再後面追加以下配置

 <properties>
        <spark.version>2.2.0</spark.version>
        <scala.version>2.11</scala.version>
        <hadoop.version>2.7.3</hadoop.version>
    </properties>

    <dependencies>
        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-core_${scala.version}</artifactId>
            <version>${spark.version}</version>
        </dependency>

        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-sql_${scala.version}</artifactId>
            <version>${spark.version}</version>
        </dependency>

        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-hive_${scala.version}</artifactId>
            <version>${spark.version}</version>
        </dependency>

        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-streaming_${scala.version}</artifactId>
            <version>${spark.version}</version>
        </dependency>

        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-client</artifactId>
            <version>2.6.0</version>
        </dependency>

        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-mllib_${scala.version}</artifactId>
            <version>${spark.version}</version>
        </dependency>

        <dependency>
            <groupId>mysql</groupId>
            <artifactId>mysql-connector-java</artifactId>
            <version>5.1.39</version>
        </dependency>

        <dependency>
            <groupId>junit</groupId>
            <artifactId>junit</artifactId>
            <version>4.12</version>
        </dependency>
    </dependencies>

    <build>
        <sourceDirectory>src/main/scala</sourceDirectory>
        <testSourceDirectory>src/test/scala</testSourceDirectory>
    </build>

13.測試環境是否成功

import org.apache.log4j.{Level, Logger}
import org.apache.spark.SparkConf
import org.apache.spark.rdd.RDD
import org.apache.spark.streaming.{Seconds, StreamingContext}

import scala.collection.mutable

object RDDQueueStream {
  def main(args: Array[String]): Unit = {
    System.setProperty("hadoop.home.dir", "D:\\temp\\hadoop-2.4.1\\hadoop-2.4.1")
    Logger.getLogger("org.apache.spark").setLevel(Level.ERROR)
    Logger.getLogger("org.eclipse.jetty.server").setLevel(Level.OFF)

    val conf = new SparkConf().setAppName("MyNetworkWordCount").setMaster("local[2]")
    val ssc = new StreamingContext(conf,Seconds(1))

    val rddQueue = new mutable.Queue[RDD[Int]]()
    for(i <- 1 to 3){
      rddQueue += ssc.sparkContext.makeRDD(i to 10)
      Thread.sleep(2000)
    }

    val inputDStream = ssc.queueStream(rddQueue)

    val result = inputDStream.map(x => (x,x*2))
    result.print()

    ssc.start()
    ssc.awaitTermination()
  }
}

 

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章