我們這裏講這三類自定義Source
SourceFunction、ParallelSourceFunction、RichParallelSourceFunction
先定義一個Class類,分別集成上面三個接口,實現裏面方法
package com.ruozedata.flink
import org.apache.flink.streaming.api.functions.source.SourceFunction
//依次extends成SourceFunction、ParallelSourceFunction、RichParallelSourceFunction
class CustomerSource extends SourceFunction[Long]{
var count=0L
var isRunning=true
override def run(ctx: SourceFunction.SourceContext[Long]): Unit = {
while(isRunning) {
ctx.collect(count)
count += 1
Thread.sleep(1000)
}
}
override def cancel(): Unit = {
isRunning=false
}
}
調用、測試
package com.ruozedata.flink
import org.apache.flink.streaming.api.scala.StreamExecutionEnvironment
import org.apache.flink.api.scala._
import org.apache.flink.streaming.api.windowing.time.Time
object CustomerSourceStreamingApp {
def main(args: Array[String]): Unit = {
val env=StreamExecutionEnvironment.getExecutionEnvironment
//通過前面依次exten不同的source
val text=env.addSource(new CustomerSource).setParallelism(2) //2個併發
val rusult=text.map(x=>{
println("CustomerSourceStreamingApp接收到的數據時:"+x)
x
}).timeWindowAll(Time.seconds(2)) //統計最近兩秒內的數據
.sum(0)
rusult.print()setParallelism(1) //表示並行度有2core
env.execute("CustomerSourceApp")
}
}