Flink-狀態後端得定義及選擇 | 狀態編程求上次溫度與此次溫度對比相差指定額度進行報警 | 使用已有API實現

GitHub代碼

https://github.com/SmallScorpion/flink-tutorial.git

狀態後端(State Backends)

  1. 每傳入一條數據,有狀態的算子任務都會讀取和更新狀態
  2. 由於有效的狀態訪問對於處理數據的低延遲至關重要,因此每個並行任務都會在本地維護其狀態,以確保快速的狀態訪問
  3. 狀態的存儲、訪問以及維護,由一個可插入的組件決定,這個組件就叫做狀態後端(state backend)
  4. 狀態後端主要負責兩件事:本地的狀態管理,以及將檢查點(checkpoint)狀態寫入遠程存儲

選擇一個狀態後端

在這裏插入圖片描述

Pom

        <!-- RocksDBStateBackend -->
        <dependency>
            <groupId>org.apache.flink</groupId>
            <artifactId>flink-statebackend-rocksdb_2.11</artifactId>
            <version>1.10.0</version>
        </dependency>

在集羣模式 配置文件中也可以設置
在這裏插入圖片描述

狀態小應用

獲取上一次得溫度,與這次獲取得數據進行對比,兩次溫度相差10.0則進行報警輸出,類似reduce算子

import com.atguigu.bean.SensorReading
import org.apache.flink.api.common.functions.RichFlatMapFunction
import org.apache.flink.api.common.state.{ValueState, ValueStateDescriptor}
import org.apache.flink.configuration.Configuration
import org.apache.flink.streaming.api.scala._
import org.apache.flink.util.Collector

object StateTempChangeAlertTest {
  def main(args: Array[String]): Unit = {
    val env = StreamExecutionEnvironment.getExecutionEnvironment
    env.setParallelism(1)

    val inputDStream: DataStream[String] = env.socketTextStream("hadoop102", 7777)

    val dataDstream: DataStream[SensorReading] = inputDStream.map(
      data => {
        val dataArray: Array[String] = data.split(",")
        SensorReading(dataArray(0), dataArray(1).toLong, dataArray(2).toDouble)
      })

    val resultDStrem: DataStream[(String, Double, Double)] = dataDstream
      .keyBy("id")
      .flatMap( TempChangeAlert(10.0) )

    dataDstream.print("data")
    resultDStrem.print("result")

    env.execute("stateBackendsApp test job")
  }
}

/**
 * 獲取上一次的溫度進行 對比,若 兩個值得溫度相差10度則進行報警輸出
 * @param tpr
 */
case class TempChangeAlert(tpr: Double) extends RichFlatMapFunction[SensorReading, (String, Double, Double)]{

  var lastTempState: ValueState[Double] = _
  var firstId: ValueState[Boolean] = _

  override def open(parameters: Configuration): Unit = {
    lastTempState = getRuntimeContext
      .getState( new ValueStateDescriptor[Double]( "last_time", classOf[Double]) )

    firstId = getRuntimeContext
      .getState( new ValueStateDescriptor[Boolean]( "first_id", classOf[Boolean]) )
  }

  override def flatMap(value: SensorReading, out: Collector[(String, Double, Double)]): Unit = {

    // 獲取上一次得值
    val lastTemp: Double = lastTempState.value()
    val bool: Boolean = firstId.value()
    if(bool == false){
      firstId.update(true)
    }

    // 更新狀態
    lastTempState.update(value.temperature)

    // 兩次得值相減得絕對值,大於傳入得警告溫度,則發生報警
    val diff: Double = (value.temperature - lastTemp).abs
    // 不是第一個數據,則上一次取出得數據永遠是0.0,永遠會輸出
    if( diff >= tpr && bool == true){
      out.collect( (value.id, lastTemp, value.temperature) )
    }

  }
}

在這裏插入圖片描述

使用已有的api實現狀態編程實現上面小Demo

import com.atguigu.bean.SensorReading
import org.apache.flink.streaming.api.scala._

object FlatMapWithStateTest {
  def main(args: Array[String]): Unit = {
    val env = StreamExecutionEnvironment.getExecutionEnvironment
    env.setParallelism(1)

    val inputDStream: DataStream[String] = env.socketTextStream("hadoop102", 7777)

    val dataDstream: DataStream[SensorReading] = inputDStream.map(
      data => {
        val dataArray: Array[String] = data.split(",")
        SensorReading(dataArray(0), dataArray(1).toLong, dataArray(2).toDouble)
      })

    val resultDStrem: DataStream[(String, Double, Double)] = dataDstream
      .keyBy("id")
      //.flatMap( TempChangeAlert(10.0) )
      .flatMapWithState[(String, Double, Double), Double]({

        case (inputData: SensorReading, None) => (List.empty, Some(inputData.temperature))
        case (inputData: SensorReading, lastTemp: Some[Double]) => {
          val diff = (inputData.temperature - lastTemp.get).abs
          if( diff >= 10.0 ){
            ( List( (inputData.id, lastTemp.get, inputData.temperature) ), Some(inputData.temperature) )
          } else {
            ( List.empty, Some(inputData.temperature) )
          }
        }

      })

    dataDstream.print("data")
    resultDStrem.print("result")

    env.execute("stateBackendsApp test job")
  }
}

在這裏插入圖片描述

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章