JDBC數據源實戰
Spark SQL支持使用JDBC從關係型數據庫(比如MySQL)中讀取數據。讀取的數據是DataFrame,可以很方便地使用Spark Core提供的各種算子進行處理。
一、JDBC的讀取和寫入格式
讀取JDBC數據的格式
Scala版本
val jdbcDF = sqlContext.read.format("jdbc")
.options(Map(
"url"->"jdbc:mysql://localhost:3306/unicom",
"user" -> "root",
"password" -> "199037",
"dbtable" -> "dataframe1"
))
.load()
或者
val dataframe_mysql = sqlcontext.read.format("jdbc")
.option("url", "jdbc:mysql://localhost:3306/unicom")
.option("driver", "com.mysql.jdbc.Driver")
.option("dbtable", "dataframe1").option("user", "root")
.option("password", "199037")
.load()
或者
val connectProperties=new java.util.Properties()
connectProperties.put("user","username")
connectProperties.put("password","yourpassword")
val url="jdbc:mysql://localhost:3306/unicom"
val df=spark.read.jdbc(url,"outDF",connectProperties)
保存到JDBC的格式
Scala版本
import java.util.Properties
import org.apache.spark.sql.SaveMode
val connectProperties = new Properties()
connectProperties.put("user", "root")
connectProperties.put("password", "199037")
val mysqlDriverUrl = "jdbc:mysql://localhost:3306/unicom?useUnicode=true&characterEncoding=utf-8"
outDF.write.mode(SaveMode.Append).jdbc(mysqlDriverUrl, "outDF", connectProperties)
二、實例講解
案例:查詢分數爲優秀的學生信息
import java.util.Properties
import org.apache.spark.sql.functions._
import org.apache.spark.sql.{SQLContext, SaveMode}
import org.apache.spark.{SparkConf, SparkContext}
/**
* Created by cuiyufei on 2018/3/8.
* jdbc實戰
*/
object sparkStudy {
/**
* 將學生基本信息和學生成績信息合併,並找出優秀的學生(成績大於85)
*/
def main(args: Array[String]) {
val conf = new SparkConf().setAppName("dataframeOperation").setMaster("local")
val sc = new SparkContext(conf)
val sqlContext = new SQLContext(sc)
//學生基本信息
val stuInfoDF = sqlContext.read.format("jdbc")
.options(Map(
"url"->"jdbc:mysql://localhost:3306/unicom",
"user" -> "root",
"password" -> "199037",
"dbtable" -> "student_info"
))
.load()
//學生成績信息
val stuScoreDF = sqlContext.read.format("jdbc")
.options(Map(
"url"->"jdbc:mysql://localhost:3306/unicom",
"user" -> "root",
"password" -> "199037",
"dbtable" -> "student_score"
))
.load()
//去除分數大於85的學生的信息
val outDF = stuInfoDF.join(stuScoreDF,"name").filter("score>85")
//將數據寫入mysq以方便觀察
val connectProperties = new Properties()
connectProperties.put("user", "root")
connectProperties.put("password", "199037")
val mysqlDriverUrl = "jdbc:mysql://localhost:3306/unicom?useUnicode=true&characterEncoding=utf-8"
outDF.write.mode(SaveMode.Append).jdbc(mysqlDriverUrl, "outDF", connectProperties)
}
}
學生基本信息,如下圖所示:
學生成績信息,如下圖所示:
優秀學生信息,最後的結果信息如下圖所示: