m 是如下地图:
scala> m
res119: scala.collection.mutable.Map[Any,Any] = Map(A-> 0.11164610291904906, B-> 0.11856755943424617, C -> 0.1023171832681312)
我想得到:
name score
A 0.11164610291904906
B 0.11856755943424617
C 0.1023171832681312
如何获取最终的数据框?
先将其转换为
Seq
,然后就可以使用toDF()
功能了。
val spark = SparkSession.builder.getOrCreate()
import spark.implicits._
val m = Map("A"-> 0.11164610291904906, "B"-> 0.11856755943424617, "C" -> 0.1023171832681312)
val df = m.toSeq.toDF("name", "score")
df.show
会给你:
+----+-------------------+
|name| score|
+----+-------------------+
| A|0.11164610291904906|
| B|0.11856755943424617|
| C| 0.1023171832681312|
+----+-------------------+
这是一个更复杂的模式示例
import spark.implicits._
val schema: StructType = new StructType()
.add("item_id", StringType, nullable = false)
.add(
"currency",
new StructType()
.add("regular_price", StringType)
.add("sale_price", StringType))
.add(
"promotion",
new StructType()
.add("promo_key", StringType)
.add("start_datetime", TimestampType)
.add("end_datetime", TimestampType)
.add("promo_price", StringType)
)
val df1 = Seq(
(
"prod102480185",
("225.00", null),
(
"E16110WN",
Timestamp.valueOf("2024-11-16 14:00:01.000000000"),
Timestamp.valueOf("2024-12-08 14:00:04.000000000"),
"180.00"
)
)
).toDF("item_id", "currency", "promotion").updateSchema(NMGPriceReader.DfSchema)
val df = df1.sqlContext.createDataFrame(df.rdd, schema)