如何将地图转换为数据框?

问题描述 投票:0回答:2

m 是如下地图:

scala> m
res119: scala.collection.mutable.Map[Any,Any] = Map(A-> 0.11164610291904906, B-> 0.11856755943424617, C -> 0.1023171832681312)

我想得到:

name  score
A  0.11164610291904906
B  0.11856755943424617
C  0.1023171832681312

如何获取最终的数据框?

scala apache-spark dictionary apache-spark-sql
2个回答
28
投票

先将其转换为

Seq
,然后就可以使用
toDF()
功能了。

val spark = SparkSession.builder.getOrCreate()
import spark.implicits._

val m = Map("A"-> 0.11164610291904906, "B"-> 0.11856755943424617, "C" -> 0.1023171832681312)
val df = m.toSeq.toDF("name", "score")
df.show

会给你:

+----+-------------------+
|name|              score|
+----+-------------------+
|   A|0.11164610291904906|
|   B|0.11856755943424617|
|   C| 0.1023171832681312|
+----+-------------------+

0
投票

这是一个更复杂的模式示例

import spark.implicits._
val schema: StructType = new StructType()
    .add("item_id", StringType, nullable = false)
    .add(
        "currency", 
        new StructType()
            .add("regular_price", StringType)
            .add("sale_price", StringType))
    .add(
      "promotion",
      new StructType()
        .add("promo_key", StringType)
        .add("start_datetime", TimestampType)
        .add("end_datetime", TimestampType)
        .add("promo_price", StringType)
    )
val df1 = Seq(
  (
    "prod102480185",
    ("225.00", null),
    (
      "E16110WN",
      Timestamp.valueOf("2024-11-16 14:00:01.000000000"), 
      Timestamp.valueOf("2024-12-08 14:00:04.000000000"), 
      "180.00"
    )
  )
).toDF("item_id", "currency", "promotion").updateSchema(NMGPriceReader.DfSchema)

val df = df1.sqlContext.createDataFrame(df.rdd, schema)
© www.soinside.com 2019 - 2024. All rights reserved.