背景
数据格式
val factories = """
{
"cities": {
"name": "Sao Paulo"
"areas": [
{
"code": "41939",
"type": "downtown"
},
{
"code": "48294",
"type": "residential"
}
],
},
"domains": [
{
"id": "19sk2nfb",
"name" : "defense"
}
]
}
代码
这将从增量表中获取数据并创建案例类对象
fetchedData
是 DataFrame
使用某些条件获取
factoriesSchema
是json模式
val structuredData =
fetchedData.withColumn(
"StructuredFactoryJson",
from_json(col("FactoryData"), factoriesSchema)
)
val factories = structuredData.collect().map { row =>
val structJson = row.getAs[Row]("StructuredFactoryJson")
val citiesRow = structJson.getAs[Row]("cities")
val city = City(
citiesRow.getAs[String]("name"),
citiesRow
.getAs[Seq[Row]]("areas")
.map(areaRow =>
Area(
area.getAs[String]("type"),
area.getAs[String]("code")
)
)
)
val domains = structJson
.getAs[Seq[Row]]("domains")
.map( area ->
Area( area.getAs
.
.
.
}
问题
效果很好,并且获得了
Seq
。但问题是,是否有办法得到 List
而不是 Seq
并按原样构建更大的对象