在下面的代码片段中,我正在阅读一个结构与此类似的JSON文件:
{ "c7254865-87b5-4d34-a7bd-6ba6c9dbab14": "72119c87-7fce-4e17-9770-fcfab04328f5"}
{ "36c18403-1707-48c4-8f19-3b2e705007d4": "72119c87-7fce-4e17-9770-fcfab04328f5"}
{ "34a71a88-ae2d-4304-a1db-01c54fc6e4d8": "72119c87-7fce-4e17-9770-fcfab04328f5"}
每行包含一个键值对,然后应将其添加到Scala中的映射中。这是我用于此目的的Scala代码:
val fs = org.apache.hadoop.fs.FileSystem.get(new Configuration())
def readFile(location: String): mutable.HashMap[String, String] = {
val path: Path = new Path(location)
val dataInputStream: FSDataInputStream = fs.open(path)
val m = new mutable.HashMap[String, String]()
for (line <- Source.fromInputStream(dataInputStream).getLines) {
val parsed: Option[Any] = JSON.parseFull(line)
m ++= parsed.get.asInstanceOf[Map[String, String]]
}
m
}
必须有更优雅的方法在Scala中做到这一点肯定。特别是你应该能够摆脱可变的地图,并通过流直接摄取线到地图。你怎么能这样做?
val r: Map[String, String] = Source.fromInputStream(dataInputStream).getLines
.map(line => JSON.parseFull(line).get)
.flatMap { case m: Map[String, String] => m.map { case (k, v) => k -> v } }
.toMap
请记住,JSON
(你的意思是scala.util.parsing.json.JSON
,对吗?)本身标记为Scala 2.11中的@deprecated
编辑:根据@SergGr和@Dima的建议,这可以进一步简化为
val r: Map[String, String] = Source.fromInputStream(dataInputStream).getLines
.flatMap(line => JSON.parseFull(line))
.collect { case m: Map[String, String] => m }
.flatten.toMap
最后一次更正还可以更好地处理意外的JSON(例如,如果传入一个数组)
val json =scala.io.Source.fromString("""
{ "c7254865-87b5-4d34-a7bd-6ba6c9dbab14": "72119c87-7fce-4e17-9770-fcfab04328f5"}
{ "36c18403-1707-48c4-8f19-3b2e705007d4": "72119c87-7fce-4e17-9770-fcfab04328f5"}
{ "34a71a88-ae2d-4304-a1db-01c54fc6e4d8": "72119c87-7fce-4e17-9770-fcfab04328f5"}
""")
拆分字符串,然后将Array
中的每个条目映射到键和值,然后转换为Map
。这将返回一个scala.collection.immutable.Map[String,String]
scala> json.map(x => x.split(":")).map(x => x(0) -> x(1)).toMap
res35: scala.collection.immutable.Map[String,String] = Map(
{ "c7254865-87b5-4d34-a7bd-6ba6c9dbab14" -> " "72119c87-7fce-4e17-9770-fcfab04328f5"}",
{ "36c18403-1707-48c4-8f19-3b2e705007d4" -> " "72119c87-7fce-4e17-9770-fcfab04328f5"}",
{ "34a71a88-ae2d-4304-a1db-01c54fc6e4d8" -> " "72119c87-7fce-4e17-9770-fcfab04328f5"}")