如何在附加SQL星火列值？

Question

我有如下表：

+-------+---------+---------+
|movieId|movieName|    genre|
+-------+---------+---------+
|      1| example1|   action|
|      1| example1| thriller|
|      1| example1|  romance|
|      2| example2|fantastic|
|      2| example2|   action|
+-------+---------+---------+

我试图做到的，是那里的ID和名称是相同的类型值附加在一起。像这样：

+-------+---------+---------------------------+
|movieId|movieName|    genre                  |
+-------+---------+---------------------------+
|      1| example1|   action|thriller|romance |
|      2| example2|   action|fantastic        |
+-------+---------+---------------------------+

Answer 1

使用groupBy和collect_list获得具有相同电影名称的项目清单。然后使用concat_ws（如果顺序很重要，第一次使用sort_array）组合这些为一个字符串。小例子与给定的样本数据帧：

val df2 = df.groupBy("movieId", "movieName")
  .agg(collect_list($"genre").as("genre"))
  .withColumn("genre", concat_ws("|", sort_array($"genre")))

给出结果：

+-------+---------+-----------------------+
|movieId|movieName|genre                  |
+-------+---------+-----------------------+
|1      |example1 |action|thriller|romance|
|2      |example2 |action|fantastic       |
+-------+---------+-----------------------+

如何在附加SQL星火列值？

问题描述投票：0回答：1

1个回答

最新问题

如何在附加SQL星火列值？

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1