我想知道这是否可以申请。 例如我有这张桌子:
new_feed_dt regex_to_apply expr_to_apply
053021 | _(\d+) | date_format(to_date(new_feed_dt, 'yyyyMMdd'), 'yyyy-MM-dd')
053022 | _(\d+) | date_format(to_date(new_feed_dt, 'yyyyMMdd'), 'yyyy-MM-dd')
053023 | _(\d+) | date_format(to_date(new_feed_dt, 'yyyyMMdd'), 'yyyy-MM-dd')
053024 | [a-zA-Z]+(\d+) | date_format(to_date(new_feed_dt, 'MMddyyyy'), 'yyyy-MM-dd')
053025 | DT(\d+) | date_format(to_date(new_feed_dt, 'MMddyy'), 'yyyy-MM-dd')
我需要使用这个 expr 将列应用到日期列
df_with_regex.withColumn(
'new_feed_dt',
f.expr("expr_to_apply")
)
你会怎么做?
提前致谢
试试这个:
for row in df_with_regex.collect():
df_with_regex = df_with_regex.withColumn(
'new_feed_dt_transformed',
f.expr(row['expr_to_apply']).alias('new_feed_dt_transformed')
)
df.show(truncate=False)
# +-----------+--------------+-----------------------------------------------------------+-----------------------+
# |new_feed_dt|regex_to_apply|expr_to_apply |new_feed_dt_transformed|
# +-----------+--------------+-----------------------------------------------------------+-----------------------+
# |053021 |_(\d+) |date_format(to_date(new_feed_dt, 'yyyyMMdd'), 'yyyy-MM-dd')|2021-05-30 |
# |053022 |_(\d+) |date_format(to_date(new_feed_dt, 'yyyyMMdd'), 'yyyy-MM-dd')|2022-05-30 |
# |053023 |_(\d+) |date_format(to_date(new_feed_dt, 'yyyyMMdd'), 'yyyy-MM-dd')|2023-05-30 |
# |053024 |[a-zA-Z]+(\d+)|date_format(to_date(new_feed_dt, 'MMddyyyy'), 'yyyy-MM-dd')|2024-05-30 |
# |053025 |DT(\d+) |date_format(to_date(new_feed_dt, 'MMddyy'), 'yyyy-MM-dd') |2025-05-30 |
# +-----------+--------------+-----------------------------------------------------------+-----------------------+