我将通过一个例子尽可能地说明问题:
所以这:
Column1,Column2,Column3,Column4
Unique,Entry2,Entry3,Entry4
Unique,Entry2,Entry3,Entry4
Unique,Entry2,Entry3,Entry4
会变成这样的:
Column1,Column2,Column3,Column4
Unique-test,Entry2,Entry3,Entry4
Unique-test,Entry2,Entry3,Entry4
Unique-test,Entry2,Entry3,Entry4
对于我目录中的所有超过25,000个文件。
是否有可以实现此目的的bash脚本?
* EDIT CSV后来被发现以逗号分隔并被“s”包围。此更改在下面解决。
find . -type f -name '*.csv' -exec sed -e '2,$s/,/-test,/' -i {} +
find
命令将在系统长时间内生成命令行,
sed -i
将进行更改,从第2行到文件结尾,替换第1昏迷,
,由-test,
find . -type f -name '*.csv' -exec sed -e '2,$s/",/-test",/' -i {} +
find . -type f -name '*.csv' -exec sed -e '2,$s/\(["\o47]\?,\)/-test\1/' -i {} +
这个将适用于CSV,如:
'Column1','Column2','Column3','Column4'
'Unique','Entry2','Entry3','Entry4'
'Unique','Entry2','Entry3','Entry4'
"Column1","Column2","Column3","Column4"
"Unique","Entry2","Entry3","Entry4"
"Unique","Entry2","Entry3","Entry4"
Column1,Column2,Column3,Column4
Unique,Entry2,Entry3,Entry4
Unique,Entry2,Entry3,Entry4
并可以呈现:
'Column1','Column2','Column3','Column4'
'Unique-test','Entry2','Entry3','Entry4'
'Unique-test','Entry2','Entry3','Entry4'
"Column1","Column2","Column3","Column4"
"Unique-test","Entry2","Entry3","Entry4"
"Unique-test","Entry2","Entry3","Entry4"
Column1,Column2,Column3,Column4
Unique-test,Entry2,Entry3,Entry4
Unique-test,Entry2,Entry3,Entry4
对于多样性,让我们看看awk
可能的解决方案:
find /some/path -type f -name '*.csv' -exec awk -i inplace -v FS=, -v OFS=, '$1=$1"-test"' '{}' +
更新:简单修复,适用于引用列"Unique","Entry2","Entry3","Entry4"
:
find /some/path -type f -name '*.csv' -exec awk -i inplace -v FS='",' -v OFS='",' '$1=$1"-test"' '{}' +