在awk中使用空行作为文件分隔符

问题描述 投票:0回答:5

我需要知道是否可以在 awk 中使用新行作为字段分隔符将多行合并到一行中?

例如:

$ cat yo  
a aa aaa  
bb bbb bbb  
cccc ccccc cccc  
ddd dddd ddd  
eeeee eeeee eee  

fffff ffffff fffffff  
gggg ggggg  
hhhhhh hhhhhhh hhhhhhhhh  

iii iiiiiiiii iiiii  
jjjj jjjjj jjjjj  
kkkkk kkkkk  
lllllllll lll ll  

以下是我的小脑袋能想到的几件事,但没有任何帮助。

cat file |awk -F'\n' '{print}'    
cat yo |awk 'NF'  '{print $NF}'  


cat yo |awk -F'/^$/d'  '{print $NF}'  
cat yo |awk -F'^$^[ \t]*$' '{print $NF}'  

cat yo |awk -F'^..' '{print $NF}'  

cat yo |awk -F'\t' '{print}'  

所需输出:

a aa aaa  bb bbb bbb cccc ccccc cccc ddd dddd ddd eeeee eeeee eee  
fffff ffffff fffffff gggg ggggg hhhhhh hhhhhhh hhhhhhhhh  
iii iiiiiiiii iiiii jjjj jjjjj jjjjj kkkkk kkkkk lllllllll lll ll  
linux unix awk sed
5个回答
4
投票

在任何 awk 中执行此操作的方法很简单:

$ awk -v RS= '$1=$1' file
a aa aaa bb bbb bbb cccc ccccc cccc ddd dddd ddd eeeee eeeee eee
fffff ffffff fffffff gggg ggggg hhhhhh hhhhhhh hhhhhhhhh
iii iiiiiiiii iiiii jjjj jjjjj jjjjj kkkkk kkkkk lllllllll lll ll

3
投票

您可以将记录分隔符定义为

RS=
,这将使其分段:每一行都是一个字段,每条记录都是一个块:

$ awk -v RS= '{for (i=1; i<=NF; i++) printf "%s%s", $i, (i==NF?"\n":" ")}' file
a aa aaa bb bbb bbb cccc ccccc cccc ddd dddd ddd eeeee eeeee eee 
fffff ffffff fffffff gggg ggggg hhhhhh hhhhhhh hhhhhhhhh 
iii iiiiiiiii iiiii jjjj jjjjj jjjjj kkkkk kkkkk lllllllll lll ll

实际上与:

相同
awk -v RS= '{for (i=1; i<=NF; i++) printf "%s%s", $i, (i==NF?ORS:FS)}' file

2
投票

您可能想要重新定义记录分隔符而不是字段分隔符....

引用

man awk

Records
   Normally, records are separated by newline characters.  You can control 
   how records are separated by assigning values to the built-in variable
   RS.  If RS is any single character, that character separates records.  
   Otherwise, RS is a regular expression.  Text in the input that  matches
   this  regular expression separates the record.  However, in compatibility 
   mode, only the first character of its string value is used for separating 
   records.  If RS is set to the null string, then records are separated by 
   blank lines.  When RS is set to the null string,  the  newline character 
   always acts as a field separator, in addition to whatever value FS may have.

所以试试这个;

awk 'BEGIN{RS="\n\n";}{print NF;}'

1
投票

如果不使用任何约束来使用

awk
,只需使用循环即可完成,逐行读取:

while read -r line ; do 
  if [[ -z $line ]]; then echo ;  else echo -n $line;  fi 
done <file

1
投票

通过

awk's
gsub 函数,

$ awk -v RS="" '{gsub(/\n/,"")}1' file
a aa aaa  bb bbb bbb  cccc ccccc cccc  ddd dddd ddd  eeeee eeeee eee  
fffff ffffff fffffff  gggg ggggg  hhhhhh hhhhhhh hhhhhhhhh  
iii iiiiiiiii iiiii  jjjj jjjjj jjjjj  kkkkk kkkkk  lllllllll lll ll 
© www.soinside.com 2019 - 2024. All rights reserved.