删除行中的所有空格,但不删除双引号之间的空格

问题描述 投票:0回答:4

示例:

输入 =

This is an example text with    some      spaces. 
This should be 2nd line.
However the spaces between "quotes    should not    change".
last line.

输出=

Thisisanexampletextwithsomespaces. 
Thisshouldbe2ndline.
Howeverthespacesbetween"quotes    should not    change".
lastline.
regex perl sed awk pattern-matching
4个回答
5
投票
awk '
    BEGIN {FS = OFS = "\""}
    /^[[:blank:]]*$/ {next}
    {for (i=1; i<=NF; i+=2) gsub(/[[:space:]]/,"",$i)} 
    1
' 
Thisisanexampletextwithsomespaces.
Thisshouldbe2ndline.
Howeverthespacesbetween"quotes    should not    change".
lastline.

3
投票

GNU 示例:

$sed -r 's/(\".*\")|\s*/ /g' 文件
这是带有一些空格的示例文本。
这应该是第二行。
然而,“引号之间的空格不应改变”。
最后一行。

2
投票

可以使用perl完成:

perl -pe 's{^\s*\n$}{}; s/ +(?=(([^"]+"){2})*[^"]*$)//g' file

这将删除所有空白行或只有 0 个或更多空格的行,并在双引号之间没有空格时修剪空格。

现场演示:http://ideone.com/xizPNI


0
投票

哎哟!

我担心引号可能会跨越一行。 尽管没有具体说明,但这似乎是一种常见现象。 我添加了跨多行的引用。 这是新的输入文件

This is an example text with    some      spaces. 
This should be 2nd line.
However the spaces between "quotes    should not    change".
last line.

Here is an addition, I was concerned about (startquote) "double quotes
that spanned
more than one
line." (endquote)  A reqular expression (startquote) "might not be the right
tool for this job. (OMG MISSING QUOTE)

之前的解决方案没有考虑到双引号可以跨越多行。 因此,我一次抓取整个文件,而不是逐行抓取。 我将每一行转换为 c 样式字符数组,并使用

$insideQuotes
布尔值。 如果
$insideQuotes
删除空格,否则忽略它们。 这是代码。

#!/usr/bin/perl -w

my $insideQuotes = 0;
undef $/;                        #grab entire file at once because there are newlines
my @characters;
while(<>){                        #while loop runs exactly once per file
  @characters =  split(//);        #split file into c style character arrays
  for(0 .. $#characters){
    if( $characters[$_] =~ /^"$/){  #current character is a quote, flip bool
      $insideQuotes = !$insideQuotes;#whatever it is, change it to other thing
      next;
    }
    if($insideQuotes){                #dont do anything
      next;
    } else {                           #current character is not a quote, and not INSIDE a quote, if space remove
      if( $characters[$_] =~ /^ $/){
        delete($characters[$_]);
      }
    }
  }#end for
}#done, print output
for (@characters){
  print if($_); #the deleted spaces are giving undefined warnings, this removes the warnings
}

输出看起来像这样

$ perl remove.spaces.except.in.dblquotes.pl remove.spaces.except.in.dblquotes.txt

Thisisanexampletextwithsomespaces.
Thisshouldbe2ndline.
Howeverthespacesbetween"quotes    should not    change".
lastline.

Hereisanaddition,Iwasconcernedabout(startquote)"double quotes
that spanned
more than one
line."(endquote)Areqularexpression(startquote)"might not be the right
tool for this job. (OMG MISSING QUOTE)
© www.soinside.com 2019 - 2024. All rights reserved.