使用纯`bash`打印每个单词及其出现次数

Question

我在下面给出了代码。我想打印每个单词及其出现次数，而不使用外部工具，如wc，awk，tr等。

我可以计算单词的总数，但在这里我也有一个问题：在输出中我没有得到总字数，输出小于它应该是的。

我该怎么办？

#!/bin/bash
#v=1

echo -n "ENTER FILE NAME: "
read file
IFS=$'\n'
cnew_line=`echo -e "\n"`
cspace=`echo  " "`

if [ $# -ne 0 ] 
then

echo "You didn't entered a filename as a parameter"
exit

elif [ $# -eq 0 ] 
then
filename="$file"

num_line=0
num_word=0
num_char=0

while read -n1  w
do
if [ "$w" = "$cnew_line" ]
then
(( num_line++ ))
elif [ "$w" = "$cspace" ]
then

(( num_word++ ))

else
(( num_char++ ))
fi
done < "$filename"


echo "Line Number = $num_line"
echo "Word Number = $num_word"
echo "Character Number =$num_char"

fi

    enter code here

Answer 1

您可以使用关联数组来计算单词，有点像这样：

$ cat foo.sh
#!/bin/bash                                                                     

declare -A words

while read line
do
    for word in $line
    do
        ((words[$word]++))
    done
done

for i in "${!words[@]}"
do
    echo "$i:" "${words[$i]}"
done

测试它：

$ echo this is a test is this | bash foo.sh
is: 2
this: 2
a: 1
test: 1

这个答案几乎是从这些很好的答案构建的：this和this。不要忘记对它们进行投票。

Answer 2

James Brown's answer的两个改进版本（考虑一个单词的标点符号，打破双引号和单引号组）：

标点符号被视为单词的一部分： #!/bin/bash declare -A words while read line ; do for word in ${line} ; do ((words[${word@Q}]++)) done ; done for i in ${!words[@]} ; do echo ${i}: ${words[$i]} done
标点符号不是单词的一部分，（如wc）： #!/bin/bash declare -A words while read line ; do line="${line//[[:punct:]]}" for word in ${line} ;do ((words[${word}]++)) done ; done for i in ${!words[@]} ;do echo ${i}: ${words[$i]} done

经过测试的代码，带有棘手的引用文字：

fortune -m "swear" | bash foo.sh
man bash | ./foo.sh | sort -gr -k2 | head

使用纯`bash`打印每个单词及其出现次数

问题描述投票：0回答：2

2个回答

最新问题

使用纯`bash`打印每个单词及其出现次数

问题描述 投票：0回答：2

2个回答

最新问题

问题描述投票：0回答：2