我有一个 csv 文件,其中有 12 个数字,对应于 12 个月。文件示例如下:
$ cat data.csv
"3","5","6","5","4","6","7","6","4","4","3","3",
我想使用“一月、二月、三月等”在 x 轴上绘制月份。
我找到了这个脚本,但我不知道如何输入月份:
for FILE in data.csv; do
gnuplot -p << EOF
set datafile separator ","
set xlabel "xlabel"
set ylabel "ylabel"
set title "graphTitle"
plot "$FILE" using $xcolumn:$ycolumn
EOF
done
预期输出应该是一个图,其中 x 轴是月份,y 轴是 csv 文件中的数据。 请注意,CSV 文件中没有月份,只有数字。这就是为什么我要问实现此目的的最佳方法是什么,而无需在 CSV 中手动输入它们或循环遍历数组。有没有可以添加日期并且可以格式化的gnuplot函数?
谢谢你
如果您不介意输入月份名称,我认为最简单的就是这个。为了清晰起见,数据以内联方式显示,而不是从文件中读取。
$DATA << EOD
"3","5","6","5","4","6","7","6","4","4","3","3",
EOD
set datafile sep comma
set xrange [0:13]
unset key
array Month[12] = ["Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec"]
plot for [N=1:12] $DATA using (N):(column(N)):xticlabel(Month[N]) with impulse lw 5
如果您不想输入月份名称,则以下内容应生成等效名称。
"%b"
将生成上面的缩写月份名称。 "%B"
将生成完整的月份名称。
Month(i) = strftime("%b", i * 3600.*24.*28.)
plot for [N=1:12] $DATA using (N):(column(N)):xticlabel(Month(N)) with impulse lw 5
如果您不想使用循环语法,有一种方法可以将 CSV 文件读取为 1x12 矩阵。另外,对于长月份名称,您可以通过给它格式说明符“%B”来使用 gnuplot 的 strftime 函数。
gnuplot 脚本在这里。
set key noautotitle
set datafile separator comma
set yrange [0:10]
set xrange [-1:12]
set xtics rotate by -45
set grid xtics
# This function generates the names "January", "February", ...
# from the integer value 0, 1, ...
#
monthname(i) = strftime("%B",strptime("%m",sprintf("%i",i+1)))
# `matrix every ...` specifier tells to read the data as a 1x12 matrix.
#
plot "data.csv" matrix every :::0:11:0 using 1:3:xtic(monthname($1)) with linespoints pt 7
更新:在查看了OP帖子和代码之后,我猜测所需的格式如下:
January:"3",February:"5",March:"6",April:"5",May:"4",June:"6",July:"7",August:"6",September:"4",October:"4",November:"3",December:"3",
如果是这种情况,我们可以使用相同的解决方案(如下)并通过
tr
管道传输最终结果,将数据转回单行/多列数据集,例如:
$ paste -d" " <(locale mon | tr ';' '\n') <(tr ',' '\n' < data.csv) | grep -v '^ $' | tr ' \n' ':,'
January:"3",February:"5",March:"6",April:"5",May:"4",June:"6",July:"7",August:"6",September:"4",October:"4",November:"3",December:"3",
并更新 OP 代码:
datfile=$(mktemp)
for FILE in data.csv
do
paste -d" " <(locale mon | tr ';' '\n') <(tr ',' '\n' < data.csv) | grep -v '^ $' | tr ' \n' ':,' > "${datfile}"
gnuplot -p <<-EOF
set datafile separator ","
set xlabel "xlabel"
set ylabel "ylabel"
set title "graphTitle"
plot "${datfile}" using $xcolumn:$ycolumn
EOF
done
'rm' -rf "${datfile}" > /dev/null 2>&1
看起来
gnuplot
可以接受各种格式的数据,包括以下格式:
January "3"
February "5"
March "6"
April "5"
May "4"
June "6"
July "7"
August "6"
September "4"
October "4"
November "3"
December "3"
注意:如果OP确定这不是可接受的文件格式,那么我确信我们可以想出其他东西......只需要使用显示月份和数字的有效文件格式示例更新问题。
因此,如果我们可以动态生成该数据集,我们就可以将其提供给
gnuplot
...
首先我们让
locale
为我们生成月份:
$ locale mon
January;February;March;April;May;June;July;August;September;October;November;December
接下来我们可以将单行/多列数据集转置为多行/单列数据集:
$ locale mon | tr ';' '\n'
January
February
March
April
May
June
July
August
September
October
November
December
$ tr ',' '\n' < data.csv
"3"
"5"
"6"
"5"
"4"
"6"
"7"
"6"
"4"
"4"
"3"
"3"
从这里我们可以将这两个数据集
paste
放在一起,使用空格作为列分隔符:
$ paste -d" " <(locale mon | tr ';' '\n') <(tr ',' '\n' < data.csv)
January "3"
February "5"
March "6"
April "5"
May "4"
June "6"
July "7"
August "6"
September "4"
October "4"
November "3"
December "3"
最后一步是将其写入(tmp)文件,例如:
$ datfile=$(mktemp)
$ paste -d" " <(locale mon | tr ';' '\n') <(tr ',' '\n' < data.csv) | grep -v '^ $' > "${datfile}"
$ cat "${datfile}"
January "3"
February "5"
March "6"
April "5"
May "4"
June "6"
July "7"
August "6"
September "4"
October "4"
November "3"
December "3"
注意:
grep -v '^ $'
是为了去掉与,
中最后一个逗号(
data.csv
)相关的多余行
从这里
"${datfile}"
可以根据需要馈送到gnuplot
,一旦不再需要就删除,例如:
$ gnuplot ... "${datfile}" ...
$ 'rm' -rf "${datfile}" > /dev/null 2>&1
另一种解决方案。因为您有一个尾随逗号,并且 gnuplot 期望其后有一个数字,所以您将收到一个警告
warning: matrix contains missing or undefined values
,您可以忽略它。因此,您应该将 x 最大值限制为较小的 12。
根据您的情况,请将 $Data
替换为您的文件名 'data.csv'
。您可能想要设置另一个区域设置(选中 help locale
)以获得月份名称的其他语言(它应该工作类似于工作日)。
脚本:
### plot monthly data
reset session
$Data <<EOD
"3","5","6","5","4","6","7","6","4","4","3","3",
EOD
set datafile separator comma
set boxwidth 0.8
set style fill solid 0.5
set yrange [0:10]
set xrange [-0.9:11.9]
MonthName(i) = strftime("%b",i*3600*24*31) # get month name, use %B for full name
plot $Data matrix u 1:3:xtic(MonthName($1)) w boxes title "my data"
### end of script
结果:
一个
awk
解决方案围绕与paste
答案相同的逻辑构建,但消除了一些子流程(例如,grep
,多个tr's
)...
awk -F'[;,]' ' # input field delimiters are ";" and ","
BEGIN { OFS=":" ; ORS="," } # set output field delimiter as ":" and output record delimiter as ","
FNR==NR { for (i=1 ; i<=NF ; i++) # loop through fields from first file ...
month[i]=$(i) # store in our month[] array
next # skip to next input line
}
{ for (i=1 ; i< NF ; i++) # loop through fields from second file ...
print month[i],$(i) # print month and current field
}
' <(locale mon) data.csv
这会生成:
January:"3",February:"5",March:"6",April:"5",May:"4",June:"6",July:"7",August:"6",September:"4",October:"4",November:"3",December:"3",
将其放入OP的代码中:
datfile=$(mktemp)
for FILE in data.csv
do
awk -F'[;,]' 'BEGIN{OFS=":";ORS=","} FNR==NR {for (i=1;i<=NF;i++) mon[i]=$(i); next} {for (i=1;i<NF;i++) print mon[i],$(i)}' <(locale mon) data.csv > "${datfile}"
gnuplot -p <<-EOF
set datafile separator ","
set xlabel "xlabel"
set ylabel "ylabel"
set title "graphTitle"
plot "${datfile}" using $xcolumn:$ycolumn
EOF
done
'rm' -rf "${datfile}" > /dev/null 2>&1