gnuplot 中的月份

问题描述 投票:0回答:5

我有一个 csv 文件,其中有 12 个数字,对应于 12 个月。文件示例如下:

$ cat data.csv
"3","5","6","5","4","6","7","6","4","4","3","3",

我想使用“一月、二月、三月等”在 x 轴上绘制月份。

我找到了这个脚本,但我不知道如何输入月份:

for FILE in data.csv; do
 gnuplot -p << EOF
 set datafile separator ","
 set xlabel "xlabel"
 set ylabel "ylabel"
 set title "graphTitle"
 plot "$FILE" using $xcolumn:$ycolumn
  EOF
done

预期输出应该是一个图,其中 x 轴是月份,y 轴是 csv 文件中的数据。 请注意,CSV 文件中没有月份,只有数字。这就是为什么我要问实现此目的的最佳方法是什么,而无需在 CSV 中手动输入它们或循环遍历数组。有没有可以添加日期并且可以格式化的gnuplot函数?

谢谢你

bash awk gnuplot paste
5个回答
4
投票

如果您不介意输入月份名称,我认为最简单的就是这个。为了清晰起见,数据以内联方式显示,而不是从文件中读取。

$DATA << EOD
"3","5","6","5","4","6","7","6","4","4","3","3",
EOD

set datafile sep comma
set xrange [0:13]
unset key

array Month[12] = ["Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec"]

plot for [N=1:12] $DATA using (N):(column(N)):xticlabel(Month[N]) with impulse lw 5

enter image description here

如果您不想输入月份名称,则以下内容应生成等效名称。

"%b"
将生成上面的缩写月份名称。
"%B"
将生成完整的月份名称。

Month(i) = strftime("%b", i * 3600.*24.*28.)
plot for [N=1:12] $DATA using (N):(column(N)):xticlabel(Month(N)) with impulse lw 5

4
投票

如果您不想使用循环语法,有一种方法可以将 CSV 文件读取为 1x12 矩阵。另外,对于长月份名称,您可以通过给它格式说明符“%B”来使用 gnuplot 的 strftime 函数。

gnuplot 脚本在这里。

set key noautotitle
set datafile separator comma
set yrange [0:10]
set xrange [-1:12]
set xtics rotate by -45
set grid xtics

# This function generates the names "January", "February", ... 
#                from the integer value 0, 1, ...
#
monthname(i) = strftime("%B",strptime("%m",sprintf("%i",i+1)))

# `matrix every ...` specifier tells to read the data as a 1x12 matrix.
#
plot "data.csv" matrix every :::0:11:0 using 1:3:xtic(monthname($1)) with linespoints pt 7 

enter image description here


3
投票

更新:在查看了OP帖子和代码之后,我猜测所需的格式如下:

January:"3",February:"5",March:"6",April:"5",May:"4",June:"6",July:"7",August:"6",September:"4",October:"4",November:"3",December:"3",

如果是这种情况,我们可以使用相同的解决方案(如下)并通过

tr
管道传输最终结果,将数据转回单行/多列数据集,例如:

$ paste -d" " <(locale mon | tr ';' '\n') <(tr ',' '\n' < data.csv) | grep -v '^ $' | tr ' \n' ':,'
January:"3",February:"5",March:"6",April:"5",May:"4",June:"6",July:"7",August:"6",September:"4",October:"4",November:"3",December:"3",

并更新 OP 代码:

datfile=$(mktemp)
for FILE in data.csv
do
    paste -d" " <(locale mon | tr ';' '\n') <(tr ',' '\n' < data.csv) | grep -v '^ $' | tr ' \n' ':,' > "${datfile}"

    gnuplot -p <<-EOF
    set datafile separator ","
    set xlabel "xlabel"
    set ylabel "ylabel"
    set title "graphTitle"
    plot "${datfile}" using $xcolumn:$ycolumn
    EOF
done
'rm' -rf "${datfile}" > /dev/null 2>&1

看起来

gnuplot
可以接受各种格式的数据,包括以下格式:

January "3"
February "5"
March "6"
April "5"
May "4"
June "6"
July "7"
August "6"
September "4"
October "4"
November "3"
December "3"

注意:如果OP确定这不是可接受的文件格式,那么我确信我们可以想出其他东西......只需要使用显示月份和数字的有效文件格式示例更新问题。

因此,如果我们可以动态生成该数据集,我们就可以将其提供给

gnuplot
...

首先我们让

locale
为我们生成月份:

$ locale mon
January;February;March;April;May;June;July;August;September;October;November;December

接下来我们可以将单行/多列数据集转置为多行/单列数据集:

$ locale mon | tr ';' '\n'
January
February
March
April
May
June
July
August
September
October
November
December

$ tr ',' '\n' < data.csv
"3"
"5"
"6"
"5"
"4"
"6"
"7"
"6"
"4"
"4"
"3"
"3"

从这里我们可以将这两个数据集

paste
放在一起,使用空格作为列分隔符:

$ paste -d" " <(locale mon | tr ';' '\n') <(tr ',' '\n' < data.csv)
January "3"
February "5"
March "6"
April "5"
May "4"
June "6"
July "7"
August "6"
September "4"
October "4"
November "3"
December "3"

最后一步是将其写入(tmp)文件,例如:

$ datfile=$(mktemp)
$ paste -d" " <(locale mon | tr ';' '\n') <(tr ',' '\n' < data.csv) | grep -v '^ $' > "${datfile}"
$ cat "${datfile}"
January "3"
February "5"
March "6"
April "5"
May "4"
June "6"
July "7"
August "6"
September "4"
October "4"
November "3"
December "3"

注意

grep -v '^ $'
是为了去掉与
,
中最后一个逗号(
data.csv

)相关的多余行

从这里

"${datfile}"
可以根据需要馈送到
gnuplot
,一旦不再需要就删除,例如:

$ gnuplot ... "${datfile}" ...
$ 'rm' -rf "${datfile}" > /dev/null 2>&1

3
投票

另一种解决方案。因为您有一个尾随逗号,并且 gnuplot 期望其后有一个数字,所以您将收到一个警告

warning: matrix contains missing or undefined values
,您可以忽略它。因此,您应该将 x 最大值限制为较小的 12。 根据您的情况,请将
$Data
替换为您的文件名
'data.csv'
。您可能想要设置另一个区域设置(选中
help locale
)以获得月份名称的其他语言(它应该工作类似于工作日)。

脚本:

### plot monthly data
reset session

$Data <<EOD
"3","5","6","5","4","6","7","6","4","4","3","3",
EOD

set datafile separator comma
set boxwidth 0.8
set style fill solid 0.5
set yrange [0:10]
set xrange [-0.9:11.9]

MonthName(i) = strftime("%b",i*3600*24*31)   # get month name, use %B for full name

plot $Data matrix u 1:3:xtic(MonthName($1)) w boxes title "my data"
### end of script

结果:

enter image description here


1
投票

一个

awk
解决方案围绕与
paste
答案相同的逻辑构建,但消除了一些子流程(例如,
grep
,多个
tr's
)...

awk -F'[;,]' '                             # input field delimiters are ";" and ","
BEGIN   { OFS=":" ; ORS="," }              # set output field delimiter as ":" and output record delimiter as ","
FNR==NR { for (i=1 ; i<=NF ; i++)          # loop through fields from first file ...
          month[i]=$(i)                    # store in our month[] array
          next                             # skip to next input line
        }
        { for (i=1 ; i< NF ; i++)          # loop through fields from second file ...
          print month[i],$(i)              # print month and current field
        }
' <(locale mon) data.csv

这会生成:

January:"3",February:"5",March:"6",April:"5",May:"4",June:"6",July:"7",August:"6",September:"4",October:"4",November:"3",December:"3",

将其放入OP的代码中:

datfile=$(mktemp)
for FILE in data.csv
do
    awk -F'[;,]' 'BEGIN{OFS=":";ORS=","} FNR==NR {for (i=1;i<=NF;i++) mon[i]=$(i); next} {for (i=1;i<NF;i++) print mon[i],$(i)}' <(locale mon) data.csv > "${datfile}"

    gnuplot -p <<-EOF
    set datafile separator ","
    set xlabel "xlabel"
    set ylabel "ylabel"
    set title "graphTitle"
    plot "${datfile}" using $xcolumn:$ycolumn
    EOF
done
'rm' -rf "${datfile}" > /dev/null 2>&1
© www.soinside.com 2019 - 2024. All rights reserved.