如何从python数据帧中获取上周所有过去几年的数据?

问题描述 投票:1回答:2

我有从2010/12到2017/12的以下股票每日价格数据。我如何选择上周每年的数据?我打算检查每年最后一周的表现。

2017-01-05   52.99  13018070.0   52.370   53.0600   51.4000
2017-01-04   52.86  12556860.0   50.770   53.3400   50.7300
2017-01-03   50.29  15794400.0   48.800   50.3000   48.4700
2016-12-30   46.75  13593420.0   48.365   48.4000   46.3600
2016-12-29   47.77  11728250.0   48.440   48.8600   47.1800
2016-12-28   48.51  14636340.0   50.580   50.7300   48.4700
2016-12-27   50.43   5594876.0   49.690   50.5500   49.6500
2016-12-23   49.59   6966559.0   49.250   49.7200   48.9900
2016-12-22   49.44  10918300.0   50.320   50.5500   49.1711
2016-12-21   50.34   9279635.0   49.820   50.4400   49.6700
2016-12-20   49.53   9533020.0   48.990   49.7900   48.9100
2016-12-19   48.55  10323930.0   47.450   48.6700   47.4300
...
2010-12-20 ...
python pandas dataframe
2个回答
2
投票

您可以使用groupby传递datetime年份。但首先我们需要删除(过滤掉)不符合您标准的数据。还要确保您的日期是日期时间。

此代码将检查月份是否等于12月(12)并且该日期大于或等于25(即每年的最后7天)。如果你想要一年的最后一周,你可以看看Wen's lambda函数。

data = '''\
2017-12-25   52.99  13018070.0   52.370   53.0600   51.4000
2017-01-04   52.86  12556860.0   50.770   53.3400   50.7300
2017-01-03   50.29  15794400.0   48.800   50.3000   48.4700
2016-12-30   46.75  13593420.0   48.365   48.4000   46.3600
2016-12-29   47.77  11728250.0   48.440   48.8600   47.1800
2016-12-28   48.51  14636340.0   50.580   50.7300   48.4700
2016-12-27   50.43   5594876.0   49.690   50.5500   49.6500
2016-12-23   49.59   6966559.0   49.250   49.7200   48.9900
2016-12-22   49.44  10918300.0   50.320   50.5500   49.1711
2016-12-21   50.34   9279635.0   49.820   50.4400   49.6700
2016-12-20   49.53   9533020.0   48.990   49.7900   48.9100
2016-12-19   48.55  10323930.0   47.450   48.6700   47.4300'''

import io
import pandas as pd

df = pd.read_csv(io.StringIO(data), sep='\s+', header=None, parse_dates=[0])
df = df[df[0].dt.month.eq(12) & df[0].dt.day.le(25)] # remove data

# Groupby year according to: https://stackoverflow.com/a/11397052/7386332
for idx, dfx in df.groupby(df[0].map(lambda x: x.year)):
    print('Dataframe containing {}\'s last week:'.format(idx))
    print(dfx)
    print()

打印

Dataframe containing 2016's last week:
            0      1           2      3      4        5
7  2016-12-23  49.59   6966559.0  49.25  49.72  48.9900
8  2016-12-22  49.44  10918300.0  50.32  50.55  49.1711
9  2016-12-21  50.34   9279635.0  49.82  50.44  49.6700
10 2016-12-20  49.53   9533020.0  48.99  49.79  48.9100
11 2016-12-19  48.55  10323930.0  47.45  48.67  47.4300

Dataframe containing 2017's last week:
           0      1           2      3      4     5
0 2017-12-25  52.99  13018070.0  52.37  53.06  51.4

2
投票

安东的数据:-)

df[df.groupby(df[0].dt.year)[0].apply(lambda x : x.dt.week==x.dt.week.max())]
Out[1471]: 
           0      1           2       3      4      5
0 2017-12-25  52.99  13018070.0  52.370  53.06  51.40
3 2016-12-30  46.75  13593420.0  48.365  48.40  46.36
4 2016-12-29  47.77  11728250.0  48.440  48.86  47.18
5 2016-12-28  48.51  14636340.0  50.580  50.73  48.47
6 2016-12-27  50.43   5594876.0  49.690  50.55  49.65
© www.soinside.com 2019 - 2024. All rights reserved.