使用Pandas和APScheduler每小时在后台处理CSV

问题描述 投票:0回答:1

我有一个CSV文件(ZN_15M),我试图每小时使用read_csv功能。所以我安装了APScheduler,我试图用它来每小时读取一次CSV文件(以及其他一些未显示的东西,但如果我能得到read_csv的东西,那么其他东西也会起作用):

import sys
from time import sleep
from apscheduler.schedulers.background import BackgroundScheduler


scheduler = BackgroundScheduler()
scheduler.start() 

def Run():
    f2 = open('C:\Users\cost9\OneDrive\Documents\PYTHON\Exported_Data\ZN_ES\ZN_15M.csv')
    ZN = pd.read_csv(f2)
    #Do stuff to the CSV File/DataFrame
    ZN.tocsv(path_or_buf = 'path')

def main():
    job = scheduler.add_interval_job(Run, minutes=60, args=())
    while True:
        sleep(60)
        sys.stdout.write('.'); sys.stdout.flush()

我手动运行脚本时没有出现任何错误,但没有像我想的那样每小时运行一次。不知道我在这里做错了什么......

更新:我收到以下错误:

def process_csv(path_to_csv):
    ZN_ES_comb = pd.read_csv(path_to_csv)
    # Insert your CSV processing here
    ZN_ES_comb = pd.DataFrame(ZN_ES_comb)
    ZN_ES_comb.to_csv(path_to_csv.replace('.csv', '_modified_{timestamp}.csv').format(
        timestamp=time.strftime("%Y%m%d-%H%M%S")), index=False)

if __name__ == '__main__':
    # Create CSV for demonstrating purposes
    path_to_csv = 'C:\Users\cost9\OneDrive\Documents\PYTHON\Daily Tasks\ZN_ES\ZN_ES_15M\CSV\ZN_ES_comb.csv'
    pd.DataFrame(ZN_ES_comb).to_csv(path_to_csv, index=False)
    # Start scheduler
    scheduler = BackgroundScheduler()
    scheduler.start()
    scheduler.add_job(func=process_csv,
                      args=[path_to_csv],
                      trigger=IntervalTrigger(seconds=2))
    # Wait for 7 seconds so that scheduler can call process_csv 3 times
    time.sleep(7)

错误是针对线pd.DataFrame(ZN_ES_comb).to_csv(path_to_csv, index=False) - 它说:

NameError: name 'ZN_ES_comb' is not defined
python function pandas apscheduler
1个回答
1
投票

您的代码中存在两个问题:

  1. 它应该是ZN.to_csv()而不是ZN.tocsv()中的def Run()
  2. time.sleep()的参数值是以秒为单位测量的,而不是像您显而易见的那样以分钟为单位。因此,在睡觉期间Run()根本没有跑。

下面有一个可以与Python 3.5和APScheduler 3.3.1一起使用的工作解决方案。 IntervalTrigger()也有hours参数,你可能想用而不是seconds

import time

import pandas as pd
from apscheduler.schedulers.background import BackgroundScheduler
from apscheduler.triggers.interval import IntervalTrigger


def process_csv(path_to_csv):
    df = pd.read_csv(path_to_csv)
    # Insert your CSV processing here
    df.to_csv(path_to_csv.replace('.csv', '_modified_{timestamp}.csv').format(
        timestamp=time.strftime("%Y%m%d-%H%M%S")), index=False)

if __name__ == '__main__':
    # Create CSV for demonstrating purposes
    path_to_csv = 'made_up.csv'
    pd.DataFrame({'fruit': ['apple', 'banana'],
                  'number': [1, 2]}).to_csv(path_to_csv, index=False)
    # Start scheduler
    scheduler = BackgroundScheduler()
    scheduler.start()
    scheduler.add_job(func=process_csv,
                      args=[path_to_csv],
                      trigger=IntervalTrigger(seconds=2))
    # Wait for 7 seconds so that scheduler can call process_csv 3 times
    time.sleep(7)
© www.soinside.com 2019 - 2024. All rights reserved.