日期值比较Python列表

问题描述 投票:-1回答:2

我正在寻找一个csv文件,我正在寻找创建一个功能,通过列表中的项目进行比较。更清楚,这是一个例子。

我将csv转换为list:

import csv
with open('test.csv', 'rb') as csvfile:
    spamreader = csv.reader(csvfile, delimiter=';', quotechar='|')
    lista = list(spamreader)
    print lista

>>>[['"Fecha"', '"Cliente"', '"Subastas"', '"Impresiones_exchange"', '"Fill_rate"', '"Importe_a_pagar_a_medio"', '"ECPM_medio"'],['20/12/2017', 'Martin', '165.665', '3.777', '2,28%', '1,58', '0,42'], ['21/12/2017', 'Martin', '229.620', '18.508', '8,06%', '14,56', '0,79'], ['22/12/2017', 'Martin', '204.042', '48.526', '23,78%', '43,98', '0,91'], ['20/12/2017', 'Tom', '102.613', '20.223', '19,71%', '17,86', '0,88'], ['21/12/2017', 'Tom', '90.962', '19.186', '21,09%', '14,26', '0,74'], ['22/12/2017', 'Tom', '60.189', '12.654', '21,02%', '11,58', '0,92']]

所以,首先,我需要比较马丁和汤姆的所有价值观。我的意思是,item[2] of 20/12/2017 to item[2] of 21/12/2017. item[2] of 21/12/2017 to item[2] of 22/12/2017。我需要这些列表中的所有项目(项目[2,3,4,5,6]。日期是最重要的价值,因为这个想法有一天与另一个相比。

结果我希望是这样的:

21/12/2017 Martin 
item[2]: smaller
item[3]: smaller
item[4]: bigger
item[5]: smaller
item[6]: smaller

22/12/2017 Martin
item[2]: smaller
item[3]: bigger
item[4]: bigger
item[5]: bigger
item[6]: bigger

21/12/2017 Tom
item[2]: smaller
item[3]: bigger
item[4]: bigger
item[5]: bigger
item[6]: bigger

22/12/2017 Tom
item[2]: smaller
item[3]: smaller
item[4]: smaller
item[5]: smaller
item[6]: bigger

如果我想将名称显示为“Subastas”而不是item [2]以及所有名称......我怎么能这样做?

python python-2.7 list csv
2个回答
2
投票

让我们首先观察你有关键是(date, name)的数据行。一种相当明显的方法是将数据存储在以(date, name)为关键字的字典中。

所以,把你发布的数据放在mylist

mylist = [['"Fecha"', '"Cliente"', '"Subastas"', '"Impresiones_exchange"', '"Fill_rate"', '"Importe_a_pagar_a_medio"', '"ECPM_medio"'],['20/12/2017', 'Martin', '165.665', '3.777', '2,28%', '1,58', '0,42'], ['21/12/2017', 'Martin', '229.620', '18.508', '8,06%', '14,56', '0,79'], ['22/12/2017', 'Martin', '204.042', '48.526', '23,78%', '43,98', '0,91'], ['20/12/2017', 'Tom', '102.613', '20.223', '19,71%', '17,86', '0,88'], ['21/12/2017', 'Tom', '90.962', '19.186', '21,09%', '14,26', '0,74'], ['22/12/2017', 'Tom', '60.189', '12.654', '21,02%', '11,58', '0,92']]

将它(带有列标签的第一行除外)转换为如下字典:

import datetime
mydict = {}
for row in mylist[1:]:
    date = datetime.datetime.strptime(row[0],'%d/%m/%Y')
    name = row[1]
    mydict[(date,name)] = row[2:]

这里棘手的一点是你的日期是dd/mm/yyyy形式的字符串,但你稍后想要在一天和下一天之间进行比较。这并不奇怪,因为您将此问题作为问题的主题。因此,您需要将字符串日期转换为可以与之进行适当比较的内容。这就是strptime()所做的。

您的数据现在如下所示:

>>> mydict
{(datetime.datetime(2017, 12, 20, 0, 0), 'Martin'): ['165.665', '3.777', '2,28%', '1,58', '0,42'],
 (datetime.datetime(2017, 12, 22, 0, 0), 'Tom'): ['60.189', '12.654', '21,02%', '11,58', '0,92'],
 (datetime.datetime(2017, 12, 21, 0, 0), 'Martin'): ['229.620', '18.508', '8,06%', '14,56', '0,79'], 
 (datetime.datetime(2017, 12, 21, 0, 0), 'Tom'): ['90.962', '19.186', '21,09%', '14,26', '0,74'],
 (datetime.datetime(2017, 12, 20, 0, 0), 'Tom'): ['102.613', '20.223', '19,71%', '17,86', '0,88'],
 (datetime.datetime(2017, 12, 22, 0, 0), 'Martin'): ['204.042', '48.526', '23,78%', '43,98', '0,91']}

接下来要注意的是,您的数据由浮点数和百分比组成,但表示为字符串。这使事情变得复杂,因为你想做比较。获取Martin的前2个数据点:

    ['165.665', '3.777', ...
    ['229.620', '18.508', ...

如果你将'165.665''229.620'进行比较,那么第一个将会更小,这就是你所期望的。但是,如果你将'3.777''18.508'进行比较,第一个会更大:不是你所期望的。这是因为字符串按字母顺序进行比较,而3则是在alpha排序的1之后。

更糟糕的是,您的数据有时以逗号表示小数点,有时不表示。

因此,您需要一个函数来对字符串进行数字转换。这是一个天真的,适用于您的数据,但可能需要在现实生活中更健壮:

def convert(n):
    n = n.replace(",",".").replace("%","")
    try:
        return float(n)
    except ValueError:
        return 0e0

现在您可以进行比较了:

for (day, name) in mydict:
    previous_day = day - datetime.timedelta(days=1)
    if (previous_day,name) in mydict:
        print datetime.datetime.strftime(day,"%d/%m/%Y"), name
        day2_values = mydict[(day, name)]
        day1_values = mydict[(previous_day, name)]
        comparer = zip(day2_values, day1_values)
        for n,value in enumerate(comparer):
            print "item[%d]:" % (n+2,),
            if convert(value[1]) < convert(value[0]):
                print value[1], "smaller than", value[0]
            else:
                print value[1], "bigger than", value[0]
        print

我已经使消息更加明确,例如,item[2]: 165.665 smaller than 229.620。这样您就可以轻松验证程序是否正确,而无需重新挖掘数据,这很容易出错且繁琐。如果需要,您可以随后使消息不太明确。

22/12/2017 Tom
item[2]: 90.962 bigger than 60.189
item[3]: 19.186 bigger than 12.654
item[4]: 21,09% bigger than 21,02%
item[5]: 14,26 bigger than 11,58
item[6]: 0,74 smaller than 0,92

21/12/2017 Martin
item[2]: 165.665 smaller than 229.620
item[3]: 3.777 smaller than 18.508
item[4]: 2,28% smaller than 8,06%
item[5]: 1,58 smaller than 14,56
item[6]: 0,42 smaller than 0,79

21/12/2017 Tom
item[2]: 102.613 bigger than 90.962
item[3]: 20.223 bigger than 19.186
item[4]: 19,71% smaller than 21,09%
item[5]: 17,86 bigger than 14,26
item[6]: 0,88 bigger than 0,74

22/12/2017 Martin
item[2]: 229.620 bigger than 204.042
item[3]: 18.508 smaller than 48.526
item[4]: 8,06% smaller than 23,78%
item[5]: 14,56 smaller than 43,98
item[6]: 0,79 smaller than 0,91

要显示"Subastas"而不是item[2],请回想一下列标签位于mylist的第一个元素中:

>>> mylist[0]
['"Fecha"', '"Cliente"', '"Subastas"', '"Impresiones_exchange"', '"Fill_rate"', '"Importe_a_pagar_a_medio"', '"ECPM_medio"']

因此要将它们包含在输出中,您需要更改此行:

print "item[%d]:" % (n+2,),

print mylist[0][n+2] + ":",

0
投票

您可以将lista加载到数据框中,然后从那里执行比较:

import pandas as pd
import numpy as np

headers = lista.pop(0)

df = pd.DataFrame(lista, columns = headers)

martin = df[df['"Cliente"'] == 'Martin']
tom = df[df['"Cliente"'] == 'Tom']

merge = pd.merge(martin, tom, on = '"Fecha"')

stats = headers[2:]
compare = ['"Fecha"']

for index, row in merge.iterrows():
    for x in stats:
        merge[x+'_compare'] = np.where(row[x+'_x'] > row[x+'_y'], 'Martin', 'Tom')
        if x+'_compare' not in compare:
            compare.append(x+'_compare')

print(merge[compare])

#output
"Fecha" "Subastas"_compare  "Impresiones_exchange"_compare  "Fill_rate"_compare "Importe_a_pagar_a_medio"_compare   "ECPM_medio"_compare
20/12/2017  Tom Martin  Martin  Martin  Tom
21/12/2017  Tom Martin  Martin  Martin  Tom
22/12/2017  Tom Martin  Martin  Martin  Tom
© www.soinside.com 2019 - 2024. All rights reserved.