我有一个Kafka提要,我正在解析并写入数据库。作为一个侧面信息,我需要对一组字典中的结果进行分组,并计算分组中的实例。然后,我需要将每个附加消息的结果聚合到最终结果中。
到目前为止我所拥有的:
from collections import Counter
kafakmessage1 = [{'power': -145.08474576271186, 'freq': 4000000000000}, {'power': -145.38135593220343, 'freq': 4601079784043}, {'power': -146.071186440678, 'freq': 5202159568086}, {'power': -146.864406779661, 'freq': 5803239352129}, {'power': -147.73728813559322, 'freq': 6404319136172}, {'power': -147.9474576271186, 'freq': 7005398920215}, {'power': -148.71016949152542, 'freq': 7606478704259}, {'power': -149.52203389830507, 'freq': 8207558488302}]
kafakmessage2 = [{'power': -145.08474576271186, 'freq': 4000000000000}, {'power': -145.38135593220343, 'freq': 4601079784043}, {'power': -146.071186440678, 'freq': 5202159568086}, {'power': -146.864406779661, 'freq': 5803239352129}, {'power': -147.73728813559322, 'freq': 6404319136172}, {'power': -147.9474576271186, 'freq': 7005398920215}, {'power': -148.71016949152542, 'freq': 7606478704259}, {'power': -149.52203389830507, 'freq': 8207558488302}]
for d in kafakmessage1:
freq = str(d['freq'])[:-12]
power = int((d['power'])+100)
occur = Counter(freq)
print(freq, power, occur)
这使:
4 -45 Counter({'4': 1})
4 -45 Counter({'4': 1})
5 -46 Counter({'5': 1})
5 -46 Counter({'5': 1})
6 -47 Counter({'6': 1})
7 -47 Counter({'7': 1})
7 -48 Counter({'7': 1})
8 -49 Counter({'8': 1})
我需要的:
4 -90 2
5 -92 2
6 -47 1
7 -95 2
8 -49 1
当外部循环(不在示例中)消耗下一条消息(由kafkamessage2表示)时,结果应为:
4 -180 4
5 -184 4
6 -94 2
7 -190 4
8 -98 2
感谢您的任何见解!
这是一个解决方案,使用collections.defaultdict
。
from collections import Counter, defaultdict
kafakmessage1 = [{'power': -145.08474576271186, 'freq': 4000000000000}, {'power': -145.38135593220343, 'freq': 4601079784043}, {'power': -146.071186440678, 'freq': 5202159568086}, {'power': -146.864406779661, 'freq': 5803239352129}, {'power': -147.73728813559322, 'freq': 6404319136172}, {'power': -147.9474576271186, 'freq': 7005398920215}, {'power': -148.71016949152542, 'freq': 7606478704259}, {'power': -149.52203389830507, 'freq': 8207558488302}]
kafakmessage2 = [{'power': -145.08474576271186, 'freq': 4000000000000}, {'power': -145.38135593220343, 'freq': 4601079784043}, {'power': -146.071186440678, 'freq': 5202159568086}, {'power': -146.864406779661, 'freq': 5803239352129}, {'power': -147.73728813559322, 'freq': 6404319136172}, {'power': -147.9474576271186, 'freq': 7005398920215}, {'power': -148.71016949152542, 'freq': 7606478704259}, {'power': -149.52203389830507, 'freq': 8207558488302}]
d_power = defaultdict(int)
d_occur = defaultdict(int)
for d in kafakmessage1:
freq = str(d['freq'])[:-12]
power = int((d['power'])+100)
occur = Counter(freq)
d_power[freq] += power
d_occur[freq] += occur[str(freq)]
for f in d_power:
print(f, d_power[f], d_occur[f])
# 4 -90 2
# 5 -92 2
# 6 -47 1
# 7 -95 2
# 8 -49 1