我在模板文件中保存了一些复杂的格式,我需要将 pandas 数据帧中的数据保存到其中。问题是当我使用 pd.to_excel 保存到此工作表时,pandas 会覆盖格式。有没有办法以某种方式将 df 中的值“粘贴到工作表中”?我正在使用熊猫0.17
import openpyxl
import pandas as pd
wb= openpyxl.load_workbook('H:/template.xlsx')
sheet = wb.get_sheet_by_name('spam')
sheet.title = 'df data'
wb.save('H:/df_out.xlsx')
xlr = pd.ExcelWriter('df_out.xlsx')
df.to_excel(xlr, 'df data')
xlr.save()
openpyxl 2.4 附带了一个实用程序,用于将 Pandas Dataframes 转换为 openpyxl 可以直接使用的东西。代码看起来有点像这样:
from openpyxl.utils.dataframe import dataframe_to_rows
rows = dataframe_to_rows(df)
for r_idx, row in enumerate(rows, 1):
for c_idx, value in enumerate(row, 1):
ws.cell(row=r_idx, column=c_idx, value=value)
您可以调整枚举的开始位置,将单元格放置在您需要的位置。
请参阅 openpyxl 文档了解更多信息。
我稍微修改了@CharlieClark的很好的答案以避免索引(原始Excel文件中不存在索引)。这是一个可以运行的代码:
import pandas as pd
from openpyxl.utils.dataframe import dataframe_to_rows
from openpyxl import load_workbook
wb = load_workbook('test.xlsx') # load as openpyxl workbook; useful to keep the original layout
# which is discarded in the following dataframe
df = pd.read_excel('test.xlsx') # load as dataframe (modifications will be easier with pandas API!)
ws = wb.active
df.iloc[1, 1] = 'hello world' # modify a few things
rows = dataframe_to_rows(df, index=False)
for r_idx, row in enumerate(rows, 1):
for c_idx, value in enumerate(row, 1):
ws.cell(row=r_idx, column=c_idx, value=value)
wb.save('test2.xlsx')
clipboard
的解决方案:
import openpyxl
import pandas as pd
import clipboard as clp
#Copy dataframe to clipboard
df.to_clipboard()
#paste the clipboard to a valirable
cells = clp.paste()
#split text in varialble as rows and columns
cells = [x.split() for x in cells.split('\n')]
#Open the work book
wb= openpyxl.load_workbook('H:/template.xlsx')
#Get the Sheet
sheet = wb.get_sheet_by_name('spam')
sheet.title = 'df data'
#Paste clipboard values to the sheet
for i, r in zip(range(1,len(cells)), cells):
for j, c in zip(range(1,len(r)), r):
sheet.cell(row = i, column = j).value = c
#Save the workbook
wb.save('H:/df_out.xlsx')
DataFrame.to_excel
签名的函数中:
from openpyxl.utils.dataframe import dataframe_to_rows
def df_to_excel(df, ws, header=True, index=True, startrow=0, startcol=0):
"""Write DataFrame df to openpyxl worksheet ws"""
rows = dataframe_to_rows(df, header=header, index=index)
for r_idx, row in enumerate(rows, startrow + 1):
for c_idx, value in enumerate(row, startcol + 1):
ws.cell(row=r_idx, column=c_idx).value = value
示例使用,请注意
openpyxl
将索引名称放在实际索引下方的第二行上,与DataFrame.to_excel
相比,这是不同的行为:
import pandas as pd
import openpyxl
import os
wb = openpyxl.Workbook()
df = pd.DataFrame([[1, 2], [3, 4]], columns=["A", "B"]).rename_axis("Index")
df_to_excel(df, wb.active)
wb.save("out.xlsx")
os.startfile("out.xlsx") # open the file in Excel (only works on Windows)
你应该先得到你的数据形状来确定循环的范围
wb_formats=load_workbook("template.xlsx")
ws_index=wb_formats.get_sheet_by_name("index")
daily_input= pd.read_excel(self.readfile,"input")
list_item=data_analysis1.groupby(["item_id"])["product_name"].unique()
list_item_size=pd.DataFrame(list_item,columns=["product_name"]).shape[0]
#create the index sheet:
r = 2 # start at 4th row
c = 1 # column 'a'
for row in range(0,list_item_size):
rows = list_item.iloc[row]
for item in rows:
ws_index.cell(row=r, column=c).value = item
c += 1 # Column 'd'
c = 1
r += 1
wb_formats.save(save_name)