dataframe 相关问题

数据框是表格数据结构。通常,它包含数据,其中行是观察值,列是各种类型的变量。虽然“数据框架”或“数据框架”是这个概念用于多种语言的术语(R,Apache Spark,deedle,Maple,Python中的pandas库和Julia中的DataFrames库),“table”是用于的术语MATLAB和SQL。

回答 1 投票 0


dataframe中的格式重组列

我想知道,如果在r或pandas中,有一种图形方式更改列的顺序。

回答 0 投票 0




将这些.CSV文件组合到单个dataframe

df = pd.DataFrame() for file in files: if file.endswith('.csv'): df=df.append(pd.read_csv(file), ignore_index=True) df.head()

回答 1 投票 0

如何优雅地拆卸火花执行者

IAM使用df.cache()来cachce数据框架,并使用databricks以min实例为1和max实例自动化为8。但是,由于某些执行者在MIDD中死亡...

回答 1 投票 0

10个交叉折叠的聚集矩阵 - 如何进行pandas dataframe?

我试图为任何型号(随机森林,决策树,天真的贝叶斯等)获得10倍的混乱矩阵 如果我为普通模型运行,我能够正常获得每个混淆矩阵,如下所示:

回答 2 投票 0

使用dataframe.replace()用于在dataframe.map()函数中用NAN替换字符串返回typeerror

我意识到有一些工作替代方案,我只想了解我自己的教育或其他任何遇到此事的事情。 df_test = pd.dataframe({'test1':['blah1','b ...

回答 1 投票 0

julia:相当于pythons选择多索引级别(尤其是列)

import numpy as np import pandas as pd #generating sample data nsmpls = 10 smpls = [f'smpl{j}' for j in range(nsmpls)] nfeats = 5 feats = [f'feat{j}' for j in range(nfeats)] data = np.random.rand(nfeats, nsmpls) countries = ['France'] * 2 + ['UK'] * 3 + ['US'] * 5 df = pd.DataFrame(data, index=feats, columns=pd.MultiIndex.from_tuples(zip(countries, smpls))) df.to_csv('./data.tsv', sep='\t') #--------------------------------------------------------------------- #loading dataset df = pd.read_csv('./data.tsv', sep='\t', index_col=0, header=[0,1]) #extracting subset dg = df.xs('France', level=0, axis=1) print(dg.shape) #iterating for country, group in df.groupby(level=0, axis=1): print('#samples: {}'.format(group.shape[1]))

回答 1 投票 0

R中有以下RmarkDown文档,该文档生成了一个可弹出的对象。

library(tidyverse) library(officer) library(flextable) ft3 = structure(list("Project Number" = c(4107L, 1770L, 1979L, 9252L, 2581L, 8360L, 6290L, 1002L, 7300L, 2925L), "Client Company" = c("Dynamic Build Concept Agency", "Nova", "Alpha Corp", "Global Innovations", "Core Metrics", "Vision Group for Property Holdings", "United Firm for Urban Growth Projects", "Eastern Gate Real Estate Investment Group (EGRIG)", "Eastern Gate Real Estate Investment Group (EGRIG)", "Eastern Gate Real Estate Investment Group (EGRIG)" ), `organizational growth planning` = c(5, 5, 4.83, 4.67, 4.17, 4, 3.83, 3.67, 3.5, 2.83), competency = c(5, 5, 4.83, 4.67, 4.27, 4.08, 4.25, 4, 3.5, 3.25), compression = c(5, 5, 5, 4.67, 4.38, 4.67, 4.67, 4, 3.67, 3), `International development project` = c(5, 4.57, 4.43, 4.43, 3.83, 4.17, 3.57, 3.14, 2.71, 2.71), `Team spirit` = c(5, 5, 5, 4.5, 4.21, 4.5, 4.5, 3.5, 3.5, 3), Plan = c(5, 5, 4, 4, 3.6, 2, 3, 4, 3, 3), PIR = c(5, 5, 4.17, 4.67, 4.07, 4.17, 4.33, 3.67, 3.67, 3.33), Success = c(5, 5, 4, 4, 4.08, 5, 3, 4, 2.67, 3), plant = c(100, 98.92, 90.65, 89.03, 81.6, 81.48, 77.88, 74.95, 65.55, 60.3)), row.names = c(NA, -10L), class = c("tbl_df", "tbl", "data.frame"))

回答 1 投票 0

嵌套JSON列表中的Flatten Pandas DataFrame

也许有人可以帮助我。我试图将以下列表置于熊猫数据框中: [{u'_id':u'2', u'_index':u'list', u'_score':1.4142135, u'_source':{u'name':u'name3'}, u'_typ ...

回答 1 投票 0



将多列的重新签名值与多个其他列的值

对于以下DF,我希望将A,B和C列中的值更改为X,Y和Z列中的值。 列= {“ a”:[1,2,3], &

回答 0 投票 0

基于日期差异将适当数量的行插入数据框架中

df1 <- read.table(text = "entity_id state last_changed DT.diff sensor.kincony02_temperature03 20.4 '2025-02-04 23:00:15' 15.188 sensor.kincony02_temperature03 20.3 '2025-02-04 23:08:15' 479.849 sensor.kincony02_temperature03 20.2 '2025-02-04 23:10:15' 120.115 sensor.kincony02_temperature03 20.3 '2025-02-04 23:15:15' 300.136 sensor.kincony02_temperature03 20.4 '2025-02-04 23:18:15' 180.020 sensor.kincony02_temperature03 20.5 '2025-02-04 23:21:15' 180.020 sensor.kincony02_temperature03 20.6 '2025-02-04 23:22:15' 59.904 sensor.kincony02_temperature03 20.7 '2025-02-04 23:23:15' 59.904 sensor.kincony02_temperature03 20.8 '2025-02-04 23:25:15' 120.115 sensor.kincony02_temperature03 20.9 '2025-02-04 23:27:15' 119.809 sensor.kincony02_temperature03 21.0 '2025-02-04 23:30:15' 179.979 sensor.kincony02_temperature03 21.1 '2025-02-04 23:31:15' 60.252 sensor.kincony02_temperature03 21.2 '2025-02-04 23:35:15' 239.921 sensor.kincony02_temperature03 21.3 '2025-02-04 23:46:15' 659.865 sensor.kincony02_temperature03 21.2 '2025-02-04 23:47:15' 60.008 sensor.kincony02_temperature03 21.1 '2025-02-04 23:51:15' 240.025 sensor.kincony02_temperature03 21.2 '2025-02-04 23:53:15' 120.218 sensor.kincony02_temperature03 21.1 '2025-02-04 23:54:15' 59.903 sensor.kincony02_temperature03 21.0 '2025-02-05 00:02:15' 479.803 sensor.kincony02_temperature03 20.9 '2025-02-05 00:06:15' 239.999 sensor.kincony02_temperature03 20.8 '2025-02-05 00:11:15' 300.007 sensor.kincony02_temperature03 20.7 '2025-02-05 00:13:15' 119.997 sensor.kincony02_temperature03 20.6 '2025-02-05 00:14:15' 60.008 sensor.kincony02_temperature03 20.5 '2025-02-05 00:15:15' 60.002 sensor.kincony02_temperature03 20.4 '2025-02-05 00:17:15' 119.999 sensor.kincony02_temperature03 20.3 '2025-02-05 00:19:15' 119.996 sensor.kincony02_temperature03 20.2 '2025-02-05 00:20:15' 59.998 sensor.kincony02_temperature03 20.1 '2025-02-05 00:24:15' 240.009 sensor.kincony02_temperature03 20.0 '2025-02-05 00:27:15' 179.997", header = TRUE) <- read.table(text = "entity_id state last_changed DT.diff sensor.kincony02_temperature03 20.4 '2025-02-04 23:00:15' 15.188 sensor.

回答 0 投票 0




最新问题
© www.soinside.com 2019 - 2025. All rights reserved.