更改索引数据框熊猫python

问题描述 投票:0回答:1

我正在尝试重新索引我的数据框,以便根据其所属的类类型为每一行提供一个值。我基本上是使用数字对它们进行分类的,这样以后以后我将更容易访问它们。

pandas.set_option('display.max_columns', None)


d = pd.read_html("https://www.bu.edu/phpbin/course-search/section/?t=casma124")
d = pd.concat(d)
number_of_rows = 1 #number of rows in dataframe
index_range = list(range(number_of_rows))


d = d.loc[:, ["Section", "Type","Schedule", "Location"]]



print(d)

此代码的输出如下:

   Section Type               Schedule Location
0       A1  LEC    MWF 1:25 pm-2:15 pm  STO B50
1       A1  NaN      R 6:30 pm-8:30 pm     ROOM
2       A2  LEC   MWF 12:20 pm-1:10 pm  STO B50
3       A2  NaN      R 6:30 pm-8:30 pm     ROOM
4       A3  LEC    TR 12:30 pm-1:45 pm  STO B50
5       A3  NaN      R 6:30 pm-8:30 pm     ROOM
6       B1  DIS      T 2:00 pm-3:15 pm  EPC 207
7       B2  DIS      T 3:30 pm-4:45 pm  EPC 207
8       B3  DIS      T 5:00 pm-6:15 pm  EPC 207
9       B4  DIS      R 2:00 pm-3:15 pm  EPC 207
10      B5  DIS      M 2:30 pm-3:45 pm  CAS 324
11      B6  DIS      W 2:30 pm-3:45 pm  CAS 324
12      B7  DIS      R 3:30 pm-4:45 pm  EPC 207
0      SA1  IND   MTWR 1:00 pm-3:00 pm  MCS B29
1      SA2  IND    MTR 6:00 pm-8:30 pm  COM 217
0      SB1  IND  MTWR 11:00 am-1:00 pm  PSY B51
1      SB2  IND    MTR 6:00 pm-8:30 pm  PSY B37
0       A1  LEC  MWF 11:15 am-12:05 pm      STO
1       A1  NaN      R 6:30 pm-8:30 pm      NaN
2       A2  LEC    MWF 2:30 pm-3:20 pm      STO
3       A2  NaN      R 6:30 pm-8:30 pm      NaN
4       A3  LEC     TR 8:00 am-9:15 am      STO
5       A3  NaN      R 6:30 pm-8:30 pm      NaN
6       B1  DIS      M 4:30 pm-5:45 pm      NaN
7       B2  DIS     T 12:30 pm-1:45 pm      NaN
8       B3  DIS      T 3:30 pm-4:45 pm      NaN
9       B4  DIS      W 8:30 am-9:45 am      CAS
10      B5  DIS      W 4:30 pm-5:45 pm      NaN
11      B6  DIS     R 12:30 pm-1:45 pm      NaN

我希望它看起来像这样:

   Section Type               Schedule Location
1       A1  LEC    MWF 1:25 pm-2:15 pm  STO B50
2       A1  NaN      R 6:30 pm-8:30 pm     ROOM
1       A2  LEC   MWF 12:20 pm-1:10 pm  STO B50
2       A2  NaN      R 6:30 pm-8:30 pm     ROOM
1       A3  LEC    TR 12:30 pm-1:45 pm  STO B50
2       A3  NaN      R 6:30 pm-8:30 pm     ROOM
3       B1  DIS      T 2:00 pm-3:15 pm  EPC 207
3       B2  DIS      T 3:30 pm-4:45 pm  EPC 207
3       B3  DIS      T 5:00 pm-6:15 pm  EPC 207
3       B4  DIS      R 2:00 pm-3:15 pm  EPC 207
3       B5  DIS      M 2:30 pm-3:45 pm  CAS 324
3       B6  DIS      W 2:30 pm-3:45 pm  CAS 324
3       B7  DIS      R 3:30 pm-4:45 pm  EPC 207
9       SA1  IND   MTWR 1:00 pm-3:00 pm  MCS B29
9       SA2  IND    MTR 6:00 pm-8:30 pm  COM 217
9       SB1  IND  MTWR 11:00 am-1:00 pm  PSY B51
9       SB2  IND    MTR 6:00 pm-8:30 pm  PSY B37
1       A1  LEC  MWF 11:15 am-12:05 pm      STO
2       A1  NaN      R 6:30 pm-8:30 pm      NaN
1       A2  LEC    MWF 2:30 pm-3:20 pm      STO
2       A2  NaN      R 6:30 pm-8:30 pm      NaN
1       A3  LEC     TR 8:00 am-9:15 am      STO
2       A3  NaN      R 6:30 pm-8:30 pm      NaN
3       B1  DIS      M 4:30 pm-5:45 pm      NaN
3       B2  DIS     T 12:30 pm-1:45 pm      NaN
3       B3  DIS      T 3:30 pm-4:45 pm      NaN
3       B4  DIS      W 8:30 am-9:45 am      CAS
3       B5  DIS      W 4:30 pm-5:45 pm      NaN
3       B6  DIS     R 12:30 pm-1:45 pm      NaN

这样,只要类型为LEC(讲座),索引将为1。对于NAN,索引将为2。DIS将为等...

我已经尝试过像这样重新索引,但出现错误。

d.reset_index()
d.reindex(index= range(len(d)))
python pandas dataframe pycharm
1个回答
0
投票

我问了一个类似的问题,关于根据特定列中的条件访问并在单独的工作表中存储行。您可能会发现Manish Chaudhary在这里对我的问题的回答很有帮助:

Using Openpyxl to create multiple custom spreadsheets

最终,我放弃了使用Openpyxl,而使用熊猫演示了该任务。

© www.soinside.com 2019 - 2024. All rights reserved.