如何通过二级索引管理行跨度和列跨度

Question

我有以下数据框，映射“课程”和“课程”之间的一对多关系：

   course_id       course_name  lesson_id     lesson_title
0          0          Learn C           1              foo
1          0          Learn C           2              bar
2          0          Learn C           3              baz
3          1  Origami together          1        the crane
4          1  Origami together          2  crease patterns
5          2        WIP course          1        the first

如何格式化它以便：

每个课程行都在其所属课程行的范围内
```
lesson_id
```
和
```
lesson_title
```
列位于公共
```
lessons
```
列
的跨度下

如下图：

                                            lessons
   course_id       course_name         id            title
0          0          Learn C#          1              foo
1                                       2              bar
2                                       3              baz
3          1  Origami together          1        the crane
4                                       2  crease patterns
5          2        WIP course          1        the first

并在导出到 Excel 时生成与此类似的输出：

通过查看类似的问题，我发现接受的答案涉及使用多索引，但在这种情况下，索引的第一级必须理解所有与课程相关的列。

最重要的是，起始表实际上是从相应的

Course

和

Lesson

数据类动态生成的，所以我担心如果我向

Course

类添加属性，这种方法将无法很好地扩展。

理想情况下，我会按

course_id

和

lesson_id

进行索引，然后指定由前者或后者索引哪些列，从而避免每节课重复课程属性；

有办法实现吗？

Answer 1

IIUC用途：

out = df.set_index(['course_id','course_name'])
out.columns = out.columns.str.split('_', expand=True)

如何通过二级索引管理行跨度和列跨度

问题描述投票：0回答：1

1个回答

最新问题

如何通过二级索引管理行跨度和列跨度

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1