我正在学习模式挖掘,想做多层次关联挖掘。我的数据集包含 25035 个唯一交易,每个交易可以包含 1 到 12 个项目。总共有 3788 个产品,18 个子类别和 3 个类别。我的问题是将聚合级别/层次结构添加到事务数据中。无论我尝试哪种方法,我总是会收到标签和列数不匹配的错误消息。
我希望我的数据最终看起来像这样:
> head(itemInfo(Groceries))
labels level2 level1
1 frankfurter sausage meat and sausage
2 sausage sausage meat and sausage
3 liver loaf sausage meat and sausage
4 ham sausage meat and sausage
5 meat sausage meat and sausage
6 finished products sausage meat and sausage
原始数据集如下所示:
Order.ID Sub.Category Product.Name Category
1 AG-2011-2040 Storage Tenex Lockers- Blue 2
2 IN-2011-47883 Supplies Acme Trimmer- High Speed 2
3 HU-2011-1220 Storage Tenex Box- Single Width 2
4 IT-2011-3647632 Paper Enermax Note Cards- Premium 2
5 IN-2011-47883 Furnishings Eldon Light Bulb- Duo Pack 1
6 IN-2011-47883 Paper Eaton Computer Printout Paper- 8.5 x 11 2
我试过按照这个tutorial但没有成功。
到目前为止,我设法达到了我的数据如下所示的阶段:
items transactionID
[1] {Epson Calculator, Red,
Fellowes File Cart, Industrial} AE-2011-9160
[2] {Accos Paper Clips, Bulk Pack,
Bush Stackable Bookrack, Pine} AE-2013-1130
[3] {Stiletto Letter Opener, High Speed,
Tenex Folders, Blue} AE-2013-1530
[4] {Rogers File Cart, Industrial} AE-2014-2840
[5] {Avery Binder Covers, Clear,
Avery Binder Covers, Recycled,
BIC Pencil Sharpener, Water Color,
Eldon Lockers, Blue,
Motorola Headset, VoIP,
Rogers File Cart, Single Width} AE-2014-3830
[6] {Hon Color Coded Labels, Adjustable} AE-2014-4120
我理解它背后的概念和逻辑,但不能将它与数据集一起在 R 中工作,而不仅仅是小的“虚拟”数据。
提前感谢您提供解决问题的任何提示。