如何使用WeightedCluster聚合序列并应用于多通道序列分析

Question

我有54399个案例，2个通道（HOM和HOS），我想使用多通道序列分析，数据示例如下：

身份证	HOM1	HOM2	HOM3	HOM4	居屋1	居屋2	居屋3	居屋4
1	A	A	B	C	不	是	不	不
2	A	B	A	A	是	不确定	是	是

我使用的代码：

HOM.seq<-seqdef(df[, 2:5])
HOS.seq<-seqdef(df[, 6:9])
channels<-list(HOM.seq, HOS.seq)
MDdist<-seqMD(channels, method="OM", sm=list("TRATE", "TRATE"), what="diss")

但是，它收到警告“52322 个唯一序列超出了允许的最大数量 46340

我的问题是如何使用 wcAggregateCaese 来减少唯一序列的数量？尽管这个 52322 看起来已经是从 54399 个序列聚合而来的。或者我可以在将 HOM 和 HOS 放入频道列表之前使用 wcaggregatecase 吗？谢谢

我分别对 HOM 和 HOS 使用了 wcAggregateCases，HOM 的聚合案例约为 10000 个，HOS 的聚合案例约为 7000 个

Answer 1

您可以使用组合序列对象计算权重和唯一序列。该组合序列在每个时间位置组合来自不同通道的状态。这是有关如何执行此操作的示例

library(TraMineR)
data(biofam)

## Building one channel per type of event left home, married, and child
bf <- as.matrix(biofam[, 10:25])
left <- bf==1 | bf==3 | bf==5 | bf==6
married <- bf == 2 | bf== 3 | bf==6
children <-  bf==4 | bf==5 | bf==6

## Building sequence objects
left.seq <- seqdef(left)
marr.seq <- seqdef(married)
child.seq <- seqdef(children)
channels <- list(LeftHome=left.seq, Marr=marr.seq, Child=child.seq)

## Retrieving the MD sequences or combined sequence
MDseq <- seqMD(channels)
## Now you have one sequence made by combining the different channels. 
alphabet(MDseq)

## Use wcAggregateCases() on the combined sequence
library(WeightedCluster)
ac <- wcAggregateCases(MDseq)
print(ac)
## Retrieving unique cases in the original data set
uniqueChannels <- list(LeftHome=left.seq[ac$aggIndex, ], Marr=marr.seq[ac$aggIndex, ], Child=child.seq[ac$aggIndex, ])
## Distance on unique data
MDdist <- seqMD(uniqueChannels, method="OM", sm=list("TRATE", "TRATE"), what="diss")

如何使用WeightedCluster聚合序列并应用于多通道序列分析

问题描述投票：0回答：1

1个回答

最新问题

如何使用WeightedCluster聚合序列并应用于多通道序列分析

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1