Python word2vec 更新

问题描述 投票:0回答:1

我正在尝试将这段旧代码片段转换为与 gensim 的更新版本一致。我能够将 model.wv.vocab 转换为 model.wv.key_to_index,但模型 [model.wv.vocab] 以及如何转换它存在问题。

代码如下:

model = Word2Vec(corpus, min_count = 1, vector_size = 5 )

#pass the embeddings to PCA
X = model[model.wv.vocab]

pca = PCA(n_components=2)
result = pca.fit_transform(X)

#create df from the pca results
pca_df = pd.DataFrame(result, columns = ['x','y'])

我试过这个:

#pass the embeddings to PCA
X = model.wv.key_to_index
pca = PCA(n_components=2)
result = pca.fit_transform(X)

#create df from the pca results
pca_df = pd.DataFrame(result, columns = ['x','y'])

并不断出现错误。这是 model.wv.key_to_index 的样子:

{'the': 0,
 'in': 1,
 'of': 2,
 'on': 3,
 '': 4,
 'and': 5,
 'a': 6,
 'to': 7,
 'were': 8,
 'forces': 9,
 'by': 10,
 'was': 11,
 'at': 12,
 'against': 13,
 'for': 14,
 'protest': 15,
 'with': 16,
 'an': 17,
 'as': 18,
 'police': 19,
 'killed': 20,
 'district': 21,
 'city': 22,
 'people': 23,
 'al': 24,
 'came': 996,
 'donbass': 997,
 'resulting': 998,
 'financial': 999}
python pca word2vec
1个回答
0
投票

这段代码最终对我有用:

word_vectors = model.wv

# Accessing word vectors using the updated syntax
vectors = word_vectors.vectors
vocab = word_vectors.index_to_key

# Retrieving vectors for specific words (for instance, for the first 10 words)
selected_words = vocab[:10]
selected_vectors = [word_vectors[word] for word in selected_words]
© www.soinside.com 2019 - 2024. All rights reserved.