我正在尝试将列表转换为DataFrame。该列表来自文档,其中单词被单独分成几行。然后,该列表需要转换为DataFrame。但是,在运行for循环之后,DataFrame不包含任何信息。
import urllib.request
import pandas as pd
data = urllib.request.urlopen('https://www.w3.org/TR/PNG/iso_8859-1.txt')
wordlist = pd.DataFrame(columns = ['col1'])
for line in data:
for word in line.split():
print(word)
wordlist.append({'col1': word}, ignore_index=True)
单词正确分割:
b'The'
b'following'
b'are'
b'the'
b'graphical'
b'(non-control)'
b'characters'
但是附加的数据帧返回:
print(wordlist)
Empty DataFrame
Columns: [col1]
Index: []
我使用了错误的语法
for line in data:
for word in line.split():
print(word)
wordlist = wordlist.append({'col1': word}, ignore_index=True)
您可以尝试一下,它直接从数据中分离出来:
import urllib.request
import pandas as pd
data = urllib.request.urlopen('https://www.w3.org/TR/PNG/iso_8859-1.txt')
wordlist = pd.DataFrame(data.read().split(), columns = ['col1'])
输出:
col1
0 b'The'
1 b'following'
2 b'are'
3 b'the'
4 b'graphical'
.. ...
853 b'SMALL'
854 b'LETTER'
855 b'Y'
856 b'WITH'
857 b'DIAERESIS'