嗨,我收到上述错误。为什么它会弹出,我错过了什么以及如何解决它?谢谢
try:
import urllib.request as urllib2
except ImportError:
import urllib2
from html2text import html2text
sock = html2text(urllib2.urlopen('http://www.example.com'))
htmlSource = sock.read()
sock.close()
print (htmlSource)
我在 Windows 7 操作系统上运行 IDLE 3.4.3。
html2text
期望以字符串形式传入 的 HTML 代码 - 读取响应:
source = urllib2.urlopen('http://www.example.com').read()
text = html2text(source)
print(text)
打印:
# Example Domain
This domain is established to be used for illustrative examples in documents.
You may use this domain in examples without prior coordination or asking for
permission.
[More information...](http://www.iana.org/domains/example)
我想我找到了Python 3.4的解决方案。我刚刚将源代码解码为 UTF-8 并且成功了。
#!/usr/bin/python
try:
import urllib.request as urllib2
except ImportError:
import urllib2
from html2text import html2text
source=urllib2.urlopen('http://www.example.com').read()
s=html2text(source.decode("UTF-8"))
print (s)
输出
# Example Domain
This domain is established to be used for illustrative examples in documents.
You may use this domain in examples without prior coordination or asking for
permission.
[More information...](http://www.iana.org/domains/example)
Replace 是字符串的一个属性,并且您有一个文件对象
obj=urllib2.urlopen('http://www.example.com')
print obj
.
<addinfourl at 3066852812L whose fp = <socket._fileobject object at 0xb6d267ec>>
这个还可以。
#!/usr/bin/python
try:
import urllib.request as urllib2
except ImportError:
import urllib2
from html2text import html2text
source=urllib2.urlopen('http://www.example.com').read()
s=html2text(source)
print s
输出
This domain is established to be used for illustrative examples in documents.
You may use this domain in examples without prior coordination or asking for
permission.
[More information...](http://www.iana.org/domains/example