UnicodeEncodeError：'ascii'编解码器无法在位置0-2处编码字符：使用python的序数不在range（128）中

Question

当我通过按词搜索数据来测试来自Twitter的数据挖掘时，我遇到了一个大问题。此代码错误UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-2: ordinal not in range(128)

retweet = "-filter:retweets"
query = "#Thailand" + retweet 

df = pd.DataFrame(columns = ["create_at","user","location","text", "retweet_count", "favourite_count","hashtag","follower","source"])
for tweet in tweepy.Cursor(api.search, q = query,result_type="recent", tweet_mode='extended').items(100):

    entity_hashtag = tweet.entities.get('hashtags')
    hashtag = ""
    for i in range(0, len(entity_hashtag)):
        hashtag = hashtag + "/" + entity_hashtag[i]["text"]
    re_count = tweet.retweet_count
    create_at = tweet.created_at
    user = tweet.user.screen_name
    source = tweet.source
    location = tweet.user.location
    follower = tweet.user.followers_count

    try:
        text = tweet.retweeted_status.full_text
        fav_count = tweet.retweeted_status.favorite_count 

    except:     
        text = tweet.full_text
        fav_count = tweet.favorite_count  
    new_column = pd.Series([create_at,user,location,text, re_count, fav_count,hashtag,follower,source], index = df.columns)
    df = df.append(new_column, ignore_index = True)

df.to_csv(date_time+".csv")

为什么有这个问题？

Answer 1

尝试在脚本开始时将系统默认编码设置为utf-8，以下应将默认编码设置为utf-8。

import sys
reload(sys)
sys.setdefaultencoding('utf-8')

Answer 2

[您没有提到您使用的是哪个版本的Python，但我会在此处查看有关此主题的Python文档：https://www.python.org/dev/peps/pep-0263/（对于Python 2）

从那里：

要定义源代码编码，必须将魔术注释作为源文件的第一行或第二行放置在源文件中，例如：

# coding=<encoding name>

或（使用流行的编辑器认可的格式：]

#!/usr/bin/python
# -*- coding: <encoding name> -*-

或：

#!/usr/bin/python
# vim: set fileencoding=<encoding name> :

我在某些情况下使用了此版本：

#!/usr/bin/python
# -*- coding: <encoding name> -*-

就是说，某些功能，尤其是str()不应与unicode一起使用。请改用unicode()。使用第三方库时，您将必须检查其文档，如果文档受限制，则可能需要查看其来源。

UnicodeEncodeError：'ascii'编解码器无法在位置0-2处编码字符：使用python的序数不在range（128）中

问题描述投票：0回答：2

2个回答

最新问题

UnicodeEncodeError：'ascii'编解码器无法在位置0-2处编码字符：使用python的序数不在range（128）中

问题描述 投票：0回答：2

2个回答

最新问题

问题描述投票：0回答：2