我需要将其抬起,但是我坚持要摆脱字符串末端的符号,这很可能是Unicode符号,表情符号和新线条符号
\\n#\\xe3\\x82\\xa2\\xe3
这是我要做的:
\\n
pat1 = r'@[A-Za-z0-9]+' # this is to remove any text with @ (links)
pat2 = r'https?://[A-Za-z0-9./]+' # this is to remove the urls
pat3 = r'[^a-zA-Z0-9$]' # to remove every other character except a-z & 0-9 & $
combined_pat2 = r'|'.join((r'|'.join((pat1, pat2)),pat3)) # combine pat1, pat2 and pat3 to pass it in the cleaning steps
我仍然有所有这些n's
和
xe3's是否有人建议为此目的提出python的正则表达方式吗?提前提前
这些不是字符。他们是逃脱的。您可以使用以下等级匹配它们:
get $10 worth of amal nthis campaign will be final airdrop before official release n e n nrt please n n amanpuri amal n btc xrp eth n cryptocurrency n china bitcoin n xe3 x82 xa2 xe3 x83 x9e xe3 x83 xb3 xe3 x83 x97 xe3 x83 xaa
如果您想删除它们,请使用: