如何在内部处理数据，使其与输出到终端时的结果相同

Question

我有这个字符串：

"birthday_balloons.\u202egpj"

如果我执行

print("birthday_balloons.\u202egpj")

它会输出

birthday_balloons.jpg

注意最后三个字符是如何颠倒的。我想以这样的方式处理字符串

"birthday_balloons.\u202egpj"

，以获得字符串

"birthday_balloons.jpg"

，字符的顺序就像它们显示的那样。

我正在寻找一种方法来内部处理一段数据，以便它与将其输出到终端时的显示效果相同，而无需执行任何从字面上捕获终端输出之类的操作。

Answer 1

\u202e

字符是从右到左覆盖，其“兄弟”

\u202d

是从左到右覆盖。

要解释这些字符，以便所有字符都按照打印顺序放置在字符串中，您可以使用此函数：

def apply_direction(s):
    def recur(s, ltr):
        try:
            i = s.index("\u202d\u202e"[ltr])
        except ValueError:
            return s
        return s[:i] + recur(s[i+1:], not ltr)[::-1]
    return recur(s, True).replace("\u202d", "").replace("\u202e", "")

用于您的示例：

s = "birthday_balloons.\u202egpj"
s = apply_direction(s)
print(s)  # birthday_balloons.jpg (last three characters reordered)

Answer 2

U+202E 是从右到左覆盖 (RLO)，它标记双向覆盖的开始，强制以下文本从右到左呈现，无论字符的方向如何。它由 U+202C POP 方向格式化 (PDF) 关闭。

它出现在文件名中将表明存在恶意意图，在支持双向格式的终端中，字符串

'birthday_balloons.\u202egpj'

在视觉上看起来是

'birthday_balloons.jpg'

，尽管大多数终端不完全支持双向。在 Web 服务或网页中，覆盖问题更大。

字符串的最后五个字符是

002E 202E 0067 0070 006A

，即

. RLO g p j

。

最简单的方法是将文件名拆分为多个组件，测试覆盖，然后使用列表理解清理包含覆盖的文件名的组件：

import re

# Test for presence of an RLO character
def override_exists(text):
    return re.search(r'\u202e', text)

# Remove RLO and PDF characters and reverse string
def repair_string(text):
    return re.sub(r'[\u202c\u202e]', '', text)[::-1]

# Split file name and use list comprehension to test and repair string.
def clean_file_name(file_name):
    components = file_name.split('.')
    cleaned = [repair_string(comp) if override_exists(comp) else comp for comp in components]
    return ".".join(cleaned)

s = 'birthday_balloons.\u202egpj'
print(clean_file_name(s))
# birthday_balloons.jpg

尽管如此，修复机制掩盖了问题并可能造成安全漏洞。

更好的方法是修复功能

def repair_string(text):
    return re.sub(r'[\u202c\u202e]', '', text)

所以：

print(clean_file_name(s))
birthday_balloons.gpj

这将删除 RLO，并以显示文件扩展名是 not

.jpg

并且可疑的方式显示文件名。或者，覆盖检测可能会引发或记录异常。

如何在内部处理数据，使其与输出到终端时的结果相同

问题描述投票：0回答：2

2个回答

最新问题

如何在内部处理数据，使其与输出到终端时的结果相同

问题描述 投票：0回答：2

2个回答

最新问题

问题描述投票：0回答：2