如何使用Python访问Firefox的内部indexedDB文件？

Question

我需要使用python读取firefox的indexeddb。

我使用

slite3

包来检索 indexeddb 的内容：

with sqlite3.connect(indexeddb_file) as conn:
    c = conn.cursor()
    c.execute('select * from object_data;')
    rows = c.fetchall()
    for row in rows:
        print row[2]

但是，虽然我知道数据库中的内容是字符串，但它们存储为 sqlite binary blobs。有没有办法从 python 读取存储为 blob 的字符串？

我已经尝试过：

hex() 和 quote() sql 方法只是将 blob 编码为十六进制
当我将 blob 写入文件

更新

按照@paa在这个问题的评论之一中指出的firefox源代码中indexeddb实现的编码方案，我在python中实现了数据库键的FF编码方法的一部分。到目前为止，我仅针对字符串实现了它，但针对其他类型实现它会更容易：

BYTE_LENGTH = 8

def hex_to_bin(hex_str):
    """Return binary representation of hexadecimal string."""
    return str(trim_bin(int(hex_str, 16)).zfill(len(hex_str) * 4))

def byte_to_unicode(bin_byte):
    """Return unicode encoding for binary byte."""
    return chr(int(str(bin_byte), 2))

def trim_bin(int_n):
    """Return int num converted to trimmed bin representation."""
    return bin(int_n)[2:]

def decode(key):
    """Return decoded idb key."""
    decoded = key
    m = re.search("[1-9]", key)  # change for non-zero
    if m:
        i = m.start()
        typeoffset = int(key[i])
    else:
        # error
        pass
    data = key[i + 1:]
    if typeoffset is 1:
        # decode number
        pass
    elif typeoffset is 2:
        # decode date
        pass
    elif typeoffset is 3:
        # decode string
        bin_repr = hex_to_bin(data)
        decoded = ""
        for i in xrange(0, len(bin_repr), BYTE_LENGTH):
            byte = bin_repr[i:i + BYTE_LENGTH]
            if byte[0] is '0':
                byte_1 = int(byte, 2) - 1
                decoded += byte_to_unicode(trim_bin(byte_1))
            else:
                byte = byte[2:]
                if byte[1] is '0':
                    byte_127 = int(byte, 2) + 127
                    decoded += byte_to_unicode(trim_bin(byte_127))
                    i += BYTE_LENGTH
                    decoded += byte_to_unicode(bin_repr[i:i + BYTE_LENGTH])
                elif byte[1] is '1':
                    decoded += byte_to_unicode(byte)
                    i += BYTE_LENGTH
                    decoded += byte_to_unicode(bin_repr[i:i + BYTE_LENGTH])
                    i += BYTE_LENGTH
                    decoded += byte_to_unicode(bin_repr[i:i + 2])
        return decoded
    elif typeoffset is 4:
        # decode array
        pass
    else:
        # error
        pass
    return decoded

但是，我仍然无法解码indexeddb的数据字段。在我看来，他们没有使用像键那样复杂的方案，因为当我用 UTF-16 编码它们时，我可以读取实际值的某些部分。

Answer 1

（在这里打字，因为我还不能发表评论......）

对于数据本身，我一直在尝试对数据 blob 做同样的事情。对于我的问题，我正在尝试抓取 JSON 字符串。如果我查看我试图筛选的数据库，大多数时候我确实会看到 UTF-16 编码的字符。但在一些奇怪的情况下我有这样的情况：

“我们走了”编码为 7400 6800 6500 7200 6500 2000 77 [05060C] 6700 6F00。 [05060C] 据说编码“e”。

https://mxr.mozilla.org/mozilla-release/source/dom/indexedDB/IDBObjectStore.cpp

我正在尝试调查此事，看看是否有任何线索。目录中应该有很多其他源文件可以提供帮助。

Answer 2

我有一个项目可以做到这一点：
https://gitlab.com/ntninja/moz-idb-edit

它是作为 CLI 工具编写的，因此您可以执行以下操作：

$ moz-idb-edit read --site https://mail.proton.me --userctx personal --sdb store

...使用自定义 JSON 超集从站点

https://mail.proton.me

转储名为 store 的数据库的输出。还可以指定过滤器，还有一个

read-json

命令可输出数据库内容的 JSON 近似值，因此您可以使用其他工具轻松解析它。

由于该工具是用 Python 编写的，因此（理论上）也可以用作 Python 库，但没有这方面的 API（更不用说文档了），因此您需要查看源代码以确定您需要什么.

如何使用Python访问Firefox的内部indexedDB文件？

问题描述投票：0回答：2

2个回答

最新问题

如何使用Python访问Firefox的内部indexedDB文件？

问题描述 投票：0回答：2

2个回答

最新问题

问题描述投票：0回答：2