通过“直接”字符串化到 ArrayBuffer/Blob 将巨大的 JSON 对象转换为 Blob,以避免最大字符串长度错误

问题描述 投票:0回答:2

上下文

我的应用程序中有一些这样的代码:

let blob = new Blob([JSON.stringify(json)], {type: "application/json"});

但是,它有时会失败,因为在 Chrome 中,最大字符串长度约为 500MB,而

json
有时可能比这个更大。

问题:

我正在寻找一种直接从我的

json
变量(即 POJO)到 Blob 的方法,可能是通过某种流式字符串化来保存到 ArrayBuffer 中。或者任何其他方法将大的
json
对象放入 Blob 中,而不会遇到“最大字符串长度”错误。

备注:

  • 解决方案必须在浏览器中运行。
  • 如果在答案中提出现有库,则它不能期望
    json
    只是一个数组,因为这种情况很容易处理。相反,它必须期望一个任意嵌套的 JSON 对象,例如90% 的数据可能位于
    foo.bar.whatever
    中,而不是均匀分布在顶级键或其他位置。
  • 不是正在寻找一种需要流作为输入并导致字符串块流作为输出的解决方案,例如json-stream-stringifystreaming-json-stringify。相反,我想输入一个已经在内存中的 POJO,并获取一个包含字符串化 JSON 的 Blob。

相关

javascript json arraybuffer
2个回答
2
投票

我们实际上可以通过字符串块生成

Blob
来解决该限制。

const header = 24;
const bytes = new Uint8Array((512 * 1024 * 1024) - header);
const bigStr = new TextDecoder().decode(bytes);
const arr = [];
for (let i=0; i<5; i++) {
  arr.push(bigStr);
}
console.log(new Blob(arr).size); // 2.7GB

鉴于

Blob
构造函数还在其
blobParts
输入中接受其他 Blob,我们甚至可以重用一个简单的递归字符串生成器并替换
join()
内部临时值列表的所有部分,以生成
 列表Blob
对象,与 DOMString 分隔符交错。

所以最后我们会产生类似的东西

new Blob(["{", <Blob>, ":", <Blob>, "}"]);

而且我们比 500MiB 限制更安全。

在这里我快速修补了这个实现,但我没有对它进行任何认真的测试,所以你可能想自己再做一次:

/*
    json2.js
    2015-05-03

    Public Domain.

    NO WARRANTY EXPRESSED OR IMPLIED. USE AT YOUR OWN RISK.

    See http://www.JSON.org/js.html


    This code should be minified before deployment.
    See http://javascript.crockford.com/jsmin.html

    USE YOUR OWN COPY. IT IS EXTREMELY UNWISE TO LOAD CODE FROM SERVERS YOU DO
    NOT CONTROL.


    This file creates a global JSON object containing two methods: stringify
    and parse. This file is provides the ES5 JSON capability to ES3 systems.
    If a project might run on IE8 or earlier, then this file should be included.
    This file does nothing on ES5 systems.

        JSON.stringify(value, replacer, space)
            value       any JavaScript value, usually an object or array.

            replacer    an optional parameter that determines how object
                        values are stringified for objects. It can be a
                        function or an array of strings.

            space       an optional parameter that specifies the indentation
                        of nested structures. If it is omitted, the text will
                        be packed without extra whitespace. If it is a number,
                        it will specify the number of spaces to indent at each
                        level. If it is a string (such as '\t' or '&nbsp;'),
                        it contains the characters used to indent at each level.

            This method produces a JSON text from a JavaScript value.

            When an object value is found, if the object contains a toJSON
            method, its toJSON method will be called and the result will be
            stringified. A toJSON method does not serialize: it returns the
            value represented by the name/value pair that should be serialized,
            or undefined if nothing should be serialized. The toJSON method
            will be passed the key associated with the value, and this will be
            bound to the value

            For example, this would serialize Dates as ISO strings.

                Date.prototype.toJSON = function (key) {
                    function f(n) {
                        // Format integers to have at least two digits.
                        return n < 10 
                            ? '0' + n 
                            : n;
                    }

                    return this.getUTCFullYear()   + '-' +
                         f(this.getUTCMonth() + 1) + '-' +
                         f(this.getUTCDate())      + 'T' +
                         f(this.getUTCHours())     + ':' +
                         f(this.getUTCMinutes())   + ':' +
                         f(this.getUTCSeconds())   + 'Z';
                };

            You can provide an optional replacer method. It will be passed the
            key and value of each member, with this bound to the containing
            object. The value that is returned from your method will be
            serialized. If your method returns undefined, then the member will
            be excluded from the serialization.

            If the replacer parameter is an array of strings, then it will be
            used to select the members to be serialized. It filters the results
            such that only members with keys listed in the replacer array are
            stringified.

            Values that do not have JSON representations, such as undefined or
            functions, will not be serialized. Such values in objects will be
            dropped; in arrays they will be replaced with null. You can use
            a replacer function to replace those with JSON values.
            JSON.stringify(undefined) returns undefined.

            The optional space parameter produces a stringification of the
            value that is filled with line breaks and indentation to make it
            easier to read.

            If the space parameter is a non-empty string, then that string will
            be used for indentation. If the space parameter is a number, then
            the indentation will be that many spaces.

            Example:

            text = JSON.stringify(['e', {pluribus: 'unum'}]);
            // text is '["e",{"pluribus":"unum"}]'


            text = JSON.stringify(['e', {pluribus: 'unum'}], null, '\t');
            // text is '[\n\t"e",\n\t{\n\t\t"pluribus": "unum"\n\t}\n]'

            text = JSON.stringify([new Date()], function (key, value) {
                return this[key] instanceof Date 
                    ? 'Date(' + this[key] + ')' 
                    : value;
            });
            // text is '["Date(---current time---)"]'


        JSON.parse(text, reviver)
            This method parses a JSON text to produce an object or array.
            It can throw a SyntaxError exception.

            The optional reviver parameter is a function that can filter and
            transform the results. It receives each of the keys and values,
            and its return value is used instead of the original value.
            If it returns what it received, then the structure is not modified.
            If it returns undefined then the member is deleted.

            Example:

            // Parse the text. Values that look like ISO date strings will
            // be converted to Date objects.

            myData = JSON.parse(text, function (key, value) {
                var a;
                if (typeof value === 'string') {
                    a =
/^(\d{4})-(\d{2})-(\d{2})T(\d{2}):(\d{2}):(\d{2}(?:\.\d*)?)Z$/.exec(value);
                    if (a) {
                        return new Date(Date.UTC(+a[1], +a[2] - 1, +a[3], +a[4],
                            +a[5], +a[6]));
                    }
                }
                return value;
            });

            myData = JSON.parse('["Date(09/09/2001)"]', function (key, value) {
                var d;
                if (typeof value === 'string' &&
                        value.slice(0, 5) === 'Date(' &&
                        value.slice(-1) === ')') {
                    d = new Date(value.slice(5, -1));
                    if (d) {
                        return d;
                    }
                }
                return value;
            });


    This is a reference implementation. You are free to copy, modify, or
    redistribute.
*/

/*jslint 
    eval, for, this 
*/

/*property
    JSON, apply, call, charCodeAt, getUTCDate, getUTCFullYear, getUTCHours,
    getUTCMinutes, getUTCMonth, getUTCSeconds, hasOwnProperty, join,
    lastIndex, length, parse, prototype, push, replace, slice, stringify,
    test, toJSON, toString, valueOf
*/


// Create a JSON object only if one does not already exist. We create the
// methods in a closure to avoid creating global variables.

if (typeof JSON !== 'object') {
    JSON = {};
}

(function () {
    'use strict';
    
    var rx_one = /^[\],:{}\s]*$/,
        rx_two = /\\(?:["\\\/bfnrt]|u[0-9a-fA-F]{4})/g,
        rx_three = /"[^"\\\n\r]*"|true|false|null|-?\d+(?:\.\d*)?(?:[eE][+\-]?\d+)?/g,
        rx_four = /(?:^|:|,)(?:\s*\[)+/g,
        rx_escapable = /[\\\"\u0000-\u001f\u007f-\u009f\u00ad\u0600-\u0604\u070f\u17b4\u17b5\u200c-\u200f\u2028-\u202f\u2060-\u206f\ufeff\ufff0-\uffff]/g,
        rx_dangerous = /[\u0000\u00ad\u0600-\u0604\u070f\u17b4\u17b5\u200c-\u200f\u2028-\u202f\u2060-\u206f\ufeff\ufff0-\uffff]/g;

    function f(n) {
        // Format integers to have at least two digits.
        return n < 10 
            ? '0' + n 
            : n;
    }
    
    function this_value() {
        return this.valueOf();
    }

    if (typeof Date.prototype.toJSON !== 'function') {

        Date.prototype.toJSON = function () {

            return isFinite(this.valueOf())
                ? this.getUTCFullYear() + '-' +
                        f(this.getUTCMonth() + 1) + '-' +
                        f(this.getUTCDate()) + 'T' +
                        f(this.getUTCHours()) + ':' +
                        f(this.getUTCMinutes()) + ':' +
                        f(this.getUTCSeconds()) + 'Z'
                : null;
        };

        Boolean.prototype.toJSON = this_value;
        Number.prototype.toJSON = this_value;
        String.prototype.toJSON = this_value;
    }

    var gap,
        indent,
        meta,
        rep;

    const join = (arr, joint) => {
        return arr.map((v) => [v, joint]).flat().slice(0, -1);
    };

    function quote(string) {

// If the string contains no control characters, no quote characters, and no
// backslash characters, then we can safely slap some quotes around it.
// Otherwise we must also replace the offending characters with safe escape
// sequences.

        rx_escapable.lastIndex = 0;
        return rx_escapable.test(string) 
            ? '"' + string.replace(rx_escapable, function (a) {
                var c = meta[a];
                return typeof c === 'string'
                    ? c
                    : '\\u' + ('0000' + a.charCodeAt(0).toString(16)).slice(-4);
            }) + '"' 
            : '"' + string + '"';
    }


    function str(key, holder) {

// Produce a string from holder[key].

        var i,          // The loop counter.
            k,          // The member key.
            v,          // The member value.
            length,
            mind = gap,
            partial,
            value = holder[key];

// If the value has a toJSON method, call it to obtain a replacement value.

        if (value && typeof value === 'object' &&
                typeof value.toJSON === 'function') {
            value = value.toJSON(key);
        }

// If we were called with a replacer function, then call the replacer to
// obtain a replacement value.

        if (typeof rep === 'function') {
            value = rep.call(holder, key, value);
        }

// What happens next depends on the value's type.

        switch (typeof value) {
        case 'string':
            return quote(value);

        case 'number':

// JSON numbers must be finite. Encode non-finite numbers as null.

            return isFinite(value) 
                ? String(value) 
                : 'null';

        case 'boolean':
        case 'null':

// If the value is a boolean or null, convert it to a string. Note:
// typeof null does not produce 'null'. The case is included here in
// the remote chance that this gets fixed someday.

            return String(value);

// If the type is 'object', we might be dealing with an object or an array or
// null.

        case 'object':

// Due to a specification blunder in ECMAScript, typeof null is 'object',
// so watch out for that case.

            if (!value) {
                return 'null';
            }

// Make an array to hold the partial results of stringifying this object value.

            gap += indent;
            partial = [];

// Is the value an array?

            if (Object.prototype.toString.apply(value) === '[object Array]') {

// The value is an array. Stringify every element. Use null as a placeholder
// for non-JSON values.

                length = value.length;
                for (i = 0; i < length; i += 1) {
                    partial[i] = str(i, value) || 'null';
                }

// Join all of the elements together, separated with commas, and wrap them in
// brackets.

                v = partial.length === 0
                    ? new Blob(['[]'])
                    : gap
                        ? new Blob(['[\n', gap, ...join(partial, ',\n' + gap), '\n', mind, ']'])
                        : new Blob(['[', ...join(partial, ','), ']']);
                gap = mind;
                return v;
            }

// If the replacer is an array, use it to select the members to be stringified.

            if (rep && typeof rep === 'object') {
                length = rep.length;
                for (i = 0; i < length; i += 1) {
                    if (typeof rep[i] === 'string') {
                        k = rep[i];
                        v = str(k, value);
                        if (v) {
                            partial.push(new Blob([quote(k) + (
                                gap 
                                    ? ': ' 
                                    : ':'
                            ), v]));
                        }
                    }
                }
            } else {

// Otherwise, iterate through all of the keys in the object.

                for (k in value) {
                    if (Object.prototype.hasOwnProperty.call(value, k)) {
                        v = str(k, value);
                        if (v) {
                            partial.push(new Blob([quote(k), (
                                gap 
                                    ? ': ' 
                                    : ':'
                            ), v]));
                        }
                    }
                }
            }

// Join all of the member texts together, separated with commas,
// and wrap them in braces.
            v = partial.length === 0
                ? new Blob(['{}'])
                : gap
                    ? new Blob(['{\n', gap, ...join(partial, ',\n' + gap), '\n',  mind, '}'])
                    : new Blob(['{', ...join(partial, ','), '}']);
            gap = mind;
            return v;
        }
    }
// If the JSON object does not yet have a stringify method, give it one.

    if (true) {
        meta = {    // table of character substitutions
            '\b': '\\b',
            '\t': '\\t',
            '\n': '\\n',
            '\f': '\\f',
            '\r': '\\r',
            '"': '\\"',
            '\\': '\\\\'
        };
        JSON.blobify = function (value, replacer, space) {

// The stringify method takes a value and an optional replacer, and an optional
// space parameter, and returns a JSON text. The replacer can be a function
// that can replace values, or an array of strings that will select the keys.
// A default replacer method can be provided. Use of the space parameter can
// produce text that is more easily readable.

            var i;
            gap = '';
            indent = '';

// If the space parameter is a number, make an indent string containing that
// many spaces.

            if (typeof space === 'number') {
                for (i = 0; i < space; i += 1) {
                    indent += ' ';
                }

// If the space parameter is a string, it will be used as the indent string.

            } else if (typeof space === 'string') {
                indent = space;
            }

// If there is a replacer, it must be a function or an array.
// Otherwise, throw an error.

            rep = replacer;
            if (replacer && typeof replacer !== 'function' &&
                    (typeof replacer !== 'object' ||
                    typeof replacer.length !== 'number')) {
                throw new Error('JSON.stringify');
            }

// Make a fake root object containing our value under the key of ''.
// Return the result of stringifying the value.

            return str('', {'': value});
        };
    }


// If the JSON object does not yet have a parse method, give it one.

    
}());

(async () => {
  const asBlob = JSON.blobify({foo: {bar: "baz", bla: [new Date(), ()=>{},,1,null]}});
  console.log({asBlob});
  const asString = await asBlob.text();
  console.log({asString});
  console.log("parsed", JSON.parse(asString));
})();


0
投票

这有效:

function jsonToBlob(json) {
  const textEncoder = new TextEncoder();
  const seen = new WeakSet();

  function processValue(value) {
    if(seen.has(value)) {
      throw new TypeError("Converting circular structure to JSON");
    }

    if(value && typeof value.toJSON === "function") {
      value = value.toJSON();
    }

    if(typeof value === 'object' && value !== null) {
      seen.add(value);

      const blobParts = [];
      const entries = Array.isArray(value) ? value : Object.entries(value);
      for(let i = 0; i < entries.length; i++) {
        if(Array.isArray(value)) {
          blobParts.push(processValue(entries[i]));
        } else {
          const [key, val] = entries[i];
          blobParts.push(textEncoder.encode(JSON.stringify(key) + ':'), processValue(val));
        }
        if(i !== entries.length - 1) blobParts.push(textEncoder.encode(','));
      }

      const startBracket = Array.isArray(value) ? '[' : '{';
      const endBracket = Array.isArray(value) ? ']' : '}';
      return new Blob([textEncoder.encode(startBracket), ...blobParts, textEncoder.encode(endBracket)]);
    } else if(typeof value === 'function' || typeof value === 'undefined') {
      return textEncoder.encode("null");
    } else {
      // For primitives we just convert it to string and encode
      return textEncoder.encode(JSON.stringify(value));
    }
  }

  return processValue(json);
}

✅ 测试1:

let blob = jsonToBlob([{hello:{foo:[1,2,3], a:1, bar:["a", 2, {$hi:[1,2,3, {a:3}]}]}}, 4, new Date(),, (()=>{})]);
console.log(JSON.parse(await blob.text()));

✅ 测试2:

let json = {};
for(let i = 0; i < 600000; i++) {
  json[Math.random()] = Math.random().toString().repeat(100);
}
let blob = jsonToBlob(json);
console.log(blob); // ~1 GB

如果我在投入生产时发现任何错误/问题,将更新此答案。

更新:一年后,我遇到的上述解决方案的唯一问题是在某些情况下有点慢。这是一个产生完全相同输出的版本,但在我的实际测试中速度快了 10 倍以上:

function jsonToBlob(json) {
  const textEncoder = new TextEncoder();
  const seen = new WeakSet();
  let buffer = new Uint8Array(1024 * 1024); // Start with 1MB buffer
  let position = 0;
  let stringBuffer = '';

  function ensureCapacity(additionalBytes) {
    if (position + additionalBytes > buffer.length) {
      const newBuffer = new Uint8Array(Math.max(buffer.length * 2, position + additionalBytes));
      newBuffer.set(buffer);
      buffer = newBuffer;
    }
  }

  function writeToBuffer(str) {
    const encoded = textEncoder.encode(str);
    ensureCapacity(encoded.length);
    buffer.set(encoded, position);
    position += encoded.length;
  }

  function flushStringBuffer() {
    if (stringBuffer.length > 0) {
      writeToBuffer(stringBuffer);
      stringBuffer = '';
    }
  }

  function processValue(value) {
    if (seen.has(value)) {
      throw new TypeError("Converting circular structure to JSON");
    }

    if (value && typeof value.toJSON === "function") {
      value = value.toJSON();
    }

    if (typeof value === 'object' && value !== null) {
      seen.add(value);

      const isArray = Array.isArray(value);
      stringBuffer += isArray ? '[' : '{';

      let first = true;
      for (const [key, val] of Object.entries(value)) {
        if (!first) stringBuffer += ',';
        first = false;

        if (!isArray) {
          stringBuffer += JSON.stringify(key) + ':';
        }

        processValue(val);
      }

      stringBuffer += isArray ? ']' : '}';
    } else if (typeof value === 'function' || typeof value === 'undefined') {
      stringBuffer += 'null';
    } else {
      stringBuffer += JSON.stringify(value);
    }

    // Flush the string buffer if it gets too large
    if (stringBuffer.length > 1024) {
      flushStringBuffer();
    }
  }

  processValue(json);
  flushStringBuffer();

  return new Blob([buffer.subarray(0, position)]);
}
© www.soinside.com 2019 - 2024. All rights reserved.