读写 YAML 文件而不破坏锚点和别名

问题描述 投票:0回答:4

这个问题之前已经被问过:读写YAML文件而不破坏锚点和别名?

我想知道如何用许多锚点和别名来解决这个问题?

谢谢

ruby parsing yaml psych emit
4个回答
11
投票

这里的问题是 Yaml 中的锚点和别名是序列化细节,因此在解析后不是数据的一部分,因此在将数据写回 Yaml 时不知道原始锚点名称。为了在往返时保留锚点名称,您需要在解析时将它们存储在某处,以便稍后在序列化时可用。在 Ruby 中,任何对象都可以具有与其关联的实例变量,因此实现此目的的一个简单方法是将锚点名称存储在相关对象的实例变量中。

继续之前的问题中的示例,对于哈希,我们可以更改重新定义的

revive_hash
方法,这样如果哈希是锚点,那么也可以在
@st
变量中记录锚点名称,以便以后可以使用识别后,我们将其添加为哈希上的实例变量。

class ToRubyNoMerge < Psych::Visitors::ToRuby
  def revive_hash hash, o
    if o.anchor
      @st[o.anchor] = hash
      hash.instance_variable_set "@_yaml_anchor_name", o.anchor
    end

    o.children.each_slice(2) { |k,v|
      key = accept(k)
      hash[key] = accept(v)
    }
    hash
  end
end

请注意,这仅影响作为锚点的 yaml 映射。如果您想让其他类型保留其锚点名称,您需要查看

psych/visitors/to_ruby.rb
并确保在所有情况下都添加名称。大多数类型都可以通过覆盖
register
来包含,但还有其他一些类型;搜索
@st

既然哈希值已经有了与之关联的所需锚点名称,您需要让 Psych 在序列化它时使用它而不是对象 id。这可以通过子类化

YAMLTree
来完成。当
YAMLTree
处理一个对象时,它 首先检查该对象是否已经被看到,如果有,则为其发出一个别名 。对于任何新对象,它记录它已经看到该对象,以防稍后需要创建别名
object_id
在此用作键,因此您需要重写这两个方法来检查实例变量,如果存在则使用它:

class MyYAMLTree < Psych::Visitors::YAMLTree

  # check to see if this object has been seen before
  def accept target
    if anchor_name = target.instance_variable_get('@_yaml_anchor_name')
      if @st.key? anchor_name
        oid         = anchor_name
        node        = @st[oid]
        anchor      = oid.to_s
        node.anchor = anchor
        return @emitter.alias anchor
      end
    end

    # accept is a pretty big method, call super to avoid copying
    # it all here. super will handle the cases when it's an object
    # that's been seen but doesn't have '@_yaml_anchor_name' set
    super
  end

  # record object for future, using '@_yaml_anchor_name' rather
  # than object_id if it exists
  def register target, yaml_obj
    anchor_name = target.instance_variable_get('@_yaml_anchor_name') || target.object_id
    @st[anchor_name] = yaml_obj
    yaml_obj
  end
end

现在您可以像这样使用它(与上一个问题不同,在这种情况下您不需要创建自定义发射器):

builder = MyYAMLTree.new
builder << data

tree = builder.tree

puts tree.yaml # returns a string

# alternativelty write direct to file:
File.open('a_file.yml', 'r+') do |f|
  tree.yaml f
end

1
投票

这是针对最新版本的心灵宝石的稍微修改的版本。在它给我以下错误之前:

NoMethodError - undefined method `[]=' for #<Psych::Visitors::YAMLTree::Registrar:0x007fa0db6ba4d0>

register
方法移至
YAMLTree
的子类中,因此现在对于马特在回答中所说的所有内容都有效:

class ToRubyNoMerge < Psych::Visitors::ToRuby
  def revive_hash hash, o
    if o.anchor
      @st[o.anchor] = hash
      hash.instance_variable_set "@_yaml_anchor_name", o.anchor
    end

    o.children.each_slice(2) { |k,v|
      key = accept(k)
      hash[key] = accept(v)
    }
    hash
  end
end

class MyYAMLTree < Psych::Visitors::YAMLTree
  class Registrar
    # record object for future, using '@_yaml_anchor_name' rather
    # than object_id if it exists
    def register target, node
      anchor_name = target.instance_variable_get('@_yaml_anchor_name') || target.object_id
      @obj_to_node[anchor_name] = node
    end
  end

  # check to see if this object has been seen before
  def accept target
    if anchor_name = target.instance_variable_get('@_yaml_anchor_name')
      if @st.key? anchor_name
        oid         = anchor_name
        node        = @st[oid]
        anchor      = oid.to_s
        node.anchor = anchor
        return @emitter.alias anchor
      end
    end

    # accept is a pretty big method, call super to avoid copying
    # it all here. super will handle the cases when it's an object
    # that's been seen but doesn't have '@_yaml_anchor_name' set
    super
  end

end

1
投票

我必须进一步修改 @markus 发布的代码才能与 Psych v2.0.17 一起使用。

这就是我的最终结果。我希望它可以帮助其他人节省大量时间。 :-)

class ToRubyNoMerge < Psych::Visitors::ToRuby
  def revive_hash hash, o
    if o.anchor
      @st[o.anchor] = hash
      hash.instance_variable_set "@_yaml_anchor_name", o.anchor
    end

    o.children.each_slice(2) do |k,v|
      key = accept(k)
      hash[key] = accept(v)
    end
    hash
  end
end

class Psych::Visitors::YAMLTree::Registrar
  # record object for future, using '@_yaml_anchor_name' rather
  # than object_id if it exists
  def register target, node
    @targets << target
    @obj_to_node[_anchor_name(target)] = node
  end

  def key? target
    @obj_to_node.key? _anchor_name(target)
  rescue NoMethodError
    false
  end

  def node_for target
    @obj_to_node[_anchor_name(target)]
  end

  private

  def _anchor_name(target)
    target.instance_variable_get('@_yaml_anchor_name') || target.object_id
  end
end

class MyYAMLTree < Psych::Visitors::YAMLTree
  # check to see if this object has been seen before
  def accept target
    if anchor_name = target.instance_variable_get('@_yaml_anchor_name')
      if @st.key? target
        node        = @st.node_for target
        node.anchor = anchor_name
        return @emitter.alias anchor_name
      end
    end

    # accept is a pretty big method, call super to avoid copying
    # it all here. super will handle the cases when it's an object
    # that's been seen but doesn't have '@_yaml_anchor_name' set
    super
  end

  def visit_String o
    if o == '<<'
      style = Psych::Nodes::Scalar::PLAIN
      tag   = 'tag:yaml.org,2002:str'
      plain = true
      quote = false

      return @emitter.scalar o, nil, tag, plain, quote, style
    end

    # visit_String is a pretty big method, call super to avoid copying it all
    # here. super will handle the cases when it's a string other than '<<'
    super
  end
end

0
投票

截至 2024 年,对于

psych 5.1.2
,如果您使用
YAML.load_file
调用
symbolize_names: true
,那么锚点和别名似乎会被保留 - 尽管是
*1, *2 ...
(而不是您可能给出的锚点名称)。

因此,如果您可以接受编号别名,那么解决方案就相当简单了。

# will retain anchors and aliases 
data = YAML.load_file(file_name, symbolize_names: true) 

# will expand aliases in-place - and lose references to them
data = YAML.load_file(file_name, symbolize_names: false) 

但是空键不会被锚定

data:
   user: &profile # not empty.
     name: something
   address: &anchor_to_blank # empty. So, no numbered anchor

more_data: 
   user: &profile
   address: &anchor_to_blank # no numbered alias

© www.soinside.com 2019 - 2024. All rights reserved.