这里的问题是 Yaml 中的锚点和别名是序列化细节,因此在解析后不是数据的一部分,因此在将数据写回 Yaml 时不知道原始锚点名称。为了在往返时保留锚点名称,您需要在解析时将它们存储在某处,以便稍后在序列化时可用。在 Ruby 中,任何对象都可以具有与其关联的实例变量,因此实现此目的的一个简单方法是将锚点名称存储在相关对象的实例变量中。
继续之前的问题中的示例,对于哈希,我们可以更改重新定义的
revive_hash
方法,这样如果哈希是锚点,那么也可以在@st
变量中记录锚点名称,以便以后可以使用识别后,我们将其添加为哈希上的实例变量。
class ToRubyNoMerge < Psych::Visitors::ToRuby
def revive_hash hash, o
if o.anchor
@st[o.anchor] = hash
hash.instance_variable_set "@_yaml_anchor_name", o.anchor
end
o.children.each_slice(2) { |k,v|
key = accept(k)
hash[key] = accept(v)
}
hash
end
end
请注意,这仅影响作为锚点的 yaml 映射。如果您想让其他类型保留其锚点名称,您需要查看
psych/visitors/to_ruby.rb
并确保在所有情况下都添加名称。大多数类型都可以通过覆盖 register
来包含,但还有其他一些类型;搜索@st
。
既然哈希值已经有了与之关联的所需锚点名称,您需要让 Psych 在序列化它时使用它而不是对象 id。这可以通过子类化
YAMLTree
来完成。当 YAMLTree
处理一个对象时,它 首先检查该对象是否已经被看到,如果有,则为其发出一个别名 。对于任何新对象,它记录它已经看到该对象,以防稍后需要创建别名。 object_id
在此用作键,因此您需要重写这两个方法来检查实例变量,如果存在则使用它:
class MyYAMLTree < Psych::Visitors::YAMLTree
# check to see if this object has been seen before
def accept target
if anchor_name = target.instance_variable_get('@_yaml_anchor_name')
if @st.key? anchor_name
oid = anchor_name
node = @st[oid]
anchor = oid.to_s
node.anchor = anchor
return @emitter.alias anchor
end
end
# accept is a pretty big method, call super to avoid copying
# it all here. super will handle the cases when it's an object
# that's been seen but doesn't have '@_yaml_anchor_name' set
super
end
# record object for future, using '@_yaml_anchor_name' rather
# than object_id if it exists
def register target, yaml_obj
anchor_name = target.instance_variable_get('@_yaml_anchor_name') || target.object_id
@st[anchor_name] = yaml_obj
yaml_obj
end
end
现在您可以像这样使用它(与上一个问题不同,在这种情况下您不需要创建自定义发射器):
builder = MyYAMLTree.new
builder << data
tree = builder.tree
puts tree.yaml # returns a string
# alternativelty write direct to file:
File.open('a_file.yml', 'r+') do |f|
tree.yaml f
end
这是针对最新版本的心灵宝石的稍微修改的版本。在它给我以下错误之前:
NoMethodError - undefined method `[]=' for #<Psych::Visitors::YAMLTree::Registrar:0x007fa0db6ba4d0>
register
方法移至YAMLTree
的子类中,因此现在对于马特在回答中所说的所有内容都有效:
class ToRubyNoMerge < Psych::Visitors::ToRuby
def revive_hash hash, o
if o.anchor
@st[o.anchor] = hash
hash.instance_variable_set "@_yaml_anchor_name", o.anchor
end
o.children.each_slice(2) { |k,v|
key = accept(k)
hash[key] = accept(v)
}
hash
end
end
class MyYAMLTree < Psych::Visitors::YAMLTree
class Registrar
# record object for future, using '@_yaml_anchor_name' rather
# than object_id if it exists
def register target, node
anchor_name = target.instance_variable_get('@_yaml_anchor_name') || target.object_id
@obj_to_node[anchor_name] = node
end
end
# check to see if this object has been seen before
def accept target
if anchor_name = target.instance_variable_get('@_yaml_anchor_name')
if @st.key? anchor_name
oid = anchor_name
node = @st[oid]
anchor = oid.to_s
node.anchor = anchor
return @emitter.alias anchor
end
end
# accept is a pretty big method, call super to avoid copying
# it all here. super will handle the cases when it's an object
# that's been seen but doesn't have '@_yaml_anchor_name' set
super
end
end
我必须进一步修改 @markus 发布的代码才能与 Psych v2.0.17 一起使用。
这就是我的最终结果。我希望它可以帮助其他人节省大量时间。 :-)
class ToRubyNoMerge < Psych::Visitors::ToRuby
def revive_hash hash, o
if o.anchor
@st[o.anchor] = hash
hash.instance_variable_set "@_yaml_anchor_name", o.anchor
end
o.children.each_slice(2) do |k,v|
key = accept(k)
hash[key] = accept(v)
end
hash
end
end
class Psych::Visitors::YAMLTree::Registrar
# record object for future, using '@_yaml_anchor_name' rather
# than object_id if it exists
def register target, node
@targets << target
@obj_to_node[_anchor_name(target)] = node
end
def key? target
@obj_to_node.key? _anchor_name(target)
rescue NoMethodError
false
end
def node_for target
@obj_to_node[_anchor_name(target)]
end
private
def _anchor_name(target)
target.instance_variable_get('@_yaml_anchor_name') || target.object_id
end
end
class MyYAMLTree < Psych::Visitors::YAMLTree
# check to see if this object has been seen before
def accept target
if anchor_name = target.instance_variable_get('@_yaml_anchor_name')
if @st.key? target
node = @st.node_for target
node.anchor = anchor_name
return @emitter.alias anchor_name
end
end
# accept is a pretty big method, call super to avoid copying
# it all here. super will handle the cases when it's an object
# that's been seen but doesn't have '@_yaml_anchor_name' set
super
end
def visit_String o
if o == '<<'
style = Psych::Nodes::Scalar::PLAIN
tag = 'tag:yaml.org,2002:str'
plain = true
quote = false
return @emitter.scalar o, nil, tag, plain, quote, style
end
# visit_String is a pretty big method, call super to avoid copying it all
# here. super will handle the cases when it's a string other than '<<'
super
end
end
截至 2024 年,对于
psych 5.1.2
,如果您使用 YAML.load_file
调用 symbolize_names: true
,那么锚点和别名似乎会被保留 - 尽管是 *1, *2 ...
(而不是您可能给出的锚点名称)。
因此,如果您可以接受编号别名,那么解决方案就相当简单了。
# will retain anchors and aliases
data = YAML.load_file(file_name, symbolize_names: true)
# will expand aliases in-place - and lose references to them
data = YAML.load_file(file_name, symbolize_names: false)
但是空键不会被锚定
data:
user: &profile # not empty.
name: something
address: &anchor_to_blank # empty. So, no numbered anchor
more_data:
user: &profile
address: &anchor_to_blank # no numbered alias