我有一个哈希值,这是一个数组。如何以最高效的方式删除数组中的重复元素和相应的ID?
这是我的哈希的一个例子
hash = {
"id" => "sjfdkjfd",
"name" => "Field Name",
"type" => "field",
"options" => ["Language", "Question", "Question", "Answer", "Answer"],
"option_ids" => ["12345", "23456", "34567", "45678", "56789"]
}
我的想法是这样的
hash["options"].each_with_index { |value, index |
h = {}
if h.key?(value)
delete(value)
delete hash["option_ids"].delete_at(index)
else
h[value] = index
end
}
结果应该是
hash = {
"id" => "sjfdkjfd",
"name" => "Field Name",
"type" => "field",
"options" => ["Language", "Question", "Answer"],
"option_ids" => ["12345", "23456", "45678"]
}
我知道我必须考虑到当我删除options和option_ids的值时,这些值的索引将会改变。但不知道该怎么做
我的第一个想法是压缩值并调用uniq,然后想办法返回初始形式:
h['options'].zip(h['option_ids']).uniq(&:first).transpose
#=> [["Language", "Question", "Answer"], ["12345", "23456", "45678"]]
h['options'], h['option_ids'] = h['options'].zip(h['option_ids']).uniq(&:first).transpose
h #=> {"id"=>"sjfdkjfd", "name"=>"Field Name", "type"=>"field", "options"=>["Language", "Question", "Answer"], "option_ids"=>["12345", "23456", "45678"]}
这些是步骤:
h['options'].zip(h['option_ids'])
#=> [["Language", "12345"], ["Question", "23456"], ["Question", "34567"], ["Answer", "45678"], ["Answer", "56789"]]
h['options'].zip(h['option_ids']).uniq(&:first)
#=> [["Language", "12345"], ["Question", "23456"], ["Answer", "45678"]]
hash = {
"id" => "sjfdkjfd",
"name" => "Field Name",
"type" => "field",
"options" => ["L", "Q", "Q", "Q", "A", "A", "Q"],
"option_ids" => ["12345", "23456", "34567", "dog", "45678", "56789", "cat"]
}
我假设“重复元素”指的是连续的相等元素(2
仅在[1,2,2,1]
中)而不是“重复元素”(前一个例子中的1
和2
)。我确实展示了如果第二种解释适用,代码将如何被改变(事实上简化)。
idx = hash["options"].
each_with_index.
chunk_while { |(a,_),(b,_)| a==b }.
map { |(_,i),*| i }
#=> [0, 1, 4, 6]
hash.merge(
["options", "option_ids"].each_with_object({}) { |k,h| h[k] = hash[k].values_at(*idx) }
)
#=> {"id"=>"sjfdkjfd",
# "name"=>"Field Name",
# "type"=>"field",
# "options"=>["L", "Q", "A", "Q"],
# "option_ids"=>["12345", "23456", "45678", "cat"]}
如果“重复元素”被解释为表示"options"
和"option_ids"
的值仅包含上面显示的前三个元素,则按如下方式计算idx
:
idx = hash["options"].
each_with_index.
uniq { |s,_| s }.
map(&:last)
#=> [0, 1, 4]
请参阅Enumerable#chunk_while(可以使用Enumerable#slice_when)和Array#values_at。步骤如下。
a = hash["options"]
#=> ["L", "Q", "Q", "Q", "A", "A", "Q"]
e0 = a.each_with_index
#=> #<Enumerator: ["L", "Q", "Q", "Q", "A", "A", "Q"]:each_with_index>
e1 = e0.chunk_while { |(a,_),(b,_)| a==b }
#=> #<Enumerator: #<Enumerator::Generator:0x000055e4bcf17740>:each>
我们可以看到枚举器e1
将生成的值,并通过将其转换为数组传递给map
:
e1.to_a
#=> [[["L", 0]],
# [["Q", 1], ["Q", 2], ["Q", 3]],
# [["A", 4], ["A", 5]], [["Q", 6]]]
继续,
idx = e1.map { |(_,i),*| i }
#=> [0, 1, 4, 6]
c = ["options", "option_ids"].
each_with_object({}) { |k,h| h[k] = hash[k].values_at(*idx) }
#=> {"options"=>["L", "Q", "A", "Q"],
# "option_ids"=>["12345", "23456", "45678", "cat"]}
hash.merge(c)
#=> {"id"=>"sjfdkjfd",
# "name"=>"Field Name",
# "type"=>"field",
# "options"=>["L", "Q", "A", "Q"],
# "option_ids"=>["12345", "23456", "45678", "cat"]}
hash = {
"options" => ["Language", "Question", "Question", "Answer", "Answer"],
"option_ids" => ["12345", "23456", "34567", "45678", "56789"]
}
hash.values.transpose.uniq(&:first).transpose.map.with_index {|v,i| [hash.keys[i], v]}.to_h
#=> {"options"=>["Language", "Question", "Answer"], "option_ids"=>["12345", "23456", "45678"]}
OP编辑后:
hash = {
"id" => "sjfdkjfd",
"name" => "Field Name",
"type" => "field",
"options" => ["Language", "Question", "Question", "Answer", "Answer"],
"option_ids" => ["12345", "23456", "34567", "45678", "56789"]
}
hash_array = hash.to_a.select {|v| v.last.is_a?(Array)}.transpose
hash.merge([hash_array.first].push(hash_array.last.transpose.uniq(&:first).transpose).transpose.to_h)
#=> {"id"=>"sjfdkjfd", "name"=>"Field Name", "type"=>"field", "options"=>["Language", "Question", "Answer"], "option_ids"=>["12345", "23456", "45678"]}