如何合并散列数组以匹配列值

问题描述 投票:-2回答:4

我想合并两个数组array1array2与列值匹配。两个阵列之间的绘图ID可能匹配也可能不匹配。匹配列是array1中的Plot ID和array2中的Plotting ID。

优势是array1。 array1中的列值应首先出现在预期的输出中。

如果array2与array1不匹配,则将array2列名合并为零值

array1 = [
{"Date" => "2019-01-01", "Plot ID" => 234},
{"Date" => "2019-01-01", "Plot ID" => 235},
{"Date" => "2019-01-01", "Plot ID" => 236},
{"Date" => "2019-01-01", "Plot ID" => 237},
{"Date" => "2019-01-01", "Plot ID" => 238},
{"Date" => "2019-01-01", "Plot ID" => 239},
{"Date" => "2019-01-01", "Plot ID" => 240},
{"Date" => "2019-01-01", "Plot ID" => 241}
]

array2 = [
{"Date" => "2019-01-01", "Plotting ID" => 234, "size"=> 20, "visit" => 10, "price" => 103},
{"Date" => "2019-01-01", "Plotting ID" => 500,  "size"=> 40, "visit" => 22, "price" => 233},
{"Date" => "2019-01-01", "Plotting ID" => 236,  "size"=> 25, "visit" => 34, "price" => 423},
{"Date" => "2019-01-01", "Plotting ID" => 600,  "size"=> 79, "visit" => 55, "price" => 234}
]

预期产量:

[
{"Date" => "2019-01-01", "Plot ID" => 234, "size"=> 20, "visit" => 10, "price" => 103},
{"Date" => "2019-01-01", "Plot ID" => 235, "size"=> 0, "visit" => 0, "price" => 0},
{"Date" => "2019-01-01", "Plot ID" => 236, "size"=> 25, "visit" => 34, "price" => 423},
{"Date" => "2019-01-01", "Plot ID" => 237, "size"=> 0, "visit" => 0, "price" => 0},
{"Date" => "2019-01-01", "Plot ID" => 238, "size"=> 0, "visit" => 0, "price" => 0},
{"Date" => "2019-01-01", "Plot ID" => 239, "size"=> 0, "visit" => 0, "price" => 0},
{"Date" => "2019-01-01", "Plot ID" => 240, "size"=> 0, "visit" => 0, "price" => 0},
{"Date" => "2019-01-01", "Plot ID" => 241, "size"=> 0, "visit" => 0, "price" => 0}
]
arrays ruby hash merge
4个回答
1
投票

这个答案只有在array2不包含任何重复的Plotting ID值时才有效。 (如果有重复的Plotting ID它仍然有效,但它使用数组中曾经的precent的最后一条记录。)

array1 = [{"Date" => "2019-01-01", "Plot ID" => 234}, {"Date" => "2019-01-01", "Plot ID" => 235}, {"Date" => "2019-01-01", "Plot ID" => 236}, {"Date" => "2019-01-01", "Plot ID" => 237}, {"Date" => "2019-01-01", "Plot ID" => 238}, {"Date" => "2019-01-01", "Plot ID" => 239}, {"Date" => "2019-01-01", "Plot ID" => 240}, {"Date" => "2019-01-01", "Plot ID" => 241}]
array2 = [{"Date" => "2019-01-01", "Plotting ID" => 234, "size"=> 20, "visit" => 10, "price" => 103}, {"Date" => "2019-01-01", "Plotting ID" => 500,  "size"=> 40, "visit" => 22, "price" => 233}, {"Date" => "2019-01-01", "Plotting ID" => 236,  "size"=> 25, "visit" => 34, "price" => 423}, {"Date" => "2019-01-01", "Plotting ID" => 600,  "size"=> 79, "visit" => 55, "price" => 234}]

array2_lookup = array2.map(&:dup).map { |record| [record.delete('Plotting ID'), record] }.to_h
array2_lookup.default = { 'size' => 0, 'visit' => 0, 'price' => 0 }
pp array1.map { |record| array2_lookup[record['Plot ID']].merge(record) }
# [{"Date"=>"2019-01-01", "size"=>20, "visit"=>10, "price"=>103, "Plot ID"=>234},
#  {"size"=>0, "visit"=>0, "price"=>0, "Date"=>"2019-01-01", "Plot ID"=>235},
#  {"Date"=>"2019-01-01", "size"=>25, "visit"=>34, "price"=>423, "Plot ID"=>236},
#  {"size"=>0, "visit"=>0, "price"=>0, "Date"=>"2019-01-01", "Plot ID"=>237},
#  {"size"=>0, "visit"=>0, "price"=>0, "Date"=>"2019-01-01", "Plot ID"=>238},
#  {"size"=>0, "visit"=>0, "price"=>0, "Date"=>"2019-01-01", "Plot ID"=>239},
#  {"size"=>0, "visit"=>0, "price"=>0, "Date"=>"2019-01-01", "Plot ID"=>240},
#  {"size"=>0, "visit"=>0, "price"=>0, "Date"=>"2019-01-01", "Plot ID"=>241}]

上面的解决方案首先在array2上循环并通过从散列中删除键/值对'Plotting ID'并将值作为键来将其转换为散列。出于这个原因,我添加了.map(&:dup)调用,防止array2中的原始哈希发生变异。如果哈希变异不是你的问题,你可以简单地删除它。

创建查找哈希后,我添加了合并哈希时使用的默认值。剩下要做的就是循环通过array1,查找记录(如果有的话)或使用默认值并将其与当前元素合并。

这个答案让密钥有点混乱,但由于散列基于密钥查找(不是键/值顺序),这不应该是一个大问题。如果您希望按照相同的顺序使用所有键,则可以通过将所有键设置为默认值,将其值设置为nil(或任何其他值,因为它们被覆盖):

array2_lookup.default = { 'Date' => nil, 'size' => 0, 'visit' => 0, 'price' => 0 }
# ...                        ^ added placeholder for ordering purposes
# [{"Date"=>"2019-01-01", "size"=>20, "visit"=>10, "price"=>103, "Plot ID"=>234},
#  {"Date"=>"2019-01-01", "size"=>0, "visit"=>0, "price"=>0, "Plot ID"=>235},
#  {"Date"=>"2019-01-01", "size"=>25, "visit"=>34, "price"=>423, "Plot ID"=>236},
#  {"Date"=>"2019-01-01", "size"=>0, "visit"=>0, "price"=>0, "Plot ID"=>237},
#  {"Date"=>"2019-01-01", "size"=>0, "visit"=>0, "price"=>0, "Plot ID"=>238},
#  {"Date"=>"2019-01-01", "size"=>0, "visit"=>0, "price"=>0, "Plot ID"=>239},
#  {"Date"=>"2019-01-01", "size"=>0, "visit"=>0, "price"=>0, "Plot ID"=>240},
#  {"Date"=>"2019-01-01", "size"=>0, "visit"=>0, "price"=>0, "Plot ID"=>241}]

1
投票

如果我得到了正确的答案,也许这是一个可能的选择。

template = {"size"=> 0, "visit" => 0, "price" => 0}
array1.map do |h|
  begin
    h.merge!(template, array2.find { |hh| hh["Plotting ID"] == h["Plot ID"] })
     .then { |hh| hh.delete("Plotting ID") }
  rescue TypeError
  end
end

我使用rescue因为find可以返回nil


Option without rescue:
template = {"size"=> 0, "visit" => 0, "price" => 0}
array1.map do |h|
  h2 = {} || array2.find { |hh| hh["Plotting ID"] == h["Plot ID"] if hh.has_key("Plotting ID") } 
  h.merge!(template, h2).then { |hh| hh.delete("Plotting ID") }
end

甚至一个班轮:

array1.map { |h| h.merge!({"size"=> 0, "visit" => 0, "price" => 0}, {} || array2.find { |hh| hh["Plotting ID"] == h["Plot ID"] if hh.has_key("Plotting ID") }).then { |hh| hh.delete("Plotting ID") } }


It modifies array1, so:
array1

# [{"Date"=>"2019-01-01", "Plot ID"=>234, "size"=>20, "visit"=>10, "price"=>103}, {"Date"=>"2019-01-01", "Plot ID"=>235, "size"=>0, "visit"=>0, "price"=>0}, {"Date"=>"2019-01-01", "Plot ID"=>236, "size"=>25, "visit"=>34, "price"=>423}, {"Date"=>"2019-01-01", "Plot ID"=>237, "size"=>0, "visit"=>0, "price"=>0}, {"Date"=>"2019-01-01", "Plot ID"=>238, "size"=>0, "visit"=>0, "price"=>0}, {"Date"=>"2019-01-01", "Plot ID"=>239, "size"=>0, "visit"=>0, "price"=>0}, {"Date"=>"2019-01-01", "Plot ID"=>240, "size"=>0, "visit"=>0, "price"=>0}, {"Date"=>"2019-01-01", "Plot ID"=>241, "size"=>0, "visit"=>0, "price"=>0}]

1
投票
template = (array2.first.keys - array1.first.keys - ["Plotting ID"]).product([0]).to_h
  #=> {"size"=>0, "visit"=>0, "price"=>0}

h = array1.each_with_object({}) { |g,h| h[g["Plot ID"]] = g.merge(template) } 
  #=> {234=>{"Date"=>"2019-01-01", "Plot ID"=>234, "size"=>0, "visit"=>0, "price"=>0},
  #    235=>{"Date"=>"2019-01-01", "Plot ID"=>235, "size"=>0, "visit"=>0, "price"=>0}, 
  #    ...
  #    241=>{"Date"=>"2019-01-01", "Plot ID"=>241, "size"=>0, "visit"=>0, "price"=>0}}

array2.each_with_object(h) { |g,f| f.update(g["Plotting ID"]=>
    g.transform_keys { |k| k == "Plotting ID" ? "Plot ID" : k }) }.values
  #=> [{"Date"=>"2019-01-01", "Plot ID"=>234, "size"=>20, "visit"=>10, "price"=>103}, 
  #    {"Date"=>"2019-01-01", "Plot ID"=>235, "size"=> 0, "visit"=> 0, "price"=>  0},
  #    {"Date"=>"2019-01-01", "Plot ID"=>236, "size"=>25, "visit"=>34, "price"=>423},
  #    {"Date"=>"2019-01-01", "Plot ID"=>237, "size"=> 0, "visit"=> 0, "price"=>  0},
  #    {"Date"=>"2019-01-01", "Plot ID"=>238, "size"=> 0, "visit"=> 0, "price"=>  0},
  #    {"Date"=>"2019-01-01", "Plot ID"=>239, "size"=> 0, "visit"=> 0, "price"=>  0},
  #    {"Date"=>"2019-01-01", "Plot ID"=>240, "size"=> 0, "visit"=> 0, "price"=>  0},
  #    {"Date"=>"2019-01-01", "Plot ID"=>241, "size"=> 0, "visit"=> 0, "price"=>  0},
  #    {"Date"=>"2019-01-01", "Plot ID"=>500, "size"=>40, "visit"=>22, "price"=>233},
  #    {"Date"=>"2019-01-01", "Plot ID"=>600, "size"=>79, "visit"=>55, "price"=>234}] 

如果需要,可以在最后一个表达式中替换h

template定义如下将更简单:

template = (%w| size visit price |).product([0]).to_h

然而,这对于改变array2元素中的键的名称和/或数量是脆弱的。

虽然效率较低,但以下内容可能更清晰,更易于维护。

a2 = array2.map { |h| h.transform_keys { |k| k == "Plotting ID" ? "Plot ID" : k } }
  # => [{"Date"=>"2019-01-01", "Plot ID"=>234, "size"=>20, "visit"=>10, "price"=>103},
  #     {"Date"=>"2019-01-01", "Plot ID"=>500, "size"=>40, "visit"=>22, "price"=>233},
  #     {"Date"=>"2019-01-01", "Plot ID"=>236, "size"=>25, "visit"=>34, "price"=>423},
  #     {"Date"=>"2019-01-01", "Plot ID"=>600, "size"=>79, "visit"=>55, "price"=>234}] 
template = (a2.first.keys - array1.first.keys).product([0]).to_h
  #=> <same as earlier value>
h = array1.each_with_object({}) { |g,h| h[g["Plot ID"]] = g.merge(template) }
  #=> <same as earlier value>
a2.each_with_object(h) { |g,f| f.update(g["Plot ID"]=>g) }.values
  #=> <same as earlier value>

1
投票
array1 = [{"Date" => "2019-01-01", "Plot ID" => 234}, {"Date" => "2019-01-01", "Plot ID" => 235}, {"Date" => "2019-01-01", "Plot ID" => 236}, {"Date" => "2019-01-01", "Plot ID" => 237}, {"Date" => "2019-01-01", "Plot ID" => 238}, {"Date" => "2019-01-01", "Plot ID" => 239}, {"Date" => "2019-01-01", "Plot ID" => 240}, {"Date" => "2019-01-01", "Plot ID" => 241}]
array2 = [{"Date" => "2019-01-01", "Plotting ID" => 234, "size"=> 20, "visit" => 10, "price" => 103}, {"Date" => "2019-01-01", "Plotting ID" => 500,  "size"=> 40, "visit" => 22, "price" => 233}, {"Date" => "2019-01-01", "Plotting ID" => 236,  "size"=> 25, "visit" => 34, "price" => 423}, {"Date" => "2019-01-01", "Plotting ID" => 600,  "size"=> 79, "visit" => 55, "price" => 234}]

grouped = (array2 + array1).group_by { |h| h["Plot ID"] || h["Plotting ID"] }
merged = grouped.values.map { |a| a.inject(:merge) }

# and in case you want exact formatting:
template = { "Date" => nil, "Plot ID" => nil, "size"=> 0, "visit" => 0, "price" => 0 }
normalized = merged.each { |h| template.merge(h).slice(*template.keys) }

您可以看到每行代码如何对数据进行有意义且独立的转换。我发现这样的代码更容易在调试器中单步执行。

学习构建那些内置的EnumerableHash方法,并大大简化您的代码!

© www.soinside.com 2019 - 2024. All rights reserved.