使用LinkedHashSet时计数外观

问题描述 投票:0回答:2

我有一个XML文件,我需要在其中查找和计数Year标签的外观。例如:

Found year 2020 10 times.
Found year 2017 1 times.
Found year 2019 2 times. 
(...)

为了避免重复使用HashSet的年份。代码:

public class Publications {
    public static void main(String[] args) throws IOException {
        Set<String> publicationYears = new LinkedHashSet<>();
        try (BufferedReader reader = Files.newBufferedReader(Paths.get("dblp-2020-04-01.xml"))) {
            Pattern pattern = Pattern.compile("<year>(.+?)</year>", Pattern.DOTALL);
            for (String line; (line = reader.readLine()) != null; ) {
                Matcher matcher = pattern.matcher(line);
                if (matcher.find()) {
                    String year = matcher.group(1);
                    publicationYears.add(year);
                }
            }
        }

结果:

2010
2002
1992
1994
1993
2006(...)

但是现在我找不到有效的代码来计算每年的外观。创建多维数组然后搜索会非常慢。有什么建议吗?

java arrays xml hash
2个回答
0
投票

有很多方法可以做到。其中一些如下:

  1. 将所有年份加到List并按如下方式使用Collections.frequency
import java.util.Collections;
import java.util.LinkedHashSet;
import java.util.List;
import java.util.Set;

public class Main {
    public static void main(String[] args) {
        List<Integer> years = List.of(2010, 2011, 2013, 2010, 2012, 2010, 2011, 2010, 2014);
        Set<Integer> yearSet = new LinkedHashSet<Integer>(years);
        for (Integer year : yearSet) {
            System.out.println("Found year " + year + " " + Collections.frequency(years, year) + " times");
        }
    }
}

输出:

Found year 2010 4 times
Found year 2011 2 times
Found year 2013 1 times
Found year 2012 1 times
Found year 2014 1 times
  1. 将所有年份加到List,然后按如下所示创建频率的Map
import java.util.LinkedHashMap;
import java.util.List;
import java.util.Map;
import java.util.Map.Entry;

public class Main {
    public static void main(String[] args) {
        List<Integer> years = List.of(2010, 2011, 2013, 2010, 2012, 2010, 2011, 2010, 2014);
        Map<Integer, Integer> frequencyMap = new LinkedHashMap<Integer, Integer>();
        for (Integer year : years) {
            if (frequencyMap.get(year) == null) {
                frequencyMap.put(year, 1);
            } else {
                frequencyMap.put(year, frequencyMap.get(year) + 1);
            }
        }
        for (Entry<Integer, Integer> entry : frequencyMap.entrySet()) {
            System.out.println("Found year " + entry.getKey() + " " + entry.getValue() + " times");
        }
    }
}

输出:

Found year 2010 4 times
Found year 2011 2 times
Found year 2013 1 times
Found year 2012 1 times
Found year 2014 1 times
  1. 将所有年份加到List,然后通过使用Map来创建频率的Map::merge
import java.util.LinkedHashMap;
import java.util.List;
import java.util.Map;
import java.util.Map.Entry;

public class Main {
    public static void main(String[] args) {
        List<Integer> years = List.of(2010, 2011, 2013, 2010, 2012, 2010, 2011, 2010, 2014);
        Map<Integer, Integer> frequencyMap = new LinkedHashMap<Integer, Integer>();
        years.forEach(year -> frequencyMap.merge(year, 1, (oldValue, newValue) -> oldValue + newValue));
        for (Entry<Integer, Integer> entry : frequencyMap.entrySet()) {
            System.out.println("Found year " + entry.getKey() + " " + entry.getValue() + " times");
        }
    }
}

输出:

Found year 2010 4 times
Found year 2011 2 times
Found year 2013 1 times
Found year 2012 1 times
Found year 2014 1 times

0
投票

尝试一下:

  • 我用地图替换了布景。
  • 起作用的语句是
        count.compute(year, (k,v)->v == null ? 1 : v + 1); 
  • 如果第一次遇到该年份,则简单地在该年份输入1,否则在该年份加1。
   Map<String, Integer> count = new LinkedHashMap<>();
        try (BufferedReader reader = Files.newBufferedReader(Paths.get("dblp-2020-04-01.xml"))) {
            Pattern pattern = Pattern.compile("<year>(.+?)</year>", Pattern.DOTALL);
            for (String line; (line = reader.readLine()) != null; ) {
                Matcher matcher = pattern.matcher(line);
                if (matcher.find()) {
                    String year = matcher.group(1);
                    count.compute(year, (k,v)->v == null ? 1 : v + 1);
                }
            }
        }
    }

要打印出来,请执行以下操作

count.entrySet().forEach(System.out::println);
© www.soinside.com 2019 - 2024. All rights reserved.