我有一个XML文件,我需要在其中查找和计数Year标签的外观。例如:
Found year 2020 10 times.
Found year 2017 1 times.
Found year 2019 2 times.
(...)
为了避免重复使用HashSet的年份。代码:
public class Publications {
public static void main(String[] args) throws IOException {
Set<String> publicationYears = new LinkedHashSet<>();
try (BufferedReader reader = Files.newBufferedReader(Paths.get("dblp-2020-04-01.xml"))) {
Pattern pattern = Pattern.compile("<year>(.+?)</year>", Pattern.DOTALL);
for (String line; (line = reader.readLine()) != null; ) {
Matcher matcher = pattern.matcher(line);
if (matcher.find()) {
String year = matcher.group(1);
publicationYears.add(year);
}
}
}
结果:
2010
2002
1992
1994
1993
2006(...)
但是现在我找不到有效的代码来计算每年的外观。创建多维数组然后搜索会非常慢。有什么建议吗?
有很多方法可以做到。其中一些如下:
List
并按如下方式使用Collections.frequency
:import java.util.Collections;
import java.util.LinkedHashSet;
import java.util.List;
import java.util.Set;
public class Main {
public static void main(String[] args) {
List<Integer> years = List.of(2010, 2011, 2013, 2010, 2012, 2010, 2011, 2010, 2014);
Set<Integer> yearSet = new LinkedHashSet<Integer>(years);
for (Integer year : yearSet) {
System.out.println("Found year " + year + " " + Collections.frequency(years, year) + " times");
}
}
}
输出:
Found year 2010 4 times
Found year 2011 2 times
Found year 2013 1 times
Found year 2012 1 times
Found year 2014 1 times
List
,然后按如下所示创建频率的Map
:import java.util.LinkedHashMap;
import java.util.List;
import java.util.Map;
import java.util.Map.Entry;
public class Main {
public static void main(String[] args) {
List<Integer> years = List.of(2010, 2011, 2013, 2010, 2012, 2010, 2011, 2010, 2014);
Map<Integer, Integer> frequencyMap = new LinkedHashMap<Integer, Integer>();
for (Integer year : years) {
if (frequencyMap.get(year) == null) {
frequencyMap.put(year, 1);
} else {
frequencyMap.put(year, frequencyMap.get(year) + 1);
}
}
for (Entry<Integer, Integer> entry : frequencyMap.entrySet()) {
System.out.println("Found year " + entry.getKey() + " " + entry.getValue() + " times");
}
}
}
输出:
Found year 2010 4 times
Found year 2011 2 times
Found year 2013 1 times
Found year 2012 1 times
Found year 2014 1 times
List
,然后通过使用Map
来创建频率的Map::merge
:import java.util.LinkedHashMap;
import java.util.List;
import java.util.Map;
import java.util.Map.Entry;
public class Main {
public static void main(String[] args) {
List<Integer> years = List.of(2010, 2011, 2013, 2010, 2012, 2010, 2011, 2010, 2014);
Map<Integer, Integer> frequencyMap = new LinkedHashMap<Integer, Integer>();
years.forEach(year -> frequencyMap.merge(year, 1, (oldValue, newValue) -> oldValue + newValue));
for (Entry<Integer, Integer> entry : frequencyMap.entrySet()) {
System.out.println("Found year " + entry.getKey() + " " + entry.getValue() + " times");
}
}
}
输出:
Found year 2010 4 times
Found year 2011 2 times
Found year 2013 1 times
Found year 2012 1 times
Found year 2014 1 times
尝试一下:
count.compute(year, (k,v)->v == null ? 1 : v + 1);
Map<String, Integer> count = new LinkedHashMap<>();
try (BufferedReader reader = Files.newBufferedReader(Paths.get("dblp-2020-04-01.xml"))) {
Pattern pattern = Pattern.compile("<year>(.+?)</year>", Pattern.DOTALL);
for (String line; (line = reader.readLine()) != null; ) {
Matcher matcher = pattern.matcher(line);
if (matcher.find()) {
String year = matcher.group(1);
count.compute(year, (k,v)->v == null ? 1 : v + 1);
}
}
}
}
要打印出来,请执行以下操作
count.entrySet().forEach(System.out::println);