我有一个如下所示的 Apache 日志文件:
64.242.88.10 - - [08/Mar/2004:07:56:34 -0800] "GET /twiki/bin/search/Main/SearchResult?scope=text®ex=on&search=Web%20*Index[^A-Za-z] HTTP/1.1" 200 4163
64.242.88.10 - - [08/Mar/2004:08:04:46 -0800] "GET /SpamAssassin.html HTTP/1.1" 200 7368
p5083cd5d.dip0.t-ipconnect.de - - [08/Mar/2004:08:09:32 -0800] "GET /SpamAssassin.html HTTP/1.0" 200 7368
64.242.88.10 - - [08/Mar/2004:08:12:50 -0800] "GET /twiki/bin/view/TWiki/ChangePassword?rev=r1.6 HTTP/1.1" 200 5181
64.242.88.10 - - [08/Mar/2004:08:14:15 -0800] "GET /twiki/bin/edit/TWiki/HaroldGottschalk?t=1078717948 HTTP/1.1" 401 12846
64.242.88.10 - - [08/Mar/2004:08:15:21 -0800] "GET /twiki/bin/edit/Main/Expand_owner_alias?topicparent=Main.ConfigurationVariables HTTP/1.1" 401 12846
我试图获取每行的总字节数(每行末尾的最后一个数字序列)并将它们相加以获得传输的总字节数
我的代码如下所示:
static String logEntryFormat = "^(\\S+) (\\S+) (\\S+) \\\[(.*?)\\\] "(.*?)" (\\S+) (\\S+)( "(.*?)" "(.*?)")?";
public static void countTotalBytes() throws IOException {
// First read file into an arrayList
Scanner scanner = new Scanner(accessLog);
ArrayList\<String\> stringList = new ArrayList\<String\>();
while (scanner.hasNextLine()){
stringList.add(scanner.nextLine());
scanner.close();
//Convert to string
String listString = String.join(" ", stringList);
Pattern findBytes = Pattern.compile(logEntryFormat);
Matcher matchBytes = findBytes.matcher(listString);
// Test
if(matchBytes.matches()) {
System.out.println(matchBytes.group(7));
}
}
我最终想抓取这个值并将其转换为int,然后计算总计。但是在我的输出中,我只收到“7368”的字节值。
我不确定我的做法是否完全错误,但我已经坚持了几个小时了。
由于方法
countTotalBytes()
的目标是对位于每个文件数据行末尾的所有字节值求和,因此也许它应该返回该结果,例如:int totalBytes = countTotalBytes("LogFile.log");
。现在您可以使用totalBytes
做任何您喜欢的事情。
此外,您确实不需要将文件加载到 ArrayList 中来执行此任务。只需在读取文件时对字节值求和,并在文件完全读取后返回总数。
这是一个例子:
public static long countTotalBytes(String accessLog) throws IOException {
int totalByteCount = 0;
// Try With Resources used here to auto-close the reader:
try (Scanner reader = new Scanner(new File(accessLog))) {
String line;
while (reader.hasNextLine()) {
line = reader.nextLine().trim();
// Skip IF current line is blank:
if (!line.isEmpty()) {
// Split the data line based on a whitespace delimiter (one or more):
String[] lineParts = line.split("\\s+");
/* Get the bytes value from the end of the data line:
It would be the last element in the above generated
array: */
String numBytes = lineParts[lineParts.length - 1];
/* Is the numerical value we pulled out a string representation
of a signed or unsigned integer value. If so, convert it to
a Integer data type and add it to totalByteCount: */
if (numBytes.matches("-?\\d+")) {
totalByteCount += Integer.parseInt(numBytes);
}
}
}
}
return totalByteCount;
}
使用上述方法:
try {
System.out.println(countTotalBytes("AccessLog.log"));
}
catch (IOException ex) {
System.err.println(ex.getMessage());
}
根据您提供的数据文件信息,控制台窗口应显示:
49772
。