日志文件:按开始和结束日期选择日志文件内的特定日志内容

问题描述 投票:0回答:1

我正在进行日志分析,我需要通过首先提取文件中的日期来分析日志文件。然后,我需要使用这些日期来定义开始日期和结束日期。根据选定的开始和结束日期,只有该范围内的特定内容才可用,从而有效地按日期过滤日志内容。

我已成功使用正则表达式格式提取日期,但根据开始和结束日期过滤日志内容的功能未按预期工作。

@staticmethod
    def filter_log_entries(log_content, start_date, end_date):
        start_datetime = datetime.strptime(start_date, '%d/%b/%Y').replace(tzinfo=timezone.utc)
        end_datetime = datetime.strptime(end_date, '%d/%b/%Y').replace(tzinfo=timezone.utc)

        # Adjust end_datetime to include the entire end day
        end_datetime = end_datetime + timedelta(days=1) - timedelta(seconds=1)

        log_entry_pattern = re.compile(r'\[(\d{2}/[A-Za-z]{3}/\d{4}:\d{2}:\d{2}:\d{2} [+-]\d{4})\]')
        filtered_entries = []

        for line in log_content.split('\n'):
            match = log_entry_pattern.search(line)
            if match:
                entry_datetime_str = match.group(1)
                try:
                    entry_datetime = datetime.strptime(entry_datetime_str, '%d/%b/%Y:%H:%M:%S %z')
                    if start_datetime <= entry_datetime <= end_datetime:
                        filtered_entries.append(line)
                except ValueError:
                    st.write(f"Date parsing error for line: {line}")

        filtered_log_content = "\n".join(filtered_entries)

        return filtered_log_content

日志内容(显示):

日志文件中的日期格式为[17/May/2015:10:05:03 +0000],日志文件结束于[20/May/2015:10:05:03 +0000]。我想过滤日志内容,这样如果我选择日期范围从 17/May/2015 到 18/May/2015,则仅选择此时间线内的内容。

83.149.9.216 - - [17/May/2015:10:05:03 +0000] "GET /presentations/logstash-monitorama-2013/images/kibana-search.png HTTP/1.1" 200 203023 "http://semicomplete.com/presentations/logstash-monitorama-2013/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1700.77 Safari/537.36"83.149.9.216 - - [17/May/2015:10:05:43 +0000] "GET /presentations/logstash-monitorama-2013/images/kibana-dashboard3.png HTTP/1.1" 200 171717 "http://semicomplete.com/presentations/logstash-monitorama-2013/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1700.77 Safari/537.36"83.149.9.216 - - [17/May/2015:10:05:47 +0000] "GET /presentations/logstash-monitorama-2013/plugin/highlight/highlight.js HTTP/1.1" 200 26185 "http://semicomplete.com/presentations/logstash-monitorama-2013/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1700.77 Safari/537.36"83.149.9.216 - - [17/May/2015:10:05:12 +0000] "GET /presentations/logstash-monitorama-2013/plugin/zoom-js/zoom.js HTTP/1.1" 200 7697 "http://semicomplete.com/presentations/logstash-monitorama-2013/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1700.77 Safari/537.36"83.149.9.216 - - [17/May/2015:10:05:07 +0000] "GET /presentations/logstash-monitorama-2013/plugin/notes/notes.js HTTP/1.1" 200 2892 "http://semicomplete.com/presentations/logstash-monitorama-2013/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1700.77 Safari/537.36"83.149.9.216 - - [17/May/2015:10:05:34 +0000] "GET /presentations/logstash-monitorama-2013/images/sad-medic.png HTTP/1.1" 200 430406 "http://semicomplete.com/presentations/logstash-monitorama-2013/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1700.77 Safari/537.36"83.149.9.216 - - [17/May/2015:10:05:57 +0000] "GET /presentations/logstash-monitorama-2013/css/fonts/Roboto-Bold.ttf HTTP/1.1" 200 38720 "http://semicomplete.com/presentations/logstash-monitorama-2013/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1700.77 Safari/537.36"83.149.9.216 - - [17/May/2015:10:05:50 +0000] "GET /presentations/logstash-monitorama-2013/css/fonts/Roboto-Regular.ttf HTTP/1.1" 200 41820 "http://semicomplete.com/presentations/logstash-monitorama-2013/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1700.77 Safari/537.36"83.149.9.216 - - [17/May/2015:10:05:24 +0000] "GET /presentations/logstash-monitorama-2013/images/frontend-response-codes.png HTTP/1.1" 200 52878 "http://semicomplete.com/presentations/logstash-monitorama-2013/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1700.77 Safari/537.36"83.149.9.216 - - [17/May/2015:10:05:50 +0000] 

完整链接:https://github.com/linuxacademy/content-elastic-log-samples/blob/master/access.log

python linux ubuntu azure-log-analytics
1个回答
0
投票

要获取包含输入日期的内容并排除结束日期,您可以使用以下代码:

import re
from datetime import datetime, timezone

@staticmethod
def test_rith_filter(rith_lg_cnt_test, ri_strt_dt, ri_ed_dt):
    rith_st_dt = datetime.strptime(ri_strt_dt, '%d/%b/%Y').replace(tzinfo=timezone.utc)
    rith_ed_dt = datetime.strptime(ri_ed_dt, '%d/%b/%Y').replace(tzinfo=timezone.utc)
    ri_lg_pt = re.compile(r'\[(\d{2}/[A-Za-z]{3}/\d{4}:\d{2}:\d{2}:\d{2} [+-]\d{4})\]')
    Tested_Result_Out = []
    for ri in rith_lg_cnt_test.split('\n'):
        cat = ri_lg_pt.search(ri)
        if cat:
            ri_entry_dt_str = cat.group(1)
            try:
                ri_entry_dt = datetime.strptime(ri_entry_dt_str, '%d/%b/%Y:%H:%M:%S %z')
                if rith_st_dt <= ri_entry_dt < rith_ed_dt:
                    Tested_Result_Out.append(ri)
            except ValueError:
                print(f"There is an error in Parsing it Rithwik Bojja: {ri}")
    ri_res = "\n".join(Tested_Result_Out)
    return ri_res
rith_lg_cnt_test = """
83.149.9.216 - - [17/May/2015:10:05:03 +0000] "Test .1700.66 Safari/500.99"
83.149.9.216 - - [18/May/2015:12:05:03 +0000] " Rithwik afari/500.99"
83.149.9.216 - - [19/May/2015:14:05:03 +0000] "GET /presentations/logstash-monitorama Bojja .66 Safari/500.99"
"""

ri_strt_dt = "17/May/2015"
ri_ed_dt = "18/May/2015"

test_rith_call = test_rith_filter(rith_lg_cnt_test, ri_strt_dt, ri_ed_dt)
print(test_rith_call)

输出:

enter image description here

© www.soinside.com 2019 - 2024. All rights reserved.