Databricks 写入日志时随机发生 FileNotFoundError: [Errno 2] No such file or directory 错误

问题描述 投票:0回答:1

我创建了一个记录器,将日志文件写入 Databricks 项目中的文件夹:

def configure_logger(logger, logfile, level=logging.DEBUG):
    """
    Configures a logger with both file and stream handlers.
    
    Parameters:
    - logger (logging.Logger): The logger to configure.
    - logfile (str): The path to the logfile.
    - level (int, optional): The logging level. Defaults to logging.DEBUG.
    
    Returns:
    - tuple: A tuple containing the configured logger and the file handler.
    """
    logger.setLevel(level)

    # Create a file handler with detailed formatting for log output
    file_handler = logging.FileHandler(logfile, mode="w")
    fformatter = logging.Formatter('%(name)s - %(levelname)s: %(message)s')
    file_handler.setFormatter(fformatter)
    logger.addHandler(file_handler)

    # Create a stream handler with simple formatting for cell output
    stream_handler = logging.StreamHandler()
    sformatter = logging.Formatter('%(levelname)s: %(message)s')
    stream_handler.setFormatter(sformatter)
    logger.addHandler(stream_handler)

    # Remove random Pyspark logs
    logging.getLogger("py4j").setLevel(logging.ERROR)
    logging.getLogger("Comm").setLevel(logging.ERROR)

    return logger, file_handler


# Set up root logger
logger = logging.getLogger()
logfile = '/Workspace/Project/Logs/dev/dev_20SEP2024.log'  # example

# Ensure the directory exists
log_dir = os.path.dirname(logfile)
if not os.path.exists(log_dir):
    os.makedirs(log_dir)

logger, file_handler = configure_logger(logger, logfile=logfile)

该代码大部分时间都有效,但每隔一段时间我就会收到以下错误:

FileNotFoundError: [Errno 2] No such file or directory: '/Workspace/Project/Logs/dev/dev_20SEP2024.log'
File <command-3660103723975015>, line 10
      7 if not os.path.exists(log_dir):
      8     os.makedirs(log_dir)
---> 10 logger, file_handler = configure_logger(logger, logfile=logfile)

但是文件夹退出了。

%sh
ls '/Workspace/Project/Logs/dev'
# returns dev_16SEP2024.log

即使该文件夹确实存在,并且前几天已登录其中。重新启动笔记本并等待一段时间后,它似乎再次工作或随机重新启动。不确定问题是什么,但似乎是 Databricks 或日志库中的问题。有人处理过这个吗?不管是什么,我需要弄清楚它为什么这样做,因为其他人可能会遇到这个问题。

python logging databricks
1个回答
0
投票

如果我在不关闭处理程序的情况下重新运行代码,文件处理程序似乎会卡住或发生其他情况。我认为添加此后问题现在已解决:

def configure_logger(logger, logfile=None, level=logging.DEBUG):
    logger.setLevel(level)

    try:
        # Create a file handler with detailed formatting for log output
        file_handler = logging.FileHandler(logfile, mode="w")
    except FileNotFoundError:
        # For some reason, sometimes this error is triggered even though the file
        # exists. Removing the file fixes this.
        os.remove(logfile) 
        # Create a file handler with detailed formatting for log output
        file_handler = logging.FileHandler(logfile, mode="w")
    
    fformatter = logging.Formatter('%(name)s - %(levelname)s: %(message)s')
    file_handler.setFormatter(fformatter)
    logger.addHandler(file_handler)

    # Create a stream handler with simple formatting for cell output
    stream_handler = logging.StreamHandler()
    sformatter = logging.Formatter('%(levelname)s: %(message)s')
    stream_handler.setFormatter(sformatter)
    logger.addHandler(stream_handler)

    # Remove random Pyspark logs
    logging.getLogger("py4j").setLevel(logging.ERROR)
    logging.getLogger("Comm").setLevel(logging.ERROR)

    return logger, file_handler
© www.soinside.com 2019 - 2024. All rights reserved.