我正在 AWS 上运行 Windows 服务器,该服务器正在向 IOT 设备提供一些数据,但一段时间后服务器停止响应请求,因为它挂在 s.accept() 调用上,我设法确定发生了这种情况因为服务器打开了太多 TCP 连接,所以操作系统不会再分配任何连接,这是有道理的,但对我来说没有意义的是为什么连接打开仍然打开,因为它们都应该已关闭。这是我的代码中的一个示例,为了安全起见,省略了部分内容:
def connection(conn, addr):
conn.settimeout(10)
data = None
connection_time = datetime.now()
n_items = 0
try:
print(connection_time.strftime("[%d/%m/%Y, %H:%M:%S] "), "new connection started:", addr)
data = get_info(conn)
print(addr, data)
# serve client here, protocol omitted
except Exception as e:
print(f"{addr} connection error:" + str(e))
if data is not None:
add_connection_info(addr, data, connection_time)
try:
conn.close()
print(connection_time.strftime("[%d/%m/%Y, %H:%M:%S] "), "connection ended:", addr)
except Exception as e:
print(f"close failed: {addr} ; {e}")
if __name__ == '__main__':
ssl_context = ssl.SSLContext(ssl.PROTOCOL_TLS_SERVER)
ssl_context.load_cert_chain(cert, key, password=*omitted*)
s = socket.socket()
s = ssl_context.wrap_socket(s, server_side=True)
host = "0.0.0.0"
port = 12345 # not the actual port
print('Server started:', host, port)
s.bind((host, port)) # Bind to the port
s.listen() # Now wait for client connection.
s.setblocking(False)
# Join completed threads and check connection status
threads = []
while True:
for thread in threads:
thread.join(0)
threads = [t for t in threads if t.is_alive()]
print(f"{len(threads)} active connections")
try:
# Use select to wait for a connection or timeout
rlist, _, _ = select.select([s], [], [], 100) # 100 seconds timeout
if s in rlist:
s.settimeout(10)
# TODO timout here
c, addr = s.accept()
print(f"Accepted connection from {addr}")
thread = Thread(target=connection, args=(c, addr))
#thread.daemon = True
thread.start()
threads.append(thread)
print("thread started")
else:
print("No connection within 100 second period")
except BlockingIOError:
print("No connection ready")
except Exception as e:
print("error", str(e))
try:
c.close()
print(f"Connection from {addr} closed due to error.")
except Exception as e_close:
print(f"Failed to close connection after error: {str(e_close)}")
我正在记录服务器的输出,当我看到服务器每次冻结后最后一次检查时
print(connection_time.strftime("[%d/%m/%Y, %H:%M:%S] "), "new connection started:", addr)
有一个匹配的 print(connection_time.strftime("[%d/%m/%Y, %H:%M:%S] "), "connection ended:", addr)
所以据我所知应该没有打开的连接,因为 print(f"{len(threads)} active connections")
打印有 0 个活动线程。但是当我打开 Windows 资源监视器时,有 50 多个由 python