如何将 ML 模型最好地集成到 Web 应用程序中？

Question

我没有 ML 经验，并且是 Web 应用程序编程的初学者。我目前有一个使用 Audiocraft 和分类模型的 Flask 应用程序。目前我将这些存储在本地应用程序文件夹中。我已经构建了这个应用程序的镜像，结果是 7GB。

有没有办法将这些模型/框架存储在其他地方并且仅在我需要时引用？

此外，当我在 docker 上运行容器时，需要大约 10 分钟才能从 audiocraft 生成 8 秒的音频。您建议我做什么来加快流程？

音乐一代 outes.py（片段）

def load_model():
    model = MusicGen.get_pretrained('facebook/musicgen-small')
    return model

def generate_music_tensors(description, duration: int):
    model = load_model()
    model.set_generation_params(
        use_sampling=True,
        top_k=250,
        duration=duration
    )
    output = model.generate(
        descriptions=[description],
        progress=True,
        return_tokens=True
    )
    return output[0]

@music_generation_bp.route('/', methods=['POST'])
def generate_music():
    data = request.json

    description = data.get('description')
    duration = data.get('duration', 8)  # Default to 8 seconds if not provided
    print("Description:", description)
    print("Duration:", duration)

    if not description:
        return jsonify({'error': 'Description is required'}), 400
    # Generate unique key for the user
    user_id = str(uuid.uuid4())  # or use a user ID from your authentication system
    audio_key_prefix = f"generated_music_{user_id}_{description}"

    # Generate music tensors
    music_tensors = generate_music_tensors(description, duration)
    print("Music Tensors: ", music_tensors)
...

图像分类 outes.py（片段）

# Load the pre-trained model
model_path = os.path.join(MODELS_DIR, 'multi_output_model.h5')
model = tf.keras.models.load_model(model_path)

@image_classification_bp.route('/', methods=['POST'])
@limiter.limit("1/minute")
def classify_image():
     if 'file' not in request.files:
         return jsonify({"error": "No file part in the request"}), 400

     file = request.files['file']

     if file.filename == '':
         return jsonify({"error": "No selected file"}), 400

    #file = os.path.join(TEST_IMG_DIR, 'blue-dress2.png')

    if file:
        # Read the image file
        img = cv2.imdecode(np.frombuffer(file.read(), np.uint8), cv2.IMREAD_UNCHANGED)
        img = cv2.resize(img, (IMAGE_DIMS[1], IMAGE_DIMS[0]))
        img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
        img = preprocess_input(img)
        img = np.expand_dims(img, axis=0)

        # Perform prediction
        predictions = model.predict(img)

Answer 1

通常，如果您想要托管一个可以存储大型二进制文件的 Web 应用程序，您将需要一个 blob 存储数据库。有各种企业解决方案，例如AWS S3、Azure Blob Storage、Google Cloud，它们提供基础设施来存储数据，您只需在需要时加载即可使用。然而，当模型尺寸太大时，这可能会有点昂贵，而且效率也不太高。

我能想到的另一种选择是拥有一个单独运行的后端服务器，它可以轻松访问模型，并且可以接受来自包含用户输入参数的前端服务器的传入请求，然后使用输入参数生成所需的输出作为对前端的响应服务器请求。这样，您就不需要在每次用户访问 Web 应用程序路由时加载模型，而是将请求转发到已加载模型的服务器来处理请求。用 docker 术语来说，这意味着您需要创建一个单独的映像来加载模型并处理前端传入请求，并将其运行容器暴露给正在运行 Flask Web 应用程序服务器的映像

如何将 ML 模型最好地集成到 Web 应用程序中？

问题描述投票：0回答：1

1个回答

最新问题

如何将 ML 模型最好地集成到 Web 应用程序中？

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1