Drastic performance degradation when running multiple times on the same image #2749

Apr 25, 2024

jacoblambert
Apr 25, 2024

Hi all,

I've encountered this issue where, after loading the model and performing inference once, the performance starts to degrade significantly. For example when I load the model -> inference on an image -> inference on the same image again, these are the results I get:

I don't want to call this a bug, I feel like this is a case where the .eval() flag is not being set and something in the model is changing, but I don't have access to the torch Module here so I can't set it. Has anyone encountered this issue and is there something wrong with my usage of the mmdeploy API?

Here's a code snippet for how I'm loading the model and running inference, followed by my environment details.

FYI I'm using mmdeploy_runtime "Detector" as opposed to "build_task_processor" because of this bug.

from __future__ import annotations

import math
from argparse import ArgumentParser
from pathlib import Path
from time import perf_counter

import cv2
import mmcv_custom  # noqa: F401
import mmdet_custom  # noqa: F401
import numpy as np
from mmdeploy_runtime import Detector  # noqa: E402


class NameSpace:
    input: Path
    output: Path
    model: str
    device_id: str
    score: float


parser = ArgumentParser()
parser.add_argument("input", type=Path, help="Path to the input directory")
parser.add_argument("output", type=Path, help="Path to the output directory")
parser.add_argument(
    "model", help="Path to the mmdeploy SDK model dumped by model converter"
)
parser.add_argument("--device-id", type=int, default=0, help="Device id for inference")
parser.add_argument("--score", type=float, default=0.3, help="Bbox score threshold")

args = parser.parse_args(namespace=NameSpace)

args.output.mkdir(parents=True, exist_ok=True)

detector = Detector(model_path=args.model, device_name="cuda", device_id=args.device_id)

if args.input.is_dir():
    paths = [
        path
        for path in args.input.iterdir()
        if not path.is_dir() and path.suffix in (".jpg", ".jpeg", ".png")
    ]
else:
    paths = [args.input, args.input, args.input, args.input, args.input, args.input]

counter = 0
for path in paths:
    print(f"Processing {path}", end="")
    start = perf_counter()

    image = cv2.imread(str(path))
    bboxes, labels, masks = detector(image)

    mask = np.zeros_like(image)
    for bbox in bboxes:
        (left, top, right, bottom) = bbox[0:4].astype(int)
        mask = cv2.rectangle(
            mask,
            (left, top),
            (right, bottom),
            color=(0, 255, 0),
            thickness=2,
            lineType=cv2.LINE_AA,
        )

    mask: np.ndarray
    for bbox, label, mask in zip(bboxes, labels, masks):
        [left, top, right, bottom], score = bbox[0:4].astype(int), bbox[4]

        if score < args.score:
            continue

        cv2.rectangle(
            image,
            (left, top),
            (right, bottom),
            color=(0, 255, 0),
            thickness=2,
            lineType=cv2.LINE_AA,
        )

        if mask.size:
            blue, green, red = cv2.split(image)
            if mask.shape == image.shape[:2]:
                mask_image = blue
            else:
                x0 = int(max(math.floor(left) - 1, 0))
                y0 = int(max(math.floor(top) - 1, 0))
                mask_image = blue[y0 : y0 + mask.shape[0], x0 : x0 + mask.shape[1]]
            cv2.bitwise_or(mask, mask_image, mask_image)
            image = cv2.merge([blue, green, red])

    cv2.imwrite(str(args.output / f"{path.stem}_{counter}.jpeg"), image)
    cv2.imwrite(str(args.output / f"{path.stem}_{counter}_mask.jpeg"), mask)
    counter += 1

    print(f"... Done in {perf_counter() - start:4.1f} s")

Here is my env check:

$ python3 tools/check_env.py 
2024-04-25 10:26:47,578 - mmdeploy - INFO - 

2024-04-25 10:26:47,578 - mmdeploy - INFO - **********Environmental information**********
2024-04-25 10:26:47,782 - mmdeploy - INFO - sys.platform: linux
2024-04-25 10:26:47,782 - mmdeploy - INFO - Python: 3.7.16 (default, Jan 17 2023, 22:20:44) [GCC 11.2.0]
2024-04-25 10:26:47,782 - mmdeploy - INFO - CUDA available: True
2024-04-25 10:26:47,782 - mmdeploy - INFO - GPU 0,1: NVIDIA RTX A6000
2024-04-25 10:26:47,782 - mmdeploy - INFO - CUDA_HOME: /usr/local/cuda-11.7
2024-04-25 10:26:47,782 - mmdeploy - INFO - NVCC: Cuda compilation tools, release 11.7, V11.7.99
2024-04-25 10:26:47,782 - mmdeploy - INFO - GCC: gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
2024-04-25 10:26:47,782 - mmdeploy - INFO - PyTorch: 1.11.0+cu113
2024-04-25 10:26:47,782 - mmdeploy - INFO - PyTorch compiling details: PyTorch built with:
  - GCC 7.3
  - C++ Version: 201402
  - Intel(R) Math Kernel Library Version 2020.0.0 Product Build 20191122 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v2.5.2 (Git Hash a9302535553c73243c632ad3c4c80beec3d19a1e)
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - LAPACK is enabled (usually provided by MKL)
  - NNPACK is enabled
  - CPU capability usage: AVX2
  - CUDA Runtime 11.3
  - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86
  - CuDNN 8.2
  - Magma 2.5.2
  - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.3, CUDNN_VERSION=8.2.0, CXX_COMPILER=/opt/rh/devtoolset-7/root/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.11.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=OFF, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF, 

2024-04-25 10:26:47,782 - mmdeploy - INFO - TorchVision: 0.12.0+cu113
2024-04-25 10:26:47,782 - mmdeploy - INFO - OpenCV: 4.9.0
2024-04-25 10:26:47,782 - mmdeploy - INFO - MMCV: 1.5.0
2024-04-25 10:26:47,782 - mmdeploy - INFO - MMCV Compiler: GCC 7.3
2024-04-25 10:26:47,782 - mmdeploy - INFO - MMCV CUDA Compiler: 11.3
2024-04-25 10:26:47,782 - mmdeploy - INFO - MMDeploy: 0.14.0+c737563
2024-04-25 10:26:47,782 - mmdeploy - INFO - 

2024-04-25 10:26:47,782 - mmdeploy - INFO - **********Backend information**********
2024-04-25 10:26:47,790 - mmdeploy - INFO - tensorrt:        None
2024-04-25 10:26:47,841 - mmdeploy - INFO - ONNXRuntime:     None
2024-04-25 10:26:47,842 - mmdeploy - INFO - ONNXRuntime-gpu: 1.8.1
2024-04-25 10:26:47,842 - mmdeploy - INFO - ONNXRuntime custom ops:  Available
2024-04-25 10:26:47,842 - mmdeploy - INFO - pplnn:   None
2024-04-25 10:26:47,843 - mmdeploy - INFO - ncnn:    None
2024-04-25 10:26:47,844 - mmdeploy - INFO - snpe:    None
2024-04-25 10:26:47,845 - mmdeploy - INFO - openvino:        None
2024-04-25 10:26:47,845 - mmdeploy - INFO - torchscript:     1.11.0+cu113
2024-04-25 10:26:47,846 - mmdeploy - INFO - torchscript custom ops:  NotAvailable
2024-04-25 10:26:47,868 - mmdeploy - INFO - rknn-toolkit:    None
2024-04-25 10:26:47,868 - mmdeploy - INFO - rknn2-toolkit:   None
2024-04-25 10:26:47,868 - mmdeploy - INFO - ascend:  None
2024-04-25 10:26:47,869 - mmdeploy - INFO - coreml:  None
2024-04-25 10:26:47,869 - mmdeploy - INFO - tvm:     None
2024-04-25 10:26:47,869 - mmdeploy - INFO - 

2024-04-25 10:26:47,869 - mmdeploy - INFO - **********Codebase information**********
2024-04-25 10:26:47,870 - mmdeploy - INFO - mmdet:   2.28.1
2024-04-25 10:26:47,870 - mmdeploy - INFO - mmseg:   None
2024-04-25 10:26:47,870 - mmdeploy - INFO - mmcls:   None
2024-04-25 10:26:47,870 - mmdeploy - INFO - mmocr:   None
2024-04-25 10:26:47,870 - mmdeploy - INFO - mmedit:  None
2024-04-25 10:26:47,870 - mmdeploy - INFO - mmdet3d: None
2024-04-25 10:26:47,870 - mmdeploy - INFO - mmpose:  None
2024-04-25 10:26:47,870 - mmdeploy - INFO - mmrotate:        None

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Drastic performance degradation when running multiple times on the same image #2749

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Search code, repositories, users, issues, pull requests...

Drastic performance degradation when running multiple times on the same image #2749

Uh oh!

Uh oh!

jacoblambert Apr 25, 2024

Replies: 0 comments

jacoblambert
Apr 25, 2024