MediaPipe 静态存储视频人脸地标检测

问题描述 投票:0回答:1

我一直在尝试使用媒体管道来检测静态(存储)视频中的面部标志,但所有在线指南和教程都使用实时摄像头输入。在 python 中很容易,但我必须在 JavaScript 中完成。

我发现这两个指南最相关,但都使用实时摄像头。

https://medium.com/@mamikonyanmichael/what-is-media-pipe-and-how-to-use-it-in-react-53ff418e5a68

https://github.com/jays0606/mediapipe-facelandmark-demo

编辑:我的问题是如何在静态(本地存储)视频而不是实时摄像机源上运行 Mediapipe 的面部检测

任何形式的帮助将不胜感激。

小背景:我正在制作3D手语头像。对于一句话,我最初的计划是将不同的单词级视频拼接在一起(效率不高,但是嘿一些进步),然后使用 Ready Player Me 和 Mediapipe 制作一个 3D 头像来模仿它。

javascript python deep-learning artificial-intelligence mediapipe
1个回答
0
投票

看看这个项目

   def run(self):
        videoHandle = cv2.VideoCapture(self.videoSource)
        self.fps = videoHandle.get(cv2.CAP_PROP_FPS)
        self.pauseDuration = int((7/30) * self.fps) #because during an experiment it was seen that in a 30FPS video, if 7 consecutive frames were with mouth shut, it could be a pause
        print(f"Video has {self.fps}FPS and pause detection duration = {self.pauseDuration} frames Processing...")
        frameNumber = 0
        with mediaPipeFaceMesh.FaceMesh(min_detection_confidence=self.minimumDetectionConfidence, min_tracking_confidence=self.minimumTrackingConfidence) as detectedMesh:
            while videoHandle.isOpened():#as long as there are frames
                frameExists, theImage = videoHandle.read()
                if not frameExists:#reached end of video
                    break #for a stream, you'd use `continue` here
                #---preprocess
                theImage = cv2.cvtColor(theImage, cv2.COLOR_BGR2RGB)
                theImage.flags.writeable = False #a performance improvement (optional)
                processedImage = detectedMesh.process(theImage)
                #---Extract desired points
                theImage.flags.writeable = True                
                theImage = cv2.cvtColor(theImage, cv2.COLOR_RGB2BGR)
                if self.displayMesh:
                    self.displayVideo(theImage, f'FPS: {int(self.fps)}')
                timestamp = videoHandle.get(cv2.CAP_PROP_POS_MSEC) / Const.MILLISECONDS_IN_ONE_SECOND
                print(f"Frame {frameNumber}, timestamp {timestamp}")
                if processedImage.multi_face_landmarks:
                    for detectedFace in processedImage.multi_face_landmarks:
                        if self.hardCodedFaceID not in self.faces:#face not present in dict
                            self.faces[self.hardCodedFaceID] = deque() #add new face
                        if self.displayMesh:
                            mediaPipeDraw.draw_landmarks(image=theImage, landmark_list=detectedFace, connections=mediaPipeFaceMesh.FACEMESH_CONTOURS, landmark_drawing_spec=self.drawSettings, connection_drawing_spec=self.drawSettings)
                        pointIterator = 0
                        landmarkObject = Landmark(timestamp)
                        for pointOnFace in detectedFace.landmark:
                            if pointIterator in self.upperLipPoints or pointIterator in self.lowerLipPoints or pointIterator == self.topOfHead or pointIterator == self.tipOfChin:
                                landmarkObject.setPoint(pointIterator, pointOnFace.x, pointOnFace.y, pointOnFace.z)
                            pointIterator = pointIterator + 1  
                        self.faces[self.hardCodedFaceID].append(landmarkObject)
                frameNumber = frameNumber + 1                
            videoHandle.release            

视频源可以是

videoSource = "yourFile.mp4"

© www.soinside.com 2019 - 2024. All rights reserved.