GithubHelp home page GithubHelp logo

Comments (2)

utility-aagrawal avatar utility-aagrawal commented on September 27, 2024 1

@aguscas , You are the best! Makes sense. I'll try that and thanks a lot for your help.

from norfair.

aguscas avatar aguscas commented on September 27, 2024

Hello! That code seems fine! When cutting out the detections you don't need to worry about the motion estimator, you are just extracting from the current frame the bounding box associated to a detection obtained by running the model on that same frame, so the camera movement doesn't matter at all. The way you are generating the embeddings is perfectly fine.

Now what I will say next is optional, but might improve the coord_transformation variable returned by the motion_estimator. If you want you can also try to mask the detections or the tracked_objects and provide that mask to the MotionEstimator.update method. Basically create a mask with the same width and height of the frame (with only one channel), that is 1 everywhere except in the detections (or the tracked objects) where it's 0.

mask = np.ones(frame.shape[:2], frame.dtype)
for d in detections:
    bbox = d.points.astype(int)
    mask[bbox[0, 1] : bbox[1, 1], bbox[0, 0] : bbox[1, 0]] = 0

So the whole code would look something like this:

motion_estimator = MotionEstimator()
for i, cv2_frame in enumerate(video):
    if i % skip_period == 0:
        retinaface_detections = detect_faces(cv2_frame)
        detections = retinaface_detections_to_norfair_detections(
            retinaface_detections, track_points=track_points
        )

        frame = cv2_frame.copy()
       
        # here I am generating the mask from the detections (you can also use the tracked_object if you want)
        mask = np.ones(frame.shape[:2], frame.dtype)
        for d in detections:
            bbox = d.points.astype(int)
            mask[bbox[0, 1] : bbox[1, 1], bbox[0, 0] : bbox[1, 0]] = 0

        # here I am passing that mask to the motion estimator
        coord_transformation = motion_estimator.update(frame, mask)

        for detection in detections:
            cut = get_cutout(detection.points, frame)
            if cut.shape[0] > 0 and cut.shape[1] > 0:
                detection.embedding = DeepFace.represent(img_path = cut, model_name = embed_model, enforce_detection = False, detector_backend = "retinaface")[0]["embedding"]#get_hist(cut) # Set embedding of a detection here..
            else:
                detection.embedding = None

        tracked_objects = tracker.update(detections=detections, period=skip_period, coord_transformations=coord_transformation)
    else:
        tracked_objects = tracker.update()

The reason for this is that the MotionEstimator instance tries to estimate the movement of the camera based on the movement of a few randomly chosen pixels. Ideally, those pixels should be picked from the background, since those are objects that only move due to the movement of the camera (for example, a wall, a table, the corner of a room, etc.), and it is better to avoid picking objects that have an intrinsic movement (for example in this case, the faces, since people can move their face independently of the motion of the camera). That is what we do by providing the mask, we are telling the MotionEstimator where to look (i.e: don't look at the movement of the pixels inside a detection).

Of course this is just a suggestion of something you might want to try and see if it works better. Bear in mind that I haven't tried the code I have written in this example, so tell me if you run into any problem with that.

from norfair.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.