Force from Motion: Decoding Control Force of Activity in a First-Person Video

Hyun Soo Park, Jianbo Shi

Research output: Contribution to journalArticlepeer-review

Abstract

A first-person video delivers what the camera wearer (actor) experiences through physical interactions with surroundings. In this paper, we focus on a problem of Force from Motion - estimating the active force and torque exerted by the actor to drive her/his activity - from a first-person video. We use two physical cues inherited in the first-person video. (1) Ego-motion: the camera motion is generated by a resultant of force interactions, which allows us to understand the effect of the active force using Newtonian mechanics. (2) Visual semantics: the first-person visual scene is deployed to afford the actor's activity, which is indicative of the physical context of the activity. We estimate the active force and torque using a dynamical system that can describe the transition (dynamics) of the actor's physical state (position, orientation, and linear/angular momentum) where the latent physical state is indirectly observed by the first-person video. We approximate the physical state with the 3D camera trajectory that is reconstructed up to scale and orientation. The absolute scale factor and gravitation field are learned from the ego-motion and visual semantics of the first-person video. Inspired by an optimal control theory, we solve the dynamical system by minimizing reprojection error. Our method shows quantitatively equivalent reconstruction comparing to IMU measurements in terms of gravity and scale recovery and outperforms the methods based on 2D optical flow for an active action recognition task. We apply our method to first-person videos of mountain biking, urban bike racing, skiing, speedflying with parachute, and wingsuit flying where inertial measurements are not accessible.

Original languageEnglish (US)
Article number8543864
Pages (from-to)622-635
Number of pages14
JournalIEEE Transactions on Pattern Analysis and Machine Intelligence
Volume42
Issue number3
DOIs
StatePublished - Mar 1 2020

Keywords

  • First-person vision
  • optimal control
  • physical sensation

PubMed: MeSH publication types

  • Journal Article

Fingerprint Dive into the research topics of 'Force from Motion: Decoding Control Force of Activity in a First-Person Video'. Together they form a unique fingerprint.

Cite this