Force from Motion: Decoding Physical Sensation in a First Person Video

Hyun Soo Park; Jyh Jing Hwang; Jianbo Shi

doi:10.1109/CVPR.2016.416

Force from Motion: Decoding Physical Sensation in a First Person Video

Hyun Soo Park, Jyh Jing Hwang, Jianbo Shi

Computer Science and Engineering

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

12 Scopus citations

Abstract

A first-person video can generate powerful physical sensations of action in an observer. In this paper, we focus on a problem of Force from Motion - decoding the sensation of 1) passive forces such as the gravity, 2) the physical scale of the motion (speed) and space, and 3) active forces exerted by the observer such as pedaling a bike or banking on a ski turn. The sensation of gravity can be observed in a natural image. We learn this image cue for predicting a gravity direction in a 2D image and integrate the prediction across images to estimate the 3D gravity direction using structure from motion. The sense of physical scale is revealed to us when the body is in a dynamically balanced state. We compute the unknown physical scale of 3D reconstructed camera motion by leveraging the torque equilibrium at a banked turn that relates the centripetal force, gravity, and the body leaning angle. The active force and torque governs 3D egomotion through the physics of rigid body dynamics. Using an inverse dynamics optimization, we directly minimize 2D reprojection error (in video) with respect to 3D world structure, active forces, and additional passive forces such as air drag and friction force. We use structure from motion with the physical scale and gravity direction as an initialization of our bundle adjustment for force estimation. Our method shows quantitatively equivalent reconstruction comparing to IMU measurements in terms of gravity and scale recovery and outperforms method based on 2D optical flow for an active action recognition task. We apply our method to first person videos of mountain biking, urban bike racing, skiing, speedflying with parachute, and wingsuit flying where inertial measurements are not accessible.

Original language	English (US)
Title of host publication	Proceedings - 29th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016
Publisher	IEEE Computer Society
Pages	3834-3842
Number of pages	9
ISBN (Electronic)	9781467388504
DOIs	https://doi.org/10.1109/CVPR.2016.416
State	Published - Dec 9 2016
Event	29th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016 - Las Vegas, United States Duration: Jun 26 2016 → Jul 1 2016

Publication series

Name	Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
Volume	2016-December
ISSN (Print)	1063-6919

Conference

Conference	29th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016
Country/Territory	United States
City	Las Vegas
Period	6/26/16 → 7/1/16

Bibliographical note

Publisher Copyright:
© 2016 IEEE.

Access

10.1109/CVPR.2016.416

OpenUrl availability

Full text

Cite this

Park, H. S., Hwang, J. J., & Shi, J. (2016). Force from Motion: Decoding Physical Sensation in a First Person Video. In Proceedings - 29th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016 (pp. 3834-3842). Article 7780785 (Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition; Vol. 2016-December). IEEE Computer Society. https://doi.org/10.1109/CVPR.2016.416

Force from Motion: Decoding Physical Sensation in a First Person Video. / Park, Hyun Soo; Hwang, Jyh Jing; Shi, Jianbo.
Proceedings - 29th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016. IEEE Computer Society, 2016. p. 3834-3842 7780785 (Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition; Vol. 2016-December).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Park, HS, Hwang, JJ & Shi, J 2016, Force from Motion: Decoding Physical Sensation in a First Person Video. in Proceedings - 29th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016., 7780785, Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2016-December, IEEE Computer Society, pp. 3834-3842, 29th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, United States, 6/26/16. https://doi.org/10.1109/CVPR.2016.416

@inproceedings{a7a3b67263e840af97769fdb6d1ed085,

title = "Force from Motion: Decoding Physical Sensation in a First Person Video",

abstract = "A first-person video can generate powerful physical sensations of action in an observer. In this paper, we focus on a problem of Force from Motion - decoding the sensation of 1) passive forces such as the gravity, 2) the physical scale of the motion (speed) and space, and 3) active forces exerted by the observer such as pedaling a bike or banking on a ski turn. The sensation of gravity can be observed in a natural image. We learn this image cue for predicting a gravity direction in a 2D image and integrate the prediction across images to estimate the 3D gravity direction using structure from motion. The sense of physical scale is revealed to us when the body is in a dynamically balanced state. We compute the unknown physical scale of 3D reconstructed camera motion by leveraging the torque equilibrium at a banked turn that relates the centripetal force, gravity, and the body leaning angle. The active force and torque governs 3D egomotion through the physics of rigid body dynamics. Using an inverse dynamics optimization, we directly minimize 2D reprojection error (in video) with respect to 3D world structure, active forces, and additional passive forces such as air drag and friction force. We use structure from motion with the physical scale and gravity direction as an initialization of our bundle adjustment for force estimation. Our method shows quantitatively equivalent reconstruction comparing to IMU measurements in terms of gravity and scale recovery and outperforms method based on 2D optical flow for an active action recognition task. We apply our method to first person videos of mountain biking, urban bike racing, skiing, speedflying with parachute, and wingsuit flying where inertial measurements are not accessible.",

author = "Park, {Hyun Soo} and Hwang, {Jyh Jing} and Jianbo Shi",

note = "Publisher Copyright: {\textcopyright} 2016 IEEE.; 29th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016 ; Conference date: 26-06-2016 Through 01-07-2016",

year = "2016",

month = dec,

day = "9",

doi = "10.1109/CVPR.2016.416",

language = "English (US)",

series = "Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition",

publisher = "IEEE Computer Society",

pages = "3834--3842",

booktitle = "Proceedings - 29th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016",

}

TY - GEN

T1 - Force from Motion

T2 - 29th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016

AU - Park, Hyun Soo

AU - Hwang, Jyh Jing

AU - Shi, Jianbo

PY - 2016/12/9

Y1 - 2016/12/9

N2 - A first-person video can generate powerful physical sensations of action in an observer. In this paper, we focus on a problem of Force from Motion - decoding the sensation of 1) passive forces such as the gravity, 2) the physical scale of the motion (speed) and space, and 3) active forces exerted by the observer such as pedaling a bike or banking on a ski turn. The sensation of gravity can be observed in a natural image. We learn this image cue for predicting a gravity direction in a 2D image and integrate the prediction across images to estimate the 3D gravity direction using structure from motion. The sense of physical scale is revealed to us when the body is in a dynamically balanced state. We compute the unknown physical scale of 3D reconstructed camera motion by leveraging the torque equilibrium at a banked turn that relates the centripetal force, gravity, and the body leaning angle. The active force and torque governs 3D egomotion through the physics of rigid body dynamics. Using an inverse dynamics optimization, we directly minimize 2D reprojection error (in video) with respect to 3D world structure, active forces, and additional passive forces such as air drag and friction force. We use structure from motion with the physical scale and gravity direction as an initialization of our bundle adjustment for force estimation. Our method shows quantitatively equivalent reconstruction comparing to IMU measurements in terms of gravity and scale recovery and outperforms method based on 2D optical flow for an active action recognition task. We apply our method to first person videos of mountain biking, urban bike racing, skiing, speedflying with parachute, and wingsuit flying where inertial measurements are not accessible.

AB - A first-person video can generate powerful physical sensations of action in an observer. In this paper, we focus on a problem of Force from Motion - decoding the sensation of 1) passive forces such as the gravity, 2) the physical scale of the motion (speed) and space, and 3) active forces exerted by the observer such as pedaling a bike or banking on a ski turn. The sensation of gravity can be observed in a natural image. We learn this image cue for predicting a gravity direction in a 2D image and integrate the prediction across images to estimate the 3D gravity direction using structure from motion. The sense of physical scale is revealed to us when the body is in a dynamically balanced state. We compute the unknown physical scale of 3D reconstructed camera motion by leveraging the torque equilibrium at a banked turn that relates the centripetal force, gravity, and the body leaning angle. The active force and torque governs 3D egomotion through the physics of rigid body dynamics. Using an inverse dynamics optimization, we directly minimize 2D reprojection error (in video) with respect to 3D world structure, active forces, and additional passive forces such as air drag and friction force. We use structure from motion with the physical scale and gravity direction as an initialization of our bundle adjustment for force estimation. Our method shows quantitatively equivalent reconstruction comparing to IMU measurements in terms of gravity and scale recovery and outperforms method based on 2D optical flow for an active action recognition task. We apply our method to first person videos of mountain biking, urban bike racing, skiing, speedflying with parachute, and wingsuit flying where inertial measurements are not accessible.

UR - http://www.scopus.com/inward/record.url?scp=84986295283&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84986295283&partnerID=8YFLogxK

U2 - 10.1109/CVPR.2016.416

DO - 10.1109/CVPR.2016.416

M3 - Conference contribution

AN - SCOPUS:84986295283

T3 - Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition

SP - 3834

EP - 3842

BT - Proceedings - 29th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016

PB - IEEE Computer Society

Y2 - 26 June 2016 through 1 July 2016

ER -

Force from Motion: Decoding Physical Sensation in a First Person Video

Abstract

Publication series

Conference

Bibliographical note

Access

OpenUrl availability

Other files and links

Fingerprint

Cite this