Dynamic network analysis with missing data: Theory and methods

Zack W. Almquist; Carter T. Butts

doi:10.5705/ss.202016.0108

Dynamic network analysis with missing data: Theory and methods

Zack W. Almquist, Carter T. Butts

Research output: Contribution to journal › Article › peer-review

4 Scopus citations

Abstract

Statistical methods for dynamic network analysis have advanced greatly in the past decade. This article extends current estimation methods for dynamic network logistic regression (DNR) models, a subfamily of the Temporal Exponential-family Random Graph Models, to network panel data which contain missing data in the edge and/or vertex sets. We begin by reviewing DNR inference in the complete data case. We then provide a missing data framework for DNR families akin to that of Little and Rubin (2002) or Gile and Handcock (2010a). We discuss several methods for dealing with missing data, including multiple imputation (MI). We consider the computational complexity of the MI methods in the DNR case and propose a scalable, design-based approach that exploits the simplifying assumptions of DNR. We dub this technique the “complete-case” method. Finally, we examine the performance of this method via a simulation study of induced missingness in two classic network data sets.

Original language	English (US)
Pages (from-to)	1245-1264
Number of pages	20
Journal	Statistica Sinica
Volume	28
Issue number	3
DOIs	https://doi.org/10.5705/ss.202016.0108
State	Published - Jul 2018
Externally published	Yes

Bibliographical note

Funding Information:
This work was supported in part by ONR award N00014-08-1-1015, ARO awards W911NF-14-1-0577 (YIP) and W911NF-14-1-0552, NSF award IIS-1526 736, and NIH/NICHD award 1R01HD068395-01.

Keywords

Dynamic network models
Dynamic network models with missing data
Dynamic network regression
Ergm
Exponential random graph models
Logistic regression
Missing data
Temporal exponential random graph models
Tergm

Access

10.5705/ss.202016.0108

OpenUrl availability

Full text

Cite this

@article{6ac86a4855fd400fa0536117cf9d25cb,

title = "Dynamic network analysis with missing data: Theory and methods",

abstract = "Statistical methods for dynamic network analysis have advanced greatly in the past decade. This article extends current estimation methods for dynamic network logistic regression (DNR) models, a subfamily of the Temporal Exponential-family Random Graph Models, to network panel data which contain missing data in the edge and/or vertex sets. We begin by reviewing DNR inference in the complete data case. We then provide a missing data framework for DNR families akin to that of Little and Rubin (2002) or Gile and Handcock (2010a). We discuss several methods for dealing with missing data, including multiple imputation (MI). We consider the computational complexity of the MI methods in the DNR case and propose a scalable, design-based approach that exploits the simplifying assumptions of DNR. We dub this technique the “complete-case” method. Finally, we examine the performance of this method via a simulation study of induced missingness in two classic network data sets.",

keywords = "Dynamic network models, Dynamic network models with missing data, Dynamic network regression, Ergm, Exponential random graph models, Logistic regression, Missing data, Temporal exponential random graph models, Tergm",

author = "Almquist, {Zack W.} and Butts, {Carter T.}",

note = "Funding Information: This work was supported in part by ONR award N00014-08-1-1015, ARO awards W911NF-14-1-0577 (YIP) and W911NF-14-1-0552, NSF award IIS-1526 736, and NIH/NICHD award 1R01HD068395-01.",

year = "2018",

month = jul,

doi = "10.5705/ss.202016.0108",

language = "English (US)",

volume = "28",

pages = "1245--1264",

journal = "Statistica Sinica",

issn = "1017-0405",

publisher = "Institute of Statistical Science",

number = "3",

}

TY - JOUR

T1 - Dynamic network analysis with missing data

T2 - Theory and methods

AU - Almquist, Zack W.

AU - Butts, Carter T.

N1 - Funding Information: This work was supported in part by ONR award N00014-08-1-1015, ARO awards W911NF-14-1-0577 (YIP) and W911NF-14-1-0552, NSF award IIS-1526 736, and NIH/NICHD award 1R01HD068395-01.

PY - 2018/7

Y1 - 2018/7

N2 - Statistical methods for dynamic network analysis have advanced greatly in the past decade. This article extends current estimation methods for dynamic network logistic regression (DNR) models, a subfamily of the Temporal Exponential-family Random Graph Models, to network panel data which contain missing data in the edge and/or vertex sets. We begin by reviewing DNR inference in the complete data case. We then provide a missing data framework for DNR families akin to that of Little and Rubin (2002) or Gile and Handcock (2010a). We discuss several methods for dealing with missing data, including multiple imputation (MI). We consider the computational complexity of the MI methods in the DNR case and propose a scalable, design-based approach that exploits the simplifying assumptions of DNR. We dub this technique the “complete-case” method. Finally, we examine the performance of this method via a simulation study of induced missingness in two classic network data sets.

AB - Statistical methods for dynamic network analysis have advanced greatly in the past decade. This article extends current estimation methods for dynamic network logistic regression (DNR) models, a subfamily of the Temporal Exponential-family Random Graph Models, to network panel data which contain missing data in the edge and/or vertex sets. We begin by reviewing DNR inference in the complete data case. We then provide a missing data framework for DNR families akin to that of Little and Rubin (2002) or Gile and Handcock (2010a). We discuss several methods for dealing with missing data, including multiple imputation (MI). We consider the computational complexity of the MI methods in the DNR case and propose a scalable, design-based approach that exploits the simplifying assumptions of DNR. We dub this technique the “complete-case” method. Finally, we examine the performance of this method via a simulation study of induced missingness in two classic network data sets.

KW - Dynamic network models

KW - Dynamic network models with missing data

KW - Dynamic network regression

KW - Ergm

KW - Exponential random graph models

KW - Logistic regression

KW - Missing data

KW - Temporal exponential random graph models

KW - Tergm

UR - http://www.scopus.com/inward/record.url?scp=85048714929&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85048714929&partnerID=8YFLogxK

U2 - 10.5705/ss.202016.0108

DO - 10.5705/ss.202016.0108

M3 - Article

AN - SCOPUS:85048714929

SN - 1017-0405

VL - 28

SP - 1245

EP - 1264

JO - Statistica Sinica

JF - Statistica Sinica

IS - 3

ER -

Dynamic network analysis with missing data: Theory and methods

Abstract

Bibliographical note

Keywords

Access

OpenUrl availability

Other files and links

Fingerprint

Cite this