Benchmark¶
We suggested in the paper that more than half of the (realizable) degreesize sequences falls in the “polynomial regime”. How does this hold in realworld data? The answer is likely affirmative.
Note
We use \(\tau_{\text{c}} = \tau_{\text{r}} + \tau_{\text{b}}\) to denote the time that is necessary to determine if the input integer sequences can be realized as a simplicial complex. We call a joint sequence polynomial if \(\tau_{\text{c}} = 0\) (if simplicial) or \(\tau_{\text{c}} = 1\) (if not simplicial).
In that case, no backtracking (\(\tau_{\text{b}} = 0\)) is necessary, the algorithm either finds an instance in linear time or rejects simpliciality immediately.
The following results are measured with simplicialtest v1.2
, using 1 thread,
for a machine with Intel(R) Core(TM) i78700B CPU @ 3.20GHz (6 cores & 12 threads) and 64GB 2667MHz DDR4 memory.
TABLE—Summary of the joint degree sequences derived from empirical datasets. Shown are the convergence time \(\tau_\text{c}\), wall time, instance size \(E\), number of nodes (resp. facets) \(n\) (resp. \(m\)), maximal/minimal facet size, and maximal/minimal node degree.
Dataset 
\(\tau_\text{c}\) 
Wall time 
\(E\) 
\(m\) 
\(n\) 
\((\textbf{s}_\text{max}, \textbf{s}_\text{min})\) 
\((\textbf{d}_\text{max}, \textbf{d}_\text{min})\) 
Reference 

crime 
0 
621 ms 
805 
194 
551 
(25, 1) 
(14, 1) 

flower and pollinator 
0 
295 ms 
1206 
91 
679 
(189, 1) 
(25, 1) 

human diseasome 
0 
2.1 s 
1416 
752 
1100 
(10, 1) 
(15, 1) 

senate committees 
0 
1.57 s 
5143 
275 
282 
(31, 4) 
(59, 1) 

house committees 
0 
6.53 s 
11640 
302 
1290 
(82, 3) 
(43, 1) 

contact high school 
3 
1min 59s [*] 
11770 
4862 
327 
(5, 2) 
(90, 2) 

contact primary school 
DNF 
N/A 
20615 
8010 
242 
(5, 2) 
(174, 20) 

senate bills 
DNF 
N/A 
92876 
3599 
294 
(99, 3) 
(1147, 1) 

mathoverflow answers 
0 
6h 15min [*] 
131406 
5296 
73851 
(1784, 2) 
(169, 1) 

walmart trips 
DNF 
N/A 
447347 
63687 
88860 
(25, 2) 
(5412, 1) 

house bills 
848 
52h 40min 
927075 
23267 
1494 
(399, 2) 
(3824, 1) 
[*] We also run on a Raspberry Pi 4 (8GB) for the “contact high school” dataset, the wall time is 8min 36s. However, for the “mathoverflow answers” dataset, MemoryError is raised (it has a memory use peak at around 32GB).
DNF: Did Not Finish with the default cutoff at \(\tau_\text{c} = 10^5\).
 deckerlouis1991
S. Decker, C. W. Kohfeld, R. Rosenfeld, and J. Sprague, “St. Louis Homicide Project: Local Responses to a National Problem.” University of Missouri, St. Louis, 1991
 katoinsect1990
Makoto Kato, Takehiko Kakutani, Tamiji Inoue, and Takao Itino, “Insectflower Relationship in the Primary Beech Forest of Ashu, Kyoto.” Contr. Biol. Lab. Kyoto Univ., 1990.
 gohhuman2007
KwangIl Goh, Michael E. Cusick, David Valle, Barton Childs, Marc Vidal, AlbertLászló Barabási, “The human disease network.” PNAS, 2007. DOI: 10.1073/pnas.0701361104 [scihub]
 stewartcongressional2017
Charles Stewart III and Jonathan Woon. “Congressional Committee Assignments, 103rd to 114th Congresses, 1993–2017: Senate.”
 stewartcongressional2017a
Charles Stewart III and Jonathan Woon. “Congressional Committee Assignments, 103rd to 114th Congresses, 1993–2017: House.”
 mastrandreacontact2015
Rossana Mastrandrea, Julie Fournet, and Alain Barrat, “Contact Patterns in a High School: A Comparison between Data Collected Using Wearable Sensors, Contact Diaries and Friendship Surveys.” PLOS ONE, 2015. DOI: 10.1371/journal.pone.0136497 [scihub], arXiv: 1506.03645.
 stehléhigh2011
Juliette Stehlé, Nicolas Voirin, Alain Barrat, Ciro Cattuto, Lorenzo Isella, JeanFrançois Pinton, Marco Quaggiotto, Wouter Van den Broeck, Corinne Régis, Bruno Lina, and Philippe Vanhems, “HighResolution Measurements of FacetoFace Contact Patterns in a Primary School.” PLOS ONE, 2011. DOI: 10.1371/journal.pone.0023176 [scihub], arXiv: 1109.1015.
 chodrowhypergraph2021(1,2)
Philip S. Chodrow, Nate Veldt, and Austin R. Benson, “Hypergraph clustering: from blockmodels to modularity.” 2021. arXiv: 2101.09611.
 veldtminimizing2020
Nate Veldt, Austin R. Benson, and Jon Kleinberg. “Minimizing Localized Ratio Cut Objectives in Hypergraphs.” Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), 2020. DOI: 10.1145/3394486.3403222 [scihub], arXiv: 2002.09441.
 amburgclustering2020
Ilya Amburg, Nate Veldt, and Austin R. Benson. “Clustering in graphs and hypergraphs with categorical edge labels.” Proceedings of the Web Conference (WWW), 2020. DOI: 10.1145/3366423.3380152 [scihub], arXiv: 1910.09943.