Matching user identities across social networks with limited profile data

Ildar NURGALIEV , Qiang QU , Seyed Mojtaba Hosseini BAMAKAN , Muhammad MUZAMMAL

Front. Comput. Sci. ›› 2020, Vol. 14 ›› Issue (6) : 146809

PDF (610KB)
Front. Comput. Sci. ›› 2020, Vol. 14 ›› Issue (6) : 146809 DOI: 10.1007/s11704-019-8235-9
RESEARCH ARTICLE

Matching user identities across social networks with limited profile data

Author information +
History +
PDF (610KB)

Abstract

Privacy preservation is a primary concern in social networkswhich employ a variety of privacy preservations mechanisms to preserve and protect sensitive user information including age, location, education, interests, and others. The task of matching user identities across different social networks is considered a challenging task. In this work, we propose an algorithm to reveal user identities as a set of linked accounts from different social networks using limited user profile data, i.e., user-name and friendship. Thus, we propose a framework, ExpandUIL, that includes three standalone algorithms based on (i) the percolation graph matching in ExpandFullName algorithm, (ii) a supervised machine learning algorithm that works with the graph embedding, and (iii) a combination of the two, ExpandUserLinkage algorithm. The proposed framework as a set of algorithms is significant as, (i) it is based on the network topology and requires only name feature of the nodes, (ii) it requires a considerably low initial seed, as low as one initial seed suffices, (iii) it is iterative and scalable with applicability to online incoming stream graphs, and (iv) it has an experimental proof of stability over a real ground-truth dataset. Experiments on real datasets, Instagram and VK social networks, show upto 75% recall for linked accounts with 96% accuracy using only one given seed pair.

Keywords

social networks / user identity linkage / graph structure learning / maximum subgraph matching / graph percolation

Cite this article

Download citation ▾
Ildar NURGALIEV, Qiang QU, Seyed Mojtaba Hosseini BAMAKAN, Muhammad MUZAMMAL. Matching user identities across social networks with limited profile data. Front. Comput. Sci., 2020, 14(6): 146809 DOI:10.1007/s11704-019-8235-9

登录浏览全文

4963

注册一个新账户 忘记密码

References

[1]

Shu K, Wang S,Tang J, Zafarani R, Liu H. User identity linkage across online social networks: a review. ACM SIGKDD Explorations Newsletter, 2017, 18(2): 5–17

[2]

Carmagnola F, Cena F. User identification for cross-system personalisation. Information Sciences, 2009, 179(1): 16–32

[3]

Madden M, Lenhart A, Cortesi S, Gasser U, Duggan M, Smith A, Beaton M. Teens, social media, and privacy. Pew Research Center, 2013, 21: 2–86

[4]

Goga O, Loiseau P, Sommer R, Teixeira R, Gummadi K P. On the reliability of profile matching across large online social networks. In: Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2015, 1799–1808

[5]

Korolova A, Motwani R, Nabar S U, Xu Y. Link privacy in social networks. In: Proceedings of the 17th ACM Conference on Information and Knowledge Management. 2008, 289–298

[6]

Chiasserini C F, Garetto M, Leonardi E. Social network deanonymization under scale-free user relations. IEEE/ACM Transactions on Networking, 2016, 24(6): 3756–3769

[7]

Narayanan A, Shmatikov V. De-anonymizing social networks. In: Proceedings of the 30th IEEE Symposium on Security and Privacy. 2009, 173–187

[8]

Sharad K. True friends let you down: benchmarking social graph anonymization schemes. In: Proceedings of the 2016 ACM Workshop on Artificial Intelligence and Security. 2016, 93–104

[9]

Chiasserini C F, Garetto M, Leonardi E. Social network deanonymization under scale-free user relations. IEEE/ACM Transactions on Networking, 2016, 24(6): 3756–3769

[10]

Kazemi E, Hassani S H, Grossglauser M. Growing a graph matching from a handful of seeds. Proceedings of the VLDB Endowment, 2015, 8(10): 1010–1021

[11]

Vosecky J, Hong D, Shen V Y. User identification across multiple social networks. In: Proceedings of the 1st International Conference on Networked Digital Technologies. 2009, 360–365

[12]

Vosoughi S, Zhou H, Roy D. Digital stylometry: linking profiles across social networks. In: Proceedings of International Conference on Social Informatics. 2015, 164–177

[13]

Hazimeh H, Mugellini E, Khaled O A, Cudr�-Mauroux P. Socialmatching++: a novel approach for interlinking user profiles on social networks. In: Proceedings of the 4th International Workshop on PROFILES Co-located with the 16th ISWC. 2017

[14]

Lorenzo Livi A R. The graph matching problem. Pattern Analysis and Applications, 2013, 16: 253–283

[15]

Cook D J, Holder L B. Graph-based data mining. IEEE Intelligent Systems, 2000, 15(2): 32–41

[16]

Singh R, Xu J, Berger B. Global alignment of multiple protein interaction networks with application to functional orthology detection. Proceedings of the National Academy of Sciences, 2008, 105(35): 12763–12768

[17]

Kazemi E, Grossglauser M. On the structure and efficient computation of isorank node similarities. Journal of CoRR, 2016, arXiv preprint arXiv:1602.00668

[18]

Al-Azizy D, Millard D E, Symeonidis I, O’Hara K, Shadbolt N. A literature survey and classifications on data deanonymisation. In: Proceedings of the 10th International Conference on Risks and Security of Internet and Systems. 2015, 36–51

[19]

Xie H, Gao K, Zhang Y, Li J, Ren H. Common visual pattern discovery via graph matching. In: Proceedings of the 19th International Conference on Multimedia. 2011, 1385–1388

[20]

Shaji A, Varol A, Torresani L, Fua P. Simultaneous point matching and 3D deformable surface reconstruction. In: Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition. 2010, 1221–1228

[21]

Sanfeliu A, Fu K. A distance measure between attributed relational graphs for pattern recognition. IEEE Transactions on System, Man, and Cybemetics, 1983, 13(3): 353–362

[22]

Messmer B T, Bunke H. A new algorithm for error-tolerant subgraph isomorphism detection. IEEE Transaction on Pattern Analysis & Machine Intelligence, 1998, 20(5): 493–504

[23]

Bunke H, Allermann G. Inexact graph matching for structural pattern recognition. Pattern Recognition Letters, 1983, 1(4): 245–253

[24]

Neuhaus M, Bunke H. An error-tolerant approximate matching algorithm for attributed planar graphs and its application to fingerprint classification. In: Proceedings of Joint IAPR International Workshops on Statistical Techniques in Pattern Recognition (SPR) and Structural and Syntactic Pattern Recognition (SSPR). 2004, 180–189

[25]

Levi G. A note on the derivation of maximal common subgraphs of two directed or undirected graphs. Calcolo, 1973, 9(4): 341

[26]

Bunke H, Shearer K. A graph distance metric based on the maximal common subgraph. Pattern Recognition Letters, 1998, 19(3–4): 255–259

[27]

Fernández M L, Valiente G. A graph distance metric combining maximum common subgraph and minimum common supergraph. Pattern Recognition Letters, 2001, 22(6–7): 753–758

[28]

Wallis W D, Shoubridge P, Kraetzl M, Ray D. Graph distances using graph union. Pattern Recognition Letters, 2001, 22(6–7): 701–704

[29]

Bunke H. On a relation between graph edit distance and maximum common subgraph. Pattern Recognition Letters, 1997, 18(8): 689–694

[30]

Neuhaus M, Bunke H. A probabilistic approach to learning costs for graph edit distance. In: Proceedings of the 17th International Conference on Pattern Recognition. 2004, 389–393

[31]

Dempster A P, Laird N M, Rubin D B. Maximum likelihood from incomplete data via the em algorithm. Journal of the Royal Statistical Society: Series B (Methodological), 1977, 39(1): 1–22

[32]

Redner R A, Walker H F. Mixture densities, maximum likelihood and the em algorithm. SIAM Review, 1984, 26(2): 195–239

[33]

Bianchi F M, Livi L, Rizzi A, Sadeghian A. A granular computing approach to the design of optimized graph classification systems. Soft Computing, 2014, 18(2): 393–412

[34]

Bianchi F M, Livi L, Rizzi A. Two density-based k-means initialization algorithms for non-metric data clustering. Pattern Analysis and Applications, 2016, 19(3): 745–763

[35]

Lozano M A, Escolano F. Graph matching and clustering using kernel attributes. Neurocomputing, 2013, 113: 177–194

[36]

Janson S, Łuczak T, Turova T, Vallier T. Bootstrap percolation on the random graph Gn,p. The Annals of Applied Probability, 2012, 22(5): 1989–2047

[37]

Yartseva L, Grossglauser M. On the performance of percolation graph matching. Proceedings of the 1st ACM Conference on Online Social Networks. 2013, 119–130

[38]

Erdös P, Rényi A. On the evolution of random graphs. Publication of the Mathematical Institute of the Hungarian Academy of Sciences, 1960, 5(1): 17–60

[39]

Watts D J, Strogatz S H. Collective dynamics of ‘small-world’ networks. Nature, 1998, 393(6684): 440

[40]

Barabási A L, Albert R. Emergence of scaling in random networks. Science, 1999, 286(5439): 509–512

[41]

Chung F, Lu L. Connected components in random graphs with given expected degree sequences. Annals of Combinatorics, 2002, 6(2): 125–145

[42]

Sharad K. Change of guard: the next generation of social graph deanonymization attacks. In: Proceedings of ACM Workshop on Artificial Intelligence and Security. 2016, 105–116

[43]

Ho T K. Random decision forests. In: Proceedings of the 3rd International Conference on Document Analysis and Recognition. 1995, 278–282

RIGHTS & PERMISSIONS

Higher Education Press and Springer-Verlag GmbH Germany, part of Springer Nature

AI Summary AI Mindmap
PDF (610KB)

Supplementary files

Article highlights

1265

Accesses

0

Citation

Detail

Sections
Recommended

AI思维导图

/