Literature review on the influence of social networks

. The rapid development of social networks has completely changed the way people communicate and greatly promoted the interaction between people, and further generated the concept of the influence of social networks, which has attracted more and more scholars' attention. The purpose of this article is to summarize the current research progress and dig the gaps in the current research by combing and reviewing the existing research on social network influence. Specifically, this paper mainly analyzes the research progress of social network influence, and through summarizing and analyzing the related literatures of the social network influence of individual Weibo, the influence of user social network and the social network influence of the topic, we put forward the research progress and existing problems, based on them the direction of future research is put forward. We believe it has considerable reference value for the research of social network influence.


Introduction
The rapid development of social networks has provided more channels for human communication and completely changed the way people communicate.It has greatly promoted the interaction between people, which further generated the concept of "influence" on social networks.Influence of social networks is an important part of social networks.Generally, the influence of social network refers to the ability of users on social networks to stimulate particular behaviors of other individuals [1].Quantitative research can be done on social network platforms by quantifying the user behaviors from the four aspects of follow, likes, repost, and comments [2].Those who influence other user behavior are called influencers, and those who generate user behavior are called affected people.It is worth noticing that the relation between the two groups of people is asymmetrical, as there is not any necessary connection between the influence of individual A on B and the influence of individual B on A.
In definition, high-impact users on social networks mean those who can largely influence other individuals or groups of people.The impact of these people is of consequence in the spread or guidance of public opinions [3].Thus, its value for a country, society, and public security is self-evident.At the national level, research on the influence of social networks can help safeguard national security, economic stability and development; at the social level, it can help us understand the social behaviors of the public; at the public security level, it can provide a theoretical basis for making public decisions or guiding public opinions [4].Therefore, it is of great practical and social significance to do research on the influence of social networks, especially to have a general idea of the current status of the research in the field.
Previous research on the influence of social networks is mostly concentrated on Twitter and Weibo † TikTok and Instagram.Based on the different characteristics of these data, different measurement indexes and models for the impact of social networks had been proposed [5].For example, the influence of social networks was analyzed by Robert B et al. from a theoretical perspective, including the internal and external causes for social network influence [6]; Focusing on the research methods, the research progress of social network node was analyzed by B. Martínez-López [7].Sergio Barta studied the extent to which relationships established in other social networks are effective for the TikTok platform, and integrated SOR and ELM models in order to better explain the influence of influencers on followers [8].In sum, the previous literature review has made a quite systematic analysis of the current research progress of social network influence.It has introduced the research progress of social network influence in terms of the factors causing influence, and introduced the progress of research methods from the perspective of nodes.
However, there are also some shortcomings in previous literature review.One is that the literature review on the impact of social network research is relatively outdated with few introductions to the latest research methods, and the research methods are not varied enough, as it mainly analyzes users' influence on social networks and lacks in analysis of social network influence through a single Weibo post or topic.Another deficiency is that some features of social networks influence had received too little attention [7].The features include four main aspects.First, the influence is dynamic.Influence is a variable which can increase with new experience (interaction or observation), or decrease as time goes by.Second, the influence is transmissive.A chain of influence may be formed.As user B gets influenced by user A, the influence may go down to user C, a follower of user B. Third, the influence is subjective.Different users may have varied opinions on a piece of news.Fourth, the influence is measurable.The influence degree can be measured with continuous real numbers, which can be called the influencing value.Though previous literature review on relevant research had covered some of the traits above, it had failed to integrate them and make a systematic analysis of the influence of current social networks.
To address the above problems and have a better understanding of the current evaluation methods of social network influence and their problems, this paper reviews and analyzes the existing research results of social network influence.In the databases of web of science, Google Scholar, and CNKI, this paper finds literature by using the key words of "social influence", "Twitter influence", and "Sina Weibo influence".Then, it chooses 49 papers in total, including high-IF or highly-cited classic literature as well as latest papers published in high-level journals.It respectively analyzes the research on the influence of a single post, a blogger, or a hotspot topic.By broadening the research field little by little, it makes a summary of the current research on the influence of social networks, in terms of the research content and methods, which is followed by a comprehensive analysis.It is hoped that it can provide some reference for future study in the field, and lay a theoretical foundation for further study on the influence of social networks.

Analysis of research on the social network influence of a Weibo
At present, there is not much research on the influence of a single Weibo.Many researchers use questionnaires, case studies, etc. to conduct relevant research from a qualitative perspective.These methods have some problems that the research content is random, subjective, and not systematic enough.With the advent of the era of big data, methods like questionnaires and case studies are not adequate for making impact analysis.Therefore, some scholars proposed a quantitative method for research on the influence of certain Weibo.The influence evaluation system was put forward by Scholar Kempe, concluding four indexesfollow, likes, comment, and repost [8].Later, some scholars represented by Romero, found from research that one of the indexes, follow is relatively weak in linking users, compared to the other three.Therefore, most research choose to take out the index, follow, when studying the influence of a single Weibo [9].
With the massive data generated by social media and emergence of a large number of "zombies" (invalid accounts applied for increasing the number of followers of celebrities) and "water army"(paid posters for certain purpose) on Weibo, researchers have gradually found that it is difficult to objectively analyze the influence of a single Weibo from these three dimensions, and the coverage of indexes has been questioned by many people.Besides, the research is very complex, as it not just requires to study the influence of a certain Weibo, but also needs to analyze the emotional tendency of the influence, which is hard to be deduced by the three indexes.To solve the problems, researchers began to further study the influence of social networks from the attributes of a single Weibo post, such as the content, field, emotional tendency, region, text type, length, and originality.Thus, different impact analysis models had been created.For instance, HH Fung added two variables -the content and time of a Weibo to the original model, so as to measure the potential influence of social networks content in a more accurate manner [10]; Bakshy and Eckles analyzed the effectiveness of online advertising.Based on the relevance between a user' s field and his or her blog, they put forward the concept -influence of field, and they found that advertising effectiveness is highly correlated with the match degree between users, that is, the influence of a Weibo is largely dependent on how the blog content fits with the field of users [11].Bae and other scholars advised to analyze the factor -emotional tendency of a blog, which matters in impact of a Weibo.Though experiment, it was found that a blog with different emotional tendencies will have significantly disparate influences on a user [12].
However, it is found that there is not enough comprehensive research on the influence of a single Weibo.Many researches just equate the influence of users with that of their Weibos, which is obviously not scientific.When making a quantitative analysis of a single Weibo' s influence, the indexes are chosen in a random manner, especially the index concerned with blog-posting time.Therefore, it can be manifested by our analysis that the current research on the influence of a single Weibo lacks an analysis of the news dissemination mechanism, which had made the results lacking in credibility [13].

Analysis of research on the influence of users on social networks
The analysis of research on the influence of users on social networks can be applied to many cases, such as high-impact user identification, online advertising, viral marketing, etc., so there has been a wide range of research on social network influence that is focused on users' influence, with detailed results.But the research perspectives and methods used are different.
Wu, Huberman, Newman and Grivan, proposed to use complex network theory to study internet communities.By dividing social networks into different communities, they made it possible to identify high-impact nodes [14]; Rogers and Cartano et al., based on the community division, proposed a method to evaluate the influence of high-impact users in a community, that is, the qualitative research method through questionnaires.But due to the difficulty in data collection, it was not so operable in practice [15].Besides, due to the randomness and strong subjectivity of qualitative research, its results will lack in persuasiveness and credibility.Therefore, it is an urgent problem to be solved as for how to analyze the influence of social networks through quantitative research method.
The introduction of complex quantitative network research method makes it possible to quantify the influence of social networks.The complex network theory regards all relations as interactive networks constituted of nodes and edges.Nodes represent real individuals and edges represent relations between individuals [16].Newman et al. had proved the "small world" effect in social networks, featuring small network' s average path length and large network' s clustering coefficient, which provides the theoretical basis for the introduction of complex quantitative network research method [17].With the introduction of complex social networks, it has also become a trend to use topological structures in complex networks and analyze the influences of social networks according to the importance of nodes and other characteristics.The disadvantage is that the diversity of relations between users and characteristics of nodes are neglected [18].Wu et al. put forward an evaluation method for the influence of government Weiboging by building social network relationships between government users, combining the number of followers, the number of followers and the number of likes in Twitter accounts, and finally using entropy weight method to determine the index weight [19].
If the social network is viewed as a graph model represented by G={V, E}, in which V represents node-set, and E represents edge-set in social networks, then the mainstream user impact analysis methods can be explained as follows.

Analysis methods of social networks: degree centrality, closeness centrality, betweenness centrality
Degree centrality: The concept of degree centrality in social network is an index to measure the direct social relationship of a user.For a node, its degree centrality can be obtained by finding the number of nodes directly connected to the node.The more user nodes there are directly connected with a user, the higher the degree centrality will be.Correspondingly, it would have a higher influence on other users on social networks [20].If A is the adjacency matrix of the network graph, and degree(i) is the degree of node i, then CiDEG, the degree centrality of node i will be the degree of the node, and    = () (1) Degree centrality can measure the influence of a node in a relatively direct manner, without too much cost in calculation.However, for a large-scale Weibo network, the situation of ignoring some influential posters occur from time to time.
Closeness centrality: Unlike degree centrality, closeness centrality measures the indirect influence of nodes, and it is the sum of the shortest distances between a node and all other nodes in the graph.In a social network, a high-impact node should feature that the sum of average shortest distances between it and other nodes is the smallest, that is, a high-impact node should have been strongly correlated with most nodes, so that it can take relatively short cuts to influence other nodes, and will be less dependent on other nodes when disseminating information.If the closeness centrality of node i is defined as CiCLO, then it can be expressed in formula: =    (2) In the formula, S represents the length of the shortest path between two nodes in a social network.When the nodes are directly connected, the distance is defined as vector 1.
Betweenness centrality: In complex network theory, betweenness centrality measures the ability of a node acting as an intermediate node.In social networks, it can be understood as that the node plays an important role amongst information dissemination.Quercia et al. studied the network structure of Twitter and found that most network intermediate nodes on Twitter are opinion leaders in a certain field with high impact [21].This also proves that betweenness centrality can be used to analyze the influence of nodes on information dissemination.If the betweenness centrality of node i is defined as CiBET, then it can be expressed as: In the formula, bjk represents the number of the shortest paths between node j and node k; bjik represents the number of the shortest paths between node j (which includes node i) and node k.
However, it is worth noticing that though betweenness centrality can be used to find these intermediate nodes in social networks, it is time-consuming.Therefore, it should be used with caution.
The shortcoming of social network analysis is that there is not any rigid index for judging its influence, so it lacks in measurement of absolute influence of social networks, and research can only be done on its relative influence [22].

PageRank and HITS algorithms in the field of information search and their extension methods
The PageRank algorithm, a widely-used and classic web page ranking algorithm, was proposed by Google founders Larry Page and Sergey Brin in 1998.It was originally used in search engines for page sorting, that is, web page was ranked according to the number of hyperlinks between pages [23].Then, researchers soon discovered that this method could also be used in social network impact analysis, using it as the basic algorithm for social network influence measurement.According to Fabiá n Riquelme et al, among the researchers doing quantitative research on user influence on social networks, half used PageRank and its improved algorithm [24].Tunkelang et al used PageRank to measure the user influence on Twitter.They made an analogy between the following relation between users on Twitter with the web page links in the original algorithm, and analyzed the influence of Twitter users according to the number and quality of followers of a user [25].In the paper, he defines social network influence as the anticipated views (including reposts) of one' s post on a social network platform.According to his model, assume that a constant probability of repost is p, then a user' s influence will be:

In the formula, Influence(X) represents the influence of uers X on a social network platform, Y is a user in the follower-set of X, and ‖Following(Y)‖ represents the number of followers in the follower-set of Y.
As low-impact users have limited influence on their followers, the algorithm pays more attention to the following of high-impact users, so as to avoid the interference of "zombies" and "water army".However, the researchers then found that the PageRank algorithm, in view of the influence dissemination of nodes, needs to use iteration in calculation.Thus, the characteristics and attributes of nodes themselves are neglected.The traditional PageRank algorithm simply considers the influence of nodes and neglects their own features.As a result, it is difficult to use the algorithm at smaller levels, such as measuring the user influence in terms of a topic.Besides, as time is not taken as a factor in the algorithm, the result will be that opinion leaders who registered accounts earlier have more influence, while newly registered opinions leaders, though being quite influential at the time, will turn out to have less impact according to the PageRank algorithm [26].
In order to solve this problem, the researchers proposed user influence measurement algorithms combining individual features and network structure based on the traditional PageRank algorithm.Lü et al. designed a self-adaptive and nonparametric algorithm, LeaderRank, to quantify the influence of users.It adds a new node apart from the existing nodes, which is called the grounding node, and creates a bidirectional connection between it and the other nodes.In this way, a new strong network is constructed.Then, a standard random walk process is used to find high-impact disseminators.After verified by examples, it is proved that LeaderRank is more effective than PageRank in ranking.It also proves a good robustness to the manipulation/noise data and solves the problem of obtaining more than one ranking of the PageRank algorithm [27].Luo et al. comprehensively considered the three dimensions of users' basic attributes, interaction between users and blog content, and purposefully integrated them into the original PageRank algorithm, effectively improving the accuracy of the algorithm [28].
In addition to the introduction of the classic PageRank algorithm, another widely used web page sorting algorithm, HITS, has also been used by researchers for social network impact research.The HITS algorithm is called Hyperlink-Introduced Topic Search.In this algorithm, each web page possesses two eigenvalues -Hub value and Authority value.Hub value represents the sum of weighted values of a web page which is linked to other pages, that is, the weighted link-out value of the web page which provides a link-set to authoritative pages.Hub value is denoted by H[i].The Authority value represents the sum of the weighted values of an authoritative web page cited by other web pages, that is, the weighted link-into value of the authoritative web page, denoted by A[i].The essential task of the HITS algorithm is to return a high-quality Authority page to users.
In the application of HITS to social networks, the number of a user' s followers is analogous to the Authority value of the original algorithm, and the number of a user' s follows is analogous to the Hub value of the original algorithm.Unlike the PageRank algorithm, it takes into consideration the centrality and authority of nodes at the same time, and it uses neighbor nodes to demonstrate the influence of a node through the influence measurement method based on random walk, thus avoiding the noise interference.Although it also ignores the attributes of nodes themselves, and its iteration is more time-consuming than PageRank, it is still a kind of effective influence measurement method [29].Regrettably, the current research on the application of HITS on social networks is very limited, and has achieved few results, especially high-level academic fruits.
The three methods above are the most common ones in research on the user influence on social networks.Clearly, they have some problems, so researchers have made improvements from different perspectives.Q Li, T Zhou et al., to identify the high impact users on social network platforms, proposed a LeaderRank algorithm combing centrality, which not only takes into consideration the number of connected nodes, but also the influence of the nodes, thus realizing a more accurate measurement of user influence [30]; Cheng et al., through studying a mass of user behaviors on Twitter, found that 10% of Twitter users alarmingly contributed over 86% of activity on the platform.Hence, if the influence analysis is done according to the traditional PageRank algorithm without taking user activity into consideration, the result would be quite different [31].Weng et al. found that the number of Weibo users and their relation obey the uniform distribution.Based on this, they proposed a topic-sensitive TwitterRank algorithm.It measures the similarity between the affected users and the link structure.It is verified to be more effective than the traditional PageRank [32].
User influence is a very vague concept, but the changes in user behaviors caused by user influence are measurable, thus laying the foundation for our social network user impact analysis.However, in spite of the various forms of user behavior on social network platforms, most of the existing models only consider one of them, or separate the behaviors for analysis and analyze user influence through simple linear weighting.Thus, they lack in consideration on the connection between user behavior and user information.Besides, the inherent complexity of human behavior also hinders the further development of social network impact analysis.For example, a user' s high impact on other users may be reflected by their frequent interaction, proximity in time when being active on social network platforms, or high degree of interest matching between them.The current research lacks in deeper analysis of the logic of user behavior and correlation with the influence of other users.This can be a field to study for future researchers [33].

Analysis of research on social network influence from the perspective of topics
The research on social network influence from the perspective of topics is similar to research on a single Weibo' s influence and research on user influence to some extent, as they have many common evaluation indexes.The influence of a topic can be understood as a summary of the influences of some Weibos of the same topic, or a reflection of the influences of users participating in certain topic.Of course, apart from the resemblance, research on influence of a topic also has its own characteristics.
Based on infectious disease transmission models, Barbieri and Bonchi introduced a new topic-driven communication model that can explain the factors influencing a user' s behavior on a social network platform and measure the impact of the topic [34].Bonchi and Castillo et al. proposed three aspects to measure the influence of a topic: the spread ability of a topic, the popularity of a topic, and the audience of a topic, which are also the three basic dimensions of topic impact analysis.Based on the three indexes, the TNIM (time network influence model) model was established and it was compared with the traditional LDA model to test the validity of the TNIM model [35].Guo et al. proposed a SRTM (social-relational topic model) topic model based on social networks and applied it to measuring a user' s influence on a certain topic.The advantage of this model is that when it learns the distribution of a user' s interested topics, it not only analyzes the content posted by the user, but also considers the content of the topics posted by his or her relevant users [36].Zhang and Chen, in order to identify the interaction propagation law of multiple topics in Weibo, proposed a Weibo topic interaction propagation model with multi topic intervention.The model integrates multiple interventions of network media and the government, and considers the characteristics of unequal topic competition.The simulation experiment shows that the model can better simulate the interaction propagation evolution trend of multiple topics in Weibo platform [37].
The spread ability of Weibo topic means the communication effect of a topic posted by users on Weibo.It includes the scope of communication, number of reposts, and number of comments, which can reflect the active spread ability of users in certain Weibo topics.
The popularity of Weibo topic refers to the participation and enthusiasm of the users participating in a topic.Generally, it can be measured by the number of reposts and comments during certain period of time.The popularity of a topic is in direct proportion to its chance of being spread.The more users participate in a topic, the more influential it will become [38].
The audience of Weibo topic refers to the scope a topic has spread.It is usually measured by the number of people focusing on a topic.The more influential the users focusing on a topic are, the more users it will attract [38].According to the three degree of influence principle proposed by Fowler et al. a comprehensive measurement should be done of the user influence within three-layer strong connections.This index reflects the audience of a topic [39].Zhang excavated topics from the three dimensions of text, time and space, found their propagation characteristics in different latitudes, analyzed the differences in the distribution of Weibo information flow in space, remembered the differences in the diffusion intensity of information flow between different regions, which can effectively provide reference for the analysis of emergency public events [40].
Canhui, Min, Liyun, etc. analyzed Weibo topics through the event mentioned in the topics.By studying the life circle of topics, they grasped the spread trend of hot topics from the macroscopic perspective, which lays a foundation for the research on topic influence [41].Wang and Zhang et al. found the inconsistency between the focus of the media and the users, and they proposed an automatic algorithm for topic ranking, the result of which can reflect the influence of time, media and users.This method deconstructs the meaning of topics in two directions, providing a new approach to the current topic impact analysis [42].
In summary, the current research on the dynamic spread of Weibo topics has been mostly focusing on the quality of information and the influence of communicators, and few has considered the quality of Weibo topics and change in topic influence as time goes by [33].

Conclusions
To sum up, it can be seen that there is a lot of research done on the influence of social networks currently.Many scholars have their own opinions on the definition and measurement methods of social network influence, and have made some achievements in applying results to some fields.Nonetheless, it should also be noticed that the current research has some shortcomings [33]: (1) Current research lacks a widely-recognized standard, especially a model with valid evaluation standards, which has led to researchers' different selection of measurement methods.Thus, the index system needs to be further improved.(2) The research on the influence of social network focuses more on application and lacks a complete theoretical system.For example, it remains a problem to be addressed as for the definition of social network influence.To some extent, the research lacks in study on its measurement indexes from a macro perspective.(3) Current research on social network influence is mainly based on some topic or content on a platform, which is a single-modal analysis.It lacks in a multi-modal analysis on other aspects such as audio, video, etc. in an age where the multi-modality of social network information is very common.(4) Most of the research on the influence of social networks only concentrates on a single platform, such as China' s Weibo or America' s Twitter.
However, whether it is a single post, or a user, or a topic, their influence is rarely limited to one platform.But current research lacks research on their cross-platform influence.Due to the rapid development of social network technology, the research on social network influence also faces numerous challenges.Nevertheless, its high research value still attracts researchers to persist.With the improvement of data acquisition technology, optimization of indexes, and update of research methods, the research will also thrive in the future.The research and modeling of social network influence can not only help people further understand the logic and evolution of individual and group behavior on social networks platforms, but also provide theoretical support for enterprise and government in making public decisions or public opinion analysis.Besides, it is also conducive to the safety and sound development of a country's economy, culture, and society.Therefore, the research is of great theoretical value and application value.