Abstract
It is a fact that most of the rumours related to hot events or emergencies can be propagated rapidly on the hotbed of online social networks. In order to track the standpoints of the participants of rumour topics to regulate the development of rumour, we propose a multi-features model combining classifiers to classify the rumour standpoints, defined as classifying the standpoints of online social network conversations into one of ‘agree’, ‘disagree’, ‘comment’ or ‘query’ on previous comment about the rumour. Testing the performance of the combinatorial model – decision tree with adaptive boosting classifier and extremely randomised trees with adaptive boosting classifier – on different features, that is, structuring the weight matrix based on combination of term frequency (TF), inverse document frequency (IDF) and term frequency – inverse document frequency (TFIDF) method and constructing the features vector with Word2vec method. The experiments show that the combinatorial classifiers that exploit different combination features in the online social network conversations outperform binary classification; especially, the topology of the social network has a highly positive impact on the classification results. Furthermore, the ‘comment’ and ‘query’ of rumour standpoints have a better classification effect based on the features of different categories.
1. Introduction
Rumour is a type of opinion on false facts that relate to hot events or emergencies, its spreading may cause social panic, in which the spreader will achieve some ulterior purpose [1–4]. In recent years, the rapid development of information technology [4], especially the emergence of online social network platforms, such as the Twitter and Facebook [5], has improved the rate and breadth of rumour spreading more than ever before [6]. For example, a classic rumour spreading event about iodine being able to ward of nuclear radiation happened after the Fukushima nuclear disaster, which caused the Chinese, Americans, Russians and Koreans to rush for materials containing iodine [3,7]. The rumour event of Fukushima nuclear disaster led to certain extent of social unrest [8,9]. Therefore, it is necessary to guide and control rumour spreading in order to maintain social stability and regulate public opinions and views to help reduce the probability of social unrest caused by rumours [10–12]. This leads the researchers to increase the attention on the rumour of the online social network. Especially, the research on whether the hot events or breaking news are rumours (i.e. the rumour detection or identification), which helps to control the process of rumours or even eliminate them at the source of rumour development. Zubiaga et al. [13] proposed a rumour-detection framework based on the rumour classification system, that is, determining whether the event is a rumour based on the classification result of the event standpoints generated from the posts of topic participants.
The classification of standpoints has been studied in many different scenarios or fields by some researchers. The classification of the rumour standpoints was also mentioned in preceding work: Qazvinian et al. [14] explored the effectiveness of three categories of features: content-based, network-based, and microblog-specific memes for correctly identifying rumours performing two-way classification of each tweet as supporting or denying a long-standing rumour; however, the number of the set standpoints is not enough to cover most of the tweets. In addition, the selection of three categories of features is not specific. Hamidian and Diab [15] employed the tweet latent vector (TLV) feature, which creates a 100-dimension vector representative of each tweet for increasing the precision of rumour retrieval task by adopting a supervised rumours classification task with the standard data set. The research of Hamidian and Diab extended the work of classification of rumour standpoints. Zeng et al. [16] explored the task of automatically classifying rumour stances expressed in crisis-related content posted on social media by utilising a data set of over 4300 manually coded tweets. However, too much data are duplicated in training set and testing set when dealing with data imbalance.
The focus of research for this article is to classify the opinions of the confirmed rumours, so as to analyse the characteristics and regulations of the classification of events standpoints and then help to detect and identify rumours. How do we confirm the standpoints of each tweet? The basic statement are as follows: such the current tweet replies with ‘yes, I agree’ or ‘of course’ affirmative contents corresponding to agree the previous opinion, otherwise, ‘nonsense’ or ‘no, it’s not true’ negative contents indicating that the current tweet disagree the preceding view, and other conditions are similar to this. Multiple researchers believe that combining the percentage of standpoints of tweets in events can help determine the truth and reliability of events [6,13], that is, if more than a certain percentage of tweets categorise their standpoints as ‘disagree’ determining that the event is a rumour and not credible [17–19]. However, the tweets of the online social networks have some problems including many nonstandard expressions such as slangs, dialects, abbreviations or even the dirty words, as well as the dictions generally are not in line with grammar [20,21]. These add difficulty to our experiment for classification of rumour standpoints. That is why we employ the natural language processing and data mining techniques to experiment with tweets of the online social networks.
Aiming at these conundrum, the focus of this article intends to assess whether and the extent to which tree-structured combination classifiers can be of help in rumour standpoints classification. Cerri et al. [22] experimentally analysed the behaviour of different decision tree–based hierarchical multilabel classification methods based on the local and global classification approaches. However, in a threshold selection scheme, their method presented a tendency to favour empty predictions. Amin [23] presented a new technique for the recognition of Arabic text using the C4.5 machine learning system generating a decision tree for classifying each word. Nevertheless, there is a phenomenon of inconsistency in segmentation of words and sub-words. On the other hand, Simm et al. [24] proposed a new tree-based ensemble multi-task learning method for classification and regression (MT-ExtraTrees), based on extremely randomised trees (ExtraTrees). However, the application of decision tree and the ExtraTrees are simply, and they cannot match the scattered characteristics of tweets. In our experiments, we apply the tree-structured classifiers to simulate the network topology of different tweets in online social networks. The experimental performance of classifier in rumour standpoints classification is explored, where the classifiers include two conformations: combining the adaptive boosting classifier with decision tree and ExtraTrees, respectively, comparing experimental performance with results of selected baseline classifier. In previous studies, many of them were based on the features extracted from the tweets themselves for experimenting [25–27]. Hence, we creatively apply the feature based on the network topology structure to test the performance of experiments and achieve extremely favourable classification results. The ID of the published tweet is unique and can be used as an identity symbol to identify the publisher and analyse the association between different IDs, where the interaction between different tweets presents the characteristics of tree-structured and complex network [28,29].
Zhang et al. [30] proposed a method for sentiment classification based on Word2vec and SVMperf, where the Word2Vec shows the capability to capture the semantic features in selected domain and Chinese language. Chen [31] proposed a new distance-based term weighting method for overcoming the bias of term frequency – inverse document frequency (TFIDF) algorithm by considering a basic characteristic, whereby each news article must be similar or different from others while processing big news that include large amounts of news. We study the features of tweets, especially the interaction of tweets (e.g. the topology of online social networks) and exploring whether and the strength to which they help to perform the classification of rumour standpoints by the combinational classifiers with tree-structured tweets. Based on the combination model of TFIDF, term frequency (TF) and inverse document frequency (IDF), we process the initial tweet data and combine Word2vec to achieve vector representation of vocabulary and features. TFIDF is a commonly used weighting technique for information retrieval and data mining, that is, reflecting how important a word is to a document in corpus or a sentence in a document [32,33].
The remainder of the article is structured as follows. Second section illustrates the methodology containing four parts: data set, the overall technology framework, classifiers part, as well as the baseline classifiers and experiment settings. The experimental results calculated from the methodology part will be displayed in the third section, that is, ‘Results and discussion’. Some conclusions are listed in the final section, as well as some suggestions for future directions.
2. Methodology
The main purpose of our study is to analyse whether the statement structure and selected features of online social network conversations can be exploited to improve the classification of the standpoints expressed by different participators towards the trending event. Each post of rumour topic represents personal views of the current topic participator, so it can be defined or annotated as a standpoint on the rumour topic just like the ‘agree’, ‘disagree’, ‘comment’ or ‘query’. However, there are many relationships between different posts in rumour topics. Based on this, we focus on improving the performance of classifiers that determine the rumour standpoint of each post.
2.1. Data set
Our data set is adopted from eight rumour topics (namely A, B, C, D, E, F, G, H) corresponding to eight breaking news, collected from the tweets of Twitter platform [6], where the data set is part of the PHEME project [34,35]. The rumour standpoints label (i.e. ‘agree’, ‘disagree’, ‘comment’ or ‘query’) of tweets in the eight rumour topics are all revised and annotated by the journalists and linguist based on the tweets annotated labels.
One important characteristics of eight rumour topics is that the uneven distribution of the rumour standpoints label, especially the tweets annotated as ‘comment’, which account for 63.91% of the total tweets. However, the rumour standpoint labels ‘query’ just account for 4.18% of the total tweets. This imbalance acts slightly different across the eight rumour topics in the data set. Meanwhile, the imbalance distribution of the rumour standpoints increases the challenge of classification for the rumour standpoints, specific as shown in Table 1 and Figure 1, which is closer to the real-world scenario, that is, it has great practical significance.
Standpoints distribution of eight rumour topics.

Standpoints distribution of eight rumour topics in each standpoint label.
As we can see in Table 1, there are totally 4738 tweets for the rumour topics in data set. Our purpose is to classify the standpoints of each tweet in rumour topics, which is described as follows: the collected data set includes eight rumour events, namely

Example of a tree-structured sub-topic discussing the Ottawa shooting incident, where the annotated labels (‘agree’, ‘disagree’, ‘comment’ and ‘query’) are the rumour standpoints of each ID.
2.2. The overall technology framework
The overall technology framework consists of two parts, that is, the data processing part and classifiers part, as shown in Figure 3. In the stage of the data processing, we apply the method of the combinatorial TFIDF (TF, IDF and TFIDF) and the Word2vec to process the tweets for getting the M × N dimensional weight matrix and the 100-dimensional features vector, respectively. Then the weight matrix and features vector are combined and sent to combinatorial classifiers for classification experiments. As we can see from the Figure 3, where the ‘F’ is the 100-dimensional various features vector generated from the rumour tweets processed using the Word2vec method. The specific content will be explained in the data processing part and the classifiers part, respectively.

The overall technology framework of classification of rumour standpoints.
First, preprocessing operations such as removing stop words and part of speech tagging are applied to the original rumour tweets corpus. Then, we combine the TF, IDF and TFIDF methods to construct the weight matrix of
where the
In the case of IDF, we calculate it as
where N is the number of tweets in the rumour topics and the
where the

The working process of combination for TF, IDF and TFIDF.
While focusing on the study of combinatorial classifiers for rumour standpoints classification, we design our experiments with two types of features: original features and contextual features, which are derived from the tweets themselves and contextual relationship between tweets, respectively. In the original features section, we employ these features to evaluate the performance of combinatorial classifier under the same condition, which make it a fairer comparison that we can quantitatively assess and boost the performance of each feature on the results of classification, where features are extracted from the current tweet. In succession, that is, the contextual features part generated from the relevant tweets in the same sub-topic, which drive us to further promote the performance of classification. The original features and contextual features include several subtypes of features, as shown in Table 2.
The description of original features and contextual features.
These features are explained as follows:
Word embedding: we apply the Word2vec to support the content of each tweet by constructing the vector values. We structure the 100-dimensional Word2vec model by training the Wikipedia corpus and the eight rumour topics corpus. Based on the Word2vec model, we represent each tweet by calculating the average vector of all the words in the tweet.
Negative words: this feature presents the number of negative words that exist in the tweet, where the negative words consist of the following words: not, no, nobody, nothing, none, never, neither, nor, nowhere, hardly, scarcely, barely, don’t, isn’t, wasn’t, shouldn’t, wouldn’t, couldn’t, doesn’t, few, seldom, little, avoid, ban, cancel, deny, deprive, forbid, ignore, lack, lose, miss, neglect, stop and prohibit.
Bad words: this feature captures the number of bad words that present in the tweet, where the 371 bad words are stored in a list.
Exclamation mark: this feature indicates the number of exclamation marks in the tweet.
Question mark: this feature indicates the number of question marks in the tweet.
URL link: this feature determines the number of URL labels when the URL link is in the tweet.
Pictures or video marks: this feature indicates the number of pictures or video marks when they exist in the tweet.
Cosine similarity between current tweet and source tweet: this feature presents the semantic correlation between the current tweet and the source tweet in each rumour sub-topic by calculating the cosine similarity between the words vector of current tweet and the words vector of source tweet. This function will help us to research the impact of source tweet on the current tweet.
Cosine similarity between current tweet and previous tweet: as described in the previous feature, this feature explains the semantic correlation between the current tweet and the previous tweet, that is, which is directly responding to by the current tweet.
Number of reply: this feature determines the number of a tweet that has been responded.
Degree distribution: degree distribution is a concept existing in graph theory and complex network theory [36]. It reflects the probability distribution of each node in the network. Based on this, we can apply the degree distribution in the online social network which is a kind of complex network, where the degree is the number of other IDs connected to the current ID. Therefore, this feature helps us to study the influence of the network structure on the classification of the rumour standpoints. And the overall network connection figure of the eight rumour topics is shown in the Figure 5.

(a) The overall network connection figure of the eight rumour topics and (b) zoomed in version of part (a).
As we can see from Figure 5, the size of the node depends on the size of the degree. The greater the degree, the larger the node, which means that the more important the node is, the more influence it has on the classification of the rumour standpoint. This is why we added the degree distribution as a feature to the experiment.
The mathematical vector generated by Word2vec is much denser [37]. Therefore, the Word2vec method is applied to vectorization of words, so that each word can be mapped to a set of vectors for mathematical computation and simulation. The Word2vec working process can be illustrated in Figure 6.

The Word2vec CBOW model working procedure.
The words are put into the Word2vec – continuous bag-of-words (CBOW) model to calculate the 100-dimensional word vector of the target word
The vector dimension:
2.3. Classifiers part
In this section, we describe the combinatorial classifiers that are applied to our experiments. As we can see from the Figure 2, the discussion about the rumour topics in the online social network is the communication between the people represented by the different ID numbers, which presents tree-structured characteristics. Based on the tree-structured characteristics, we apply the ExtraTrees classifier and decision tree classifier. In order to further enhance the performance of the classification, we employ the ExtraTrees classifier and decision tree classifier as the base estimator to test the adaptive boosting classifier.
The idea of decision tree algorithm mainly comes from the ID3 algorithm [38] and its improved algorithm, the C4.5 algorithm [39], as well as the classified regression decision tree (CART) [40]. We mainly experiment with decision tree based on the CART algorithm. CART algorithm constructs trees using the feature and threshold that yield the largest information gain at each node based on the Gini coefficient described as follows
where
Based on the equations (5) and (6), we can give the Gini coefficient of D under the condition of feature
where
Calculating the Gini coefficient of D by the feature
Determining the feature
Calling the step 1 and step 2 at the optimal subtree.
Generating the CART tree.
At last, pruning the decision tree by minimising the overall loss function of the decision tree to avoid overfitting. And the parameter setting will be introduced as follows:
The function to measure the quality of a split:
Geurts et al. [41] proposed the ExtraTrees algorithm derived from the random forest (RF) algorithm [42]. The main process of the ExtraTrees is described as follows:
Providing the sample data set
Generating the strong classifiers based on the step 1.
Sampling k times at the m iterations to acquire the data set
Then, it will randomly select a feature value for classifiers while training the m classifier based on the
This operation is the difference between the ExtraTrees and the RF, which help to enhance the generalisation ability of ExtraTrees. The parameter setting will be introduced as follows:
The number of trees in the forest:
Freund and Schapire [43] proposed the adaptive boosting algorithm (i.e. the AdaBoost Classifier), which has the following characteristics: AdaBoost is a classifier with high accuracy and provides a framework that can be used to construct sub-classifiers in a variety of ways. Otherwise, the weak classifier constructed on the basis of A is extremely simple. In addition, there is no feature screening and overfitting problem. Based on the characteristics of the adaptive boosting algorithm, we employ the AdaBoost Classifier as the framework and experiment with decision tree algorithm and the ExtraTrees algorithm as the sub-classifiers, constructing the combination classifiers, that is, AdaBoost–DecisionTree (AB-DT) and AdaBoost–ExtraTrees (AB-ET).
2.4. Baseline classifiers and experiment settings
Based on our previous experimental plan, we adopt the baseline classifiers from four categories: support vector machines (SVMs), RF, conditional random fields (CRFs) and the logistic regression (LR). The training data set is 60% of the total data, and the testing data set is 40%. We use their implementation in the scikit-learn Python package to all the experiments. In addition, the parameters of each baseline classifier will be set default.
We experiment in a five-fold cross-validation setting. In our cross-validation setting, we run the classifier five times, on each occasion having a different fold for testing, with the other four folds for training. In this way, each fold is tested once, and the aggregation of all folds enables experimentation on all data. In the tweets of data set, there are many slang and abbreviations that do not conform to grammar, which results in extreme imbalance and irregularity of the tweets. Furthermore, this article intends to assess whether and the extent to which the tree-structured combination classifiers can be of help in rumour standpoints classification. Aiming at this problem, we employ the macro-averaged F1 scores (Macro-F1) as the standard for comparison, where the Macro-F1 judges the overall performance of each category, that is, computing locally over each category first and then average over all categories is taken.
3. Results and discussion
On the basis of the experimental plan mentioned above, first of all, we assess the performance of the experiment with classifiers on the original features only, which make it a fairer comparison that we can quantitatively assess and boost the performance of each feature on the results of classification, where features are extracted from the current tweet. The original features, extracted solely from each tweet in isolation, can evaluate whether the performance of the classification results of different classifiers can be enhanced based on the same type features of each tweet separately.
Table 3 shows the classification results of diverse classifiers under the combinations of original features, that is, experimenting with each subtype of original features and then employing the combination of subtypes, at last, applying the all of the subtypes in the original features.
The performance results of Macro-F1 using only the original features (OFs).
OF1: lexicon; OF2: sentiment words; OF3: reliability proof; LR: logistic regression; SVM: support vector machine; CRF: conditional random field; RF: random forest; AB-DT: AdaBoost–DecisionTree; AB-ET: AdaBoost–ExtraTrees.
Bold values indicate the optimal results of different classifiers under each feature.
From the results in Table 3, we can present some observations as follows:
Compared with the baseline classifiers, the combinatorial classifier of AB-DT and AB-ET perform good results overall.
The AB-ET classifier and AB-DT classifier gains three optimal Macro-F1 scores in each feature, respectively. That is to say, the AB-ET classifier and AB-DT classifier achieves the best classification results and the second best classification results in each feature, respectively, except the classification result of RF classifier under the feature OF2. Otherwise, the RF classifier is better than other baseline classifiers. Therefore, the combinatorial classifiers outperform single classifiers. This also corresponding to our presuppositions and experimental objectives.
The AB-ET classifier acquires the optimal Macro-F1 score 0.471 under the feature OF23; however, the RF classifier employs the Macor-F1 score 0.456 under the Feature OF2, which is second only to the former. Moreover, the Macro-F1 scores of the classifiers under the feature OF2 are generally better than other features, for example, the Macro-F1 score 0.446 of AB-DT classifier under the feature OF2. By and large, the Macro-F1 score of each classifier increases with the increase in number of features. The reason is that OF2 consists of four features (negative words, bad words, exclamation mark and question mark), meanwhile, the OF1 contains only one feature (word embedding), and the OF3 owns two features (URL link, pictures or video marks). In addition, the Macro-F1 scores of the combination features containing the OF2 is often better than the combination features without OF2. In conclusion, the combination of multiple features contributes to the improvement of the Macro-F1 score.
Even though the combination of multiple features is beneficial to increase the Macro-F1 score, some combined features will weaken this improvement. For instance, the Macro-F1 scores of SVM classifier under the feature OF12, OF23 and OF123 are 0.336, 0.359 and 0.342, respectively, which are all less than the Macro-F1 score 0.387 under the feature OF2. Therefore, it is not the more the number of features, the better the Macro-F1 score.
In brief, the Macro-F1 scores, using only the original features, provide a preferable result. Then, we obtain the combination features of negative words, bad words, exclamation mark, question mark, URL link, pictures or video marks calculating the great precision results (i.e. the Macro-F1 score 0.471 of AB-ET classifier under the feature OF23). Based on this, we experiment with the combination of original features and the contextual features in the next section.
From the results in Table 4, we can present some observations as follows:
The Macro-F1 scores of classifiers under the feature CF12 in Table 4 are overall better than the classifiers under the feature OF123 in Table 3, except the Macro-F1 scores of the feature CF2 in Table 4 (i.e. on the contrary, the Macro-F1 scores of the feature OF2 are superior to the feature CF2). Moreover, the Macro-F1 scores of classifiers under the feature CF1 are better than the classifiers under the features OF1 and OF3. Therefore, the Macro-F1 scores of the contextual features has been significantly improved comparing to the original features.
The AB-ET and AB-DT classifiers gain four and three optimal Macro-F1 scores in each feature, respectively, that is, the best classification results and the second best classification results in each feature. Just like the results discussed in Table 3, the combinatorial classifiers outperform single classifiers under the combination features of OF and CF.
Even though the Macro-F1 scores of classifiers under the feature CF2 are inferior to that of the feature OF2, they gain Macro-F1 scores with two features (number of reply and degree distribution) that are superior to the other features. Moreover, the best Macro-F1 score 0.476 compared with all of the features comes from the AB-ET classifier with the feature OF + CF2. That is to say, the features (degree distribution and number of reply) derived from the network topology have a strong positive impact on improving the performance of the Macro-F1 score.
The performance results of Macro-F1 using only the contextual features (CFs), as well as the combination of the OF and CF.
CF1: correlation; CF2: network topology; LR: logistic regression; SVM: support vector machine; CRF: conditional random field; RF: random forest; AB-DT: AdaBoost–DecisionTree; AB-ET: AdaBoost–ExtraTrees.
Bold values indicate the optimal results of different classifiers under each feature.
Based on the combination of the original features and the contextual features, the Macro-F1 scores are significantly improved than that of a single one, for example, on the feature OF + CF2 and OF + CF1, the AB-ET classifier achieve the best Macro-F1 score 0.476 and the second best Macro-F1 score 0.472, respectively. What needs to be noted is that the feature of the network topology introduced for the first time exhibits excellent experimental performance, where the network topology can better fit the tree-structured characteristics, that is, the flow of information through the Twitter community is similar to the complex network in which nodes are individual user accounts, where the flow is the edge of the network. The example diffusion network is shown in Figure 5.
In order to further analyse our experiments, we give the confusion matrices of different classifiers for each standpoints. Table 5 shows the confusion matrices embodying the deviations calculated from the different classifiers, where the deviations help us to research the performance of classification. The bold values displayed in the diagonals are the correct classification probability for each rumour standpoint.
Confusion matrices of deviations for different classifiers.
LR: logistic regression; SVM: support vector machine; CRF: conditional random field; RF: random forest; AB-DT: AdaBoost–DecisionTree; AB-ET: AdaBoost–ExtraTrees.
The bold values are the correct classification probability for each rumour standpoint.
From the deviations, we can see that the combinational classifiers with tree-structured characteristic present the best results in the overall state, where the probability 0.907 of the AB-ET is the best result comparing with the other classifiers under the standpoints type of ‘comment’. At the same time, a large number of standpoints are misclassified into ‘comment’; the reason is that the standpoint of ‘comment’ accounts for the largest proportion of data set, reaching 63.91%. Correspondingly, the probability of standpoint ‘disagree’ only has a maximum of 0.367 for the AB-DT classifier, with a minimum of 0.107 for the SVM classifier, where the standpoint of ‘disagree’ only accounts for a proportion of 12.92% for the data set.
In other words, as far as the standpoint of ‘comment’ is concerned, there is a large amount of data for training, which shows that the amount of data has an extremely important positive effect on the classification results. If the training set is balanced by reducing the number of standpoint ‘comment’, this will disrupt the structure of the online social network, which will have a negative impact on the classification of rumour standpoints. Aiming at this scenario, it can be solved by increasing the amount of data or perfecting the experiment scheme (e.g. employing the more appropriate features, models or algorithms).
4. Conclusion and future work
The research on rumour stance classification in online social networks based on the natural language processing is rarely covered by the previous work. Our study on the rumour standpoints classification in this article pays some significant contributions for future study: experimenting with the tweets that present the characteristic of extreme imbalance and irregularity; we consider not only the properties of the tweets themselves but also the tree-structured characteristic of the tweets with correlation; we take into account the feature of the social network topology creatively, which performs favourable classification results. Comparing to a range of baseline classifiers, we apply the combinational classifiers AB-DT and the AB-ET to construct the tree-structured classifier of tweets for acquiring preferable results.
From the above experiments, it can be known that the results by considering the combination of tweets and features are better than the experimental results that simply considered the tweets, and the effect of multi-features combinations is particularly obvious. In particular, the feature of the online social network topology that we creatively apply highlights the better results. Hence, in the future work, we try to experiment with classification of rumour standpoints as well as the identification of rumour by considering the properties of complex network, for example, the shortest path, average degree, out-degree, in-degree and clustering coefficient. At the same time, the presentation of the effects for each feature helps us to understand and analyse the influence of individual feature or combinational features on the classification effect. This will provide reference and help for further research on the classification of rumour standpoints in the future.
Moreover, the study of rumour stance classification for online social networks has great practical significance. Based on the research of rumour classification, our future work will not only further improve the effectiveness and accuracy of classification but also improve further applications, that is, combining natural language processing and data mining technology to detect and confirm rumour events for suppressing or eliminating them, as well as we can analyse the public views and opinions on the hot events or breaking news to provide suggestions for administrators.
Footnotes
Acknowledgements
The authors are grateful for the helpful suggestions from the reviewers. J.M. and Y.L. contributed equally to this work.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship and/or publication of this article.
Funding
This research was supported by the National Natural Science Foundation of China (Grant No.: 71373123) and the Fundamental Research Funds for the Central Universities (Grant No.: NW2018004).
