# Table 1 Total similarity formula and the formulas used for each signal

The total similarity formula
$$Sim _{\text{Total }} ({\text{u}}_{\text{i}} ,{\text{u}}_{\text{j}} ) = \mathop \sum \limits_{m = 1}^{7} \left( {Sim_{\text{m }} \left( {{\text{u}}_{\text{i}} ,{\text{u}}_{\text{j}} } \right) * {\text{weight}}_{\text{m }} } \right)$$
where, $$u_{i}$$ is the examined user—TSim attempts to find similar users to it
$$u_{j}$$ is the candidate user—the user that TSim is computing its similarity to ui
Simm is the score of signal m similarity between ui and uj. Sim1 through Sim7 explained below
weightm is the weight assigned to signal m score
Name Formula Explanation
Signal 1 or Sim1 Followings and Followers Relationship Similarity $${\text{Sim}}_{{ {\text{Relationship}}}} \left( {u_{i} , u_{j} } \right) = \left\{ {\begin{array}{*{20}c} {1\; if\;the\;candidate\;user\;appears \;in \;one \;list} \\ { 2\; if\; the \;candidate user\;appears \;in \;two \;lists} \\ . \\ . \\ . \\ . \\ {n + k \;if\; the \;candidate \;user \;appears \;in \;all \;lists} \\ \end{array} } \right.$$ n is the number of the ui’s followers
k is the number of the ui’s friends
Signal 2 or Sim2 Mention Similarity $${\text{Sim}}_{\text{Mention }} \left( {u_{i} , u_{j} } \right) = \mathop \sum \limits_{l = 1}^{w} \frac{{{\text{twtsThrd }}\left( {{\text{l}}, u_{i} ,u_{j} } \right)}}{{{\text{twtsThrdTot}}\left( {{\text{l}}, u_{i} } \right) }}* \frac{ 1}{{ {\text{accntsTwt}}\left( {{\text{l}}, u_{i} } \right)}}$$ twtsThrd is a function that returns the number of ui tweets in the communication thread l with uj that mention the account uj
twtsThrdTot is a function that returns the total number of tweets in the communication thread l.
accntsTwt is the total number of accounts in the tweets in thread l
w is the total number of communication threads mentioning both ui and uj
Signal 3 or Sim3 Retweet Similarity $${\text{Sim}}_{\text{Retweet }} \left( {u_{i} , u_{j} } \right) = {\text{numOfTwtsInRetwtList}}\left( {{\text{u}}_{\text{i}} , {\text{u}}_{\text{j}} } \right)$$ numOfTwtsInRetwtList is the number of uj tweets that ui retweeted
Signal 4 or Sim4 Favorite Similarity $${\text{Sim}}_{\text{Favorite }} \left( {u_{i} , u_{j} } \right) = {\text{numOfTwtsInFavList }}\left( {{\text{u}}_{\text{j}} ,{\text{u}}_{\text{i}} } \right)$$ numOfTwtsInFavList is the number of uj tweets that ui favorited
Signal 5 or Sim5 Common Hashtags Similarity $${\text{Sim}}_{\text{Hashtag }} \left( {{\text{u}}_{\text{i}} , {\text{u}}_{\text{j}} } \right) = \sum \limits_{l = 1}^{w} \frac{1}{{1 + HTOffset\left( {{\text{u}}_{\text{i}} , {\text{u}}_{\text{j}} ,{\text{HT}}_{l} } \right) }}$$
where,
$${HTOffset}\left( {{\text{u}}_{\text{i}} , {\text{u}}_{\text{j}} ,HT} \right) = \left| {{\text{PT}}\left( {{\text{u}}_{\text{i}} , {\text{HT}}} \right) - {\text{PT}}\left( {{\text{u}}_{\text{j}} , {\text{HT}}} \right)} \right| + \left| {{\text{NT}}\left( {{\text{u}}_{\text{i}} , {\text{HT}}} \right) - {\text{NT}}\left( {{\text{u}}_{\text{j}} , {\text{HT}}} \right)} \right| + \left| {{\text{NTT}}\left( {{\text{u}}_{\text{i}} , {\text{HT}}} \right) - {\text{NTT}}\left( {{\text{u}}_{\text{j}} , {\text{HT}}} \right)} \right|$$
PT is a function that takes in a user id and a hashtag HT and returns the number of positive tweets of the user in the hashtag
NT is a function that takes in a user id and a hashtag HT and returns the number of negative tweets of the user in the hashtag
NTT is a function that takes in a user id and a hashtag HT and returns the number of neutral tweets of the user in the hashtag
w is the total number of hashtags that both ui and uj tweeted in
Signal 6 or Sim6 Common Interests Similarity $${\text{Sim}}_{\text{Interests }} \left( {{\text{u}}_{\text{i}} ,{\text{u}}_{\text{j}} } \right) = {\text{count}}\left( {{\text{ints}}\left( {{\text{u}}_{\text{i}} } \right) \cap {\text{ints}}\left( {{\text{u}}_{\text{j}} } \right)} \right)$$ Ints is a function that takes in a user id and returns his/her top 5 interests after performing topic analysis to his/her tweets
Signal 7 or Sim7 Profile Similarity \begin{aligned} {\text{Sim}}_{\text{Profile}} \, = & \;\,\left[ {{\text{gender}}\left( {{\text{u}}_{\text{i}} } \right)is \, equal \, to\;{\text{gender}}\left( {{\text{u}}_{\text{j}} } \right)} \right] \\ & + \;[{\text{language}}\left( {{\text{u}}_{\text{i}} } \right)is \, equal \, to\;{\text{language}}\left( {{\text{u}}_{\text{j}} } \right)] \\ & + \;[{\text{location}}\left( {{\text{u}}_{\text{i}} } \right)is \, equal \, to\;{\text{location}}\left( {{\text{u}}_{\text{j}} } \right)] \\ \end{aligned} Gender is a function that takes in a user id and returns its gender from the user’s profile on Twitter
Language is a function that takes in a user id and returns its language from the user’s profile
Location is a function that takes in a user id and returns its location from the user’s profile