Abstract
The wide penetration of location-aware mobile devices and location-based services renders the location-based social media as a reliable proxy to study the real-world geographic space. Language diversity is an important indicator of a city's internationalization level. People communicate using different languages in the cyberspace of social media as they do in the geographic space. The location-based social media therefore provides an innovative set of lens to map the language diversity and study the internationalization of cities. In the enclosed graphics, based on a collection of geo-tagged Twitter posts, we generated a fine resolution map of language diversity index in the area of Hong Kong to illustrate the potential of location-based social media in city research.
Hong Kong, as one of the top five world cities, attracts numerous global businessmen, tourists, and students speaking different languages every day (GaWC, 2017; Gott et al., 2014). By imitating Shannon index, a popular diversity index in ecology but originally proposed to quantify entropy in text (Peet, 1974; Shannon, 1948), we addressed an index to quantify language diversity and produced a map (Figure 1) to visualize spatial variations of language diversity within Hong Kong.
Language diversity index derived from geo-tagged tweets.
Specifically, we collected geo-tagged tweets posted during 1 July 2013 to 31 December 2013 in a geographic extent of 133.77884°E to 114.42154°E and 22.15725°N to 22.55478°N which covers the whole Hong Kong area and southern regions of Shenzhen in mainland of China. These geo-tagged tweets were posted by 165,098 users in 39 different languages. We divided the extent into 500 m × 500 m cells and distributed the tweets into the cells based on their posted locations. For each cell, multiple tweets posted by the same user and in the same language were combined as one tweet. Based on the proportions of the tweet languages, we computed a language diversity index (H) for each cell
We produced another auxiliary map (Figure 2) showing the number of tweeters in each grid cell. Visually comparing Figures 1 and 2, one can find that the numbers of tweeters and the values of the language diversity index are both notably large at the proximity of Hong Kong International Airport. However, on the campus of the University of Hong Kong, the number of tweeters is very small (<50), while the value of the language diversity is fairly large (>2.00). A Pearson correlation test revealed that although the value of the language diversity index significantly (p < 0.01) correlates with the number of tweeters, the correlation coefficient (0.25) is small. Previous studies have demonstrated that the number of tweeters is a good proxy of population (Jurdak et al., 2015; Mislove et al., 2011), and it can be expected that the tweets in different languages may imply the tweeters coming from different countries and cultural backgrounds. Thus, the above visual comparison and statistical result suggest that within a world city the internationalized intensity varied dramatically and a higher internationalized intensity does not necessarily correspond to a large population density.
The number of tweeters per 0.25 km2.
At present, world cities are identified and ranked by their international connectedness of economy, polity, and culture (GaWC, 2017), whereas rare studies have been conducted to assess internationalization at finer geographic scales than the city level. In the future, the language diversity index derived from social media like Twitter posts may be used to measure the intensity of internationalization across an individual city.
Footnotes
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: Guofeng Cao gratefully acknowledges the funding provided by the Transdisciplinary Research Academy of Texas Tech University to support this research.
