Exploring Tourist Experience of Island Tourism Based on Text Mining: A Case Study of Jiangmen, China

: Island tourism is an important part of the development of the marine economy. Understanding the tourist experience of island tourism is conducive to promoting the development of marine tourism. This study takes the main island tourism attractions in Jiangmen, China, as a case, and analyzes the tourist experience of island tourism through a text mining method based on the text reviews of tourists on Ctrip. The study shows that beach, hotel, attraction, seafood and seawater are the main discourse system core of tourists’ evaluation of island tourism. Tourists’ sentiment evaluation of island tourism is generally positive. Nice, convenient, clean, cheap and comfortable are the main sentiment characteristic words of tourists. The results of LDA topic model analysis show that tourists island tourism experience is mainly divided into four categories: coastal scenery, seafood cuisine, beach environment and entrance service.


INTRODUCTION
The ocean has important significance and value to the world's political and economic peace, stability and development. In recent years, China has been paying more and more attention to the development of marine industry. Jiangmen's 14th Five-Year Plan and the outline of the 2035 Vision also clearly put forward the need to actively expand the blue development space and vigorously develop the marine economy, which provides new opportunities for the high-quality development of Jiangmen's economy. Island tourism is one of the important industries of the marine economy, and the development of island tourism will certainly give a strong impetus to the sustainable development of the marine economy. The current research of scholars on island tourism mainly focuses on the development strategy of island tourism [1][2][3], the environment of island tourism destination [4][5], the value of island tourism resources [6], etc. However, the current academic research on island tourism is generally not deep enough, and few scholars have studied island tourism from the perspective of tourists experience.
With the development of the experience economy, tourists are increasingly valuing the specific experiences they obtain in the tourism process that meet their individual psychological needs and preferences [7]. Tourist experiences are also increasingly valued by researchers in rural tourism, ecotourism, black tourism, outbound tourism and other aspects [8][9][10][11]. Research methods are also becoming more and more diverse, from the traditional interview method and questionnaire survey method to the analysis and utilization of online text data [8][9][10][11]. The booming of online tourism industry not only has an impact on the way of tourists consume, but also changes the way of travel information is disseminated. With the rapid development and maturity of online travel agency (OTA), more and more tourists order travel products and share their experiences through OTA platforms such as Ctrip and Qunar. A large amount of user-generated content (UGC) has become an important source of information and decision-making basis for consumers. Compared with traditional questionnaires, this kind of unstructured open-ended reviews can better reflect tourists real experiences and concerns about travel experiences from different perspectives, and more truly explore tourists experience sentiments and demands. Big data analysis also provides companies with the opportunity to capture consumer preferences and identify market trends. Research on big data in tourism has become a hot topic of academic attention today [7,[10][11]. However, the current academic research in the field of island tourism is not deep and sufficient, and there are few studies on the tourist experience of island tourism, and even fewer using web texts to analyze tourists experience perceptions of island tourism.
In view of this, this paper selects three representative island tourism attractions in Jiangmen, namely Shangchuan Island, Xiachuan Island and Langqin Bay-Naqin Peninsula, as the case study, collects the tourists' reviews on Ctrip of these three attractions, and explores the tourists island tourism experience through word frequency analysis, sentiment analysis, experience topic classification and other text mining methods. The purpose is to provide effective theoretical support and decisionmaking basis for the market segmentation of island tourism and the development and management of Jiangmen island tourism destination, and to promote the high-quality development of Jiangmen's marine economy.

Case Selection
Jiangmen is located in the western part of the Pearl River Delta of China. It has a large sea area, a long coastline and a large number of islands, which has a total of 561 islands, ranking second in Guangdong province in terms of number, and is rich in marine resources. According to statistics, the total marine economy of Jiangmen has grown steadily in recent years, accounting for about 16% of the city's GDP. Among them, Shangchuan Island, Xiachuan Island, Langqin Bay-Naqin Peninsula are the three main island tourism areas in Jiangmen, which are also national 4A scenic spots, and they are the places that receive the most tourists among Jiangmen's islands every year. Therefore, this study selects Shangchuan Island, Xiachuan Island and Langqin Bay-Naqin Peninsula as the cases to study tourist experience of island tourism, which has certain representativeness and reference value.

Data Collection
This study use the Ctrip as the main source of review texts for the following reasons: firstly, Ctrip, as the largest OTA website in China, ranks first in the list of online travel websites in terms of monthly active users [12], and in 2019, Ctrip accounted for 43.35% of monthly active users on online travel platforms, so the review data on Ctrip is a good representative; what's more, Ctrip has a "Travel Guides" section, which is conducive to the publication of tourists' reviews on destinations and the collection of text data. Based on this, the study uses the "Travel Guides" section on Ctrip to search for reviews of scenic spots including Shangchuan Island, Xiachuan Island, Langqin Bay-Naqin Peninsula, and writes a relevant crawler program to obtain relevant tourists' reviews by Python programming language. Finally, a total of 4621 reviews and 243234 words are collected. The deadline for data collection is September 2021. The specific data collection situation is shown in Table 1.

Data Analysis
Before conducting text analysis, data pre-processing is required, because data pre-processing has an extremely important impact on the data analysis results. Firstly, the data are screened to remove duplicate, meaningless and irrelevant information to improve the quality of text mining, and finally 4260 valid reviews are obtained, totaling 236461 words. Then the code program is written with PyCharm software to call the jieba package for word segmentation and stop word removal of the collected data. After pre-processing the data, PyCharm is used to compile related programs to analyze the word frequency of tourists' reviews. Then, the ROST EA software is used to sentiment analysis on the overall reviews to analyze the sentiment characteristics of tourists on the island tourism experience. Finally, LDA model is used to explore the topic of tourists island tourism experience.

Frequency Analysis
Using PyCharm software to write the relevant programming language, the word frequency statistics of nouns and adjectives are conducted separately for the review text data, and the results are shown in Table 2.

Sentiment Analysis
This study uses ROST EA to perform sentiment analysis on the overall reviews, and calculates the sentiment score corresponding to each review. The results of the study show that there are 3218 reviews with positive sentiment, accounting for 75.54% of the total reviews; 24 reviews with neutral sentiment, accounting for 0.56% of the total reviews; and 1018 reviews with negative sentiment, accounting for 23.90% of the total reviews. This shows that tourists' evaluation of the island travel experience tends to be positive. The segmental statistics of positive sentiment show that there are 1005 reviews with general positive sentiment, accounting for 23.59% of the total reviews; 885 reviews with moderate positive sentiment, accounting for 20.77% of the total reviews; 1328 reviews with high positive sentiment, accounting for 31.17% of the total reviews. It can be seen that tourists' evaluation of the island tourism experience is generally high, indicating that tourists are more satisfied with the Jiangmen island tourism. Through the statistical analysis of word frequency of positive reviews, it is found that adjectives such as "good", "convenient", "clean", "comfortable", "cheap" and "beautiful" appear more frequently, which are the main positive sentiment words of tourists experience of island tourism.
In addition, the statistical results of the negative sentiment segmentation show that the general negative sentiment accounts for 19.48% of the total reviews, which has 830 reviews; the moderate negative sentiment accounts for 2.58% of the total reviews, which has 110 reviews; the high negative sentiment accounts for 1.83% of the total reviews, which has 78 reviews. The results of statistical data analysis show that the negative sentiment of tourists on Jiangmen island tourism experience are basically mild, and there are few extremely negative sentiment. Among the negative reviews on Jiangmen island tourism, "inconvenient", "dirty", "hard", "expensive" and "boring" are the main negative sentiment words.

Identified Topics in Online Reviews
This part mainly uses the LDA model for topic mining of tourists' reviews. First, the nouns and adjectives in the reviews are extracted through the jieba package of Python, and words with high frequency but no reference meaning, such as "Taishan", "Ctrip", etc., are removed, a dictionary is established, and word frequency statistics are performed. Then, called the gensim package and used the topic coherence measure to calculate the topic coherence score. The higher the topic coherence score, the more relevant the topic words in the topic, and the less the ambiguity within the topic [14]. The results are shown in Figure 1, and the optimal number of topic can be determined as 4 based on the maximum coherence score. Finally, topic training is performed, and the topic words are abstracted by the LDA algorithm for topic classification to obtain four topic classifications and the corresponding 10 topic words for tourists island travel experience, and the topic words are sorted in descending order by the posterior probability of occurrence, and the results are shown in Table 3. Since the experimentally obtained topic classifications only have topic serial numbers and no topic names, it is necessary to inductively name each topic according to the relevant topic words and the corresponding logical relationships. As can be seen from Table 3, the four topics have different focuses, mainly containing coastal scenery, seafood cuisine, beach environment, and entrance service. Among them, topic 1 contains the words "scenery", "trestle", "beach", "wind" and "seascape", etc., reflecting tourists' attention to the coastal scenery. Topic 2 involves "seafood", "cost-effective", "price", "fresh", "discount", "delicacy", "affordable", etc., which reflects the attention of tourists to seafood delicacies and their prices. Topic 3 mainly includes "beach", "environment", "seawater", "scenery", "seaside" and "sand", which reflect tourists' attention to the beach and its related beach environment. Topic 4 contains keywords such as "attraction", "queue", "hour", "ticket", and "parking", etc., highlighting tourists' attention to the ticketing service at the entrance of attraction. It is worth noting that "queue" and "hour" appear more frequently, which indicates to a certain extent that the convenience of ticketing service at the entrance of scenic spots is crucial to the satisfaction of tourists experience. At the same time, parking service is also an important element of concern to tourists. At present, tourists who travel to Jiangmen's islands are mainly from Guangdong province, and self-driving tourists account for a considerable part. Therefore, the quality of parking lot service in the attraction is also the concern of tourists.

CONCLUSION
Tourists' attention to island tourism mainly focuses on the seaside characteristic scenery, marine characteristic food, accommodation conditions and so on. Whether the scenic environment (including beach, seawater, etc.) of the island tourism is clean or not, whether the price is affordable, whether the scenic environment and scenery are beautiful, whether the surrounding hotel accommodation conditions are good, whether the special seafood catering and food is hygienic, and whether the price is reasonable will greatly affect the quality of tourists experience of island tourism. Therefore, the island tourism management department should strengthen the supervision and management of sland tourism destination in terms of scenic spot and hotel environment, cost performance, marine food, etc., optimize the marketing plan, improve the cost performance of island tourism consumption, optimize the scenic spot environment, and improve the construction of hotels and other surrounding facilities in order to optimize the tourists island tourism experience and form a good reputation. In addition, tourists' evaluation of island tourism experience is generally positive, but at the same time, the negative impact of negative reviews cannot be ignored. Destination marketers should understand the problems reported by tourists through negative reviews, and deal with them in time.
Tourists experience of island tourism can be divided into four categories: coastal scenery, seafood cuisine, beach environment, and entrance service. Therefore, the island tourism management department should focus on the development of island tourism from the above four aspects. First of all, in terms of coastal scenery, traditional and popular island tourism elements should be fully combined to create a characteristic island tourism IP. Secondly, we should enrich the types of island tourism projects and products, extend the industry chain of island tourism, and provide tourists with diverse and affordable marine products. Thirdly, optimize the beach environment and entertainment facilities, keep the beach environment clean all the time, and provide a good rest and amusement environment for tourists. Finally, it is necessary to further optimize ticket purchasing and ticket checking procedures, and do a good job in guidance of tourists.