HOTEL SELECTION UTILIZING ONLINE REVIEWS: A NOVEL DECISION SUPPORT MODEL BASED ON SENTIMENT ANALYSIS AND DL-VIKOR METHOD

. With the considerable development of tourism market, as well as the expansion of the e-commerce platform scale, increasing tourists often prefer to select tourism products such as services or hotels online. Thus, it needs to provide an efficient decision support model for tourists to select tourism products. Online reviews based on the user experience would help tourists improve decision efficiency on tourism products. Therefore, in this study, a quantitative method for hotel selection with online reviews is proposed. First, with respect this problem with online reviews, by analyzing sentiment words in online reviews, tourists’ sentiment preferences are transformed into the format of distribution linguistic with respect to sentiment levels. Second, from a theoretical perspective, we proposed a method to determine the ideal solution and nadir solution for distribution linguistic evaluations. Next, based on the frequency of words for evaluating hotel and the distribution linguistic evaluations, the weight vector of the evaluation features is determined. Further, a novel DL-VIKOR method is developed to rank and then to select hotels. Finally, a realistic case from TripAdvisor.com for selecting hotel is used to demonstrate practically and feasibility of the proposed model.


Introduction
As the time for the era of big data is coming, E-commerce sites have expanded rapidly. Coupled with the rapid growth of tourism market, a new opportunity is given to the ecommerce platform of tourism industry. For most tourists, choices of accommodation are related to the joy of traveling (Sparks & Browning, 2011;Song, 2015;Sabokbar et al., 2016;Chiu & Lin, 2018). Thanks to the e-commerce platform of tourism industry, tourists can access the tourism websites to obtain the accommodations about their travel destinations so as to find the desirable hotel. Nowadays, many websites, such as TripAdvisor.com (https:// www.tripadvisor.com/), Tuniu.com (http://www.tuniu.com/) and Qunar.com (https://www. qunar.com/) can provide channels for tourists to choose their tourism service and post online reviews for trip. For example, a tourist wants to visit Macao for few days. Before arriving in Macao, he can browse online reviews about Macao hotels via certain tourism website, such as TripAdvisor.com. Furthermore, some tourism websites provide tourists with an opportunity to view and collect other tourists' reviews. In Figure 1, it shows a list of tourism hotels in Macao from TripAdvisor.com. Since the complexity and subjectivity of online reviews in e-commerce platform, it is not very easy for a tourist to identify the most desirable tourism hotel. Therefore, the topic orientated online reviews to evaluate and select desirable tourism hotel is significant.
At present, selection problem of tourism hotels, which is oriented to online reviews has been drawn attention to many scholars at home and abroad. Li, Ye, and Law (2013) investigate method for determining the satisfaction of customer oriented on hotel, in which the online reviews have been considered. Many consumers usually look through the reviews provided by other consumers before make a decision in travel trip. Thus, Vermeulen and Seegers (2009) studied the impact of consumers' online reviews for hotels, in which the consideration set theory have been applied. Jacob and Guéguen (2015) investigated the effect about the Figure 1. A list of tourism hotels in Macao from TripAdvisor.com geographic proximity of products for customer's decision. Sparks, So, and Bradley (2016) investigated views of the potential customers with respect to negative oriented of online review and along with the responses of hotels. Amaro, Duarte, and Henriques (2016) researched the influence of the social media on travel plan, and constructed a cluster analysis. Xiang, Du, Ma, and Fan (2017) used text analysis method to investigate three e-commerce websites, such as Yelp, Expedia and TripAdvisor. With respect to a problem of selecting hotels, Li, Law, Vu, and Rong (2013) proposed a fuzzy decision model by using Choquet Integral (CI) aggregation operator. Ye, Li, and Wang (2012) investigated the impact of price on the service quality and value from the perspective of customer perceptions. The study indicates the price has positive effects on perceived service but has negative effects on perceived value. Later on, based on online reviews, a mathematical model was designed to select appropriate hotels on websites (Yu, J. Wang, J. Q. Wang, & Li, 2018). Bi, Liu, Fan, & Zhang (2019) proposed a methodology for conducting importance-performance analysis (IPA) through online reviews. C. B. Zhang, H. Y. Zhang, and Wang (2018) investigated personalized restaurant recommendation problem, considering group correlations and customer preferences, proposed a new recommendation method.
Besides studies on evaluation and selection of tourism hotels, there are some other studies on processing online reviews still worthy studying for reference. These processing technology of online reviews provides technical support for the subsequent decision research with online reviews. With respect to the problem of selecting mobile service, Daekook and Yongtae (2014) analyzed online reviews provided consumers, and proposed a decision model combined text mining and sentiment analysis technology. In order to support online users making decision, Li and Lai (2014) proposed a social appraisal mechanism, which combines social companionship analysis, collective opinion analysis, and consensus decision analysis with the micro-blogosphere. In order to provide a desirable decision model for consumer, Fan, Xi, and Liu (2018) established a model based on online ratings, in which the stochastic dominance theory and PROMETHEE-II method are used. Later on, Liu, Bi, and Fan (2017a) proposed a decision model through online reviews. In this model, by using the sentiment analysis technique, sentiment orientations of the online reviews were identified so as to select products.
Although existing literature has contributed a lot to the decision making problem with online reviews, the methods of processing online reviews are still in an initial stage. The processing for sentiment words of online reviews were used the assignment scoring and adding method. This is equivalent to take the average of an sentiment words, which cannot fully reflect all the information provided by the online reviews. Actually, due to the complexity, fuzziness and uncertainty of the online reviews, the reasonable quantification of the online reviews is a complex problem. Moreover, it is essential to build a reasonable method to solve the hotel selection problem with the complex format of online reviews.
Compared with existing traditional uncertain problem, this selection hotels problem with online review in e-commerce sites is more complex. Therefore, the research aims at building a quantitative method to select the desirable travel hotel(s). On the one hand, since the different evaluation levels are expressed from different online reviews by different consumers, we derive to distributed evaluation easier. On the other hand, in previous studies, distribution linguistic evaluation is a kind of effective tool for assessment. So as to accomplish the overall objective of selecting the desirable travel hotel(s), inspired by studies for processing online reviews in e-commerce sites, there need following three tasks. First, by adopting feature extraction and sentiment analysis, the online reviews in format of text are transformed into quantified information, which is distribution linguistic information. Then the basic theories for distribution linguistic should be presented, such as operational rule and distance formula. Additionally, the weight of evaluation feature for selecting hotels should be determined. Further, a novel decision model for ranking the alternative hotels should be proposed. Finally, a realistic selection case is illustrate to explain the practicability and effectiveness of the decision support model in this paper.
The remaining sections of this study are organized as follows. Some basic concepts of VIKOR (VlseKriterijumska Optimizacija I Kompromisno Resenje) method and basic theory for distribution linguistic evaluation are introduced in Section 1. Then we developed a novel method to determine the ideal solution and nadir solution. Further, we construct a method of calculating distance between two distribution linguistic variables in Section 1. In Section 2, the formulation of the problem and the resolution process of ranking and selecting hotels are constructed. In Section 3, we built a novel decision support model. The online reviews in format of text are processed. According to feature extraction and sentiment analysis, the weight vector of evaluation feature is determined. And the distribution linguistic evaluations of the hotels are constructed. Then we proposed a distribution linguistic VIKOR (DL-VIKOR) method to rank the travel hotels. A realistic case of selecting travel hotel(s) is used to explain the practicability and effectiveness of the method in Section 4. And conclusions are drawn in the last Section.

The VIKOR method
In some practical problems with multiple attribute or criteria, because of some conflicting attribute or criteria, it usually has no solution subject to all criteria, simultaneously. The VIKOR method was originally proposed by Opricovic (Opricovic, 1998;Opricovic & Tzeng, 2002), the Serbian name: VlseKriterijumska Optimizacija I Kompromisno Resenje, means multi-criteria optimization and compromise solution. The method is a classical efficient tool for dealing with multiple criteria decision making (MCDM) problems (Alimardani, Hashemkhani Zolfani, Aghdaie, & Tamošaitienė, 2013;Zhang & Xing, 2017) with conflicting criteria. The VIKOR method considers the distances between both the ideal solution and nadir solution, which can sort the alternatives by using utility values and regret degrees.
Suppose a MCDM problem has m alternatives and n criteria. As a matter of convenience, we suppose that all criteria are of the benefit type. The classical VIKOR method is described step by step in the following: (1) Step 2. Construct the ideal solution Step 3. Compute the maximum group utility value S i of alternative A i , i.e., where j w is the weight of criteria F j .
Step 4. Calculate the minimum individual regret value R i of alternative A i , i.e., Step 5. Calculate the composite value Q i of alternative A i , i.e., h is the preference weight of maximum group utility value, while 1 -h is the preference weight of the individual regret degree.
Step 6. Sort the alternatives by maximum group utility value S i , individual regret degree R i and composite value Q i in an ascending order. The alternative (1) A is the best solution if the following two principles are satisfied: D1. Acceptable advantage: A is the 2nd position in the ranking result of alternatives by Q i and 1/ 1 D2. Acceptable stability in decision making. The alternative (1) A must also be the best ranked by S i or/and R i .
If one of the principles is not meet, then a set of compromise solutions is proposed, which consists two principles: (a) Alternatives (1) A and (2) A if only the principle D2 is not meet; for maximum k (the positions of these alternatives are ''in closeness'').

The distribution linguistic evaluation
The fuzzy evaluations, stochastic evaluations (Liang, Jiang, & Liu, 2018;Sun, Hrušovský, Zhang, & Lang, 2018) and probability linguistic evaluations are useful uncertainty evalua-tions in practical decision making (Ji, Zhang, & Wang, 2018;Luo, Zhang, Wang, & Li, 2019). As a useful format of uncertain evaluation, linguistic evaluations can represent qualitative information more accurate. Herrera (Herrera, Herrera-Viedma, & Verdegay, 1995;Herrera & Herrera-Viedma, 1996)  Definition 1 (Zhang, Dong, & Xu, 2014;Guo, Huynh, & Sriboonchitta, 2017). Suppose In order to solve MCDM problem with distribution linguistic evaluations, we may need to determine the ideal distribution linguistic evaluation + y and nadir distribution linguistic evaluationy for a sequence of distribution linguistic evaluations. Inspired by Jiang, Liang, and Sun (2015), we proposed three definitions to construct ideal and nadir distribution linguistic evaluations, as well as the distance between two distribution linguistic evaluations.
Then the ideal distribution linguistic evaluation + y is defined as such that In definition 2, the index k is the smallest index satisfies 0 Then the nadir distribution linguistic evaluation ris defined as such that In definition 3, the index k is the largest index satisfies 0 Note that the value of k in definition 3 is independent of that of in definition 3.
Definition 4. Suppose y 1 and y 2 are two distribution linguistic evaluations. Then the normalized distance between two y 1 and y 2 can be given as: where 0 1 Obviously, matrix Q is a symmetric positive definite matrix. It can be seen that the smaller the difference between y 1 and y 2 is, the larger the value of q ij is.
For distance between two distribution linguistic evaluations, the following properties are provided.
Property 2 (Symmetry). For any two distribution linguistics y 1 and y 2 , we have

Property 3 (Triangle inequality).
For any three distribution linguistics y 1 , y 2 and y 3 , we have Property 4 (Degeneration). If y 1 and y 2 are two deterministic linguistics, then the distance between y 1 and y 2 is degraded into the Euclidean distance.  Table 1.
Based on the subjective intuition, with respect to distribution linguistic y 1 , the full probability is averaged by each linguistic term. And y 3 and y 2 are two extreme distribution linguistics. Thus, the distance between y 3 and y 2 should be the maximum. And the distances between y 1 and y 2 , as well as y 1 and y 3 should be equal. Based on Definition 4, we have Then based on Eq. (9), we can obtain ( , ) 1 d y y =, which is satisfied with both triangle inequality and subjective intuition. Meanwhile, according to the distance measure provided by Yu et al. (2018), we can calculated the distance between any two distribution linguistic evaluations, which are shown as ( , ) 1 d y y =. Unfortunately, we obtain that ( , ) ( , ) d d y y ≠ y y is not consistent with our subjective intuition. Therefore, the proposed distance measure is a relatively reasonable formula to measure the separation degree of two distribution linguistic evaluations.

Formulation of the problem
Considering a problem is to select the most reasonable travel hotel(s) in e-commerce platform. By a preliminary screening, some acceptable travel hotels are initially confirmed. But, because of the limited knowledge and expertise, the tourist hesitates among the several alternative hotels. In order to rank and select a reasonable hotel, some evaluation features are considered, which are provided from e-commerce platform. In order to support the consumer's decision, a great many of online reviews about the hotels associated with the evaluation features are crawled from the related e-commerce platform. Therefore, the objective in this paper is how to rank and select the travel hotels with the online reviews.
As a matter of convenience, throughout all of this study, the following notations are defined in the problem. Suppose Generally, by limiting the location and price from the tourist, the alternative hotels set A can be pre-screened.
We suppose the set F can be obtained from e-commerce platform according to their online reviews. Assume is a vector of feature weights, in which w j is the weight given to feature F j with 0 j w ≥ for j N ∀ ∈ , and 1 1 n j j= w = ∑ . The weight vector of evaluation feature can be directly provided by the tourist or indirectly obtained using some classical objective method.
 is the vector of the online reviews concerning alternative hotels provided by the rth tourist, The problem addressed in this paper is how to rank and select the most reasonable alternative hotel(s) from the alternative set A with the online review information for alternative hotel A i , the feature weight vector W. In Figure 2, the formulation of ranking alternative hotels with online reviews is described concretely.

Framework and processing for the problem
In order to solve the selection problem above, a process of resolution is set, which is shown in Figure 3. From the Figure 3, we can see that the process of resolution can be divided into two modules, i.e.: (1) crawling the online reviews associated with tourist hotels and identifying sentiment orientations about the online reviews by the sentiment analysis technology; (2) ranking the hotels by using the DL-VIKOR method. The brief description of each module is given below.
Preparatory Phase. In the former module, after determining the initially alternative hotels, as well as the evaluation features associated with hotels provided by the website, the online reviews about the tourist hotels on evaluation features can be crawled by using the crawler software. Then the online reviews are analyzed based on the Stanford parser. Next, by the sentiment dictionary and the online reviews of the hotels, the positive, negative and neutral sentiment dictionaries are constructed. Further, the sentiment orientation of the hotel associated with each evaluation feature in each review is recognized by identifying those of words.
Decision analysis Phase. In the later module, based on the sentiment oriented of reviews associated with different hotels on evaluation features, the assessments of hotels on each evaluation feature can be constructed, which are in format of distribution linguistic evalu- ations. Then, based on numbers of online reviews on each evaluation feature, as well as the evaluation of distribution linguistic evaluation, the weight vector of evaluation features is determined. Further, according to the evaluations of distribution linguistic evaluations for hotels, we proposed a generalized VIKOR method to obtain a ranking of the alternative hotels.

The proposed method
Based on Figure 3, the details of the model for ranking and selecting hotels through online reviews are described in the following. In Subsection 3.1, we proposed sentiment analysis technology so as to derive the structural data. Then the method to determine the weight vector of criteria is proposed in Subsection 3.2. Considering the format of distribution linguistic evaluations, a DL-VIKOR method is presented to sort the alternative hotels.

Determining the structure evaluation information
Based on the sentiment analysis technology, the sentiment orientations on the hotels associated with each evaluation feature in online reviews can be obtained. Thus, this module can be further divided into three parts: (a) Crawling online reviews and preprocessing the related data with respect to the hotels; (b) Establishing sentiment dictionaries for the alternative hotels. Then we identify the positive, negative and neutral sentiment orientation of online reviews. (c) Further, the distribution linguistic evaluation for the hotels on each evaluation feature can be determined. Then the detail of each part is shown as follows.

Crawling and preprocessing online reviews concerning the alternative hotels
In today's society, some websites support the tourists to public out of their experience feeling online about products (Bi et al., 2019;Liu et al., 2017b, Fan, Che, & Chen, 2017. For example, the TripAdvisor establishes evaluation features of hotels, and encourages the tourists to post their reviews for tourist hotels. Thus, in the following, we suppose that the online review associated with hotels on evaluation features can be crawled from the website TripAdvisor. com. For example, the online reviews provide by any tourist for Okura Macao Hotel from TripAdvisor.com is shown in Figure 4.
In this subsection, to begin with, we used crawler software, such as Octopus harvester, to derive online reviews. Then we preprocess online reviews according to Stanford parser. Concretely, with respect to all alternative hotels, i.e.
, the online reviews can be crawled from the related website. Nowadays, some web crawlers have been presented, which can be taken to derive the online reviews. Some concrete introduction on web crawlers may be derived from Wikipedia (https://en.wikipedia.org/wiki/Web_ crawler). In fact, an online review posted by a tourist may contain several sentences concerning different features of the hotel. Thus, to identify the customers' sentiment orientations and sentiment strengths towards the certain feature, the sentences concerning each feature need to be extracted from the online reviews. Specifically, each online review is first divided into several sentences according to the punctuations. As a result, each sentence after divided is considered as a virtual tourist's online review. Therefore, the online reviews of alternative hotel A i provided by the rth virtual tourist can be expressed by

Establishing sentiment dictionaries and sentiment analysis
Based on sentiment dictionary HowNet, adding the buzzword, an improved sentiment dictionary is established. Let V + , Vand V ′ represent the sentiment sets of positive, negative and neutral, which are shown in Table 2. In the following, the positive, negative and neutral sentiment orientation of the hotel associated with each evaluation feature in every review can be identified. We suppose the sentiment orientation of a review is mainly embodied in the sentiment words in this statement. If there is only positive or negative sentiment words in this review, then the sentiment orientation of every review is deemed as positive or negative; if there are no obvious sentiment words or both positive and negative sentiment words in this review, then the sentiment orientation of this review is regarded as neutral; if there are negation words in the review, then the sentiment orientation of the review should be reversed.
Let rq ij C represents the sentiment word of for q ri e on evaluation feature F j , ( ) rq ij P C represents the oriented of sentiment word rq ij C , i.e.

1, if
; The degree words can enhance the expression of sentiment levels. In this paper, based on sentiment dictionary, the degree word is divided into two levels according to the intensity of expression. Let deg( )  Table 3.
If there are negation words with sentiment words, such as "no", "none" or "really not", then the sentiment orientation will be reversed. In fact, there are two situations we should consider: (1) If the negation word is negative to another negation word, which expresses positive, then the orientation of sentiment word is constant; (2) If the negation word is negative to the sentiment word, then its orientation is reversed. Generally, the negation word may appears once or twice. Let N be expressed the number of negation words for sentiment words rq ij C , then N = 1, 2. To sum up, after the treatments of the sentiment orientation, degree adverb and negation word, the sentiment scores Score rq ij of online review q ri e on feature F j is  . (11) Therefore, the linguistic term of online review q i e on feature F j is denoted as q ij y , which is calculated by

Determining the distribution linguistic evaluation of each alternative hotel concerning each feature
We suppose is a set of sentiment orientation levels. In order to accurately describe the differences in the level of sentiment orientation, in this study, the evaluations for alternative hotels are format of distribution linguistic evaluations.

Determination of evaluation features' weights
The online reviews generalized include two types. One is to express the overall impression of the hotel, such as " One of my favorite hotels in the world". In this situation, the evaluation of all features is the same and equal to the overall evaluation. The other is to express the evaluation of the hotel with respect to any attribute, such as "Great service". In order to collect the assessments on all evaluation features for hotels, the desirable weights of evaluation features should be determined. In general, the weights of evaluation features are usually regarded as subjective weights, that is predefined by the decision maker. However, the subjective weights of evaluation features is sometimes completely unknown. In order to determine the desirable features weights that tourists will focus on, we proposed a method based on online reviews.
1) The more times a feature appears in online reviews, the higher the weight of the feature is. A statistical method is used to sum up the times of the online reviews of each evaluation feature. Thus, we can obtain the weights of features F j by the following Equation: 1 1 1 1 in which 1, 2) In the practical process of decision making, the larger the degree of deviation for the evaluation feature, the more important role the evaluation feature plays in the aggregation process of assessments. So we should attach a lager weight toward this evaluation feature. Thus, we can obtain the weight of evaluation feature F j : in which ( , ) ij lj d k k is the distance between two distribution linguistic evaluations of hotel A i and A l with respect to evaluation feature F j . Based on two weight vector for evaluation features, we integrate 1 j w and 2 j w into combined weight vector of evaluation features, which denoted by where a is an important of 1 j w for combined weight of evaluation feature F j , which provided by the decision maker.

Ranking the hotels based on the distribution linguistic information
After obtaining the evaluations in format of distribution linguistic for alternative hotels and the evaluation features, we propose a novel DL-VIKOR method to ranking the alternative hotels.
First, we need to determine the ideal distribution linguistic solution Then, based on the normalized distance between two distribution linguistic evaluations, the maximum group utility value in which w j is the weight of evaluation feature F j . Further, the distribution linguistic individual regret degree DLR i of alternative hotel A i can be computed by the following Eq.: Based on the classical VIKOR method, the compromise solution has a characteristic, i.e. a maximum group utility while a minimum individual regret degree, simultaneously. Therefore, the compromise degree DLQ i of the alternative hotel A i can be derived as follows: where 0 1 ≤ h ≤ is the preference weight of maximum group utility values, while 1 -h is the weight of the individual regret degree.
According to the ascending order of DLS i , DLR i and DLQ i , three ranking results for alternative hotels are obtained. The compromise solution is the hotel (1) A (the alternative hotel with the first position in the ranking result by DLQ i ) if the following two principles are satisfied.

D2. The alternative hotel (1)
A must also be the best ranked by DLS i or/and DLR i . If one of the principles is not meet, then a set of compromise solutions is determined, which consists two principles: (a) Alternatives hotel (1) A and (2) A if only the principle D2 is not satisfied, or (b) Alternatives (1) (2) for maximum k. According to the above analysis, the procedures of the proposed method for solving the selection problem with online reviews can be summarized as follows: Step 1. Crawl online review  from the related travel website by a web crawler software. Preprocess the obtained online reviews based on Stanford parser, in which the notional words of each review is denoted by k i e .
Step 2. Construct positive, negative and neutral sentiment dictionaries V + , Vand ' V concerning feature F j based on Tables 2-3.
Step 3. Determine distribution linguistic evaluations k ij of alternative hotel A i on evaluation feature F j using Eqs. (10)-(13).
Step 5. Establish the ideal distribution linguistic solution and nadir distribution linguistic solution using Eqs (17) and (18).
Step 6. Compute the distribution linguistic group utility value DLS i , distribution linguistic individual regret degree DLR i and the compromise value DLQ i of alternative A i by using Eqs (19)-(21).
Step 7. Determine the compromise solution(s) based on DLS i , DLR i and DLQ i .

A case study
A case of selecting the desirable hotel with online reviews in TripAdvisor.com is used to illustrate the proposed model. A tourist wants to go on a trip to Macao in holiday. However, since the tourist has limited of advance knowledge for foreign hotels, it is difficult to select an optimal hotel in time. Thus, it is need to make decision through a tourism services platform. TripAdvisor.com (https:// www.tripadvisor.com/) is one of the world's leading tourism websites. TripAdvisor.com compares prices from more than 200 booking sites to help tourists find the lowest price on the right hotel, restaurants and vacation rental. TripAdvisor.com includes with over 200 million online reviews which provide by tourists from all regions. As a popular tourism website, "Latest reviews, lowest prices" is the objective of the TripAdvisor.com. It provides many kinds of reviews about travel text, such as vacation rental, restaurants and hotels, etc.
By limiting the location and price of hotels, five alternative hotels in Macao were initially screened by a tourist in TripAdvisor.com. The tourist needs to select a desirable hotel from the following five hotels.
A 1 : Galaxy Hotel A 2 : Conrad Macao Cotai Central A 3 : Hotel Okura Macao A 4 : Grand Hyatt Macao A 5 : Sheraton Grand Macao Hotel Based on the information in this tourism services platform, five features associated with alternative hotels are considered, i.e. Location (F 1 ), Service (F 2 ), Sleep Quality (F 3 ), Cleanliness (F 4 ) and Rooms (F 5 ). The weight vector of the five evaluation features is waited for confirmation, which are denoted as 1 2 3 4 5 ( , , , , ) T W = w w w w w . The proposed method for determining weight vector in details provided in Section 4 is applied to ranking these five alternative hotels. The calculation processes and discussion are expressed as below.

Preprocessing data
First, based on Octopus harvester, we can crawl the reviews for these five hotels from the website of TripAdvisor.com. Due to the original information obtained from online reviews websites is complex and Textual data, we must further transform the original data into quantifiable evaluations. Then, we can obtain the sentiment words of all reviews by preprocessing the online reviews using Stanford parser. For each online review, the orientations of tourists for hotels are positive sentiment, negative sentiment and neutral sentiment. Thus, we establish positive, negative and neutral sentiment dictionaries V + , Vand V ′ based on Tables 2-3. According to the sentiment dictionaries, then the distribution linguistic evaluations k ij of alternative hotel A i concerning feature F j are format of distributed information by using Eqs (10)-(13). In this practical example, we suppose 0 1 2 3 4 { , , , , } S s s s s s = is a set of sentiment orientation levels. Thus, we construct the distribution linguistic decision matrix of five alternative hotels on five features, which is shown in the following Table 4.

Methodology and results
After processing the evaluations with online reviews, the structure data in format of distribution linguistic evaluations are obtained. Then based on times of online reviews and the distribution linguistic evaluations, the weight vector of evaluations can be determined. By times of features with online reviews and Eq. (14), the weight vector of features can be calculated as 1 (0.2,0.23,0.22,0.17,0.18 On the other hand, by the distribution linguistic evaluations for all alternative hotels on each feature, the objective weight vector of evaluation features can be calculated as 2 (0.32,0.09,0.2,0.24,0.15) T W = . Combined W 1 and W 2 , then we calculate the overall weight vector of evaluation features, which denoted as (0.26,0.16,0.21,0.21,0.16) T W = . Here a = 0.5 is provided by the tourist.
Further, by the proposed DL-VIKOR method, the ranking result for the alternative hotels can be obtained. Based on the decision making matrix for alternative hotels, the ideal distribution linguistic solution and nadir distribution linguistic solution  (17) and (18). By the distribution linguistic group utility values, distribution linguistic individual regret degrees and the compromise values of five hotels are proposed by Eqs (19)-(21).
Concretely, the distribution linguistic group utility values of five alternative hotels are computed based on Eq. (19): The distribution linguistic individual regret degrees of these five tourist hotels are calculated by using Eq. (20) Using Eq. (21), the distribution linguistic compromise measures of these five tourist hotels are calculated as follows: Based on the decreasing orders of DLS i , DLR i , DLQ i , three kinds of ranking results for these five alternative hotels are derived as: Obviously, the compromise solution in this decision making process for five alternative hotel is 2 Because it meets the following two principles: (a) principle D1, i.e., 5 2 ( ) ( ) 0.398 0.25 DLQ A DLQ A -= ≥ ; (b) principle D2, i.e., hotel A 2 is also the hotel with the first position in the ranking results derived by DLR i . Therefore, the best hotel for this trip is A 2 , i.e. Conrad Macao Cotai Central.

Discussion and comparison
In the subsection, we design a sensitive analysis and a comparison of our proposal with the methods proposed by Daekook and Yongtae (2014) and Yu et al. (2018) so as to represent the advantages of the model in this study.
Let the parameter h be assigned different numbers, such as h = 0, h = 0.5 and h = 1. Based on the proposed method, different parameters h yield different results, which can be shown in the following Table 5.
when h = 1. Thus, the weight of the maximum group utility reflects decision maker's preference, which makes a direct contribution to the similar ranking results.
In the following, a comparison of the results obtained using different three methods to solve the case is in Table 6 (h = 0.5). Table 6. Comparison of ranking results by using three methods

Methods
Ranking results The DL-VIKOR method in this study  Table 6, the major trend of ranking results by using these three methods is basically in line. To our delight, the alternative hotel A 2 , which is the best alternative by our proposed method, has a top scores of 5 points in TripAdvisor.com. And the ranking of other hotels by the proposed method has the similar ranking result in top 15 hotels of Macao in TripAdvisor.com.
However, it can be seen that a slightly difference is obtained using these three methods. Since the two principles of the VIKOR method, although the ranking of group utility values and compromise measures are similar based on these three methods, the ranking of five alternative hotels is different. The ranking of alternative hotels A 2 , A 4 and A 5 cannot be distinguished using the methods presented by Daekook and Yongtae (2014). The main reason for this is that the evaluations which are constructed by online reviews were a little coarse. And some evaluation information about different levels is partially ignored and cancelled. For ex-ample, in this proposed method, the evaluations of A 5 concerning feature F 2 , F 3 and F 4 are in format of ( ) ( ) ( ) ( ) ( ) respectively. These evaluations all have a similar character, which is both the probabilities of low linguistic term and high linguistic term are large. Since the compensatory of evaluations among different linguistic terms, it partially leads to a loss of evaluations information for alternative hotels A 3 , A 4 and A 5 by using the method provided by Daekook and Yongtae (2014).
On the other hand, the similar problem arises in the method provided by Yu et al. (2018). Although the full evaluations of alternative hotels are preserved so as to avoid information loss, the ranking result is also different from that of other method. The main reason is that the distance measure in this generalized VIKOR method is different from that of the proposed method. This was specifically contrasted for these two methods in the Subsection 1.2. Therefore, through the comparison values of results, the proposed DL-VIKOR method can avoid information loss when handling the online reviews. Furthermore, more accurate overall compromise values of alternative hotels are derived, which leads to a clear ranking result of the alternative hotels. What is worth mentioning is that the method provided by Yu et al. (2018) can only solve the problem with linguistic or crisp evaluations from each customer. In fact, some customers usually give complex text online reviews instead of crisp evaluations to express their feelings about consumption. In this study, the proposed method can evaluate and select products with online reviews effectively.

Conclusions
The problem of selecting tourism products with online reviews has extensive practical application background. With respect to evaluating and selecting travel hotels, a decision support model is provide in this study, based on data processing method and distribution linguistic VIKOR (DL-VIKOR) method. In the proposed method, by sentiment analysis, we processed the text data into distribution linguistic evaluations. With respect of assessment in format of distribution linguistic, the approach to determining weight vector of evaluation feature is presented. Then the method of determining ideal solution and nadir solution, and the distance between two distribution linguistic evaluations are proposed. Further, the ranking method for alternative hotels is presented based on DL-VIKOR.
Compared to previous works, this study has some characteristics. From the realistic perspective, the problem of selecting tourism service product with online reviews is investigated so as to propose a novel decision support model. Moreover, from a theoretical point of view, in order to take advantage of textual data as possible, we processed the text data into distribution linguistic evaluation. This link can avoid information loss or distortion compared to the previous models. Then some basic theories about distribution linguistic are introduced, such as ideal solution and nadir solution, distance formula and the DL-VIKOR method, in favor of the selection.
For future researches, it is worth saying that the attainment for evaluation feature associated with alternatives could be further investigated from online reviews. Further, the evaluation or selection problem with online reviews from different regions could be investigated, such as e-commerce platform, micro-blog and the public service platforms. Additionally, so as to promote the choice of decision makers in the real problems, it is essential to apply the proposed method to solve some real decision problems (Guan, Zhao, & Du, 2017;Sun, Li, Liang, & Zhang, 2019).