MODELLING CONSUMER SATISFACTION BASED ON ONLINE REVIEWS USING THE IMPROVED KANO MODEL FROM THE PERSPECTIVE OF RISK ATTITUDE AND ASPIRATION

. With the development of e-commerce, an increasing number of online reviews can serve as a promising data source for enterprises to improve online products. This paper proposes a method for modelling consumer satisfaction based on online reviews using the improved Kano model from the perspective of risk attitude and aspiration. Firstly, the attributes concerned by consumers are extracted from online reviews, and sentiment analysis of the extracted attributes is carried out using Standford CoreNLP. Secondly, to identify the types of product attributes, an improved Kano model is proposed based on the effects of product attributes on consumer total utility. On this basis, different attribute types are illustrated from the perspective of risk attitude. Then, the consumer aspirations are mined based on the risk attitudes of different attributes and the attribute impact on consumer satisfaction. According to the risk attitudes and aspirations of different attributes, the quantified satisfaction functions are constructed to provide more objec-tive and accurate improvement suggestions. Finally, the proposed method is applied to the hotel service improvement to illustrate the effectiveness.


Introduction
Effective product improvement through analyzing consumer satisfaction has become the key for enterprises to be competitive sustainably in the fierce market competition (Bi et al., 2020b).Enterprises can enhance consumer satisfaction by product improvement, but the available resources are often limited (Xu et al., 2009).Therefore, it is particularly important to maximize consumer satisfaction through product improvements with limited resources (Violante & Vezzetti, 2017).Many scholars have carried out research on product improvement strategies through analyzing consumer satisfaction, and put forward many theoretical methods with practical significance, such as DINESERV (Marković et al., 2010), Importance Performance Analysis (IPA) (Bi et al., 2020a), Kano model (Kano et al., 1984).And the Kano model has been widely applied to many industries and researches, which can distinguish the non-linear relationships between consumer demand and consumer satisfaction (Ou et al., 2018).However, most researches analyze consumer satisfaction through surveys and the way of surveys has the following disadvantages: Firstly, the process of data collection is expensive in terms of time and money (Jiang et al., 2016).Secondly, due to the subjectivity of questionnaires and the subjective willingness of participants, the quality of data cannot be guaranteed (Groves, 2006).Finally, it is difficult to ensure the timeliness of the collected data (Culotta & Cutler, 2016).
With the continuous development of social media, online reviews given by consumers can be found on various websites (Wang et al., 2018).Online reviews contain a wealth of information, such as consumer preferences (Jin et al., 2019), attributes that consumers care about (Popescu & Etzioni, 2007), and sentimental preferences for attributes (Ahani et al., 2019).As a kind of resource which is low-cost, easily available and timely updated, online reviews have the potential to address the shortcomings of traditional survey methods (Bi et al., 2019b;Gao et al., 2018).At present, online reviews have been recognized as a data source in many fields, such as product recommendation (Siering et al., 2018), product improvement (Liu et al., 2018), product ranking (Li et al., 2010) and consumer preference analysis (Xiao et al., 2016).These studies have shown that online reviews are not only important for consumers to make purchase decisions, but also provide a low-cost and time-efficient data source for enterprises to make the product improvement (Liu et al., 2015).
However, there are few researches on product improvement using online reviews.And a few researches have studied consumer satisfaction based on online reviews to provide improvement strategies, but there still exist some literature gaps that need to be dealt with.
Firstly, some studies classify the product attributes according to the effect of attribute performances on consumer satisfaction (Xiao et al., 2016;Qi et al., 2016;Bi et al., 2019a), but it will be better to quantify the relationship between attribute performance and consumer satisfaction for provide the improvement strategies (Xu et al., 2009).Secondly, consumer satisfaction is not only related to attribute performance, but also influenced by the risk attitude and aspiration (Simon, 1956).Consumers will have different satisfaction with the same attribute performance under different risk attitudes.And when the attribute performances reach the aspirations, customer satisfaction will not change with the attribute performances.Therefore, it is necessary to construct the quantitative consumer satisfaction functions from the perspective of risk attitude and aspiration.Thirdly, due to that consumers have different risk attitudes towards different attributes types, the emphasis on positive and negative reviews will be not the same.But the most existing studies don't consider this point and aggregate the attribute performance often directly average the sentiment scores of online reviews (Martí Bigorra et al., 2019).Finally, consumers often have different sentiment degrees when evaluating attribute performance, but the most studies only divide the attribute sentiments of online reviews into positive and negative sentiments and have not quantified the degree of positive or negative sentiment (Bi et al., 2019a).

Related studies on opinion mining from online reviews
With the continuous development of social media, online reviews serve as a promising data source to decision analysis (Wang et al., 2018;Lee et al., 2019).However, as online reviews are unstructured qualitative data that cannot be directly applied to decision analysis, it is very necessary to conduct quantitative processing of online reviews through opinion mining (Ahani et al., 2019).Many scholars research on opinion mining from online reviews, and the studies mainly focus on extracting the product attributes and analyzing the sentiment information related to the product attributes (Fan et al., 2020).
Attribute extraction is the basis of sentiment analysis.The existing methods of attribute extraction can be divided into three types: supervised, semi-supervised or unsupervised.Since the use of supervised techniques requires labelled data, and the construction of labelled data for enterprises often requires a large amount of resources, enterprises usually adopt unsupervised techniques to extract attributes.The most frequently used unsupervised techniques are frequency-based (Quan & Ren, 2014), bootstrapping (Li et al., 2015), heuristic or rule-based (Rana & Cheah, 2017b).These attribute extraction methods have been included in many Python toolkits, such as Gensim, NLTK (Natural Language Toolkit), etc.In addition, many text analytics tools use these methods for attribute extraction, which are powerful and easy to operate, and have been applied in many academic researches (Culotta & Cutler, 2016).
Sentiment analysis is an analytical method that can mine and analyze the sentiment information expressed in the text in order to provide enterprises with more comprehensive opinions on product attributes.At present, some scholars have conducted studies on the methods of sentiment analysis (Cao et al., 2011).Existing studies are mainly divided into two categories: lexicon-based sentiment analysis (Medhat et al., 2014) and sentiment analysis based on machine learning (Chang et al., 2019).Lexicon-based sentiment analysis is suitable for sentence-level sentiment analysis, which can be further divided into dictionary-based (Liang et al., 2019) and corpus-based (Liu et al., 2017b).Machine learning based emotion analysis is suitable for document-level sentiment analysis, which can be further divided into based on supervised machine learning (Fan et al., 2017;Liu et al., 2017a) and unsupervised machine learning (Khan et al., 2016).
The above researches made an important contribution to the opinion mining from online reviews.By extracting the product attributes and analyzing the sentiment information of product attributes from online reviews, the useful information of consumers in online reviews can be mined to provide support for the actual decision problems.Kano et al. (1984) pointed out that consumers have different attitudes toward different product attributes, because different product attributes have different influences on consumer satisfaction.The Kano model proposed by Kano et al. (1984) can qualitatively evaluate the impact of the attribute performance on consumer satisfaction and this model divides product attributes into five types: must-be attributes, one-dimensional attributes, attractive attributes, indifference attributes and reverse attributes.The image of the five attribute types is shown in Figure 1.

Related studies on Kano model
However, the original Kano model is a qualitative model in nature and cannot reflect the consumer satisfaction accurately (Berger et al., 1993).Due to the lack of quantitative mea- Must-be attributes

Indifference attributes
Reverse attributes surement on consumer satisfaction, the Kano model cannot play a key role in product improvement and service management (Violante & Vezzetti, 2017).Subsequently, some scholars have improved the Kano model from a quantitative perspective (Brandt, 1988;Berger et al., 1993;Xu et al., 2009;Lee & Huang, 2009).These studies above provide important references for enterprises to understand quantitatively consumer satisfaction and market structure.In practical decision-making, the researches focus on Kano model are usually to meet the demands of consumers and improve overall consumer satisfaction (Ting & Chen, 2002).Therefore, researches usually focus on three attribute types: must-be attributes, one-dimensional attributes and attractive attributes.
With the continuous development of social media, online reviews have the potential as a data source for implementing the Kano model (Bi et al., 2019b;Gao et al., 2018).A few scholars began to take online reviews as the data sources and made quantitative improvement on the Kano model based on the characteristics of online reviews (Ou et al., 2018).Based on the background of e-commerce, Qi et al. (2016) innovatively applied online reviews to improve Kano method in order to develop appropriate product improvement strategies.Xiao et al. (2016) proposed the modified ordered choice model to extract consumer preferences from online reviews, and proposed a marginal effect-based Kano model to classify attribute types.Martí Bigorra et al. (2019) proposed a method to automatically classify attributes extracted from online reviews into Kano categories.Bi et al. (2019a) proposed an ensemble neural network to measure the effects of consumer sentiments toward different attributes on consumer satisfaction, and the category of each attribute is identified by the effect-based Kano model.
The above quantitative studies on the Kano model based on online reviews are of great practical significance, but there still are some problems that need further study.Firstly, some studies only distinguish attribute types according to the relationship between attribute performance and consumer satisfaction (Xiao et al., 2016;Qi et al., 2016;Bi et al., 2019a), but it will be better to quantify the relationship between attribute performance and consumer satisfaction.Secondly, consumer satisfaction is not only related to attribute performance, but also influenced by the risk attitude and aspiration (Simon, 1956).Therefore, how to construct the quantitative consumer satisfaction function from the perspective of risk attitude and aspiration needs further exploration.Thirdly, the existing researches on aggregating attribute performances often directly average the sentiment scores of online reviews (Martí Bigorra et al., 2019), but due to consumers have different risk attitudes towards attributes, the emphasis on positive and negative reviews is not the same when aggregating the single online reviews.The setting of weight curves that can reflect the emphasis of consumers on positive and negative reviews should be proposed to aggregate the single online reviews.Finally, when establishing the extended Kano model to classify attribute types, most studies only divide the attribute sentiments of online reviews into positive and negative sentiments, but the degree of positive or negative sentiment has not been quantified (Bi et al., 2019a).However, consumers often have different sentiment degrees when evaluating attribute performance and sentiment degree should be quantified to understand consumer satisfaction accurately.

Processing of online reviews
This section will introduce the processing of online reviews.Leximancer is used first for text analysis to obtain product attributes and related keywords which were regarded as rules for sentiment analysis.Moreover, Standford CoreNLP is used to perform sentiment analysis on the online reviews, and the degree of sentiment information is finally expressed as PLTS.Then the score function of PLTS is proposed to calculate the sentiment score.In the end, the different weight curves for aggregating the single online review are established.

Extraction of product attributes
Leximancer is an automated text analysis software that can extract important concepts and achieve information visualization by analyzing the conceptual structure of a large amount of text content.Leximancer does not use the coding of word frequency or phrases, but uses Bayesian algorithm to analyze text relationships, which has been shown to be stable and repeatable (Boo & Busser, 2018;Tseng et al., 2015).Therefore, Leximancer can extract the important product attributes from online reviews.
Leximancer works in the following steps: Firstly, it detects the frequency that a word appears with other words to form a frequency matrix.Secondly, based on the frequency matrix, the related concepts are generated by the algorithm.Finally, concepts are divided into different themes by analyzing the correlation between the concepts.And the theme maps are formed.The closer the concepts in the theme map are, the higher their correlation is.The theme is composed of several concepts that are relatively close, and the theme is named as the most representative concept.In addition, Leximancer can not only identify the main concepts from the text, but also generate word frequency statistics tables related to the concepts, so as to help readers understand the text content more intuitively.This paper uses Leximancer to conduct online text analysis and select the key concepts as the product attributes for subsequent analysis.
The following part is the introduction to the text analysis process of Leximancer, taking the online reviews on the tourism website "TripAdvisor" as an example.Firstly, the online reviews on travel website "TripAdvisor" are obtained, and the words such as "I", "is" and "of " are removed by using stop words list.Secondly, the online reviews are imported into Leximancer and the reviews through a series of settings are analyzed.Finally, the analysis results after manual adjustments are visualized.The partial visualization results are shown as follows: It can be seen from Figure 2 that consumers mention "room", "staff ", "service", "lounge", "executive", "location" with high frequency, indicating that consumers pay more attention to these aspects.
Figure 3 shows the statistics of partial themes generated from concepts.Leximancer generates the themes according to the relationship between the concepts.As can be seen from Figure 3, the most obvious themes are "service", "room", "location" and "sleep".Figure 4 shows the concepts related to the theme "Location" and the online reviews about it.The key concepts related to the theme "Location" mainly include "location", "restaurants", "distance", "mall", etc.The online reviews referring to the theme "Location" are also shown in Figure 4, which can illustrate that theme extraction of Leximancer is quite accurate.
After the analysis of the themes and concepts, Leximancer can form the theme maps to help the readers have a clearer understanding of the relationship between various themes and concepts.Figure 5 shows some major themes and concepts related to each theme.For example, it can be seen that the concepts related to the theme "sleep" include "noise", "bed", "quiet", etc.The closer the distances of the themes are, the closer the relationship will be.It is worth noting that there are overlaps between some themes, so some themes need to be adjusted manually based on the analysis results.For example, although the themes "location", "airport" and "traffic" belong to different themes, they are relatively close and related.Besides, there are few concepts in "airport" and "traffic".Therefore, when defining the concepts related to the themes, "airport" and "traffic" can be incorporated into theme "location".In addition, the concepts of themes "room" and "sleep" are partly intersected, so it also needs to manually distinguish the concepts that belong to two themes.
By using Leximancer, the word frequency, concepts and themes of online reviews are analyzed, and the final themes and related concepts are manually selected.The final themes are the product attributes that consumers pay attention to, and the related concepts are the keywords related to attributes, which provide rules for subsequent sentiment analysis.

Sentiment analysis of online reviews
Online reviews given by consumers often include various sentiment preferences and different preference degrees.Therefore, this paper uses probabilistic language term sets (PLTS) to represent online reviews.The PLTS can not only represent multiple sentiment preferences, but also describe the distribution of sentiment preferences (Pang et al., 2016).Standford CoreNLP (Manning et al., 2014) (https://stanfordnlp.github.io/CoreNLP/) is a natural language analysis toolkit with a variety of tools, including the part-of-speech (POS) tagger, the named entity recognizer (NER), the parser, the coreference resolution system, sentiment analysis, bootstrapped pattern learning and the open information extraction tools.The Recursive Neural Tensor Network (RNTN) (Socher et al., 2013) is used for sentiment analysis in Standford CoreNLP, which can build up a representation of whole sentences based on the sentence structure.It divides the results of sentiment analysis into five sentiment labels: very negative, negative, neutral, positive and very positive sentiment labels, corresponding to sentiment values of 0, 1, 2, 3, 4, respectively.The probability of the corresponding labels can be calculated by using the RNTN model.
The previous studies have shown that the RNTN model can push the state of the art in single sentence positive/negative classification from 80% up to 85.4% (Socher et al., 2013).The RNTN model can obtain 80.7% accuracy on predicting fine-grained sentiment labels for all phrases and capture different sentiment preferences and sentiment degrees more accurately than previous models (Song & Chambers, 2014).Therefore, this paper uses Standford CoreNLP for sentiment analysis of online reviews.Figure 6 provides an example of the sentiment treebank.In the sentence "This movie was actually neither that funny, nor super witty", "funny" and "witty" are defined as positive words, but the sentiment of this sentence is negative.Therefore, it is very important to analyze sentiment information in semantic space based on grammatical structures.Standford CoreNLP uses the sentiment treebank and the RNTN model that can effectively analyze grammatical structures and quantify sentiment information of online reviews to make the sentiment analysis more detailed (Socher et al., 2013).
Standford CoreNLP can obtain the sentiment score with a value of 0-4 and the sentiment distribution under different sentiment labels by analyzing the sentence under a certain Figure 6.Example of the sentiment treebank attribute.By calculating the probability of different sentiment labels, the sentiment distribution under this attribute can be transformed into PLTS.Figure 7 shows an example of sentiment analysis using the Standford CoreNLP.Since this sentence contains the concept word "breakfast" under the theme "Food", the online review is classified as a description of attribute "Food".The total sentiment score of this sentence is 1, which means its sentiment is "negative".And the corresponding sentiment distribution from 0 to 4 is [0.4143, 0.4623, 0.0985, 0.0125, 0.0124].

The score function of PLTS
By referring to the closeness degree of alternatives in TOPSIS (Technique for Order Preference by Similarity to an Ideal Solution) and the score function proposed by Liao et al. (2020), a new score function of PLTS is proposed in this paper.In order to facilitate subsequent analysis, this paper will use the score function of PLTS to calculate the sentiment score of online reviews.
Definition 1 (Pang et al., 2016).Let be a linguistic term sets, then a probabilistic linguistic term set can be defined as: where ( ) ( ) ( ) L associated with the probability ( ) k p , and # ( ) L P is the number of all different linguistic terms in L(P).Note that if the complete probabilistic information of possible linguistic terms can be obtained.
, the score function of L(P) is:

Aggregation of the single online review based on risk attitude
There is often no absolute sentiment preference in online reviews generated by consumers and some attributes in online reviews are positive sentiment and others are negative sentiment.If the total sentiment scores of online reviews are directly used to analyze, a large amount of valuable information will be lost in the processing process.Therefore, in order If the consumers are risk averse, they will be more concerned about the negative online reviews.It is more practical to amplify the impact of negative reviews when aggregating the online reviews.The weight curve for the single online review is as follows: If the consumers are risk neutral, they will not distinguish positive and negative online reviews.The weight curve for the single online review is as follows: If the consumers are risk seeking, they will be more concerned about the positive online reviews.It is more practical to amplify the impact of positive reviews when aggregating the online reviews.The weight curve for the single online review is as follows: where w nj represents the weight of the nth single online review on the attribute c j , and d 1 j represents the sentiment score of the overall consumer towards the attribute c j .d 2 nj denotes the sentiment score of the nth single online review and N is the total number of online reviews towards the attribute c j .With reference to prospect theory (Tversky & Kahneman, 1992), In this paper, a =b = 1 1 2.25 and a = b = 2 2 0.88.We normalize the weights of online reviews, and obtain the final weight of the single text review: Figure 8 shows the weight curve of single online reviews when consumers are risk averse and have different sentiment scores (where d 1 j is between 0.2 and 0.8).As can be seen from Figure 8, when the sentiment score of a single online review is greater than 0.5, the weight is fixed.When the sentiment score of a single online review is less than 0.5, the weight will change with the sentiment score.The changing trend reflects the degree of risk aversion.It is embodied in the following two points: Firstly, when the sentiment score d 1 j is determined, the smaller the score d 2 nj is, the greater the weight w nj is.In other words, when the risk attitude of overall consumers is determined, the more negative the sentiment of a single online review is, the greater the weight is.Secondly, when the sentiment score d 2 nj is determined, the smaller the score d 1 j is, the greater the weight w nj is.In other words, when the sentiment score of a single online review is determined, the more negative the sentiment of overall consumers is, the greater the weight is.The above two points show that the weight curve can amplify the influence of negative reviews.Similarly, when consumers are risk seeking, the weight curve can also amplify the impact of positive reviews.
(3) According to the weight of the single online review, the attribute performance can be obtained by the DAWA operator: ( , ,..., ,..., ).

Methodology
In the fierce market competition, the accurate understanding of consumer demands and satisfaction by developing product or improving service has become the key for enterprises to remain competitive (Yang et al., 2019).Previous studies are mainly based on questionnaires or interviews with consumers, which have disadvantages such as high cost, poor data quality and quickly outdated (Culotta & Cutler, 2016).Online reviews are generated by consumers to express their experiences and opinions about the online product.The opinions on product attributes extracted from online reviews are often more reliable than the results of questionnaires (Martí Bigorra et al., 2019;Xiao et al., 2016).In this section, the proposed methodology is introduced, which uses online reviews as the data sources for product improvement analysis.The proposed framework based on online reviews is illustrated in Figure 9.
In this section, the effects of product attributes on consumer total utility are measured first by the consumer utility model based on online reviews.Then, based on the effects of product attributes on consumer total utility, an improved Kano model is proposed to identify the types of product attributes.Thirdly, different attribute types are illustrated from the perspective of risk attitude and the consumer aspirations are mined based on the risk attitudes of different attributes and the attribute impact on consumer satisfaction.The quantified satisfaction functions are constructed according to the risk attitudes and aspirations of different attributes.Finally, the objective and accurate improvement suggestions are provided by considering attribute types, customer aspirations and customer attribute satisfaction functions comprehensively.

Consumer utility model based on online reviews
To analyze the impact of different product attributes on consumer utility, the consumer utility model is constructed.The consumer satisfaction towards online products is often not determined by a certain attribute, but by comprehensive consideration of the all attribute performance.Therefore, the consumer utility of a product can be seen as the result of the combined effect of multiple product attributes.The conjoint analysis method (Green & Srinivasan, 1978), as a quantitative research method, can analyze the impact of different product attributes on consumer utility and the relative importance of product attributes.

Aggregate attributes performance
There are three types of models in the conjoint analysis method: vector models, idealpoint models and part-worth function models.Different from the previous two models, the part-worth function model allows each attribute to set different function shapes, which is more in line with the actual situation.The formula of the utility function model is as follows: where y represents the total utility value of the consumer, pos j x represents the positive degree when consumers have positive sentiment for the attribute c j .neg j x represents the negative degree that consumers have negative sentiment towards the attribute c j .When the attribute c j is not mentioned, both of pos j x and neg j x are 0. b pos j represents the influence of the attribute c j on consumer total utility when consumers have positive sentiments.b neg j represents the influence of attribute c j on consumer utility when consumers have negative sentiments on it.
Previous studies on consumer utility model based on online reviews usually set the utility values as 1 (positive sentiment), 0 (neutral sentiment), -1 (negative sentiment) according to the sentiment of online reviews.However, the proposed consumer utility model quantifies the sentiment degrees of different sentiments.The total utility value y is the total sentiment score of the single online review.The positive degree pos j x and the negative degree neg j x are sentiment scores about the attribute c j extracting from the single online review, and  The impact degree of negative sentiments on consumer total utility is much stronger than that of positive sentiments

Attribute classification based on the improved
The impact degree of positive sentiments on consumer total utility is similar to that of negative sentiments The impact degree of positive sentiments on consumer total utility is much stronger than that of negative sentiments

Must-be attributes
According to the characteristics of the three attribute types in online reviews, the improved Kano model is proposed by the impact degree of positive and negative sentiments on consumer total utility.The steps of attribute classification are illustrated as follows: Firstly, the influence index and satisfaction index are defined according to the impact degree of positive and negative sentiments on consumer total utility.
Influence index: Secondly, before the classification of the attributes, the importance of each attribute will be determined by the influence index.
If l j < l 0 , then the attribute is considered as undifferentiated.The performance of this attribute will not have a significant impact on consumer satisfaction, so there is no need to classify the attribute type.If l j > l 0 , then it is considered that the performance of this attribute has a significant impact on consumer satisfaction, and the attribute types should be distinguished.
Finally, three attribute types are divided according to the satisfaction index.The specific division is shown in Figure 11.

One-dimensional attributes
Attractive attributes Must-be attributes Indifference attributes (1) Must-be attributes Consumers seldom mention must-be attributes in online reviews unless they are dissatisfied, thus the impact degree of negative sentiments on consumer total utility is much stronger than that of positive sentiments.l j > l 0 and l j < l 1 need to be satisfied.
(2) One-dimensional attributes Consumers often mention one-dimensional attributes in online reviews.the impact degree of positive sentiments on consumer total utility is similar to that of negative sentiments.l j > l 0 and l 1 < l j < l 2 need to be satisfied.
(3) Attractive attributes Attractive attributes refer to the attributes that consumers do not expect excessively, thus the impact degree of positive sentiments on consumer total utility is much stronger than that of negative sentiments.l j > l 0 and l j > l 2 need to be satisfied.Kano et al. (1984) divided attributes into different types according to the influence of attribute performance on consumer satisfaction, but they did not quantify the relationship between attribute performance and consumer satisfaction.This section will mine the consumer aspirations according to the impact of different attribute types on consumer satisfaction and construct three quantified satisfaction functions to provide more objective and accurate management suggestions.

Construction of satisfaction functions based on aspirations
According to the relationship between consumer satisfaction and attribute performance of different attribute types, the risk attitudes of consumers toward three attribute types can be obtained.(1) Attractive attributes refer to the attributes that consumers do not expect excessively.According to the relationship between consumer satisfaction and attribute performance in Kano model, consumers will not show extreme dissatisfaction when the attribute performance is poor.However, consumer satisfaction will increase sharply when the attribute performance is well.The characteristic of attractive attributes indicates that consumers are risk seeking.When evaluating attractive attributes of online products, consumers tend to provide positive online reviews.This fact also reflects that consumers are more tolerant of the performance of attractive attributes and consumers are risk seeking towards attractive attributes.(2) One-dimensional attributes are the attributes that consumer satisfaction will be in direct proportion to the attribute performance.According to the Kano model, the better the one-dimension attributes perform, the higher the consumer satisfaction will be.The proportions of positive and negative online reviews given by consumers on online sites with one-dimensional attributes are almost equal.Therefore, consumers are risk neutral toward one-dimensional attributes.(3) Must-be attributes are considered as the basic attributes by consumers.Consumers will not be extremely satisfied with good attribute performance, but they will be strongly dissatisfied when the attribute performance is poor.When evaluating must-be attributes of online product on online sites, consumers will have higher requirements on the attribute performance, which leads to more negative reviews than the positive ones.All above indicate that consumers are risk averse toward must-be attributes.
Prospect theory (Kahneman & Tversky, 1979) is one of the most influential descriptive models for analyzing decision behavior, which describes the risk attitude of people when making decisions.The value function is defined as follows: where a and b are the coefficients of the decision-maker's risk attitude and q indicates the loss-averse coefficient.Tversky and Kahneman (1992) proposed that when a = b = 0.88 and q = 2.25, the results obtained by the value function are more consistent with the original data.
Based on the description of risk attitudes in Prospect Theory and the characteristics of three attribute types, consumer satisfaction function is constructed.The functions of different attributes are set as follows, and the specific function graphs are shown in Table 1.
, when is attractive attribute; , when is one-dimensional attribute; 0 1 , when is must-be attribute; x c (12) Besides, consumer satisfactions are not only affected by attribute types, risk attitude and attribute performance, but also by the consumer aspiration.The concept of aspiration was proposed by Simon (1956) in the satisfaction model considering psychological behavior.Simon believed that consumer satisfaction depended on the degree to which the attribute performance satisfied the aspiration.Since the proposed method is to provide suggestions for product improvement, the aspiration mentioned in this paper is the expectation level of the overall consumer for attribute performance, not the expectation level of a particular consumer.
Theoretically, when the attribute performance reaches the optimal level (d j = 1), the consumer is completely satisfied.Suppose that the consumer utility is ES at this point.When the attribute performance is the worst level (d j = 0), the consumer is completely dissatisfied and the consumer utility is DS.But due to the consumer aspiration, the consumer satisfaction doesn't always change with the attribute performance.When the attribute performance meets the aspiration, consumer satisfaction will no longer vary with the attribute performance.Therefore, it is very important to obtain the consumer aspirations for different attributes to construct the consumer satisfaction function.
The following is the specific process of constructing the consumer satisfaction function.
(1) The risk attitudes of consumers toward different attributes are determined by the attribute types.
(2) The weights of attributes are determined according to the influence range of different attributes.The influence range of the attribute is defined as follows: When the satisfaction reaches 1, no matter how the attribute performance increases, the satisfaction will not increase continually.When the satisfaction reaches -1, it will not decrease continually.Therefore, ES j and DS j should satisfy 0 ≤ ES j ≤ 1 and -1 ≤ DS j ≤ 0.
If ES j > 1 (DS j < -1), then it indicates that the consumer satisfaction (dissatisfaction) has reached the maximum before d j = 1 (d j = 0).At this point, the satisfaction function should be a piecewise function, and the attribute performance corresponding to the inflection point is the consumer aspiration.The positive aspiration + d j can be solved by letting , and the negative aspiration − d j can be obtained by letting

Improvement strategy analysis
The improvement strategy mainly considers the priority of attribute improvement in the case of limited resources.After considering attribute types, attribute aspirations and consumer satisfaction functions, the more reasonable improvement strategy can be developed.Firstly, under the different attribute types, the priority of attribute improvement can be formulated based on the attribute types.According to the different influences of attribute types on consumer satisfaction, the priority of attribute improvement should be in the order of must-be attributes, one-dimensional attributes and attractive attributes.
Secondly, under the same attribute type, the priority of attribute improvement can be formulated according to the consumer satisfaction functions, and it should be assigned according to the attribute aspiration and quantified consumer satisfaction function.
Finally, it is necessary to consider not only the attribute types and the quantified consumer satisfaction function, but also the current attribute performance and the attribute satisfaction.The attributes with poor performance and low satisfaction should be improved first.

Application and discussion
In this section, a practical example and some discussion of results will be conducted to illustrate the effectiveness the proposed method.

Case study
TripAdvisor is one of the most famous e-commerce sites in the tourism industry and encourages users from all around the world to share experiences about travel destinations and hotel services (Neirotti et al., 2016).The users often evaluate different attributes based on their preferences and share their experiences through online reviews (Tan et al., 2018).The online reviews have an important impact on the reputation of hotels, so they are often used to analyze consumer satisfaction by hotel managers (Yen & Tang, 2015).The practical example proposed in this paper crawls online reviews on TripAdvisor to analyze the product improvement.
Step 1: Crawl online reviews and extract useful information from online reviews.
Figure 12 shows the online information on TripAdvisor, including the personal details, overall rating, online reviews and the online ratings of different attributes.The online reviews are an important information source for hotels to understand the consumer satisfaction.This example obtained 2106 online reviews of Hilton Beijing Wangfujing hotel through crawling.By preprocessing the obtained reviews and eliminating the reviews with fewer than 5 words, 2022 valid online reviews were obtained for subsequent analysis.The results of extracting useful information from online reviews are mainly divided into two parts: attribute extraction and sentiment analysis.
Firstly, Leximancer is used to extract the concerned attributes by consumers concerned and corresponding keywords from online reviews.According to the analysis results of Lexi-mancer, seven product attributes , , , , ,  C c c c c c c c are determined.The seven product attributes and corresponding keywords are shown in Table 2. Based on attributes and corresponding keywords, the sentiment information of online reviews can be analyzed by the Standford CoreNLP, and the analysis results are presented as PLTS.The final result of sentiment analysis is mainly composed of three parts.The first part is to obtain the sentiment PLTS of different attributes mentioned in the single online review.The second part is the total sentiment PLTS of the single online review, and the last part is to obtain the sentiment PLTS of the overall consumer towards different attributes.Since there are often several sentences and multiple attributes in a single online review, only part of the data under the "location" attribute are shown in Table 3.
The PLTS score function proposed in Section 2.3 is used to calculate the total utility of the single online review and the performance score of the different attributes.According to the sentiment of consumers to each attribute in the single online review, the performance score of the attribute c j is divided into two categories: the positive sentiment performance score pos j x and the negative sentiment performance score neg j x .Some results are shown in Table 4.   Step 2: According to the total utility and sentiment performance score of all single online reviews, we construct the consumer utility model by Eq. ( 8).The online reviews have already been quantified as the numeric sentiment score after sentiment analysis and the calculation of PLTS score function.According to the sentiment score of different attributes and the total utility in the single online review, and the influence of each attribute on the consumer utilities b pos j and b neg j can be analyzed through the conjoint analysis method.In this paper, multiple linear regression method is adopted to analyze the coefficients in the consumer utility model, and the specific process is implemented in SPSS.The final expression of consumer utility is as follows: The positive influence degree b pos j and the negative influence degree b neg j on total consumer utility are shown in the following Table 5. Step 3: Classify attributes according to the coefficients of the consumer utility model.According the coefficients of the consumer utility model, the normalized positive and negative influence degrees ′ b pos j and ′ b neg j of each attribute can be obtained.The influence index g j and the satisfaction index l j of each attribute are calculated according to the Eq. ( 9) and Eq. ( 10).The attribute types can be obtained by the influence index g j and the satisfaction index l j .According to Table 5, the influence index g j and the satisfaction index l j of each attribute can be obtained.The results are shown in Table 6.The thresholds of attribute classification are defined subjectively.And following the existing researches (Bi et al., 2020;Caber et al., 2013), the tenth of the maximum utility value is assumed to determine whether an attribute is influential, i.e., g j = 0.1.The high and low thresholds of the satisfaction index are l 1 = 0.8, l 2 = 1.6, respectively.The classification results are as follows: c 2 and c 6 belong to the must-be attributes; c 1 , c 3 and c 4 belong to the one-dimensional attributes; c 5 is the attractive attribute; c 7 is the indifference attribute.
Step 4: Calculate the weight of each attribute and the ES and DS values.
According to ′ b pos j and ′ b neg j , the attribute weights can be calculated by Eq. ( 14), and the values of ES and DS toward each attribute satisfaction function can be calculated according to Eq. ( 15) and Eq. ( 16).The final results are shown in Table 7. Step 5: Calculate the weights of the single online reviews under each attribute according to the sentiment score of the overall consumers.Then we aggregate the single online review to determine attribute performances.Firstly, we obtain the total sentiment score of a single online review under different attributes by using the Standford CoreNLP.Then, we determine the probability of different sentiment labels with all single online reviews, and obtain the overall PLTS of overall consumers under different attributes.The sentiment scores of PLTS can be calculated by the PLTS score functions.Finally, the weight of the single online review can be obtained by the weight curves proposed in Section 2.4, and the final performance scores of different attributes can be obtained by the DAWA operator.The results are shown in Table 8.Step 6: Determine the aspirations for consumer satisfaction functions according to the attribute types and the values of ES and DS.The final expressions of satisfaction functions can be determined.Check whether the ES and DS values of different attributes exceed the actual range (0 ≤ ES j ≤ 1, -1 ≤ DS j ≤ 0). and if ES j > 1 or DS j < -1, which indicate that there is an inflection point that consumer satisfaction no longer changes with the attribute performance.As can be seen from Table 7, the ES and DS values of all attributes exceed the actual range.According to the attribute types and corresponding consumer satisfaction function forms, the inflection points of satisfaction functions can be determined.The final satisfaction functions of attributes are shown in Table 9 and the graphs of them are shown in Figure 13.
,0 0.028 2.2260 1.0624,0.0280.9265 1 ,0.9265 1 Step 7: Considering attribute types, attribute aspirations and consumer satisfaction functions, the suggestions for product improvement are explained, and the suggestions will be illustrated in Section 4.3.

Summary of results
This study proposes an improved Kano model and defines the quantitative satisfaction functions based on the Kano model.Firstly, Leximancer was used to extract 7 product attributes from the 2022 valid online reviews of TripAdvisor.These attributes are basically consistent with those extracted in the previous studies (Bi et al., 2019b).Then, Standford CoreNLP was used to conduct sentiment analysis on the 7 attributes, and the consumer utility model was established according to the results of sentiment analysis.It can be seen from the results of the consumer utility model in Table 5 that the attribute c 7 has the smallest effect on consumer total utility and c 5 has the largest effect.According to Table 5 and Table 6, the classification results of the attributes can be obtained.The must-be attributes include the attributes c 2 and c 6 ; The one-dimensional attributes include the attribute c 1 , c 3 and c 4 ; The attribute c 5 is the attractive attribute and the attribute c 7 is the indifference attribute.The final classification of attribute types is basically consistent with the impact of each attribute on consumer utility.Table 8 shows the final attribute performances, and it can be seen that the attributes c 2 and c 6 are poor and the attribute c 5 is the best.Table 9 shows the satisfaction function expressions of different attributes and Figure 13 depicts the graphs of consumer satisfaction functions.The purpose of this paper is to provide decision support for product improvement.Therefore, the final suggestions will be made based on the results of the above attribute types, attribute aspirations and consumer satisfaction functions.

The suggestions for product improvement
Considering three aspects of attribute types, attribute aspirations and corresponding satisfaction functions, the product improvement suggestions for the above case are given.
According to the attribute types, c 2 and c 6 are must-be attributes.When they perform poorly, consumers will generate strong dissatisfaction.Therefore, must-be attributes should be promoted first.Combined with Table 9 and Figure 13, it can be seen that the performance scores of the attributes c 2 and c 6 are less than 0.5 and consumers are dissatisfied with the current attribute performances.As can be seen from Figure 13, consumers are more dissatisfied with the attribute c 2 than the attribute c 6 .Therefore, it is better to improve the performance of the attribute c 2 first.c 1 , c 3 and c 4 are one-dimensional attributes, representing that the consumer satisfaction increases with attribute performances.In this case, the attribute whose performance causes dissatisfaction should be promoted first.If there is no dissatisfaction attribute, the priority of attribute improvement should be determined according to the increasing degree of satisfaction when the same attribute performance is promoted.Combined with Table 9 and Figure 13, the current attribute performance score of c 4 has caused the consumers dissatisfaction, so c 4 should be promoted first.The consumer satisfaction of c 1 is worse than that of c 3 , and when the same attribute performance is improved, the improved satisfaction of the attribute c 1 is greater than that of the attribute c 3 .Therefore, attributes should be promoted in the orders of c 4 , c 1 , c 3 .c 5 is the attractive attribute.The current attribute performance and consumer satisfaction are the highest among all attributes.c 5 can be considered at last when improving the attribute performance.
To sum up, the suggestion for product improvement is that the attribute should be improved in the orders of c 2 , c 6 , c 4 , c 1 , c 3 and c 5 .

Discussion of results
In this paper, multiple linear regression method is adopted to analyze the coefficients in the consumer utility model.In order to better analyze the positive and negative sentiment degree of consumers for the attributes and eliminate some attributes that have little impact on the total utility, this paper adopts stepwise regression method for multiple linear regression.The stepwise regression method selects the independent variable input equation that contributes the most to the dependent variable, and removes the independent variable with small value of F. Repeat the process until all independent variables are identified.The result is shown in Table 10.Table 10 shows the summary of the fitting process.There are seven attributes in this case, and each attribute is divided into two variables pos j x and neg j x , Therefore, there are 14 independent variables in total.The result of the stepwise regression is that there are 14 models, which means that all the variables are in the equation.It can also be seen that the value of coefficient R 2 is gradually increasing with the stepwise regression, which is mainly due to the increase of variables entering the equation, but does not mean that the model is getting better.However, the adjusted R 2 avoids the influence of the variables number and can accurately reflect the fitting degree.From the Table 10, it can be concluded that the regression model fits well.In addition, there are 14 independent variables in the regression model of this case.Although these independent variables have the effect on the dependent variable, whether these independent variables are related to each other still requires the collinearity diagnosis.Therefore, collinearity diagnosis is also performed during regression analysis.It can be seen from Table 10 that all values of Tolerance are greater than 0.6 and the value of VIF (Variance Inflation Factor) is not large (less than 5), so it can be concluded that there is no collinearity between the 14 independent variables.That is to say, the performances of the seven attributes have the significant impact on the consumer utility, but the performances of the seven attributes do not affect each other.
Figure 14(a) is the residual histogram from which it can be seen that residuals approximately obey normal distribution.Figure 14(b) is the cumulative-probability plot (P-P plot) from which it can be seen that the points in the plot are all around a diagonal line.In other words, the residual error of the data (the curve in the plot) is distributed around the assumed line (normal distribution), which indicates that the residual error basically conforms to the normal distribution.In conclusion, the distribution of residuals passes the normality test.Figure 15 shows the scatterplot of predicted value and studentized residual.It can be seen that there is no obvious relationship between the predicted value and studentized residual, and most observation measurements are randomly distributed within ±2 of the horizontal and vertical coordinates.It can be concluded that the regression equation conforms to the homoscedasticity hypothesis and has a good fitting effect.

Comparative analysis
This section will carry out some comparative analysis to illustrate the effectiveness of the proposed method.

The comparative analysis with other method
In order to better illustrate the effectiveness of the proposed method, this section will compare the proposed method with the Kano model based on online reviews proposed by Qi et al. (2016).
The formula of the utility function in Qi et al. (2016)'s model is the same as Eq. ( 8) in Section 3.1, but the parameters have different meanings.In the model proposed by Qi et al. (2016), y is the consumer utility and the value of y is 1,0 or -1 according to the overall sentiment of consumers.=1 pos j x and =1 neg j x denotes that the consumer has positive or negative sentiment for the attribute c j respectively.If the attribute c j is not mentioned, both = =0 pos neg j j x x .b pos j and b neg j are the preferences of the positive or negative sentiment respectively.By using the case above, the comparative results of the two models can be proposed in Table 12.
As can be seen from Table 12 and 13, we can draw the following observations: the impact of different attributes (positive and negative) performance on customer satisfaction using the proposed method is larger than that using the method proposed by Qi et al. (2016).And the attribute types of seven attributes classified by the two methods are roughly the same, but there are still differences in the attribute types.For example, the attributes c 1 and c 4 are different.This result indicates that the method presented in this paper has a greater differentiation degree when analyzing the impact of attribute performance on consumer satisfaction by quantifying sentiment degree, which will have an impact on the attribute classification.
There are two main reasons for the differences.Firstly, the total consumer utility in the method proposed by Qi et al. (2016) is divided into three grades, namely, -1, 0 and 1.This setting is too simple to quantify the degree of consumer utility.Secondly, the sentiment of consumers towards attributes is absolutely classified as positive or negative sentiments, which is inconsistent with the actual situation.Quantifying the sentiment degree can analyze the influence of attributes accurately because consumers tend to have different degrees of sentiment for different attributes.The proposed method in this paper considers the above two points.In addition, attributes are ranked by comparing their weights in the method proposed by Qi et al. (2016), which is not directly beneficial to the product improvement.On the basis of attribute classification, the method proposed in this paper further considers the attribute aspirations and consumer satisfaction functions, which can directly put forward effective suggestions for product improvement.

The comparative analysis with different aggregation of online reviews
The paper proposes three weight curves that consider the different risk attitudes of consumers toward attribute types.In order to illustrate the effectiveness of the weight curves, we compare this method with the previous method which uses the average operator to aggregate online reviews.The weight curves that can reflect the emphasis of consumers on positive and negative reviews under different attribute types will directly affect the final performance scores.The final performance scores under different aggregation of online reviews are illustrated in Table 14.
From Tables 14, we can draw the following observations: The performance scores of onedimensional attributes (c 1 , c 3 , c 4 ) remain unchanged when different aggregation methods are used.The performance scores of the must-be attributes (c 2 , c 6 ) using the weight curves are smaller than that of using the average operator.The performance score of the attractive attribute (c 5 ) using the weight curve is larger than that of using the average operator.From the perspective of risk attitude, the above observations can be explained as follows: according to the relationship between consumer satisfaction and attribute performance of one-dimensional attributes, consumers tend to be risk neutral toward one-dimensional attributes.Therefore, there is no need to magnify the impact of positive or negative reviews when aggregating online reviews, and the weight curves have no effect on the performance scores of attributes.However, the characteristic of must-be attributes indicates that consumers are risk averse which means the consumers are more sensitive to negative reviews.The weight curves of must-be attributes will magnify the influence of negative reviews, so the performance scores of the must-be attributes will be smaller.What's more, consumers are risk seeking toward attractive attributes, which means the consumers will pay more attention to the positive reviews.The weight curves of attractive attributes will magnify the influence of positive reviews, so the performance scores of the attractive attributes will be larger than that of using average operator.
As it affects the attribute performance score, the weight curve will further affect the risk attitude coefficients and attribute aspirations of consumers in the satisfaction functions.The results are presented in Table 15.It can be seen from Table 15 that using the weight curve to aggregate online reviews affects the risk attitude coefficients and attribute aspirations of consumers in the satisfaction functions.The values of a j and − d j towards the must-be attributes using the weight curves are smaller than that of using the average operator.In other words, using the weight curves s to aggregate online reviews can better describe the risk averse degree of consumers towards must-be attributes than the average operator.As for the attractive attributes, the values of a j and d j using the weight curves are larger than that of using the average operator, which means the weight curves are better than the average operator to describe the risk preference degree of consumers for attractive attributes.
To sum up, using weight curves to aggregate attribute performance can not only obtain the attribute performance scores on the basis of full consideration of attribute types, but also contribute to further analyze the risk attitude coefficients and attribute aspirations of consumers towards different attribute types.

Conclusions
Based on the consumer sentiment reflected in online reviews, this paper gives the quantified satisfaction functions using the improved Kano model from the perspective of risk attitude and aspiration.The main innovations are as follows.
Firstly, this paper proposes an improved Kano model based on online reviews to identify attribute types and the identified attribute types are illustrated from the perspective of risk attitude.Previous Kano model studies based on online reviews only identified attribute types, but did not further analyze the impact of attribute types on consumer satisfaction from the perspective of risk attitude.
Secondly, this paper mines the consumer aspirations based on the risk attitudes of different attributes and the attribute impact on consumer satisfaction.According to the risk attitudes and aspirations of different attributes, the quantified satisfaction functions are constructed to provide more objective and accurate improvement suggestions.The previous studies only analyze the relationship between attribute performance and consumer satisfaction qualitatively and the improvement suggestions are imprecise.
Thirdly, considering that consumers have different risk attitudes toward different attribute types, three weight curves that can reflect the emphasis of consumers on positive and negative reviews under different attribute types are proposed.In order to obtain the final attribute performance more in line with the reality, different weight curves are used to aggregate the single online reviews for different attribute types.
Finally, this paper not only discriminates the positive and negative sentiments of attributes in online reviews, but also quantifies the degrees of positive and negative sentiments.The existing Kano model studies based on online reviews only identified positive and negative sentiments, but ignored the degrees of positive and negative sentiments.
This paper also has the limitation.For example, this paper only considers the positive and negative sentiments of attributes for classification and does not consider the frequency that consumers refer to attributes, which can be studied in the further research.

Figure
Figure 1.The Kano model

Figure 7 .
Figure 7.An example of sentiment analysis using Standford CoreNLP

Figure 8 .
Figure 8.The weight curve when the consumers are risk averse

Figure 9 .
Figure 9.The proposed framework based on online reviews Kano model Martí Bigorra et al. (2019) pointed out that consumers have different demands for different product attributes, and different attributes have different influences on consumer satisfaction.The characteristics of different attribute types in the traditional Kano model can be reflected by the sentiment characteristics reflected in online reviews.The characteristics of the three attribute types are shown in Figure 10.

Figure 10 .
Figure 10.Feature comparison of three attribute types

Figure 11 .
Figure 11.Attribute types division based on the sentiment informationThe influence degree of negative sentiment The influence degree of positive sentiment  pos j The consumer aspirations are calculated by the weights of attributes and the effect of attribute performance on consumer utility ( ′ b pos j and ′ b neg j ), which are the inflection points of the satisfaction functions.When analyzing the satisfaction function of each attribute, it is necessary to analyze the influence of the single attribute performance on the consumer utility.In the consumer utility model proposed in Section 3.1, b pos j attribute c j .The consumer utility is affected by the weight of each attribute, so b pos j and b neg j can be considered as the consumer satisfaction and dissatisfaction about attribute c j under the influence of all attributes.According to the consumer utility model, ES j , DS j and ′ of ES j and DS j can be solved by the above equation.The consumer satisfaction function can be first obtained by (current attribute performance, which can be obtained by the weight curve proposed in Section 2.4.* j S is the current attribute satisfaction and can be obtained by quantifying the overall evaluation of the attribute c j .(4)The final consumer satisfaction functions can be obtained by calculating the aspirations.

Figure 12 .
Figure 12.The online information on TripAdvisor website

Figure 13 .
Figure 13.The image of satisfaction functions

Figure 14 .
Figure 14.Normality test of residuals: a -residual histogram; b -the P-P probability plot

Table 1 .
Three types of consumer satisfaction functions

Table 2 .
The attributes and keywords extracted from online reviews

Table 3 .
Partial data under the attribute "location"

Table 4 .
Total utility and performance scores for attributes of each online reviews

Table 5 .
The positive and negative influence degree of each attribute

Table 6 .
The influence index g j and the satisfaction index l j of each attribute

Table 7 .
The results of each attribute

Table 8 .
Final performance scores of different attributes

Table 9 .
Satisfaction functions of different attributes

Table 10 .
Model summary

Table 11 .
The coefficients of the final regression equation

Table 12 .
The comparative result of the two models

Table 13 .
The comparison of the attribute types in different method Attributes Attribute types in the method proposed by Qi Attribute types in the proposed method

Table 14 .
Final performance scores under different aggregation of online reviews

Table 15 .
The risk attitude coefficients and aspirations under different aggregation of online reviews