A COMPARISON BETWEEN MRA AND ROUGH SET THEORY FOR MASS APPRAISAL. A CASE IN BARI

. Rough Set Theory is a property valuation methodology recently applied to property market data (d'Amato, 2002). This methodology may be applied in property market where few market data are available or where econometric analysis may be difficult or unreliable. This methodology was introduced by a polish mathematician (Pawlak, 1982). The model permit to estimate a property without defining an econometric model, although do not give any estimation of marginal or hedonic prices. I : ,he first version of RST was necessary to organize the data in classes before the valuation .The relationship between these classes defined if-then rules. If a property belongs to a specific group then it will belong to a class of value. The relationship between the property and the class of value is dichotomous. In this paper will be offered a second version that improve the RST with a "value tolerance relation" in order to make more flexible the rule. In this case the results will come out from an explicit and specific relationship. The methodology has been tested on 69 transactions in the zone of Carrassi-Poggiofranco in the residential property market of Bari.


INTRODUCTION
Mass appraisal is the systematic appraisal of groups of properties at a given date using standardized procedures and statistical testing. In the property market econometric modelling tries to replicate the market behaviour through a representative model. In particular hedonic price modelling looks for an econometric relationship between the price and the property characteristics. This methodology (Griliches, 1971;Rosen, 1974) is based on demand side analysis in a static framework. For this reason statistic data analysis has theoretical weakness (Lentz and Wang, 1998) and may be not efficient in those markets where the uncertainty is high because of the unreliability of information sources. In these cases the relationship between the dependent variable normally the property price and the independent variables or property characteristics may not be econometrically modelled. In some property markets the price dynamic could be described through non-monotonic processes, therefore alternative "heretic approaches" (Kauko and d'Amato, 2004) approaches have been proposed as neural network (Borst, 1992;McCluskey et al, 1997;Rossini, 1997;Nguyen N. et al., 2001) or AHP (Kauko, 2002). In this group Rough Set Theory has been applied for the first time (d 'Amato, 2002) to a small sample of residential property transactions in the real estate market of Bari. In the first application of RST the final output was a if-then rule that indi-

International Journal of Strategic Property Management
ISSN 1648-715X http://www.vtu.lt/english/editions cates the right class of value for the property to estimate. This work shows an evolution in the application of Rough Set Theory for mass appraisal problems. The valuation methodology has been applied to appraise a sample of 69 residential properties transactions using a Value Tolerance Relation in order to avoid a crisp relationship between the class of value and the value. The data comes from the Real Estate Market Observatory of the 1 st School of Engineering of the University Polytechnic of Bari. This contribution represents a research cooperation between the Real Estate Market Observatory and the AICI real estate research center. The work is organized as follows in the next section a brief presentation of RST will be offered, the second section will offer a comparison between the RST and a MRA model. Final remarks and the analysis of future directions of research will conclude the work.

ROUGH SET THEORY AND VALUE TOLERANCE RELATION
Rough Set Theory has been proposed in a previous work as a methodology to appraise the value of properties through if-then rule without econometric modelling. The application of this methodology is quite similar to regression analysis. RST is a rule-based approach to uncertain information developed by Zdzislaw Pawlak in two famous works (Pawlak, 1982;Pawlak, 1991). In the application of RST to property market data, real estate transaction may be considered as an element or object which could be related to a piece of information. A real estate transaction is considered as an element (say object), and the available information are the specific characteristics (attributes) related to the property. Therefore the price of a property, the technical characteristics, the tenant characteristic, can be considered "attributes" of a "real estate transaction". A property may have or not these characteristics. Therefore the relationship between an "object" and its "attributes" can be described by the following three "regions" of knowledge: "Certainly, Possibly and Certainly not". The relationship between the object and its attribute can be defined as "certainly not" for a property (object) inside a group of property transactions (universe) without parking (attribute). Among the property transactions (universe), those properties which have the same attributes can be considered indiscernible at a certain level of information. An indiscernible element is defined as an "elementary set" which can not be confused with any other element. If two properties (objects) are very similar in the technical features (attributes), in the same area, with the same prices then they will be indiscernible. The first stage of valuation process is an "informative table" which will be developed in order to show the relationship between the objects (property transactions) and their attributes (property characteristics). The lines of this table will have the universe units or objects (real properties considered), while in the columns there will be the different attributes belonging to the objects of the universe. In the columns there will be the list of all the attributes (panoramic quality, maintenance, area etc.) each of them measured in a different domain. Every cell may have quantitative or qualitative description of the relationship between an object (real property) and its attribute. The presence or the absence of a parking (attribute) within a property (object) will be marked with a dummy variable, while the area dimension (attribute) of a propriety (object) will be expressed by square meters. An "information function" links each object to its attribute. The property transaction inside the Universe of all transactions is described by a line which can also be named vector. There is an indiscernibleness or equivalency relationship among the objects that belong to the same universe U (property transactions) when the respective attributes are identical. For example two 120 sq. m. area properties will be indiscernible as regard this attribute. In other terms two objects (properties) can be define indiscernible if they have identical characteristics. If all the objects of the U universe were analysed according to the N attribute set and if they resulted to be similar to each other (for example all the real estate properties located near the downtown have a 110 sq. m. area) then they would be indiscernible. Two or more real estate properties may have only one difference but a relevant difference in price. On the contrary two real estate properties may have two or more differences but the same price. For this reason two important concepts must be added. Assuming U as the universe or the set containing all the transactions, X as a universe object set (real estate properties whose price is known), Q as the attribute set (that is under the above said universe), and N as an attribute subset the Lower Approximation can be defined as follows: If an attribute of a real property is included in this subset, then will be part of its positive or lower region. It is also possible to define the following relation: The Upper Approximation is defined by the set which shows a non-empty intersection with X. If there are some elements of the set N that belongs to X and others do not, then the attribute will be described by the upper approximation. The RST will value each uncertain phenomenon through these approximations. The difference between the upper or lower regions will be represented by a "boundary region" of rough sets. The boundary region is expressed in formal way as: If the boundary region were not empty, a rough set would be defined through a union between an upper approximation and a lower approximation. This valuation methodology is depending on several characteristics like: the information quality, the capacity to classify the information, the ability to single out the at-tributes apt to describe them. After an "informative table" it is necessary a "decisional table" dividing the attributes in: conditional (C set) and decisional (D set). The distinction between the conditional and decisional attributes evidences a causal relation between the attributes. In the application of this methodology to property market the decisional variable will be the price and the attributes will be the conditional variables. As a consequence the RST allows us to know how the conditional attributes (property characteristics) that influence the decisional attribute (price) determining a lower and an upper approximation based on the relationships between the price of the set of elements (decisional attribute) and the set containing the other attributes (conditional attributes) which have an influence on the property price. The conditional attribute will be selected in the same way of independent variables inside the regression analysis at the end the relationship between conditional and decisional attribute will generate the if then rule. There are two general kinds of decisional rules: the former is the "exact decisional rule", named also deterministic and the latter is "approximate decisional rule". If the decisional set (the price) contains the conditional attributes (area or other features) then an exact decisional rule will be originated. On the other hand an approximate decisional rule will be derived if only some conditional attributes (area or other features) will be included in the decisional set (price). For mass appraisal purpose the deterministic rule are more important than the approximate one. The former kind of rule define a certain causal relationship between the price and the other characteristics. The property value is originated by comparing the property characteristics with the rules defined by the comparatives properties. In the previous experience in a first stage were defined classes of value, therefore the property value was reached finding the right class of value for each property to be estimated. This work represents a further step on the application of RST to property valuation for mass appraisal purposes. The previous work was based on a crisp indiscernibly relation (complete, reflexive, symmetric and transitive relation valued in the domain (0, 1}). In this work the application will show the opportunity to use a Value Tolerance Relation (Tsoukias and Vincke, 2000) instead of a crisp tolerance relation like in the traditional version of RST. This modification may allow a comparison between RST and one of the most important approach in mass appraisal like MRA. For this reason real transactions have been obtained from the Real Estate Market Observatory of the 1st Faculty of Engineering of the University Polytechnic of Bari. They have been selected in the residential sector in the city of Bari using a cross sectional model in a zone called Picone. A linear Multiple Regression Analysis has been developed using SPSS 10 for Windows.
The price in euro is dependent on the date of transactions calculated in month referred to September 2004. The DATE in the regression model considered the difference in months between the date of transactions and the date of September 2004.
In addition to date the price is depending on the commercial square meters of the properties SQM. Two dummy variables concerning the presence or not both of elevator and autonomous heating completed the regression model proposed. The results of the regression analysis are shown in the appendix 1. The model is significant as indicated by the F-test, the R 2 is quite high 0,86 and the Adj. R 2 is 0,85. The coefficients seem to be significant according to the t-test. The valuation accuracy is indicated in the table below.
In the previous application the rules have some conditional attributes {property characteristics) and a decisional attribute {the price). The group of the properties considered was examined in order to choose the right class according to the indiscernibly relationship. This relationship was crisp therefore the element {property transaction) was included or not in a class of prices. In the property market this may be considered a strong assumption. As a consequence the appraiser was compelled to define a class of prices instead of a value. In this work the indiscernibly relation between the element and the universe is not crisp. The property value is based on a relationship between an object and a rule which can be generated either by the whole group of properties considered or by a part of properties taken into account. The Value Tolerance relation is a functional extension of RST and allow the appraiser to develop upper or lower approximation with different degrees of indiscernibly relation. The formal relation is indicated below (Tsoukias and Vincke, 2000).
This relation Rj may assume values included in the interval 0-1 ( not only 0 or 1). It is a sort of variation ratio based on sets (fuzzy) whose membership function may have values included in the interval [0,1]. In this context the choice of the minimum of the membership function results represents the intersection between two sets, while the maximum of membership functions results is the union between the two sets. Two objects x and may have different level of indiscernibly according to a Table 1. Valuation Accuracy of the Linear MRA Model in formula (d) discriminant threshold which measures the characteristic cj. This threshold can be applied to different measures of these characteristics of all the objects. For example the indiscernibly relation between two objects like properties A and ¡ whose sq. m. area are 100 and 50 for a threshold of 10 sq. m. will be calculated as follows: In this case the two elements can not be considered similar. If a value tolerance relation with the same is applied to two elements (properties) whose sq. m. area are 110 and 115 the result will be: In fact, in this case the difference in sq. m. between the two properties is included in the discriminant threshold. As one can see the measure of indiscernibly relation is not crisp but may have different degrees. If the value of Rj is equal to 1 therefore the two objects con be considered indiscernible in that characteristic assuming a defined threshold, otherwise if the Rj is equal to 0 then the two object are completely different in a specific characteristic referring to a specific threshold. It is clear that this indicator must be calculated for each conditional attribute considered. This mathematical formula can be used also for the relationship between the object of a universe (properties) and a set of rules Rj developed for valuation purposes. The relationship will compare the characteristics of the object with the conditional part of the rule. In this case it will be modified as follows (Stefanowski and Tsoukias, 2000): As a consequence the output of the formula will be a level of indiscernibly relation between the object and the rule assuming a k level of threshold for the measure of the characteristic. Among all the attributes the relationship of an object and the conditional part of the "rules" will be calculated assuming the "intersection" of all the sets (Stefanowski and : The R (x, p) gives a flexible (not crisp) measure of this relationship between one single element and each rule developed on the group of the property considered. It must be stressed that the rule may be developed on the entire group of property considered or may be developed on a part of it. As an object may have more than one attribute the appraiser will take into account the minimum Rj among all the attributes because of the necessity to take the object with the higher level of approximation. It may happen that more than one rule have the same minimum Rj, in this case the appraiser will consider the rule with the highest sum of Rj compared with the other objects.

COMPARING REGRESSION ANALYSIS WITH THE ROUGH SET THEORY
The regression model indicated in formula id) has been applied to the entire sample of 69 property transactions. For the application of RST the sample has been divided in two parts The former part of 19 real property transactions has been selected for the determination of the rules, while in the latter part has been tested the rule. Therefore comparing RST and the MRA both valuation accuracy and valuation variation have been calculated (Brown, 1985;Brown, Shepherd and Matysiak, 1998) on the same group of 50 residential properties whose price is known.
While regression analysis can be tested on the same sample used to create the model this is not possible for RST. In the RST application the first sample of 19 properties has been considered for the analy-sis and the creation of if-then rules. In a second moment these rules have been taken into account to appraise the 50 properties. These properties are listed in the appendix 2. The rules refer to the following four conditional attributes: SQM (commercial square meter); DATE (date of transaction in months); HEATING is a dummy variable concerning the presence or the absence of this technical equipment; ELEVATOR is a dichotomic variable concerning the presence or the absence of the elevator in the building. Using ROSETTA software the following table defined the rules.
For example the first rule may be read as follows: The former part of the rule shows the conditional attribute while the latter part of the rule is the decisional attribute. As one can see no "class" of value has been considered like in the previous work on RST. In order to analyse the "quality" of the rule there are two important indexes: the "coverage" of the rule and the "accuracy" of the rule. The former index is a ratio between the number of properties which satisfy both the conditional and the decisional part of a rule and the number of properties which satisfies only the decisional part (Pawlak, 1997). The latter index measures the probability that the decisional part is exact. In other terms it is the ratio between the number of properties which satisfy both the conditional part and the decisional part of a rule and the number of the properties which satisfy only the conditional part. All the rules have the highest level of accuracy and coverage (equal tol), therefore in this case there are no rule "better" or "worst" but all the rules give an equal contribution for valuation purposes. The table below indicates the selected k. They are the so called discriminant thresholds used to define the indiscernibility relation between two objects. In the value tolerance relation formula there is a threshold for each element (object or real property) of the universe. It must be stressed that this threshold is based on the analysis of the rules and of the objects. The choice of the discriminant threshold is an important problem that will be analysed in deep Table 2. The list of the 19 rules developed through ROSETTA the software for the application of RST. Only the deterministic rule has been selected. The prices are in euro. Each conditional attribute of the objects has been compared with the conditional part of the rules indicated in the table 2 in order to define which rule is more "suitable" for valuation purposes. The comparison has been carried out through an excel spreadsheet for the 50 properties to be appraised. In order to analyse the valuation accuracy of RST model the market price of each property has been compared with the value predicted from the RST rules. The error measurement has been calculated with the Mean Absolute Percentage Error displayed in the following formula: Where Pj and Aj are the predicted selling price and the actual selling price of the property i in the set of m properties. The table 4 below shows the difference between the property prices and the property value assessed through RST through the MAPE previously defined.
The second column indicates the rule of the table 2 used to appraise the property. The valuation of the properties with RST has 18 Table 4. A comparison between the market prices and the RST estimated prices percentual points of difference from the market price. This percentage is included inside an interval between 10% and 20% indicated as the range of the valuation accuracy in several empirical analysis. Although these studies could never assume a conclusive nature (Matysiak et al., 1995) the difference between property valuation and property price is generally included within the 15%-20% (Adair et al.1996;Parker, 1998;Newell and Kishore, 1998). The calculation of the proportions of errors is reported in the table 5. In a similar way the valuation accuracy of regression model has been measured on the same sample of property prices through the MAPE (Mean Absolute Percentual Error). In the previous application the valuation results were classes of value instead of a crisp value , in this case through the value tolerance relation RST can give a single value estimated.
As one can see the MRA has a 11% of MAPE showing a better performance than RST. The valuation variation of the model is 15,31%. In general term RST may be considered an emerging approach in mass appraisal (Kauko et ai., 2004). Unfortunately while the regression analysis allow the appraiser to define the marginal (in some cases hedonic) price of each property characteristics considered in the model, the Rough Set analysis does not give any information. Therefore the quality of outputs of statistical mass appraisal methodologies still remains superior to those obtainable from Rough Set Approach. Multiple Regression Analysis relies on econometric modelling which reproduces the market behaviour based on probability framework. Rough Set Theory is not based on behaviour modelling. In fact the results of this mass appraisal valuation technique are depending on the simple observation of market data. Regression theory has statistical control indexes because of the assumptions of the model. In the RST no assumption are made and the control indexes are concerned about two main ratios like the "accuracy and the "coverage" of the rules. A small sample is at least composed by 30 observations. On the contrary RST have no data limits giving re-sults based on rules originated by 10, 20 or 15 observations. It seems important to highlight that although there are not data limits a high number of observations allow the appraiser to develop rule at highest level of coverage and accuracy. The two valuation procedures have also similarity. As one can see both the application of RST and MRA are based on a cross section analysis. The valuation process starts with the definition of the "attributes" in the Rough Set Theory and the independent variables in the Multiple Regression Analysis. In fact a causal relationship is supposed both in Multiple Regression Analysis and in RST. In the first case the output will be a mathematical model while in the second case the output will be a Boolean sum or a if then rule. There is no risk of different results coming from different "algorithms" like for example in the neural networks. (Worzala E. Lenk Margarita and Silva Ana, 1995;McGreal et al., 1998). The application of RST may be recommended for mass appraisal in those markets where property market transaction data are not abundant. Property market in Eastern Europe, Italy, Greece, and other European countries for several reasons (tax burden, social organization) may not have great amount of property data. In this case a mass appraisal methodologies like RST may help to reach a property value without econometric modelling.

FINAL REMARKS AND FUTURE DIRECTIONS OF RESEARCH
This paper showed an application of Rough Set Theory to mass appraisal problems. In this work the valuation procedures has been improved through the Value Tolerance Relation.
The application showed that RST may have the potential to give results closer to the Multiple Regression Analysis. In particular RST may become an useful tool in those market where econometric modelling can not be applied because data are not abundant. Although the application of Rough Set Theory combined with Value Tolerance Relation may represent an interesting evolution two main problems remains. The former is the determination of the sample or the criteria and the number of transactions to be considered to generate the rules. The latter is the determination of the kvalue. In a forthcoming work both the problems are analysed and a solution is proposed. An interesting future direction of research may be the comparison between MRA and RST in other urban contexts in order to confirm (or not) the empirical results obtained in the residential property market of Bari. Another interesting directions of research may be the analysis of the relationship among MRA, RST and other emerging approaches to mass appraisal.