EVOLUTION OF EFFICIENCY AND ITS DETERMINANTS IN THE RETAIL SECTOR IN SPAIN: NEW EVIDENCE

The purpose of this work is twofold: on the one hand, recent methodologies will be used to estimate technical efficiency and its determinants factors in Spain’s retail sector. In particular, the order-m approach, which is based on the concept of expected minimum input function and quantile regression, for the analysis of the factors determinants of efficiency is used. On the other hand, the results obtained applying the methods mentioned in the Spanish retail sector can contribute to opening up a new field of analysis since the results may be compared by means of the methodologies proposed as well as those which already exist in the literature. The paper used data envelopment analysis stochastic (order-m) to measure efficiency and quantile regression analysis for the second stage in Spanish retail. For the second stage of analysis relative of the factors determinants of efficiency, we use quantile regression. We take account of heterogeneity between the different characteristics of firms, using quantile regression techniques. We find that firm size, age and market concentration are positively related to the efficiency along the quantiles considered in the analysis. The relationship between intensity of capital and better trained employees in the efficiency shows a curvilinear behavior. Also, there are significant differences by region to which the firm belongs. The main contribution of this paper is to provide an efficiency analysis for Spanish retail sector using a non parametric approach with a robust estimator and quantile regression analysis for second stage. This methodology allows for a more careful analysis of what happens at firm level.


Introduction
The debate on the effects on the law on the Retail Trade in Spain since its enactment in the Law of 1996 to the present with the legislative reforms that are taking place is almost permanently present in the media and the academia in Spain. Examples of these processes of regulation / deregulation are among others; Real Decreto-Ley 6/2000, with the extension of weekly working hours 72 to 90; Law 1/2004 turned to a more restrictive situation. Specifically, they turned to twelve the number of Sundays and holidays the shops could remain open to the public, although the regions (CCAA) could increase or decrease this number, but were provided no less than eight, while reduced the total minimum weekly hours of all working days up to seventy-two hours a week 90 to 72, Law 1/2010 with the freedom of establishment and removal of licenses.
This interest in the effects that the law has been created in the competitive sector, it has been shown through several articles that have been published both nationally and internationally at the micro and macro level data. At micro level data and international perspective, some authors (Waldorf 1966;Goldman 1992) analyzed the average productivity of labor, while others (Hall, Knapp, Winsten 1961;Good 1984) studied the factors that explain this. Other studies (Athanassopoulos 1995;Donthu, Yoo 1998;Thomas et al. 1998;Ratchford 2003;Barros, Alves 2003, 2004Barros 2005;De Jorge 2010) focus attention on the analysis of the levels of efficiency and the causes thereof. From a national perspective we find papers addressing the relationship between efficiency and regulation (De Jorge 2006;De Jorge, Suárez 2007, 2010, differences in efficiency between entrants and established firms considering the size and geographical location (De Jorge 2008), the relationship between efficiency, size and market concentration (Sellers-Rubio, Mas-Ruiz 2006) or the analysis of productivity (De Jorge 2009). For this they usually apply non-parametric techniques (Donthu, Yoo 1998;Keh, Chu 2003;Sellers-Rubio, Mas-Ruiz 2006;De Jorge 2006, 2009, 2010. Others studies use parametric techniques (Ratchford 2003;Barros 2005;Sellers-Rubio, Mas-Ruiz 2007;De Jorge, Suárez 2007;De Jorge 2008). Some authors have used newer methodologies such as De Jorge and Suárez (2010), De Jorge and Sanz-Triguero (2010b), that used stochastic data envelopment analysis.
At macro level data there is extensive literature that has analyzed the impact of regulation retail on different macroeconomic variables. A compilation of several studies in the nineties can be found at Boylaud and Nicoletti (2001). It should also be mentioned among the more recent empirical literature, Bertrand andKramarz (2002), FMI (2004), Burda and Weil (2005), Skuterud (2005), Viviano (2006), Hoffmaister (2006, Orea (2008), Schivardi and Viviano (2008). Both Bertrand and Kramarz (2002) for France, Viviano (2006) for Italy, as the FMI (2004) found that the more restrictive trade policy is less sector employment is. For a discussion of the results obtained in these work see Matea and Mora (2009).
The public intervention dynamism of the Spanish retail sector with the implications that may arise in the competitive market structure, provide a solid foundation for further research. We are interested in analyzing determinants of technical efficiency in Spanish retail sector at the microeconomic level for the period of 2001-2010. Specifically, the purpose of this paper is to analyze the effect on efficiency of some organizational factors related to the managerial ability to use properly and adjust capital and labour according to environmental conditions as the change of the regulatory environment that occurred. Size and market share are included in the analysis as two of the most important factors that condition the organization of firms and then the degree of their efficiency. Our paper contributes to the empirical evidence of efficiency in Spain, adding to the previous papers the relevance of changes affecting the factors of production and the way these factors are used and combined. Secondly, our paper differs from previous literature in Spain in that we use an improved methodology for non parametric efficiency model.
For this purpose, first we use the order-m approach proposed by Cazals et al. (2002). With this methodology, which will be explained later, it is possible to avoid the limitations of the techniques DEA (Data Envelopment Analysis), as they are, the deterministic and probabilistic character of these models, the dimensionality and the high sensitivity to outliers. Second, we want to explain firm differences in efficiency. We will use quantile regressions given the nature of the variable efficiency obtained.
The structure of this paper is as follows. Section 1 presents a descriptive analysis of the data and variables. Section 2 describes the methodology. Section 3 discusses the results. The last section provides some concluding remarks.

Data and variables
The statistical information comes from the SABI database, produced jointly by Bureau van Dijk and Informa, from the financial information that firms must present to the Companies Registration Office (BORME). This database covers all sectors of Spanish business activity. It is highly representative of firms from 17 Spanish autonomous ''communities'' (i.e. regions). The study samples taken from SABI include all the firms belonging to the Retail sale of clothing in specialized stores (CNAE 4711)(download February 2012). Our sample includes 738 firms from the SABI database and refers to an balanced panel where we have eliminated those firms, which we do not have two consecutive years of data for. Summary statistics of the data are presented in Table 1. The variables used as outputs and inputs are: Output -Sales revenue as a measure of the production (Donthu, Yoo 1998; Thomas et al. 1998;Zhu 2000;Barros, Alves 2003, 2004Sellers-Rubio, Mas-Ruiz 2007;De Jorge, Suárez 2010). Input -Employees (Bucklin 1978;Ingene 1982;Pilling et al. 1995;Yoo et al. 1997;Thomas et al. 1998;Sellers-Rubio, Mas-Ruiz 2007;Perrigot, Barros 2008). For the purposes of efficiency analysis will be discussed later, would have been desirable for both variables, consumption of materials and flow services to be expressed in physical units, but the limitations of available information taken directly require the accounting variables, expressed in constant currencies. Given the timeframe of the study, all variables are deflated and expressed in thousands of euros. Conversion to constant euro has been made using the implicit GDP deflator.

Methodology
In this section we present the methodology to carry out the objective. In section, 2.1 explains the methodology for estimating Order-m efficiency. Section 2.2 shows the empirical model through interquantile regressions to explain the determinants of efficiency in the Spanish retail sector. Cazals et al. (2002) proposed the non-parametric order-m estimator as an alternative based on the expected minimum frontier of order-m (alternatively expected maximum output). According to Wheelock and Wilson (2007), order-m estimators do not impose the assumption that the production set is convex, and in addition they permit noise (with zero expected value) in input measures. Note that DEA estimates of the production frontier can be severely distorted by extreme values. Further, for given numbers of inputs and outputs, the order-m estimator requires far less data in order to produce meaningful efficiency estimates than DEA. The core idea of order-m is to set up a conditional frontier that does not envelop all firms in the population, but just a share of them. This share is determined by the integer value m 1 which can be fixed by the researcher 1 .

Order-m estimator
Here, the condition for the input-oriented case is that the firm is considered with an output level that is equal to or greater than the firm's interests. The radial distance of a firm (x o , y o ) interior to the order-m frontier represents the proportional reduction in the input it needs, in order to become efficient to a randomly drawn sample of firms which have an output level of Y ≥ y o . For a multivariate setting consider X 1 ,…X m are m (p-dimensional) random firms drawn from the conditional distribution function of X given Y ≥ y o . The random variable: (1) with X i,j x( i o ) as the j-th component of X i (of x o respectively) measures the distance between point xo and the free disposal hulls X 1 ,…X m . The latter are generated from the conditional distribution function of X given Y ≥ y o . The order-m 2 efficiency measure of firm o (x o , y o ) is then defined as: The order-m efficiency measure of firm o (x o , y o ) is then defined as: (2) Because the distribution of the population is unknown, the calculation of the order-m frontiers requires the use of the empirical distribution functions. In a multivariate case this calculation involves a numerical integration which is easier to solve by Monte-Carlo approximation. For details of the methodologies see Simar (2003).
In short, the order-m estimation of an input-oriented score is straightforward. For a particular observation, all sample observations which dominate the observation to be evaluated in the input are selected. From this sub-sample, a number of samples of size m are drawn with the replacement. Note that this does not automatically include the observation itself. Then θ m is calculated as defined in (1). Because the observation itself is not a necessary part of the order-m sample and because there will not necessarily be any other observations dominating the observation to be evaluated in the input, scores greater or less than unity may result.

Econometric model
To analyze the determinants of efficiency we will make a quantile regression (RQ) analysis (Koenker, Bassett 1978;Koenker 2001) being the estimated technical efficiency as the dependent variable, and independent variables that show in equation x: (3) where the variable A is the firm age. MS represents market share (the ratio of the sales of an individual firm over sector sales by year). Since we are interested in analyzing the effects of size on efficiency, we split the sample into four percentiles in relation to fixed assets and we investigate about different sizes. Authors such as Athanassopoulos (2003) use the same variable to control for the size. K_W and S_W represents capital by worker (the ratio of the fixes assets over employment) and salary by worker (the ratio of personnel cost over employment) (Diaz, Sanchez 2008). Reg and t are dummy variables for region and time respectively.
The advantages of this type of regression have been highlighted in numerous works such as Coad and Rao (2008), Coad and Hölzl (2009) or Reichstein et al. (2010) among others. The main contribution of this methodology is that it is not considered an average effect, as by ordinary least squares (OLS), but the estimation performed at different points (quantiles) of the distribution. That is, on average the effect of a given variable on the efficiency may not be significant, however it could be, for example, for more efficient observations. Also, an additional advantage of using quantile regression is that the estimators are more robust to failure of certain assumptions of OLS as the absence of normality of the indices of efficiency or the dependency relationships between them since they were obtained through linear programming problems.
As described in Koenker and Bassett (1978) and, in greater detail in Koenker (2005), the estimation is performed by minimizing the following expression for Øth quantile: (4) which shows that β may differ depending on the quantile estimated.
There are also disadvantages associated with the RQ estimator. Probably the main problem is that only asymptotic of the estimators are known, which raises the issue of how parameters behave in finite samples. This may not be too problematic in our case since we have a large sample. Though estimating quantile regression is computationally demanding, this problem was less so because of the availability of powerful computers and software programs such as Stata 11.0 which allows estimates to be performed relatively easily.

Estimating efficiency by order-m
We will present efficiency estimates based on order-m. Table 2 shows descriptive statistic for efficiency scores by years. A distance estimated larger than 1.0 indicates that the retail store uses more than the expected minimum, whereas a distance estimate less than 1.0 indicated that the firm uses less than the expected minimum. The average efficiency level of the period is the 0.7557 and for the period of 2005 to 2010 we see a downward trend in average.
As complement, Figures 1a and 1b show the evolution of technical efficiency. Figure 1a, represents the entire distribution using box and violin plots corresponding to all the years, which enables the features of the distributions to be detected more thoroughly. The white dot indicates the median and the black bar the first and third quartile. As can be shows in Figure 1a The analysis so far has provided us with information about the external form of the distribution and average. For further analysis, we use stacked and highest conditional density approach was developed by Hyndman et al. (1996). This technique allows us to consider the probability of moving between any two levels of efficiency in the range of values to be considered and examine intra-distribution dynamics. This new approach is based on the estimation of the so-called stacked and highest conditional density plots. It facilitates viewing the changes in the shape of the distributions of the variables observed at time t+k over the range of the same variable observed at time t (2001). Each univariate density plot describes transitions over 10 years (t+k) (2010) from a given efficiency value in period t.
The second type of plot proposed by Hyndman et al. (1996) is the highest conditional density region (HDR) plot. A high-density region is the smallest region of the sample space containing a given probability. These regions allow a visual summary of the characteristics of a probability distribution function. In the case of unimodal distributions, the HDRs are exactly the usual probabilities around the mean value; however, in the case of multimodal distributions, the HDR displays different disjointed sub-regions.
The results for order-m approach, employing optimal bandwidths in the two directions x and y, are shown in Figure 2. The stacked density plot is on the left side and the highest conditional plot is on the right side of Figure 2.
With reference to the stacked conditional density plot, it seems that the firms under study, with low initial efficiency levels, have improved their relative efficiency, although some signs of mobility appear in the initial and the middle of the distribution. A more informative way to represent the distribution changes is by looking at the HDR. Thus, each vertical strip on the right side of Figure 2 shows the highest density portion of the probability distribution for a given firm's efficiency levels in 2001. In particular, this figure shows the highest density regions for a probability of 25, 50, 75 and 90 percent (as it passes from darker to lighter shades). In addition, a bullet (•) indicates the mode (maximum value) of each conditional probability distribution.
As can be seen in right side of Figure 2, the modes of the estimated distribution are placed above the diagonal. The only exception finds in the right top of the distribution. Therefore, the firms improve their efficiency levels in a widespread manner. In addition, if we observe the mass of probability (dark areas), we see that the area representing a probability of 25 and 50 percent (and even 75 and 90 percent in some cases) does cross the diagonal in some parts; this shows the existence of certain mobility inside the efficiency distribution.

Econometric results
In order to analyze the determinants of efficiency we estimate equation 3 using data for the Spanish retail sector. The results are reported in Table 3. In the first column we present the results obtained using the standard OLS procedure, i.e., the estimation of the conditional means using least squares technique. In the further columns we report the regression results for five different quantiles of the efficiency distribution. We estimate the equation for the 0.15, 0.25, 0.50 (i.e., median), 0.75 and 0.9 quantile.
Looking at the standard OLS results we find that, age, size and market share all exert statistically significant positive effects on efficiency. These results are as expected more experience and reputation of the company lead to higher efficiency levels. Similarly larger firms and market power are associated with both economies of scale and scope with a positive influence on efficiency. Moreover, both capital by worker and Salary by worker show curvilinear behavior (U shaped), such as show the negative and positive signs and statistically significant the linear and quadratic parameters of the respective variables. It is therefore necessary to reach a certain threshold so that both capital intensity (the ratio of the fixes assets over employment and qualifications the ratio of personnel cost over employment) have a positive influence on the efficiency.
While the OLS regression describes the central tendency of the data, the regression quantile results give a more accurate picture of the importance of the explanatory variables for the different quantiles. Comparing the results in Table 2 for the different quantiles show that the magnitude of the coefficients not changes a lot of as we move along the efficiency distribution of firms. The positive or negative influence and statistical significance was maintained constant throughout all quantiles. Therefore, the determinants of the efficiency shown in Table 3 are equally relevant regardless of the efficiency chosen area (quantile). Table 4 presents the results of interquantile range regressions, i.e., regressions of the difference in quantiles, which allow us to examine whether the effects of the variables are the same at the respective quantiles. The results show that the effects of the size and Capital by worker are different in the compared quantiles.
The effects of Market Share not appear to be different in the 0.75 and 0.50 quantiles. For the salary by worker we only find different effects in comparisons of higher and lowest quantiles. Age is only different in the 0.25 and 0.50 quantile. Finally, we found statistically significant differences in the different levels of efficiency in two environmental factors, according to the localization of the firm region and the temporal evolution.
In relation, to the first, these differences were expected, because different context of regulatory the retail sector in Spain, as mentioned in the introduction section. The relationship between efficiency and regulation in Spain is a topic that has received special
(End of Table 3) attention, as we remarked. However, the different processes of regulation / deregulation taking place in the last decade in this sector make this topic a factor that must be considered continuously. Regarding to the second factor, it is interesting to observe how in general, there is little difference between years of the pre-crisis 2008, while in the period of 2008-2010 can be seen deterioration (in some percentiles 0.25, 0.5, 0.75, this efficiency loss begins to be perceived with some years in advance).

Conclusions
The main contribution of this paper is to provide an efficiency analysis for Spanish retail sector for the period of 2001-2010 using a non parametric approach with a robust estimator that has been suggested recently by Cazals et al. (2002) and quantile regression analysis (Koenker, Bassett 1978;Koenker 2001Koenker , 2005 for second stage. In relation with order-m analysis, Daraio and Simar (2007) mentioned the managerial interpretation of order-m measures of efficiency. In particular, the parameter m has a dual nature. It is defined as a trimming parameter for the robust non-parametric estimation. It also defines the level of benchmark wanted to carry out over the populations firms. As the cited authors indicate, the use of the parameter m could be understood in a dual sense as providing both robust estimations and a potential competitor's analysis.
In relation with second stage, we present further empirical evidence into determinants of efficiency. We take account of heterogeneity between the different characteristics of firms, using quantile regression techniques. We find that firm size (we may approximate the firm size i.e. in first percentile of fixed assets: traditional stores; firms in second percentile: small supermarkets; firms in third percentile: medium sized or large supermarkets and firms in fourth percentile: very large supermarkets and hypermarkets), age and market concentration is positively related to the efficiency along the quantiles considered in the analysis. The relationship between intensity of capital and better trained employees in the efficiency shows a curvilinear behavior (U shaped). Also, there are significant differences by region to which the company belongs.
Some of the extensions of this work are related to the results obtained in this work. For example, the correct definition of retail formats. In this way one could compare the results obtained in this work. This is a complicated issue, since in Spain by the time this information is not available.
The curvilinear relationship of capital per employee and wages for the employee suggest a more deep future study. In particular in the decreasing area of the efficiency, what implications has the inverse relationship between levels of capital intensity or salary by worker and efficiency? In the case of capital intensity this variable picks the effect on efficiency of the combination of inputs. One possible explanation is that changes in efficiency generated by a technical innovation depend on their nature and diffusion. If it is easy for firms to adopt it, then this change affects efficiency positively, while if it requires an important investment as well as organizational modification, then it could cause a shift in the frontier, thus the relative distance augment. This means that even if an increase in the stock of capital improves efficiency to do it in a different timing than the rest of the firms, this could cause losses of productivity derived from the capital adjustment in the short-run. In the case of salary by worker, this inverse relationship could be linked with minimum wages received, or excessive training of employees with consequent frustration in the tasks to be performed below their qualifications. Reflecting on these issues in future studies could help bring out best business practices and help policy makers to make decisions more and better information.

Notes
1. The choice of the tuning parameter m will determine the level of robustness of the analyses and at m = z we strike the balance between being restrictive enough to reduce the effect of outliers and allowing for meaningful comparisons across the sample at the same time.
2. We have worked under variables return to scale (VRS) as type of technology. The advantage of VRS assumption is that we can look at the relative performance of firms of a similar size within the same methodology.
3. As can be seen in Figure 1b the distribution of the efficiency is not normal. The Shapiro-Wilk normality test computed for efficiency distribution gives a Z statistic of 13.87 significant at 1%.