Response Surface Methodology (RSM) consists of a collection of statistical and mathematical techniques useful to the development and interpretation of polynomial equations (see Box & Draper, 1987; Myers & Montgomery, 1995). Even though this technique has been specially adopted in the engineering and industrial world, it can also be applied in education (see Meyer, 1963) and it is becoming increasingly interesting for the social sciences since the last decade (see Arnau, 1997). RSM is particularly advantageous in situations where there are several independent variables (*x*_{1}, *x*_{2}, ... , *x*_{k}) that are supposed to have a potential influence on a performance measure which is called the response (*y*). Figures 1 and 2 show the relationship between a response (*y*) and two independent variables (*x*_{1} and *x*_{2}). For each value of *x*_{1} and *x*_{2} there is a corresponding value of *y*, and the values of *y* can be viewed as a surface lying above the *x*_{1}, *x*_{2} plane. It is this graphical perspective of the problem what did lead to the term RSM.
The main objectives of RSM are to investigate the space of the independent variables, to test empirical models for developing an appropriate approximating relationship between the response and the independent variables, and to optimize methods for estimating the values of *x*_{1}, *x*_{2}, ... , *x*_{k} which produce the most desirable values of *y*.
When we investigate a response *y* that depends on the input variables *x*_{1}, *x*_{2}, ..., *x*_{k} the relationship which is planned is:
*y = f (x*_{1},x_{2},...,x_{k}) + e
This empirical model is called a response surface model. The form of the true response function *f* is unknown and *e* represents other sources of variability not accounted for in *f*. If we assume that *e *is a statistical error with a normal distribution with mean 0 and variance σ^{2}, then:
*E(y)= E[f (x*_{1},x_{2},...,x_{k})] + E(e)= f (x_{1},x_{2},...,x_{k})
Because the form of the true response function *f* is unknown, we must approximate it. In fact, to develop a suitable approximation for *f* is one of the main purposes of RSM. This can be done by estimating *first*, *second* or *n*-order polynomials. For the case of two independent variables, the first-order model corresponds to the following polynomial equation:
*y= b*_{0}+ *b*_{1}x_{1}+b_{2}x_{2} + e (1)
Where *y* is the response, the *b*_{i} are a set of unknown parameters, *x*_{1} and *x*_{2} are the independent variables, and *e* is the random error. Equation 1 is sometimes called main effects model because it includes only the main effects of the independent variables. If there is an interaction, it can be added to this model and expressed as a first-order model with interaction. That is:
*y = b*_{0} + b_{1}x_{1} + b_{2}x_{2} + b_{12}x_{1}x_{2} + e
Adding the interaction term (*x*_{1}x_{2}) introduces curvature in the response function. Sometimes the curvature is strong enough as that the first-order model is inadequate and a second-order model is required. For the case of two variables, the second-order model is:
*y = b*_{0} + b_{1}x_{1} + b_{2}x_{2} + b_{11}x^{2}_{1} + b_{22}x^{2}_{2}+ b_{12}x_{1}x_{2} + e (2)
This model considers the interaction term (*x*_{1}x_{2}) and the quadratic expression of the independent variables (*x*^{2}_{1}, *x*^{2}_{2}) and it is the most widely used in RSM.
It should be noted that there is a close connection between RSM and linear regression analysis. That is, the polynomial models described above are linear functions and represent multiple regression models. For example, the first-order model defined in equation 1 can be expressed as a regression equation by:
*=
b*_{0}+* b*_{1}x_{1}+ b_{2}x_{2}
+ e (3)
And the second-order model defined in equation 2 (if we change the terms: *x*^{2}_{1} = x_{3}, *x*^{2}_{2} =*x*_{4}, *x*_{1}*x*_{2}= *x*_{5}, *b*_{11} = *b*_{3},* b*_{22} = *b*_{4} and* b*_{12} = *b*_{5}) can be expressed as a linear regression equation by:
*=
b*_{0}+ *b*_{1}x_{1}+ *b*_{2}x_{2}
+ *b*_{3}x_{3}+ *b*_{4}x_{4}+
*b*_{5}x_{5} + e (4)
As in multiple linear regression models, the *b*_{i}
parameters can be estimated by using the *method of least squares*. The
fitted regression model can be written in matrix notation as:
= *X + e*
RSM allows the development and interpretation of models of different complexity and it is exploratory and sequential in nature. That is, the experimenter seeks to find the values of the independent variables where the response *y* is a maximum and estimates different order polynomials to approximate the form of the true response function. A graphic interpretation of the problem may help to detect the maximum response very straightforward. When there are two independent variables this can be viewed in a three-dimensional space.
RSM and the study of the person-organization fit
The person-organization (P-O) fit is a topic that is becoming widely popular among organizational researchers (see Kristof, 1996; O’Reilly et al., 1991; Ximénez, 1998). It is based on the person-environment (P-E) fit approach to the study of behavior where P-O fit is defined as the degree of correspondence or compatibility between the person and the organization. Most studies have found a relationship between P-O fit and work outcomes such as job satisfaction, performance, stress, intent to leave and turnover (see Kristof, 1996 for a review). However, they struggle with methodological problems. Among them the main ones are the measurement of fit and the examination of its relationship with a work outcome. Fit is generally measured by calculating a single index that expresses the degree of correspondence between P and O separate and commensurate measures. Most studies use fit indices based on difference scores (i.e. as an algebraic difference (O - P) or as a quadratic difference (O - P)^{2}). However, several authors have noted that difference scores introduce substantive and methodological problems (see Cronbach & Furby, 1970; Johns, 1981). For example, they confound the individual contribution of the P and O measures to the overall score, discard information about the absolute level of P and O, and impose restrictive constraints about the joint relationship of P and O with an outcome that are seldom empirically demonstrated and let ambiguous the exact nature of that relationship.
Recent work by Edwards (1991; 1994), Edwards and Harrison (1993) and Edwards and Parry (1993) has revealed that polynomial regression analysis is a procedure capable to overcome these methodological problems. Its use is based on the assumptions that: (1) the relationship between two entities (P and O) and an outcome should be considered in three dimensions, (2) the analysis should use three-dimensional response surfaces to depict the joint relationship of P and O with the outcome, and (3) the constraints implied by fit indices should be considered as hypotheses that must be tested and if supported would lend credibility to the proposed model.
For example, if we consider the algebraic difference between O and P as a single predictor of a work outcome (say *Z*), the corresponding equation is:
*Z= b*_{0} + b_{1} (O - P) + e (5)
Expanding this equation yields:
*Z= b*_{0} + b_{1} O - b_{1}P + e (6)
This shows that the algebraic difference index (labeled *d*) constrains the coefficients on P and O to be equal in magnitude and opposite in sign. If we consider P and O as separate predictors of the outcome Z the corresponding unconstrained equation is the first-order polynomial:
*Z= b*_{0} + b_{1} O + b_{2}P + e (7)
Equation 7 is mathematically equivalent to equation 6 but it relaxes the constraints of the coefficients on O and P allowing them to independently take on whatever values maximize the variance explained in Z, and allowing the test of the constraints (*b*_{1} = -b_{2}) before lending credibility to the proposed model.
The same can be asserted for the quadratic difference between P and O as a single predictor. The following expression corresponds to the constrained equation:
*Z= b*_{0} + b_{1} (O-P)^{2} + e (8)
Expanding and rearranging this equation yields:
*Z= b*_{0} + b_{1} O^{2}- 2b_{1}PO + b_{1}P^{2} + e (9)
If we consider an equation with *O*, *P*, *O*^{2}, *PO* and *P*^{2} as separate predictors (*P* and *O* are included because they are components of the terms *O*^{2}, *P*^{2} and *PO*; see Cohen & Cohen, 1983) we have the second-order polynomial:
*Z= b*_{0} + b_{1}O + b_{1}P + b_{3}O^{2} + b_{4}PO + b_{5}P^{2} + e (10)
Comparing equations 10 and 9 reveals that the quadratic difference index (labeled *d*^{2}) imposes four constraints: (1) the coefficients on *O*^{2} and *P*^{2} are equal (*b*_{3}= *b*_{5}), (2) the coefficient on *PO* is two times the one on *O*^{2} and opposite in sign (*b*_{4}= -2*b*_{3}), (3) and (4) the coefficients on *O* and *P* are 0 (*b*_{1}= *b*_{2}= 0).
In sum, this procedure consists of applying the fundamentals of RSM to the study of P-O fit. This allows the exploration of different models about the relationship between P-O fit and a work outcome and a more reliable approximation of its three-dimensional nature. Using this technique, Edwards and Harrison (1993) reanalyzed the data from a traditional study about P-E fit and strain demonstrating that when the constraints imposed were relaxed, the average proportion of variance explained was nearly tripled and that the relationship between P, E and strain was more complex than the restricted models subjacent to the difference scores measures.
In this article, we present a study that constitutes an application of RSM to the study of P-O fit. We explore the relationship between two parallel P and O measures and a measure of intent to leave using the Edwards procedure. The work outcome measure included here was chosen because it is one of the most studied in P-O fit research. Empirical studies have found support for a negative and direct relationship between P-O fit and intent to leave but the three-dimensional relationship between P, O and intent to leave has not been examined so deeply.
Method
Sample and data collection procedure
The study presented here is a part of a research project which contains a set of attitudinal questionnaires which were mailed to a sample of 2.000 ex-students of the Universidad Autónoma de Madrid working in Spanish organizations and selected at random (see Ximénez, 1998). Data were collected using a self-administered questionnaire that was answered anonymously. 591 subjects returned valid responses. 296 are men and 295 women with ages ranging from 24 to 50 years (with mean 35 and standard deviation 6.21). All subjects have a bachelor degree: 18% in sciences, 13% on laws, 26% in economics & management, 14% in humanities & philosophy, 13% in psychology, 14% in medicine and 3% in elementary teaching. 8% of the subjects are working since less than 1 year and the rest between 1 and 30 years (with mean 8 and standard deviation 6.16). 20% work in the public administration, 17% in banks, 18% in teaching/training, 17% in hospitals, 9% in communication, 10% at the industry and 9% in research centers.
Measures
*Person (P) and organization (O) *components are measured with 11 parallel items selected from the Minnesota Importance Questionnaire (MIQ; Gay, et al., 1971) and the Minnesota Job Description Questionnaire (MJDQ; Borgen, et al., 1968). These items are representative of four of the six commensurate dimensions found in the MIQ and MJDQ: *Safety* (4 item pairs), *Autonomy* (3 item pairs), *Status* (2 item pairs) and *Achievement* (2 item pairs). Items were presented in the center of a three-column table and subjects answered about their importance in terms of P and O separately using a 5-point Lykert type scale (see Ximénez & San Martín, 1998 for more details about P and O measures and the results of commensurate analysis). Responses were averaged to yield P and O scores. Following the suggestion of Edwards and Harrison (1993), before analyses we scale centered the P and O measures by subtracting the mean point of their scales (2.50). This transformation reduced multicollinearity in later regression analyses and provided a meaningful interpretation of coefficient regression when higher order terms were included.
*Intent to leave* is measured with the four items of the O’Reilly, et al. (1991) intent to leave measure rated in a 5-point Lykert type scale (i.e. ‘to what extent would you prefer a more ideal job than the one you now work in?’). The four items were averaged to yield an overall score.
Analysis procedure
Calculation of fit indices
P-O fit is defined here as ‘the perceived correspondence between the person preferences (*P*) and the degree to which the organization rewards them (*O*)’ and operationalized with two of the classical fit measures used in P-E fit studies (see Edwards, 1991). The first one is the *d* index and it consists of the algebraic difference between O and P (O - P). This index reflects a monotonic and linear relationship between O, P and an outcome Z. The second one is the *d*^{2} index and it consists of the squared difference between O and P ((O - P)^{2}).*d*^{2} represents a curvilinear relationship between O, P and Z where either the maximum or the minimum of the response Z occurs when O is equal to P. Results of previous studies suggest that this is the expected relationship for P-O fit and intent to leave (i.e. Z reaches the minimum when O = P). Once calculated the fit indices, we will calculate their correlations with intent to leave.
Estimating constrained and unconstrained equations
Each fit measure does have a corresponding equation to represent
the relationship between P-O fit and a work outcome Z. Then, the first step
consists of identifying the constrained and unconstrained equations for each
fit measure (see table 1). The *b*_{i} parameters are estimated
through multiple linear regression analysis using the statistical package SPSS
7.15.2S.
Evaluating the suitability of each model.
The decision whether each model is appropriate or not to explain the relationship between O, P and Z is made following the criteria proposed by Edwards and Harrison (1993). Support is inferred if: (1) the overall unconstrained equation is significant, (2) the coefficients are significant and in the right direction, (3) the imposed constraints are not rejected, and (4) no significant higher order terms beyond those specified by the unconstrained equation are found. The constraints for each fit model are tested as a set using the hypothesis subroutine of the Multivariate General Linear Hypothesis module of the statistical package SYSTAT 6.0 (Wilkinson, 1990b). Test of higher order terms included one order higher than those in the unconstrained equations (i.e.*O*^{2}, *P*^{2} and *PO* for the *d* index). Finally, it is offered a 3-D graphical interpretation of the problem representing the observed relationship between O, P and intent to leave (raw data) as compared to the constrained and unconstrained models approximated. The graphs are made with SYSGRAPH (Wilkinson, 1990a).
Results
Table 2 shows the descriptive statistics, reliability estimates and correlation coefficients for the P, O and the intent to leave measures. All measures yielded reliabilities greater than .70. Corresponding P and O measures were all positively related with correlations ranging from .09 (ns) for achievement to .20 (p < .05) for status. All O measures were positively and significantly (p < .01) related with correlations ranging from .58 for achievement and status to .67 for safety and achievement, as were their respective P measures. The intent to leave measure obtained higher correlations with the O measures than with their parallel P measures. Correlations were significant and opposite in sign in all P and O corresponding measures with the exception of autonomy/P. Table 2 also shows the correlations between the fit measures and intent to leave. For all dimensions with the exception of autonomy the *d* index obtained a slightly strongest relationship with intent to leave. Since these correlations were all of them negative, they suggest a monotonic relationship. It should be noted that in some instances these correlations were equal or smaller than the ones for the O and P measures, indicating that collapsing O and P into a single measure may confound the predictive power of each of them taken separately.
Results for the estimation of constrained and unconstrained equations and the tests of constraints and higher order terms for both the first and second-order fit models appear in Table 3. As it can be seen, all regression equations reached significance (p < .01). The variance explained by the unconstrained equations ranges between .07 for status and .11 for safety. These coefficients are congruent with the results found in the meta-analysis by Assouline and Meir (1987). Even if all equations were significant, the expected pattern of coefficients was found only for the first-order model explaining the relationship between safety and achievement with intent to leave. Tests of constraints for the *d* model were rejected both for autonomy (F= 5.65, p < .01) and status (F= 20.08, p < .01). The same happened for the second-order model (F = 4.54, p < .01; and F = 9.34, p < .01 for autonomy and status respectively). Therefore, even if the estimated models reached significance for autonomy and status in relation to intent to leave, the restricted models underlying the fit measures do not adequately represent the relationship between O, P and intent to leave. Then, the significance seems to be due to the O measure and the interaction term PO.
Since the relationship of safety and achievement with intent to leave yielded the strongest correlations and the most comprehensive pattern of results, we will consider them in more detail. The relationship of safety predicting intent to leave has yielded O and P coefficients significant and opposite in sign (*b*_{1}= -.42, *b*_{2}= .39, p < .01). Tests of constraints imposed by the *d* index indicated that the coefficients on O and P did not differ in absolute magnitude and were opposite in sign (F= 1.23, p > .05). Significant higher order terms were not found; then, the tests of constraints imposed by the *d*^{2} index were rejected (F= 5.33, p < .01) and the variance explained by its corresponding unconstrained model (R^{2}= .11, p < .01) did not increase at all. However, it should be noted that the linear coefficients remained significant and opposite in sign (*b*_{1}= -.39, *b*_{2}= .45, p < .01). These results indicate that the data imply a linear negative relationship between O, P and intent to leave. This can be seen more clearly in a graphic representation. Figure 1 presents three-dimensional plots of raw data, the surfaces predicted by the fit measures and their corresponding unconstrained equations. Raw data (see figure 1a) indicate that intent to leave increases as O decreases and P increases (that is, as the difference O - P is greater and negative) and it is medium in the region where O and P are equal. This surface is very close to the surfaces predicted by the constrained and unconstrained *d* model (see figures 1b and 1c) but it differs substantially from those predicted by the *d*^{2} model (see figures 1d and 1e). In conclusion, even if all equations were significant, the first-order model is the only one that adequately represents the relationship of safety predicting intent to leave.
The relationship of achievement predicting intent to leave has also yielded O and P coefficients significant and opposite in sign (*b*_{1}= -.28, *b*_{2}= .26, p < .01) and the tests of constraints imposed by the *d* index were accepted (F=.08, p > .05). However, in this case even if there was no support for the second-order model (F= 5.07, p < .01), there are significant higher order terms (i.e. O^{2}= -.09, PO= -.18, p < .05), indicating that the data deviated from the planar surface implied by the linear model. This can explain the form of the surface observed in the raw data (see figure 2a) which presents curvature in the O,P plane, showing that intent to leave increases as the difference (O - P) increases negatively and that there is a negative moderation effect. This surface is closer to the surfaces predicted by the constrained and unconstrained *d* model (see figures 2b and 2c) than to those predicted by the *d*^{2} model (see figures 2d and 2e) but it can not be concluded that the first-order model adequately represents the achievement-intent to leave relationship.
Discussion and conclusion
The results of this study have revealed that even if the two fit measures used here have yielded 8 significant correlations with intent to leave, the constraints imposed by them have been supported in only two instances and solely for the linear first-order model. Then, contrary to what it was expected the relationship between P-O fit and intent to leave is best explained by the linear model than by the curvilinear. Relaxing the constraints imposed by each of the models increased the adjusted squared multiple correlations from an average of .067 for the constrained equations to an average of .086 for the unconstrained equations, incrementing the explained variance in a 25%. This means that collapsing the P and O measures into a single index confounds the predictive power of each of them taken separately and let ambiguous the inherent three-dimensional relationship of P and O with a work outcome.
The results of the study presented here are similar to the ones found in the study of Edwards and Harrison (1993), demonstrating that RSM is a procedure capable to overcome the methodological problems in the study of P-O fit. Future research should continue applying the fundamentals of RSM to explore the relationship between O, P and a work outcome and to determine if it follows the functional form of the conceptual model of interest and of more complex ones. Since the use of more sophisticated mathematical models to describe the underlying process that generates observed data is increasingly growing in the social sciences (see i.e. Myung & Pitt, 1997), future research should also include other criteria for model selection to explain the nature of the three-dimensional relationship between O, P and a work outcome. |