Home > Industry Analysis > Research & Analysis > FDIC Banking Review




FDIC Banking Review

Evaluating the Vulnerability of Banks and Thrifts to a Real Estate Crisis
Charles Collier, Sean Forbush, and Daniel A. Nuxoll*

As part of its extensive off-site monitoring efforts, the Federal Deposit Insurance Corporation (FDIC) has evaluated banks’ and thrifts’ vulnerability to the stress of a real estate crisis similar to the crisis that occurred in New England in the early 1990s.1  Asking what would happen to banks and thrifts today if the real estate market were to experience a downturn similar to the one in New England a decade ago, we developed the history of the collapse of the New England real estate market into a stress testReal Estate Stress Test (REST)—that produces ratings comparable to the CAMELS ratings.2  The REST ratings indicate the severity of the exposure to real estate and therefore identify institutions that appear vulnerable to real estate problems.  The ratings direct the attention of examiners to particular institutions and indicate that the FDIC should be especially concerned about the management of real estate lending at these institutions.  Poor practices there could expose the FDIC to substantial losses.

In addition, REST is able to identify particular areas of the country where a high fraction of the banks and thrifts are vulnerable—areas where the real estate markets might be of concern to bank examiners.  Although these markets may be healthy at the moment, the extent of bank lending in them means that the FDIC must pay particular attention to conditions there.

The results of our research with REST indicate that the institutions most vulnerable to real estate crises today are headquartered in the West and a handful of southern cities.3  The real estate markets in these locations are currently healthy, but because banks—and by extension the FDIC—have substantial exposure to these markets, bank supervisors need to be especially alert to any indication of problems there.

We also find that the most critical risk factor is construction lending, a finding that confirms the conventional wisdom that construction lending is particularly risky.  Many accounts of the savings and loan crisis of the late 1980s and early 1990s discuss commercial and residential construction projects that went awry.4

Because the stress test was developed on data from New England, it may well reflect the distinctive characteristics of events in that region.  However, when REST was backtested on data from Southern California in the late 1980s and early 1990s, it was successful in identifying institutions that later had problems.  More importantly, REST was also successful in identifying troubled banks in parts of the country where real estate downturns were moderate.  These successes suggest that even if a repetition of the severe problems of New England or Southern California is very unlikely, REST can still help identify banks that might suffer difficulties during less severe real estate downturns.

The REST model should not be interpreted as a condemnation of construction lending.  The model does, however, emphasize that risk control is especially important for these loans.  The success of a construction loan depends on the future, not the present, of the real estate market, so construction lending is intrinsically more risky than forms of lending that are secured by liens on real property.

The obvious question is why one should focus on New England.  There are three reasons.  First, problems among the banks in New England can be traced directly to the real estate market.5  Second, the number of banks in the region was large enough that statistical models can be estimated relatively easily.  Third, the New England experience is hardly unique.  As the FDIC (1997) documents, commercial real estate was a factor in several distinct sets of banking problems during the 1980s and early 1990s.6  In addition, commercial real estate has been a factor in bank crises in a number of other countries.7  Thus, events in New England constitute a relatively clear case of a problem that is endemic to banking.

Importantly, REST uses Call Report data, so it cannot evaluate pricing, terms, or underwriting—factors critical to controlling the risk of real estate lending.  Moreover, REST does not estimate the condition of the real estate market in any region, state, or metropolitan statistical area (MSA); it identifies markets where banks are exposed to potential real estate problems, not markets where such problems actually exist.  What the REST model can do is identify the banks that are at most risk in the event real estate problems should occur.  In so doing, it sharpens the focus of questions about risk control and real estate markets and therefore makes an important contribution to the FDIC’s off-site monitoring.

This article explains how the model was built with the use of New England data and was tested with the use of data from other historical real estate crises.  The REST results for December 2002 are presented and analyzed, and recent trends—both nationally and for selected states—are discussed.

Method of Examining New England

The central question for the team that built the REST model was whether any model could detect those healthy banks that would be in most danger during periods when real estate became a problem.  To answer this question, we examined the New England real estate crisis of the early 1990s.  In 1987, the economy and the banking industry in New England could have been described as vibrant, but by 1990 the problems were obvious.8  The first stage in developing the REST model involved comparing the banks in New England in 1987 with the banks there in 1990.  All the banks were healthy in 1987, but by 1990 a substantial fraction of them were troubled.  Our analysis used statistical procedures and data from 1987 to find the traits common to the institutions that later had severe difficulties.  This approach seeks to answer the question whether as early as 1987 one could have identified the riskiest banks in New England.

Because the purpose of our project was to evaluate banks’ ability to withstand a crisis such as the one in New England in 1991–1993, banks that had a special function or were somehow atypical were eliminated from the analysis.  Banks considered atypical were those that had equity-to-asset ratios greater than 30 percent or loan-to-asset ratios less than 25 percent.  A total of 13 special-purpose or atypical (or new) banks were eliminated from the December 1987 sample.9

In addition, consolidation just before the crisis had to be taken into account.  In December 1987, 289 New England banks filed Call Reports, but in December 1990 the number had shrunk to 255.  Much of the consolidation appears to have been achieved by mergers of different banks owned by the same holding company.  Regardless of the reason for the consolidation, the performance of the bank resulting from a merger was undoubtedly affected by the characteristics of the banks absorbed in the merger.  Consequently, this project used data adjusted for mergers.10

Finally, because growth rates between 1985 and 1987 were included in the model, only banks that had been in existence for five years (1985–1990) were part of the sample.

The sample contained a total of 203 banks.11

In the first stage—comparing the conditions and balance sheets of banks at the end of 1987 with the same banks’ conditions and balance sheets at the end of 1990—the model considers 12 variables as measures of health at the end of 1990 (previous work has shown that these variables are closely related to CAMELS ratings, and the FDIC has developed a Statistical CAMELS Off-site Rating [SCOR] model using them).12  The 1987 data include the same 12 variables as well as 12 variables that measure (as a fraction of assets) the types of loans made by the bank, a variable that measures the bank’s growth rate between 1985 and 1987, and a variable that measures the bank’s size in 1987.

The basic results appear in tables 1 and 2.  The reason the two tables differ is that only 3 of the 12 SCOR variables (equity, provisions for loan losses, and net income) can be less than zero, while another of the variables (loans and long-term securities) can be zero in principle but in fact was substantially greater than zero for the whole sample.  These 4 variables were handled by the usual regression technique—ordinary least squares (OLS).  Table 1 reports the results for these 4 variables.

Table 1 The REST Model: Coefficients Estimated with OLS

The other 8 SCOR variables cannot, in principle, be less than zero. These variables—loan-loss reserves, loans past due 30–89 days, past due 90+ days, nonaccruals, other real estate, charge-offs, volatile liabilities, and liquid assets—were fit with a Tobit model.13  Table 2 reports the results for these variables.  For a number of the 8 variables, the results do not differ appreciably from OLS because, as reported in table 2, there were very few zero values.14

As mentioned above, the independent variables for the REST model include all the 1987 values for SCOR variables, 12 categories of loans as a fraction of assets, asset growth, and bank size.

Table 2 The REST Model: Coefficients Estimated with Tobit

The SCOR variables represent the condition of the bank in 1987.  In fact, the condition of the bank results partly from the characteristics of the bank, so these 12 variables are proxies for the characteristics of the bank.  For example, one cannot directly observe the quality of a bank’s underwriting, but presumably tighter underwriting results in fewer past-due loans—so the data on loans past due 30–89 days can be seen as a proxy for underwriting standards.

The loan-type variables are important because the New England crisis was a real estate crisis.  Our project included data on several types of real estate loans (1–4 family residential, multifamily housing, agricultural, construction and development, and other nonresidential) as well as other loans (unsecured commercial, to municipalities, to depository institutions, credit card, other consumer, agricultural production).  Presumably banks that held large amounts of real estate loans would be the ones most severely affected by the crisis.

Asset growth between 1985 and 1987 was included because rapidly growing banks are considered especially risky.  Total assets were included in the model because it is usually thought that larger banks can more easily diversify risk away.15

All estimations were done with a stepwise procedure.16  This method starts with all 26 variables (12 SCOR variables, 12 loan-type variables, asset growth, and size) and eliminates those that are not statistically significant.  The stepwise method was necessary because some variables have coefficients that are very large but statistically insignificant.  Although inclusion of these variables improves the in-sample fit of the model, it does so only very slightly.  If the coefficients are large, however, inclusion of these variables in out-of-sample forecasting would almost certainly have an effect on the forecasts despite the complete absence of statistical evidence that these variables matter at all.  Their elimination made very little difference to the fit of the model.

New England Results

As noted above, all the results were estimated with a stepwise procedure, and this procedure did not result in a significantly worse fit than if all the variables had been used.  In general, the two sets of estimates are completely consistent with each other.  In fact, most of the coefficients estimated with a stepwise procedure are very similar to those estimated when all the variables are used.17

Although alternative methods produced similar estimates, one should be cautious about interpreting these results.  For example, one cannot conclude that the ratio of loans and long-term securities to assets did not affect asset quality, although that variable has a zero coefficient in all the asset-quality equations (loans past due 30–89 days, loans past due 90+ days, nonaccrual loans, other real estate, charge-offs, and provisions for loan loss).  The effect might be small or incon­sistent.  Statistical tests reveal correlation, not causation.18  When the correlation is strong and consistent with theory, however, there is good reason to take statistical results seriously.  With that in mind, one should note several features of the results.

First, this approach captures much of the variation between banks.  Near the bottom of both table 1 and table 2 there is a line reporting that R2 is between 0.30 and 0.60 for most of the results.19  This means that 1987 data can account for about 30–60 percent of the differences between banks in 1990.  The major exception to this result is that the variable “loans past due 90+ days” has an R2 of only 0.1306.20

Second, most variables are mean-reverting.  That is, the banks that were exceptional in 1987 tended to resemble the average (mean) bank more closely by 1990.  The coefficients on the lagged variables show this effect.  For example, consider the effect of lagged equity on equity. From table 1, the estimated coefficient is 0.535.  This means that an extra 1 percent equity would lead to an extra 0.535 percent equity in 1990.  Importantly, the coefficient is between 0 and 1, indicating that banks with unusually high levels of equity in 1987 still had unusually high levels of equity in 1990, but other things being equal, differences in equity levels shrank during those three years.  An inspection of tables 1 and 2 shows that the only variables without a strong mean-reverting component are provisions, reserves, nonaccrual loans, and other real estate.

This observation suggests that most of the SCOR variables reflect something fundamental about the operations of a bank.  Banks with higher than average levels of loans past due 30–89 days tend to have higher than average levels even three years later.  Conceivably, high levels of past-due loans may reflect a less cautious underwriting philosophy.

This interpretation of the SCOR variables is supported by the data.  For example, high levels of loans past due 30–89 days might be considered a sign that the bank is more willing to take risks.  In fact, high levels of loans past due 30–89 days in 1987 are associated with lower net income and more nonaccrual loans, more other real estate, more charge-offs, and more provisions in 1990.

The third feature of our New England REST results is that most of the loan-type variables have the expected coefficients.  High levels of commercial real estate loans in 1987 were associated with poor performance in 1990.  It should be noted that construction and development loans in particular were problems for New England banks.  Although other types of commercial real estate (nonresidential real estate and multifamily housing) were associated with problems, construction loans were the major problem: they were significant in almost every regression, and they generally had a larger effect than other types of commercial real estate loans.  High levels of commercial and industrial (C&I) loans and other consumer loans also seem to have been a risk factor.  Credit card loans were not a special problem, and loans to municipalities helped shield banks from the downturn.

Fourth, high asset growth between 1985 and 1987 also resulted in poor performance by 1990.  The signs on log assets are consistent with the theory that larger institutions were more diversified and more aggressive in facing their problems in 1990.  Large institutions had fewer past-due loans; on the other hand, they had more nonaccrual loans, reserves, charge-offs, and provisions.  They also had lower net income, but that result seems to be driven completely by the higher provisions.

There are some other interesting features of the results.  Banks with high net income in 1987 tended to have higher equity in 1990.  Banks with high levels of reserves in 1987 performed better in 1990.  This last finding is consistent with the interpretation that more-conservative banks tend to recognize losses more quickly and reserve against them.  This interpretation, in turn, is consistent with the observation that charge-offs in 1987 are negatively correlated with loans past due 30–89 days in 1990.  Banks that relied on noncore liabilities also tended to have more difficulties in 1990 (lower income, more past dues 30–89 days, and more other real estate).  This result is consistent with the notion that banks that use noncore liabilities may be more aggressive and take more risks.

And there are some anomalies.  High levels of other real estate in 1987 are correlated with low levels of loans past due 90+ days and nonaccrual loans in 1990.  This might reflect differences in workout policies.

Out-of-Sample Testing

Although these results are intrinsically interesting as an analysis of past events in New England, the goal of our project was to develop a forecasting tool that could identify banks most likely to have difficulties during future real estate downturns.  To test whether REST had forecasting power, we applied it to other real estate crises.  Because these tests involved banks that were not in the sample used to build the model, they are called “out-of-sample” tests.

Southern California experienced a real estate crisis at about the same time as New England.  To test the validity of the New England results, we forecasted 1991 SCOR ratios on the basis of 1988 data for Southern California banks.  The banks included all California banks overseen by the FDIC’s Los Angeles East, Los Angeles West, and Orange County field offices.  Again, all institutions with loan-to-asset ratios less than 25 percent or equity-to-asset ratios greater than 30 percent were excluded.  The sample contained 242 banks, 173 of which had a composite CAMELS rating of 1 or 2 as of year-end 1988.

The banks in California differed from those in New England in a number of ways.  First, California banks in 1988 were generally in worse shape than New England banks in 1987.  None of the New England banks in the sample had a CAMELS rating worse than 3 at year-end 1987, but 20 (8.3 percent) of the Southern California banks were rated 4 at year-end 1988.  Second, the shock in the New England economy and therefore to the region’s banks was both shorter and more severe.  Of the 203 New England banks, 33—16.3 percent—failed, a percentage slightly higher than the percentage in California (33 of 240, or 13.8 percent).  Moreover, in New England the bulk of the failures (29) were concentrated in a two-year period (1991 and 1992), whereas in Southern California the failures were spread out over three years (1992–1994).  Third, structural differences between the two regions’ banking industries were significant: California had permitted statewide branching for decades, whereas the banking industry in New England was more segmented.

The stress test did not do particularly well at forecasting individual ratios.  For example, the model was not able to identify those banks that experienced large increases in nonaccrual loans.  This is not too surprising because management’s decisions about handling problems determine how the problems affect the bank’s balance sheet and income statement, and if bank management delays dealing with real estate problems, the bank will tend to have higher other real estate owned or nonaccrual loans.  If management deals with the problems aggressively, those same problems may affect the bank’s provisions, charge-offs, income, and capital.  Even a perfect model cannot forecast how management will deal with problems.

However, bank supervisors do not evaluate banks in terms of individual ratios but in terms of the overall condition of the bank.  Consequently, the major issue is whether the stress test can forecast bank condition.  The SCOR model can be used to translate the 12 SCOR ratios into a forecasted CAMELS rating.21  These ratings are the REST ratings.

Table 3 compares 1988 REST ratings with CAMELS ratings and with failures between 1992 and 1995.  All the banks used for compiling table 3 were 1 or 2 rated as of December 1988, and all banks survived until at least December 1991.  If the bank failed between 1992 and 1995, the bank is identified as a failure.  Otherwise, the bank’s reported CAMELS rating is the worst rating it received between 1992 and 1995.22

Table 3 Performance of Stress Test in Southern California

Several considerations underlie this approach.  First, the ultimate concern of supervision is troubled banks; hence, one should concentrate on the worst ratings.  Second, banks that are rated 3 or worse have already been identified as potential problems, and the critical question is which banks currently regarded as sound are likely to develop problems.23  Third, as noted above, events in Southern California evolved over a number of years.  Problems at a bank that were obvious at the end of 1993 might not have been evident at the end of 1991.  Using the worst rating during the crisis years 1992–1995 avoids the issue of timing.  This method considers the banks that encountered difficulties, regardless of when the problems actually occurred.

Forecasting models have two types of error: they fail to identify the banks that are downgraded (Type I error), and they identify banks that are not downgraded (Type II error).24  This article analyzes the number of banks that the model correctly identified, so it refers to Type I accuracy and Type II accuracy.  The emphasis is on problem banks (banks with a CAMELS rating of 4 or 5) and failures.  Failures cost the FDIC money to resolve the bank, and problem banks are in danger of failing and take considerably more supervisory resources.

Panel A of table 3 shows the raw numbers, while panel B reports Type I accuracy and panel C reports Type II accuracy.  Ideally all banks that failed would have REST ratings worse than 4.5 (100 percent Type I accuracy), and the “failed” line in panel B would have a 100 percent in the last column.  In addition, ideally all the banks with REST ratings of 4.5 or worse would eventually have a CAMELS rating of 5 or would fail (100 percent Type II accuracy).  In that case, the column for REST ratings greater than 4.5 in panel C would have numbers that sum to 100 percent in the lines for CAMELS ratings of 4 or 5 or for failures.

Table 3 shows that the model is not perfect but that it does correctly identify a large percentage of problem banks and failures.  Consider Type I accuracy first.  Panel A indicates that 15 banks failed; 4 of the 15 had REST ratings worse than 4.5, while 8 of the 15 had REST ratings between 3.5 and 4.5.  Panel B shows this is 27 percent and 53 percent of the failures, respectively. Thus, if banks with REST ratings of 3.5 or worse are targeted, the Type I accuracy for failures is 80 percent.  A similar analysis shows that for problem banks, REST has a Type I accuracy of 68 percent.

The analysis of Type II accuracy also shows that REST is quite accurate.   Panel C indicates that among the banks with REST ratings of 4.5–5, 62 percent became problem banks and 12 percent failed; put differently, 74 percent either failed or were in danger of failing.  For REST ratings of 3.5–4.5, 28 percent were problem banks, and 13 percent failed.  In other words, just over 40 percent had severe difficulties.

Several points about this backtesting should be emphasized.  All the banks were rated 1 or 2 at the time of the December 1988 Call Report, and the examination ratings were given three to seven years after the Call Report.  In short, REST did a reasonably good job of identifying which sound banks were most likely to encounter difficulties three to seven years later.

However, the example of Southern California was chosen precisely because real estate problems were severe there.  This is a critical piece of information.  The stress test identifies banks that could become problems if there were a real estate crisis.  REST does not identify real estate markets that are susceptible to crisis.  The backtest was successful because the Southern California market did in fact have a crisis; REST did not identify that market as one vulnerable to crisis.  In the jargon of forecasting, the stress test provides conditional, not unconditional, forecasts.25

Both New England and Southern California suffered from extremely bad real estate problems.  REST has also been backtested on episodes of less severe real estate problems.  For example, table 4 reports the results (based on December 1987 data and examination ratings from the period 1991–1994) for banks headquartered in the Atlanta MSA.

Table 4 Performance of Stress Test in Atlanta

The problems in Atlanta were clearly less severe than those in New England or Southern California.  Only one bank failed, and no banks received a CAMELS 5 rating.  Nonetheless, institutions identified by the stress test were more likely to have severe difficulties.  Only 2 of the 17 institutions (12 percent) with REST ratings better than 3.5 received a CAMELS 4 rating, but 11 of the 30 institutions (37 percent) with REST ratings worse than 3.5 later became problem banks or failed.  Again, all these banks were CAMELS rated 1 or 2 at year-end 1987.26

Forecasts Based on December 2002 Data

The stress test has been run at the FDIC since 1999, and the ratings are distributed every quarter to FDIC examiners and analysts as well as to the other banking regulatory agencies.  Tables 5, 6, and 7 summarize a recent set of ratings—those based on the December 31, 2002, Call Report data.  In contrast to the backtests, these tables report on all institutions regardless of CAMELS rating.27  However, institutions with equity-to-asset ratios exceeding 30 percent and loan-to-asset ratios less than 25 percent are omitted.  Table 5 reports the results by FDIC region, table 6 by state (omitting U.S. territories), and table 7 by selected MSAs.

Table 5 shows that the banks in the San Francisco and Atlanta regions are unusually vulnerable to real estate problems.  Of the 699 institutions in the San Francisco region, 162 (23.2 percent) had ratings of 3.5–4.5, and 250 (35.8 percent) fell into the worst category, with ratings of 4.5–5.0.  This last number is the most significant, since these are the institutions that the model identifies as especially vulnerable to real estate problems.  In the Atlanta region, 244 (21.2 percent) had ratings between 3.5 and 4.5, and 332 (28.8 percent) had ratings worse than 4.5.  In the rest of the nation, only 12.7 percent were rated between 3.5 and 4.5, and only 9.5 percent were rated worse than 4.5.  In short, the model indicates that institutions in the West and Southeast are approximately three times more likely to be vulnerable to a real estate crisis than institutions in other parts of the country.

Table 5 REST Ratings by FDIC Region

Table 5 also indicates some regions of secondary concern, notably the Dallas and Memphis regions.

In table 6 (the results reported by state) the states are ranked by the percentage of institutions with stress-test ratings worse than 4.5.28  This table clearly indicates that the vulnerable institutions are concentrated geographically, with 6 of the top 10 states being in the San Francisco region.  In addition, there are only 11 states in which 30 percent or more of the institutions are extremely vulnerable, and only 4 more in which the percentage is between 20 percent and 30 percent.

Table 6 REST Ratings by State

Table 7 presents the data by MSA, though it includes only MSAs where at least 10 banks or thrifts are headquartered.29  Again, the MSAs are ranked by the percentage of institutions with stress-test ratings worse than 4.5.  Only the top 20 MSAs are reported in the table, and the table confirms that these MSAs are unusual.  On average, REST assigns almost 60 percent of the banks and thrifts in these MSAs a rating of 4.5 or worse, whereas for all other MSAs the comparable number is approximately 20 percent.  Clearly, the FDIC should be especially concerned about the health of real estate markets in these MSAs.30

Table 7 REST Ratings by MSA

Analysis of the December 2002 Forecasts

The results of the stress test can be analyzed much as the SCOR model is.  With the SCOR model, one can attribute the reasons for a forecasted CAMELS downgrade to specific variables by comparing the bank’s ratios with the median ratios of all banks currently rated 2.  The same technique can be used with REST.31

For purposes of examining the REST ratings, we defined the benchmark as the median ratios of all institutions currently rated 1 or 2.  This standard of comparison cannot be identified with any existing institution; it is a composite—the “average” institution with a 1 or 2 rating.32  This benchmark is used to calculate “weights” that trace the reason for poor ratings back to specific ratios.  The weights are in terms of percentages so they necessarily sum to 100 percent.  The percentages can be negative if the ratio is better than the standard.  Importantly, the weights are not used in the estimation; they are merely a method of comparing an institution that has received a poor rating with an average institution.

Table 8 Sample Stress-Test Rating, Hypothetical Bank A

Tables 8 and 9 illustrate how the weighting procedure can be used to analyze a result.  Each table is for a hypothetical institution.  The institution described in table 8 has a stress-test rating of 4.86 but a CAMELS rating of 2 and a SCOR rating of 1.51.  However, almost 12 percent of the institution’s assets are construction loans, and those loans make up about 81 percent of the difference between this institution and the typical 1- or 2-rated bank.  Other factors contributing to the poor REST rating are nonresidential real estate (18.52 percent of the portfolio, with a weight of 6.38 percent), multifamily housing loans (weight 5.47 percent), and C&I loans (weight 4.71 percent).  This institution does have some strong points, though they are not important enough to change the stress-test rating.  It holds 0.89 percent of its assets in its loan-loss reserves.  These reserves have a weight of –0.64 percent, indicating that although they are a positive factor, they are negligible in comparison with the size of the construction loan portfolio.

Table 9 shows an institution that has a stress-test rating of 4.88.  In contrast to the rating of the bank in table 8, this rating is not driven by construction loans.  In fact, the bank illustrated in table 9 has no construction loans, and the stress test evaluates this as a positive factor (weight –10.27 percent).  However, the institution is concentrated in multifamily housing (weight 74.07 percent).  Secondary factors include a concentration in nonresidential real estate (weight 21.33 percent) and a reliance on noncore liabilities (weight 16.74 percent).  This institution is relatively large (assets of $500 million), and on balance its size is a slight negative factor (weight 7.60 percent).

Table 9 Sample Stress-Test Rating, Hypothetical Bank B

Table 10 presents an overview of the weights for banks that are currently rated CAMELS 1 or 2 but have REST ratings of 4.5 or worse.  The variables are ordered by the median weight.  Construction loans, with a median weight of almost 75 percent, are clearly the most important factor in the model.  Of the 800 institutions with ratings of 4.5 or worse, 777 have weights for construction loans that exceed 5 percent.  In 16 cases (as for the hypothetical bank in table 9), construction loans are a significant positive factor and have weights that exceed –5 percent.  The median bank that is identified as extremely vulnerable holds 13.05 percent of its assets as construction loans, compared with 0.50 percent of the banks that receive REST ratings of between 1.50 and 2.50.

Table 10 Reasons for Ratings of 4.5-5.0

In some cases nonresidential real estate loans, C&I loans, and multifamily housing loans are also significant risk factors.  In addition, large weights are regularly assigned to low levels of liquid assets, high levels of noncore liabilities, and high levels of loans past due 30–89 days.  Moreover, banks with poor ratings tend to be larger and to have grown more rapidly.  Most variables seldom, if ever, have significant positive or negative weights.  Mortgages on 1–4 family homes generally have a positive weight, but it is never significant.

Table 11 shows that although construction loans have the most weight, they are not the only factor driving the ratings.  All institutions holding construction loans exceeding 20 percent of their total assets are identified as extremely vulnerable, but 12 institutions that have no construction loans received REST ratings of 4.5 or worse.

Table 11 REST Ratings and Construction Loans

Table 11 also shows the reason that using the ratio of construction loans to total assets by itself is inadequate.  A bank could have 7 percent of its assets in construction loans and receive almost any REST rating.  If the bank has no other risk factors, it will receive a rating of 1–1.5, but if other risk factors are present, it may receive a rating of 5.  Before assigning ratings, the stress test considers several aspects of a bank’s operations, allowing for both mitigating and exacerbating factors.  A single ratio is only one number and is meaningful only after it has been put in a broader context.33

Trends in Stress-Test Ratings

Figures 1 and 2 show the history of stress-test ratings since December 1986 for the United States as well as some individual states.  Both figures show the percentage of institutions receiving ratings of 3.5 or worse as a percentage of all institutions with REST ratings.34  Figure 1 shows ratings in the United States and in two states that have already been discussed—Massachusetts and California.  Figure 2 shows ratings in Arizona, Georgia, and Illinois.  Both figures also show a definite trend in stress-test ratings: since 1993, the ratings for the United States and for all five states have become worse.

Figure 1 REST Ratings Worse Than 3.5, 1986-2002 (USA, CA, and MA)

Figure 2 REST Ratings Worse Than 3.5, 1986-2002 (AZ, GA, and IL)

In figure 1 the effects of the real estate crises in Massachusetts and California are clear.  Large percentages of the financial institutions in both states were vulnerable in the late 1980s, and the percentages of vulnerable institutions then declined dramatically.  Figure 1 also shows that institutions in the two states have followed quite different paths in the last decade.  Whereas the REST ratings for California banks and thrifts have again become substantially worse than those for the United States as a whole, ratings for Massachusetts banks have generally become better.

Figure 2 shows that Arizona banks and thrifts have followed a pattern similar to California’s, with very poor ratings in the mid-1980s, a very rapid improvement, and a subsequent deterioration.  Ratings in Georgia, in contrast, have gradually deteriorated up to the present.  Georgia today has a very high percentage of banks and thrifts with poor ratings.  Ratings in Illinois have followed the national pattern quite closely, with some increase before the recession of the early 1990s, a decline during the recession, and a gradual but definite increase in the percentage of poor ratings after 1993.  However, ratings in Illinois have generally been a little better than ratings in the rest of the country.  Both figures illustrate quite clearly that although national trends may be significant, each state has a story of its own.

Conclusion

This article has explained the development of a real estate stress test and the test’s most significant results.  The stress test highlights institutions whose lending practices deserve scrutiny; it therefore spotlights markets that should be inspected for evidence of incipient real estate problems.  REST indicates that a large fraction of banks and thrifts in the West and the Southeast may be vulnerable to problems in the real estate market, mostly because of large concentrations in construction and development lending.  REST does not, however, show that any real estate market is either overbuilt or on the verge of a crisis.  There are, after all, a multitude of ways for institutions to manage and mitigate the risk of construction lending.

This article raises the questions of whether institutions that have exposures to the real estate market have adequately protected themselves and whether the real estate markets in the West and Southeast are inherently healthy.  The history of banking suggests that these questions are vitally important to the FDIC.

References

Collier, Charles, Sean Forbush, Daniel A. Nuxoll, and John O’Keefe.  2003.   The SCOR System of Off-Site Monitoring: Its Objectives, Functioning, and Performance.  FDIC Banking Review 15, no. 3:17–32.

Federal Deposit Insurance Corporation (FDIC).  1997.  History of the Eighties—Lessons for the Future.  Vol. 1, An Examination of the Banking Crises of the 1980s and Early 1990s.  FDIC.

Gilbert, R. Alton, Andrew P. Meyer, and Mark D. Vaughan.  1999.  The Role of Supervisory Screens and Econometric Models in Off-Site Surveillance.  Federal Reserve Bank of St. Louis Review (November–December): 31–56.

Herring, Richard J., and Susan M. Wachter.  1999.  Real Estate Booms and Banking Busts—An International Perspective.  Occasional Paper No. 58.  Group of Thirty.

Mayer, Martin.  1990.  The Greatest-Ever Bank Robbery.  Charles Scribner’s Sons.

FOOTNOTES:

* All the authors are on the staff of the Federal Deposit Insurance Corporation (FDIC).  Charles Collier and Sean Forbush are with the Division of Supervision and Consumer Protection (DSC), Collier as chief of the Information Management Section and Forbush as a senior financial analyst.  Daniel Nuxoll is with the Division of Insurance and Research (DIR) as a senior economist. 

This article reports the results of a close collaboration among numerous people in both the DSC and the DIR.  In addition, the staff of the FDIC’s San Francisco Regional Office encouraged the project and provided the authors with helpful comments. The opinions expressed here are those of the authors and do not necessarily reflect the views of the FDIC.

1See Collier et al. (2003) for a more general discussion of the objectives and methods of the FDIC’s off-site models.

2CAMELS ratings are based on examiners’ assessments of Capital, Asset quality, Management, Earnings, Liquidity, and market Sensitivity.  The ratings range from 1 to 5, with 1 being the best.  Banks and thrifts with a rating of 1 or 2 are considered sound, whereas supervisors have definite concerns about institutions with a rating of 3.  Institutions with a rating of 4 or 5 are considered problem banks.  The Sensitivity rating was added only in 1997, so strictly speaking, ratings before that year are CAMEL ratings.  This article uses “CAMELS” throughout, despite the anachronism.

3Clearly, our project is most directly related to the FDIC’s function as an insurer, not a supervisor.  Consequently, this article discusses all banks and thrifts, whether or not they are supervised by the FDIC.

 It must also be observed that banks are identified by their headquarters.  Consequently, for purposes of this stress test, the Bank of America is located in Charlotte, N.C., although the vast majority of its business is outside the Charlotte metropolitan statistical area and outside the state of North Carolina.  However, the number of megabanks is relatively small, and few of the banks in our project have many operations that are outside a small area.

4A number of popular accounts—for example, see Mayer (1990), chapter 5—report that Edwin Gray, the chairman of the Federal Home Loan Bank Board from 1983 to 1987, became aware of the depth of the S&L crisis while watching a videotape of abandoned projects in the Dallas area.

5See FDIC (1997), chapter 10, for a discussion of this issue.  In contrast, the Texas banking crisis during the late 1980s and early 1990s was caused only partly by commercial real estate.

6Ibid., especially chapter 3.

7See Herring and Wachter (1999).

8We could have used data from years other than 1987 and 1990 to develop the REST model, but for a terminal date, 1990 is the obvious choice.  The problems in New England were not that apparent until 1990, yet in 1991 a significant number of banks failed.  We are especially interested in banks that are so troubled they eventually fail; thus, a later terminal date would ignore some important information.  The start date of 1987 corresponds closely to the peak in the New England economy, but 1986 or 1988 could equally well have been used.  Experiments indicate that the REST results would have been similar for any of those three years.

9Also excluded was a Connecticut bank that at the end of 1988 apparently sold its regular banking operations and continued as a special-purpose institution.

10To adjust the data, we combined the data for separate institutions that later merged.  For example, if two banks merged in January 1988, the 1987 data for the resulting bank would be the combined balance sheets and income statements for the two banks as of December 1987.

11Our discussion of New England does not refer to thrifts because the savings banks were excluded from the sample.  During this period, savings banks filed a slightly different Call Report from the one filed by commercial banks, so some data provided by commercial banks are missing for savings banks..  More importantly, during this period many mutual savings banks converted to stockholder-owned savings banks, and after conversion, these institutions behaved quite differently.  See FDIC (1997).  The development of the stress test assumes that the institutions in the sample had a generally stable strategy, and clearly many of the savings banks in New England did not.

 Our discussion of Southern California does not include thrifts because before 1991, data on thrifts in that region are limited.

12See Collier et al. (2003).  A model could be developed that would forecast CAMELS ratings directly.  However, the deterioration among banks in New England was extremely sudden, and CAMELS ratings change only after an examination (or, occasionally, after an off-site review).  CAMELS ratings at the end of 1990 probably do not reflect the extent of the problems in New England because examiners were overwhelmed and had not changed the ratings at some troubled institutions.  We developed a model to forecast CAMELS ratings directly, and although it identified the same types of institutions as the REST model, in backtests it was found to be slightly less accurate than the REST model.

13Other real estate consists mostly of real estate that banks own because of foreclosures.  Charge-offs are gross, not net, so they cannot be less than zero.

14In fact, all banks had some loans past due 30–89 days, but the OLS estimates differ from Tobit because of a handful of values that are close to zero.  Tobit considers the possibility that these values are greater than zero by chance.

15The number actually used is the logarithm of total assets.

16The statistical software SAS supports a stepwise method for OLS but not for Tobit.  The variables with the Tobit specification were also estimated with stepwise OLS and with a full Tobit model (one that includes all 26 variables).  The variables that were insignificant in both the stepwise OLS and the full Tobit specification were dropped.  The Tobit was reestimated, and the more insignificant variables were dropped.  In the final estimation, all variables were significant at least at the 15 percent level.

17It should be noted that because these equations were estimated with a stepwise procedure, the coefficients and t-statistics cannot be interpreted in the textbook manner.  However, the estimated coefficients and t-statistics are very similar when all the variables are included.

18The stepwise procedure complicates the usual warning about reasoning from correlation to causality.  The coefficient on a correlated variable might well incorporate the effect of an omitted variable.

19The numbers reported for the Tobit are pseudo-R2s.  They are calculated in a manner analogous to the manner in which OLS R2s are calculated, except that with the Tobit numbers the calculation allows for the fact that the variables can never be less than zero.

20The test statistics for the hypothesis that the omitted variables have a zero coefficient are also included.  By way of comparison, the 5 percent significance level for a Chi-squared statistic with 15 degrees of freedom is 25.00, while the comparable F-statistic with 20 and 200 degrees of freedom is 1.62.  However, because the model was fitted with a stepwise procedure, the statistics in the tables are not useful for classical hypothesis testing.  They merely indicate that excluding the variables has very little effect on the fit of the model.

21Our project focuses on the information that could have been known at the time.  Consequently, the REST ratings are computed with the same coefficients that could have been used to produce the December 1988 SCOR ratings.  There is one complication: the coefficients were estimated using revised Call Report data and a complete set of examination ratings.  Neither would have been available if someone had estimated the SCOR model in 1989.

22Three banks are excluded because although they survived until December 1991, they merged before they were examined.  The mergers were not assisted; that is, the banks did not fail.

23The results are not materially different if one includes banks that were rated 3, 4, or 5 as of 1988.

24For a more extended explanation of Type I and Type II errors, see Collier et al. (2003).

25Earlier in the same period Texas had a major crisis, which we did not use for two reasons.  First, large bank-holding companies present a number of difficulties because of the connections between banks in the holding company.  Second, the real estate problems in Texas began after many banks in the state had already gotten into trouble because of loans to the oil and gas industry.  However, tests on the 1986 data from Texas show results similar to those presented in the text for Southern California.  As of December 1986, only 34 banks had a composite CAMELS rating of 1 or 2 and a REST rating of 5.  Of those 34, 13 (38 percent) failed and 13 (38 percent) became problem banks.  Only 1 maintained a 1 or 2 rating until 1993.  In contrast, 338 banks had a REST rating of 2, and only 12 (3 percent) failed, while 43 (13 percent) became problem banks.

26A handful of other backtests have been done and have produced similar results.

27There is a second difference as well: thrifts are included in the December 2002 data.

28The totals in table 5 include banks and thrifts in U.S. territories.

29Unfortunately, some cities with very high percentages of poor REST ratings (for example, Provo, Utah, and Fort Collins, Colo.) are excluded from the table because too few institutions are headquartered in them.

30Some preliminary work also shows that new banks have unusually poor REST ratings.  As a group, banks that are less than three years old have REST ratings comparable to those in the MSAs listed in table 7.

31See appendix 2 in Collier et al. (2003) for an explanation of the method for deriving SCOR weights.  The method used by REST is slightly more complicated because some variables (for example, nonaccruing loans) can never be less than zero.

32SCOR uses the median ratios of the banks that received a rating of 2 within the previous year.

33Gilbert, Meyer, and Vaughan (1999) make this point forcefully.

34REST uses the SCOR model to assign ratings that are comparable to CAMELS ratings.  Using the data on the characteristics of banks assigned CAMELS 5 ratings after actual examinations, SCOR estimates coefficients that describe the characteristics of a 5-rated bank.  In 1998, there were few banks with CAMELS 5 ratings, so for that year the SCOR characterization of a 5-rated bank relies on very little data and is consequently imprecise.  This imprecision affects REST ratings worse than 4 because a rating midway between 4 and 5 draws on the characterizations of both 4-rated and 5-rated banks.  The imprecision in SCOR (and REST) resulted in better ratings for banks with very poor financials.  If one takes a set of very poor financial ratios and assigns a rating based on pre-1997 coefficients or coefficients estimated on data from 1999 or later, the ratings would all be similar.  However, the 1998 coefficients produce better ratings for the weakest financial ratios (that is, those ratios that would have been assigned a rating worse than 4 by coefficients from other periods).  The data for the worst ratings are misleading in 1998 because the coefficients for 1998 are imprecise, and the ratings based on those coefficients do not reflect the innate weakness of the banks in the worst condition.

Last Updated 12/16/2003 Questions, Suggestions & Requests