the Vulnerability of Banks and Thrifts to a Real Estate Crisis Charles Collier, Sean Forbush, and Daniel A. Nuxoll*
As part of its extensive
off-site monitoring efforts, the Federal Deposit Insurance Corporation (FDIC)
has evaluated banks’ and thrifts’ vulnerability to the stress of a real estate
crisis similar to the crisis that occurred in New England in the early 1990s.1
Asking what would happen to banks and thrifts today if the real estate market
were to experience a downturn similar to the one in New England a decade ago,
we developed the history of the collapse of the New England real estate market
into a stress testReal Estate Stress Test (REST)—that produces ratings
comparable to the CAMELS ratings.2 The REST ratings indicate the
severity of the exposure to real estate and therefore identify institutions
that appear vulnerable to real estate problems. The ratings direct the
attention of examiners to particular institutions and indicate that the FDIC
should be especially concerned about the management of real estate lending at
these institutions. Poor practices there could expose the FDIC to substantial
In addition, REST is able
to identify particular areas of the country where a high fraction of the banks
and thrifts are vulnerable—areas where the real estate markets might be of
concern to bank examiners. Although these markets may be healthy at the moment,
the extent of bank lending in them means that the FDIC must pay particular
attention to conditions there.
The results of our research with REST indicate that the
institutions most vulnerable to real estate crises today are headquartered in
the West and a handful of southern cities.3 The real estate markets
in these locations are currently healthy, but because banks—and by extension
the FDIC—have substantial exposure to these markets, bank supervisors need to
be especially alert to any indication of problems there.
We also find that the most critical risk factor is
construction lending, a finding that confirms the conventional wisdom that
construction lending is particularly risky. Many accounts of the savings and
loan crisis of the late 1980s and early 1990s discuss commercial and
residential construction projects that went awry.4
Because the stress test was developed on data from New
England, it may well reflect the distinctive characteristics of events in that
region. However, when REST was backtested on data from Southern California in
the late 1980s and early 1990s, it was successful in identifying institutions
that later had problems. More importantly, REST was also successful in
identifying troubled banks in parts of the country where real estate downturns
were moderate. These successes suggest that even if a repetition of the severe
problems of New England or Southern California is very unlikely, REST can still
help identify banks that might suffer difficulties during less severe real
The REST model should not be interpreted as a
condemnation of construction lending. The model does, however, emphasize that
is especially important for these loans. The success of a construction loan
depends on the future, not the present, of the real estate market, so
construction lending is intrinsically more risky than forms of lending that are
secured by liens on real property.
The obvious question is why one should focus on New
England. There are three reasons. First, problems among the banks in New
England can be traced directly to the real estate market.5 Second,
the number of banks in the region was large enough that statistical models can
be estimated relatively easily. Third, the New England experience is hardly
unique. As the FDIC (1997) documents, commercial real estate was a factor in
several distinct sets of banking problems during the 1980s and early 1990s.6
In addition, commercial real estate has been a factor in bank crises in a
number of other countries.7 Thus, events in New England constitute
a relatively clear case of a problem that is endemic to banking.
Importantly, REST uses Call Report data, so it cannot
evaluate pricing, terms, or underwriting—factors critical to controlling the
risk of real estate lending. Moreover, REST does not estimate the condition of
the real estate market in any region, state, or metropolitan statistical area
(MSA); it identifies markets where banks are exposed to potential real estate
problems, not markets where such problems actually exist. What the REST model
can do is identify the banks that are at most risk in the event real estate
problems should occur. In so doing, it sharpens the focus of questions about
risk control and real estate markets and therefore makes an important
contribution to the FDIC’s off-site monitoring.
This article explains how the model was built with the
use of New England data and was tested with the use of data from other
historical real estate crises. The REST results for December 2002 are
presented and analyzed, and recent trends—both nationally and for selected
of Examining New England
The central question for the team that built the REST
model was whether any model could detect those healthy banks that would be in
most danger during periods when real estate became a problem. To answer this
question, we examined the New England real estate crisis of the early 1990s.
In 1987, the economy and the banking industry in New England could have been
described as vibrant, but by 1990 the problems were obvious.8 The
first stage in developing the REST model involved comparing the banks in New
England in 1987 with the banks there in 1990. All the banks were healthy in 1987,
but by 1990 a substantial fraction of them were troubled. Our analysis used
statistical procedures and data from 1987 to find the traits common to the
institutions that later had severe difficulties. This approach seeks to answer
the question whether as early as 1987 one could have identified the riskiest
banks in New England.
Because the purpose of our project was to evaluate
banks’ ability to withstand a crisis such as the one in New England in
1991–1993, banks that had a special function or were somehow atypical were
eliminated from the analysis. Banks considered atypical were those that had
equity-to-asset ratios greater than 30 percent or loan-to-asset ratios less
than 25 percent. A total of 13 special-purpose or atypical (or new) banks were
eliminated from the December 1987 sample.9
In addition, consolidation just before the crisis had to
be taken into account. In December 1987, 289 New England banks filed Call
Reports, but in December 1990 the number had shrunk to 255. Much of the
consolidation appears to have been achieved by mergers of different banks owned
by the same holding company. Regardless of the reason for the consolidation,
the performance of the bank resulting from a merger was undoubtedly affected by
the characteristics of the banks absorbed in the merger. Consequently, this
project used data adjusted for mergers.10
Finally, because growth
rates between 1985 and 1987 were included in the model, only banks that had
been in existence for five years (1985–1990) were part of the sample.
In the first
stage—comparing the conditions and balance sheets of banks at the end of 1987
with the same banks’ conditions and balance sheets at the end of 1990—the model
considers 12 variables as measures of health at the end of 1990 (previous work
has shown that these variables are closely related to CAMELS ratings, and the
FDIC has developed a Statistical CAMELS Off-site Rating [SCOR] model using
them).12 The 1987 data include the same 12 variables as well as 12
variables that measure (as a fraction of assets) the types of loans made by the
bank, a variable that measures the bank’s growth rate between 1985 and 1987,
and a variable that measures the bank’s size in 1987.
The basic results appear in tables 1 and 2. The reason
the two tables differ is that only 3 of the 12 SCOR variables (equity,
provisions for loan losses, and net income) can be less than zero, while
another of the variables (loans and long-term securities) can be zero in
principle but in fact was substantially greater than zero for the whole
sample. These 4 variables were handled by the usual regression
technique—ordinary least squares (OLS). Table 1 reports the results for these
Table 1 The REST Model: Coefficients Estimated with OLS
The other 8 SCOR variables cannot, in principle, be less
than zero. These variables—loan-loss reserves, loans past due 30–89 days, past
due 90+ days, nonaccruals, other real estate, charge-offs, volatile
liabilities, and liquid assets—were fit with a Tobit model.13 Table
2 reports the results for these variables. For a number of the 8 variables,
the results do not differ appreciably from OLS because, as reported in table 2,
there were very few zero values.14
As mentioned above, the independent variables for the
REST model include all the 1987 values for SCOR variables, 12 categories of
loans as a fraction of assets, asset growth, and bank size.
Table 2 The REST Model: Coefficients Estimated with Tobit
The SCOR variables represent the condition of the bank
in 1987. In fact, the condition of the bank results partly from the
characteristics of the bank, so these 12 variables are proxies for the
characteristics of the bank. For example, one cannot directly observe the
quality of a bank’s underwriting, but presumably tighter underwriting results
in fewer past-due loans—so the data on loans past due 30–89 days can be seen as
a proxy for underwriting standards.
The loan-type variables are important because the New
England crisis was a real estate crisis. Our project included data on several
types of real estate loans (1–4 family residential, multifamily housing, agricultural,
construction and development, and other nonresidential) as well as other loans
(unsecured commercial, to municipalities, to depository institutions, credit
card, other consumer, agricultural production). Presumably banks that held
large amounts of real estate loans would be the ones most severely affected by
Asset growth between 1985 and 1987 was included because
rapidly growing banks are considered
especially risky. Total assets were included in the model because it is
usually thought that larger banks can more easily diversify risk away.15
All estimations were done with a stepwise procedure.16
This method starts with all 26 variables (12 SCOR variables, 12 loan-type
variables, asset growth, and size) and eliminates those that are not
statistically significant. The stepwise method was necessary because some
variables have coefficients that are very large but statistically
insignificant. Although inclusion of these variables improves the in-sample
fit of the model, it does so only very slightly. If the coefficients are
large, however, inclusion of these variables in out-of-sample forecasting would
almost certainly have an effect on the forecasts despite the complete absence
of statistical evidence that these variables matter at all. Their elimination
made very little difference to the fit of the model.
As noted above, all the results were estimated with a
stepwise procedure, and this procedure did not result in a significantly worse
fit than if all the variables had been used. In general, the two sets of
estimates are completely consistent with each other. In fact, most of the
coefficients estimated with a stepwise procedure are very similar to those
estimated when all the variables are used.17
Although alternative methods produced similar estimates,
one should be cautious about interpreting these results. For example, one
cannot conclude that the ratio of loans and long-term securities to assets did
not affect asset quality, although that variable has a zero coefficient in all
the asset-quality equations (loans past due 30–89 days, loans past due 90+
days, nonaccrual loans, other real estate, charge-offs, and provisions for loan loss). The effect
might be small or inconsistent.
Statistical tests reveal correlation, not causation.18 When the
correlation is strong and consistent with theory, however, there is good reason
to take statistical results seriously. With that in mind, one should note
several features of the results.
First, this approach captures much of the variation
between banks. Near the bottom of both table 1 and table 2 there is a line
reporting that R2 is between 0.30 and 0.60 for most of the results.19
This means that 1987 data can account for about 30–60 percent of the
differences between banks in 1990. The major exception to this result is that
the variable “loans past due 90+ days” has an R2 of only 0.1306.20
Second, most variables are mean-reverting. That is, the
banks that were exceptional in 1987 tended to resemble the average (mean) bank
more closely by 1990. The coefficients on the lagged variables show this
effect. For example, consider the effect of lagged equity on equity. From
table 1, the estimated coefficient is 0.535. This means that an extra 1
percent equity would lead to an extra 0.535 percent equity in 1990.
Importantly, the coefficient is between 0 and 1, indicating that banks with
unusually high levels of equity in 1987 still had unusually high levels of
equity in 1990, but other things being equal, differences in equity levels
shrank during those three years. An inspection of tables 1 and 2 shows that
the only variables without a strong mean-reverting component are provisions,
reserves, nonaccrual loans, and other real estate.
This observation suggests that most of the SCOR
variables reflect something fundamental about the operations of a bank. Banks
with higher than average levels of loans past due 30–89 days tend to have
higher than average levels even three years later. Conceivably, high levels of
past-due loans may reflect a less cautious underwriting philosophy.
This interpretation of the SCOR variables is supported
by the data. For example, high levels of loans past due 30–89 days might be
considered a sign that the bank is more willing to take risks. In fact, high
levels of loans past due 30–89 days in 1987 are associated with lower net
income and more nonaccrual loans, more other real estate, more charge-offs, and
more provisions in 1990.
The third feature of our New England REST results is
that most of the loan-type variables have the expected coefficients. High
levels of commercial real estate loans in 1987 were associated with poor
performance in 1990. It should be noted that construction and development
loans in particular were problems for New England banks. Although other types
of commercial real estate (nonresidential real estate and multifamily housing)
were associated with problems, construction loans were the major problem: they
were significant in almost every regression, and they generally had a larger
effect than other types of commercial real estate loans. High levels of
commercial and industrial (C&I) loans and other consumer loans also seem to
have been a risk factor. Credit card loans were not a special problem, and
loans to municipalities helped shield banks from the downturn.
Fourth, high asset growth between 1985 and 1987 also
resulted in poor performance by 1990. The signs on log assets are consistent
with the theory that larger institutions were more diversified and more
aggressive in facing their problems in 1990. Large institutions had fewer
past-due loans; on the other hand, they had more nonaccrual loans, reserves,
charge-offs, and provisions. They also had lower net income, but that result
seems to be driven completely by the higher provisions.
There are some other interesting features of the
results. Banks with high net income in 1987 tended to have higher equity in
1990. Banks with high levels of reserves in 1987 performed better in 1990.
This last finding is consistent with the interpretation that more-conservative
banks tend to recognize losses more quickly and reserve against them. This
interpretation, in turn, is consistent with the observation that charge-offs in
1987 are negatively correlated with loans past due 30–89 days in 1990. Banks
that relied on noncore liabilities also tended to have more difficulties in
1990 (lower income, more past dues 30–89 days, and more other real estate).
This result is consistent with the notion that banks that use noncore liabilities
may be more aggressive and take more risks.
And there are some anomalies. High levels of other real
estate in 1987 are correlated with low levels of loans past due 90+ days and
nonaccrual loans in 1990. This might reflect differences in workout policies.
Although these results are intrinsically interesting as
an analysis of past events in New England, the goal of our project was to
develop a forecasting tool that could identify banks most likely to have
difficulties during future real estate downturns. To test whether REST had
forecasting power, we applied it to other real estate crises. Because these
tests involved banks that were not in the sample used to build the model, they
are called “out-of-sample” tests.
Southern California experienced a real estate crisis at
about the same time as New England. To test the validity of the New England
results, we forecasted 1991 SCOR ratios on the basis of 1988 data for Southern
California banks. The banks included all California banks overseen by the
FDIC’s Los Angeles East, Los Angeles West, and Orange County field offices.
Again, all institutions with loan-to-asset ratios less than 25 percent or
equity-to-asset ratios greater than 30 percent were excluded. The sample
contained 242 banks, 173 of which had a composite CAMELS rating of 1 or 2 as of
The banks in California differed from those in New
England in a number of ways. First, California banks in 1988 were generally in
worse shape than New England banks in 1987. None of the New England banks in
the sample had a CAMELS rating worse than 3 at year-end 1987, but 20 (8.3
percent) of the Southern California banks were rated 4 at year-end 1988.
Second, the shock in the New England economy and therefore to the region’s
banks was both shorter and more severe. Of the 203 New England banks, 33—16.3
percent—failed, a percentage slightly higher than the percentage in California
(33 of 240, or 13.8 percent). Moreover, in New England the bulk of the
failures (29) were concentrated in a two-year period (1991 and 1992), whereas
in Southern California the failures were spread out over three years
(1992–1994). Third, structural differences between the two regions’ banking
industries were significant: California had permitted statewide branching for
decades, whereas the banking industry in New England was more segmented.
The stress test did not do particularly well at
forecasting individual ratios. For example, the model was not able to identify
those banks that experienced large increases in nonaccrual loans. This is not
too surprising because management’s decisions about handling problems determine
how the problems affect the bank’s balance sheet and income statement, and if
bank management delays dealing with real estate problems, the bank will tend to
have higher other real estate owned or nonaccrual loans. If management deals
with the problems aggressively, those same problems may affect the bank’s
provisions, charge-offs, income, and capital. Even a perfect model cannot
forecast how management will deal with problems.
However, bank supervisors do not evaluate banks in terms
of individual ratios but in terms of the overall condition of the bank.
Consequently, the major issue is whether the stress test can forecast bank
condition. The SCOR model can be used to translate the 12 SCOR ratios into a
forecasted CAMELS rating.21 These ratings are the REST ratings.
Table 3 compares 1988 REST ratings with CAMELS ratings
and with failures between 1992 and 1995. All the banks used for compiling
table 3 were 1 or 2 rated as of December 1988, and all banks survived until at
least December 1991. If the bank failed between 1992 and 1995, the bank is
identified as a failure. Otherwise, the bank’s reported CAMELS rating is the
worst rating it received between 1992 and 1995.22
Table 3 Performance of Stress Test in Southern California
Several considerations underlie this approach. First,
the ultimate concern of supervision is troubled banks; hence, one should
concentrate on the worst ratings. Second, banks that are rated 3 or worse have
already been identified as potential problems, and the critical question is
which banks currently regarded as sound are likely to develop problems.23
Third, as noted above, events in Southern California evolved over a number of
years. Problems at a bank that were obvious at the end of 1993 might not have
been evident at the end of 1991. Using the worst rating during the crisis
years 1992–1995 avoids the issue of timing. This method considers the banks
that encountered difficulties, regardless of when the problems actually
Forecasting models have two types of error: they fail to
identify the banks that are downgraded (Type I error), and they identify banks
that are not downgraded (Type II error).24 This article analyzes
the number of banks that the model correctly identified, so it refers to Type I
accuracy and Type II accuracy. The emphasis is on problem banks (banks with a
CAMELS rating of 4 or 5) and failures. Failures cost the FDIC money to resolve
the bank, and problem banks are in danger of failing and take considerably more
Panel A of table 3 shows the raw numbers, while panel B
reports Type I accuracy and panel C reports Type II accuracy. Ideally all
banks that failed would have REST ratings worse than 4.5 (100 percent Type I
accuracy), and the “failed” line in panel B would have a 100 percent in the
last column. In addition, ideally all the banks with REST ratings of 4.5 or
worse would eventually have a CAMELS rating of 5 or would fail (100 percent
Type II accuracy). In that case, the column for REST ratings greater than 4.5
in panel C would have numbers that sum to 100 percent in the lines for CAMELS
ratings of 4 or 5 or for failures.
Table 3 shows that the model is not perfect but that it
does correctly identify a large percentage of problem banks and failures.
Consider Type I accuracy first. Panel A indicates that 15 banks failed; 4 of
the 15 had REST ratings worse than 4.5, while 8 of the 15 had REST ratings
between 3.5 and 4.5. Panel B shows this is 27 percent and 53 percent of the
failures, respectively. Thus, if banks with REST ratings of 3.5 or worse are
targeted, the Type I accuracy for failures is 80 percent. A similar analysis
shows that for problem banks, REST has a Type I accuracy of 68 percent.
The analysis of Type II accuracy also shows that REST is
quite accurate. Panel C indicates that among the banks with REST ratings of
4.5–5, 62 percent became problem banks and 12 percent failed; put differently,
74 percent either failed or were in danger of failing. For REST ratings of
3.5–4.5, 28 percent were problem banks, and 13 percent failed. In other words,
just over 40 percent had severe difficulties.
Several points about this backtesting should be
emphasized. All the banks were rated 1 or 2 at the time of the December 1988
Call Report, and the examination ratings were given three to seven years after
the Call Report. In short, REST did a reasonably good job of identifying which
sound banks were most likely to encounter difficulties three to seven years
However, the example of
Southern California was chosen precisely because real estate problems were
severe there. This is a critical piece of information. The stress test
identifies banks that could become problems if there were a real estate
crisis. REST does not identify real estate markets that are susceptible to
crisis. The backtest was successful because the Southern California market did
in fact have a crisis; REST did not identify that market as one vulnerable to
crisis. In the jargon of forecasting, the stress test provides conditional,
not unconditional, forecasts.25
Both New England and
Southern California suffered from extremely bad real estate problems. REST has
also been backtested on episodes of less severe real estate problems. For
example, table 4 reports the results (based on December 1987 data and
examination ratings from the period 1991–1994) for banks headquartered in the
The problems in Atlanta were clearly less severe than
those in New England or Southern California. Only one bank failed, and no
banks received a CAMELS 5 rating. Nonetheless, institutions identified by the
stress test were more likely to have severe difficulties. Only 2 of the 17
institutions (12 percent) with REST ratings better than 3.5 received a CAMELS 4
rating, but 11 of the 30 institutions (37 percent) with REST ratings worse than
3.5 later became problem banks or failed. Again, all these banks were CAMELS
rated 1 or 2 at year-end 1987.26
Based on December 2002 Data
The stress test has been run at the FDIC since 1999, and
the ratings are distributed every quarter to FDIC examiners and analysts as
well as to the other banking regulatory agencies. Tables 5, 6, and 7 summarize
a recent set of ratings—those based on the December 31, 2002, Call Report
data. In contrast to the backtests, these tables report on all institutions
regardless of CAMELS rating.27 However, institutions with
equity-to-asset ratios exceeding 30 percent and loan-to-asset ratios less than
25 percent are omitted. Table 5 reports the results by FDIC region, table 6 by
state (omitting U.S. territories), and table 7 by selected MSAs.
Table 5 shows that the banks in the San Francisco and
Atlanta regions are unusually vulnerable to real estate problems. Of the 699
institutions in the San Francisco region, 162 (23.2 percent) had ratings of
3.5–4.5, and 250 (35.8 percent) fell into the worst category, with ratings of
4.5–5.0. This last number is the most significant, since these are the
institutions that the model identifies as especially vulnerable to real estate
problems. In the Atlanta region, 244 (21.2 percent) had ratings between 3.5
and 4.5, and 332 (28.8 percent) had ratings worse than 4.5. In the rest of the
nation, only 12.7 percent were rated between 3.5 and 4.5, and only 9.5 percent
were rated worse than 4.5. In short, the model indicates that institutions in
the West and Southeast are approximately three times more likely to be
vulnerable to a real estate crisis than institutions in other parts of the
Table 5 also indicates some regions of secondary
concern, notably the Dallas and Memphis regions.
In table 6 (the results reported by state) the states
are ranked by the percentage of institutions with stress-test ratings worse
than 4.5.28 This table clearly indicates that the vulnerable
institutions are concentrated geographically, with 6 of the top 10 states being
in the San Francisco region. In addition, there are only 11 states in which 30
percent or more of the institutions are extremely vulnerable, and only 4 more
in which the percentage is between 20 percent and 30 percent.
Table 7 presents the data by MSA, though it includes
only MSAs where at least 10 banks or thrifts are headquartered.29
Again, the MSAs are ranked by the percentage of institutions with stress-test
ratings worse than 4.5. Only the top 20 MSAs are reported in the table, and
the table confirms that these MSAs are unusual. On average, REST assigns
almost 60 percent of the banks and thrifts in these MSAs a rating of 4.5 or
worse, whereas for all other MSAs the comparable number is approximately 20
percent. Clearly, the FDIC should be especially concerned about the health of
real estate markets in these MSAs.30
results of the stress test can be analyzed much as the SCOR model is. With the
SCOR model, one can attribute the reasons for a forecasted CAMELS downgrade to
specific variables by comparing the bank’s ratios with the median ratios of all
banks currently rated 2. The same technique can be used with REST.31
For purposes of examining
the REST ratings, we defined the benchmark as the median ratios of all
institutions currently rated 1 or 2. This standard of comparison cannot be
identified with any existing institution; it is a composite—the “average”
institution with a 1 or 2 rating.32 This benchmark is used to
calculate “weights” that trace the reason for poor ratings back to specific
ratios. The weights are in terms of percentages so they necessarily sum to 100
percent. The percentages can be negative if the ratio is better than the
standard. Importantly, the weights are not used in the estimation; they are
merely a method of comparing an institution that has received a poor rating
with an average institution.
Table 8 Sample Stress-Test Rating, Hypothetical Bank A
Tables 8 and 9 illustrate
how the weighting procedure can be used to analyze a result. Each table is for
a hypothetical institution. The institution described in table 8 has a
stress-test rating of 4.86 but a CAMELS rating of 2 and a SCOR rating of 1.51.
However, almost 12 percent of the institution’s assets are construction loans,
and those loans make up about 81 percent of the difference between this
institution and the typical 1- or 2-rated bank. Other factors contributing to
the poor REST rating are nonresidential real estate (18.52 percent of the
portfolio, with a weight of 6.38 percent), multifamily housing loans (weight
5.47 percent), and C&I loans (weight 4.71 percent). This institution does
have some strong points, though they are not important enough to change the
stress-test rating. It holds 0.89 percent of its assets in its loan-loss
reserves. These reserves have a weight of –0.64 percent, indicating that
although they are a positive factor, they are negligible in comparison with the
size of the construction loan portfolio.
Table 9 shows an
institution that has a stress-test rating of 4.88. In contrast to the rating
of the bank in table 8, this rating is not driven by construction loans. In
fact, the bank illustrated in table 9 has no construction loans, and the stress
test evaluates this as a positive factor (weight –10.27 percent). However, the
institution is concentrated in multifamily housing (weight 74.07 percent).
Secondary factors include a concentration in nonresidential real estate (weight
21.33 percent) and a reliance on noncore liabilities (weight 16.74 percent).
This institution is relatively large (assets of $500 million), and on balance
its size is a slight negative factor (weight 7.60 percent).
Table 9 Sample Stress-Test Rating, Hypothetical Bank B
Table 10 presents an
overview of the weights for banks that are currently rated CAMELS 1 or 2 but
have REST ratings of 4.5 or worse. The variables are ordered by the median
weight. Construction loans, with a median weight of almost 75 percent, are
clearly the most important factor in the model. Of the 800 institutions with
ratings of 4.5 or worse, 777 have weights for construction loans that exceed 5
percent. In 16 cases (as for the hypothetical bank in table 9), construction
loans are a significant positive factor and have weights that exceed –5
percent. The median bank that is identified as extremely vulnerable holds
13.05 percent of its assets as construction loans, compared with 0.50 percent
of the banks that receive REST ratings of between 1.50 and 2.50.
In some cases
nonresidential real estate loans, C&I loans, and multifamily housing loans
are also significant risk factors. In addition, large weights are regularly
assigned to low levels of liquid assets, high levels of noncore liabilities,
and high levels of loans past due 30–89 days. Moreover, banks with poor
ratings tend to be larger and to have grown more rapidly. Most variables
seldom, if ever, have significant positive or negative weights. Mortgages on
1–4 family homes generally have a positive weight, but it is never significant.
Table 11 shows that
although construction loans have the most weight, they are not the only factor
driving the ratings. All institutions holding construction loans exceeding 20
percent of their total assets are identified as extremely vulnerable, but 12
institutions that have no construction loans received REST ratings of 4.5 or
Table 11 also shows the
reason that using the ratio of construction loans to total assets by itself is
inadequate. A bank could have 7 percent of its assets in construction loans
and receive almost any REST rating. If the bank has no other risk factors, it
will receive a rating of 1–1.5, but if other risk factors are present, it may
receive a rating of 5. Before assigning ratings, the stress test considers
several aspects of a bank’s operations, allowing for both mitigating and
exacerbating factors. A single ratio is only one number and is meaningful only
after it has been put in a broader context.33
Trends in Stress-Test Ratings
1 and 2 show the history of stress-test ratings since December 1986 for the
United States as well as some individual states. Both figures show the
percentage of institutions receiving ratings of 3.5 or worse as a percentage of
all institutions with REST ratings.34 Figure 1 shows ratings in the
United States and in two states that have already been discussed—Massachusetts
and California. Figure 2 shows ratings in Arizona, Georgia, and Illinois.
Both figures also show a definite trend in stress-test ratings: since 1993, the
ratings for the United States and for all five states have become worse.
Figure 1 REST Ratings Worse Than 3.5, 1986-2002 (USA, CA, and MA)
Figure 2 REST Ratings Worse Than 3.5, 1986-2002 (AZ, GA, and IL)
In figure 1 the effects of the real estate crises in
Massachusetts and California are clear. Large percentages of the financial
institutions in both states were vulnerable in the late 1980s, and the
percentages of vulnerable institutions then declined dramatically. Figure 1
also shows that institutions in the two states have followed quite different
paths in the last decade. Whereas the REST ratings for California banks and
thrifts have again become substantially worse than those for the United States
as a whole, ratings for Massachusetts banks have generally become better.
Figure 2 shows that Arizona banks and thrifts have
followed a pattern similar to California’s, with very poor ratings in the
mid-1980s, a very rapid improvement, and a subsequent deterioration. Ratings
in Georgia, in contrast, have gradually deteriorated up to the present.
Georgia today has a very high percentage of banks and thrifts with poor
ratings. Ratings in Illinois have followed the national pattern quite closely,
with some increase before the recession of the early 1990s, a decline during
the recession, and a gradual but definite increase in the percentage of poor
ratings after 1993. However, ratings in Illinois have generally been a little
better than ratings in the rest of the country. Both figures illustrate quite
clearly that although national trends may be significant, each state has a
story of its own.
This article has explained the development of a real
estate stress test and the test’s most significant results. The stress test
highlights institutions whose lending practices deserve scrutiny; it therefore
spotlights markets that should be inspected for evidence of incipient real
estate problems. REST indicates that a large fraction of banks and thrifts in
the West and the Southeast may be vulnerable to problems in the real estate
market, mostly because of large concentrations in construction and development
lending. REST does not, however, show that any real estate market is either
overbuilt or on the verge of a crisis. There are, after all, a multitude of
ways for institutions to manage and mitigate the risk of construction lending.
This article raises the questions of whether
institutions that have exposures to the real estate market have adequately
protected themselves and whether the real estate markets in the West and
Southeast are inherently healthy. The history of banking suggests that these
questions are vitally important to the FDIC.
Charles, Sean Forbush, Daniel A. Nuxoll, and John O’Keefe. 2003. The SCOR
System of Off-Site Monitoring: Its Objectives, Functioning, and Performance. FDIC
Banking Review 15, no. 3:17–32.
Deposit Insurance Corporation (FDIC). 1997. History of the
Eighties—Lessons for the Future. Vol. 1, An Examination of the Banking
Crises of the 1980s and Early 1990s. FDIC.
R. Alton, Andrew P. Meyer, and Mark D. Vaughan. 1999. The Role of Supervisory
Screens and Econometric Models in Off-Site Surveillance. Federal Reserve Bank
of St. Louis Review (November–December): 31–56.
Richard J., and Susan M. Wachter. 1999. Real Estate Booms and Banking
Busts—An International Perspective. Occasional Paper No. 58. Group of Thirty.
Martin. 1990. The Greatest-Ever Bank Robbery. Charles Scribner’s
* All the authors are on the
staff of the Federal Deposit Insurance Corporation (FDIC). Charles Collier and Sean Forbush are with the Division of Supervision and Consumer Protection (DSC), Collier as chief of the Information Management Section and Forbush as a senior financial analyst. Daniel Nuxoll is with the Division of Insurance and Research (DIR) as a senior economist.
This article reports the results of a close collaboration among numerous people in both the DSC and the DIR. In addition, the staff of the FDIC’s San
Francisco Regional Office encouraged the project and provided the authors with
helpful comments. The opinions expressed here are those of the authors and do not
necessarily reflect the views of the FDIC.
1See Collier et al. (2003) for a more general discussion of the
objectives and methods of the FDIC’s off-site models.
2CAMELS ratings are based on
examiners’ assessments of Capital, Asset quality, Management,
Earnings, Liquidity, and market Sensitivity. The ratings range from 1 to 5, with 1 being
the best. Banks and thrifts with a
rating of 1 or 2 are considered sound, whereas supervisors have definite
concerns about institutions with a rating of 3.
Institutions with a rating of 4 or 5 are considered problem banks. The Sensitivity rating was added only in
1997, so strictly speaking, ratings before that year are CAMEL ratings. This article uses “CAMELS” throughout,
despite the anachronism.
3Clearly, our project is most
directly related to the FDIC’s function as an insurer, not a supervisor. Consequently, this article discusses all
banks and thrifts, whether or not they are supervised by the FDIC.
It must also be observed that banks are identified by their
headquarters. Consequently, for purposes
of this stress test, the Bank of America is located in Charlotte, N.C.,
although the vast majority of its business is outside the Charlotte
metropolitan statistical area and outside the state of North Carolina. However, the number of megabanks is
relatively small, and few of the banks in our project have many operations that
are outside a small area.
4A number of popular accounts—for example, see Mayer (1990),
chapter 5—report that Edwin Gray, the chairman of the Federal Home Loan Bank
Board from 1983 to 1987, became aware of the depth of the S&L crisis while
watching a videotape of abandoned projects in the Dallas area.
5See FDIC (1997), chapter 10, for a discussion of this issue. In contrast, the Texas banking crisis during the
late 1980s and early 1990s was caused only partly by commercial real estate.
8We could have used data from years other than 1987 and 1990 to
develop the REST model, but for a terminal date, 1990 is the obvious
choice. The problems in New England were
not that apparent until 1990, yet in 1991 a significant number of banks
failed. We are especially interested in
banks that are so troubled they eventually fail; thus, a later terminal date
would ignore some important information.
The start date of 1987 corresponds closely to the peak in the New
England economy, but 1986 or 1988 could equally well have been used. Experiments indicate that the REST results
would have been similar for any of those three years.
9Also excluded was a Connecticut bank that at the end of 1988
apparently sold its regular banking operations and continued as a
10To adjust the data, we combined the data for separate institutions
that later merged. For example, if two
banks merged in January 1988, the 1987 data for the resulting bank would be the
combined balance sheets and income statements for the two banks as of December
11Our discussion of New
England does not refer to thrifts because the savings banks were excluded from
the sample. During this period, savings
banks filed a slightly different Call Report from the one filed by commercial
banks, so some data provided by commercial banks are missing for savings
banks.. More importantly, during this
period many mutual savings banks converted to stockholder-owned savings banks,
and after conversion, these institutions behaved quite differently. See FDIC (1997). The development of the stress test assumes
that the institutions in the sample had a generally stable strategy, and
clearly many of the savings banks in New England did not.
Our discussion of
Southern California does not include thrifts because before 1991, data on
thrifts in that region are limited.
12See Collier et al. (2003). A
model could be developed that would forecast CAMELS ratings directly. However, the deterioration among banks in New
England was extremely sudden, and CAMELS ratings change only after an
examination (or, occasionally, after an off-site review). CAMELS ratings at the end of 1990 probably do
not reflect the extent of the problems in New England because examiners were
overwhelmed and had not changed the ratings at some troubled institutions. We developed a model to forecast CAMELS
ratings directly, and although it identified the same types of institutions as
the REST model, in backtests it was found to be slightly less accurate than the
13Other real estate consists mostly of real estate that banks own
because of foreclosures. Charge-offs are
gross, not net, so they cannot be less than zero.
14In fact, all banks had some loans past due 30–89 days, but the OLS
estimates differ from Tobit because of a handful of values that are close to
zero. Tobit considers the possibility
that these values are greater than zero by chance.
15The number actually used is the logarithm of total assets.
16The statistical software SAS
supports a stepwise method for OLS but not for Tobit. The variables with the Tobit specification
were also estimated with stepwise OLS and with a full Tobit model (one that
includes all 26 variables). The
variables that were insignificant in both the stepwise OLS and the full Tobit
specification were dropped. The Tobit
was reestimated, and the more insignificant variables were dropped. In the final estimation, all variables were
significant at least at the 15 percent level.
17It should be noted that because these equations were estimated
with a stepwise procedure, the coefficients and t-statistics cannot be
interpreted in the textbook manner.
However, the estimated coefficients and t-statistics are very similar
when all the variables are included.
18The stepwise procedure complicates the usual warning about
reasoning from correlation to causality.
The coefficient on a correlated variable might well incorporate the
effect of an omitted variable.
19The numbers reported for the Tobit are pseudo-R2s. They are calculated in a manner analogous to
the manner in which OLS R2s are calculated, except that with the
Tobit numbers the calculation allows for the fact that the variables can never
be less than zero.
20The test statistics for the hypothesis that the omitted variables
have a zero coefficient are also included.
By way of comparison, the 5 percent significance level for a Chi-squared
statistic with 15 degrees of freedom is 25.00, while the comparable F-statistic
with 20 and 200 degrees of freedom is 1.62.
However, because the model was fitted with a stepwise procedure, the
statistics in the tables are not useful for classical hypothesis testing. They merely indicate that excluding the
variables has very little effect on the fit of the model.
21Our project focuses on the information that could have been known
at the time. Consequently, the REST
ratings are computed with the same coefficients that could have been used to
produce the December 1988 SCOR ratings.
There is one complication: the coefficients were estimated using revised
Call Report data and a complete set of examination ratings. Neither would have been available if someone
had estimated the SCOR model in 1989.
22Three banks are excluded because although they survived until
December 1991, they merged before they were examined. The mergers were not assisted; that is, the
banks did not fail.
23The results are not materially different if one includes banks
that were rated 3, 4, or 5 as of 1988.
24For a more extended explanation of Type I and Type II errors, see
Collier et al. (2003).
25Earlier in the same period Texas had a major crisis, which we did
not use for two reasons. First, large
bank-holding companies present a number of difficulties because of the
connections between banks in the holding company. Second, the real estate problems in Texas
began after many banks in the state had already gotten into trouble because of
loans to the oil and gas industry.
However, tests on the 1986 data from Texas show results similar to those
presented in the text for Southern California.
As of December 1986, only 34 banks had a composite CAMELS rating of 1 or
2 and a REST rating of 5. Of those 34,
13 (38 percent) failed and 13 (38 percent) became problem banks. Only 1 maintained a 1 or 2 rating until 1993. In contrast, 338 banks had a REST rating of
2, and only 12 (3 percent) failed, while 43 (13 percent) became problem banks.
26A handful of other backtests have been done and have produced
27There is a second difference as well: thrifts are included in the
December 2002 data.
28The totals in table 5 include banks and thrifts in U.S. territories.
29Unfortunately, some cities with very high percentages of poor REST
ratings (for example, Provo, Utah, and Fort Collins, Colo.) are excluded from
the table because too few institutions are headquartered in them.
30Some preliminary work also shows that new banks have unusually
poor REST ratings. As a group, banks
that are less than three years old have REST ratings comparable to those in the
MSAs listed in table 7.
31See appendix 2 in Collier et al. (2003) for an explanation of the
method for deriving SCOR weights. The
method used by REST is slightly more complicated because some variables (for
example, nonaccruing loans) can never be less than zero.
32SCOR uses the median ratios of the banks that received a rating of
2 within the previous year.
33Gilbert, Meyer, and Vaughan (1999) make this point forcefully.
34REST uses the SCOR model to
assign ratings that are comparable to CAMELS ratings. Using the data on the characteristics of banks assigned CAMELS 5 ratings after actual examinations, SCOR estimates coefficients that describe the characteristics of a 5-rated bank. In 1998, there were few banks with CAMELS 5 ratings, so for that year the SCOR characterization of a 5-rated bank relies on very little data and is consequently imprecise.
This imprecision affects REST ratings worse than 4 because a rating
midway between 4 and 5 draws on the characterizations of both 4-rated and
5-rated banks. The imprecision in SCOR (and REST) resulted in better ratings for banks with very poor financials. If one takes a set of very poor financial ratios and assigns a rating based on pre-1997 coefficients or coefficients estimated on data from 1999 or later, the ratings would all be similar. However, the 1998
coefficients produce better ratings for the weakest financial ratios (that is,
those ratios that would have been assigned a rating worse than 4 by
coefficients from other periods). The data for the worst ratings are misleading in 1998 because the coefficients for 1998 are imprecise, and the ratings based on those coefficients do not reflect the innate weakness of the banks in the worst condition.