Bank Loan Underwriting Practices: Can Examiners’ Risk Assessments Contribute to Early-Warning Systems?
John O’Keefe, Virginia Olin, and Christopher A. Richardson*
Working Paper 2003-06
* John O’Keefe is Chief of the Financial
Risk Measurement Section, Division of Insurance and Research, FDIC. Virginia
Olin is a Senior Financial Economist, Division of Insurance and Research,
FDIC. Christopher A. Richardson is an Economist with the Center for Responsible
Lending, Washington, DC and was at the U.S. Department of Justice, Civil Rights
Division when this paper was written. The authors thank Andrew Davenport
and Michael Jenkins of the FDIC and the participants in the 2003 annual meetings
of the Western Economic Association, for helpful comments and suggestions.
The views expressed here are those of the authors and not necessarily those
of the Federal Deposit Insurance Corporation or the Department of Justice.
Loan underwriting practices are
the primary determinant of bank credit risk and bank credit availability.
For this reason, U.S. bank supervisors conduct periodic surveys to assess
bank underwriting practices and their riskiness. In early 1995 the Federal Deposit Insurance Corporation (FDIC)
introduced a comprehensive examination questionnaire, or survey, of bank underwriting
practices at FDIC-supervised banks; FDIC bank examiners complete
the survey at the end of each FDIC-supervised bank examination. The survey
covers lending practices both in general and in specific loan categories.
This study investigates (1) the relationships between examiners’ assessments
of the riskiness of bankers' lending practices and subsequent changes in bank
condition, and (2) the question of whether these relationships can enhance
supervisors’ early-warning systems. We find that higher (lower) risk in underwriting
practices is associated with subsequent increases (decreases) in nonperforming
assets generally. We also find that assessments of underwriting risk contribute
to off-site surveillance models that predict safety-and-soundness examination
ratings. However, this contribution is largely subsumed by that of concurrent
safety-and-soundness examinations ratings. Thus, underwriting survey data
are best used as diagnostic measures of the sources of financial distress.
1. Supervisory Tools for Monitoring Bank Loan-Underwriting
To maintain public confidence in insured depository institutions
and to protect the deposit insurance funds (the Bank Insurance Fund and the
Savings Association Insurance Fund), federal regulatory agencies must promote
the safety and soundness of commercial banks and savings associations. To
a great extent, the risk to an institution's soundness and to the insurance
funds is determined by the quality of the institution's loan portfolio. Loans
typically make up the largest portion of the institution's asset structure,
and they ordinarily present the greatest credit risk and therefore the greatest
potential loss exposure to banks.
The most common supervisory tools
used by the regulatory agencies in promoting safety and soundness are on-site
examinations and off-site surveillance systems. Each serves a unique purpose.
For the on-site examination, the objective is to evaluate effectively the
safety and soundness of the FDIC-insured depository institution (hereinafter
"bank") and to analyze all aspects of the bank’s operations,
including loan portfolios and policies on making sound loans. For off-site
surveillance systems, the objective is to provide bank supervisors with an
early warning of potential problems in banks’ condition. Some off-site systems
model financial data that banks file with regulators (the modeling is to determine
whether additional supervisory attention is warranted before the next regularly
scheduled on-site examination; examination frequency is discussed below in
the section on sample and data). Other off-site systems use different indicators
to predict changes in banks’ condition.
In early 1995 the FDIC introduced a third tool, a hybrid of on-site
examinations and off-site surveillance systems: a questionnaire, or survey,
that supplements on-site examinations. Because loan underwriting practices
are the primary determinant of a bank’s credit risk and credit availability,
the survey’s focus was on current underwriting practices. The survey is intended
to provide an early warning of potential credit-quality problems. When examiners
look at credit administration during the examination, they identify weaknesses
that may exist, and they note that if these practices are not improved, the
bank’s condition will probably worsen because of deterioration in the quality
of loans. On the basis of this review, they complete the survey. Thus, the
underwriting survey is a way to anticipate future problems by relying on examiners’
risk assessments of current underwriting practices. In this study we assess
the survey’s contribution to early-warning systems by examining the predictive
content of survey data, using two models: one forecasts changes in banks’
supervisory ratings, and the other forecasts banks’ nonperforming assets.
In the rest of this section, we briefly explain
on-site examinations, off-site surveillance systems, and the FDIC supplemental
questionnaire. In the next section we discuss three areas of
related empirical research: underwriting cycles in property and casualty insurance
markets, bank supervisors’ off-site surveillance systems, and the predictive
content of bank supervisors’ surveys of underwriting practices. Then we describe
our sample and data. After that we present the methodology used to examine
the potential contribution of the FDIC underwriting survey to early-warning
models, and the results of our empirical tests. Finally, we summarize and
1.1 On-Site Examinations
There are four fundamental reasons
for on-site examinations. First, they help maintain public confidence in
the integrity of the banking system and individual banks. The existence of
unhealthy or deteriorating conditions, which may threaten this integrity,
should be disclosed through the examiners’ evaluations. Second, periodic
on-site examinations provide the best means of determining banks’ adherence
to laws and regulations. Third, the examination process can help prevent
problem situations from remaining uncorrected and deteriorating to the point
at which costly financial assistance by the FDIC becomes unavoidable. Finally,
examinations supply supervisors with an understanding of the nature, relative
seriousness, and ultimate cause of a bank’s problems and thus provide a sound
factual foundation on which to base corrective measures, recommendations,
The appraisal of lending and collection policies
and of the bank’s adherence to those policies, as well as the evaluation of
individual loans, is only part of the on-site examination. To be sure, examiners
are instructed to review loan policies and portfolios, but they are also capturing
a kind of information different from that captured in off-site surveillance
systems. Specifically, they review lending policies to ensure that the policies
are clearly defined and explicit enough to provide the directors and senior
officers with effective supervision. They check to see that loan policies
are up-to-date and have been approved by the board of directors. And they
check to see that the actions taken by officers and employees adhere to established
policies. The examiners’ manual contains an extensive list of broad areas
of consideration and concern that lending policies should address. 
At the end of each on-site examination,
the bank is assigned a safety-and-soundness rating. The basis for this rating
is the Uniform Financial Institutions Rating System (UFIRS) designed to evaluate
banks’ condition on a uniform basis and to identify banks requiring special
attention or concern. Each examined bank is assigned a composite rating that
uses six essential components of the bank's financial condition and operations.
These components address the adequacy of capital (C), the quality of
assets (A), the capability of management (M), the quality and
level of earnings (E), the adequacy of liquidity (L), and the
sensitivity to market risk (S). Hence, the composite rating is called
a CAMELS rating. Composite and component ratings are assigned on a 1 to 5
numerical scale. A "1" indicates the highest rating (strongest
performance and risk management practices, and the least degree of supervisory
concern), while a "5" indicates the lowest rating (weakest performance,
inadequate risk management practices, and therefore the highest degree of
In sum, on-site examinations are the best way for supervisors
to track the condition of banks; however, since examiners cannot be continuously
on-site, regulators also use off-site surveillance to help span the gap between
regularly scheduled on-site examinations.
1.2 Off-Site Surveillance
Off-site surveillance provides supervisors with an early warning of potential
problems in the bank’s condition. Some off-site systems use statistical techniques
that analyze previous financial data that banks file with regulators (Call
Reports) to predict future CAMELS composite and component ratings. Other
systems are not statistically based, but they still have predictive qualities.
Both kinds of systems provide information that helps regulators determine
whether additional supervisory attention is warranted before the next regularly
scheduled on-site examination. Use of off-site surveillance, however, is
not a substitute for periodic on-site examinations. Instead, it is a valuable
complement to the examination process. Off-site surveillance has several
advantages. For instance, it is far less intrusive and uses fewer personnel
than on-site examinations. And since off-site surveillance can identify banks
that show signs of financial distress, it also helps regulators allocate on-site
examination resources efficiently.
1.3 The FDIC Supplemental
The supplemental questionnaire, or survey, introduced in 1995 asks examiners
to respond to questions only about underwriting practices. Underwriting practices
can generally be characterized by the criteria used to qualify borrowers,
loan pricing, repayment terms, sources of repayment, and collateral requirements.
Underwriting practices also encompass the management and administration of
the loan portfolio, including its growth, concentrations in specific markets,
out-of-area lending, and adherence to written underwriting policies. 
FDIC bank examiners
complete the survey at the end of each FDIC-supervised bank examination.
They assess material changes in underwriting practices and evaluate
them in relation to supervisory standards, rating the risk associated with
a bank’s underwriting practices in absolute terms: low, medium, or high.  FDIC examiners also classify the frequency of specific risky underwriting
practices for overall lending as "never or infrequently," "frequently enough
to warrant notice," or "commonly or as standard procedure." Specifically, the FDIC survey
asks examiners about the risk in current underwriting practices, loan portfolios,
purchased loan participations, loan growth and/or significant changes in lending
activities, and loan administration.
In addition, examiners
rate the frequency of the following risky practices in overall lending: lending
in amounts that result in concentrations to one industry or borrower, out-of-area
lending, failing to adjust loan pricing for risk, failing to require principal
reductions before renewing loans terms, and deviating from written lending
policies. Finally, examiners classify the frequency of specific risky underwriting
practices in seven major loan categories: business, residential and commercial
(nonresidential) real estate construction, commercial real estate, home equity,
agriculture, consumer, and credit cards. 
The designers of the first survey
were staff from the FDIC’s former Division of Research and Statistics and
Division of Supervision (now the Division of Insurance and Research and the
Division of Supervision and Consumer Affairs, respectively). Before selecting
the questions, the designers reviewed the underwriting practices of banks
that had failed during the banking crisis of the 1980s. 
Once questions were chosen, test surveys were sent to FDIC examiners and
Division of Supervision regional directors for comment and feedback. Although
some of the survey questions have changed over time to account for emerging
underwriting concerns and to incorporate feedback from examiners, the majority
of questions have remained intact.
Other U.S. bank regulators also conduct
underwriting surveys; however, the FDIC’s survey is unique in the extent to
which it quantifies the level of risk and the frequency of specific risky
underwriting practices. The Federal Reserve Board conducts a Senior Loan
Officer Opinion Survey on Bank Lending Practices,but its primary
focus is credit availability. The Comptroller of the Currency’s annual Survey
of Credit Underwriting Practices is closer in spirit to the FDIC’s survey
in that it surveys bank examiners and has questions on credit risk, but it
does not request information about the frequency of specific risky practices.
2. Empirical Studies
Regulatory assessments of underwriting standards
become especially important as banks respond to incentives to change their
standards from time to time. In our view, one incentive that may induce changes
in underwriting standards over time is capacity constraints—periodic constraints
on the amount of risk banks can safely assume and, consequently, on the amount
of credit banks can extend. Capacity constraints in the banking industry
can arise from various sources: binding risk-based capital constraints, limited
opportunities for deposit growth in particular markets, increases in large
time-deposit funding costs, or macroeconomic weakness. In addition,
examiner responses to the FDIC survey on underwriting practices indicate that
the level of market competitiveness influences underwriting standards. Examiners
have typically commented that the main reason banks loosen underwriting practices
is competitive forces.  Here we offer a possible connection
between periodic capacity constraints, the level of market competitiveness,
and underwriting cycles in banking by drawing on research in property-casualty
Gron (1994) applies the theory of capacity constraints
to explain underwriting cycles in property-casualty insurance markets. She
describes four stages in a property-casualty insurance underwriting cycle.
The first stage is characterized by low profitability (and prices) and relatively
high insurance capacity. In the second stage, profitability and prices rise
abruptly in response to an unanticipated crisis, while the availability of
insurance falls as a result of capacity constraints. In the third—post-crisis—stage,
profitability and prices remain high while industry capacity gradually increases
to more normal levels as new firms enter the market and existing firms expand.
Finally, in the fourth stage, competitive pressures cause profitability and
prices to fall while capacity remains relatively high. In our adaptation
of the theory of capacity constraints to banking, we anticipate that banks
loosen underwriting standards at the fourth stage rather than reduce the price
of the loan (which directly affects profits). Gron’s finding of a
negative relationship between capacity and underwriting margins (a measure
of profitability) supports the capacity constraint theory in insurance markets
over arbitrage theories that suggest no systematic relationship between capacity
and underwriting margins. Although we do not empirically test the capacity
constraint theory in this paper, we propose that banks’ responses to capacity
constraints can create an underwriting cycle in which banks tighten or loosen
their underwriting standards in response to competitive pressures.
Haskel and Martin (1994) provide additional empirical
support for the theory of capacity constraints as applied to industrial firms.
Haskel and Martin find a positive relationship between profits and direct
measures of capacity constraints for industrial firms in the United Kingdom.
Previous empirical research to develop off-site
surveillance models has accounted for underwriting standards indirectly through
their impact on financial variables. These models estimate the probability
of individual banks’ failure or their downgrade in CAMELS rating. With the
rapid increase in the number of bank and thrift failures in the late 1980s
and early 1990s, bank regulators took particular interest in off-site models.
Several researchers have built models based on the experience of those years.
Gilbert, Meyer, and Vaughan (2002) empirically compare the predictive ability
of the Federal Reserve Board’s SEER failure prediction model with a model
they develop to predict downgrades in bank condition from CAMELS ratings 1
or 2 to ratings 3, 4, or 5. They find that over one- and two-year test windows,
their specialized downgrade model outperformed the SEER failure model by only
a small margin. They conclude that a model specified to predict downgrades
adds little predictive value to existing off-site failure prediction models.
One limitation on applying these results more generally is that their study
estimates the models on the basis of 1989 data and considers out-of-sample
performance for a period—1991 to 1998—that was relatively stable. The stability
of their test period makes it difficult to discern how their model would perform
in more volatile economic states.
Kolari et al. (2002) develop failure prediction
models for national banks using both parametric logit and nonparametric trait
recognition methods. They find that although both perform well with regard
to in-sample classification of failing and nonfailing banks, trait recognition
outperforms logit in predicting failures out of sample. Our study, in contrast,
uses data on FDIC-supervised institutions, thus providing us with more data
for both in-sample estimation and out-of-sample prediction.
and Peristiani (2000) use three types of capital ratios—risk-weighted, leverage,
and gross revenue—to predict bank failure.  They find that the most complex of the ratios—the risk-weighted ratio—is
the strongest predictor of bank failure over long time horizons. However,
the risk-weighted ratio is sometimes outperformed by the leverage ratio and
the gross revenue ratio over horizons of less than two years. Their analysis
illustrates the centrality of the capital ratio as a determinant of bank failure:
the use of the capital ratio alone in a logistic regression model results
in pseudo R-squared values as high as 13.5 percent.
Data from underwriting surveys also inform monetary and supervisory
policy makers about current banking industry practices and conditions and
therefore have the potential to contribute to econometric forecasting models.
Lown, Morgan, and Rohatgi (2000) and Lown and Morgan (2003) use the
Federal Reserve’s Senior Loan Officer Opinion Survey to analyze the
importance of credit availability to the economy. They conclude that shocks
to lending standards (credit tightening or loosening) have a significant impact
on both the volume of commercial loans and real output. To date, however,
there has been no published research on the potential contribution of underwriting
survey data to statistical models that predict bank performance and financial
distress. We propose that survey data can shed light not only on the potential
future effect of credit availability on the economy (as shown by Lown, Morgan,
and Rohatgi  and Lown and Morgan ) but also on future bank performance.
Using the FDIC survey data for underwriting practices, we test this hypothesis
by controlling directly for bank examiners’ assessments of underwriting standards
in off-site surveillance systems.
3. Sample and Data
To investigate whether FDIC underwriting survey data can contribute to bank
supervisors’ early-warning models, we used FDIC survey data from the period
August 1, 1995, through December 31, 2002, a period that includes 6,937 survey
observations.  This period
was relatively benign for U.S. banks (the few bank failures that occurred
were often attributed to idiosyncratic factors),  so it should provide a useful test of examiners’
ability to detect idiosyncratic weaknesses in underwriting practices as opposed
to broad-based problems of credit quality. During this period, the requirements
set by the Federal Deposit Insurance Corporation Improvement Act (FDICIA)
for examination frequency were in effect. Specifically, annual safety-and-soundness
examinations were required for all FDIC-insured banks except those that had
assets under $250 million and composite CAMELS ratings of 1 or 2; these banks
had to be examined once every 18 months. The FDIC is the federal supervisor
for state-chartered banks that are not members of the Federal Reserve System,
and to avoid duplication of examination expenses and the associated burdens
on banks, the FDIC generally alternates examinations with state supervisory
authorities. As a result, FDIC survey data are typically available for every
second examination of FDIC-supervised banks, or about one examination every
two to three years.
Table 1 presents the survey questions
and responses for the 1995–2001 period. As discussed above, the survey has
undergone revision over time, and we control for this in our empirical analysis.
The most significant survey change was made in October 1998, when the possible
responses to several questions on risk in general underwriting practices were
changed from relative to absolute risk rankings. That is, the possible responses
were changed from characterizing risk as above average, average, or below
average  to high, medium,
or low based on supervisory standards. The change was made to improve the
clarity and ease of interpretation of survey results. Relative risk rankings
can be difficult to interpret, especially when overall industry risk is changing.
To see this, one need only consider the likely differences in the absolute
risk levels between banks with "above-average" risk in current underwriting
practices during a banking crisis and banks with "above-average"
risk during a period of record profitability. More is said below about these
changes and how we control for them in the analysis.
Since the purpose of this study is to investigate
the potential contribution of FDIC survey data to off-site surveillance models,
we use FDIC off-site surveillance models as our basis for comparison. The
primary off-site surveillance model used by the FDIC is the Statistical CAMELS
Off-Site Rating (SCOR) model. SCOR uses the balance-sheet and income-statement
information that banks are required to report to their primary federal supervisor
each calendar quarter-end (Call Reports). The SCOR model attempts to predict
CAMELS ratings by relating 12 financial ratios to each bank’s future composite
rating (similar models are developed for the component ratings). In addition
to the 12 SCOR variables, we include controls for economic conditions in banks’
local markets by using current and lagged growth rates of state personal income.
Previous research by the FDIC (Nuxoll, O'Keefe, and Samolyk ) suggests
that the boom and bust economic cycle that contributed to bank failures in
the 1980s and 1990s is captured by state personal-income growth.
We examine the predictive content of
the FDIC survey data using two types of models. The first type forecasts
banks’ examination ratings (CAMELS ratings) received during a given year on
the basis of prior year-end values of variables that we hypothesize determine
bank safety and soundness. The second type models nonperforming assets using
similar explanatory variables. We selected these two models because the dependent
variables (CAMELS ratings and nonperforming assets) are the broadest and,
from a supervisory perspective, the most significant indicators of bank safety
and soundness. The two dependent variables are also measured in very different
ways; that is, discrete CAMELS ratings are reported by bank supervisors, and
continuous nonperforming assets are reported by bank management. As a result,
the two models provide fairly robust tests of the potential contribution of
survey responses to off-site surveillance models.
For the first type of model, we estimate
four alternative specifications of the CAMELS-rating forecast model to show
the relative contributions of groups of explanatory variables. Specification
1, which serves as our basis of comparison, uses as explanatory variables
(a) past financial variables contained in the FDIC’s SCOR model, and (b) past
measures of state economic conditions, as measured by state personal-income
growth lagged one to five years. Specification 2 differs from specification
1 by including as additional explanatory variables (c) past FDIC underwriting
survey responses. Specification 3 expands the set of explanatory variables
used in specification 2 to include (d) past CAMELS ratings. Equation 1 presents
the broadest specification (specification 3) of the model in general form:
Specification 4 of the model is used to test the
potential overlap between the information contained in the past FDIC surveys
and past CAMELS ratings. This specification excludes past FDIC surveys as
explanatory variables but includes all the other items in the 3rd, or full,
The SCOR model uses the same set of financial variables to predict
the CAMELS composite and six component ratings (Capital, Asset
quality, Management, Earnings, Liquidity, and Sensitivity
to market risk). A review of the FDIC survey questions as presented in table
1, however, indicates that they should have a narrower scope of influence
on examination ratings than the SCOR financial variables. Weanticipate
that underwriting practices are most closely related to examiners’ assessments
of the quality of bank management, as measured by the management component
rating, although the timing of the relationship would seem to be concurrent,
since the survey focuses on "current" lending practices. It is reasonable,
however, to expect that current underwriting practices affect future loan
quality and future overall safety and soundness. For these reasons, we estimate
equation 1 using three dependent variables: composite CAMELS ratings, the
asset component ratings, and the management component ratings. We estimate
all specifications of the prediction model in equation 1 using ordered logistic
regression. All financial measures are expressed as percentages of contemporaneously
measured bank assets. In specifications where lagged CAMELS ratings are used
as explanatory variables, the ratings are from examinations occurring in the
year from which year-end financial data are used. Similarly, the CAMELS ratings
used as dependent variables are from examinations occurring within one year
of the date of the financial data used as explanatory variables.
CAMELS composite and component ratings
are ordinal indices ranging from 1 (best rating) to 5 (worst rating). Because
banking conditions were very favorable during the 1995–2002 period for which
we have FDIC survey data, there have been very few poorly rated banks. Between
1995 and 2002, of the approximately 6,900 examinations that make up our sample,
the distribution of composite CAMELS ratings was as follows: 40.7 percent
rated 1, 50.9 percent rated 2, 7.6 percent rated 3, 0.7 percent rated 4,
and 0.1 percent rated 5. The small size of some CAMELS rating groups made
it impossible to estimate ordinal logistic regressions using all five rating
levels, as would normally be done. We therefore grouped CAMELS ratings of
3, 4, and 5 into one category and maintained the CAMELS 1 and 2 rating categories.
We did this for both the composite and the component ratings. Since the focus
of bank supervisors’ off-site surveillance is potential deterioration in CAMELS
1- and 2-rated banks, we believe the use of a three-level CAMELS rating does
not compromise our analysis. 
As shown in table 1, FDIC survey responses
for general underwriting practices are three-level ordinal responses, with
level 1 representing the lowest perceived risk (below-average risk, low absolute
risk, or low frequency of risky practices) and level 3 representing the highest
perceived risk (above-average risk, high absolute risk, or frequent risky
practices). To allow for differences in outcomes (future condition or CAMELS
rating) across perceived risk levels, each survey question was measured using
two dummy variables: a dummy set equal to one for low-risk responses, and
zero otherwise; and a dummy set equal to one for high-risk responses, and
zero otherwise. The middle or average responses were the omitted dummy.
For our second type of model (nonperforming
assets), equation 2 presents the basic model in general form:
Our approach to estimation of
equation 2 is parallel to the approach we used for equation 1. Specifically,
we use the same four specifications of the model (explanatory variables) as
before. The major difference in estimating equation 2 is that we model only
total nonperforming assets, expressed as a percentage of bank assets, as the
dependent variable. Nonperforming assets are defined as the sum of all loans
and leases past due 30–89 days, loans and leases past due 90 days or more,
nonaccrual loans and leases and all other real estate owned (included repossessed
real estate). Equation 2 is estimated using ordinary least squares.
The FDIC survey could contribute
to off-site surveillance in at least two ways. The first is by contributing
to models that forecast supervisory ratings and other quantifiable bank risk
measures, such as nonperforming assets. These contributions are perhaps the
easiest to quantify. Here we document them and show that they are statistically
and economically meaningful. The second way the FDIC survey could contribute
to off-site surveillance is by helping bank supervisors understand the actual,
current sources of bank risk. We also document this latter contribution and
show that individual survey questions have persistent and logical relationships
with future bank risk.
5.1 Prediction of CAMELS Ratings: In-Sample Results
Tables 2 – 4 (table 2a, table2b,
table 3a, table 3b,
table 4a, table4b) show
the results of logistic regressions of the four specifications of equation
1 discussed above. The regressions are designed to predict the likelihood
of receiving a poorer future CAMELS rating. To account for the changes in
possible responses to questions in October 1998, we estimate the models using
survey data from two periods: January 1996 – September 1998 and October 1998
– December 2001 (hereinafter "1996–1998" and "1998–2001,"
respectively). We combine all responses within each of the two periods and
estimate the resulting panels of data using ordered logistic regression.  We are interested in panel estimations for
both periods for several reasons. First, any test of the relationships between
survey results and bank CAMELS ratings is also a test of the ability of the
survey to accurately measure the risks in underwriting practices. The change
from relative to absolute risk rankings in the survey was intended to improve
the survey’s measurement of underwriting risk, and we wish to see if this
intended improvement is supported by the data. Second and more importantly,
we wish to see if the relationships between underwriting practices and banks’
CAMELS ratings are robust over time.
Generally speaking, as we explain
in the pages that follow, all results are in close agreement with our expectations.
Discussed here are the results for composite CAMELS ratings and then for the
asset quality and management component ratings. Because of the large number
of explanatory variables included in the models as well as the number of model
specifications tested, this section summarizes results and addresses only
statistically significant relationships (statistical significance levels between
1 and 10 percent).
Tables 2a and 2b show
that for both the 1996–1998 and 1998–2001 periods, the likelihood of receiving
a poorer composite CAMELS rating is negatively related to equity capitalization
and income before taxes and extraordinary items. The likelihood of receiving
a poorer composite CAMELS rating is also negatively related to liquid assets
and gross loan charge-offs in the 1998–2001 period, but not consistently so
for all model specifications. For both the 1996–1998 and 1998–2001 periods,
the likelihood of receiving a poorer composite CAMELS rating is positively
related to the allowance for loan losses, loans past due 30–89 days, loans
past due 90 days or more, nonaccrual loans and leases, other real estate owned,
volatile liabilities, and provisions for loan losses. State personal-income
growth rates show a less-consistent relationship with future composite CAMELS
ratings, but one- and two-year lagged growth rates are positively related
to the likelihood of receiving poorer composite CAMELS ratings in the 1998–2001
Similarly, the relationships between FDIC survey results and the likelihood
of receiving a poorer composite CAMELS rating are in strong agreement with
our expectations. Tables 2a and 2b
show that the likelihood of receiving a poorer composite CAMELS rating is
positively associated with the dummy for high perceived risk in underwriting
practices (above-average risk, high absolute risk, or frequent risky practices)
and negatively related to the dummy for low perceived risk in underwriting
practices (below-average risk, low absolute risk or low frequency of risky
practices). Further, for 1996–1998 and 1998–2001 we find that FDIC survey
data add significant explanatory power to the SCOR model. Comparisons of
model specifications 1 and 2 in tables 2a and 2b
show that for both periods, the FDIC survey data increase the overall explanatory
power of the model. This can be seen when one compares the Akaike Information
Criteria with Somers’ D statistics. 
A Chi-Square test of the joint significance of the survey data in specification
2 shows that for both periods, the survey explanatory variables are jointly
significant at the 95 percent confidence level. One interesting result is
that the changes in the survey responses from relative to absolute risk levels
(as seen when one compares results for the two periods) do not alter the significance
of the survey data in the forecasting models. In addition, comparisons of
specifications 2 and 3 show that the inclusion of dummy variables for initial
composite CAMELS ratings in the models reduces the number of statistically
significant survey explanatory variables. It is reasonable to expect that
since survey data characterize one portion of the information used by examiners
to determine CAMELS ratings, initial CAMELS ratings will subsume some of the
informational content of survey data. 
While specifications 3 and 4 are extremely close in overall explanatory power,
a Chi-Square test of the joint significance of the survey data in specification
3 shows that for both periods, the survey explanatory variables are jointly
significant at the 95 percent confidence level.
Tables 3a and 3b show
the results for models that attempt to predict asset component ratings. The
relationships between the SCOR explanatory variables and future asset component
ratings are similar to those seen with future composite CAMELS ratings. Equity
capitalization and income before taxes and extraordinary items are negatively
related to the likelihood of receiving a poorer asset component rating for
both periods, and for the 1998–2001 period the negative relationship is true
for liquid assets and loans plus securities with maturities of five years
or more. For both the 1996–1998 and 1998–2001 periods the likelihood of receiving
a poorer asset component rating is positively related to allowance for loans
and lease losses, loans past due 30–89 days, loans past due 90 days or more,
nonaccrual loans and leases, and other real estate owned. Similarly, the
relationships between the survey data and asset component ratings are very
similar to those between the survey data and CAMELS composite ratings. Finally,
changes in the explanatory power of the model across all four specifications
of the asset component forecast model are very similar to those seen with
the CAMELS composite rating prediction. For brevity, therefore, we will not
repeat that discussion.
The results of estimation of the management component rating model are presented
in tables 4a and 4b.
The relationships that the SCOR and survey explanatory variables have with
future management component ratings are very similar to those seen with future
composite CAMELS and asset component rating forecast models. In addition,
the marginal changes in model explanatory power across the four specifications
are the same as those seen with the future composite CAMELS and asset component
ratings forecast models.  One result we should highlight is that the
general explanatory power of the forecast models is less when the attempt
is made to forecast management component ratings than it is in the attempt
to forecast composite CAMELS and asset component ratings.
At this point, one
may question the value of the underwriting survey responses for off-site surveillance,
given that the explanatory power of survey responses is largely subsumed by
CAMELS ratings that are contemporaneous with survey responses. The FDIC does
not, however, include CAMELS ratings as explanatory variables in the SCOR
off-site surveillance model. CAMELS forecast models that include lagged CAMELS
ratings as explanatory variables tend to produce forecasts that are clustered
around those lagged CAMELS values, since CAMELS ratings change infrequently.
We expect, however, that the underlying riskiness of banks is more evenly
distributed than the clustered results would suggest. That is, not all CAMELS
1-rated banks are of equal risk; the same can be said for other CAMELS ratings.
In addition, if a bank’s financial condition has changed since the last examination,
off-site models that omit lagged CAMELS ratings will place greater weight
on this changed condition. 
For these and other reasons, from our perspective the most important tests
of the contribution of underwriting survey data to off-site surveillance models
are those based on models without lagged CAMELS ratings as explanatory variables,
that is, model specifications 1 and 2. 
5.2 Prediction of CAMELS
Ratings: Out-of-Sample Results
The true measure of the contribution
of survey data to off-site surveillance models is their contribution to out-of-sample
forecasts. This section presents the results of out-of-sample forecasts of
CAMELS ratings, using essentially the same approach as for the in-sample forecasts.
To allow for several forecast periods, we use an annual forecast model. That
is, forecast models are designed to show how explanatory variables measured
during one calendar year predict CAMELS ratings received over the next calendar
year. For example, we estimate the ordered logistic regression for model
specification 2 using survey data for 1996, year-end 1996 Call Report values
of the SCOR variables, and lagged state personal-income growth rates. The
dependent variable in this estimation is the composite CAMELS rating received
during 1997. We next apply these estimated relationships (estimated explanatory
variables’ coefficients) to survey data and SCOR variables for 1997 to predict
the composite CAMELS ratings that banks would receive in 1998. The model
specifications are otherwise identical to those discussed in the section on
in-sample results, with one exception. Since data on one survey question
(the frequency with which the bank fails to require a material principal reduction
before renewing loan terms) did not become available until October 1996, we
drop this variable from the forecast model. Dropping it allows us to extend
the sample period back to include all of 1996 and adds another forecast year
to our tests. For the sake of brevity, we do not present the estimated model
coefficients for the annual logistic regressions. Rather, we proceed directly
to measures of predictive accuracy.
Figure 1 shows the overall
predictive accuracy of the CAMELS rating forecasts for all model specifications.
In measuring overall accuracy, we round the predicted CAMELS rating to the
nearest integer value and use that value as the forecast. The results generally
parallel those seen with the in-sample results discussed above. The inclusion
of survey data (specification 2) enhances the overall accuracy of models that
predict CAMELS composite, asset component, and management component ratings,
relative to specification 1. In addition, specifications 3 and 4 have very
similar predictive accuracy for all categories of CAMELS ratings tested.
As shown in figures 2–4, similar
results are also generally found when assessed by initial CAMELS rating groups.
These latter comparisons, however, require us to consider Type 1 and 2 error
rates. For example, a greater proportion of CAMELS 1-rated banks might be
forecast correctly at the expense of forecasting fewer CAMELS 2- or 3-rated
banks correctly. Figures 2–4 do not show the Type 1 and Type 2 error rates,
but only the percentage of forecasts that are correct. Figures 2–4 show that
in cases when model specification 2 had greater accuracy than specifications
3 or 4 in predicting any one CAMELS-rating group, it was less accurate in
forecasting some other CAMELS rating group.
As an alternative test of the predictive accuracy of the models, we compute
the Pearson’s partial correlation coefficient between actual future CAMELS
ratings and the predicted ratings (without rounding the predicted ratings
to integer values). We combine forecasts across both the 1996–1998 and 1998–2001
periods, since previous results indicate similar predictive accuracy for both
periods. As shown in table 5, the correlation coefficient
increases when survey data are added to the model (compare specifications
1 and 2) for all CAMELS rating forecasts. In addition, the correlations between
actual and predicted CAMELS are greatest for specifications 3 and 4.
5.3 Nonperforming-Asset Models
Equation 2 provides an additional test of the potential contribution of FDIC
survey data to off-site surveillance models. If lapses in underwriting practices
occur, we expect the bank to be more exposed to borrowers who are unable to
meet interest and principal payments. Tables 6
and 7 present the results of ordinary least squares
regressions of the relationships between nonperforming assets and the same
types of explanatory variables used for CAMELS prediction. Nonperforming
assets are hypothesized to be determined by prior year-end nonperforming assets,
prior year-end values of all SCOR model financial variables, state personal-income
growth rates lagged one to five years, initial examination ratings, and survey
data. We also include as an explanatory variable the interval (in days) between
the date of the survey data and the date of the Call Report from which the
dependent variable (nonperforming assets) was measured. Since it may take
some time for lending practices to result in problem loans, a control for
this interval is needed. In all other respects, the four specifications we
use to estimate the nonperforming asset prediction model (equation 2) are
the same as those for CAMELS prediction.
Tables 6 and 7 show that
nonperforming assets are positively related to lagged nonperforming assets
and equity capitalization for both the 1996–1998 and 1999–2001 periods but
show no consistent relationship with other financial variables across both
periods. However, nonperforming assets increase with high-risk underwriting
practices and decrease with low-risk underwriting practices for both the 1996–1998
and 1999–2001 periods. Finally, tests of the joint significance of the survey
explanatory variables in specification 1 show that the survey data are jointly
significant, and similar tests of specification 3 yield the same result.
We did not perform out-of-sample tests of the nonperforming-asset prediction
models because of the limited number of forecasts that might be produced.
As constructed, our models would have allowed for only two out-of-sample estimations,
and we do not feel this provides enough information to assess the out-of-sample
predictive accuracy of the models.
6. Summary and Conclusions
We conclude that the examiners’
risk assessments of underwriting practices do contribute to early-warning
systems in at least two ways. First, these assessments, as captured by the
FDIC underwriting survey, contribute to models that forecast CAMELS ratings.
Survey responses can be used to predict not only composite CAMELS ratings
but also asset component ratings and, to a lesser extent, management component
ratings. Moreover, our results indicate that inclusion of survey responses
in the FDIC’s SCOR off-site surveillance model would improve SCOR’s predictive
accuracy. In addition, we find that survey responses contribute to models
that attempt to forecast nonperforming assets. Specifically, we find that
higher (lower) risk in underwriting practices is associated with subsequent
increases (decreases) in nonperforming assets generally. Second, we argue
that survey responses add to the usefulness of off-site surveillance models
by pointing to additional potential sources of bank risk. Bank supervisors
need to know not only which banks have increased in riskiness but also what
the likely causes of the increase in risk are. It is important for bank supervisors
to know the possible reasons for increased bank risk so that they know where
to direct their efforts. Our results show that survey responses give clear
and consistent signals about the contribution of underwriting risk to overall
risk and therefore provide useful direction to bank supervisors.
Collier, Charles, Sean Forbush, Daniel Nuxoll, and
John O’Keefe. 2003. The SCOR System of Off-Site Monitoring. FDIC Banking
Review 15, no. 3:17–32.
Estrella, Arturo, Sangkyun Park, and Stavros Peristiani.
2000. Capital Ratios as Predictors of Bank Failure. Federal Reserve Bank
of New York Economic Policy Review 6, no. 2:33–52.
Federal Deposit Insurance Corporation (FDIC). 1997. History
of the Eighties—Lessons for the Future. Vol. 1, An Examination of
the Banking Crises of the 1980s and Early 1990s. FDIC.
———. February 2002. Section 3. DOS Manual of Examination
Gilbert, R. A, Andrew P. Meyer, and Mark D. Vaughan. 2002.
Could a CAMELS Downgrade Model Improve Off-Site Surveillance? Federal Reserve
Bank of St. Louis Economic Review 84, no. 1:47–64.
Gron, Anne. 1994. Capacity Constraints and Cycles in Property-Casualty
Insurance Markets. RAND Journal of Economics 25, no. 1:110–23.
Haskel, Johnathan, and Christopher Martin. 1994. Capacity
and Competition: Empirical Evidence on U.K. Panel Data. Journal of Industrial
Economics 42, no. 1:23–44.
Kolari, James, Dennis Glennon,
Hwan Shin, and Michele Caputo. 2002. Predicting Large U.S. Commercial Bank
Failures. Journal of Economics and Business 54:361–87.
Lown, Cara S., and Donald P. Morgan. Forthcoming. The Credit
Cycle and the Business Cycle: New Findings Using the Survey of Senior Loan
Officers. In Financial Market Behavior and Appropriate Regulation over
the Business Cycle, Proceedings of the 38th Annual Conference on Bank Structure
and Competition. Federal Reserve Bank of New Chicago.
Lown, Cara S., Donald P. Morgan, and Sonali Rohatgi. 2000.
Listening to Loan Officers: The Impact of Commercial Credit Standards on Lending
and Output. Federal Reserve Bank of New York Economic Policy Review
6, no. 2:1–15.
Nuxoll, Daniel, John O’Keefe, and Katherine Samolyk. 2003. Do Local Economic
Data Improve Bank Off-Site Monitoring Models? FDIC Banking Review
15, no. 2:39–53.
Figure 1. Overall Accuracy of CAMELS
Figure 2. Accuracy of CAMELS Composite
Rating Predictions by Rating Group
Figure 3. Accuracy of CAMELS Asset
Quality Rating Predictions by Rating Group
Figure 4. Accuracy of CAMELS Management
Rating Predictions by Rating Group
 Part 364 of the FDIC Rules and Regulations covers "Credit
Underwriting" in a general sense, and Part 365 provides a list of specific
factors for underwriting standards. Part 365 also covers loan administration.
Low: the level of risk imposed on the institution
does not warrant notice by bank supervisors even when factors that might offset
the risk are ignored. Medium: the level of risk should be brought
to the attention of bank supervisors. There may or may not be factors that
offset the risk incurred by the institution; however, the level of risk raises
concerns when considered apart from these offsetting factors. High:
the level of risk is high and therefore should be brought to the immediate
attention of bank supervisors. There may or may not be factors that offset
the risk incurred by the institution; however, the level of risk is high when
viewed in isolation.
Never or infrequently: the institution does
not engage in the practice, or does so only to an extent that does not warrant
notice by bank supervisors. Frequently enough to warrant notice: the
institution engages in the practice often enough for it to be brought to the
attention of bank supervisors. There may or may not be factors that offset
the risks the practice imposes on the institution. Commonly or as standard
procedure: the practice is either common or standard at the institution
and therefore should be brought to the attention of bank supervisors. There
may or may not be factors that offset the risks the practice imposes on the
 The survey also asks examiners to identify potentially
risky loan categories in which the bank is actively lending, such as unguaranteed
portions of small business administration loans, subprime loans, dealer paper
loans, low-documentation business loans, high loan-to-value home equity loans,
or any other category not listed. In addition, the survey asks examiners
to comment on which, if any, of these loan categories pose more-than-normal
risk to the bank.
For a complete history of this crisis, see FDIC (1997).
FDIC, Report on Underwriting Practices, various issues.
Estrella, Park, and Peristiani (2000) define the three capital
ratios as follows: The risk-weighted capital ratio is the ratio
of Tier 1 capital (which includes common stock, common stock surplus, retained
earnings, and some perpetual preferred stock) to risk-weighted assets. The
leverage ratio is Tier 1 capital divided by the quarterly average of
total tangible assets. The gross revenue ratio is the ratio of Tier
1 capital divided by total interest and noninterest incomebefore the
deduction of any expenses.
 Although the survey began in February 1995, not until
August of that year were several key questions added to it.
 Between 1995 and 2002, 46 FDIC-insured banks failed.
Fraud was considered a significant factor in 7 (15.2 percent) of the failures,
whereas of the 472 FDIC-insured bank failures that occurred between 1990 and
1994, fraud was found to be a significant factor in 27 (5.7 percent).
 As noted above, the survey was designed in consultation
with senior examiners. They agreed that examiners had a general feel for
what was considered "below average," "average," and "above average" risk.
No formal definitions were written.
 Once a bank is CAMELS-rated 3 or worse, supervisors
increase surveillance and rely more on on-site examinations than on off-site
surveillance. See Collier, Forbush, Nuxoll, and O’Keefe (2003) for additional
discussion of this focus.
 Because we control for changes in economic conditions
using state personal-income growth, we do not include dummy variables for
time or geographic region in the models.
 Larger values for the Somers’ D statistic indicate better
model fit, while smaller values for the Akaike Information Criteria indicate
better model fit.
 The initial composite CAMELS rating and survey data
are taken from the same bank examination, so the overlap in information is
 Robustness tests were
performed on model specifications 2 and 3 for all CAMELS rating models. Specifically,
we included only those survey questions that were consistently defined for
both the 1996–1998 and 1998–2001 periods. The consistently defined survey
questions measured risk in terms of the frequency of risky practices (see
table 1). The model specifications were identical
to those shown in tables 2–4 (table 2a, table2b,
table3a, table3b, table4a,
table4b) in all other respects. Our results regarding
the marginal contribution of the survey responses are essentially the same
as those seen in tables 2–4. Finally, pooled regressions that combined data
for both periods also yielded similar results.
 Since bank examinations are required only every 12 to
18 months, there can be a considerable difference between the dates of lagged
CAMELS ratings and the financial data that are used as explanatory variables
in off-site surveillance models.
 The reasons lagged CAMELS ratings are excluded from
the FDIC’s SCOR model are discussed at length in Collier, Forbush, Nuxoll,
and O’Keefe (2003), appendix 1.
 For example, a predicted CAMELS rating of 2.55 is rounded
up to 3, while a predicted CAMELS rating of 2.45 is rounded down to 2. Since
actual CAMELS ratings are integer values ranging from 1 to 5, some method
of translating predicted CAMELS ratings into integers is needed in order to
assess the accuracy of predictions. While rounding is the simplest way to
accomplish this, other approaches could have been used. For example, one
might round up all values that are more than 0.25 from the nearest integer,
so that predicted CAMELS of 1.26 and 2.39 are translated into ratings of 2
and 3, respectively.