I’ve included a URL to the do.file (for examples of commands used), and the data set.

http://www.megafileupload.com/63h1/assemble2007callreports.do

http://www.megafileupload.com/63h2/call20071231.dta

**Overview**

** **

Every quarter banks make detailed, public reports of their balance sheets called Call Reports. Bank regulators use this data and a simple logit model to find which banks are at risk of failure and should be monitored closely. This project replicates the type of screening model used by estimating the relationship between Call Report variables and eventual bank failure in historical data. It then applies the model to recent data to find which banks currently seem most risky.

### Procedure and Details

Start with the data set callreport20071231 posted on Blackboard. (It was prepared from the December 2007 Call Reports available from the Federal Financial Institutions Examination Council using the Stata program assemble2007callreports.do, which is available on Blackboard as an example.) Merge in a list bank failures downloaded from the Federal Depository Insurance Corporation (http://www.fdic.gov/bank/individual/failed/banklist.html) using the “cert” identifier as the merge key. Create an indicator variable for whether the bank failed in 2008 or 2009.

Replicate the regulators’ screening model by estimating a logit regression using the failure indicator as the dependent variable. The explanatory variables are the following, all as proportions of total assets (rcon2170):

- Equity (riad3210)
- Income (riad4340)
- Investment securities (rcon1773)
- Large time deposits (rcon2604 or the sum of rconj473 and rconj474)
- Loans 30 days past due (the sum of rcon1606, rcon2759, and rconb575 in 2007 data or the sum of rcon1606, rconf172, rconf173, and rconb575 in the most recent data)
- Loans 90 days past90 (the sum of rcon1607, rcon2769, and rconb576 or the sum of rcon1607, rconf174. rconf175, and rconb576 in most recent data)
- Nonaccrual loans (the sum of rcon1608, rcon3492, and rconb577 or the sum of rcon1608, rconf176. rconf177, and rconb577)
- Other real estate owned (rcon2150)

Report the results of the logit regression.

Calculate a predicted probability within two years for each bank using the 2007 data. Create a table of all the banks that failed in that two year period with the model’s failure probability for each. Create a table of the ten banks with the highest predicted failure probability that did not fail.

Download the most recent Call Report from https://cdr.ffiec.gov/public/PWS/DownloadBulkData.aspx. Find the same balance sheet variables used previously (using the labels in the list above) and standardize them as proportions of total assets. The variables needed will be spread across several files in the compressed folder that downloads. Use the new data to create a table with the ten banks with highest predicted failure rate now. (Use the parameters on the latent index estimated from the 2008-2009 data and the current data to calculate a predicted latent index. Transform that into a probability of failure.)

### Submission

Submit the following in a single file using the Blackboard link for Main Answer file:

- logit regression results
- table of all the banks that failed in 2008-2009 with their predicted failure probabilities
- table of the ten banks with the highest predicted failure probability that did not fail
- table with the ten banks with highest predicted failure rate using current data (and coefficients estimated from the historical data)

This project (unlike most) does not require a narrative description. Submit a script or program (such as a .do, .R, or .sas file) that allows instant replication of your work.