variable | description | units |
---|---|---|
CCRETT01 | relative consumer price indices | |
CCUS | currency exchange rates | monthly average |
IR3TIB | short-term interest rates | percent per annum |
IRLT | long-term interest rates | percent per annum |
IRSTCI | immediate interest rates, call money, interbank rate | percent per annum |
MABM | broad money (M3) | index, seasonally adjusted |
MANM | narrow money (M1) | index, seasonally adjusted |
SP | share prices | index |
OECD Economies
Australia, Austria, Belgium, Canada, Chile, Czech Republic, Denmark, Estonia, Finland, France, Germany, Greece, Hungary, Iceland, Ireland, Israel, Italy, Japan, Korea, Latvia, Luxembourg, Mexico, Netherlands, New Zealand, Norway, Poland, Portugal, Slovak Republic, Slovenia, Spain, Sweden, Switzerland, Turkey, United Kingdom, United States
Non-OECD Economies
Argentina, Brazil, China (People's Republic of), Colombia, Costa Rica, India, Indonesia, Lithuania, Russia, Saudi Arabia, South Africa
Monthly Data
Jan. 1950 to Nov. 2017
download
This dataset contains financial market data from stats.oecd.org covering 35 OECD countries and 11 non-OECD countries. You may use this data to explore the effect that interest rates, exchange rates, inflation rates and the money supply have on share prices.
Specifically, you may test the null hypotheses that:
Then focus on the variables where we rejected the null hypothesis. In those cases, we have accepted the alternative hypothesis that there is a relationship, so now we want to know:
For example, if we think that interest rates will rise one percentage point next month, then how much will share prices fall in response to that change?
As you conduct your analysis, you must remember that one of the Gauss-Markov assumptions is that your residuals ("error terms") must not be correlated with each other. Related to this assumption is the concept of "stationary residuals" -- the mean and variance of your residuals must be constant over time.
Taking the difference in value between one time period and the next will usually make a series stationary, so if you difference each variable in your regression model your residuals will usually be stationary. Differencing, therefore, usually ensures that your residuals are stationary.
The alternative is to find a co-integrating relationship among your variables that makes the residuals stationary. In practice however, it is difficult to find such co-integrating relationships, so I encourage you to work with the differenced variables.
But also -- from the perspective of an investor -- the share price itself is not important. What is important to the investor is the change in share price (i.e. the difference in share price).
So from an investment perspective, you want to develop a model that predicts changes in share price. What predicts those changes?
And how large is the effect of those changes on the change in share price?
...
variable | description | units |
---|---|---|
gwagegap | gender wage gap | percentage |
minwage | minimum wage in 2014 constant prices | 2014 USD PPPs |
rgdpcap | GDP per head (expenditure approach) at constant prices, constant PPPs, OECD base year = 2010 | US dollar |
dln_cpi | CPI inflation rate (no food, no energy) | percentage |
uniondens | union density -- percentage of wage and salary earners that are trade union members | percentage |
lrem25fe | Employment rate, Aged 25-54, Females | percentage |
lrem25ma | Employment rate, Aged 25-54, Males | percentage |
lrem25tt | Employment rate, Aged 25-54, All Persons | percentage |
lrem64fe | Employment rate, Aged 15-64, Females | percentage |
lrem64ma | Employment rate, Aged 15-64, Males | percentage |
lrem64tt | Employment rate, Aged 15-64, All Persons | percentage |
lfwa25fe | Working age population, Aged 25-54, Females | percentage |
lfwa25ma | Working age population, Aged 25-54, Males | percentage |
lfwa25tt | Working age population, Aged 25-54, All Persons | percentage |
lfwa64fe | Working age population, Aged 15-64, Females | percentage |
lfwa64ma | Working age population, Aged 15-64, Males | percentage |
lfwa64tt | Working age population, Aged 15-64, All Persons | percentage |
eprc_v1 | employment protection -- individual and collective dismissals (regular contracts), version 1 | 0 to 6 |
eprc_v2 | employment protection -- individual and collective dismissals (regular contracts), version 2 | 0 to 6 |
eprc_v3 | employment protection -- individual and collective dismissals (regular contracts), version 3 | 0 to 6 |
epr_v1 | employment protection -- individual dismissals (regular contracts), version 1 | 0 to 6 |
epr_v3 | employment protection -- individual dismissals (regular contracts), version 3 | 0 to 6 |
epc | employment protection -- collective dismissals (additional provisions) | 0 to 6 |
ept_v1 | employment protection -- temporary employment, version 1 | 0 to 6 |
ept_v3 | employment protection -- temporary employment, version 3 | 0 to 6 |
OECD Economies
Australia, Austria, Belgium, Canada, Chile, Czech Republic, Denmark, Estonia, Finland, France, Germany, Greece, Hungary, Iceland, Ireland, Israel, Italy, Japan, Korea, Luxembourg, Mexico, Netherlands, New Zealand, Norway, Poland, Portugal, Slovak Republic, Slovenia, Spain, Sweden, Switzerland, Turkey, United Kingdom, United States
Annual Data
1985 to 2014
download
This dataset contains labor market data from stats.oecd.org covering 34 OECD countries. It is the same dataset that I use in class with the addition of "gender wage gap" -- the percentage difference between male and female wages.
In class, we discuss the effect of labor market regulation on male and female employment rates. You might extend that analysis by exploring the effect of labor market regulation on the gender wage gap.
...
variable | description | units |
---|---|---|
NODEID | intersection identifier | |
Casualties | sum of "Fatalities" and "Injuries" during month | count |
Fatalities | total number of fatalities during month | count |
PedFatalit | number of pedestrian fatalities during month | count |
BikeFatali | number of bicyclist fatalities during month | count |
MVOFatalit | number of motorist fatalities during month | count |
Injuries | total number of injuries during month | count |
PedInjurie | number of pedestrian injuries during month | count |
BikeInjuri | number of bicyclist injuries during month | count |
MVOInjurie | number of motorist injuries during month | count |
CasualBefore | total "Casualties" from 2009 to 2013 | sum |
CasualAfter | total "Casualties" from 2014 to 2017 | sum |
InjurBefore | total "Injuries" from 2009 to 2013 | sum |
InjurAfter | total "Injuries" from 2014 to 2017 | sum |
FatalBefore | total "Fatalities" from 2009 to 2013 | sum |
FatalAfter | total "Fatalities" from 2014 to 2017 | sum |
Monthly Data
Jan. 2009 to Dec. 2017
download
nyc-dot_by-mon_with-zeroes.csv.zip (Zipped CSV file)
lib_crosstabs.r (R library)
nyc-dot_crosstabs_v3.r (R script)
This NYC DOT dataset contains information on traffic fatalities and injuries at 30,754 New York City intersections over 9 years.
During the most recent 4 years of data (from Jan. 2014 to Dec. 2017), New York City set a goal of eliminating traffic fatalities and injuries in an initiative called "Vision Zero." Vision Zero reduced the speed limit throughout the city from 35 to 25 miles per hour and changed traffic rules at many intersections.
You may use this dataset to test the null hypothesis that Vision Zero did not reduce fatalities or injuries. And to conduct such a hypothesis test, you might use regression analysis. But because this dataset is so large (3,347,028 observations) we can also create cross-tabulations that directly examine the empirical distribution.
The one requirement is that you must use a high-memory computer for this analysis. Because this dataset is so large (3,347,028 observations), it took 5 minutes to run this code on my computer with 8 GB of RAM.
...
Copyright © 2002-2025 Eryk Wdowiak