<\p>
The list below includes all research which has been published or has been made available as a working paper. When possible, I have included a link to download the open-access working paper version of the publication. For current research projects, see the Research section of this site.
This paper estimates returns to education using a dynamic model of educational choice that synthesizes approaches in the structural dynamic discrete choice literature with approaches used in the reduced form treatment effect literature. It is an empirically robust middle ground between the two approaches which estimates economically interpretable and policy-relevant dynamic treatment effects that account for heterogeneity in cognitive and non-cognitive skills and the continuation values of educational choices. Graduating college is not a wise choice for all. Ability bias is a major component of observed educational differentials. For some, there are substantial causal effects of education at all stages of schooling.
This paper develops robust models for estimating and interpreting treatment effects arising from both ordered and unordered multi-stage decision problems. Identification is secured through instrumental variables and/or conditional independence (matching) assumptions. We decompose treatment effects into direct effects and continuation values associated with moving to the next stage of a decision problem. Using our framework, we decompose the IV estimator, showing that IV generally does not estimate economically interpretable or policy-relevant parameters in prototypical dynamic discrete choice models, unless policy variables are instruments. Continuation values are an empirically important component of estimated total treatment effects of education. We use our analysis to estimate the components of what LATE estimates in a dynamic discrete choice model.
The option to obtain a General Educational Development (GED) certificate changes the incentives facing high school students. This article evaluates the effect of three different GED policy innovations on high school graduation rates. A 6-point decrease in the GED pass rate produced a 1.3-point decline in high school dropout rates. The introduction of a GED certification program in high schools in Oregon produced a 4% decrease in high school graduation rates. Introduction of GED certificates for civilians in California increased the dropout rate by 3 points. The GED program induces students to drop out of high school.
This paper discusses and illustrates identification problems in personality psychology. The measures used by psychologists to infer traits are based on behaviors, broadly defined. These behaviors are produced from multiple traits interacting with incentives in situations. In general, measures are determined by these multiple traits and do not identify any particular trait unless incentives and other traits are controlled for. Using two data sets, we show, as an example, that substantial portions of the variance in achievement test scores and grades, which are often used as measures of cognition, are explained by personality variables.
Intelligence quotient (IQ), grades, and scores on achievement tests are widely used as measures of cognition, but the correlations among them are far from perfect. This paper uses a variety of datasets to show that personality and IQ predict grades and scores on achievement tests. Personality is relatively more important in predicting grades than scores on achievement tests. IQ is relatively more important in predicting scores on achievement tests. Personality is generally more predictive than IQ on a variety of important life outcomes. Both grades and achievement tests are substantially better predictors of important life outcomes than IQ. The reason is that both capture personality traits that have independent predictive power beyond that of IQ.
The General Educational Development (GED) credential is issued on the basis of an eight hour subject-based test. The test claims to establish equivalence between dropouts and traditional high school graduates, opening the door to college and positions in the labor market. In 2008 alone, almost 500,000 dropouts passed the test, amounting to 12% of all high school credentials issued in that year. This chapter reviews the academic literature on the GED, which finds minimal value of the certificate in terms of labor market outcomes and that only a few individuals successfully use it as a path to obtain post-secondary credentials. Although the GED establishes cognitive equivalence on one measure of scholastic aptitude, recipients still face limited opportunity due to deficits in noncognitive skills such as persistence, motivation and reliability. The literature finds that the GED testing program distorts social statistics on high school completion rates, minority graduation gaps, and sources of wage growth. Recent work demonstrates that, through its availability and low cost, the GED also induces some students to drop out of school. The GED program is unique to the United States and Canada, but provides policy insight relevant to any nation's educational context.
Objective: To design and implement a tool that creates a secure, privacy preserving linkage of electronic health record (EHR) data across multiple sites in a large metropolitan area in the United States (Chicago, IL), for use in clinical research.
Methods: The authors developed and distributed a software application that performs standardized data cleaning, preprocessing, and hashing of patient identifiers to remove all protected health information. The application creates seeded hash code combinations of patient identifiers using a Health Insurance Portability and Accountability Act compliant SHA-512 algorithm that minimizes re-identification risk. The authors subsequently linked individual records using a central honest broker with an algorithm that assigns weights to hash combinations in order to generate high specificity matches.
Results: The software application successfully linked and de-duplicated 7 million records across 6 institutions, resulting in a cohort of 5 million unique records. Using a manually reconciled set of 11 292 patients as a gold standard, the software achieved a sensitivity of 96% and a specificity of 100%, with a majority of the missed matches accounted for by patients with both a missing social security number and last name change. Using 3 disease examples, it is demonstrated that the software can reduce duplication of patient records across sites by as much as 28%.
Conclusions: Software that standardizes the assignment of a unique seeded hash identifier merged through an agreed upon third-party honest broker can enable large-scale secure linkage of EHR data for epidemiologic and public health research. The software algorithm can improve future epidemiologic research by providing more comprehensive data given that patients may make use of multiple healthcare systems.