This paper presents a new family of Stata functions devoted to small area estimation. This is accomplished by incorporating This is accomplished by incorporating information from outside sources. Such target data sets are becoming increasingly available and can take the form of a traditional population census, but also large scale administrative records from tax administrations, or geospatial information produced using remote sensing.
The strength of these target data sets is their granularity on the subpopulations of interest, however, in many casesthey lack the ability to collect analytically relevant variables such as welfare or caloric intake. The goal is to study the relationship between health care expenditures and population health, with a focus on specific elements, such as disease prevalence and trends; treatment, intervention, and prevention effectiveness; and mortality, quality of life, and other health outcomes.
The relationships are examined using Bayesian network modeling, and microsimulations are performed to evaluate hypothetical alternative scenarios. Given that no existing data set contains all of the desired measures, Raghunathan and his colleagues are working on combining data from a variety of sources.
Partner Research Institutions
For example, the team identified disease categories with major impact on expenditures. Related data for subsets of diseases are available from. Although information on past and current disease conditions is available from self-report data, the claims data represent current conditions, so to combine the information, both types of data are converted to a measure of whether the person ever had the disease.
Respondents are matched on the covariates and then the missing self-report in the MCBS is imputed, so that the overall self-report rates in the two surveys agree. Raghunathan concluded by saying that although there are a lot of challenges related to the portability of information from one survey to another, including differences in the data collection instruments, protocols, and timing, often combining several data sets is the best option.
When data from multiple sources are synthesized, statistical modeling and an imputation framework are particularly useful tools to create the necessary infrastructure. However, data sharing has to be made easier for these types of approaches to be successful. In an ideal world, all data collection would be part of a large matrix, with a unified sampling frame and complementary content that could be filled in by different agencies or researchers working on different pieces. He prefers to think of everything as models, and, in that sense, the design-based model can be described as a saturated fixed-effects model, in which one does not make strong structural assumptions, so there are no random effects.
One can also consider unsaturated hierarchical models, so to the extent that there are any distinctions, they are in terms of the different types of models. Little argued that hierarchical models are the right way to think about this conceptually. The advantage of hierarchical models is that it is not necessary to use either the direct estimates or the model-based estimates, because they provide a compromise between the direct estimate from the saturated model and the model-based estimate from the unsaturated model.
Small-Area Estimation : Theory and Practice
In some cases, Bayes modeling may be a little better because it does a better job with the uncertainty and the variance components. While the calibrated approach is a weighted combination of the two, the weights can be poorly estimated, and in simulation studies the calibrated approach can provide worse inference than the model-based approach when the hierarchical model is reasonable. Little finished by stating that the challenge is to come up with some kind of index of what he called structural assumption dependence.
For example, when the weights allocated to the model get too high, it might be possible to use that as a criteria for whether to publish an estimate. Other aspects of this include how well the model is predicting and the extent to which the model is highly parametric. He said that research is needed to develop the right indices. The number of people who can implement the design-based theory of inference is much larger than those with the skills described by Little, so that represents a very practical boundary.
Identifying the boundary will help those who have to decide whether they want to pursue a small-area application that requires considerable effort and buildup of staff. In response, Little responded that, since this is a forward-looking workshop, the emphasis is not on how things are now, but on thinking about how things might be in the future. Graham Kalton asked Raghunathan whether using Medicare administrative records was considered when producing estimates about the population ages 65 and older.
He agreed that the quality of the auxiliary data is very important in order to borrow strength for small-area estimation. In his third project, he and his team worked hard on obtaining county-level data from a wide variety of sources, not only census data, but also marketing data and data about how active the public health department is.
He added that they also went through a lot of soul searching in terms of whether the estimates are publishable. They had a meeting at the Centers for Disease Control and Prevention with state health department representatives and presented the estimates. Most said that the numbers looked reasonable. The few who did not, also came around after they went back and thought about it. The fact is that researchers have to rely on the best information available to solve a particular problem, and the modeling framework provides an opportunity to move forward with research on these topics.
Raghunathan commented that in some areas of statistics modeling is widely used, but the techniques are less common in the survey field. He argued that the distinctions made by survey researchers between model-based, model-assisted, and design-based approaches are not particularly helpful. In his research, they relied on the latest computational and statistical developments in the field as a whole, and that allowed them to solve the problems at hand.
Quoting George Box, he said that all models are wrong, but some are useful. Viewing models as a succinct summary of the information available and using that to make projections about the population helps scientific areas, such as health policy, move forward. A related issue is the number of sampled PSUs in each small area; if there is not a sizable number of PSUs in an area, direct variance estimates will be too imprecise, leading to the need to model the variances.
Fay responded that the problem of degrees of freedom raised was a common one.
In the case of the eight years of data in California, about half of the PSUs were self-representing, which means a lot of degrees of freedom for that half of the population. The study did poorly in the remaining part of the state. He agreed that a distinction can be made between design-based estimation and design-based inference, adding that the variances may have to proceed out of some form of generalization.
This was true for the CPS case as well, because for the averages it was only a guess what the true variances were. Kalton quoted Gordon Brackstone, former assistant chief statistician at Statistics Canada, who many years ago said that the problem with small-area estimates is that the local people know better, and they will always challenge model-based estimates that appear wrong. Kalton said that it turns out that the local people do not necessarily know better, and that surprisingly they tend to react to the estimates presented by constructing a rationalization for why the estimates are correct.
Bell said that he believes that when Kalton spoke of large errors, he was referring to the school district estimates and also some counties. The issue was the paucity of the data they had at the school district level. In the case of the estimates that the panel chaired by Kalton was reviewing National Research Council, , the updates were coming from the county level, and there were no school district level updates, so the quality of those estimates was not as good as the data that were available for higher levels of geography.
mathematics and statistics online
The general principle is that the smaller the area, the worse the data are going to be, and that is an issue. Regarding challenges, Bell commented that they are continuing to get challenges, although he does not deal with them directly himself. Often they come from very small school districts, where it is easier for the locals to have a good sense of what is going on. Occasionally the challenges make reference to other data, such as free and reduced price lunch data, a situation that indicates that there is some confusion, given that these are not the same as the poverty estimates.
There were also a lot of challenges based on the census data, using the census numbers to estimate the school district to county shares of poverty and making reference to what the previous estimates were. Generally, data users compare the current estimates to something else, and they tend to react when they see a large discrepancy, even though it is clearly possible that the other estimate was incorrect. Sometimes they have a legitimate case and it.
Zaslavsky added that if the general feeling is that there are not enough people who can do this type of analysis, then it is important to think about the implications for new directions in training.
- Ghosh , Rao : Small Area Estimation: An Appraisal?
- Course Dates.
- Small Area Estimation | R-bloggers.
- Elvis Costello, Joni Mitchell, and the Torch Song Tradition;
- Small Area Estimation - SAGE Research Methods;
- Donate to arXiv.
- Control-oriented modelling and identification : theory and practice!
He thinks it would be useful for researchers to continue to pursue this research and talk to the data users in contexts similar to that described by Raghunathan. To the extent that researchers are able to communicate their methods and demonstrate a commitment to accuracy, it is likely that data users will embrace these techniques, in the same way they accepted the classical estimators that they do not fully understand. Federal household surveys today face several significant challenges including: increasing costs of data collection, declining response rates, perceptions of increasing response burden, inadequate timeliness of estimates, discrepant estimates of key indicators, inefficient and considerable duplication of some survey content, and instances of gaps in needed research and analysis.
Census Bureau, was designed to address the increasing concern among many members of the federal statistical system that federal household data collections in their current form are unsustainable. The workshop brought together leaders in the statistical community to discuss opportunities for enhancing the relevance, quality, and cost-effectiveness of household surveys sponsored by the federal statistical system.
The Future of Federal Household Surveys is a factual summary of the presentations and related discussions that transpired during the workshop. This summary includes a number of solutions that range from methodological approaches, such as the use of administrative data, to emphasis on interagency cooperative efforts. Based on feedback from you, our users, we've made some improvements that make it easier than ever to read thousands of publications on our website.
Jump up to the previous page or down to the next one. Also, you can type in a page number and press Enter to go directly to that page in the book. Switch between the Original Pages , where you can read the report as it appeared in print, and Text Pages for the web version, where you can highlight and search the text.
To search the entire text of this book, type in your search term here and press Enter. Ready to take your reading offline? Click here to buy this book in print or download it as a free PDF, if available.click here
Small Area Estimation and Microsimulation Modeling - CRC Press Book
Do you enjoy reading reports from the Academies online for free? Sign up for email notifications and we'll let you know about new publications in your areas of interest when they're released. Inslee's proposed budget —19 enacted budgets. Building a modern work environment People Connectivity Place. Labor relations Collective bargaining agreements Interest arbitration awards Training and resources.
Director's Reviews Personnel Resources Board appeals. Small area estimates program.