Data from the 1998 Medicare Current Beneficiary Survey indicate that for otherwise similar individuals with heart disease, the likelihood and extent of utilization of heart medications are independent of supplemental insurance and drug coverage, whereas total and out-of-pocket expenses are not. Yet, a large share of heart patients does not use heart medications, as many lack drug coverage. Nonusers without drug coverage are disproportionately represented in the subsample that reports a recent inpatient hospital stay for heart disease. This paper discusses these findings.
U.S. Bureau of Census
Statistical Research Division Seminar
Topic: A Weighted Jackknife Method for the Fay-Herriot Model with an
Application in the Saipe Program
- Speaker: P. Lahiri University of Nebraska-Lincoln & University of
Maryland at College Park
- Date/Time: April 23, 2002, 10:30 - 11:30 a.m.
- Location: U.S. Bureau of the Census, 4700 Silver Hill Road, Suitland, Maryland - the Morris Hansen Auditorium, FOB 3. Enter at Gate 5 on Silver Hill Road. Please call (301) 457-4974 to be placed on the visitors' list. A photo ID is required for security purposes.
Abstract:
We present a weighted jackknife method to estimate the mean square error (MSE) of empirical best linear unbiased predictor (EBLUP) of a small-area mean for the well-celebrated Fay-Herriot model. The proposed MSE estimator improves on the existing MSE estimators and is robust under a variety of situations. We illustrate our methodology for the U.S. Census Bureau's SAIPE program.
This seminar is physically accessible to persons with disabilities. For TTY callers, please use the Federal Relay Service at 1-800-877-8339. This is a free and confidential service. Requests for sign language interpreting services or other auxiliary aids should be directed to Yvonne Moore at (301) 457-2540 text telephone (TTY), 301-763-5113 (voice mail), or by e-mail to Sherry.Y.Moore@census.gov.
Title: Combination of Information from Several Sources: The Case of t and F Tests
- Speaker: Professor Benjamin Kedem, Chair of Statistics Program, University of Maryland, College Park, Maryland
- Chair: Jai Choi, PhD, mathematical statistician, Office of Research and Methodology, National Center for Health Statistics, (NCHS), 301-458-4144
- Date/Time: Wednesday, April 24, 2002, 10:00 a.m.- 11:30 a.m.
- Location: National Center for Health Statistics Auditorium, Room 1110
- Sponsor: WSS Public health and Biostatistics Program and the Office of Research and Methodology, NCHS
Abstract:
We consider the following general problem. Suppose there are several sources of information regarding a certain quantity, where some of the sources are reliable and some are distorted. How can we combine all the data, reliable as well as distorted, to improve the reliability of the "good data" ? A case in point is the classical analysis of variance. It will be demonstrated that the idea of combining poor and reliable data can improve and generalize the classical t and F tests without the usual normal assumption.
Topic: Including Families with Limited English Proficiency in the Early Childhood Longitudinal Study, Birth Cohort (ECLS-B)
- Speaker: Brad Edwards, Westat
- Date: Thursday, April 25, 2002, 12:30 - 2:00 p.m.
- Location: Bureau of Labor Statistics Conference Rooms 9
- Sponsor: WSS Data Collection Methods Section and AAPOR-DC
Abstract:
Language minority families present special challenges for the Early Childhood Longitudinal Study's Birth Cohort (ECLS-B). Data collection methods include CAPI interviews with parents, direct assessments of children, self-administered paper questionnaires for fathers, and CATI interviews with child care providers. In Round 1 of the study, data will be collected from about 1,800 Asian, 1,400 Hispanic, and 900 American Indian births, part of a national sample of about 13,000 children born throughout 2001. The approach to language minority issues is to make every reasonable effort to include these families in the study, to collect their data without compromising quality in any major way, and to be sensitive to cultural differences presented by these families. At the same time, fixed resources are available to the project and there are tradeoffs in reaching out to minority language families without jeopardizing the overall study design. Specific criteria and decision rules have been developed, so that the procedures for including language minority families are not arbitrary and their data are collected in a standardized manner. Although much of the focus in developing the ECLS-B language minority protocol has been on the first two data collection points, the general approach incorporates a longitudinal perspective, and this presentation addresses issues that are likely to occur over the course of all waves of data collection, ending when the children are in first grade.
University of Maryland
Statistics Program
Department of Mathematics Seminar
Title: Monte Carlo Approximation and the Bootstrap
- Speaker: Jim Booth, Department of Statistics. University of Florida
- Date/Time: Thursday, April 25, 3:30 p.m.
- Location: Dean's Conference Room, Van Munching Hall 3300, University of Maryland. For directions, please visit http://www.rhsmith.umd.edu/visitors/planning.html
Abstract:
The bootstrap can be thought of as a simple plug-in rule that may be stated as follows: "Estimate any functional characteristic of an unknown distribution by the same characteristic of a fitted or empirical distribution". In particular, given an i.i.d. sample from an unknown distribution, the bootstrap can be used to estimate the bias, the variance and quantiles of the sampling distribution of any statistic. In most cases exact computation of bootstrap estimates is either analytically intractible or computationally infeasible. Thus, in practice bootstrap estimates are usually approximated by Monte Carlo methods. In this talk I will discuss the amount of Monte Carlo simulation necessary for accurate approximation of bootstrap standard errors and confidence intervals and argue that the answer is more than is generally thought.
University of Maryland
Statistics Program
Department of Mathematics Seminar
Title: On the Correlation Structure of Transformed Gaussian Random
Fields
- Speaker: Victor Deoliveira
- Date/Time: Thursday, April 25, 3:30 p.m.
- Location: Room 1313, Mathematics Building, University of Maryland. For directions, please visit the Mathematics Web Site: http://www.math.umd.edu/dept/contact.html
Abstract:
Transformed Gaussian random fields can be used to model continuous time series and spatial data when the Gaussian assumption is not appropriate. The main features of these random fields are specified in a transformed scale, while for modeling and parameter interpretation it is useful to establish connections between these features and those of the random field in the original scale. This work provides evidence of the property that, for many `normalizing' transformations and under certain conditions, the correlation function of a transformed Gaussian random field is not much dependent on the transformation that is used. Hence many commonly used transformations of correlated data have little effect on the original correlation structure. The property is shown to hold for some kinds of transformed Gaussian random fields, and a statistical explanation based on the concept of parameter orthogonality is provided. The property is also illustrated using two spatial data sets and several `normalizing' transformations. Some consequences of this property for modeling and inference are also discussed.
University of Maryland
Statistics Program
Department of Mathematics Seminar
Title: Application of the Sanov Large Deviation Theorem to the Density Estimation and Screening Significant Factors
- Speaker: Mikhail B. Malioutov, Northeastern University, Boston
- Time: Thursday, May 2nd, 2002, 3:30 pm
- Place: Room 1313, Mathematics Building, University of Maryland College Park. For directions, please visit the Mathematics Web Site: http://www.math.umd.edu/dept/contact.html
Abstract:
Two remarkable applications of the Sanov theorem will be outlined. The first one deals with large Lp deviations of general regular density estimates. The exponential rate of the Lp decay turns out to be free of the underlying density function and the estimator. The second application proves the asymptotic optimality of the famous Jaynes principle in finding significant inputs of an unknown noisy function.
U.S. Bureau of Census
Statistical Research Division Seminar
Topic: YOU ARE HERE: Information Architecture and Web Navigation
- Speaker: Jonathan Lazar, Professor of Computer and Information Sciences, Towson University
- Date/Time: May 8, 2002, 10:30 - 11:30 a.m.
- Location: U.S. Bureau of the Census, 4700 Silver Hill Road, Suitland, Maryland - the Morris Hansen Auditorium, FOB 3. Enter at Gate 5 on Silver Hill Road. Please call (301) 457-4974 to be placed on the visitors' list. A photo ID is required for security purposes.
Abstract:
In large web sites, intranets, extranets and other information spaces, users tend to get lost and disoriented among the hundreds and thousands of web pages. It is frustrating to users when they cannot reach their task goal because they cannot find the content that they need. Information architecture and web navigation focus on structuring information in a manner so that users can find what they need with relative ease. With appropriate architecture and navigation, users are aware of what information is available on the web site and can reach the maximum amount of content with minimal effort. In turn, this will increase user satisfaction and productivity. This presentation will focus on what web designers need to know about information architecture and web navigation to design effective sites for users.
This seminar is physically accessible to persons with disabilities. For TTY callers, please use the Federal Relay Service at 1-800-877-8339. This is a free and confidential service. Requests for sign language interpreting services or other auxiliary aids should be directed to Yvonne Moore at (301) 457-2540 text telephone (TTY), 301-763-5113 (voice mail), or by e-mail to Sherry.Y.Moore@census.gov.
Topic: Survey Automation: The Promise and the Reality
- Speaker: Jesse Poore, Ericsson-Harlan D. Mills Chair in Software Engineering, University of Tennessee
- Date/Time: May 10, 2002 3:00-4:30 p.m. (See below--RSVP by May 3d required)
- Location: Auditorium at the National Academy of Sciences, 2100 C Street, NW, Washington, DC. Please arrive early, as parking is limited, and be prepared to show identification to enter the building. Please note that the entrance to the National Academy of Sciences building at 2101 Constitution Avenue, NW, is closed to the public. Guests wishing to take Metro to the seminar are encouraged to take the National Academy's shuttle, which departs from the Foggy Bottom/GWU Metro station every 30 minutes.
Abstract:
A tea, from 2:30 to 3:00 p.m., will precede the afternoon session, which will begin with a discussion of recent developments in national statistics, followed by a seminar on the challenges of automating complex survey questionnaires and how statistical agencies may benefit from the computer sciences to make survey automation more efficient and effective (The seminar is based on a recent CNSTAT workshop on survey automation, which brought together leading computer scientists and survey methodologists.) The seminar will include a brief overview of why the replacement of paper questionnaires by computerized instruments---so promising in theory---can be so difficult in practice, and feature a presentation by Jesse Poore, Ericsson-Harlan D. Mills Chair in Software Engineering, University of Tennessee, on computer science tools for management, documentation, and testing of complex software. Discussion will follow the presentation. A reception will follow from 4:30-5:15 p.m. in the Members' Room.
All are welcome, but for security purposes, you must RSVP by May 3rd. To RSVP, or if you need further information, please contact Danelle Dessaint at (202) 334-3096 or email ddessain@nas.edu.
U.S. Bureau of Census
Statistical Research Division Seminar
Topic: The One-Way Fixed and Random Models under Heteroscedasticity
- Speaker: Aref N. Dajani, Statistical Research Division, U.S. Census Bureau
- Date/Time: May 14, 2002, 10:30 - 11:30 a.m..
- Location: U.S. Bureau of Census, 4700 Silver Hill Road, Suitland, Maryland - the Henry Gannett and Herman Hollerith Rooms, FOB 3. Enter at Gate 5 on Silver Hill Road. Please call Barbara Palumbo at (301) 457-4974 to be placed on the visitors' list. A photo ID is required for security purposes. 14
Abstract:
For testing the equality of several treatment effects in a one-way fixed effects model, or for testing the significance of the treatment variance component in a one-way random effects model, the usual F test is appropriate when error variances are assumed to be equal. When this assumption is violated, the F test may not be appropriate.
Many alternative tests have been suggested in the literature. When applied to actual data, the different tests can yield drastically different p-values and opposing conclusions. This brings up the issue of which test should be chosen for practical use. To address this, the different tests are compared in terms of their Type I error probability and power, estimated by Monte Carlo simulation. It turns out that there are scenarios where many of the tests have Type I error probabilities far greater than the nominal level. Based on the numerical results, recommendations are made on the choice of the test for practical use.
For the one-way random model, a test is also derived for testing the more general hypothesis that the random effect variance component is below a known bound. Interval estimation is also addressed in this context. The results are applied to several examples.
This seminar is physically accessible to persons with disabilities. For TTY callers, please use the Federal Relay Service at 1-800-877-8339. This is a free and confidential service. Requests for sign language interpreting services or other auxiliary aids should be directed to Yvonne Moore at (301) 457-2540 text telephone (TTY), 301-763-5113 (voice mail), or by e-mail to Sherry.Y.Moore@census.gov.
U.S. Bureau of Census
Statistical Research Division Seminar
Topic: An "Optimal" Data Swapping Procedure
- Speakers:
Krish Muralidhar
School of Management
Gatton College of Business & Economics
University of Kentucky, Lexington KY 40506
Rathindra Sarathy
Department of Management
College of Business Administration
Oklahoma State University, Stillwater OK 74078
- Date/Time: May 20, 2002, 10:00 - 11:30 a.m.
- Location: U.S. Bureau of Census, 4700 Silver Hill Road, Suitland, Maryland - the Henry Gannett and Herman Hollerith Rooms, FOB 3. Enter at Gate 5 on Silver Hill Road. Please call Barbara Palumbo at (301) 457-4974 to be placed on the visitors' list. A photo ID is required for security purposes. 14
Abstract:
Data swapping can be described in simple terms as a process by which the values of two records in the microdata are interchanged (or swapped). Reiss (1980) was one of the first proponents of data swapping. The objective of data swapping is to mask the original data while maintaining its characteristics. Compared to other methods of masking, data swapping provides two major advantages: (1) when analyzing a single masked attribute, data swapping preserves its statistical characteristics, while most other masking methods are subject at least to sampling error; (2) from a human perspective, it is likely to be more acceptable to users than other masking methods that involve use of noise, since data swapping uses only the original (true) values.
The two major objectives of masking procedures are accuracy and security. In broad terms, accuracy can be defined as the extent to which the masked values faithfully replicate the characteristics of the original values in the microdata set, while security can be defined as the extent to which a snooper can gain information about the confidential attributes and/or the identity of a particular record using the masked data. Ideally, an "optimal" masking procedure would replicate the information in the original data and would provide a snooper with no additional information. Most masking procedures have a theoretical basis for their implementation, enabling modifications that provide improvements in their performance. This is not the case with data swapping, although Moore (1996) has provided some theoretical results regarding the efficacy of rank-based proximity swap in achieving the two objectives of masking. This has resulted in limited advancements in swapping techniques.
In this study, we propose a new data swapping procedure for continuous numerical data that is capable of achieving both objectives of masking, leading to an "optimal" masking procedure. The new approach has a strong theoretical basis, and theoretically achieves both the accuracy and security objectives. We illustrate the application of the new procedure by using simulated microdata sets having a multivariate normal distribution (with and without non-confidential categorical data) and other distributions (with and without non-confidential categorical data). We also hope to extend the results of this study and investigate the suitability of this approach for confidential categorical data as well.
This seminar is physically accessible to persons with disabilities. For TTY callers, please use the Federal Relay Service at 1-800-877-8339. This is a free and confidential service. Requests for sign language interpreting services or other auxiliary aids should be directed to Yvonne Moore at (301) 457-2540 text telephone (TTY), 301-763-5113 (voice mail), or by e-mail to Sherry.Y.Moore@census.gov.
Topic: Analyzing patterns of killings and migration flow in Kosovo, March-June 1999
- Speakers: Patrick Ball, American Association for the Advancement of Science
- Discussant: Mary Gray, American University
- Chair: Fritz Scheuren, Urban Institute
- Date: Friday, May 24th, 12:30-2:00 p.m.
- Location: Bureau of Labor Statistics Conference Rooms 7 and 8
- Sponsor: WSS Data Collection Methods Section, WSS Methodology Section and AAPOR-DC
Abstract:
During the conflict between NATO and Yugoslavia, thousands of people were killed and hundreds of thousands more fled their homes. Logically, NATO and Yugoslavia advanced quite different explanations for the violence. Yugoslavia claimed that the deaths and migration were the result of NATO's airstrikes and local actions by the ethnic Albanian insurgents (the KLA). NATO claimed that the deaths and migration were the result of a coordinated campaign by Yugoslav authorities to "ethnically cleanse" Kosovo of Albanians.
This report used techniques from historical demography as well as multiple systems estimation to model patterns of killing and migration flow. Comparing killings and migration to patterns of KLA activity and NATO airstrikes, the hypotheses advanced by the Yugoslav government are rejected. Key coincidences in the data are observed which are suggestive of agreement with the hypothesis that Yugoslav forces were responsible for the violence.
This analysis was presented in the trial of Slobodan Milosevic at the International Criminal Tribunal for Former Yugoslavia (ICTY) in The Hague on 13-14 March 2002.
Topic: Why Are Semiconductor Prices Falling So Fast? Industry Estimates and Implications for Productivity Measurement
- Speaker: Ana Aizcorbe, Federal Reserve Board
- Discussant: Marshall Reinsdorf, Bureau of Economic Analysis
- Chair: Linda Atkinson, Economic Research Service, USDA
- Date/Time: Thursday, June 13, 2002; 12:30 p.m. 2:00 p.m.
- Location: Bureau of Labor Statistics, Conference Center Room 2, Postal Square Building (PSB), 2 Massachusetts Ave. NE, Washington, D.C. Please use the First St., NE, entrance to the PSB. To gain entrance to BLS, please see "Notice" at the top of this page.
- Sponsor: Economics Section
Abstract:
By any measure, price deflators for semiconductors fell at a staggering pace over much of the last decade. These rapid price declines are typically attributed to technological innovations that lower constant-quality manufacturing costs. But, given Intel's dominance in the microprocessor market, those price declines may also reflect changes in Intel's profit margins. Disaggregate data on Intel's operations are used to explore these issues. There are three basic findings. First, the industry data show that Intel's markups from its microprocessor segment shrank substantially from 1993-99. Second, about 3-1/2 percentage points of the average 24 percent price decline in a price index for Intel's chips can be attributed to declines in these profit margins over this period. And, finally, the data suggest that virtually all of the remaining price declines can be attributed to quality increases associated with product innovation.
Statistics For A New Century: Meeting The Needs Of A World Of Data
- Speaker: Richard L. Scheaffer, Professor Emeritus, University of Florida, and Past ASA President
- Date/Time: June 18, 2002
- Location: Maggiano's Little Italy, 5333 Wisconsin Ave., N.W., Washington, DC.
Abstract:
The world is awash in data. Many are aware of the importance and power of data in their professional and personal lives, but few are educated in ways that would allow them to more fully comprehend the vast array of uses (and misuses) of data or to effectively use the quantitative information that confronts them daily. Even fewer are aware of the fact that formal study of statistics can serve to strengthen their own academic preparation for a wide variety of careers. Some successes are being achieved, however, through recent efforts to infuse statistics into the school (K-12) curriculum and to enhance opportunities for undergraduates to learn more statistics. The goals of these efforts are to empower students through improved quantitative literacy and to provide strong foundations for careers that depend increasingly on data.
Modern statistics education has generated terrific interest among educators and students at all levels; it now must prove itself by making effective use of this opportunity to produce new generations of graduates that will not drown in their world of data.
U.S. Bureau of Census
Statistical Research Division Seminar
Topic: Leonardo's Laptop: Human Needs and the New Computing
Technologies
- Speaker: Ben Shneiderman, Professor of Computer Science
University of Maryland at College Park
- Date/Time: June 20, 2002, 10:30 - 11:30 a.m.
- Location: Bureau of the Census, 4700 Silver Hill Road, Suitland, Maryland - the Morris Hansen Auditorium, FOB 3. Enter at Gate 5 on Silver Hill Road. Please call (301) 457-4974 to be placed on the visitors' list. A photo ID is required for security purposes.
Abstract:
The old computing was about what computers could do; the new computing is about what users can do. Attention is shifting from making computers intelligent to making users creative. Leonardo da Vinci could help as an inspirational muse for the new computing to push for improved quality through scientific study and more elegant design through visual thinking. We can follow Leonardo's example by integrating text and graphics, functionality and esthetics.
The new computing emphasizes empowerment and collaboration. We must reduce user frustration with annoying crashes, incomprehensible dialog boxes, and incompatible attachments. Then we can promote universal usability through interfaces that are more customizable for diverse users, more tailorable to a wide range of hardware, software, and networks, and designed to bridge the gap between what users know and what they need to know.
With these basics in place, the new computing principle is that human needs should shape technology. Four circles of human relationships and four human activities map out the human needs for mobility, ubiquity, creativity and community. Million-person communities will be accessible through desktop, palmtop and fingertip devices that support e-learning, e-business, e-healthcare, and e-government.
This talk will present an agenda of what is needed to bring about The New Computing (www.cs.umd.edu/hcil/newcomputing).
This seminar is physically accessible to persons with disabilities. For TTY callers, please use the Federal Relay Service at 1-800-877-8339. This is a free and confidential service. Requests for sign language interpreting services or other auxiliary aids should be directed to Yvonne Moore at (301) 457-2540 text telephone (TTY), 301-763-5113 (voice mail), or by e-mail to Sherry.Y.Moore@census.gov.
U.S. Bureau of Census
Statistical Research Division Seminar
Topic: Bootstrap Approximation to Prediction MSE for State-Space Models with Estimated Parameters
- Speaker: Danny Pfeffermann, Professor of Statistics
Hebrew University and University of Southampton
Joint work with Dr. Richard Tiller, Bureau of Labor Statistics
- Date/Time: August 7, 2002, 10:30 - 11:30 a.m.
- Location: U.S. Bureau of Census, 4700 Silver Hill Road, Suitland, Maryland - the Morris Hansen Auditorium, FOB 3. Enter at Gate 5 on Silver Hill Road. Please call (301) 457-4974 to be placed on the visitors' list. A photo ID is required for security purposes.
Abstract:
We propose a simple, but general method for approximating the prediction Mean Square Error (PMSE) of the state vector predictors in a state-space model when the unknown model parameters are estimated from the observed series. As is well known, substituting the model parameters with the sample estimates in the theoretical MSE expressions that assume known parameter values results in under-estimation of the true MSE. Methods proposed in the literature to deal with this problem are inadequate and may not even be operational when fitting complex models, or when some of the parameters are close to their boundary values. Application of the method to a model fitted to sample estimates of employment ratios in the U.S.A. that contains eighteen unknown parameters estimated by a three-step procedure yields accurate results. The method may be applied to a wide variety of problems, including many of the time series and mixed linear models used for Small Area Estimation problems. This will be illustrated using the Fay-Herriot model.
This seminar is physically accessible to persons with disabilities. For TTY callers, please use the Federal Relay Service at 1-800-877-8339. This is a free and confidential service. Requests for sign language interpreting services or other auxiliary aids should be directed to Yvonne Moore at (301) 457-2540 text telephone (TTY), 301-763-5113 (voice mail), or by e-mail to Sherry.Y.Moore@census.gov.
Topic: Confidentiality Audit On Suppressed Entries in Multi-Dimensional
Contingency Tables
- Speaker: Lawrence H. Cox, National Center for Health Statistics
- Discussant: Paul B. Massell, Bureau of the Census
- Chair: Virginia de Wolf
- Date/Time: Tuesday, July 16, 12:30 to 1:45 p.m.
- Location: Bureau of Labor Statistics, Conference Center, Conference Room 3, Postal Square Building (PSB), 2 Massachusetts Ave. NE, Washington, D.C. Please use the First St., NE, entrance to the PSB. To gain entrance to BLS, please see "Notice" at the top of this announcement.
- Sponsor: WSS Methodology Section
Abstract:
Disclosure limitation in contingency tables amounts to thwarting the ability of the data intruder to infer or make narrow estimates of small cell values. The Census Bureau adopts the base value five for "small"; the Statistics of Income Program and Statistics New Zealand prefer base value three. In two-dimensional tables, for disclosure limitation the statistical office traditionally has chosen either to round the counts to the base value or to perturb (add noise to) the counts or to suppress small counts together with additional cell values known as complementary suppressions. In multi-dimensions, a suggested approach is massive suppression, such as suppressing all internal entries, leaving only (some) marginals. Suppressed values must be subjected to a confidentiality audit to ensure that confidentiality protection has been achieved. This amounts to computing, for every suppressed small value x, the interval [min (x), max(x)] subject to all released and suppressed cell values and marginal totals. This is easily accomplished in two-dimensions using standard, efficient methods and software from linear programming. The purpose of this talk is to explore the difficulties of performing confidentiality audit in multi-dimensions. Preliminaries on mathematical properties of multi-dimensional contingency tables will be introduced, followed by examination based on examples of the utility of linear programming for confidentiality audit in multi-dimensional contingency tables. The talk is comprised of examples illustrating good, bad and ugly behaviors of contingency tables in two-, three- and four-dimensions.
U.S. Bureau of Census
Statistical Research Division Seminar
Topic: Parameter Estimation in Logistic Regression -- Not an Easy Matter
- Speaker: Thomas P. Ryan, Consultant
- Date/Time: August 19, 2002, 10:30 - 11:30 a.m.
- Location: U.S. Bureau of Census, 4700 Silver Hill Road, Suitland, Maryland - Room 3225, FOB 4. Enter at Gate 5 on Silver Hill Road. Please call (301) 457-4974 to be placed on the visitors' list. A photo ID is required for security purposes.
Abstract:
Logistic regression is a popular statistical tool that is used primarily in health and medical applications, but is also used in many other applications, including modeling data from complex sample surveys. Because parameter estimation is straightforward in linear regression, it would be easy to assume the same thing for logistic regression. Unfortunately, parameter estimation in logistic regression is problematic. This is known for maximum likelihood, the usual estimation method, in the case of rare events, but is apparently not known in the case of near separation of the data. The latter can cause serious problems, as will be illustrated. One alternative is to use exact logistic regression, which is generally preferable, but which also has some shortcomings. What is a user to do? Some insight will be given, and needed research will also be discussed.
(This talk will be based primarily on the paper "A Preliminary Investigation of Maximum Likelihood Logistic Regression versus Exact Logistic Regression" by E. N. King and T. P. Ryan, The American Statistician, August, 2002, 163-170.)
This seminar is physically accessible to persons with disabilities. For TTY callers, please use the Federal Relay Service at 1-800-877-8339. This is a free and confidential service. Requests for sign language interpreting services or other auxiliary aids should be directed to Yvonne Moore at (301) 457-2540 text telephone (TTY), 301-763-5113 (voice mail), or by e-mail to Sherry.Y.Moore@census.gov.
U.S. Bureau of Census
Statistical Research Division Seminar
Topic: Combined Survey Sampling Inference: Compromise or Consummation?
- Speaker: Kenneth R.W. Brewer, Australia National University
- Date: Tuesday, August 20, 2002
- Location: U.S. Bureau of Census, 4700 Silver Hill Road, Suitland, Maryland. Enter at Gate 5 on Silver Hill Road. Please call (301) 457-4974 to be placed on the visitors' list. A photo ID is required for security purposes.
Abstract:
Part 1: The Why and the How
(10:30 - 12:00 Noon - Morris Hansen Auditorium/FOB3)
Design (or randomization) inference is particularly appropriate for large samples and populations, and model (or prediction) inference for small ones. It is useful to combine them, if for no other reason than because large populations are usually made up of small domains, but there are certain spinoffs as well. These include (for the design approach) circumventing the need for asymptotics when justifying the use of the Classical Ratio Estimator, and (for the prediction approach) being easily able to avoid unacceptably small case weights. The combination of the two is achieved by equating a design-based (GREG) estimator and a prediction-based (PRED) estimator, and then imposing the resulting condition on the estimator of the relevant regression coefficient. The imposition of that condition involves both approaches in something of a compromise, but it will be shown that this is seldom of any material consequence for either of them.
Part 2: Some Simple Variance Formulas and Estimators
(2:00 - 3:30 p.m. - the Herman Hollerith Room, FOB 3)
The sampling literature has long been heavily sprinkled with theoretical and empirical comparisons of alternative variance estimators, some of which can involve rather complex formulas and/or logic which is difficult to follow. Some even require ad hoc adjustments as well. The combination of the two approaches seems at first to make matters even worse, because there are then three types of variance to consider: the design variance, the randomization variance and the "anticipated variance", the last involving a double expectation (over all possible samples and over all possible realizations of a prediction model). In this event, however, the three are so intimately related that transition from one to another is simple and obvious. Some surprising spinoffs include a simplification of the prediction variance (and its estimator) that can only be made when the estimator (of mean or total) is also supported by design inference. These spinoffs resemble so closely the "emergent phenomena" of modern complexity theory that the bringing together of the two approaches can arguably be viewed more appropriately as a fruitful consummation than as a mere compromise.
This seminar is physically accessible to persons with disabilities. For TTY callers, please use the Federal Relay Service at 1-800-877-8339. This is a free and confidential service. Requests for sign language interpreting services or other auxiliary aids should be directed to Yvonne Moore at (301) 457-2540 text telephone (TTY), 301-763-5113 (voice mail), or by e-mail to Sherry.Y.Moore@census.gov.
The George Washington University
Department of Statistics
Title: Partial Volume Correction for Neuroimaging using Tensor Based Statistical Algorithms
- Speaker: Dr. John Aston, Bureau of the Census
- Date/Time: 11:00-12:00 a.m., September 20, 2002
- Location: Funger Hall 321, 2201 G Street NW. Foggy Bottom metro stop on the blue and orange line.
Abstract:
The partial volume effect in Positron Emission Tomography (PET) is a problem for quantitative adiotracer studies. These studies can be used to study of many well-known diseases such as Epilepsy but partial volume effects can cause misinterpretation of the data. The partial volume effect results from the limited spatial resolution of the imaging device (a few mm's) and results in a blurring of the data. Two factors are involved for pre-defined regions; spillover of radioactivity into neighboring regions and the underlying tissue inhomogeneity (mixed tissue types) of the particular region. Linear modelling methods are currently used to correct for this effect on a regional level, using tissue classification from higher resolution imaging modalities, e.g. Magnetic Resonance Imaging, and anatomically defined regions which are assumed to contain homogeneous tracer concentrations. We extend these methods to incorporate the underlying noise structure of the PET tomograph measurements, and develop fast tensor based algorithms to facilitate the computation of true tracer concentration estimates and their associated errors. This allows calculation of linear models in the case of massive data sets with inherent spatial correlation structure. We also investigate the possibility of using the developed noise models to infer whether the defined regions were homogenous using Krylov subspace based approximate estimates for the regional errors associated with the fits.
Note: For a complete list of upcoming seminars check the dept's seminar web site: http://www.gwu.edu/~stat/seminars/Fall2002.htm. The campus map is at: http://www.gwu.edu/Map/. The contact person is Reza Modarres at Reza@gwu.edu or 202-994-6359.
Topic: Robust Seasonal Adjustment using Heavy-Tailed Distributions
- Speakers:
John Aston, Statistical Research Division, Census Bureau
Siem Jan Koopman, Free University Amsterdam, Netherlands
- Discussant: Stuart Scott, Bureau of Labor Statistics
- Chair: David Findley, U.S. Census Bureau
- Date/Time: September 26, 2002, Thursday; 12:30 PM - 2:00 PM
- Location: Bureau of Labor Statistics, Conference Center Rooms 7 and 8, Postal Square Building (PSB), 2 Massachusetts Ave. NE, Washington, D.C. Please use the First St., NE, entrance to the PSB. To gain entrance to BLS, please see "Notice about Seminars at the Bureau of Labor Statistics" at the beginning of this web page.
- Sponsor: Economics Section
Abstract:
Seasonal adjustment is routinely used to eliminate seasonal effects from monthly economic time series. However these seasonal adjustments are influenced by many factors in the data. Outliers, both additive and level shift, can result in highly variable seasonal factors if outlier detection is used and outliers drop in and out of the calculation on a month by month basis. This is especially true when outliers appear towards the end of the series, as these have greater effect on up-to-date estimates.
A new method of accounting for outliers is proposed involving the use of heavy-tailed distributions, namely t-distributions. Recent developments in state space modelling techniques (Durbin and Koopman, 2000) have facilitated the incorporation of heavy-tailed distributions into the state equations. This allows error distributions to be extended, and through importance sampling, estimates of parameters from these distributions to be found.
Assessment of these new models and techniques will be presented using both simulated and real data sets. It will be shown that use of the new models can allow for more robust seasonal adjustment than the traditional outlier detection methods.
The George Washington University
Department of Statistics
Title: Baysian Group Testing
- Speaker: Dr. Curtis Tatsuoka, Department of Statistics, The George Washington University
- Date/Time: 11:00-12:00 a.m., October 4, 2002
- Location: Funger Hall 321, 2201 G Street NW. Foggy Bottom metro stop on the blue and orange line.
Abstract:
A Bayesian formulation of group testing with testing error will be considered, where group testing is viewed as a sequential classification problem on lattices. Various response distribution formulations will be presented, including the case when testing error is a function of pool size. Results include describing experiment selection rules that attain optimal rates of convergence. Non-standard group testing problems also will be discussed.
Note: For a complete list of upcoming seminars check the dept's seminar web site: http://www.gwu.edu/~stat/seminars/Fall2002.htm. The campus map is at: http://www.gwu.edu/Map/. The contact person is Reza Modarres at Reza@gwu.edu or 202-994-6359.
Title: Synthetic Tabular Data To Limit Statistical Disclosure Of Sensitive Information
- Speaker: Ramesh A. Dandekar, Energy Information Administration
- Co-Author: Lawrence H. Cox, National Center for Health Statistics
- Chair: Phillip Steel, U.S. Census Bureau
- Discussant: Brian Greenberg, Social Security Administration
- Date/Time: Tuesday, October 15, 2002, 12:30 to 2:00 p.m.
- Location: Bureau of Labor Statistics, Postal Square Building (PSB), Conference Center, Conference Room 1, 2 Massachusetts Ave., N.W., Washington DC. Please use the First Street entrance to the PSB. To gain entrance to BLS, please see Notice at the beginning of this announcement.
- Sponsor: WSS Methodology Section
Abstract:
In the scientific community a synthetic product is developed when the real product is either in short supply or exhibits some undesirable properties. The objective in the latter case is to remove undesirable properties from the synthetic product. Several examples of synthetic products include: rubber, wood, sugar, fiber, fuel, and hormones.
We apply this notion to the release of statistical data products in tabular form. Here, the undesirable property is that of revealing confidential information on entities covered by the data. We explore the possibility of generating synthetic tabular data, which exhibits overall statistical characteristics similar to that of the real tabular data, yet offers protection from statistical disclosure. The method applies linear programming to synthesize tabular cells by controlled adjustments to original tabular cells. The controlled adjustments are made in such a way that the overall distortion of the original cell value is minimal, based on one of several standard criteria. The resultant synthetic table conveys approximately the same statistical information to the end users as the original table, but at reduced risk of disclosure.
Title: Afghan Refugee Camp Surveys: Pakistan, 2002
- Speakers: James Bell, Ruth Citrin, David Nolle U.S. Department of State and Fritz Scheuren, NORC, University of Chicago
- Chair: Mary Batcher, Ernst & Young LLP
- Date/Time: Thursday, October 17, 2002, 12:30 to 2:00 p.m.
- Location: Bureau of Labor Statistics, Postal Square Building (PSB), Conference Center, Conference Room 1, 2 Massachusetts Ave., N.W., Washington DC. Please use the First Street entrance to the PSB. To gain entrance to BLS, please see Notice at the beginning of this announcement.
- Sponsor: WSS Methodology Section and AAPOR-DC
Abstract:
Both as professionals and as citizens, the events of the last year and now more have brought about many changes to our view of the world and our engagement in it. This survey was one response to those changes. Its main goal was measuring attitudes on a variety of social, economic, and political issues of the Afghan refugees that are now returning to their homeland from Pakistan. Particularly important was learning about their perceptions regarding current circumstances as well as future expectations. Methodologically, in a setting of great danger, trying to obtain a good sample of adult males in the refugee camps posed many challenges and most of the discussion will be focused on these.
The George Washington University
Department of Statistics
Title: The Value of Standardization - Software and Current Best Methods
- Speaker: Dr. David Morganstein, WESTAT Corporation
- Date/Time: 11:00-12:00 a.m., October 18, 2002
- Location: FFunger Hall 323, 2201 G Street NW. Foggy Bottom metro stop on the blue and orange line.
Abstract:
In a private statistical organization, the amount of effort needed to plan and conduct a survey is a critical indicator of success in competing for government contracts. Westat, an employee owned survey organization, must be concerned about the staff time needed to do it's work. It must also be concerned about retaining high quality staff, so job satisfaction is also a critical measure of success. The statistical group of 55 statisticians is involved in dozens of surveys every year. Often a staff member is working on 3 or more surveys simultaneously. To reduce the effort needed to support the variety of surveys and to increase interest in the work, our statistical group has standardized in two areas: software and current best methods. In this talk, we'll describe why we choose to do this, how we do it and the benefits we have observed
Note: For a complete list of upcoming seminars check the dept's seminar web site: http://www.gwu.edu/~stat/seminars/Fall2002.htm. The campus map is at: http://www.gwu.edu/Map/. The contact person is Reza Modarres at Reza@gwu.edu or 202-994-6359.
Title: The 2002 Roger Herriot Award For Innovation in Federal Statistics
- Recipient: Daniel H. Weinberg, U.S. Census Bureau
- Speakers:
Katherine K. Wallman, Statistical Policy Office, Office of Management and Budget
William P. Butz, Rand Corporation
Paula J. Schneider, U.S. Census Bureau (retired)
Daniel H. Weinberg, U.S. Census Bureau
- Chair: Edward J. Spar, Council of Professional Associations on Federal Statistics
- Date: Tuesday, November 12, 2002 12:30 - 2:00 p.m. Reception to Follow
- Location: Bureau of Labor Statistics. Conference Rooms 7 and 8. To gain entrance to BLS, please see Notice at the beginning of this announcement.
- Video Conference to selected sites.
- Co-sponsors of the Herriot Award: Washington Statistical Society, American Statistical Association's Government Statistics Section and Social Statistics Section
Abstract:
On August 12, 2002, Dan Weinberg was awarded the Roger Herriot Award at the annual meeting of the American Statistical Association in New York. Dan is the Chief of the Housing and Household Economic Statistics Division of the U.S. Census Bureau. Dan has been immersed in all three sectors of f