#### 2009 Quality & Productivity Research Conference

IBM T. J. Watson Research Ctr., Yorktown Heights, NY

June 3-5, 2009

 Invited Paper Sessions (session dates and times can be found in the Program ) Opening Plenary Session Session Organizer: Emmanuel Yashchin, IBM Research Session Chair: Emmanuel Yashchin, IBM Research 1. Conference kick-off: Brenda Dietrich, IBM Fellow and VP, Business Analytics and Mathematical Science 2. "Process Transformation and Quality," Kathleen Smith, IBM VP Process Transformation and Quality Abstract: Breakthroughs in Quality will require greater focus on integrated, end-to-end process-driven transformation. Leaders around the world recognize the imperative to transform their business to drive continuous process improvement, delivering productivity, process excellence, and quality standards. This presentation will look at the need for transformation of two development processes - Product & Services Offering Development and Field Solution Design Development. It will look at root causes and the predictability of Quality based on statistical measures of those processes. The actual process improvements and the challenges of Quality standards deployment will be covered. Along this transformation journey, key elements for success will be outlined, focusing on leadership commitment, skills development, key performance measures, and company culture. 3. "Data Analytics in IBM: Opportunities and Challenges", Chidanand Apte, Manager of the Data Analytics Department, IBM Research. 1. Designing High Reliability Products Session Organizer: Narendra Soman, General Electric Co. Session Chair: Reinaldo Gonzalez, General Electric Co. 1. "Applications of Reliability Demonstration Tests," Winson Taam, Boeing Co. Paper Abstract: Reliability demonstration tests (RDT) have been introduced in the literature for several decades, primarily in the automotive industry and for metal structures. The basic idea of RDT is to provide a strategy to verify a pre-specified reliability with confidence by adjusting the test conditions on limited number of samples. Two applications will be discussed in this presentation: one involving one test parameter, and the other involving two test parameters. In both applications, the framework of RDT and its inferential assumptions are closely connected to statistical hypothesis testing and accelerated life testing in the reliability literature. 2. "Reliability of advanced CMOS devices and circuits," James Stathis, IBM Research Paper Abstract: As MOSFET devices have scaled to nanometer dimensions, dielectric reliability has remained as a central focus of fundamental research and reliability engineering studies. The recent introduction and rapid maturation of complex hafnium-based gate insulators with metal gates has further intensified the study of the two dominant gate oxide failure mechanisms -- dielectric breakdown and the so-called bias/temperature instability. Ultimately, the importance of these mechanisms depends on the degree to which they impact integrated circuit functionality and performance over the lifetime of a product. In this talk we will discuss new physical understanding of dielectric wearout mechanisms and statistics, and describe how these concepts may be used to estimate circuit failure rates under use conditions. 3. "Statistical Reliability Modeling of Field Failures Works!" David C. Trindade, Sun Microsystems Paper Abstract: Because of the increased complexity of modern server systems, identifying the cause of field reliability problems can be a very challenging task. In this paper we show how the application of statistical methods for the analysis and modeling of repairable systems provided valuable insight into the source of field failures, leading to eventual cause identification and remediation. 2. Statistics for Business Analytics Session Organizer: Ta-Hsin Li, IBM Research. Session Chair: Ta-Hsin Li, IBM Research. 1. "Statistical Applications in Business Analytics," Yasuo Amemiya, IBM Research Abstract: Business analytics systems address management, finance, marketing, sales, and other enterprise problems that involve complex business processes, periodically updated multivariate time-dependent data, and timely reporting of decision-support information. Development of such a system requires sophisticated data integration, clever implementation of statistical analytics, and effective reporting mechanism. This talk presents a number of business analytics applications where new statistical methods/approaches were developed as solutions to actual business problems. 2. "Forecasting Time Series of Inhomogeneous Poisson Processes With Application to Call Center Workforce Management," Haipeng Shen, U. North Carolina - Chapel Hill Paper Abstract: We consider forecasting the latent rate profiles of a time series of inhomogeneous Poisson processes. The work is motivated by operations management of queueing systems, in particular, telephone call centers, where accurate forecasting of call arrival rates is a crucial primitive for efficient staffing of such centers. Our forecasting approach utilizes dimension reduction through a factor analysis of Poisson variables, followed by time series modeling of factor score series. Time series forecasts of factor scores are combined with factor loadings to yield forecasts of future Poisson rate profiles. Penalized Poisson regressions on factor loadings guided by time series forecasts of factor scores are used to generate dynamic within-process rate updating. Methods are also developed to obtain distributional forecasts. Our methods are illustrated using simulation and real data. The empirical results demonstrate how forecasting and dynamic updating of call arrival rates can affect the accuracy of call center staffing. 3. "On the Existense of E-loyalty Networks in Ebay Auctions and Their Structure," Inbal Yahav and Wolfgang Jank, University of Maryland Paper Abstract: We study seller-bidder networks in online auctions. We investigate whether networks exist and whether and how they affect the outcome of an auction. In particular, we study loyalty networks. We define loyalty of a bidder as the distribution of sellers and their corresponding auctions in which she participates, with more loyal bidders having a smaller distribution. We define loyalty network as the corresponding seller-bidder network. We further employ ideas from functional data analysis to derive, from this loyalty network, several distinct measures of a bidder’s loyalty to a particular seller. We use these measures to quantify the effect of loyalty on product end price. 3. Analytics and Risk Management Session Organizer: Martha Gardner, General Electric Co. Session Chair: Martha Gardner, General Electric Co. 1. "Post-Financial Meltdown; What Does the Financial Services Industry Need From Us?" Roger Hoerl, GE Global Research Abstract: In 2008 the global economy was rocked by a crisis that began on Wall Street, but quickly spread to Main Street USA, and then to side streets around the world. Statisticians working in the financial services sector were not immune, with many in fear of, or already having lost their jobs. Given this dramatic course of events, how should statisticians respond? What, if anything, can we do to help our struggling financial organizations survive this recession, in order to prosper in the future? This talk will discuss some approaches that can help financial service organizations deal with aftereffects of the financial meltdown. Based on a description of what appears to be this industry's current needs, we emphasize three approaches in particular: a greater emphasis on statistical engineering relative to statistical science, “embedding” statistical methods and principles into key financial processes, and the reinvigoration of Lean Six Sigma to drive immediate, tangible business results. 2. "Model Risks in Financial Services Risk Modeling," Tim Keyes, Sr. VP, Risk Management, GE Commercial Finance Abstract: Recent events in the financial services industry showcase misconceptions about risk models, as well as their assumptions, usefulness, and limitations. Statistical models for credit, market and operational risk management are becoming increasingly commonplace as risk tools in consumer and commercial lending, and are achieving a perhaps unwarranted level of influence without a commensurate level of oversight and suspicion. Never before has a reliance on models - for credit approval, risk ratings, capitalization, and the like - been so fundamental to the operating health of an industry. Our discussion will review the state of the art for use of statistical models in the financial services industry, explore the level of risk contribution being introduced to the industry by statistical models and modelers, and revisit some remedial measures that could be undertaken to address this issue. 3. "Elicitation of Causal Network Diagrams for Risk Assessment," Bonnie Ray, IBM Research Abstract: A systematic approach to risk management is essential for a well-managed organization, requiring an accurate representation and assessment of risks in order to be effective. However, given the complex nature of many enterprises, articulating the correct relationships between multiple risk factor and quantifying the corresponding risk parameters (e.g., potential costs, probability of event occurrence) is one of the most problematic steps in the risk management process. Lacking historical data from which to learn this information, experts are often called upon to provide it manually, e.g. through in-person collaborations involving group meetings and/or a series of in-person interviews, etc. In this talk, we present work on the development of computer-based methods for effectively eliciting a causal network model for risk assessment through a distributed, asynchronous and collaborative, web-based application. We discuss several issues, including ease-of-use, design of scalable questionnaires, aggregation of responses, resolution of conflicting responses and accuracy of the resulting diagram structure, and report results from a small experiment to determine the impact of different question ordering on the value of the input received. The work is presented in the context of a business application at IBM where the elicitation system has been used to collect risk information pertaining to the sales process. 4. Special Honoree Invited Session Session Organizer: William Q. Meeker, Iowa State University Session Chair: Paul Tobias (Sematech, Ret.) 1. "Proactive Product Servicing," Necip Doganaksoy, Gerald Hahn, General Electric Co. and William Q. Meeker, Iowa State University Paper Abstract: A prime goal of proactive product servicing is to avoid unscheduled shutdowns. It is usually a great deal less disruptive and costly to perform repairs during scheduled maintenance on, for example, automobiles, aircraft engines, locomotives, and medical scanners than it is to cope with unexpected failures in the field. The emergence of long term service agreements (LTSAs)-through which manufacturers sell customers not just products, but a guaranteed ongoing level of service-has, moreover, added momentum to proactive servicing, as well as to building high reliability into the design in the first place. Even when shutdown cannot be prevented, proactive servicing is still useful to ensure speedy, inexpensive repair-reducing the deleterious impact of such shutdowns. We describe three approaches for proactive product servicing: optimum product maintenance scheduling, proactive parts replacement and automated monitoring for impending failures. This paper elaborates on a discussion that appeared in the book "The Role of Statistics in Business and Industry" and a Statistics Roundtable article that appeared in the November 2008 issue of Quality Progress. 2. "Accelerated Destructive Degradation Test Planning," Luis A. Escobar, Louisiana State University, Ying Shi and William Q. Meeker, Iowa State University Paper Abstract: Accelerated Destructive Degradation Tests (ADDTs) are used to obtain reliability information quickly. For example, an ADDT can be used to estimate the time at which a given percentage of a product population will have a strength less than a specified critical degradation level. An ADDT plan specifies a set of factor level combinations of accelerating variables (e.g., temperature) and evaluation time and the test units' allocations to each of these combinations. This talk describes methods to find good ADDT plans for an important class of linear degradation models. First, different optimum plans are derived in the sense that they all provide the same minimum large sample approximate variance for the maximum likelihood (ML) estimate of a time quantile. The General Equivalence Theorem (GET) is used to verify the optimality of an optimum plan. Because an optimum plan is not robust to the model specification and the planning information used in deriving the plan, a more robust and useful compromise plan is proposed. Sensitive analyses show the effects on the precision of the quantile estimate by changes on sample size, time duration of the experiment, and levels of the accelerating variable. Monte Carlo simulations are used to evaluate the statistical characteristics of the ADDT plans. The methods are illustrated with an application for an adhesive bond. 3. "A General Model and Data Analyses for Defect Initiation, Growth, and Failure," Wayne Nelson, Wayne Nelson Statistical Consulting Co. Paper Abstract: This paper describes a simple and versatile new model for defect initiation and growth leading to specimen failure. This model readily yields the distribution of time to failure (when a specified defect size is reached) and the distribution of time to defect initiation. It is shown how the model can be readily fitted to data where each specimen's defect size is observed only once, using commercial software for fitting regression models to censored data. The model and fitting methods are illustrated with applications to dendrite growth on circuit boards and the growth of blisters in automotive paint. 5. Aspects of Bias, Prediction Variance and Mean Square Error Session Organizer: Christine Anderson-Cook, Los Alamos National Laboratory Session Chair: Nalini Ravishanker, University of Connecticut 1. "Construction and Evaluation of Response Surface Designs Incorporating Bias from Model Misspecification," Connie Borror (Arizona State University West), Christine M. Anderson-Cook (Los Alamos National Laboratory) and Bradley Jones, JMP SAS Institute Paper Abstract: Construction and evaluation of response surface designs often involve metrics such as optimality criteria or minimizing prediction variance. These metrics are dependent upon the assumed model being correct. In this presentation, design evaluation and comparison will emphasize considering potential bias that may be present due to an underspecified model. Graphical and numerical summaries of potential bias, prediction variance and expected mean square error are used to assess response surface designs over several scenarios of cuboidal and spherical regions. 2. "Mean Squared Error in Model Selection," Adam Pintar (Iowa State University), Christine M. Anderson-Cook (Los Alamos National Laboratory) and Huaiqing Wu, Iowa State University Paper Abstract: In regression problems, several methods, such as AIC and BIC in maximum likelihood settings and Stochastic Search Variable Selection (SSVS) in Bayesian settings, are available for choosing a suitable subset of input variables to include in the model. A common thread for all of these methods is the use of observed data as the basis to choose a subset of regressors, even though the goal of the study is to predict the expected response at an unobserved covariate point. Another similarity between these methods is the use of a single number summary to characterize a subset of regressors. We present an approach to variable selection where the goal is to predict expected response well over a user defined region of the covariate space. Moreover, models are compared by graphically examining their distribution of mean squared error (MSE) of prediction over the entire region of interest. The use of MSE allows us to balance model fit with model complexity, as AIC and BIC do. The method is illustrated via an example, which is constructed to demonstrate the steps in the algorithm, and the importance of considering where the model will be used when selecting a model. A second example considers extrapolation. 3. Discussion: Geoff Vining, Virginia Tech Paper 6. Semi-supervised Learning in Quality Control Session Organizer: George Michailidis, University of Michigan Session Chair: George Michailidis, University of Michigan 1. "Semi-supervised Regression with the Joint Trained Lasso," Mark Culp, West Virginia University Abstract: The lasso (supervised lasso henceforth) is a popular and computationally efficient approach for performing the simultaneous problem of selecting variables and shrinking the coefficient vector in the linear regression setting. Semi-supervised regression uses data with missing response values (unlabeled) along with labeled data to construct predictors. In this talk, we propose the joint trained lasso, which elegantly incorporates the benefits of semi-supervised regression with the supervised lasso. From an optimization point of view, we illustrate that the joint trained lasso can be equivalently fit under optimization framework similar to that of the semi-supervised SVM, a popular semi-supervised classification approach. Finally, we show that the proposed approach is advantageous in the n < p case, specifically when the size of the labeled response data is smaller than p but the size of the unlabeled response data is larger than p. 2. "High Dimensional Nonlinear Semi-supervised Learning," Tong Zhang, Rutgers University Abstract: We present a new method for learning nonlinear functions in high dimension using semisupervised learning. Our method consists of two stages: in the first stage we learn a compact data representation using unlabeled data; in the second stage we use the representation from the first stage to learn the target function. It is shown that this method can consistently estimate high dimensional nonlinear functions, and is able to (partially) avoid the curse of dimensionality problem. 3. "Data-dependent Kernels for Semi-supervised Learning," Vikas Sindhwani, IBM Research Abstract: The classical framework of Regularization in Reproducing Kernel Hilbert Spaces forms the basis of many state-of-the-art supervised algorithms such as SVMs, Regularized Least Squares and Gaussian Processes. Given the high cost of acquiring labeled data in many domains, significant attention is now being devoted to learning from unlabeled examples by exploiting the presence of multiple-views, manifold or cluster structures. In this talk, we will discuss semi-supervised extensions of classical regularization techniques. We present specific data-dependent kernels that reduce multi-view learning and manifold regularization to standard supervised learning, and discuss how such reductions expand the algorithmic and theoretical scope of these frameworks. 7. Risk Analysis in Information Technology Industry Session Organizer: Bonnie Ray, IBM Research. Session Chair: Bonnie Ray, IBM Research. 1. "Human reliability analysis – challenges in modeling operational risk," Tim Bedford, University of Strathclyde Paper Abstract: Technological risk analysis has long recognized the role of human failure in the operation of technical systems, and has developed a number of tools with which to study and quantify the likelihood of such failures. Human reliability analysis (HRA) methods are usually highly task oriented: A variety of different taxonomies of failure or error types are used to classify the possible operator faults for each task; these classification schemes are then used in conjunction with other data, describing performance shaping factors, to assess a probability of human error for each specific task. Many HRA models have explicitly recognized a high degree of uncertainty in human failure probabilities, but have not recognized the dynamic nature of such probabilities through time as staff adapt to the organizational context. In this presentation we discuss how management initiatives may negatively influence human failure probability and look at some of the challenges involved in introducing human reliability analysis into the assessment of operational risk in service industries. 2. "Adversarial Risk Analysis," David Banks, Duke University Paper Abstract: Classical game theory focuses upon situations in which the payoffs are known and there is no subjective information about the strategy of one's opponents. Classical risk analysis focuses upon situations in which the opponent is inanimate, and does not attempt to exploit one's vulnerabilities through "mirroring" of the calculations for defensive investment (i.e., the opponent is nature, not a competitor). This talk describes a Bayesian version of game theory in which one can use both prior beliefs and mirroring to do game theoretic risk analysis. The ideas are illustrated through applications in gambling and auctions. 3. "Modeling and Measuring Operational Process Risk," Eric Cope, IBM Zurich Research Lab Abstract: A large class of operational risks can be thought of as threats to business process objectives. We discuss how formal metamodels can incorporate risk-relevant information directly into business process models, and how those models can be transformed into quantitative models for estimating risk to process objectives. Various quantitative methods, including Bayesian networks and discrete-event simulation will be discussed, in the context of HR processes and IT infrastructure risk. 8. Sequential Testing Session Organizer: Joseph Glaz, University of Connecticut Session Chair: Joseph Glaz, University of Connecticut 1. "Repeated significance tests with random stopping," Vladimir Pozdnyakov, University of Connecticut Abstract: Usually a sequential testing procedure is stopped when a process associated with the test crosses a certain boundary. But sometimes it is practical to introduce an additional stopping rule that could forces us to make a decision before the crossing. Two examples are presented. The first example is a sequential RST with random target sample size. In this example the test is designed in such a way that the target sample size adapts itself to an unknown non-linearly increasing total variation of a process linked to the test. In the second example stopping is triggered by technological restrictions that are imposed on the monitoring. When an initial sample size is large both problems can be addressed with help of limit theorems. What to do when sample size is small is an open question. 2. "An Introduction to a random sequential probability ratio test (RSPRT) with some data analysis," Nitis Mukhopadhyay, University of Connecticut Abstract: (pdf can be downloaded from here). Wald's (1947, \textit{Sequential Analysis}, New York: Wiley) \textit{% sequential probability ratio test} (SPRT) remains relevant in addressing a wide range of practical problems. Clinical trials owe a great deal of debt to this methodology. There has been a recent surge of applications in many areas including sonar detection, tracking of signals, detection of signal changes, computer simulations, agriculture, pest management, educational testing, economics, and finance. Obviously there are circumstances where sampling one observation at-a-time may not be practical. In contexts of continuously monitoring, for example, inventory, queues or quality assurance, the data may appear \textit{% sequentially} in groups where the group sizes may be ideally treated as random variables themselves. For example, one may sequentially record the number of stopped cars ($M_{i}$) and the number of cars ($\Sigma _{j=1}^{M_{i}}X_{ij}$) without working brake lights'' when a traffic signal changes from green to red, $i=1,2,...$ . This can be easily accomplished since every working brake-light'' must glow bright red when a driver applies brakes. One notes that (i) it may be reasonable to model $% M_{i}$'s random, but (ii) it would appear impossible to record data sequentially one-by-one on the status of brake lights (that is, $X_{ij}=0$ or $1$) individually for car \#1, and then for car \#2, and so on! In order to address these situations, we start with an analog of the concept of a best fixed-sample-size'' test based on data $% \{M_{i},X_{i1},...,X_{iM_{i}},i=1,...,k\}.$ Then, a \textit{random sequential probability ratio test} (RSPRT) is developed for deciding between a simple null and a simple alternative hypotheses with preassigned Type I and II errors $\alpha ,\beta$. The RSPRT and the best fixed-sample-size'' test with $k\equiv k_{\min }$ associated with same errors $\alpha ,\beta$ are compared. An illustration of RSPRT will include a binomial distribution with substantive computer simulations. A real application will be highlighted.{} This is joint work with Professor Basil M. de Silva from the RMIT University, Melbourne, Victoria, Australia. 3. "Scan statistics with applications to quality control and reliability theory," Joseph Glaz, University of Connecticut Abstract: In this talk we review one and two dimensional scan statistics that have been used in quality control and reliability theory. Both theoretical and computational issues related to implementation of these scan statistics will be discussed. Numerical results will be presented to evaluate the performance of these scam statistics. Recent developments as well as open problems in this area will be presented. 9. Lifetime Data Analysis Session Organizer: Scott Kowalski, Minitab. Session Chair: Scott Kowalski, Minitab. 1. "Issues in Designing and Analyzing Experiments for Life Time Data," Geoff Vining, Virginia Tech Paper Abstract: An important characteristic of many products is the length of time that it performs its intended function. Often, people call these times life data. This paper examines the issues in planning experiments for life data. Of special concern is the proper inclusion of the true experimental error into the model and its analysis. 2. "Multi-faceted Approach to Demonstrate Product Reliability," James Breneman, Pratt & Whitney Co. Abstract: Reliability demonstration has become a necessity for products today, commercial and military alike. The same tools and general approach should be followed to assure the reliability of the final production product: 1. Define the reliability goal (sounds trivial, doesn't it?) 2. Review lessons learned (but, what if I don't have any lessons learned?) 3. Need for an FMEA (&PFMEA) (how can this help?) 4. DFSS and Reliability (what's the connection here? why do I need to bother? what's the payoff?) Software plays a major role in demonstrating Reliability because it possesses tools that can be used in all four stages. Examples of commercial product reliability development will be used to illustrate the steps above. 3. "Control Charts for the Parameters of Life Time Data Distributions," Denisa Olteanu, Virginia Tech Paper Abstract: Companies routinely perform life tests for their products. Typically, these tests involve running a set of products until some (the censored case) or all (the uncensored case) fail. Reliability professionals use a number of non-normal distributions to model the resulting lifetime data with the Weibull distribution being the most frequently used. This talk proposes CUSUM charts for lifetime data, monitoring for shifts in the Weibull distribution shape and scale parameters. The CUSUM charts are constructed using a sequential probability ratio test approach for both uncensored and censored lifetime data. 10. Challenges in Web Search and Advertising Session Organizer: Kishore Papineni, Yahoo Research. Session Chair: Kishore Papineni, Yahoo Research. 1. "Contextual advertising: Challenges in matching ads to Web pages," Silviu-Petru Cucerzan, Microsoft Research Abstract: Internet advertising is the fastest growing type of marketing due to its cost, reach, targeting opportunities, and performance analysis means. The presentation will briefly review the basic concepts used in internet advertising, and then focus on the particular area of contextual advertising, with emphasis on semantic targeting, keyword extraction and disambiguation, and keyword expansion. A model for large-scale keyword extraction and semantic disambiguation based on Wikipedia data and search engine query logs will be discussed in detail. The disambiguation process is formulated as a problem of maximizing the agreement between the contextual information extracted from Wikipedia and the context of a Web document, as well as the agreement among the topic tags associated with the candidate disambiguations. 2. "Keeping a Search Engine Index Fresh: Risk versus Optimality Tradeoffs in Estimating Frequency of Change in Web Pages," Carrie Grimes, Google Research Abstract: Search engines strive to maintain a "current" repository of all pages on the internet to index for user queries or to choose appropriate ad content to match the page. However, refreshing all web pages all the time is costly and inefficient: many small websites don't support that much load, and while some pages update content very rapidly, others don't change at all. As a result, estimated frequency of change is often used to decide how frequently a web page needs to be refreshed in an offline corpus. Here we consider a Poisson process model for the number of state changes of a page, where a crawler samples the page at some known (but variable) time interval and observes whether or not the page has changed in during that interval. Under this model, we first estimate the rate of change for the observed intervals using a Maximum Likelihood estimator described in Cho and Garcia-Molina (2000), and test the model on a set of 100,000 web pages by examining the outcome of a refresh policy based on these estimates. Second, we consider a constantly evolving set of web pages, where new pages enter the set with no information and estimation must begin immediately, but where we control the ongoing sampling of the page. In this setting, the crawl efficiency gained by an accurate estimator trades off with the risk of a stored page not accurately reflecting fresh content on the web. We implement a computationally simple empirical Bayes estimate that improves initial estimation. We demonstrate that the initial page sampling strategy that minimizes the risk of a stale corpus directly precludes optimal strategies for acquiring information to improve the accuracy of estimated rate of change and consider alternate strategies for initial estimation. 3. "Challenges in allocating banner advertisements," Kishore Papineni, Yahoo Research Abstract: Banner advertisements are a significant fraction of online advertising and differ from contextual/search advertisements in their placement and payment mechanisms. They can be sold months in advance as guarantees on number of impressions or can be sold in an auction right after a browser visits a page. Advertisers can target audience by publishers, specific web-pages, browser IP address, time of day, demographic attributes of the visitor, or a combination of these. Even with detailed targeting, there is still some variability in the quality of advertising opportunities which raises the issue of fairness to advertisers in allocating the opportunities. Given a placement opportunity, publishers such as Yahoo face two choices: use the opportunity to satisfy one of the guaranteed contracts or put up the opportunity for auction on the spot. We discuss a mechanism where the publisher puts up every opportunity for online auction and bids on it on behalf of the guaranteed contracts to maximize fairness subject to supply, demand, and budget constraints. 11. Classification models with applications to Quality Session Organizer: Karel Kupka, TriloByte Statistical Software Session Chair: Karel Kupka, TriloByte Statistical Software 1. "ROC curves as an aspect of classification," Jan Kolacek, University Brno, Czech Republic Paper Abstract: Receiver Operating Characteristic (ROC) analysis has its origin in signal detection theory, but most of the current work occurs in the medical decision making community. Now, ROC curves have been widely used for evaluating the accuracy and discriminating power of a diagnostic test or statistical model. The empirical ROC curve is the most commonly used non-parametric estimator for the ROC curve. To derive a smooth estimate for the ROC curve, we use a kernel smoothing method which has better statistical properties than empirical estimates. We estimate a distribution function by this process. It is well known now that kernel distribution estimators are not consistent when estimating a distribution near the finite end points of the support of the distribution function to be estimated. This is due to boundary effects that occur in nonparametric curve estimation problems. To avoid these difficulties we use the technique, which is a kind of a generalized reflection method involving reflecting a transformation of the data. 2. "Automatic detection and classification of fabric defects," A. Linka, M. Tunak, P. Volf, Technical University Liberec, Czech Republic Paper Abstract: Fabric is, as a rule, composed of two sets of mutually perpendicular and interlaced yarns. The weave pattern or basic unit of the weave is periodically repeated throughout the whole fabric area with the exception of the edges. Due to the periodical nature of woven fabrics their images are homogeneously structured and can be considered as texture images. Considering the periodic nature of fabric it is possible to monitor their second-order grey level statistic features obtained from grey level co-occurrence matrix or to monitor and describe the relationship between the regular structure of woven fabric in the spatial domain and its Fourier spectrum in the frequency domain. Presence of a defect of this periodical structure of woven fabric causes changes in periodicity and consequently changes of statistical features. We especially focus on the recognition of common directional defects associated with the change of weaving density or defects that appear as a thick place distributed along the width or high of an image. The methods will be illustrated on the examples of analysis of real fabrics with changes and defects. The characteristics are computed sequentially in windows moving over analyzed fabrics and statistical comparison is used to recognize the parts with defect. Simultaneously, the type of defect is classified, with the aid of the classification tree method. Further, unsupervised model-based classification is utilized for the detection of different structures in the fabric. 3. "Data transformation in multivariate quality control," Jiri Militky, Technical University Liberec, Czech Republic Paper Abstract: Industrial quality control is based on measured data from technology and measured quality parameters of products. Such data are inherently multivariate, therefore multivariate statistical methods are widely employed to analyze, model and improve quality. Most statistical methods rely on strict statistical assumptions which however are often not met. This may lead to many misinterpretations and loss of information present in the data, especially in high dimensional spaces. Commonly used techniques like Hotelling control charts, FA/PCA based methods, multivariate clustering and classification, distribution modeling, etc. may be heavily affected. We show how non-normality and high dimensionality will affect model building and estimation in clustering, classification and predictive modelling and suggest some transformation procedures based on power functions with optimal predictor distributional and information properties. 4. "Neural network time series classification of changes in nuclear power plant processes ," Karel Kupka, TriloByte Statistical Software Paper Abstract: Time series are typical data output from technological processes. Diagnostics of process data such as model change detection, outlier detection, etc. are often of primary interest for quality management. For autocorrelated processes, many models and procedures have been suggested, many of them based on uni- and multivariate EWMA, AR, ARIMA, CUSUM. We are comparing two types of models for stationary univariate series: linear partial least squares autoregression (PLSAR) and nonlinear perceptron-type neural network autoregression (ANNAR) with multistep prediction. Lack of knowledge of statistical properties of prediction is solved by multiple overlay estimates, which makes it possible to assess and predict heteroscedastic variance and construct statistical models and control charts of processes with confidence intervals. It is shown that orthogonal PLSAR models are more stable and its parameters could be used to identify and classify different physical modes of processes. Expansions of the proposed models for multivariate and non-stationary data are also possible and allow to control and classify more complex processes in technology. 12. Advances in Statistical Process Control: Session in Honor of the late Professor Zachary Stoumbos Session Organizer: Emmanuel Yashchin, IBM Research Session Chair: Emmanuel Yashchin, IBM Research 1. "The Combined Chart Controversy," Douglas Hawkins, University of Minnesota Paper Abstract: It has long been a dictum that cusum and EWMA charts for mean are good for detecting small shifts, and that the Shewhart chart is good for detecting large shifts. This led to the standard advice to combined the two, maintaining a cusum (or EWMA) location chart and a Shewhart Xbar chart side by side so that each could deliver on what it was good for. Zachary Stoumbos championed the idea that you could do just as well from a conventional cusum (or EWMA) location/scale pair; that the scale charts were as effective for detecting large shifts in mean as is the Shewhart. Yet another possibility is the "sub-nosed V-mask cusum" proposal in which two parallel cusums are run, one for large and one for small shifts. In the talk, the three possibilities will be sketched and their performance compared. 2. "The State of Statistical Process Control - An Update," William H. Woodall, Virginia Tech Paper Abstract: In this overview presentation, applications of profile monitoring and health-related surveillance will be described. These are two of the most exciting applications of quality control charts. Other areas of research on control charting will be briefly summarized with some key references provided. After a very brief overview of process monitoring in general, some profile monitoring applications will be given. In profile monitoring the quality of a process or product is best characterized by a function, i.e., the profile. Changes in the function over time are to be detected. A number of applications of healthcare quality monitoring will be presented. Some of the differences between the industrial process monitoring and health-related surveillance environments are to be discussed. Public health data often include spatial information as well as temporal information, so the public health applications can be more challenging in many respects than industrial applications. 3. "The Use of Sequential Sampling in Process Monitoring," Marion Reynolds, Virginia Tech Paper Abstract: Control charts are used to monitor a process to detect changes in the process that may occur at unknown times, and have traditionally been based on taking samples of fixed size from the process with a fixed time interval between samples. Here we consider control charts based on sequential sampling, so that the sample size is at each sampling point is not fixed but instead depends on the current and past data from the process. The basic idea is that sampling at a sampling point should continue until either it appears that there has been no change in the process, or there is enough evidence of a process change to generate a signal by the control chart. Control charts based on sequential sampling can be designed so that, when the process is in control, the average sample size at each sampling point is some specified value. In particular, the in-control average sample size can be the same as the fixed sample size used by a traditional control chart. Here we review some univariate control charts based sequential sampling for detecting a one-sided change in one parameter. These univariate charts apply a sequential probability ratio test at each sampling point to either accept or reject the hypothesis that the process is in control. We also discuss extensions of the sequential sampling approach to multivariate monitoring. It will be shown that control charts based on sequential sampling can detect most process changes significantly faster than traditional control charts based on taking samples of fixed size. 13. Reliability Assessment and Verification Session Organizer: Gejza Dohnal, Czech Technical University, Czech Republic Session Chair: Gejza Dohnal, Czech Technical University, Czech Republic 1. "Intensity models for randomly censored data," Ivanka Horova, Masaryk University Brno, Czech Republic Paper Abstract: Survival analysis belongs to the classical parts of mathematical statistics and occupies an important place in medical research. In summarizing survival data there are two functions of central interest - a survival function and an intensity called also a hazard function. In the present paper we focus on estimating the hazard function which represents the instantaneous death rate for an individual. The goal of the paper is to develop a procedure for an analytical form of the hazard function for cancer patients. Our model is based on an acceptable assumption that the hazard depends on proliferation speed of cancer cells population. Thus we propose a deterministic model which is defined as a solution for a dynamical model. In order to estimate the parameters of such a model one needs a suitable nonparametric method. Among these methods a kernel estimate represents one of the most effective methods. We use this method which provides a pointwise estimate of the hazard function and on the other hand it makes possible to estimate parameters of the proposed model. As far as the biomedical application is concerned the attention is paid not only to the estimate of hazard function but also to a detection the points where the most rapid changes of the hazard function occur. The developed procedure is applied to the breast cancer data and leukemia data. 2. "Successive events modeling," Gejza Dohnal, Czech Technical University, Czech Republic Paper Abstract: The notion of successive events involves such situations in which the strong dependence must be considered. On the contrary with a sequence of various traffic accidents in time, equipment failures or nonconforming product release, which occur independently in time, this kind of events comes collective, due to some common cause. As examples we can use some disasters in oil production or transportation, attacks of IT systems, wildfire spreading, epidemic progression and so on. Considering the system of all objects which can be affected by a consequence of some initiative disastrous event as a whole along with all its macro-states, we can use Markov chains to model events spreading. Then, we are able to compute some predictions such as the life time of the system, first affect time to selected object and others. This modeling can help us to make some preventive decision or to prepare disaster recovery plans. In the contribution, the model will be described and some computations will be outlined. 3. "Quality and Risk: Convergence and Perspectives," Ron S. Kenett (KPA Ltd Management Consulting) and Charles S. Tapiero, Polytechnic Institute of NYU, Brooklyn, New York Abstract: The definition and the management of quality has evolved and assumed a variety of approaches, responding to needs expounded by the parties involved. In industry, quality and its control has responded to the need to maintain an industrial process operating as "expected", reducing the process sensitivity to unexpected disturbances (robustness) etc. By the same token, in services, quality was defined by some as the means to meet customer wants. Interestingly, quality in name or in spirit has been used by many professions in manners that respond to their specific needs. Throughout these many approaches, quality, just as risk, is measured as a consequence resulting from factor and events that may be defined in terms of the statistical characteristics that underlie these events. Quality and Risk may therefore converge, both conceptually and technically in dealing with the broader concerns that quality is currently confronted with-measurement, valuation, pricing and management. This presentation will present both applications and a prospective convergence between quality and risk and their management. In particular we shall emphasize aspects of Industrial and Services Quality-Risk Management, Environmental Quality and Risk Management etc. Through such applications, we will demonstrate alternative approaches for Quality and Risk Management to merge in order to improve management process of the risks and opportunities that quality implies. 4. "Safety and Economic Reliability," Charles S. Tapiero, Polytechnic Institute of NYU, Brooklyn, New York Abstract: Managing risk and safety consists in defining, measuring, estimating, analyzing, valuing-pricing and integrating all facets of risk and their safety-consequential effects (real or not, external, internally induced or use dependent) into a whole system which can contribute to their design, economy, controls and management. The purpose of this paper is to provide an economic approach to reliability design and safety based on economic considerations regarding a system design, its safety consequences which depend on both system reliability and the proficiency of the user who may be at fault in operating unsafely the system. The paper provides some specific examples that highlight the approach proposed for reliability and safety design and contrasts a number of approaches-risk and finance based, which we consider. Extensions to asymmetric information between the system design and the user, conflictual objectives and controls for safety are natural extensions of this paper, to be considered in further research. 14. Adaptive Designs in Pharmaceutical Industry Session Organizer: Mani Y. Lakshminarayanan, Merck Session Chair: Mani Y. Lakshminarayanan, Merck 1. "Implementation of Adaptive Designs at Merck," Keaven Anderson, Nichole Dossin, Jerry Schindler, Merck Abstract: Adaptive clinical trial designs provide opportunities to speed clinical development or to improve the probability of ultimate success. Realization of these benefits requires a combination of 1) early strategic planning, 2) sophisticated clinical trial design expertise and tools as well as 3) extensive capabilities for implementation. Merck Research Laboratories is currently in the middle of a highly cross-functional effort to evaluate the needs in these 3 areas and to implement appropriate processes, tools and training to enable the efficient consideration, design and implementation of adaptive clinical trials. The challenges of an effort with this major scope as well as the initial successes will be discussed. We will also discuss the business case for adaptive designs at Merck, which is substantial. 2. "Sample Size Reestimation in Clinical Trials: A Review and Comparison," Yanping Wang, Eli Lilly and Co. Abstract: When it becomes evident that an ongoing clinical trial is underpowered for an alternative hypothesis, it's often desirable to modify the sample size to achieve desired power. With adaptive design methods, accumulated data can be used to reassess sample size during the course of a trial while preserving the false positive rate. In this presentation, we will review and compare the current methods proposed for data-dependent sample size re-estimation. 3. "Data combination in seamless Phase II/III designs," Chris Jennison and L. Hampson, University of Bath, UK Abstract: In seamless Phase II/III trials multiple treatments are compared against a control. One of k treatments is selected in Phase II for further study in Phase III. The final decision to declare the selected treatment superior to control must protect the family-wise type I error rate for k comparisons against the control. In adaptive seamless designs, combination rules are applied to P-values from data in the two Phases. This data combination does not necessarily lead to greater power than an analysis which simply ignores the Phase II data, raising the question of how one should combine information optimally to maximise power while protecting family-wise type I error. We shall present a formulation of the problem which is amenable to analysis by decision theory, Hence, we derive optimal data combination rules for particular objectives. The results of this exercise are somewhat surprising. While Simes' rule reacts positively to low P-values for the treatments eliminated in Phase II, under some assumptions, the optimal inference treats such low P-values as detracting from the evidence for efficacy in the treatment selected for Phase III. We shall endeavour to present a decision rule with robust efficiency across a variety of scenarios. We believe the results supporting this conclusion have broader implications for other multiple comparison problems and the methodology we have developed may be adapted for wider use. 15. Data Mining and Information Retrieval Session Organizer: Regina Liu, Rutgers University Session Chair: Regina Liu, Rutgers University 1. "High-Dimensional Classifiers: The Bayesian Connection," David Madigan, Columbia University Abstract: Supervised learning applications in text categorization, authorship attribution, hospital profiling, and many other areas frequently involve training data with more predictors than examples. Regularized logistic models often prove useful in such applications and I will present some experimental results. A Bayesian interpretation of regularization offers advantages. In applications with small numbers of training examples, incorporation of external knowledge via informative priors proves highly effective. Sequential learning algorithms also emerge naturally in the Bayesian approach. Finally I will discuss some recent ideas concerning structured supervised learning problems and connections with epidemiology. 2 "Multiplicity Issue in Confidentiality," Jiashun Jin, Carnegie Mellon University Abstract: The explosion of computerized data bases offers enormous opportunities for statistical analysis. But at the same time, many of such of bases contain sensitive data (e.g. financial or health records) and is vulnerable for attack. This heightened public attention and generated fears regarding the privacy of personal data. Many statistical methods have been introduced in the literature to address the privacy issue in data mining. These include but are not limited to: sampling, perturbation, collapsing, and synthetic data. However, an issue that has been largely overlooked is the multiplicity issue. For example, many of these approaches target to control the probability that a specified individual record can be identified, but fail to control the probability that out of a large number of records, at least one of them can be successfully identified. The latter is one of the topics that has been studied extensively in the recent literature in multiple statistical testing. In this talk, we lay out basics of the multiplicity issues in privacy control with many examples. 3 "The role of valid models in data mining," Simon Sheather, Texas A&M University Abstract: Texts on data mining typically offer a version of the following as the steps involved in data mining: Step 1 - Build model(s) using training data Step 2 - Evaluate model(s) using validation data Step 3 - Reevaluate model(s) using test data Step 4 - Predict or classify using final data Checking whether the current model is a valid model is not an explicit step in this process. In this talk we will illustrate the importance of checking whether the current model is a valid model as well as highlight the dangers of a black box approach to modeling. Techniques to be discussed include marginal model plots. Real-life examples will be used to demonstrate that dramatic improvements to models can translate into benefits in terms of much more accurate predictions. Special Invited Session #1: Advances in Reliability Session Organizer: Emmanuel Yashchin, IBM Research Session Chair: Emmanuel Yashchin, IBM Research "Step - Stress models and associated inferential issues," N. Balakrishnan, McMaster University, Canada Abstract: In this talk, I will introduce first various models that are used in the context of step-stress testing. Then, I will describe the cumulative exposure model in detail and describe the model under the assumption of exponentiality. Next, I will discuss the derivation of the MLEs and their exact conditional distributions, and then present various methods of inference including exact, asymptotic and bootstrap methods and compare their relative performances. I will develop the results for different forms of censored data, and then present some illustrative examples. Finally, I will point out some other recent results concerning the Weibull model, order restricted inference, and optimal test design. In the course of the talk, some open problems will also be mentioned. Special Invited Session #2: Advances in Process Monitoring Session Organizer: Emmanuel Yashchin, IBM Research Session Chair: William H. Woodall, Virginia Tech "Process Monitoring with Supervised Learning and Artificial Contrasts," George Runger, Arizona State University and Eugene Tuv, Intel Paper Abstract: Normal operations typically can be represented as patterns and contrasted with artificial data so that multivariate statistical process control can be converted to a supervised learning task. This can reshape the control region and open the control problem to a rich collection of supervised learning tools. Recent results and applications are presented. 16. Special INFORMS Session: Data Mining in Manufacturing Process Control Session Organizer: Susan Albin, Rutgers University Session Chair: Susan Albin, Rutgers University 1. "Attribute Control Charts Using Generalized Zero-Inflated Poisson Distribution," Nan Chen and Shiyu Zhou (University of Wisconsin - Madison), Tsyy-Shun Chang and Howard Huang (OG Technologies Inc., Ann Arbor, MI) Abstract: This paper presents a control charting technique to monitor attribute data based on Generalized Zero-Inflated Poisson (GZIP) distribution, which is an extension of Zero-Inflated Poisson (ZIP) distribution. Poisson distributions are widely used to model counted data, such as customer arriving counts, number of phone calls, and so forth. However, when large number of zeros contained in the data, a Poisson distribution can not fit the data well. Zero-Inflated Poisson is often used to model the excessive number of zeros in the data. It was assumed that the number of non-conformities on a product caused by the random shocks follows the Poisson distribution with parameter µ, and the shocks occurred with the probability p. GZIP distribution is a generalization of ZIP distribution, assuming the number of non-conformities caused by each shock follows a specific Poisson distribution with parameter µi, and the occurring probability of each shock is ?i. There could be multiple types of shocks in the system. GZIP distribution is very flexible in modeling complicated behaviors of the data. Both the technique of fitting the GZIP model and the technique of designing control charts to monitor the attribute data based on the estimated GZIP model are developed. Simulation studies and real industrial applications illustrate that the proposed GZIP control chart is very flexible and advantageous over many existing attribute control charts. 2. "Data Mining for Process Monitoring of Semiconductor Manufacturing with Spatial Wafer Map Data," Myong K. Jeong, Young S. Jeong and Seong J. Kim, Rutgers University Abstract: Semiconductor manufacturing is a complex, costly and lengthy process involving hundreds of chemical steps. Several hundred integrated circuits (ICs) are simultaneously fabricated on a single wafer. After fabrication, each chip will go through a series of pass or fail functional tests and it is classified as either functional or defective. Wafer map, which is consisted of binary code, "1" or "0", displays the locations of defective or functional chips on the wafer. Various types of defect patterns shown on the wafer map contain important information that can assist process engineers in their understanding of the ongoing manufacturing processes. Because each defect pattern represents its unique manufacturing process condition, automatic classification of various types of defect patterns is significant for monitoring process control and improving yield rate. In this talk, we first introduce some traditional approaches for the analysis of wafer map data. And then, we present a novel spatial correlogram-based defect classification technique, which combined spatial correlogram with dynamic time warping (DTW) for automatic defect classification on the wafer map. Spatial correlogram is used for the detection of the presence of spatial autocorrelations and DTW is adopted for the automatic classification of spatial patterns based on spatial correlogram. Also, we propose the new weighted dynamic time warping (WDTW), which weights nearer neighbors more based on our proposed weighting scheme because neighboring points could be more important than others while usual DTW treats all points equally. A case study demonstrates the effectiveness of our proposed algorithms. 3. "Selecting the best variables for classifying production batches into two quality levels," Michel Anzanello, Susan Albin and Wanpracha A. Chaovalitwongse, Rutgers University Abstract: Datasets containing a large number of noisy and correlated process variables are commonly found in chemical and industrial processes. The goal here is to identify the most important process variables to correctly classify the outcome of each production batch into two quality classes, conforming and non-conforming, for example. To reduce the number of process variables needed for classification by eliminating noisy and irrelevant ones, we develop a framework that combines data mining classification tools with Partial Least Squares (PLS) regression. We propose new indices that measure variable importance based on PLS parameters. Then process variables are eliminated one-by-one in the order of importance given by the index. We then compute the classification accuracy after each variable deletion and the set of process variables with the maximum accuracy is selected. We evaluate the results on five industrial datasets. The proposed method reduces the number of variables for classification of production batches by an average of 83%, while yielding 13% more accurate classifications than using the entire datasets. It is interesting that the reduced data set can not only match the accuracy of the original but often exceeds that accuracy. 17. Nonparametric Statistical Process Control / Monitoring Session Organizer: Regina Liu, Rutgers University Session Chair: Regina Liu, Rutgers University 1. "Using the Empirical Probability Integral Transformation to Construct a Nonparametric CUSUM Algorithm," Daniel Jeske, University of California, Riverside Paper Abstract: Empirical distribution functions can be used to transform observations prior to incorporating them into a change-point cusum algorithm. The usefulness of this approach is examined and the question of how to determine an appropriate alarm threshold is discussed. The approach is illustrated with a real-life example where the raw data consists of non-stationary data streams. 2. "Optimal Sampling in State Space Models with Applications to Network Monitoring," George Michailidis, University of Michigan Paper Abstract: Advances in networking technology have enabled network engineers to use sampled data from routers to estimate network flow volumes and track them over time. However, low sampling rates result in large noise in traffic volume estimates. We propose to combine data on individual flows obtained from sampling with highly aggregate data for the tracking problem at hand. Specifically, we introduce a linearized state space model for the estimation of network traffic flow volumes from the combined data. Further, we formulate the problem of obtaining optimal sampling rates under router resource constraints as an experiment design problem. The usefulness of the approach in the context of network monitoring is illustrated on both emulated and real network data. 3. "Nonparametric Tolerance Regions Based on Multivariate Spacings," Jun Li, University of California, Riverside Paper Abstract: The tolerance intervals have been used very often in many fields, such as reliability theory, medical statistics, chemistry, quality control, etc, to gauge whether or not certain prescribed specifications are met by the underlying distribution of the subject of interest. In many practical situations, the specifications are imposed upon multiple characteristics of the subject, and thus tolerance regions for multivariate measurements are needed. We present in this talk an approach to construct the multivariate tolerance region based on the multivariate spacings derived from data depth. The construction of tolerance regions is nonparametric and completely data driven, and the resulting tolerance region reflects the true geometry of the underlying distribution. This is different from most existing approaches which require that the shape of the tolerance region be specified in advance. The proposed tolerance regions are shown to meet the prescribed specifications, in terms of beta-content and beta- expectation. They are also asymptotically minimal under elliptical distributions. Finally, a simulation and a comparison study on the proposed tolerance regions are presented. This is joint work with Regina Liu, Department of Statistics & Biostatistics, Rutgers University. 18. Statistical Leadership in Industry Session Organizer: Diane Michelson, Sematech Session Chair: Diane Michelson, Sematech 1. "Statisticians and Statistical Organizations - How to Be Successful in Today's World?" Ronald D. Snee, Tunnell Consulting Paper Abstract: The statistics profession is at a critical point in its history and has been for some time. The May 2008 Technometrics article, "Future of Industrial Statistics", summarized many of the major issues. Two key drivers are global competition and the rapid growth of information technology. The old model for the use of statistical thinking and methods in business and industry, which has been around for at least 50 years, does not work in today's business environment. This presentation begins with a brief summary of the current state of the profession and then moves quickly to a focus on what statistical organizations and statisticians as individuals need to do to effectively deal with the new environment. The focus is on strategies and approaches that have been found to work. Several case studies will be presented to illustrate the "new model" and the needed changes. 2. "Statistical Leadership at GE," J.D. Williams, GE Paper Abstract: All too often statisticians are thought of as being only "resources" for analytical expertise or data analysis. In these circumstances statisticians are left out of the critical decisioning processes and serve only a consultative role. A more ideal scenario would be for statisticians to be revered as thought leaders, innovators, or business leaders who can produce bottom-line results for their companies. In this talk I will share some of the things that we do here at GE Global Research to shape the way GE business leaders and our scientific peers view statisticians and how we cultivate leadership among our statisticians. Examples will be drawn from past and current collaborations between GE Global Research statisticians and the GE businesses, as well as statistician-lead research initiatives. 3. "Analytics with Commercial Impact: IBM Research Solutions for Microelectronic Manufacturing and Development," Robert Baseman, IBM Research Paper Abstract: IBM Research’s Business Analytics and Mathematical Sciences Department engages with many of IBM’s divisions and external customers to address a wide variety of tactical and strategic business issues. The Department employs a wide variety of analytical techniques to develop and deliver solutions and analyses that have significant commercial impact. Here, we describe our recent efforts with IBM’s 300mm semiconductor manufacturing operation. The unique complexity of this particular environment provides numerous opportunities for advanced analytic applications. In the course of this multi-disciplinary, multi-year effort, the IBM Research team has delivered solutions improving manufacturing yield, product quality, and overall factory productivity. We review our proactive multi-disciplinary approach, the suite of solutions that have been developed, and consider factors contributing to the overall success of the engagement. 19. New developments in Design of Experiments Session Organizer: Douglas C. Montgomery, Arizona State University Session Chair: Connie Borror, Arizona State University West 1. "Alternatives to Resolution IV Screening Designs in 16 Runs," Bradley Jones, JMP Paper Abstract: The resolution IV regular fractional factorial designs in 16 runs for six, seven, and eight factors are in standard use. They are economical and provide clear estimates of main effects when three-factor and higher-order interactions are negligible. However, because the two-factor interactions are completely confounded, experimenters are frequently required to augment the original fraction with new runs to resolve ambiguities in interpretation. We identify nonregular orthogonal fractions in 16 runs for these situations that have no complete confounding of two-factor interactions. These designs allow for the unambiguous estimation of models containing both main effects and a few two-factor interactions. We present the rationale behind the selection of these designs from the non-isomorphic 16-run fractions and illustrate how to use them with an example from the literature. 2. "Dual Response Surface Methods (RSM) to Make Processes More Robust," Mark J. Anderson and Patrick J. Whitcomb, Stat-Ease, Inc. Abstract: This talk illustrates the dual response surface method by example. Then, as a postscript, it takes into account the propagation of error (POE). Response surface methods1 (RSM) provide statistically-validated predictive models, sometimes referred to as "transfer functions," that can then be manipulated for finding optimal process configurations. The dual response2 approach to RSM captures both the average and standard deviation of the output(s) and simultaneously optimizes for the desired level at minimal variation, thus achieving and on-target, robust process. With inspiration provided by a case study on a single-wafer etching process,3 the implications of these methods for industry will be readily apparent, especially for those involved in design for six sigma (DFSS) quality programs. Robust design makes systems less sensitive to variation in the process variables. To accomplish this you should set controllable factors to levels that reduce variation in the response. Dual RSM provides only one aspect of this - response instability caused by variation in the uncontrollable factors, for batch-to-batch changes. Another cause for non-robustness is error transmitted to the response from variation in the controllable factors. Response surface methods provide predictive models that make it relative easy to measure transmitted variation via a technique called propagation of error (POE).4,5 Given statistics on standard deviation of the inputs, POE can be calculated and the 'flats' found - regions on the response surface where transmitted variation from inputs will be minimized. 3. "An Empirical Comparison of Computer Experiments," Rachel T. Johnson, Douglas C. Montgomery (Arizona State University), Bradley Jones, JMP Paper Abstract: Computer experiments are experimental designs intended for use on a deterministic computer simulation model. Computer experiments found in practice and in the literature are the Latin Hypercube, Sphere Packing, Uniform, and Gaussian Process Integrated Mean Squared Error (GP IMSE) designs. Often, computer simulation models are complex and have long execution times. Long computer execution time necessitates the use of a surrogate model. The Gaussian Process (GASP) model has been shown to provide fits to computer experiments that have low prediction variance at untried locations in the design space. Previous theoretical comparison studies based on prediction variance with respect to the Gaussian Process model demonstrated that factors such as design type, dimension of designs (i.e. number of input variables), and sample size all have an impact on the quality of predictions. This research provides an empirical comparison of these factors with respect to the root mean square error and average absolute percent difference between the predicted results (using a fitted GASP model) and actual results. The empirical study is based on a series of test functions with varying complexities. 20. Bayesian methods in Quality and Reliability Session Organizer: Will Guthrie, NIST Session Chair: Will Guthrie, NIST 1. "Bayesian SPC for Count Data," Panagiotis Tsiamyrtzis, Athens University of Economics and Business Paper Abstract: We consider a process producing count data from a Poisson distribution. Our interest is in detecting on-line whether the Poisson parameter (mean and variance) shifts to either a higher value (causing worst process performance) or to smaller values (good scenario). The necessity for drawing inference sequentially as the observations become available leads us to adopt a Bayesian sequentially updated scheme of mixture of Gamma distributions. Issues regarding inference and prediction will be covered. The developed methodology is very appealing in cases like short runs and/or Phase I count data. 2. "Integrating Computer and Physical Experiment Data," Brian Williams, Statistical Sciences Group, Los Alamos National Laboratory Paper Abstract: The investigation of complex physical systems utilizing sophisticated computer models has become commonplace with the advent of modern computational facilities. In many applications, experimental data on the physical systems of interest is extremely expensive to obtain and hence is available in limited quantities. The mathematical systems implemented by the computer models often include parameters having uncertain values. This article provides an overview of statistical methodology for calibrating uncertain parameters to experimental data. This approach assumes that prior knowledge about such parameters is represented as a probability distribution, and the experimental data is used to refine our knowledge about these parameters, expressed as a posterior distribution. Uncertainty quantification for computer model predictions of the physical system are based fundamentally on this posterior distribution. Computer models are generally not perfect representations of reality for a variety of reasons, such as inadequacies in the physical modeling of some processes in the dynamic system. The statistical model includes components that identify and adjust for such discrepancies. A standard approach to statistical modeling of computer model output for unsampled inputs is introduced for the common situation where limited computer model runs are available. Extensions of the statistical methods to functional outputs are available and discussed briefly. 3. "Bayesian Design of Reliability Experiments," Yao Zhang, Pfizer Paper Abstract: This talk describes Bayesian methods for planning reliability experiments. A Bayesian reliability experiment is a reliability experiment in which the experimenters intend to use prior information and for which Bayesian methods will be used in making inferences from the resulting data. Within the framework of a general model for reliability involving unknown parameters, we describe the basic ideas of Bayesian experimental design. We then describe a large-sample normal approximation that provides an easy to interpret yet useful simplification to the problem of Bayesian evaluation of reliability experiments and discuss aspects of finding optimum experiments within this framework. We point to several examples of the methodology that are available in the statistical literature. 21. Healthcare Applications Session Organizer: Art Chaovalitwongse, Rutgers University Session Chair: Art Chaovalitwongse, Rutgers University 1. "A simulation optimization approach to refine the US organ allocation policy," Yu Teng and Nan Kong, Weldon School of Biomedical Engineering, Purdue University Paper Abstract: Organ transplantation and allocation has been a contentious issue in the U.S. for decades. The scarcity of donors highlights the need for improving the efficiency of the current policy that applies a three-tier hierarchical allocation network (i.e., local, regional, and nationwide). While the system is too complex for analytical modeling, agent-based modeling and simulation (ABMS), provides a promising approach to evaluate the allocation policy alternatives, since ABMS can model interactions of autonomous individuals in a network. In this paper, we develop an ABMS model in which each organ procurement organization (OPO) is an agent. We model geographic preference of the OPOs in the organ matching process, i.e., a procured organ is assigned to recipients following a pre-specified priority list: local OPO, partnering OPOs, and other OPOs. We incorporate system heterogeneity in terms of geographic and biological disparities of donors and recipients. We consider a variety of medically acceptable outcomes to measure the system performance. We develop a simulation optimization framework to select the optimal policy. An intermediate task is to reconfigure the organ allocation regions. Our computational results show improvements in system performance with respect to various outcomes by refining the allocation policy. 2. "Statistical Inference for Ranks of Health Care Facilities in the Presence of Ties and Near Ties," Minge Xie, Rutgers University Abstract: Performance evaluation of institutions is common in many areas, and performance evaluations "inevitably lead to institutional ranking". Although there is a great need for such procedures in practice, confidence intervals and statistical inference for ranks are not well established. How do we assign confidence bounds for the ranks of health care facilities, schools and financial institutions based on data that do not clearly separate the performance of different entities apart? The commonly used bootstrap-based frequentist confidence intervals and Bayesian intervals for population ranks can not achieve the intended coverage probability in the frequentist sense, especially in the presence of unknown ties or "near ties" among the populations to be ranked. Given random samples from k populations, we propose confidence bounds for population ranking parameters and develop rigorous theory and inference for population ranks which allow ties and near ties. In the process, a notion of modified population rank is introduced which appears quite suitable for dealing with the population ranking problem. The results are extended to general risk adjustment models. This project stems from a collaborative research with medical researchers and administrators from the VA Medical Center East Orange, New Jersey. Simulations and a real dataset from a health research study involving 79 VHA facilities are used to illustrate the proposed methodology and theoretical results. 3. "Multivariate Time Series Classification for Brain Diagnosis," Art Chaovalitwongse, Ya-Ju Fan and Rajesh C. Sachdeo, Rutgers University Abstract: During the past few years, today's technological advances have made it possible to capture simultaneous responses from large enough medical data to empirically test the theories of human biological functions. To better understand the human biology and diseases from this new flood of information, there has been an explosion of interest in rapidly emerging interdisciplinary research area in both medical and engineering domains. The main objective of this talk is to present a new optimization framework for data mining in development of a new multivariate time series classification framework. In this framework, Support Feature Machine (SFM) is proposed as a new optimization method for feature selection, whose objective is to find the optimal set of features with good class separability based on the concept of intra-class and inter-class distances and the nearest neighbor rule. The empirical study shows that SFM achieves very high classification accuracies on real medical data classification. SFM's performance is comparable to that of Support Vector Machines while using a lesser number of features. Application of this framework is to develop a decision-aided tool to improve the current diagnosis of epilepsy, the second most common brain disorder. The diagnosis of epilepsy and brain disorders is a case point in this study. We have developed several optimization approaches in attempt to predict seizures and localize the abnormal brain area initiating the seizures as well as identify if the patients have epilepsy or other brain disorders. 4. "Total body irradiation treatment planning using IMRT optimization," Dionne Aleman, Velibor Misic (University of Toronto) and Michael Sharpe, Princess Margaret Hospital Abstract: It is estimated that over 4,500 new cases of leukemia and 7,900 new cases of lymphoma were diagnosed in Canada in 2008. In the US, those numbers rise to 44,000 new cases of leukemia and 74,000 new cases of lymphoma. A common treatment for these diseases and many others is bone marrow transplant. As part of the conditioning process to prepare the patient for the bone marrow transplant, the entire patient is treated with total body irradiation (TBI). The purpose of TBI is to eliminate the underlying bone marrow disease and to suppress the recipient's immune systems, thus preventing rejection of new donor stem cells. However, since the diseased cells are confined to bone marrow only, it would be more accurate to irradiate only the bone marrow rather than irradiate the entire body. Total marrow irradiation (TMI) is not yet used in practice and has not been well explored in the literature. Designing a treatment plan for TMI poses unique challenges that are not present in other forms of site-specific radiation therapy, for example, head-and-neck and prostate treatments. Specifically, the large site to be treated results in clinical treatments where the patient must be positioned far from the isocenter, as well as often repositioned during treatment, thus increasing uncertainty in delivered dose. Designing TMI treatments with intensity modulated radiation therapy (IMRT) will provide more accurate treatments that can spare healthy tissues while simultaneously delivering the prescribed radiation dose to the bones. To bring the patient closer to isocenter, beam orientation optimization (BOO) will be used to incorporate non-coplanar beams. This presentation will present and discuss in detail the difficulties presented by TBI treatment planning in conjunction with IMRT, as well as the difficulties posed by non-coplanar BOO. Treatment plan results obtained using a local search heuristic and an set cover formulation of the problem will be presented.