Data Analysis with Microsoft Excel

KhoriBona
Data Analysis with Microsoft Excel

Data Analysis with Microsoft Excel

613 Pages · 2010 · 10.55 MB · English

by Kenneth N. Berk

Editor's Picks  +  Technology Software



Data Analysis with Microsoft® Excel Updated for Offi ce 2007®

Kenneth N. Berk 
Illinois State University
 Patrick Carey
 Carey Associates, Inc


Data Analysis with Microsoft® Excel: Updated for Offi ce 2007®, Third Edition Berk, Carey

 Publisher: Richard Stratton
Senior Sponsoring Editor: Molly Taylor
 Associate Editor: Daniel Seibert
 Editorial Assistant: Shaylin Walsh
 Associate Media Editor: Catie Ronquillo
 Senior Marketing Manager: Greta Kleinert
 Marketing Coordinator: Erica O’Connell
Marketing Communications Manager: Mary Anne Payumo
 Content Project Manager: Jessica Rasile
Art Director: Linda Helcher
 Print Buyer: Linda Hsu
 Permissions Editor: Margaret Chamberlain-Gaston
 Production Service/Compositor: PrePress PMG
 Photo Manager: John Hill
 Cover Designer: Blue Bungalow Design
Cover Image: ©Fotolia
© 2010, 2004 Brooks/Cole, Cengage Learning
ALL RIGHTS RESERVED. No part of this work covered by the copyright herein may be reproduced, transmitted, stored, or used in any form or by any means graphic, electronic, or mechanical, including but not limited to photocopying, recording, scanning, digitizing, taping, Web distribution, information networks, or information storage and retrieval systems, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without the prior written permission of the publisher

For product information and technology assistance, contact us at Cengage Learning Customer & Sales Support, 1-800-354-9706
For permission to use material from this text or product, submit all requests online at www.cengage.com/permissions. Further permissions questions can be emailed to permissionrequest@cengage.com.

Library of Congress Control Number: 2009928574
ISBN-13: 978-0-495-39178-4
ISBN-10: 0-495-39178-6

 Brooks/Cole
 20 Channel Center Street
 Boston, MA 02210
USA
 Cengage Learning products are represented in Canada by Nelson Education, Ltd.


For your course and learning solutions, visit www.cengage.com Purchase any of our products at your local college store or at our preferred online store www.ichapters.com 



Printed in the United States of America
 1 2 3 4 5 6 7 13 12 11 10 09


About the Authors 

Kenneth N. Berk 

Kenneth N. Berk (Ph.D., University of Minnesota) is an emeritus professor of mathematics at Illinois State University and a Fellow of the American Statistical Association. Berk was editor of Software Reviews for the American Statistician for six years. He served as chair of the Statistical Computing Section of the American Statistical Association. He has twice co-chaired the annual Symposium on the Interface between Computing Science and Statistics. 

Patrick Carey 

Patrick Carey received his M.S. in biostatistics from the University of Wisconsin where he worked as a researcher in the General Clinical Research Center designing and analyzing clinical studies. He coauthored his first textbook with Ken Berk on using Excel as a statistical tool. He and his wife Joan founded Carey Associates, Inc., a software textbook development company. He has since authored or coauthored over 20 academic and trade texts for the software industry. Besides books on data analysis, Carey has written on the Windows® operating system, Web page design, database management, the Internet, browsers, and presentation graphics software. Patrick, Joan, and their six children live in Wisconsin.

I thank my wife Laura for her advice, because here she is
 the one who knows about publishing books.
 —Kenneth N. Berk 
Thanks to my wife, Joan, and my children, John Paul, Thomas,
 Peter, Michael, Stephen, and Catherine, for their love and
 support. 
—Patrick M. Carey


Preface

Introduction

 Data Analysis with Microsoft® Excel: Updated for Offi ce 2007® harnesses the power of Excel and transforms it into a tool for learning basic statistical analysis. Students learn statistics in the context of analyzing data. We feel that it is important for students to work with real data, analyzing real-world problems, so that they understand the subtleties and complexities of analysis that make statistics such an integral part of understanding our world. The data set topics range from business examples to physiological studies on NASA astronauts. Because students work with real data, they can appreciate that in statistics no answers are completely fi nal and that intuition and creativity are as much a part of data analysis as is plugging numbers into a software package. This text can serve as the core text for an introductory statistics course or as a supplemental text. It also allows nontraditional students outside of the classroom setting to teach themselves how to use Excel to analyze sets of real data so they can make informed business forecasts and decisions. 
Users of this book need not have any experience with Excel, although previous experience would be helpful. The fi rst three chapters of the book cover basic concepts of mouse and Windows operation, data entry, formulas and functions, charts, and editing and saving workbooks. Chapters 4 through 12 emphasize teaching statistics with Excel as the instrument.

Using Excel in a Statistics Course

 Spreadsheets have become one of the most popular forms of computer software, second only to word processors. Spreadsheet software allows the user to combine data, mathematical formulas, text, and graphics together in a single report or workbook. For this reason, spreadsheets have become indispensable tools for business, as they have also become popular in scientifi c research. Excel in particular has won a great deal of acclaim for its ease of use and power.

As spreadsheets have expanded in power and ease of use, there has been increased interest in using them in the classroom. There are many advantages to using Excel in an introductory statistics course. An important advantage is that students, particularly business students, are more likely to be familiar with spreadsheets and are more comfortable working with data entered into a spreadsheet. Since spreadsheet software is very common at colleges and universities, a statistics instructor can teach a course without requiring students to purchase an additional software package. Having identifi ed the strengths of Excel for teaching basic statistics, it would be unfair not to include a few warnings. Spreadsheets are not statistics packages, and there are limits to what they can do in replacing a full-featured statistics package. This is why we have included our own downloadable add-in, StatPlus™. It expands some of Excel’s statistical capabilities. (We explain the use of StatPlus where appropriate throughout the text.) Using Excel for anything other than an introductory statistics course would probably not be appropriate due to its limitations. For example, Excel can easily perform balanced two-way analysis of variance but not unbalanced two-way analysis of variance. Spreadsheets are also limited in handling data with missing values. While we recommend Excel for a basic statistics course, we feel it is not appropriate for more advanced analysis.

System Information 

You will need the following hardware and software to use Data Analysis with Microsoft® Excel: Updated for Offi ce 2007®: 
  • A Windows-based PC. 
  • Windows XP or Windows Vista.
  •  Excel 2007. If you are using an earlier edition of Excel, you will have to use an earlier edition of Data Analysis with Microsoft® Excel. 
  • Internet access for downloading the software fi les accompanying the text. 

The Data Analysis with Microsoft® Excel package includes: 

  • The text, which includes 12 chapters, a reference section for Excel’s statistical functions, Analysis ToolPak commands, StatPlus Add-In commands, and a bibliography. 
  • The companion website at www.cengage.com/statistics/berk contains 92 different data sets from real-life situations plus a summary of what the data set fi les cover, ten interactive Concept Tutorials, and installation fi les for StatPlus—our statistical application. Chapter 1 of the text includes instructions for installing the fi les. 
  • An Instructor’s Manual with solutions to all the exercises in the text is available, password-protected on the companion website, to adopting instructors

Excel’s Statistical Tools 

Excel comes with 81 statistical functions and 59 mathematical functions. There are also functions devoted to business and engineering problems. The statistical functions that basic Excel provides include descriptive statistics such as means, standard deviations, and rank statistics. There are also cumulative distribution and probability density functions for a variety of distributions, both continuous and discrete. 
The Analysis ToolPak is an add-in that is included with Excel. If you have not loaded the Analysis ToolPak, you will have to install it from your original Excel installation. 
The Analysis ToolPak adds the following capabilities to Excel: 

  • Analysis of variance, including one-way, two-way without replication, and two-way balanced with replication
  • Correlation and covariance matrices 
  • Tables of descriptive statistics 
  • One-parameter exponential smoothing 
  • Histograms with user-defi ned bin values 
  • Moving averages 
  • Random number generation for a variety of distributions 
  • Rank and percentile scores 
  • Multiple linear regression 
  • Random sampling 
  • t tests, including paired and two sample, assuming equal and unequal variances 
  • z tests 
In this book we make extensive use of the Analysis ToolPak for multiple linear regression problems and analysis of variance.

StatPlus™ 

Since the Analysis ToolPak does not do everything that an introductory statistics course requires, this textbook comes with an additional add-in called the StatPlus™ Add-In that fi lls in some of the gaps left by basic Excel 2007 and the Analysis ToolPak. 
Additional commands provided by the StatPlus Add-In give users the ability to: 
  • Create random sets of data 
  • Manipulate data columns 
  • Create random samples from large data sets 
  • Generate tables of univariate statistics
  • Create statistical charts including boxplots, histograms, and normal probability plots 
  • Create quality control charts 
  • Perform one-sample and two-sample t tests and z tests 
  • Perform non-parametric analyses 
  • Perform time series analyses, including exponential and seasonal smoothing 
  • Manipulate charts by adding data labels and breaking charts down into categories 
  • Perform non parametric analyses 
  • Create and analyze tabular data 
A full description of these commands is included in the Appendix’s Reference section and through on-line help available with the application.

Concept Tutorials 

Included with the StatPlus add-in are ten interactive Excel tutorials that provide students a visual and hands-on approach to learning statistical concepts. 
These tutorials cover: 
  • Boxplots 
  • Probability 
  • Probability distributions 
  • Random samples 
  • Population statistics 
  • The Central Limit Theorem 
  • Confidence intervals 
  • Hypothesis tests 
  • Exponential smoothing 
  • Linear regression

Acknowledgments 

We thank Mac Mendelsohn, Managing Editor at Course Technology, for his support and enthusiasm for the First Edition of this book. For this edition, our thanks to Jessica Rasile, Content Project Manager, Blue Bungalow Design for the cover design, and Carol A. Loomis, Copyeditor, for their professional attention to all the details of production.

Special thanks go to our reviewers, who gave us valuable insights into improving the book in each edition: Aaron S. Liswood, Sierra Nevada College; Abbot L. Packard, State University of West Georgia; Andrew E. Coop, US Air Force Academy; Barry Bombay, J. Sargeant Reynolds Community College; Beth Eschenback, Humboldt State University; Bruce Trumbo, California State University – Hayward; Carl Grafton, Auburn University; Carl R. Williams, University of Memphis; Cheryl Dale, William Carey College; Dang Tran, California State University – Los Angeles; Bruce Marsh, Texas A & M University – Kingsvile; Edward J. Williams, University of Michigan – Dearborn; Eric Zivot, University of Washington; Farrokh Alemi, George Mason University; Faye Teer, James Madison University; Gordon Dahl, University of Rochester; Ian Hardie, University of Maryland; Jack Harris, Hobart and William Smith Colleges; Ames E. Pratt, Cornell University; James Zumbrunnen, Colorado State University; John A. Austin, Jr., Louisiana State University – Shreveport; Kelwyn A. D’Souza, Hampton University; Kevin Griffi n, Eastern Arizona College; Lea Cloninger, University of Illinois at Chicago; Lorrie Hoffman, University of Central Florida; Marion G. Sobol, Southern Methodist University, and Matthew C. Dixon, USAF Academy.
We thank Laura Berk, Peter Berk, Robert Beyer, David Booth, Orlyn Edge, Stephen Friedberg, Maria Gillett, Richard Goldstein, Glenn Hart, Lotus Hershberger, Les Montgomery, Joyce Nervades, Diane Warfi eld, and Kemp Wills for their assistance with the data sets in this book. We especially want to thank Dr. Jeff Steagall, who wrote some of the original material for Chapter 12, Quality Control. If we have missed anyone, please forgive the omission.

Kenneth N. Berk 
Patrick M. Carey

Contents 


  • Chapter 1 
  • GETTING STARTED WITH EXCEL 1 
  • Getting Started 2 
  • Special Files for This Book 2 
  • Installing the StatPlus Files 2 
  • Excel and Spreadsheets 4 
  • Launching Excel 5 
  • Viewing the Excel Window 6 
  • Running Excel Commands 7 
  • Excel Workbooks and Worksheets 10 
  • Opening a Workbook 10 
  • Scrolling through a Workbook 11 
  • Worksheet Cells 14 
  • Selecting a Cell 14 
  • Moving Cells 16 
  • Printing from Excel 18 
  • Previewing the Print Job 18 
  • Setting Up the Page 19 
  • Printing the Page 21 
  • Saving Your Work 22 
  • Excel Add-Ins 24 
  • Loading the StatPlus Add-In 24
  •  Loading the Data Analysis ToolPak 28 
  • Unloading an Add-In 30 
  • Features of StatPlus 30 
  • Using StatPlus Modules 30 
  • Hidden Data 31 
  • Linked Formulas 32 
  • Setup Options 32 
  • Exiting Excel 34 
  • Chapter 2 
  • WORKING WITH DATA 35 
  • Data Entry 36 
  • Entering Data from the Keyboard 36 
  • Entering Data with Autofi ll 37 
  • Inserting New Data 40 
  • Data Formats 41 
  • Formulas and Functions 45 
  • Inserting a Simple Formula 46 
  • Inserting an Excel Function 47 
  • Cell References 50 
  • Range Names 51 
  • Sorting Data 54 
  • Querying Data 55 
  • Using the AutoFilter 56 
  • Using the Advanced Filter 59
  •  Using Calculated Values 62 
  • Importing Data from Text Files 63 
  • Importing Data from Databases 68 
  • Using Excel’s Database Query Wizard 68 
  • Specifying Criteria and Sorting Data 71 
  • Exercises 75 
  • Chapter 3 
  • WORKING WITH CHARTS 81 
  • Introducing Excel Charts 82 
  • Introducing Scatter Plots 86 
  • Editing a Chart 91 
  • Resizing and Moving an Embedded Chart 91 
  • Moving a Chart to a Chart Sheet 93
  •  Working with Chart and Axis Titles 94 
  • Editing the Chart Axes 97 
  • Working with Gridlines and Legends 100 
  • Editing Plot Symbols 102 
  • Identifying Data Points 105 
  • Selecting a Data Row 106 
  • Labeling Data Points 107 
  • Formatting Labels 109 
  • Creating Bubble Plots 110 
  • Breaking a Scatter Plot into Categories 117 
  • Plotting Several Variables 120 
  • Exercises 123 
  • Chapter 4 
  • DESCRIBING YOUR DATA 128 
  • Variables and Descriptive Statistics 129 
  • Frequency Tables 131 
  • Creating a Frequency Table 132 
  • Using Bins in a Frequency Table 134 
  • Defi ning Your Own Bin Values 136 
  • Working with Histograms 138 
  • Creating a Histogram 138 
  • Shapes of Distributions 141 
  • Breaking a Histogram into Categories 143 
  • Working with Stem and Leaf Plots 146 
  • Distribution Statistics 151 
  • Percentiles and Quartiles 151 
  • Measures of the Center: Means, Medians, and the Mode 154 
  • Measures of Variability 159 
  • Measures of Shape: Skewness and Kurtosis 162 
  • Outliers 164 
  • Working with Boxplots 165 
  • Concept Tutorials: Boxplots 166 Exercises 175 
  • Chapter 5 
  • PROBABILITY DISTRIBUTIONS 182 
  • Probability 183 
  • Probability Distributions 184 
  • Discrete Probability Distributions 185 
  • Continuous Probability Distributions 186 
  • Concept Tutorials: PDFs 187 
  • Random Variables and Random Samples 189 
  • Concept Tutorials: Random Samples 190
  •  The Normal Distribution 193 
  • Concept Tutorials: The Normal Distribution 194 
  • Excel Worksheet Functions 196 
  • Using Excel to Generate Random Normal Data 197 
  • Charting Random Normal Data 199 
  • The Normal Probability Plot 201 
  • Parameters and Estimators 205 
  • The Sampling Distribution 206 
  • Concept Tutorials: Sampling Distributions 211 
  • The Standard Error 212 
  • The Central Limit Theorem 212 
  • Concept Tutorials: The Central Limit Theorem 213 
  • Exercises 218 
  • Chapter 6
  •  STATISTICAL INFERENCE 224 
  • Confi dence Intervals 225 
  • z Test Statistic and z Values 225 
  • Calculating the Confi dence Interval with Excel 228 
  • Interpreting the Confi dence Interval 229 
  • Concept Tutorials: The Confi dence Interval 229 
  • Hypothesis Testing 232 
  • Types of Error 233 
  • An Example of Hypothesis Testing 234 
  • Acceptance and Rejection Regions 234 
  • p Values 235 
  • Concept Tutorials: Hypothesis Testing 236 
  • Additional Thoughts about Hypothesis Testing 239 
  • The t Distribution 240 
  • Concept Tutorials: The t Distribution 241 
  • Working with the t Statistic 242 
  • Constructing a t Confi dence Interval 243 
  • The Robustness of t 243 
  • Applying the t Test to Paired Data 244 
  • Applying a Nonparametric Test to Paired Data 250 
  • The Wilcoxon Signed Rank Test 250 
  • The Sign Test 253 
  • The Two-Sample t Test 255 
  • Comparing the Pooled and Unpooled Test Statistics 256 
  • Working with the Two-Sample t Statistic 256 
  • Testing for Equality of Variance 258 
  • Applying the t Test to Two-Sample Data 259 
  • Applying a Nonparametric Test to Two-Sample Data 265 
  • Final Thoughts about Statistical Inference 267 
  • Exercises 268 
  • Chapter 7 
  • TABLES 275 
  • PivotTables 276 
  • Removing Categories from a PivotTable 280 
  • Changing the Values Displayed by the PivotTable 282 
  • Displaying Categorical Data in a Bar Chart 283 
  • Displaying Categorical Data in a Pie Chart 285 
  • Two-Way Tables 288 Computing Expected Counts 291 
  • The Pearson Chi-Square Statistic 293 
  • Concept Tutorials: The x2 Distribution 293
  •  Working with the x2 Distribution in Excel 296 
  • Breaking Down the Chi-Square Statistic 297 
  • Other Table Statistics 297 
  • Validity of the Chi-Square Test with Small Frequencies 299 
  • x Contents Tables with Ordinal Variables 302 
  • Testing for a Relationship between Two Ordinal Variables 303 
  • Custom Sort Order 307 
  • Exercises 309 
  • Chapter 8 REGRESSION AND CORRELATION 313 
  • Simple Linear Regression 314 
  • The Regression Equation 314 
  • Fitting the Regression Line 315 
  • Regression Functions in Excel 316 
  • Exploring Regression 317 
  • Performing a Regression Analysis 318 
  • Plotting Regression Data 320 
  • Calculating Regression Statistics 323 
  • Interpreting Regression Statistics 325 
  • Interpreting the Analysis of Variance Table 326 
  • Parameter Estimates and Statistics 327 
  • Residuals and Predicted Values 328 
  • Checking the Regression Model 329 
  • Testing the Straight-Line Assumption 329 
  • Testing for Normal Distribution of the Residuals 331 
  • Testing for Constant Variance in the Residuals 332 
  • Testing for the Independence of Residuals 332 
  • Correlation 335 
  • Correlation and Slope 336 
  • Correlation and Causality 336 
  • Spearman’s Rank Correlation Coeffi cient s 337 
  • Correlation Functions in Excel 337 
  • Creating a Correlation Matrix 338 
  • Correlation with a Two-Valued Variable 342 
  • Adjusting Multiple p Values with Bonferroni 342 
  • Creating a Scatter Plot Matrix 343 
  • Exercises 345 
  • Chapter 9 
  • MULTIPLE REGRESSION 352 
  • Regression Models with Multiple Parameters 353 
  • Concept Tutorials: The F Distribution 353 
  • Using Regression for Prediction 355 
  • Regression Example: Predicting Grades 356 
  • Interpreting the Regression Output 358
  •  Multiple Correlation 359 
  • Coeffi cients and the Prediction Equation 361 
  • t Tests for the Coeffi cients 362 
  • Testing Regression Assumptions 363 
  • Observed versus Predicted Values 363 
  • Plotting Residuals versus Predicted Values 366 
  • Plotting Residuals versus Predictor Variables 368 
  • Normal Errors and the Normal Plot 370 
  • Summary of Calc Analysis 371 
  • Regression Example: Sex Discrimination 371 
  • Regression on Male Faculty 372 
  • Using a SPLOM to See Relationships 373 
  • Correlation Matrix of Variables 374 
  • Multiple Regression 376 
  • Interpreting the Regression Output 377 
  • Residual Analysis of Discrimination Data 377 
  • Normal Plot of Residuals 378 
  • Are Female Faculty Underpaid? 380 
  • Drawing Conclusions 385 
  • Exercises 386 
  • Chapter 10 
  • ANALYSIS OF VARIANCE 392 
  • One-Way Analysis of Variance 393
  •  Analysis of Variance Example: Comparing Hotel Prices 393 
  • Graphing the Data to Verify ANOVA Assumptions 395
  •  Computing the Analysis of Variance 397 
  • Interpreting the Analysis of Variance Table 399 
  • Comparing Means 402 
  • Using the Bonferroni Correction Factor 403 
  • When to Use Bonferroni 404 
  • Comparing Means with a Boxplot 405 
  • Contents xi One-Way Analysis of Variance and Regression 406
  •  Indicator Variables 406 
  • Fitting the Effects Model 408 
  • Two-Way Analysis of Variance 410 
  • A Two-Factor Example 410 
  • Two-Way Analysis Example: Comparing Soft Drinks 413 
  • Graphing the Data to Verify Assumptions 414 
  • The Interaction Plot 417 
  • Using Excel to Perform a Two-Way Analysis of Variance 419 
  • Interpreting the Analysis of Variance Table 422 
  • Summary 424 
  • Exercises 424 
  • Chapter 11 
  • TIME SERIES 431 
  • Time Series Concepts 432 
  • Time Series Example: The Rise in Global Temperatures 432 
  • Plotting the Global Temperature Time Series 433 
  • Analyzing the Change in Global Temperature 436 
  • Looking at Lagged Values 438 
  • The Autocorrelation Function 440 
  • Applying the ACF to Annual Mean Temperature 441 
  • Other ACF Patterns 443 
  • Applying the ACF to the Change in Temperature 444 
  • Moving Averages 445 
  • Simple Exponential Smoothing 448 
  • Forecasting with Exponential Smoothing 450 
  • Assessing the Accuracy of the Forecast 450 
  • Concept Tutorials: One-Parameter Exponential Smoothing 451 
  • Choosing a Value for w 455 
  • Two-Parameter Exponential Smoothing 457 
  • Calculating the Smoothed Values 458 
  • Concept Tutorials: Two-Parameter Exponential Smoothing 459 
  • Seasonality 462 
  • Multiplicative Seasonality 462 
  • Additive Seasonality 464 
  • Seasonal Example: Liquor Sales 464 
  • Examining Seasonality with a Boxplot 467 
  • Examining Seasonality with a Line Plot 468 
  • Applying the ACF to Seasonal Data 470 
  • Adjusting for Seasonality 471 
  • Three-Parameter Exponential Smoothing 473 
  • Forecasting Liquor Sales 474 
  • Optimizing the Exponential Smoothing Constant (optional) 479 
  • Exercises 482 
  • Chapter 12 
  • QUALITY CONTROL 487 
  • Statistical Quality Control 488 
  • Controlled Variation 489 
  • Uncontrolled Variation 489 
  • Control Charts 490 
  • Control Charts and Hypothesis Testing 492 
  • Variable and Attribute Charts 493 
  • Using Subgroups 493 
  • The x Chart 493 
  • Calculating Control Limits When s Is Known 494 
  • x Chart Example: Teaching Scores 495 
  • Calculating Control Limits When s Is Unknown 498
  •  x Chart Example: A Coating Process 500 
  • The Range Chart 502 
  • The C Chart 504 
  • C Chart Example: Factory Accidents 504 
  • The P Chart 506 
  • P Chart Example: Steel Rod Defects 507 
  • Control Charts for Individual Observations 509 
  • The Pareto Chart 513 
  • Exercises 517 
  • APPENDIX 521 
  • Excel Reference 581
  •  Bibliography 587 
  • Index 589 

#buttons=(Ok, Go it!) #days=(60)

Our website uses cookies to enhance your experience. Check Now
Ok, Go it!
Do you have any doubts? chat with us on WhatsApp
Hello, How can I help you? ...
Click me to start the chat...