An introductory textbook on data analysis and statistics written especially for students in the social sciences and allied fields Quantitative analysis is an increasingly essential skill for social science research, yet students in the social sciences and related areas typically receive little training in it--or if they do, they usually end up in statistics classes that offer few insights into their field. This textbook is a practical introduction to data analysis and statistics written especially for undergraduates and beginning graduate students in the social sciences and allied fields, such as economics, sociology, public policy, and data science. Quantitative Social Science engages directly with empirical analysis, showing students how to analyze data using the R programming language and to interpret the results--it encourages hands-on learning, not paper-and-pencil statistics. More than forty data sets taken directly from leading quantitative social science research illustrate how data analysis can be used to answer important questions about society and human behavior. Proven in the classroom, this one-of-a-kind textbook features numerous additional data analysis exercises and interactive R programming exercises, and also comes with supplementary teaching materials for instructors. * Written especially for students in the social sciences and allied fields, including economics, sociology, public policy, and data science* Provides hands-on instruction using R programming, not paper-and-pencil statistics* Includes more than forty data sets from actual research for students to test their skills on* Covers data analysis concepts such as causality, measurement, and prediction, as well as probability and statistical tools* Features a wealth of supplementary exercises, including additional data analysis exercises and interactive programming exercises* Offers a solid foundation for further study* Comes with additional course materials online, including notes, sample code, exercises and problem sets with solutions, and lecture slides
Les mer
List of Tables xiii List of Figures xv Preface xvii 1 Introduction 1 1.1 Overview of the Book 3 1.2 How to Use this Book 7 1.3 Introduction to R 10 1.3.1 Arithmetic Operations 10 1.3.2 Objects 12 1.3.3 Vectors 14 1.3.4 Functions 16 1.3.5 Data Files 20 1.3.6 Saving Objects 23 1.3.7 Packages 24 1.3.8 Programming and Learning Tips 25 1.4 Summary 27 1.5 Exercises 28 1.5.1 Bias in Self-Reported Turnout 28 1.5.2 Understanding World Population Dynamics 29 2 Causality 32 2.1 Racial Discrimination in the Labor Market 32 2.2 Subsetting the Data in R 36 2.2.1 Logical Values and Operators 37 2.2.2 Relational Operators 39 2.2.3 Subsetting 40 2.2.4 Simple Conditional Statements 43 2.2.5 Factor Variables 44 2.3 Causal Effects and the Counterfactual 46 2.4 Randomized Controlled Trials 48 2.4.1 The Role of Randomization 49 2.4.2 Social Pressure and Voter Turnout 51 2.5 Observational Studies 54 2.5.1 Minimum Wage and Unemployment 54 2.5.2 Confounding Bias 57 2.5.3 Before-and-After and Difference-in-Differences Designs 60 2.6 Descriptive Statistics for a Single Variable 63 2.6.1 Quantiles 63 2.6.2 Standard Deviation 66 2.7 Summary 68 2.8 Exercises 69 2.8.1 Efficacy of Small Class Size in Early Education 69 2.8.2 Changing Minds on Gay Marriage 71 2.8.3 Success of Leader Assassination as a Natural Experiment 73 3 Measurement 75 3.1 Measuring Civilian Victimization during Wartime 75 3.2 Handling Missing Data in R 78 3.3 Visualizing the Univariate Distribution 80 3.3.1 Bar Plot 80 3.3.2 Histogram 81 3.3.3 Box Plot 85 3.3.4 Printing and Saving Graphs 87 3.4 Survey Sampling 88 3.4.1 The Role of Randomization 89 3.4.2 Nonresponse and Other Sources of Bias 93 3.5 Measuring Political Polarization 96 3.6 Summarizing Bivariate Relationships 97 3.6.1 Scatter Plot 98 3.6.2 Correlation 101 3.6.3 Quantile-Quantile Plot 105 3.7 Clustering 108 3.7.1 Matrix in R 108 3.7.2 List in R 110 3.7.3 The k-Means Algorithm 111 3.8 Summary 115 3.9 Exercises 116 3.9.1 Changing Minds on Gay Marriage: Revisited 116 3.9.2 Political Efficacy in China and Mexico 118 3.9.3 Voting in the United Nations General Assembly 120 4 Prediction 123 4.1 Predicting Election Outcomes 123 4.1.1 Loops in R 124 4.1.2 General Conditional Statements in R 127 4.1.3 Poll Predictions 130 4.2 Linear Regression 139 4.2.1 Facial Appearance and Election Outcomes 139 4.2.2 Correlation and Scatter Plots 141 4.2.3 Least Squares 143 4.2.4 Regression towards the Mean 148 4.2.5 Merging Data Sets in R 149 4.2.6 Model Fit 156 4.3 Regression and Causation 161 4.3.1 Randomized Experiments 162 4.3.2 Regression with Multiple Predictors 165 4.3.3 Heterogenous Treatment Effects 170 4.3.4 Regression Discontinuity Design 176 4.4 Summary 181 4.5 Exercises 182 4.5.1 Prediction Based on Betting Markets 182 4.5.2 Election and Conditional Cash Transfer Program in Mexico 184 4.5.3 Government Transfer and Poverty Reduction in Brazil 187 5 Discovery 189 5.1 Textual Data 189 5.1.1 The Disputed Authorship of The Federalist Papers 189 5.1.2 Document-Term Matrix 194 5.1.3 Topic Discovery 195 5.1.4 Authorship Prediction 200 5.1.5 Cross Validation 202 5.2 Network Data 205 5.2.1 Marriage Network in Renaissance Florence 205 5.2.2 Undirected Graph and Centrality Measures 207 5.2.3 Twitter-Following Network 211 5.2.4 Directed Graph and Centrality 213 5.3 Spatial Data 220 5.3.1 The 1854 Cholera Outbreak in London 220 5.3.2 Spatial Data in R 223 5.3.3 Colors in R 226 5.3.4 US Presidential Elections 228 5.3.5 Expansion of Walmart 231 5.3.6 Animation in R 233 5.4 Summary 235 5.5 Exercises 236 5.5.1 Analyzing the Preambles of Constitutions 236 5.5.2 International Trade Network 238 5.5.3 Mapping US Presidential Election Results over Time 239 6 Probability 242 6.1 Probability 242 6.1.1 Frequentist versus Bayesian 242 6.1.2 Definition and Axioms 244 6.1.3 Permutations 247 6.1.4 Sampling with and without Replacement 250 6.1.5 Combinations 252 6.2 Conditional Probability 254 6.2.1 Conditional, Marginal, and Joint Probabilities 254 6.2.2 Independence 261 6.2.3 Bayes' Rule 266 6.2.4 Predicting Race Using Surname and Residence Location 268 6.3 Random Variables and Probability Distributions 277 6.3.1 Random Variables 278 6.3.2 Bernoulli and Uniform Distributions 278 6.3.3 Binomial Distribution 282 6.3.4 Normal Distribution 286 6.3.5 Expectation and Variance 292 6.3.6 Predicting Election Outcomes with Uncertainty 296 6.4 Large Sample Theorems 300 6.4.1 The Law of Large Numbers 300 6.4.2 The Central Limit Theorem 302 6.5 Summary 306 6.6 Exercises 307 6.6.1 The Mathematics of Enigma 307 6.6.2 A Probability Model for Betting Market Election Prediction 309 6.6.3 Election Fraud in Russia 310 7 Uncertainty 314 7.1 Estimation 314 7.1.1 Unbiasedness and Consistency 315 7.1.2 Standard Error 322 7.1.3 Confidence Intervals 326 7.1.4 Margin of Error and Sample Size Calculation in Polls 332 7.1.5 Analysis of Randomized Controlled Trials 336 7.1.6 Analysis Based on Student's t-Distribution 339 7.2 Hypothesis Testing 342 7.2.1 Tea-Tasting Experiment 342 7.2.2 The General Framework 346 7.2.3 One-Sample Tests 350 7.2.4 Two-Sample Tests 356 7.2.5 Pitfalls of Hypothesis Testing 361 7.2.6 Power Analysis 363 7.3 Linear Regression Model with Uncertainty 370 7.3.1 Linear Regression as a Generative Model 370 7.3.2 Unbiasedness of Estimated Coefficients 375 7.3.3 Standard Errors of Estimated Coefficients 378 7.3.4 Inference about Coefficients 380 7.3.5 Inference about Predictions 384 7.4 Summary 389 7.5 Exercises 390 7.5.1 Sex Ratio and the Price of Agricultural Crops in China 390 7.5.2 File Drawer and Publication Bias in Academic Research 392 7.5.3 The 1932 German Election in the Weimar Republic 394 8 Next 397 General Index 401 R Index 406
Les mer
"The author has masterfully balanced careful explanations of the quantitative theory with the practical computer implementation of the methods applied to real world data sets. . . . That Quantitative Social Science: An Introduction is carefully written, detailed, and interactive makes it useful either as a textbook for a lecture course or for self-study. . . . I highly recommend the book to anyone looking for an introduction to data science."---Jason M. Graham, Mathematical Association of America Reviews
Les mer
"Kosuke Imai has produced a superb hands-on introduction to modern quantitative methods in the social sciences. Placing practical data analysis front and center, this book is bound to become a standard reference in the field of quantitative social science and an indispensable resource for students and practitioners alike."—Alberto Abadie, Massachusetts Institute of Technology"The search for a good undergraduate social science textbook is eternal, but with Imai's book, the search may well be over. It covers a host of cutting-edge issues in quantitative analysis, from causality and inference to its use of R so that students can advance in both their research and work lives. Imai plots a new way for us to think about how to teach undergraduate methods."—Nathaniel Beck, New York University"Kosuke Imai's book takes a very novel and interesting approach to a first quantitative methods course for the social sciences. Focusing on interesting questions from the beginning, he starts by introducing the potential outcome approach to causality, and proceeds to present the reader with a wide range of methods for an admirably broad range of settings, including textual, network, and spatial data. Integrated with the methodological discussions are examples with detailed R code. Readers who work through this book will be well equipped to use modern methods for data analysis in the social sciences. I highly recommend this book!"—Guido W. Imbens, coauthor of Causal Inference for Statistics, Social, and Biomedical Sciences"This important new book seeks to democratize quantitative social science. In it, one of the world's foremost political methodologists shows how you can join the movement that has changed so much of the academic, commercial, government, and nonprofit worlds. It provides a seamless path from ignorance to insight in a few hundred clear and enlightening pages."—Gary King, Harvard University"Imai's new textbook has the potential to totally transform how undergraduate statistics is taught. The focus is on data analysis first and statistics second. It is full of great and relevant empirical examples. Students will engage this book rather than dread it."—Christopher Winship, Harvard University"This is the ideal book for a first class on data analysis. Not only does it provide students with a clear, accessible, and technically correct introduction to research design, computing with data, and statistical inference, but it does what truly great introductions to a topic all do—it generates excitement."—Kevin M. Quinn, University of California, Berkeley"Finally, a statistics text has caught up with rapid developments in the social sciences in the last two decades, spanning everything from the rediscovery of design, randomization, and causality to Bayesian approaches. From the organization of the subject matter (e.g., causality, measurement, uncertainty) to the mode of presentation, Imai has produced a work that is both comprehensive and accessible, but reflects the vast breadth of topics and approaches today's social scientists are expected to know. The examples are extremely well chosen, a delight to read, and accompanied by R code. Social science finally has an introductory book that presents statistics as it is practiced at the research frontier today, not thirty years ago."—Simon Jackman, United States Studies Centre, University of Sydney"Imai's new book on quantitative social science represents a groundbreaking and effective method for teaching statistics and quantitative methods to students in any number of fields—ranging from public health and medicine to education and political science. The motivating examples, clear and engaging exposition, and easy implementation for students will make it a resource they (and their instructors) turn to again and again."—Elizabeth Stuart, Johns Hopkins Bloomberg School of Public Health"Imai's fantastic textbook provides a succinct but thorough introduction to quantitative methods and how they are applied to social science problems. The text is easy to read while also providing material that is generally pitched at a level appropriate for newcomers to the subject."—Justin Grimmer, Stanford University"Imai's text is engaging and full of examples. It will be widely taught and will have a wide impact. Anyone who really masters the skills and concepts presented here will know statistics better than many professional political scientists."—Andrew Eggers, University of Oxford
Les mer

Produktdetaljer

ISBN
9780691167039
Publisert
2017
Utgiver
Vendor
Princeton University Press
Vekt
1049 gr
Høyde
254 mm
Bredde
178 mm
Aldersnivå
U, P, 05, 06
Språk
Product language
Engelsk
Format
Product format
Innbundet
Antall sider
432

Forfatter

Biographical note

Kosuke Imai is professor of politics and founding director of the Program in Statistics and Machine Learning at Princeton University.