The fast and easy way to learn Python programming and statistics
Python is a general-purpose programming language created in the late 1980s—and named after Monty Python—that's used by thousands of people to do things from testing microchips at Intel, to powering Instagram, to building video games with the PyGame library.
Python For Data Science For Dummies is written for people who are new to data analysis, and discusses the basics of Python data analysis programming and statistics. The book also discusses Google Colab, which makes it possible to write Python code in the cloud.
- Get started with data science and Python
- Visualize information
- Wrangle data
- Learn from data
The book provides the statistical background needed to get started in data science programming, including probability, random distributions, hypothesis testing, confidence intervals, and building regression models for prediction.
Introduction 1
Part 1: Getting Started with Data Science and Python 7
Chapter 1: Discovering the Match between Data Science and Python 9
Chapter 2: Introducing Python’s Capabilities and Wonders 21
Chapter 3: Setting Up Python for Data Science 39
Chapter 4: Working with Google Colab 59
Part 2: Getting Your Hands Dirty with Data 81
Chapter 5: Understanding the Tools 83
Chapter 6: Working with Real Data 99
Chapter 7: Conditioning Your Data 121
Chapter 8: Shaping Data 149
Chapter 9: Putting What You Know in Action 169
Part 3: Visualizing Information 183
Chapter 10: Getting a Crash Course in MatPlotLib 185
Chapter 11: Visualizing the Data 201
Part 4: Wrangling Data 227
Chapter 12: Stretching Python’s Capabilities 229
Chapter 13: Exploring Data Analysis 251
Chapter 14: Reducing Dimensionality 275
Chapter 15: Clustering 295
Chapter 16: Detecting Outliers in Data 313
Part 5: Learning from Data 327
Chapter 17: Exploring Four Simple and Effective Algorithms 329
Chapter 18: Performing Cross-Validation, Selection, and Optimization 347
Chapter 19: Increasing Complexity with Linear and Nonlinear Tricks 371
Chapter 20: Understanding the Power of the Many 411
Part 6: The Part of Tens 429
Chapter 21: Ten Essential Data Resources 431
Chapter 22: Ten Data Challenges You Should Take 437
Index 447
- Learn Python data analysis programming and statistics
- Write code in the cloud with Google Colab™
- Wrangle data and visualize information
Relax! Data science doesn't have to be scary
Curious about data science, but a bit intimidated? Don't be! This book shows you how to use Python to do all sorts of cool things with data science. You'll see how to install the Anaconda tool suite, so working with Python is a breeze. You'll discover Google Colab, which lets you write code in the cloud using your tablet. You'll find out how to perform all kinds of interesting calculations using the latest version of Python. And you'll learn to use the various libraries that enable scientific statistical analysis, plotting and graphing, and much more.
Inside...
- Python set-up for data science
- Working with Jupyter Notebook
- Conditioning and shaping data
- Graphing with MatPlotLib
- Ways to analyze your data
- Getting more from Python
- Useful data science algorithms
- Ten essential data resources
Produktdetaljer
Biografisk notat
John Paul Mueller is a tech editor and the author of over 100 books on topics from networking and home security to database management and heads-down programming. Follow John's blog at http://blog.johnmuellerbooks.com/. Luca Massaron is a data scientist who specializes in organizing and interpreting big data and transforming it into smart data. He is a Google Developer Expert (GDE) in machine learning.