As an aspiring data scientist, you appreciate why organizations rely on data for important decisions—whether it's for companies designing websites, cities deciding how to improve services, or scientists discovering how to stop the spread of disease. And you want the skills required to distill a messy pile of data into actionable insights. We call this the data science lifecycle: the process of collecting, wrangling, analyzing, and drawing conclusions from data. Learning Data Science is the first book to cover foundational skills in both programming and statistics that encompass this entire lifecycle. It's aimed at those who wish to become data scientists or who already work with data scientists, and at data analysts who wish to cross the "technical/nontechnical" divide. If you have a basic knowledge of Python programming, you'll learn how to work with data using industry-standard tools like pandas. Refine a question of interest to one that can be studied with data Pursue data collection that may involve text processing, web scraping, etc. Glean valuable insights about data through data cleaning, exploration, and visualization Learn how to use modeling to describe the data Generalize findings beyond the data
Les mer
Learning Data Science is the first book to cover foundational skills in both programming and statistics that encompass the entire data science lifecycle: the process of collecting, wrangling, analyzing, and drawing conclusions from data.
Les mer

Produktdetaljer

ISBN
9781098113001
Publisert
2023-09-30
Utgiver
Vendor
O'Reilly Media
Høyde
233 mm
Bredde
178 mm
Aldersnivå
G, 01
Språk
Product language
Engelsk
Format
Product format
Heftet
Antall sider
600

Forfatter

Biographical note

Sam Lau is a PhD candidate at UC San Diego. He designs novel interfaces for learning and teaching data science, and his research has been published in top-tier conferences in human-computer interaction and end-user programming. Sam instructed and helped design flagship data science courses at UC Berkeley. These courses have grown to serve thousands of students every year and their curriculum is used by universities across the world. Joseph (Joey) Gonzalez is an assistant professor in the EECS department at UC Berkeley and a founding member of the new UC Berkeley RISE Lab. His research interests are at the intersection of machine learning and data systems, including: dynamic deep neural networks for transfer learning, accelerated deep learning for high-resolution computer vision, and software platforms for autonomous vehicles. Joey is also co-founder of Turi Inc. (formerly GraphLab), which was based on his work on the GraphLab and PowerGraph Systems. Turi was recently acquired by Apple Inc. Deborah (Deb) Nolan is Professor of Statistics and Associate Dean for Undergraduate Studies in the Division of Computing, Data Science, and Society at the University of California, Berkeley, where she holds the Zaffaroni Family Chair in Undergraduate Education. Her research has involved the empirical process, high-dimensional modeling, and, more recently, technology in education and reproducible research. Her pedagogical approach connects research, practice and education, and she is co-author of 4 textbooks: Stat Labs, Teaching Statistics, Data Science in R, and Communicating with Data.