Resources Available: Data Science Textbooks
Data Science has been rapidly growing as enterprises are utilizing data in order to make data-driven decisions. Such an approach comes from a systematic framework, involving the stages of data collection, cleaning, visualizations, and mining as both iterative and adaptive processes.
While this domain is still relatively new, there are many resources available to practitioners and newcomers alike, such as video tutorials, web-blog posts, and online data science competitions. Throughout this article, we will go over some of the excellent textbooks that will help you get started or develop a new skill in a data science environment.
Data Science Foundation
In this section, we will introduce books that highlight the skill that is crucial in data science: problem-solving using programming languages in R and Python. These books can be a useful resource for practitioners who want to review coding material that can assist with data visualization and modeling. For those who are starting out in this field, these books will solidify your understanding and build a strong foundation of various concepts and algorithms while using the most supported languages in data science.
1. Data Science Using Python and R
Authors: Chantal D. Larose & Daniel T. Larose
Publisher: Wiley, 1st Edition
Originally Published: April 2019
Number of Pages: 256
This book gives an introduction to data science through the Data Science Methodology, a scientific framework of an iterative and adaptive approach to the analysis of data. Each chapter provides breakdowns of data mining algorithms provided with both Python and R-programming code snippets and exercises. For example, it introduces decision trees and demonstrates how to model this algorithm on a data set step by step. An easy, digestible book to start coding in both R and Python right away!
2. Practical Statistics for Data Scientists: 50+ Concepts Essential Concepts Using Python and R
Authors: Peter Bruce, Andrew Bruce & Peter Gedeck
Publisher: O’Reilly Media, 2nd Edition
Originally Published: June 2020
Number of Pages: 368
This book organizes key concepts from statistics that are relevant in data science. Also, the book does an excellent job explaining practical statistics and is easy to navigate for reference. There are examples of outputs and plots in both Python and R languages. A great refresher for anyone interested in statistics for data science!
3. R for Everyone: Advanced Analytics and Graphics
Authors: Jared P. Lander
Publisher: Addison-Wesley Professional, 2nd Edition
Originally Published: June 2017
Number of Pages: 560
This book primarily focuses on working with R to help readers without a programming or statistics background to build powerful statistical models in research projects. R for Everyone is the right balance between analytics, communication, and computer science. The reader will be equipped with the knowledge to use packages in R, such as Tidyverse and Shiny. Additionally, the author explains complex statistical concepts in a clear and concise manner.
Data Visualizations and Storytelling
In this section, we will introduce books that highlight the skill of creating effective data visualizations and showing different ways to tell a story with data. These books will help analysts with data visualizations and write up insightful reports to present to business stakeholders.
4. Storytelling with Data: A Data Visualization Guide for Business Professionals
Authors: Cole Nussbaumer Knaflic
Publisher: Wiley, 1st Edition
Originally Published: November 2015
Number of Pages: 288
This book discusses the guidelines for creating data visualizations and how to tell a story with your data effectively. The reader will be able to apply this information into any data exploration tasks, supported with concepts from business marketing and information management. An excellent book to learn about core principles of designing data visualizations, and the power of storytelling!
5. Storytelling with Data: Let’s Practice!
Authors: Cole Nussbaumer Knaflic
Publisher: Wiley, 1st Edition
Originally Published: October 2019
Number of Pages: 448
This textbook is an expanded follow-up from the previous with more exercises to give the reader an immersive experience of being a data storyteller. There are interesting supplements to hone in the necessary skills to communicate insights to the audience. Additionally, the reader will be able to practice critical-thinking and problem-solving.
6. Effective Data Storytelling: How to Drive Change with Data, Narrative and Visuals
Authors: Brent Dykes
Publisher: Wiley, 1st Edition
Originally Published: December 2019
Number of Pages: 336
This book is for data practitioners who are passionate about using data to tell an effective story and drive better business decisions. There are guides with detailed information on how to apply the right data visualizations and communication methods to make an impact on the organization. Effective Data Storytelling teaches readers how to communicate data insights effectively to business stakeholders.
Machine Learning
In this section, we will introduce books that highlight the concepts of predictive data analytics, applied data mining, and machine learning. These books will help analysts with data preparation for modeling, different types of models and model evaluations.
7. Fundamentals of Machine Learning for Predictive Data Analytics
Authors: John D. Kelleher, Brian Mac Namee & Aoife D’Arcy
Publisher: The MIT Press, 2nd Edition
Originally Published: October 2020
Number of Pages: 856
This book provides a comprehensive introduction to predictive data analytics with a broad range of machine learning approaches. The targeted audience are readers who are new to machine learning. There are examples about theoretical concepts and practical applications in machine learning, such as price prediction, document classification, or customer segmentation. Each concept explains the underlying function and behavior of the models in a business context.
[RELATED RESOURCE] Ready to apply your knowledge practically? Use our guide to select a master’s program that aligns with your learning goals.
8. Introduction to Data Mining
Authors: Pang-Ning Tan, Michael Steinbach, Anuj Karpatne & Vipin Kumar
Publisher: Pearson, 2nd Edition
Originally Published: January 2018
Number of Pages: 864
This book provides a thorough and comprehensive overview of data mining algorithms, outlining the fundamentals concepts clearly and concisely. The reader will gain a full understanding of the mathematical concepts and the pseudocode implementation of each data mining algorithm. Each data mining algorithm is covered extensively with examples and emphasizes the importance and functionality of the model.
9. Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems
Authors: Peter Bruce, Andrew Bruce & Peter Gedeck
Publisher: O’Reilly Media, 2nd Edition
Originally Published: October 2019
Number of Pages: 856
This book covers concepts in machine learning, although with prerequisites in Python and college-level mathematics. The reader will learn the necessary concepts and tools to implement models, such as linear regression to deep learning techniques. The tools taught in this book are production-ready code from Python frameworks using the Scikit-Learn, TensorFlow, and Keras packages.
10. Building Machine Learning Powered Applications: Going from Idea to Product
Authors: Emmanuel Ameisen
Publisher: O’Reilly Media, 1st Edition
Originally Published: February 2020
Number of Pages: 260
This textbook teaches the reader about the deployment phase in a data science project. The reader will learn how to design, build and deploy machine learning applications from ideas to products. Building Machine Learning Powered Applications is most suited for data scientists, product designers, and software engineers. There are detailed steps on deploying data science applications from the end results of a project through practical examples in industries.
Data Science Preparation
In this section, we will talk about the books that highlight ways to jumpstart your career as a data scientist. These books will guide you through the necessary steps and teach you the data science tools in the industry.
11. Build a Career in Data Science
Authors: Emily Robinson & Jacqueline Nolis
Publisher: Manning Publications, 1st Edition
Originally Published: March 2020
Number of Pages: 354
This textbook is excellent for readers who are interested in becoming a data scientist. Readers will learn how to create a data science portfolio from scratch. Build a career in Data Science is a well-written comprehensive guide on crafting a robust resume and acing data science interviews. It also outlines the necessary skills needed to become a data scientist.
12. Learning SQL: Generate, Manipulate, and Retrieve Data
Authors: Alan Beaulieu
Publisher: O’Reilly Media, 3rd Edition
Originally Published: April 2020
Number of Pages: 384
Learning SQL is an introductory textbook for beginners with no background knowledge or experience working with relational databases. This textbook teaches SQL concepts via MySQL by finding the right balance of the basics and advanced features. It also provides very thorough explanations on abstract concepts in SQL and step-by-step syntax tutorials on querying, manipulating, and retrieving data from MySQL.
13. OpenIntro Statistics
Authors: David Diez, Mine Cetinkaya-Rundel & Christopher Barr
Publisher: OpenIntro, Inc., 4th Edition
Originally Published: May 2019
Number of Pages: 422
OpenIntro Statistics is a useful statistics textbook for working professionals. Additionally, it serves as an excellent introductory textbook for readers who are new to statistics. Statistics plays a crucial role in data science projects. It serves as the backbone for every data analysis report. The reader will gain a strong foundation in statistical analysis and modeling. This book features the necessary concepts in statistics to prepare the reader for data science interviews.
Conclusion
In summary, we covered different themes of the textbooks listed above, such as data science foundation, data visualizations and effective storytelling, predictive analytics and machine learning, and data science preparation as a career. The data science foundation books highlight R and Python as the most supported programming languages in data science projects. Then, we introduced books that cover how to create effective data visualizations and story-telling with data. Next, we discussed textbooks that teach concepts on data mining algorithms and machine learning models from scratch to deployment. Lastly, we shared a handful of textbooks that can teach you how to jumpstart your data science career and prepare for data science interviews.
Thank you for reading this article. We hope you are able to find this information useful to help guide you in your data science education and career.
If you would like to learn more about data science, the Master of Science in Applied Data Science (MS-ADS) program at the University of San Diego will help you gain mastery over these skills in their courses.