A curated selection of resources for data science. This collection includes books, tools, cheatsheets and guides I've found relevant, organized by topic for easy reference.
-
Python and Data Science:
- Python Data Science Handbook: A comprehensive guide to data science in Python, including machine learning.
- Data Science Topics: A detailed, code-focused book with examples and snippets.
-
Causal Inference:
- Causal Inference for The Brave and True: A hands-on guide to causal inference using Python.
- Causal Inference: What If (the book): Covers STATA, Python, and R (also SAS and Julia).
- Applied Causal Inference Powered by ML and AI: An introduction to the emerging fusion of machine learning and causal inference.
-
Machine Learning:
-
Big Data:
-
Data Visualization:
- Plotly
- Bokeh: Includes dashboard building.
- Altair
- From Data to Viz: A tool to help you choose the most appropriate graph for your data, with source code in Python.
- Python Graph Gallery: A gallery of Python-based visualization examples.
- Datawrapper: A drag-and-drop tool for building graphs with ease.
-
Geospatial Analysis:
-
Machine Learning:
- Darts: A comprehensive time series forecasting library.
-
Web App Deployment:
-
Big Data:
-
Python and other languages:
- W3 Schools: A great source for simple, editable code examples with excellent explanations.
- Geeks for Geeks: A comprehensive resource for learning about computer science and programming concepts, with tutorials, problem sets, and quizzes.
- Awesome List: A comprehensive list of computer science resources.
- Project Based Learning: Tutorials for building applications from scratch in various languages.
- Open Source Society University: A self-taught path to a free education in Computer Science.
- Developer Roadmaps: diagrams with selected resources for each topic (for instance, machine learning); helps to organize the concepts and provide studying materials.
-
Machine Learning:
- mlcourse.ai: open machine learning course.
- Harvard CS181: Machine Learning: resources from Harvard's machine learning course.
- Stanford CS229: resources from Stanfords's machine learning course.
- Stanford CS231: Convolutional Neural Networks for Visual Recognition
- Machine Learning Mastery: An extensive resource covering both theory and practice.
- MLOps Principles: Key principles for machine learning operations.
-
Data Visualization:
- The Data Visualisation Catalogue: A non-code-based library of various visualization types with some links to code examples.
- FlowingData: Non-code-based tips and suggestions for working with and designing data visualizations.
- Visual Vocabulary: Helps you decide which data relationship is most important in your data story.
- WTF Visualizations: Examples of poorly designed visualizations that can lead to misinformation.
-
Big Data:
- Spark by Examples: A comprehensive guide to PySpark based on examples.
- APIs & Databases: AnyAPI
- World Data:
- Nutrition Data: Nutritionix: The worldβs largest verified nutrition database.