資料科學(Data Science)這個名詞是近幾年非常流行的概念,它其實講的是跨領域(Interdisciplinary)的組成,這些學科包含如下:
- 數學 (Mathematics)
- 統計 (Statistics)
- 化學計量學 (Chemometrics)
- 資訊科學 (Information Science)
- 電腦科學 (Computer Science)
- 機率學 (Probability Theory)
- 機器學習 (Machine Learning)
- 統計學習 (Statistical Learning)
- 資料探勘 (Data Mining)
- 資料庫 (Database)
- 資料工程 (Data Engineering)
- 模式識別 (Pattern Recognition)
- 視覺化 (Visualization)
- 預測分析 (Predictive Analytics)
- 不確定性建模 (Uncertainty Modeling)
- 資料倉儲 (Data Warehousing)
- 資料壓縮 (Data Compression)、
- 電腦程式 (Computer Programming)
- 人工智慧 (Artificial Intelligence)、
- 高效能運算 (High Performance Computing)
資料科學領域知識包羅萬象,其中更包含特定領域知識的資料挖掘手法,因此資料科學在多個複雜學科中應運而生,既然是資料科學實驗室,就不能不推薦國外的資料科學相關書籍,以下為國外收集的免費資料科學相關書籍,不完全是新書,但希望可以協助讀者能更深入的了解這塊領域,其中幾個學科是必須知道的,像是R、Python這兩個資料科學主流語言,以及人工智慧、機器學習、資料探勘、資料庫等理論與工具書,都是進入資料科學領域的必學知識。
關於資料科學
- An Introduction to Data Science (2013)
- School of Data Handbook (2015)
- Data Jujitsu: The Art of Turning Data into Product (2012)
- Art of Data Science (2015)
關於大數據
- Disruptive Possibilities: How Big Data Changes Everything (2013)
- Real-Time Big Data Analytics: Emerging Architecture (2013)
- Big Data Now: 2012 Edition (2012)
關於資料分析
- The Elements of Data Analytic Style (2015)
- The Data Science Handbook (2015)
- The Data Analytics Handbook (2015)
關於資料科學團隊
- Data Driven: Creating a Data Culture (2015)
- Building Data Science Teams (2011)
- Understanding the Chief Data Officer (2015)
關於分散式分析
關於Python
- Think Python: How to Think Like a Computer Scientist (2012)
- Python Programming (2015)
- Automate the Boring Stuff with Python: Practical Programming for Total Beginners (2015)
- Learn Python the Hard Way (2013)
- Dive Into Python 3 (2009)
關於R
關於SQL
關於資料探勘與機器學習
- Introduction to Machine Learning (2008)
- Machine Learning (2009)
- Machine Learning – The Complete Guide
- Social Media Mining An Introduction (2014)
- Data Mining: Practical Machine Learning Tools and Techniques (2005)
- Mining of Massive Datasets (2014)
- A Programmer’s Guide to Data Mining (2015)
- Data Mining with Rattle and R (2011)
- Data Mining and Analysis: Fundamental Concepts and Algorithms (2014)
- Probabilistic Programming & Bayesian Methods for Hackers (2015)
- Data Mining Techniques For Marketing, Sales, and Customer Relationship Management (2004)
- Inductive Logic Programming: Techniques and Applications (1994)
- Pattern Recognition and Machine Learning (2006)
- Machine Learning, Neural and Statistical Classification (1999)
- Information Theory, Inference, and Learning Algorithms (2005)
- Data Mining and Business Analytics with R (2013)
- Bayesian Reasoning and Machine Learning (2014)
- Gaussian Processes for Machine Learning (2006)
- Reinforcement Learning: An Introduction (2012)
- Algorithms for Reinforcement Learning (2009)
- Big Data, Data Mining, and Machine Learning (2014)
- Modeling With Data (2008)
- KB – Neural Data Mining with Python Sources (2013)
- Deep Learning (2015)
- Neural Networks and Deep Learning (2015)
- Data Mining Algorithms In R (2014)
- Data Mining and Analysis: Fundamental Concepts and Algorithms (2014)
- Theory and Applications for Advanced Text Mining (2012)
關於統計與統計學習
- Think Stats: Exploratory Data Analysis in Python (2014)
- Think Bayes: Bayesian Statistics Made Simple (2012)
- The Elements of Statistical Learning: Data Mining, Inference, and Prediction (2008)
- An Introduction to Statistical Learning with Applications in R (2013)
- A First Course in Design and Analysis of Experiments (2010)
關於資料視覺化
關於資料科學工具
- Natural Language Processing with Python (2009)
- Computer Vision (2010)
- Concise Computer Vision (2010)
- Artificial Intelligence A Modern Approach, 1st Edition (1995)
參考資料: