Catalog

Record Details

Catalog Search



Data science from scratch : first principles with Python / Joel Grus.

Grus, Joel. (Author).

Record details

  • ISBN: 9781491901427
  • Physical Description: xvi, 311 páginas : ilustraciones ; 24 cm.
  • Publisher: Sebastopol, CA : O´Reilly, [2015].

Content descriptions

General Note:
Incluye índice.
Immediate Source of Acquisition Note:
IPICYT ; Compra/R.3357 ; 2016.
Language Note:
En inglés.
Subject: Phyton (Lenguaje de progración de computadores).
Administración de bases de datos.
Estructura de datos (Ciencias de la computación).

Available copies

  • 1 of 1 copy available at IPICYT.

Holds

  • 0 current holds with 1 total copy.
Show Only Available Copies
Location Call Number / Copy Notes Barcode Shelving Location Status Due Date
Biblioteca Ipicyt QA76.73.P98G7 D3 2015 LCI00970 Coleccion General Available -

Prefacexi
1. Introduction1
The Ascendance of Data1
What Is Data Science?1
Motivating Hypothetical: DataSciencester2
Finding Key Connectors3
Data Scientists You May Know6
Salaries and Experience8
Paid Accounts11
Topics of Interest11
Onward13
2. A Crash Course in Python15
The Basics15
Getting Python15
The Zen of Python16
Whitespace Formatting16
Modules17
Arithmetic18
Functions18
Strings19
Exceptions19
Lists20
Tuples21
Dictionaries21
Sets24
Control Flow25
Truthiness25
The Not-So-Basics26
Sorting27
List Comprehensions27
Generators and Iterators28
Randomness29
Regular Expressions30
Object-Oriented Programming30
Functional Tools31
enumerate32
zip and Argument Unpacking33
args and kwargs35
Welcome to DataSciencester!35
For Further Exploration35
3. Visualizing Data37
matplotlib37
Line Charts39
Bar Charts43
Scatterplots44
For Further Exploration47
4. Linear Algebra49
Vectors49
Matrices53
For Further Exploration55
5. Statistics57
Describing a Single Set of Data57
Central Tendencies59
Dispersion61
Correlation62
Simpson's Paradox65
Some Other Correlational Caveats66
Correlation and Causation67
For Further Exploration68
6. Probability69
Dependence and Independence69
Conditional Probability70
Bayes's Theorem72
Random Variables73
Continuous Distributions74
The Normal Distribution75
The Central Limit Theorem78
For Further Exploration80
7. Hypothesis and Inference81
Statistical Hypothesis Testing81
Example: Flipping a Coin81
Confidence Intervals85
P-hacking86
Example: Running an A/B Test87
Bayesian Inference88
For Further Exploration92
8. Gradient Descent93
The Idea Behind Gradient Descent93
Estimating the Gradient94
Using the Gradient97
Choosing the Right Step Size97
Putting It All Together98
Stochastic Gradient Descent99
For Further Exploration100
9. Getting Data103
stdin and stdout103
Reading Files105
The Basics of Text Files105
Delimited Files106
Scraping the Web108
HTML and the Parsing Thereof108
Example: O'Reilly Books About Data110
Using APIs114
JSON (and XML)114
Using an Unauthenticated API115
Finding APIs116
Example: Using the Twitter APIs117
Getting Credentials117
For Further Exploration120
10. Working with Data121
Exploring Your Data121
Exploring One-Dimensional Data121
Two Dimensions123
Many Dimensions125
Cleaning and Munging127
Manipulating Data129
Rescaling132
Dimensionality Reduction134
For Further Exploration139
11. Machine Learning141
Modeling141
What Is Machine Learning?142
Overfitting and Underfitting142
Correctness145
The Bias-Variance Trade-off147
Feature Extraction and Selection148
For Further Exploration150
12. k-Nearest Neighbors151
The Model151
Example: Favorite Languages153
The Curse of Dimensionality156
For Further Exploration163
13. Naive Bayes165
A Really Dumb Spam Filter165
A More Sophisticated Spam Filter166
Implementation168
Testing Our Model169
For Further Exploration172
14. Simple Linear Regression173
The Model173
Using Gradient Descent176
Maximum Likelihood Estimation177
For Further Exploration177
15. Multiple Regression179
The Model179
Further Assumptions of the Least Squares Model180
Fitting the Model181
Interpreting the Model182
Goodness of Fit183
Digression: The Bootstrap183
Standard Errors of Regression Coefficients184
Regularization186
For Further Exploration188
16. Logistic Regression189
The Problem189
The Logistic Function192
Applying the Model194
Goodness of Fit195
Support Vector Machines196
For Further Investigation200
17. Decision Trees201
What Is a Decision Tree?201
Entropy203
The Entropy of a Partition205
Creating a Decision Tree206
Putting It All Together208
Random Forests211
For Further Exploration212
18. Neural Networks213
Perceptrons213
Feed-Forward Neural Networks215
Backpropagation218
Example: Defeating a CAPTCHA219
For Further Exploration224
19. Clustering225
The Idea225
The Model226
Example: Meetups227
Choosing k230
Example: Clustering Colors231
Bottom-up Hierarchical Clustering233
For Further Exploration238
20. Natural Language Processing239
Word Clouds239
n-gram Models241
Grammars244
An Aside: Gibbs Sampling246
Topic Modeling247
For Further Exploration253
21. Network Analysis255
Betweenness Centrality255
Eigenvector Centrality260
Matrix Multiplication262
Directed Graphs and PageRank264
For Further Exploration266
22. Recommender Systems267
Manual Curation268
Recommending What's Popular268
User-Based Collaborative Filtering269
Item-Based Collaborative Filtering272
For Further Exploration274
23. Databases and SQL275
CREATE TABLE and INSERT275
UPDATE277
DELETE278
SELECT278
GROUP BY280
ORDER BY282
JOIN283
Subqueries285
Indexes285
Query Optimization286
NoSQL287
For Further Exploration287
24. MapReduce289
Example: Word Count289
Why MapReduce?291
MapReduce More Generally292
Example: Analyzing Status Updates293
Example: Matrix Multiplication294
An Aside: Combiners296
For Further Exploration296
25. Go Forth and Do Data Science299
IPython299
Mathematics300
Not from Scratch300
NumPy301
pandas301
scikit-learn301
Visualization301
R302
Find Data302
Do Data Science303
Hacker News303
Fire Trucks303
T-shirts304
And You?304
Index305

Additional Resources