Reading for Data Analytics

Section Title
3.2 Data Indexing and Selection Links to an external site.
3.3 Operating on Data in Pandas Links to an external site.
3.5 Hierarchical Indexing Links to an external site.
3.6 Combining Datasets: Concat and Append Links to an external site.
3.7 Combining Datasets: Merge and Join Links to an external site.
3.8 Aggregation and Grouping Links to an external site.
3.9 Pivot Tables Links to an external site.
3.11 Working with Time Series Links to an external site.
3.12 High-Performance Pandas: eval() and query() Links to an external site.

Reading from the handbook

 

Chapter Title
25 MapReduce

Reading from the scratch book

 

Further reading

Research papers

MapReduce: simplified data processing on large clusters J. Dean and S. Ghemawat Communications of the ACM January 2008 DOI Links to an external site.

A relational model of data for large shared data banks E. F. Codd Communications of the ACM June 1970 DOI Links to an external site.

Finding the frequent items in streams of data G. Cormode and M. Hadjieleftheriou Communications of the ACM October 2009 link Links to an external site.