1.Stay in Touch!
2.Preface
3.Getting Started with R
- 3.1Installation
- 3.2Getting started with the R interface
4.Managing Data Frames with the dplyr package
- 4.1Data Frames
- 4.2The
dplyrPackage - 4.3
dplyrGrammar - 4.4Installing the
dplyrpackage - 4.5
select() - 4.6
filter() - 4.7
arrange() - 4.8
rename() - 4.9
mutate() - 4.10
group_by() - 4.11
%>% - 4.12Summary
5.Exploratory Data Analysis Checklist
- 5.1Formulate your question
- 5.2Read in your data
- 5.3Check the packaging
- 5.4Run
str() - 5.5Look at the top and the bottom of your data
- 5.6Check your “n”s
- 5.7Validate with at least one external data source
- 5.8Try the easy solution first
- 5.9Challenge your solution
- 5.10Follow up questions
6.Principles of Analytic Graphics
- 6.1Show comparisons
- 6.2Show causality, mechanism, explanation, systematic structure
- 6.3Show multivariate data
- 6.4Integrate evidence
- 6.5Describe and document the evidence
- 6.6Content, Content, Content
- 6.7References
7.Exploratory Graphs
- 7.1Characteristics of exploratory graphs
- 7.2Air Pollution in the United States
- 7.3Getting the Data
- 7.4Simple Summaries: One Dimension
- 7.5Five Number Summary
- 7.6Boxplot
- 7.7Histogram
- 7.8Overlaying Features
- 7.9Barplot
- 7.10Simple Summaries: Two Dimensions and Beyond
- 7.11Multiple Boxplots
- 7.12Multiple Histograms
- 7.13Scatterplots
- 7.14Scatterplot - Using Color
- 7.15Multiple Scatterplots
- 7.16Summary
8.Plotting Systems
- 8.1The Base Plotting System
- 8.2The Lattice System
- 8.3The ggplot2 System
- 8.4References
9.Graphics Devices
- 9.1The Process of Making a Plot
- 9.2How Does a Plot Get Created?
- 9.3Graphics File Devices
- 9.4Multiple Open Graphics Devices
- 9.5Copying Plots
- 9.6Summary
10.The Base Plotting System
- 10.1Base Graphics
- 10.2Simple Base Graphics
- 10.3Some Important Base Graphics Parameters
- 10.4Base Plotting Functions
- 10.5Base Plot with Regression Line
- 10.6Multiple Base Plots
- 10.7Summary
11.Plotting and Color in R
- 11.1Colors 1, 2, and 3
- 11.2Connecting colors with data
- 11.3Color Utilities in R
- 11.4
colorRamp() - 11.5
colorRampPalette() - 11.6RColorBrewer Package
- 11.7Using the RColorBrewer palettes
- 11.8The
smoothScatter()function - 11.9Adding transparency
- 11.10Summary
12.Hierarchical Clustering
- 12.1Hierarchical clustering
- 12.2How do we define close?
- 12.3Example: Euclidean distance
- 12.4Example: Manhattan distance
- 12.5Example: Hierarchical clustering
- 12.6Prettier dendrograms
- 12.7Merging points: Complete
- 12.8Merging points: Average
- 12.9Using the
heatmap()function - 12.10Notes and further resources
13.K-Means Clustering
- 13.1Illustrating the K-means algorithm
- 13.2Stopping the algorithm
- 13.3Using the
kmeans()function - 13.4Building heatmaps from K-means solutions
- 13.5Notes and further resources
14.Dimension Reduction
- 14.1Matrix data
- 14.2Patterns in rows and columns
- 14.3Related problem
- 14.4SVD and PCA
- 14.5Unpacking the SVD: u and v
- 14.6SVD for data compression
- 14.7Components of the SVD - Variance explained
- 14.8Relationship to principal components
- 14.9What if we add a second pattern?
- 14.10Dealing with missing values
- 14.11Example: Face data
- 14.12Notes and further resources
15.The ggplot2 Plotting System: Part 1
- 15.1The Basics:
qplot() - 15.2Before You Start: Label Your Data
- 15.3ggplot2 “Hello, world!”
- 15.4Modifying aesthetics
- 15.5Adding a geom
- 15.6Histograms
- 15.7Facets
- 15.8Case Study: MAACS Cohort
- 15.9Summary of qplot()
16.The ggplot2 Plotting System: Part 2
- 16.1Basic Components of a ggplot2 Plot
- 16.2Example: BMI, PM2.5, Asthma
- 16.3Building Up in Layers
- 16.4First Plot with Point Layer
- 16.5Adding More Layers: Smooth
- 16.6Adding More Layers: Facets
- 16.7Modifying Geom Properties
- 16.8Modifying Labels
- 16.9Customizing the Smooth
- 16.10Changing the Theme
- 16.11More Complex Example
- 16.12A Quick Aside about Axis Limits
- 16.13Resources
17.Data Analysis Case Study: Changes in Fine Particle Air Pollution in the U.S.
- 17.1Synopsis
- 17.2Loading and Processing the Raw Data
- 17.3Results
