Leanpub Header

Skip to main content

Material for R Programming for Data Science

The author is letting you choose the price you pay for this book!

Pick Your Price...
PDF
EPUB
WEB
About

About

About the Book

Price

Pick Your Price...

Minimum price

$19.00

$29.00

You pay

$29.00

Author earns

$23.20
$

All prices are in US $. You can pay in US $ or in your local currency when you check out.

EU customers: prices exclude VAT, which is added during checkout.

...Or Buy With Credits!

Number of credits (Minimum 2)

2
The author will earn $24.00 from your purchase!
You can get credits monthly with a Reader Membership

Author

About the Author

Roger D. Peng

Roger D. Peng is a Professor of Statistics and Data Sciences at the University of Texas, Austin. Previously, he was Professor of Biostatistics at the Johns Hopkins Bloomberg School of Public Health. His research focuses on the development of statistical methods for addressing environmental health problems and on developing tools for doing better data analysis. He is the author of the popular book R Programming for Data Science and 10 other books on data science and statistics. He is also the co-creator of the Johns Hopkins Data Science Specialization, the Simply Statistics blog where he writes about statistics for the public, the Not So Standard Deviations podcast with Hilary Parker, and The Effort Report podcast with Elizabeth Matsui. Roger is a Fellow of the American Statistical Association and is the recipient of the Mortimer Spiegelman Award from the American Public Health Association, which honors a statistician who has made outstanding contributions to public health. He can be found on Twitter and GitHub at @rdpeng.

Leanpub Podcast

Episode 16

An Interview with Roger D. Peng

Contents

Table of Contents

1: Introduction

  1. 1.1: Stay in Touch!
  2. 1.2: Preface

2: History and Overview of R

  1. 2.1: What is R?
  2. 2.2: What is S?
  3. 2.3: The S Philosophy
  4. 2.4: Back to R
  5. 2.5: Basic Features of R
  6. 2.6: Free Software
  7. Exercise 1
  8. 2.7: Design of the R System
  9. Exercise 2
  10. 2.8: Limitations of R
  11. Exercise 3
  12. 2.9: R Resources
  13. 2.9.1: Official Manuals
  14. 2.9.2: Useful Standard Texts on S and R
  15. 2.9.3: Other Resources
  16. Quiz 1

3: Getting Started with R

  1. 3.1: Installation
  2. 3.2: Getting started with the R interface

4: R Nuts and Bolts

  1. 4.1: Entering Input
  2. Exercise 4
  3. 4.2: Evaluation
  4. Exercise 5
  5. 4.3: R Objects
  6. Exercise 6
  7. 4.4: Numbers
  8. Exercise 7
  9. 4.5: Attributes
  10. Exercise 8
  11. 4.6: Creating Vectors
  12. Exercise 9
  13. 4.7: Mixing Objects
  14. Exercise 10
  15. 4.8: Explicit Coercion
  16. 4.9: Matrices
  17. Exercise 11
  18. 4.10: Lists
  19. Exercise 12
  20. 4.11: Factors
  21. Exercise 13
  22. 4.12: Missing Values
  23. Exercise 14
  24. 4.13: Data Frames
  25. Exercise 15
  26. 4.14: Names
  27. 4.15: Summary
  28. Exercise 16
  29. Quiz 2

5: Getting Data In and Out of R

  1. 5.1: Reading and Writing Data
  2. Exercise 17
  3. 5.2: Reading Data Files with read.table()
  4. Exercise 18
  5. 5.3: Reading in Larger Datasets with read.table
  6. Exercise 19
  7. 5.4: Calculating Memory Requirements for R Objects
  8. Exercise 20
  9. Quiz 3

6: Using the readr Package

  1. Exercise 21
  2. Quiz 4

7: Using Textual and Binary Formats for Storing Data

  1. Exercise 22
  2. 7.1: Using dput() and dump()
  3. Exercise 23
  4. 7.2: Binary Formats
  5. Exercise 24
  6. Quiz 5

8: Interfaces to the Outside World

  1. 8.1: File Connections
  2. Exercise 25
  3. 8.2: Reading Lines of a Text File
  4. Exercise 26
  5. 8.3: Reading From a URL Connection
  6. Exercise 27
  7. Quiz 6

9: Subsetting R Objects

  1. 9.1: Subsetting a Vector
  2. Exercise 28
  3. 9.2: Subsetting a Matrix
  4. 9.2.1: Dropping matrix dimensions
  5. Exercise 29
  6. 9.3: Subsetting Lists
  7. Exercise 30
  8. 9.4: Subsetting Nested Elements of a List
  9. Exercise 31
  10. 9.5: Extracting Multiple Elements of a List
  11. Exercise 32
  12. 9.6: Partial Matching
  13. Exercise 33
  14. 9.7: Removing NA Values
  15. Exercise 34
  16. Quiz 7

10: Vectorized Operations

  1. Exercise 35
  2. 10.1: Vectorized Matrix Operations
  3. Exercise 36
  4. Quiz 8

11: Dates and Times

  1. 11.1: Dates in R
  2. Exercise 37
  3. 11.2: Times in R
  4. Exercise 38
  5. 11.3: Operations on Dates and Times
  6. Exercise 39
  7. 11.4: Summary
  8. Exercise 40
  9. Quiz 9

12: Managing Data Frames with the dplyr package

  1. 12.1: Data Frames
  2. 12.2: The dplyr Package
  3. 12.3: dplyr Grammar
  4. 12.3.1: Common dplyr Function Properties
  5. Exercise 41
  6. 12.4: Installing the dplyr package
  7. Exercise 42
  8. 12.5: select()
  9. Exercise 43
  10. 12.6: filter()
  11. Exercise 44
  12. 12.7: arrange()
  13. Exercise 45
  14. 12.8: rename()
  15. Exercise 46
  16. 12.9: mutate()
  17. Exercise 47
  18. 12.10: group_by()
  19. Exercise 48
  20. 12.11: %>%
  21. Exercise 49
  22. 12.12: Summary
  23. Quiz 10

13: Control Structures

  1. 13.1: if-else
  2. Exercise 50
  3. 13.2: for Loops
  4. 13.3: Nested for loops
  5. Exercise 51
  6. 13.4: while Loops
  7. 13.5: repeat Loops
  8. Exercise 52
  9. 13.6: next, break
  10. Exercise 53
  11. 13.7: Summary
  12. Exercise 54
  13. Quiz 11

14: Functions

  1. 14.1: Functions in R
  2. Exercise 55
  3. 14.2: Your First Function
  4. Exercise 56
  5. 14.3: Argument Matching
  6. Exercise 57
  7. 14.4: Lazy Evaluation
  8. 14.5: The ... Argument
  9. Exercise 58
  10. 14.6: Arguments Coming After the ... Argument
  11. Exercise 59
  12. 14.7: Summary
  13. Exercise 60
  14. Quiz 12

15: Scoping Rules of R

  1. 15.1: A Diversion on Binding Values to Symbol
  2. Exercise 61
  3. 15.2: Scoping Rules
  4. Exercise 62
  5. 15.3: Lexical Scoping: Why Does It Matter?
  6. Exercise 63
  7. 15.4: Lexical vs. Dynamic Scoping
  8. Exercise 64
  9. 15.5: Application: Optimization
  10. Exercise 65
  11. 15.6: Plotting the Likelihood
  12. Exercise 66
  13. 15.7: Summary
  14. Quiz 13

16: Coding Standards for R

  1. Exercise 67
  2. Quiz 14

17: Loop Functions

  1. 17.1: Looping on the Command Line
  2. Exercise 68
  3. 17.2: lapply()
  4. Exercise 69
  5. 17.3: sapply()
  6. Exercise 70
  7. 17.4: split()
  8. Exercise 71
  9. 17.5: Splitting a Data Frame
  10. Exercise 72
  11. 17.6: tapply
  12. Exercise 73
  13. 17.7: apply()
  14. Exercise 74
  15. 17.8: Col/Row Sums and Means
  16. Exercise 75
  17. 17.9: Other Ways to Apply
  18. Exercise 76
  19. 17.10: mapply()
  20. Exercise 77
  21. 17.11: Vectorizing a Function
  22. Exercise 78
  23. 17.12: Summary
  24. Quiz 15

18: Regular Expressions

  1. 18.1: Before You Begin
  2. 18.2: Primary R Functions
  3. Exercise 79
  4. 18.3: grep()
  5. Exercise 80
  6. 18.4: grepl()
  7. Exercise 81
  8. 18.5: regexpr()
  9. Exercise 82
  10. 18.6: sub() and gsub()
  11. Exercise 83
  12. 18.7: regexec()
  13. Exercise 84
  14. 18.8: The stringr Package
  15. Exercise 85
  16. 18.9: Summary
  17. Exercise 86
  18. Quiz 16

19: Debugging

  1. 19.1: Something’s Wrong!
  2. Exercise 87
  3. 19.2: Figuring Out What’s Wrong
  4. Exercise 88
  5. 19.3: Debugging Tools in R
  6. Exercise 89
  7. 19.4: Using traceback()
  8. Exercise 90
  9. 19.5: Using debug()
  10. Exercise 91
  11. 19.6: Using recover()
  12. Exercise 92
  13. 19.7: Summary
  14. Exercise 93
  15. Quiz 17

20: Profiling R Code

  1. 20.1: Using system.time()
  2. Exercise 94
  3. 20.2: Timing Longer Expressions
  4. Exercise 95
  5. 20.3: The R Profiler
  6. Exercise 96
  7. 20.4: Using summaryRprof()
  8. Exercise 97
  9. 20.5: Summary
  10. Exercise 98
  11. Quiz 18

21: Simulation

  1. 21.1: Generating Random Numbers
  2. Exercise 99
  3. 21.2: Setting the random number seed
  4. 21.3: Simulating a Linear Model
  5. Exercise 100
  6. 21.4: Random Sampling
  7. Exercise 101
  8. 21.5: Summary
  9. Exercise 102
  10. Quiz 19

22: Data Analysis Case Study: Changes in Fine Particle Air Pollution in the U.S.

  1. 22.1: Synopsis
  2. 22.2: Loading and Processing the Raw Data
  3. 22.2.1: Reading in the 1999 data
  4. 22.2.2: Reading in the 2012 data
  5. Exercise 103
  6. 22.3: Results
  7. 22.3.1: Entire U.S. analysis
  8. 22.3.2: Changes in PM levels at an individual monitor
  9. 22.3.3: Changes in state-wide PM levels
  10. Exercise 104
  11. Quiz 20

23: Parallel Computation

  1. 23.1: Hidden Parallelism
  2. 23.1.1: Parallel BLAS
  3. Exercise 105
  4. 23.2: Embarrassing Parallelism
  5. Exercise 106
  6. 23.3: The Parallel Package
  7. 23.3.1: mclapply()
  8. 23.3.2: Error Handling
  9. Exercise 107
  10. 23.4: Example: Bootstrapping a Statistic
  11. 23.4.1: Generating Random Numbers
  12. 23.4.2: Using the boot package
  13. 23.5: Building a Socket Cluster
  14. 23.6: Summary
  15. Quiz 21

24: Why I Indent My Code 8 Spaces

25: About the Author

The Leanpub 60 Day 100% Happiness Guarantee

Within 60 days of purchase you can get a 100% refund on any Leanpub purchase, in two clicks.

Now, this is technically risky for us, since you'll have the book or course files either way. But we're so confident in our products and services, and in our authors and readers, that we're happy to offer a full money back guarantee for everything we sell.

You can only find out how good something is by trying it, and because of our 100% money back guarantee there's literally no risk to do so!

So, there's no reason not to click the Add to Cart button, is there?

See full terms...

Earn $8 on a $10 Purchase, and $16 on a $20 Purchase

We pay 80% royalties on purchases of $7.99 or more, and 80% royalties minus a 50 cent flat fee on purchases between $0.99 and $7.98. You earn $8 on a $10 sale, and $16 on a $20 sale. So, if we sell 5000 non-refunded copies of your book for $20, you'll earn $80,000.

(Yes, some authors have already earned much more than that on Leanpub.)

In fact, authors have earned over $14 million writing, publishing and selling on Leanpub.

Learn more about writing on Leanpub

Free Updates. DRM Free.

If you buy a Leanpub book, you get free updates for as long as the author updates the book! Many authors use Leanpub to publish their books in-progress, while they are writing them. All readers get free updates, regardless of when they bought the book or how much they paid (including free).

Most Leanpub books are available in PDF (for computers) and EPUB (for phones, tablets and Kindle). The formats that a book includes are shown at the top right corner of this page.

Finally, Leanpub books don't have any DRM copy-protection nonsense, so you can easily read them on any supported device.

Learn more about Leanpub's ebook formats and where to read them

Write and Publish on Leanpub

You can use Leanpub to easily write, publish and sell in-progress and completed ebooks and online courses!

Leanpub is a powerful platform for serious authors, combining a simple, elegant writing and publishing workflow with a store focused on selling in-progress ebooks.

Leanpub is a magical typewriter for authors: just write in plain text, and to publish your ebook, just click a button. (Or, if you are producing your ebook your own way, you can even upload your own PDF and/or EPUB files and then publish with one click!) It really is that easy.

Learn more about writing on Leanpub