What is this book

Note: If you would like to contribute to the instructor resources, please contribute here. If you have any questions or comments about this guide, you can contact us here.

The following book provides additional resources to instructors who use Cloud-Based Data Science MOOCs as part of their curriculum for teaching data science. Cloud-Based Data Science is a result of a team of data scientists and biostatisticians at Johns Hopkins University School of Public Health.

Cloud-Based Data Science (CBDS) is a free, massive open online educational offered through Leanpub to help anyone who can read, write, and use a computer to move into data science. CBDS lets students do data science using only a web browser and cheap computers like Chromebooks.

CBDS program is entirely online and students can take it for free. Students get a certificate for each CBDS online class from Leanpub. There are currently 12 courses that are offered in the Cloud-Based Data Science Curriculum. Courses can be assigned as homeworks and therefore the teacher can flip the classroom and use the class time to work on projects and answering questions. However, the instructor can also use the class time to have students take the courses depending on tastes and school resources.

Course Course Description Leanpub Link
Introduction to Cloud-Based Data Science This is the first class in the Cloud-Based Data Science series. Data science is one of the most exciting and fastest growing careers in the world. The goal of this series is to help people with no background and limited resources transition into data science. The only pre-requisites are a computer with a web browser and the ability to type and follow instructions. We guide you through the rest. Course 0
How to Use A Chromebook This course will introduce you to using a Chromebook. The Introduction and Setup course might sound simple, but it will set up the infrastructure for success with the later, more challenging courses. Course 1
Google and the Cloud The Google and the Cloud course introduces using Google’s in-built apps, which form the fundamental backbone of a Chromebook. We’ll go step by step through the process to integrating these apps together to form your productivity workflow. Course 2
Organizing Data Science Projects Projects are central to the role of any data scientist. These lessons will discuss how to organize projects and the files that are part of each project and will introduce you to Markdown, a simple way to compile text documents to a standard format. Course 3
Version Control Github is the world’s most popular version control website. With GitHub and Markdown, they provide a powerful way for you to get your code out to the world. In this course, we will tour GitHub, discussing the basic features of the website, what a repository is, and how to work with repositories on GitHub. Course 4
Introduction to R R is a simple to learn programming language that is powerful for data analysis. The R Basics course will teach you how to get started from ground zero. We will discuss what objects and packages are, introduce some basic R commands, and discuss RMarkdown, which you will use to write all your reports and to develop a personal website. Course 5
Data Tidying This course will focus on how to organize and tidy data sets in R, this is the first step most data scientist’s do before analyzing data! Course 6
Data Visualization This course will cover the different types of visualation most commonly used by data scientists as well as how to make these different plots in R. We will cover how to make basic tables and figures as well as how to make interactive graphics. Course 7
Getting Data Data is often misunderstood in both subject and application. The Data course will focus on understanding what data is, what the data you’ll encounter will look like, and how to analyze and use data. Additionally, we’ll start to discuss important ethical and legal considerations when working in data science, where to find data, and how to work with these data in RStudio. Course 8
Data Analysis This course will discuss the various types of data analysis, what to consider when carrying out an analysis, and how to approach a data analysis project. Course 9
Written and Oral Communication in Data Science This course will discuss better practices for oral and written communication in data science. Course 10
Getting a Job in Data Science After you learn all of these skills, it is still crucial that you learn the best ways to network and get a job in data science. This course will focus on so-called soft skills on how to give presentations, how to present yourself in the online community, how to network, and how to do data science interviews. Course 11

Note: If the courses are used as part of a K-12 curriculum, the instructors can skip the last two courses that discuss getting jobs in data science and soft skills needed for jobs in data science.