ANU UG/Degree 2nd Sem Introduction to Data Science and R Programming Material

 ANU UG/Degree 2nd Sem Introduction to Data Science and R Programming Material for B.Sc Data Science course is now available. 

R for Data Science: A Robust Alternative

While Python receives much spotlight in the data science realm, R deserves significant recognition as a powerful contender. Here's why R is a fantastic choice for data science:

Strengths of R for Data Science:

  • Statistical Analysis Powerhouse: R boasts a rich heritage in statistics, offering a vast collection of robust statistical methods and tests readily available within the core language. This makes it ideal for hypothesis testing, regression analysis, time series analysis, and more.
  • Tidyverse Ecosystem: Similar to Python's data science libraries, R's "tidyverse" collection of packages provides a cohesive and intuitive framework for data manipulation, wrangling, and visualization. This streamlines the data analysis process and promotes reproducible workflows.
  • Interactive Environment: RStudio, the most popular IDE for R, offers an interactive environment where you can code, explore data, and visualize results on the fly. This fosters rapid prototyping and iterative analysis, allowing you to quickly adjust your approach based on insights.
  • Outstanding Graphics: R is renowned for its exceptional graphics capabilities through packages like ggplot2. This library empowers you to create high-quality, publication-ready visualizations with remarkable flexibility and aesthetic control.
  • Strong Community and Resources: R enjoys a dedicated and active community of data scientists and statisticians who contribute extensive documentation, tutorials, and online resources. This ensures readily available support and guidance as you navigate your data science journey.

However, it's important to consider some aspects when choosing R:

  • Learning Curve: Compared to Python, R's syntax might initially appear less intuitive, especially for newcomers to programming. While the tidyverse helps ease the transition, dedication and consistent practice are essential.
  • General-Purpose Utility: Python's wider applicability beyond data science might be advantageous for projects involving tasks like web development or automation.

Ultimately, the choice between R and Python depends on individual preferences and project requirements. If you value:

  • Statistical depth and robust analysis tools
  • An interactive and visualization-rich environment
  • A tightly integrated data manipulation framework

Then R could be the perfect fit for your data science explorations.


  1. Defining Data Science and Big data
  2. Benefits and Uses
  3. facets of Data
  4. Data Science Process. 
  5. History and Overview of R
  6. Getting Started with R
  7. R Nuts and Bolts

UNIT II: The Data Science Process

  1. Overview of the Data Science Process
  2. Setting the research goal
  3. Retrieving Data
  4. Data Preparation
  5. Exploration
  6. Modeling
  7. data Presentation and Automation. 
  8. Getting Data in and out of R
  9. Using reader package
  10. Interfaces to the outside world.

UNIT III: Machine Learning

  1. Understanding why data scientists use machine learning
  2. What is machine learning and why we should care about
  3. Applications of machine learning in data science
  4. Where it is used in data science
  5. The modeling process
  6. Types of Machine Learning-Supervised and Unsupervised

UNIT IV: Handling large Data on a Single Computer

  1. The problems we face when handling large data
  2. General Techniques for handling large volumes of data
  3. Generating programming tips for dealing with large datasets


  1. Sub setting R objects
  2. Vectorised Operations
  3. Managing Data Frames with the dplyr
  4. Control structures
  5. functions
  6. Scoping rules of R
  7. Coding Standards in R
  8. Loop Functions
  9. Debugging
  10. Simulation. 
  11. Case studies on preliminary data analysis


1. DavyCielen, Arno.D.B.Maysman, Mohamed Ali, “Introducing Data Science” ManningPublications,

2. Roger D. Peng, “R Programming for DataScience” Lean Publishing, 2015.


1. Nina Zumel, John Mount, “Practical Data Science with R”, Manning Publications, 2014.

2. Tony Ojeda, Sean Patrick Murphy, Benjamin Bengfort, AbhijitDasgupta, “PracticalData Science Cookbook”, Packt Publishing Ltd., 2014.


Post a Comment

Note: only a member of this blog may post a comment.


Follow US

Join 12,000+ People Following





Java Tutorial


Digital Logic design Tutorial




ANU Materials