Posts

Assignment #10: Building Your Own R Package

 # Proposal for My R Package: Friedman For this project, I am proposing an R package called **Friedman**. The purpose of this package is to provide beginner-friendly tools for simple data analysis and visualization in R. Many students and new R users struggle with repetitive tasks such as cleaning datasets, calculating summary statistics, and creating clear visualizations. This package will help make those tasks easier by grouping useful functions into one simple package. The main audience for this package is students, beginner analysts, and anyone who wants a more straightforward way to explore data in R. Instead of writing long code repeatedly, users will be able to call functions from Friedman to quickly summarize and visualize their data. This makes the package especially useful for class assignments, small research projects, and practice with R programming. Some of the key functions I plan to implement include: - `clean_data()` – removes missing values or standardizes column n...

Assignment #9: Visualization in R – Base Graphics, Lattice, and ggplot2

 For this assignment, I used the airquality dataset from the datasets package in R. I chose this dataset because it includes several quantitative variables such as Ozone , Temperature , and Wind , along with a grouping variable, Month , which made it a good fit for comparing different graphing systems in R. After loading the dataset, I removed missing values and converted the Month variable into a factor so it would work better in grouped and faceted plots. Using base R , I created a scatter plot of Temperature versus Ozone and a histogram of Wind. Base R was straightforward and easy to use for simple plots, but it required more manual setup for labels and appearance. With lattice , I created a conditioned scatter plot of Ozone versus Temperature by Month and a boxplot of Wind by Month. Lattice was useful for grouped displays and small multiples, and it made conditioning by category very simple. Finally, with ggplot2 , I created a scatter plot with a regression line and a facete...

Module # 8 Input/Output, string manipulation and plyr package

 In this assignment, I worked with a student dataset in R and practiced importing data, calculating summary statistics, filtering observations, and exporting results to files. These steps helped reinforce basic data manipulation techniques that are commonly used in data analysis. First, I imported the dataset into R using the read.table() function and loaded several required packages, including plyr . After the dataset was loaded, I used the ddply() function from the plyr package to group the data by Sex and calculate the mean grade for each category. This allowed me to quickly compare the average grades between the groups. After calculating these results, I wrote the output to a file so it could be saved and used outside of R. Next, I created a filtered version of the dataset that only included students whose names contained the letter “i.” To do this, I used the subset() function combined with grepl() to search for the letter within the Name column. This created a smalle...

Module # 7 R Object: S3 vs. S4 assignment

 data("mtcars") head(mtcars, 6) str(mtcars) class(mtcars) typeof(mtcars) What I found: class(mtcars) → "data.frame" typeof(mtcars) → "list" (because data frames are lists under the hood) str(mtcars) shows it’s a list of columns, each column is numeric. Yes. Generic functions work great with mtcars . A generic function is a function that chooses which version of the function to run based on the class of the object you pass in. Example: print() is a generic. print ( mtcars ) # uses print.data.frame behind the scenes summary ( mtcars ) # uses summary.data.frame To prove it’s generic: isS4 ( mtcars ) # FALSE methods ( "print" ) # shows lots of print methods methods ( "summary" ) If a generic function didn’t work , it would usually be because the object’s class has no method implemented for that generic (so it falls back to a default method or errors). Step 3 — Can S3 and S4 be assigned to this dataset? ✅...

Matrix Operations and Diagonal Construction in R

# Question 1 # Define matrices A and B A <- matrix ( c ( 2 , 0 , 1 , 3 ), ncol = 2 ) B <- matrix ( c ( 5 , 2 , 4 , - 1 ), ncol = 2 ) # a) A + B A_plus_B <- A + B # b) A - B A_minus_B <- A - B A B A_plus_B A_minus_B # Question 2 # Create diagonal matrix with values 4,1,2,3 D <- diag ( c ( 4 , 1 , 2 , 3 )) D # Question 3 # Generate the required 5x5 matrix M <- diag ( 3 , 5 ) M [ 2 : 5 , 1 ] <- 2 M [ 1 , 2 : 5 ] <- 1 M  In this assignment, I practiced basic matrix operations in R, including matrix addition, subtraction, and constructing matrices using the diag() function. First, I defined matrices A and B using the matrix() function. I then computed A + B and A − B to demonstrate how R performs element-wise arithmetic operations on matrices of the same dimensions. This reinforces how R handles structured data efficiently using vectorized operations. Next, I used the diag() function to construct a 4×4 diagonal matrix with the values 4, ...

Module # 5 Doing Math

  Matrix Operations in R: Determinant and Inverse The goal of this assignment is to learn how to work with matrices in R , specifically how to compute the determinant and inverse of a matrix using built-in R functions. Step 1: Creating the Matrices First, I created the two matrices provided in the assignment using the matrix() function. A <- matrix ( 1 : 100 , nrow = 10 ) B <- matrix ( 1 : 1000 , nrow = 10 ) To better understand these matrices, I checked their dimensions: dim ( A ) dim ( B ) Matrix A has dimensions 10 × 10 , so it is a square matrix . Matrix B has dimensions 10 × 100 , so it is not square . Step 2: Determinant of Matrix A Since matrix A is square, its determinant can be calculated using the det() function. det ( A ) Result: The determinant of matrix A is: 0 A determinant of 0 means that matrix A is singular , which has important consequences for finding its inverse. Step 3: Inverse of Matrix A The inverse of a matri...

Module # 4 Programming structure assignment

 # ------------------------------------------------------------ # hospital_analysis.R # Hospital patient intake dataset: plots + brief summaries # ------------------------------------------------------------ # ---- 1) Create the dataset ---- Freq <- c(0.6, 0.3, 0.4, 0.4, 0.2, 0.6, 0.3, 0.4, 0.9, 0.2) BP   <- c(103, 87, 32, 42, 59, 109, 78, 205, 135, 176) # first: bad = 1, good = 0, NA = missing First <- c(1, 1, 1, 1, 0, 0, 0, 0, NA, 1) # second: low = 0, high = 1 Second <- c(0, 0, 1, 1, 0, 0, 1, 1, 1, 1) # finaldecision: low = 0, high = 1 FinalDecision <- c(0, 1, 0, 1, 0, 1, 0, 1, 1, 1) hospital_data <- data.frame(Freq, BP, First, Second, FinalDecision) # Print the dataset print("Hospital dataset:") print(hospital_data) # ---- 2) Boxplot: Blood Pressure by Final Decision ---- # Side-by-side comparison of BP for Low vs High final decision boxplot(   BP ~ FinalDecision,   data = hospital_data,   names = c("Low Priority (0)", "High Priority (...