Posts

Showing posts from March, 2026

Assignment #10: Building Your Own R Package

 # Proposal for My R Package: Friedman For this project, I am proposing an R package called **Friedman**. The purpose of this package is to provide beginner-friendly tools for simple data analysis and visualization in R. Many students and new R users struggle with repetitive tasks such as cleaning datasets, calculating summary statistics, and creating clear visualizations. This package will help make those tasks easier by grouping useful functions into one simple package. The main audience for this package is students, beginner analysts, and anyone who wants a more straightforward way to explore data in R. Instead of writing long code repeatedly, users will be able to call functions from Friedman to quickly summarize and visualize their data. This makes the package especially useful for class assignments, small research projects, and practice with R programming. Some of the key functions I plan to implement include: - `clean_data()` – removes missing values or standardizes column n...

Assignment #9: Visualization in R – Base Graphics, Lattice, and ggplot2

 For this assignment, I used the airquality dataset from the datasets package in R. I chose this dataset because it includes several quantitative variables such as Ozone , Temperature , and Wind , along with a grouping variable, Month , which made it a good fit for comparing different graphing systems in R. After loading the dataset, I removed missing values and converted the Month variable into a factor so it would work better in grouped and faceted plots. Using base R , I created a scatter plot of Temperature versus Ozone and a histogram of Wind. Base R was straightforward and easy to use for simple plots, but it required more manual setup for labels and appearance. With lattice , I created a conditioned scatter plot of Ozone versus Temperature by Month and a boxplot of Wind by Month. Lattice was useful for grouped displays and small multiples, and it made conditioning by category very simple. Finally, with ggplot2 , I created a scatter plot with a regression line and a facete...

Module # 8 Input/Output, string manipulation and plyr package

 In this assignment, I worked with a student dataset in R and practiced importing data, calculating summary statistics, filtering observations, and exporting results to files. These steps helped reinforce basic data manipulation techniques that are commonly used in data analysis. First, I imported the dataset into R using the read.table() function and loaded several required packages, including plyr . After the dataset was loaded, I used the ddply() function from the plyr package to group the data by Sex and calculate the mean grade for each category. This allowed me to quickly compare the average grades between the groups. After calculating these results, I wrote the output to a file so it could be saved and used outside of R. Next, I created a filtered version of the dataset that only included students whose names contained the letter “i.” To do this, I used the subset() function combined with grepl() to search for the letter within the Name column. This created a smalle...

Module # 7 R Object: S3 vs. S4 assignment

 data("mtcars") head(mtcars, 6) str(mtcars) class(mtcars) typeof(mtcars) What I found: class(mtcars) → "data.frame" typeof(mtcars) → "list" (because data frames are lists under the hood) str(mtcars) shows it’s a list of columns, each column is numeric. Yes. Generic functions work great with mtcars . A generic function is a function that chooses which version of the function to run based on the class of the object you pass in. Example: print() is a generic. print ( mtcars ) # uses print.data.frame behind the scenes summary ( mtcars ) # uses summary.data.frame To prove it’s generic: isS4 ( mtcars ) # FALSE methods ( "print" ) # shows lots of print methods methods ( "summary" ) If a generic function didn’t work , it would usually be because the object’s class has no method implemented for that generic (so it falls back to a default method or errors). Step 3 — Can S3 and S4 be assigned to this dataset? ✅...