Title: | Tidying and Analyzing 'Coursera' Research Export Data |
---|---|
Description: | Tidies and performs preliminary analysis of 'Coursera' research export data. These export data can be downloaded by anyone who has classes on Coursera and wants to analyze the data. Coursera is one of the leading providers of MOOCs and was launched in January 2012. With over 25 million learners, Coursera is the most popular provider in the world being followed by EdX, the MOOC provider that was a result of a collaboration between Harvard University and MIT, with over 10 million users. Coursera has over 150 university partners from 29 countries and offers a total of 2000+ courses from computer science to philosophy. Besides, Coursera offers 180+ specialization, Coursera's credential system, and four fully online Masters degrees. For more information about Coursera check Coursera's About page on <https://blog.coursera.org/about/>. |
Authors: | Aboozar Hadavand [aut, cre], Jeff Leek [aut], John Muschelli [aut] |
Maintainer: | Aboozar Hadavand <[email protected]> |
License: | GPL-2 |
Version: | 0.2.3 |
Built: | 2024-11-08 04:28:00 UTC |
Source: | https://github.com/jhudsl/crsra |
digest
from the
package digest
.This function will still keep the relationship between tables, i.e. it will change a specific id across all tables to the same id.
crsra_anonymize( all_tables, col_to_mask = attributes(all_tables)$partner_user_id, algorithm = "crc32" )
crsra_anonymize( all_tables, col_to_mask = attributes(all_tables)$partner_user_id, algorithm = "crc32" )
all_tables |
A list from |
col_to_mask |
The name of id column to mask. |
algorithm |
The algorithms to be used for anonymization;
for currently available choices, see |
A list that contains all the tables within each course.
res = crsra_anonymize(example_course_import, col_to_mask = "jhu_user_id", algorithm = "crc32")
res = crsra_anonymize(example_course_import, col_to_mask = "jhu_user_id", algorithm = "crc32")
Frequencies of skipping an peer-assessed submission
crsra_assessmentskips(all_tables, bygender = FALSE, wordcount = TRUE, n = 20)
crsra_assessmentskips(all_tables, bygender = FALSE, wordcount = TRUE, n = 20)
all_tables |
A list from |
bygender |
A logical value indicating whether results should be broken down by gender |
wordcount |
A logical value indicating whether word count should be shown in the results; default is true |
n |
An integer indicating the number of rows for the word count |
The outputs are frequency tables (tibble).and are shown for each specific course
crsra_assessmentskips(example_course_import) crsra_assessmentskips(example_course_import, bygender = TRUE, n = 10)
crsra_assessmentskips(example_course_import) crsra_assessmentskips(example_course_import, bygender = TRUE, n = 10)
Deletes a specific user from all tables in the data in case Coursera data privacy laws require you to delete a specific (or set of) user(s) from your data.
crsra_delete_user(all_tables, users)
crsra_delete_user(all_tables, users)
all_tables |
A list from |
users |
A vector of user ids to delete |
A list that contains all the tables within each course.
del_user = example_course_import$users$jhu_user_id[1] del_user %in% example_course_import$users$jhu_user_id res = crsra_delete_user(example_course_import, users = del_user) del_user %in% res$users$jhu_user_id
del_user = example_course_import$users$jhu_user_id[1] del_user %in% example_course_import$users$jhu_user_id res = crsra_delete_user(example_course_import, users = del_user) del_user %in% res$users$jhu_user_id
The average course grade across different groups
crsra_gradesummary( all_tables, groupby = c("total", "country", "language", "gender", "empstatus", "education", "stustatus") )
crsra_gradesummary( all_tables, groupby = c("total", "country", "language", "gender", "empstatus", "education", "stustatus") )
all_tables |
A list from |
groupby |
A character string indicating the how to break down
grades. The default is set to |
A table which indicates the average grade across specified groups for each course
crsra_gradesummary(example_course_import) crsra_gradesummary(example_course_import, groupby = "education")
crsra_gradesummary(example_course_import) crsra_gradesummary(example_course_import, groupby = "education")
Imports all the .csv files into one list consisting of all the courses and all the tables within each course.
crsra_import(workdir = ".", ...)
crsra_import(workdir = ".", ...)
workdir |
A character string vector indicating the directory where all the unzipped course directories are stored. |
... |
Additional arguments to pass to
|
zip_file = system.file("extdata", "fake_course_7051862327916.zip", package = "crsra") bn = basename(zip_file) bn = sub("[.]zip$", "", bn) res = unzip(zip_file, exdir = tempdir(), overwrite = TRUE) example_import = crsra_import(workdir = tempdir(), check_problems = FALSE)
zip_file = system.file("extdata", "fake_course_7051862327916.zip", package = "crsra") bn = basename(zip_file) bn = sub("[.]zip$", "", bn) res = unzip(zip_file, exdir = tempdir(), overwrite = TRUE) example_import = crsra_import(workdir = tempdir(), check_problems = FALSE)
Convert a Coursera Course to Coursera Import
crsra_import_as_course(x)
crsra_import_as_course(x)
x |
object of class |
object of class coursera_import
Imports all the .csv files into one list consisting of all the tables within the course.
crsra_import_course( workdir = ".", add_course_name = FALSE, change_pid_column = FALSE, check_problems = TRUE, include = NULL ) crsra_table_names(workdir = ".")
crsra_import_course( workdir = ".", add_course_name = FALSE, change_pid_column = FALSE, check_problems = TRUE, include = NULL ) crsra_table_names(workdir = ".")
workdir |
A character string vector indicating the directory where the unzipped course is stored. |
add_course_name |
Should a column of the course name
be added to all the |
change_pid_column |
Should the |
check_problems |
Should problems with reading in the data be checked? |
include |
vector of tables to import, they are the lowercase
names of the files without any '.csv'. See |
zip_file = system.file("extdata", "fake_course_7051862327916.zip", package = "crsra") bn = basename(zip_file) bn = sub("[.]zip$", "", bn) res = unzip(zip_file, exdir = tempdir(), overwrite = TRUE) workdir = file.path(tempdir(), bn) course_tables = crsra_import_course(workdir, check_problems = FALSE)
zip_file = system.file("extdata", "fake_course_7051862327916.zip", package = "crsra") bn = basename(zip_file) bn = sub("[.]zip$", "", bn) res = unzip(zip_file, exdir = tempdir(), overwrite = TRUE) workdir = file.path(tempdir(), bn) course_tables = crsra_import_course(workdir, check_problems = FALSE)
Ordered list of course items and the number and share of learners who have completed the item
crsra_progress(all_tables)
crsra_progress(all_tables)
all_tables |
A list from |
A table which lists all the item within a course and the total number of learners and the share of learners who have completed the item.
crsra_progress(example_course_import)
crsra_progress(example_course_import)
Returns description for a table
crsra_tabledesc(x)
crsra_tabledesc(x)
x |
Name of the table to get the description |
The description for a table based on the description provided by Coursera in the data exports
crsra_tabledesc("assessments")
crsra_tabledesc("assessments")
Time that took each learner (in days) to finish a course
crsra_timetofinish(all_tables)
crsra_timetofinish(all_tables)
all_tables |
A list from |
A table containing hashed_user_id
s with a column indicating the time (in days) that took each user to complete a course. The time is calculated as the difference between the last and first activity in the a course.
crsra_timetofinish(example_course_import)
crsra_timetofinish(example_course_import)
Returns a list of tables a variable appears in
crsra_whichtable(all_tables, col_name)
crsra_whichtable(all_tables, col_name)
all_tables |
A list from |
col_name |
The name of the column/variable to look for |
A list of tables that a specific variable appears in
crsra_whichtable(example_course_import, "assessment_id")
crsra_whichtable(example_course_import, "assessment_id")
Example Import of a Coursera Course
example_course_import
example_course_import
A list with 100 elements, which are data.frame
s
imported from a fake Coursera class:
Table Descriptions
tabdesc
tabdesc
A vector table descriptions, where the names of the table descriptions is the name of the tables in an import.