Type: | Package |
Title: | Extract Data from NCAA Women's and Men's Volleyball Website |
Version: | 0.4.3 |
Maintainer: | Jeffrey R. Stevens <jeffrey.r.stevens@protonmail.com> |
Description: | Extracts team records/schedules and player statistics for the 2020-2024 National Collegiate Athletic Association (NCAA) women's and men's divisions I, II, and III volleyball teams from https://stats.ncaa.org. Functions can aggregate statistics for teams, conferences, divisions, or custom groups of teams. |
License: | MIT + file LICENSE |
Encoding: | UTF-8 |
LazyData: | true |
Depends: | R (≥ 4.2) |
RoxygenNote: | 7.3.2 |
Imports: | cli, curl, dplyr, httr2, lifecycle, purrr, rlang, rvest, stringr, tibble, tidyr, xml2 |
Suggests: | chromote, knitr, rmarkdown, testthat (≥ 3.0.0) |
Config/testthat/edition: | 3 |
URL: | https://github.com/JeffreyRStevens/ncaavolleyballr, https://jeffreyrstevens.github.io/ncaavolleyballr/ |
BugReports: | https://github.com/JeffreyRStevens/ncaavolleyballr/issues |
VignetteBuilder: | knitr |
NeedsCompilation: | no |
Packaged: | 2025-07-22 22:22:34 UTC; jstevens |
Author: | Jeffrey R. Stevens
|
Repository: | CRAN |
Date/Publication: | 2025-07-22 22:40:02 UTC |
ncaavolleyballr: Extract Data from NCAA Women's and Men's Volleyball Website
Description
Extracts team records/schedules and player statistics for the 2020-2024 National Collegiate Athletic Association (NCAA) women's and men's divisions I, II, and III volleyball teams from https://stats.ncaa.org. Functions can aggregate statistics for teams, conferences, divisions, or custom groups of teams.
Author(s)
Maintainer: Jeffrey R. Stevens jeffrey.r.stevens@protonmail.com (ORCID) [copyright holder]
See Also
Useful links:
Report bugs at https://github.com/JeffreyRStevens/ncaavolleyballr/issues
Checks if division or conference is valid
Description
Checks if division or conference is valid
Usage
check_confdiv(group = NULL, value = NULL, teams = NULL)
Arguments
group |
Character string for group ("div" or "conf"). |
value |
Character string for group's value (e.g., 1 or "Big Ten") |
Checks if contest ID is valid
Description
Checks if contest ID is valid
Usage
check_contest(contest = NULL)
Arguments
contest |
Contest ID |
Checks if a logical input is valid
Description
Checks if a logical input is valid
Usage
check_logical(name = NULL, value = NULL)
Arguments
name |
Argument name. |
value |
Argument value. |
Checks if value is matched in vector
Description
Checks if value is matched in vector
Usage
check_match(name = NULL, value = NULL, vec = NULL)
Arguments
name |
Argument name. |
value |
Value. |
vec |
Vector. |
Checks if sport is valid
Description
Checks if sport is valid
Usage
check_sport(sport, vb_only = TRUE)
Arguments
sport |
Sport code. |
vb_only |
Logical indicating whether to check only for volleyall sports (TRUE) or all sports (FALSE) |
Checks if team ID is valid
Description
Checks if team ID is valid
Usage
check_team_id(team_id = NULL)
Arguments
team_id |
Team ID |
Checks if team name is valid
Description
Checks if team name is valid
Usage
check_team_name(team = NULL, teams = NULL)
Arguments
team |
Team name |
teams |
Data frame of team names |
Checks if year is valid
Description
Checks if year is valid
Usage
check_year(year = NULL, single = FALSE)
Arguments
year |
Year. |
single |
Logical for whether year should be a single element or can be a vector of multiple years. |
Aggregate player statistics for a NCAA conference and seasons
Description
This is a wrapper around group_stats()
that extracts season, match, or pbp
data from players in all teams in the chosen conference. For season stats,
it aggregates all player data and team data into separate data frames and
combines them into a list. For match and pbp stats, it aggregates into a
data frame.
Conferences names can be found in
ncaa_conferences.
Usage
conference_stats(
year = NULL,
conf = NULL,
level = NULL,
sport = "WVB",
save = FALSE,
path = "."
)
Arguments
year |
Numeric vector of years for fall of desired seasons. |
conf |
NCAA conference name. |
level |
Character string defining whether to aggregate "season", "match", or play-by-play ("pbp") data. |
sport |
Three letter abbreviation for NCAA sport (must be upper case; for example "WVB" for women's volleyball and "MVB" for men's volleyball). |
save |
Logical for whether to save the statistics locally as CSVs (default FALSE). |
path |
Character string of path to save statistics files. |
Value
For season level, returns list with data frames of player statistics and team statistics. For match and pbp levels, returns data frame of player statistics and play-by-play information respectively.
Note
This function requires internet connectivity as it checks the NCAA website for information.
See Also
Other functions that aggregate statistics:
division_stats()
,
group_stats()
Examples
conference_stats(year = 2024, conf = "Peach Belt", level = "season")
Aggregate player statistics for a NCAA division and seasons
Description
This is a wrapper around group_stats()
that extracts season, match, or pbp
data from players in all teams in the chosen division. For season stats,
it aggregates all player data and team data into separate data frames and
combines them into a list. For match and pbp stats, it aggregates into a
data frame.
Usage
division_stats(
year = NULL,
division = 1,
level = NULL,
sport = "WVB",
save = FALSE,
path = "."
)
Arguments
year |
Numeric vector of years for fall of desired seasons. |
division |
NCAA division (must be 1, 2, or 3). |
level |
Character string defining whether to aggregate "season", "match", or play-by-play ("pbp") data. |
sport |
Three letter abbreviation for NCAA sport (must be upper case; for example "WVB" for women's volleyball and "MVB" for men's volleyball). |
save |
Logical for whether to save the statistics locally as CSVs (default FALSE). |
path |
Character string of path to save statistics files. |
Value
For season level, returns list with data frames of player statistics and team statistics. For match and pbp levels, returns data frame of player statistics and play-by-play information respectively.
Note
This function requires internet connectivity as it checks the NCAA website for information.
See Also
Other functions that aggregate statistics:
conference_stats()
,
group_stats()
Extract date, opponent, and contest ID for team and season
Description
NCAA datasets use a unique ID for each sport, team, season, and match. This function returns a data frame of dates, opponent team names, and contest IDs for each NCAA contest (volleyball match) for each team and season.
Usage
find_team_contests(team_id = NULL)
Arguments
team_id |
Team ID determined by NCAA for season. To find ID, use
|
Value
Returns a data frame that includes date, team, opponent, and contest ID for each season's contest.
Note
This function requires internet connectivity as it checks the NCAA website for information.
Examples
find_team_contests(team_id = "585290")
Find team ID for season
Description
NCAA datasets use a unique ID for each team and season. To access a team's
data, we must know the volleyball team ID. This function looks up the team ID
from wvb_teams or mvb_teams using the team name.
Team names can be found in ncaa_teams or searched with
find_team_name()
.
Usage
find_team_id(team = NULL, year = NULL, sport = "WVB")
Arguments
team |
Name of school. Must match name used by NCAA. Find exact team
name with |
year |
Numeric vector of years for fall of desired seasons. |
sport |
Three letter abbreviation for NCAA sport (must be upper case; for example "WVB" for women's volleyball and "MVB" for men's volleyball). |
Value
Returns a character string of team ID.
Note
This function requires internet connectivity as it checks the NCAA website for information.
See Also
Other search functions:
find_team_name()
Examples
find_team_id(team = "Nebraska", year = 2024)
find_team_id(team = "UCLA", year = 2023, sport = "MVB")
Match pattern to find team names
Description
This is a convenience function to find NCAA team names in
ncaa_teams. Once the proper team name is found, it can be
passed to find_team_id()
or group_stats()
.
Usage
find_team_name(pattern = NULL)
Arguments
pattern |
Character string of pattern you want to find in the vector of team names. |
Value
Returns a character vector of team names that include the submitted pattern.
Note
This function requires internet connectivity as it checks the NCAA website for information.
See Also
Other search functions:
find_team_id()
Examples
find_team_name(pattern = "Neb")
Fix teams that change their names
Description
Fix teams that change their names
Usage
fix_teams(x)
Gets year, team, and conference from team ID
Description
Gets year, team, and conference from team ID
Usage
get_team_info(team_id = NULL)
Arguments
team_id |
Team ID |
Extract data frame of team names, IDs, conference, division, and season
Description
NCAA datasets use a unique ID for each sport, team, and season. This function extracts team names, IDs, and conferences for each NCAA team in a division. However, you should not need to use this function for volleyball data from 2020-2024, as it has been used to generate wvb_teams and mvb_teams. However, it is available to use for other sports, using the appropriate three letter sport code drawn from ncaa_sports (e.g., men's baseball is "MBA").
Usage
get_teams(year = NULL, division = 1, sport = "WVB")
Arguments
year |
Single numeric year for fall of desired season. |
division |
NCAA division (must be 1, 2, or 3). |
sport |
Three letter abbreviation for NCAA sport (must be upper case; for example "WVB" for women's volleyball and "MVB" for men's volleyball). |
Value
Returns a data frame of all teams, their team ID, division, conference, and season.
Note
This function requires internet connectivity as it checks the NCAA website for information.
This function is a modification of the ncaa_teams()
function from the
{baseballr}
package.
Aggregate player statistics and play-by-play information
Description
This function aggregates player statistics and play-by-play information
within a season by applying player_season_stats()
, player_match_stats()
,
or match_pbp()
across groups of teams (for player_season_stats()
) or
across contests within a season (for player_match_stats()
and
match_pbp()
). For season stats, it aggregates all player data and team
data into separate data frames and combines them into a list.
For instance, if you want to extract the data from the teams in the women's
2024 Final Four, pass a vector of
c("Louisville", "Nebraska", "Penn State", "Pittsburgh")
to the function. For match or play-by-play data for a team, pass a single
team name and year. Team names can be found in ncaa_teams or by
using find_team_name()
.
Usage
group_stats(
teams = NULL,
year = NULL,
level = "season",
unique = TRUE,
sport = "WVB"
)
Arguments
teams |
Character vector of team names to aggregate. |
year |
Numeric vector of years for fall of desired seasons. |
level |
Character string defining whether to aggregate "season", "match", or play-by-play ("pbp") data. |
unique |
Logical indicating whether to only process unique contests (TRUE) or whether to process duplicated contests (FALSE). Default is TRUE. |
sport |
Three letter abbreviation for NCAA sport (must be upper case; for example "WVB" for women's volleyball and "MVB" for men's volleyball). |
Value
For season level, returns list with data frames of player statistics and team statistics. For match and pbp levels, returns data frame of player statistics and play-by-play information respectively.
Note
This function requires internet connectivity as it checks the NCAA website for information.
See Also
Other functions that aggregate statistics:
conference_stats()
,
division_stats()
Examples
group_stats(teams = c("Louisville", "Nebraska", "Penn St.", "Pittsburgh"),
year = 2024, level = "season")
Creates table of raw HTML
Description
Copied and modified from {rvest}
https://github.com/tidyverse/rvest/blob/main/R/table.R
Usage
html_table_raw(
x,
header = NA,
trim = TRUE,
dec = ".",
na.strings = "NA",
convert = TRUE
)
Extract play-by-play information for a particular match
Description
The NCAA's page for a match/contest includes a tab called "Play By Play". This function extracts the tables of play-by-play information for each set.
Usage
match_pbp(contest = NULL)
Arguments
contest |
Contest ID determined by NCAA for match. To find ID, use
|
Value
Returns a data frame of set number, teams, score, event, and player responsible for the event.
Note
This function requires internet connectivity as it checks the NCAA website for information.
Examples
match_pbp(contest = "6080706")
Assigns most recent season
Description
Assigns most recent season
Usage
most_recent_season()
NCAA Men's Volleyball Teams 2020-2024
Description
This data frame includes all men's NCAA Division 1 and 3 teams from 2020-2024.
Usage
mvb_teams
Format
A data frame with 873 rows and 6 columns:
- team_id
Team ID for season/year
- team_name
Team name
- conference_id
Conference ID
- conference
Conference name
- div
NCAA division number (1 or 3)
- yr
Year for fall of season
Source
See Also
Other data sets:
ncaa_conferences
,
ncaa_sports
,
ncaa_teams
,
wvb_teams
Examples
head(mvb_teams)
NCAA Conference Names
Description
This vector includes names for all NCAA volleyball conferences.
Usage
ncaa_conferences
Format
A character vector with 111 conference names.
Source
See Also
Other data sets:
mvb_teams
,
ncaa_sports
,
ncaa_teams
,
wvb_teams
Examples
head(ncaa_conferences)
NCAA Sports and Sport Codes
Description
This data frame includes all NCAA women's and men's sports and the codes used to refer to the sports.
Usage
ncaa_sports
Format
A data frame with 100 rows and 2 columns:
- code
Sport code
- sport
Sport name
Source
https://ncaaorg.s3.amazonaws.com/championships/resources/common/NCAA_SportCodes.pdf
See Also
Other data sets:
mvb_teams
,
ncaa_conferences
,
ncaa_teams
,
wvb_teams
Examples
head(ncaa_sports)
NCAA Team Names
Description
This vector includes names for all NCAA volleyball teams.
Usage
ncaa_teams
Format
A character vector with 1,089 team names.
Source
See Also
Other data sets:
mvb_teams
,
ncaa_conferences
,
ncaa_sports
,
wvb_teams
Examples
head(ncaa_teams)
Extract player statistics for a particular match
Description
The NCAA's page for a match/contest includes a tab called "Individual Statistics". This function extracts the tables of player match statistics for both home and away teams, as well as team statistics (though these can be omitted). If a particular team is specified, only that team's statistics will be returned.
Usage
player_match_stats(
contest = NULL,
team = NULL,
team_stats = TRUE,
sport = "WVB"
)
Arguments
contest |
Contest ID determined by NCAA for match. To find ID, use
|
team |
Name of school. Must match name used by NCAA. Find exact team
name with |
team_stats |
Logical indicating whether to include (TRUE) or exclude (FALSE) team statistics. Default includes team statistics with player statistics. |
sport |
Three letter abbreviation for NCAA sport (must be upper case; for example "WVB" for women's volleyball and "MVB" for men's volleyball). |
Value
By default, returns data frame that includes both home and away team match statistics. If team is specified, only that team's data are returned.
Note
This function requires internet connectivity as it checks the NCAA website for information.
See Also
Other functions that extract player statistics:
player_season_stats()
Examples
player_match_stats(contest = "6080706")
Extract player statistics from a particular team and season
Description
The NCAA's main page for a team includes a tab called "Team Statistics". This function extracts the table of player statistics for the season, as well as team and opponent statistics (though these can be omitted).
Usage
player_season_stats(team_id, team_stats = TRUE)
Arguments
team_id |
Team ID determined by NCAA for season. To find ID, use
|
team_stats |
Logical indicating whether to include (TRUE) or exclude (FALSE) team statistics. Default includes team statistics with player statistics. |
Value
Returns a data frame of player statistics. Note that hometown and high school were added in 2024.
Note
This function requires internet connectivity as it checks the NCAA website for information.
See Also
Other functions that extract player statistics:
player_match_stats()
Examples
player_season_stats(team_id = "585290")
Submit URL request via live browser
Description
Submit URL request via live browser
Usage
request_live_url(url)
Arguments
url |
URL for request. |
Note
This function requires internet connectivity as it checks the NCAA website for information.
Submit URL request, check, and return response
Description
Submit URL request, check, and return response
Usage
request_url(url)
Arguments
url |
URL for request. |
Note
This function requires internet connectivity as it checks the NCAA website for information.
Save data frames
Description
Save data frames
Usage
save_df(x, label, group, year, division, conf, sport, path)
Extract team summary statistics for all matches in a particular season
Description
The NCAA's main page for a team includes a tab called "Game By Game" and a section called "Game by Game Stats". This function extracts the team's summary statistics for each match of the season.
Usage
team_match_stats(team_id = NULL, sport = "WVB")
Arguments
team_id |
Team ID determined by NCAA for season. To find ID, use
|
sport |
Three letter abbreviation for NCAA sport (must be upper case; for example "WVB" for women's volleyball and "MVB" for men's volleyball). |
Value
Returns a data frame of summary team statistics for each match of the season.
Note
This function requires internet connectivity as it checks the
NCAA website for information.
It also uses the {chromote}
package
and requires Google Chrome to be
installed.
See Also
Other functions that extract team statistics:
team_season_info()
,
team_season_stats()
Examples
team_match_stats(team_id = "585290")
Extract arena, coach, record, and schedule information for a particular team and season
Description
The NCAA's main page for a team includes a tab called "Schedule/Results".
This function extracts information about the team's venue, coach, and
records, as well as the table of the schedule and results. This returns a
list, so you can subset specific components with $
(e.g., for coach
information from an object called output
, use output$coach
).
Usage
team_season_info(team_id = NULL)
Arguments
team_id |
Team ID determined by NCAA for season. To find ID, use
|
Value
Returns a list that includes arena, coach, schedule, and record information.
Note
This function requires internet connectivity as it checks the NCAA website for information.
See Also
Other functions that extract team statistics:
team_match_stats()
,
team_season_stats()
Examples
team_season_info(team_id = "585290")
Extract teams statistics for season statistics from 2020-2024
Description
The NCAA's main page for a team includes a tab called "Game By Game" and a section called "Career Totals". This function extracts season summary stats.
Usage
team_season_stats(team = NULL, opponent = FALSE, sport = "WVB")
Arguments
team |
Name of school. Must match name used by NCAA. Find exact team
name with |
opponent |
Logical indicating whether to include team's stats (FALSE) or opponent's stats (TRUE). Default is set to FALSE, returning team stats. |
sport |
Three letter abbreviation for NCAA sport (must be upper case; for example "WVB" for women's volleyball and "MVB" for men's volleyball). |
Value
Returns a data frame of summary team statistics for each season.
Note
This function requires internet connectivity as it checks the NCAA website for information.
Due to changes in the NCAA website, statistics from before 2020 are no longer available.
See Also
Other functions that extract team statistics:
team_match_stats()
,
team_season_info()
Examples
team_season_stats(team = "Nebraska")
NCAA Women's Volleyball Teams 2020-2024
Description
This data frame includes all women's NCAA Division 1, 2, and 3 teams from 2020-2024.
Usage
wvb_teams
Format
A data frame with 5,289 rows and 6 columns:
- team_id
Team ID for season/year
- team_name
Team name
- conference_id
Conference ID
- conference
Conference name
- div
NCAA division number (1, 2, or 3)
- yr
Year for fall of season
Source
See Also
Other data sets:
mvb_teams
,
ncaa_conferences
,
ncaa_sports
,
ncaa_teams
Examples
head(wvb_teams)