Hitting percentage: the end all be all?

NCAA
offense
college volleyball
Author

Derek Harrington

Published

December 9, 2024

Code
#load libraries
library(tidyverse)
library(gt)
library(ggrepel)

It’s an exciting time for volleyball in the United States. The sport has been improving rapidly for the last 10-15 years, especially on the offensive side of the equation. Players are more athletic, dynamic and better trained coming into college. As a result, teams can push the boundaries of creativity in offensive schemes. But what teams are improving their offenses? Are there decent, but not great, offensive teams that still win a lot of games? Let’s explore.

First of all, I make no secret that I’m a Nebraska and Big Ten fan. So, to begin this exploration I’ll be looking at Nebraska and the Big Ten conference. However, my analysis will expand to the rest of the country (with comparisons to Nebraska made along the way).

Who’s the offensive juggernaut in the Big Ten?

To begin, let’s look at how Big Ten offenses fared between 2022 and 2023. This first chart simply displays the change in season hitting percentages from each Big Ten member between the 2022 and 2023 seasons. And if you’re not quite sure what a hitting percentage in volleyball is, it’s an efficiency that’s calculated by taking (your kills- your errors)/your total attempts.

Code
#load in the data
offense <- read_csv("data/derek_offense.csv")

#make a slope chart for 2022 and 2023 offenses
#first make some vectors to be used later on
big_ten <- c("Nebraska","Wisconsin","Penn St.", "Michigan", "Michigan St.", "Iowa", "Purdue", "Illinois", "Minnesota", "Ohio St.", "Maryland", "Rutgers", "Northwestern", "Indiana")

big_ten_offense <- offense |>
  group_by(team_name, season)|>
  filter(team_name %in% big_ten)|>
  summarise(
    totalkills=sum(kills),
    totalerrors=sum(attack_errors),
    totalattempts=sum(attacks)
  )|>
  mutate(
    hit_pct=(totalkills-totalerrors)/totalattempts
  )|>
  select(
    team_name, hit_pct, season
  )
 
  nu <- big_ten_offense |> filter(team_name=="Nebraska")
  wis <- big_ten_offense |> filter(team_name=="Wisconsin")
  ind <- big_ten_offense |> filter(team_name=="Indiana")
  mich <- big_ten_offense |> filter (team_name=="Michigan")
  ggplot() + 
  geom_line(
    data = big_ten_offense, 
    aes(
      x = season, 
      y = hit_pct, 
      group = team_name
    ), 
    color="grey")+
  geom_point(
    data = big_ten_offense, 
    aes(
      x = season, 
      y = hit_pct, 
      group = team_name
    ), 
    color="grey"
  )+
    geom_line(
    data = nu, 
    aes(
      x = season, 
      y = hit_pct, 
      group = team_name
    ), 
    color="black")+
  geom_point(
    data = nu, 
    aes(
      x = season, 
      y = hit_pct, 
      group = team_name
    ), 
    color="black"
  )+
    geom_line(
    data = wis, 
    aes(
      x = season, 
      y = hit_pct, 
      group = team_name
    ), 
    color="#C5050C")+
  geom_point(
    data = wis, 
    aes(
      x = season, 
      y = hit_pct, 
      group = team_name
    ), 
    color="#C5050C"
  )+
    geom_line(
    data = ind, 
    aes(
      x = season, 
      y = hit_pct, 
      group = team_name
    ), 
    color="#990000")+
  geom_point(
    data = ind, 
    aes(
      x = season, 
      y = hit_pct, 
      group = team_name
    ), 
    color="#990000"
  )+
    geom_line(
    data = mich, 
    aes(
      x = season, 
      y = hit_pct, 
      group = team_name
    ), 
    color="#00274c")+
  geom_point(
    data = mich, 
    aes(
      x = season, 
      y = hit_pct, 
      group = team_name
    ), 
    color="#00274c"
  )+
  scale_x_continuous(
    breaks=c(2022, 2023), 
    limits=c(2021.75, 2023.25)
  ) +
    scale_y_continuous(
      breaks=c(.175, .200, .225, .250, .275, .300, .325)
    )+
  geom_text(
    data = nu |> filter(season == max(season)),
    aes(
      x = season + .1, 
      y = hit_pct , 
      group = team_name, 
      label = team_name
    ))+
  geom_text(
    data = nu |> filter(season == min(season)),
    aes(
      x = season - .1, 
      y = hit_pct, 
      group = team_name, 
      label = team_name
    ))+
  geom_text(
    data = wis |> filter(season == max(season)),
    aes(
      x = season + .1, 
      y = hit_pct , 
      group = team_name, 
      label = team_name
    ))+
  geom_text(
    data = wis |> filter(season == min(season)),
    aes(
      x = season - .1, 
      y = hit_pct, 
      group = team_name, 
      label = team_name
    ))+
  geom_text(
    data = ind |> filter(season == max(season)),
    aes(
      x = season + .1, 
      y = hit_pct , 
      group = team_name, 
      label = team_name
    ))+
  geom_text(
    data = ind |> filter(season == min(season)),
    aes(
      x = season - .1, 
      y = hit_pct, 
      group = team_name, 
      label = team_name
    ))+
  geom_text(
    data = mich |> filter(season == max(season)),
    aes(
      x = season + .1, 
      y = hit_pct , 
      group = team_name, 
      label = team_name
    ))+
  geom_text(
    data = mich |> filter(season == min(season)),
    aes(
      x = season - .1, 
      y = hit_pct, 
      group = team_name, 
      label = team_name
    ))+
    theme_minimal()+
  labs(
    x= "Season",
    y = "Hitting Percentage", 
    title = "Wisconsin continues to dominate offensively in Big Ten",
    subtitle = "Michigan had a bad year one after coaching change",
    caption="Data courtesy of Evollve.net | By Derek Harrington"
  ) +
  theme(
    plot.title = element_text(size = 16, face = "bold"),
    plot.subtitle = element_text(size = 12),
    axis.title = element_text(size = 10),
    panel.grid.minor = element_blank(),
    plot.title.position = "plot"
  )

So, what to make of this? Well, first of all, Wisconsin was the national champion in 2021. They lost many key pieces from that team including Dana Rettke and Sydney Hilley. And yet, they were still atop the Big Ten in 2022 in terms of hitting percentage, and to no one’s surprise, they won the conference title. The Huskers switched to a 6-2 offense in 2022 and the offensive numbers were up from 2021. If not for a crippling injury in the last week of the regular season that forced some lineup changes, Nebraska may have been able to win the conference title. But I digress. The interesting story comes in 2023 in my opinion.

As the chart shows, Wisconsin improved its offense from 2022 to 2023, hitting over .300 as a team! The Badgers were no doubt among the elite offensive teams across the country. We can also see that Nebraska made a sizable jump, thanks to newcomers that included Merritt Beason, Harper Murray, Andi Jackson and Bergen Reilly (among many other important pieces). This chart would suggest that Wisconsin still ran away with the conference title as there’s a sizable gap between the Huskers and Badgers offensively.

But….Nebraska actually won the conference title. Nebraska was the nation’s best defensive team last year, and that paired with the offensive improvement led to the Huskers claiming the Big Ten title outright. It’s also only fair to point out that Wisconsin had a couple of injuries toward the end of the year that allowed Purdue and Penn State to knock off the Badgers, allowing the Huskers to claim the title. Nevertheless, it’s a prime example of offense not always being the main story. So, if Nebraska won the conference in 2023 and had a chance to win it in 2022, but it wasn’t close to touching Wisconsin in team hitting percentage,so there’s something else there, but what?

Also note that Indiana made a huge improvement with its offense, which put the Hoosiers in contention for an at-large bid to the NCAA tournament. Unfortunately, the Hoosiers dropped some games early in the non-conference and late in the season that kept them out. Michigan had a rough year with a coaching change and roster turnover. However, if you were to compare the Wolverines’ 2023 numbers to those of 2024, you’d see a big improvement.

Which teams actually earn their points?

So, let’s take this examination up a level and look at the 2022 and 2023 seasons in a different light. Let’s examine the top 10 (using the end-of-year top 10 rankings) for each year and see how they fared with points earned and points given up.

In the most simplistic terms, the points per set a team earns is the addition of the kills per set, aces per set and blocks per set. Typically, you’d want to be in the range of 17 to 18 earned points per set to be considered elite. Points given up is the summation of all of your errors (hitting, serving, ball handling and block errors). Obviously with points given up you want to be as low as possible. So, let’s take a look for both 2022 and 2023.

Code
#|message: FALSE

#make some more vectors with teams from the power 5 conferences 

sec <- c("Florida", "Kentucky", "Missouri", "Alabama", "Arkansas", "Georgia", "Ole Miss", "Miss St.", "Tennessee", "South Carolina", "LSU", "Texas A&M", "Auburn")
pac_12 <- c("Stanford", "Oregon", "UCLA", "USC", "Arizona", "Arizona St.", "Colorado", "Utah", "Oregon St.", "Washington", "Washington St.", "California")
big_12_22 <- c("Texas", "Iowa St.", "Kansas", "Kansas St.", "Baylor", "Oklahoma", "TCU", "Texas Tech", "West Virginia")
big_12_23 <- c("Texas", "Iowa St.", "Kansas", "Kansas St.", "Baylor", "Oklahoma", "TCU", "Texas Tech", "Weest Virginia", "BYU", "UCF", "Houston", "Cincinnati")
acc <- c("Pittsburgh", "Florida St.", "Miami", "Clemson", "NC State", "Boston College", "Louisville", "Notre Dame", "Syracuse", "Duke", "North Carolina", "Wake Forest", "Georgia Tech", "Virginia", "Virginia Tech" )

power5_22 <- c(big_ten, sec, pac_12,big_12_22, acc)
power5_23 <- c(big_ten, sec, pac_12,big_12_23, acc)

top_ten22 <- c("Texas", "Louisville", "San Diego", "Pittsburgh", "Wisconsin", "Stanford", "Oregon", "Ohio St.", "Nebraska", "Minnesota")

top_ten23 <- c("Texas", "Nebraska", "Wisconsin", "Pittsburgh", "Stanford", "Louisville", "Oregon", "Arkansas", "Tennessee", "Kentucky")

offense_earned_given_22 <- offense |>
  
   select(
    team_name,
    season,
    aces,
    kills,
    sets_won,
    sets_lost,
    block_solos,
    block_assists,
    block_errors,
    service_errors,
    attack_errors,
    ball_handling_errors
  ) |>
  filter(season=="2022")|>
  group_by(team_name) |>
  drop_na()|>
  summarize(
    total_sets = sum(sets_won+ sets_lost),
    earned_pts = sum(aces + kills +(block_solos + (block_assists/2))),
    pts_given = sum(service_errors + attack_errors +block_errors+ball_handling_errors)
  ) |>
  mutate(
    earned_pts_set = earned_pts/total_sets,
    given_pts_set= pts_given/total_sets 

  ) |>
  rename(Team = team_name) |>
  select(Team, earned_pts_set, given_pts_set) |>
  filter(Team %in% top_ten22)|>
  arrange(desc(earned_pts_set))

offense_earned_given_23 <- offense |>
  
   select(
    team_name,
    season,
    aces,
    kills,
    sets_won,
    sets_lost,
    block_solos,
    block_assists,
    block_errors,
    service_errors,
    attack_errors,
    ball_handling_errors
  ) |>
  filter(season=="2023")|>
  group_by(team_name) |>
  drop_na()|>
  summarize(
    total_sets = sum(sets_won+ sets_lost),
    earned_pts = sum(aces + kills +(block_solos + (block_assists/2))),
    pts_given = sum(service_errors + attack_errors +block_errors+ball_handling_errors)
  ) |>
  mutate(
    earned_pts_set = earned_pts/total_sets,
    given_pts_set= pts_given/total_sets 

  ) |>
  rename(Team = team_name) |>
  select(Team, earned_pts_set, given_pts_set) |>
  filter(Team %in% top_ten23)|>
  arrange(desc(earned_pts_set))



#make a table for 22 pts/earned per set

 offense_earned_given_22 |>
  gt()|>
  cols_label(
    earned_pts_set="Earned Pts/set",
    given_pts_set="Errors/set"
  )|>
  tab_header(
    title = "Stanford had no problems scoring against teams in 2022",
    subtitle = "Texas didn't gift many points"
  )|>
   tab_style(
    style = cell_text(color = "black", weight = "bold", align = "left"),
    locations = cells_title("title")
  ) |>
  tab_style(
    style = cell_text(color = "black", align = "left"),
    locations = cells_title("subtitle")
  )|>
  tab_source_note(
    source_note = md("By **Derek Harrington**  <br>Data courtesy of **Evollve.net**"))|>
   tab_style(
    locations = cells_column_labels(columns = everything()),
    style = list(
      cell_borders(sides = "bottom", weight = px(3)),
      cell_text(weight = "bold", size = 12)))|>
    opt_row_striping() |>
  opt_table_lines("none") |>
  fmt_number(
    columns = c(earned_pts_set, given_pts_set),
    decimals = 2)|>
  tab_style(
    style = list(
      cell_fill(color = "white"),
      cell_text(color = "darkgreen", weight="bold")
    ),
    locations = cells_body(
      rows = Team == "Texas"
    ))|>
      tab_style(
    style = list(
      cell_fill(color = "white"),
      cell_text(color = "red", weight="bold")
    ),
    locations = cells_body(
      rows = Team == "Nebraska"
    
  ))
Stanford had no problems scoring against teams in 2022
Texas didn't gift many points
Team Earned Pts/set Errors/set
Stanford 18.84 7.79
Texas 18.68 6.50
Wisconsin 18.00 7.16
San Diego 17.87 6.79
Pittsburgh 17.72 7.06
Louisville 17.66 7.20
Ohio St. 17.46 7.64
Minnesota 17.40 6.77
Oregon 17.30 7.29
Nebraska 16.63 7.76
By Derek Harrington
Data courtesy of Evollve.net

What’s the story here? First of all, the thing jumping off the page to me is Stanford was earning nearly 19 points a set which is truly outstanding. However, they also led the top 10 in points given away per set. Perhaps that’s one of the reasons why San Diego was able to upset the Cardinal in the regional final at Stanford. Both teams were tremendous in 2022, but on average, San Diego didn’t give away as many points as Stanford.

Texas ranks second for total points earned per set and that’s not a surprise. That Texas team had Logan Eggleston, Madi Skinner and Asjia O’Neal as its go-to hitters. They were truly a great offensive team AND they also didn’t gift you many free points. If you lined up against the Longhorns, you were going to have to earn every point.

Nebraska ranks last for earned points per set, which is a bit surprising considering the 6-2 had improved its offense over 2022. The final numbers may have been impacted somewhat by the changes in personnel due to the injury, but I’m not sure it would make that big of a difference. The 2022 Huskers were also giving away far too many points to other teams, ranking second just behind Stanford.

So, recalling Nebraska’s offensive jump from 2022 to 2023, did the Huskers make a significant jump in the 2023 standings?

Code
 offense_earned_given_23 |>
  gt()|>
   cols_label(
    earned_pts_set="Earned Pts/set",
    given_pts_set="Errors/set"
  )|>
     tab_header(
    title = "Stanford continued its dominance over the field in 2023",
    subtitle = "Nebraska earned more points, but Texas still won the title"
  )|>
   tab_style(
    style = cell_text(color = "black", weight = "bold", align = "left"),
    locations = cells_title("title")
  ) |>
  tab_style(
    style = cell_text(color = "black", align = "left"),
    locations = cells_title("subtitle")
  )|>
  tab_source_note(
    source_note = md("By **Derek Harrington**  <br>Data courtesy of **Evollve.net**"))|>
   tab_style(
    locations = cells_column_labels(columns = everything()),
    style = list(
      cell_borders(sides = "bottom", weight = px(3)),
      cell_text(weight = "bold", size = 12)))|>
    opt_row_striping() |>
  opt_table_lines("none") |>
  fmt_number(
    columns = c(earned_pts_set, given_pts_set),
    decimals = 2)|>
  tab_style(
    style = list(
      cell_fill(color = "white"),
      cell_text(color = "darkgreen", weight="bold")
    ),
    locations = cells_body(
      rows = Team == "Texas"
    )
  )|>
   tab_style(
    style = list(
      cell_fill(color = "white"),
      cell_text(color = "red", weight="bold")
    ),
    locations = cells_body(
      rows = Team == "Nebraska"
    ))
Stanford continued its dominance over the field in 2023
Nebraska earned more points, but Texas still won the title
Team Earned Pts/set Errors/set
Stanford 19.02 7.88
Tennessee 18.71 7.19
Wisconsin 18.52 6.42
Oregon 18.29 7.58
Pittsburgh 18.19 6.95
Texas 17.92 6.84
Kentucky 17.80 7.79
Arkansas 17.74 7.28
Nebraska 17.73 7.95
Louisville 17.64 7.44
By Derek Harrington
Data courtesy of Evollve.net

Nope. Sure, Nebraska did improve slightly in total earned points per set, getting into the upper 17s, but there were clearly teams better at earning points than the Huskers. the Stanford did eclipse the 19 earned points per set mark in 2023. However, the Cardinal lost yet again in the regional final, at home, this time to the Texas Longhorns. Stanford losing at home in the regional final in back-to-back years, especially with how dominate they were, is quite thought-provoking.

Nebraska only slightly improved its earned points per set, and it led the top 10 teams in errors per set. And yet, the Huskers made it to the national championship and only lost two games. Texas, who knocked off Stanford, Wisconsin and Nebraska en route to the title, put together a remarkable post-season run. It’s more impressive when you consider the fact Texas faced match point in the fourth set against Tennessee in the regional semifinal. But the Longhorns prevailed over the Volunteers and weren’t seriously challenged the rest of the tournament.

So, there’s an element that’s still missing here. Why was Nebraska the team to beat in 2023 even though the above numbers suggest they were only slightly better than in 2022? Well….if you followed social media whenever Nebraska played in 2023 there was a common theme. It would be pointed out that it seemed teams couldn’t keep a serve in play to save their lives when they faced Nebraska. But is that really what happened?

Code
#make a table showing opponents errors for 2023
#make new dataframe for Nebraska's gifts
Nebraska_gifted23 <- offense |>
  filter(season=="2023")|>
  
   select(
    team_name,
    aces,
    kills,
    sets_won,
    sets_lost,
    block_solos,
    block_assists,
    block_errors,
    service_errors,
    attack_errors,
    ball_handling_errors,
    opp_service_errors,
    opp_attack_errors,
    opp_ball_handling_errors
  ) |>
  group_by(team_name) |>
  drop_na()|>
  summarize(
    total_sets = sum(sets_won+ sets_lost),
    earned_pts = sum(aces + kills +(block_solos + (block_assists/2))),
    pts_given = sum(service_errors + attack_errors +block_errors+ball_handling_errors),
    gift_points= sum(opp_service_errors + opp_ball_handling_errors + opp_attack_errors),
    se_gifts=sum(opp_service_errors)
  ) |>
  mutate(
    earned_pts_set = earned_pts/total_sets,
    given_pts_set= pts_given/total_sets,
    unearned_pts_set=gift_points/total_sets,
    SE=se_gifts

  ) |>
  rename(Team = team_name) |>
  select(Team, unearned_pts_set,SE) |>
  filter(Team %in% top_ten23)|>
  arrange(desc(SE))


#table with service errors from opponents

Nebraska_gifted23 |>
  gt(
  )|>
  cols_label(
    SE="Opp service errors",
    unearned_pts_set="Opp errors/set"
  )|>
     tab_header(
    title = "Nebraska's opponents were error-prone",
    subtitle = "Huskers' opponents served too aggressively"
  )|>
   tab_style(
    style = cell_text(color = "black", weight = "bold", align = "left"),
    locations = cells_title("title")
  ) |>
  tab_style(
    style = cell_text(color = "black", align = "left"),
    locations = cells_title("subtitle")
  )|>
  tab_source_note(
    source_note = md("By **Derek Harrington**  <br>Data courtesy of **Evollve.net**"))|>
   tab_style(
    locations = cells_column_labels(columns = everything()),
    style = list(
      cell_borders(sides = "bottom", weight = px(3)),
      cell_text(weight = "bold", size = 12)))|>
    opt_row_striping() |>
  opt_table_lines("none") |>
  fmt_number(
    columns = c(Team, SE),
    decimals = 0)|>
  tab_style(
    style = list(
      cell_fill(color = "white"),
      cell_text(color = "darkgreen", weight="bold")
    ),
    locations = cells_body(
      rows = Team == "Nebraska"
    )
  )
Nebraska's opponents were error-prone
Huskers' opponents served too aggressively
Team Opp errors/set Opp service errors
Nebraska 8.735537 325
Texas 8.551402 277
Stanford 7.300000 269
Arkansas 7.401575 265
Louisville 7.948718 260
Oregon 7.928571 260
Pittsburgh 8.387931 260
Kentucky 6.916667 230
Wisconsin 8.260504 227
Tennessee 7.226415 222
By Derek Harrington
Data courtesy of Evollve.net

Yes! Whoever played Nebraska seemed to have issues with keeping the ball in play. Not only did the Huskers lead the top 10 in points gifted by the other team, but they also led in opponent service errors. The question on your mind is probably why? A logical guess would be that because Nebraska was such a good passing team in 2023 AND the Huskers had some formidable attackers, teams had to crank up the service pressure to get them out-of-system.

The downside to that is the amount of service errors. This makes Texas’ win in the national championship even more impressive because they served Nebraska off the court and perhaps, Nebraska didn’t know what to do when they weren’t getting the free points they were accustomed to. By all accounts, Nebraska is an even better passing team in 2024 so it would be interesting to revisit these numbers to see if there’s a similar story in 2024.

Takeaways

We know hitting efficiency is a stat that everyone wants to look at, especially during a game, to see how much success a team is having. While it can give an instant snapshot of a game, season, etc., it’s not the entire story. Earned points and points given away per set can show which teams are great at scoring the ball in a variety of ways, and who struggles with giving away free points.

Perhaps the best metric for analyzing and predicting a team’s success is hitting differential. While it’s not 100% foolproof (Hello Texas’ run through the 2023 tournament!), it does a decent job at predicting who the elite teams are.