By way of introduction, I think, quite possibly, that I've always been destined to do a column on basketball statistics.
I've always been a very good math student, earning A's in my classes despite being far advanced for my age and competing in Math Team during both elementary school and high school. Since second grade, I've been a huge basketball fan. It was only natural to put the two together.
At the age of nine, my favorite basketball cards were Skybox 1990-1991, because they had per 48 minute numbers on the back of the card. When I was a kid, I thought per 48 minute numbers were just the greatest thing ever. "After all," I thought then, "how fair is it to compare the numbers of a guy who plays 30 minutes a night to those of a guy who plays 10?"
I'd pull out the cards, and faithfully transcribe per 48 values in several categories into one of several notebooks I filled up with extracurricular work. I'd do this for all the starters at a given position, rank them, and add the ranks to determine who the best all-around player was.
Hopefully, the methods I use now won't seem similarly primitive in ten more years.
By 1996, when my family had bought a computer (a well as Seattle SuperSonics season tickets), I would enter the stats into Works from the paper so that I could calculate statistics like assists/turnover, steals/turnover, and true shooting percentage (which I got from the Rick Barry Pro Basketball Bible).
However, in ensuing years, I began to turn away from watching basketball, and likewise looking at NBA statistics. Like so many before me, I discovered how much easier baseball was to quantify.
When I first got the internet in the year 2000, three sites in particular became favorites of mine: BskBALL.com, baseballprospectus.com, and Rob Neyer's column at espn.com. BskBALL's importance we'll get to; bp.com and Neyer are significant because both are devoted to the statistical analysis of baseball, which I gained a new appreciation for. I can only wonder how I could have been so ignorant of the value of the walk.
One day, when I had some free time, I decided to look to see if I could find any similar work with regards to basketball. I found a pair of sites (that I recall now; I'm sure there were others which have slipped to the recesses of my mind) that were promising. Dean Oliver's JoBS (Journal of Basketball Statistics) was an excellent resource, though unfortunately it hadn't been updated in quite a while. NBA Stat Site: The Hidden Game had only a handful of articles, but was an excellent resource for secondary statistics on every player going back several years.
I was left somewhat unfulfilled by the paucity of NBA statistical analysis, but filed it away at the time.
Last spring, a pair of seemingly unrelated events piqued my interest in NBA statistics again. First off, I took a statistics class at the University of Washington during fall quarter which I enjoyed immensely. While calculus, the other college-level math I had taken, was largely theoretical, statistics were applicable and real to me. That application was, again, the NBA. The previous fall, I had returned to being a basketball fanatic (as opposed to a fan) when Nate McMillan was hired as coach of my beloved Sonics. This basketball renaissance led to the second event. BskBALL had an opening to write about the Sonics, and I was chosen to serve as the columnist.
My selection as a columnist did two things on a statistical level. First, I vowed to take a look at some relatively uncomplicated secondary stats for the column, which I did, giving me more experience with these types of metrics. Second, it directed me to BskBALL's message board, where I found time to eventually join into discussions with a few other statistically-minded posters.
From discussions with these others, along with my own experience, I began to feel that there was an interest in NBA statistical analysis which was simply going unmet. Interested as I was in humbly trying to fill this void to some extent or another, I saw no place to do so.
That is, I didn't see one until I heard from BskBALL about his newest pet project, ProSportsWriters.net. I was interested in the site for another, more practical reason. As noted before, I'm in college and thinking about my future. I discovered when I started writing for BskBALL.com about the Sonics that I rather enjoyed the work, and have since stepped up my amateur basketball journalism with my own website, SonicsCentral.com. I'm very interested in pursuing sports journalism as a career, and this site seems to me a great opportunity to get my work into a public forum with a wide (and influential, hopefully) audience. That I'd get to write about statistics is an excellent bonus.
So, that's some background about me. What will this column be like? If you hadn't guessed by now, I hope for a significant statistical undertone to this column. That said, nobody's interested in just reading a list of statistics I've prepared. What makes statistics relevant and useful is their analysis and application. To do this, my plan is to try each column to answer an interesting question about the NBA with a look at various related statistics.
This could be anything from "Who deserves the NBA's MVP?" to "Are today's players really less ready to contribute their rookie season?" to "How much more valuable is the one draft pick than another?" to "Does a team need a superstar to win a championship?" This is where (hopefully) you, the readers, come in. I have lots of questions I'd like to answer, those above being just a few, but I think people would be far more interested in reading the questions they want answers to. So please send me anything you'd like to see in the column via the link at the top of the page. If you don't have a web-based e-mail, my address is email@example.com
To conclude, I'd like to briefly discuss a related set of statistics of my own invention which I will be using in this column. In my discussions at BskBALL.com, I decided to come up with my own all-encompassing single stat. It's hardly unique; every statistical NBA analyst has their own version, and they're all predicated on the same principle -- analyze how effectively a player has made use of offensive possessions, and translate their production in other areas (rebounding, defense, passing) into fractions of a possession.
The general principle is to add the good stuff (points, blocks, steals, assists, rebounds) and subtract the bad (turnovers, field goal misses, free throw misses, sometimes fouls and ejections). I decided to divide instead. I don't know why; I just like dividing more than subtracting. I also think the values I've used for each non-scoring good thing are better. For example, I don't think I've seen a formula that seperates offensive and defensive rebounds. To me, an offensive rebound is far more valuable. Why? Well, in theory, if the player who we're analyzing wasn't on the court, a defensive rebound is far more likely to result from a missed shot than an offensive rebound. On average, the ratio of offensive boards is about 3:7. If we assume that everyone contributes equally, the player we're analyzing contributes .6 offensive boards, so the rest of his team would get 2.4 offensive boards per 10 missed shots (close enough to 1/4 to me). Thus, an offensive board means .75 more possessions than expected, a defensive board .25 more. Besides the bad stuff (I threw out fouls and ejections; they're really not that meaningful to me), I added a measure of the non-point good stuff to balance the effect of these things. Finally, to keep someone who doesn't shoot and does non-scoring things from becoming way too important, I added in minutes, which also helps make the formula more accurate for players who don't play much.
The final formula, then:
(points + (.75 * offensive rebounds) + (.25 * defensive rebounds) + (.5 * assists) + steals + (.33 * blocks))/(1.5 * (field goals attempted + (.48 * free throws attempted) + turnovers + (.375 * offensive rebounds)) + (.125 * defensive rebounds) + (.25 * assists) + (.5 * steals) + ((1/6) * blocks) + (.25 * minutes))
I will try to be up-front in this column. I'm not interested in tricking the reader so I look good or achieve the desired result; instead, I'm only interested in the search for truth in the game of basketball. So, here's my caveat about this formula. I have found, conclusively, that it consistently underrates the perceived value of man defensive specialists. Guys like Bruce Bowen, Randy Brown, Emanual Davis, et. al., who are in the NBA primarily because of their ability to lock up a player on D (but usually not collect steals) end up looking horrible. I'm afraid I really don't have an idea how to fix this. I think it's just a limitation of the stats the NBA provides. Or, on the other hand, maybe these players are simply overrated in real life. I can't answer that question, though I tend to lean towards them being better than the stats say.
The nice thing about this ratio is that it fairly closely approximates field goal percentages, with normal ratings going from about .4 to .6, and a good player being at .5 and above. In this sense, it's sort of like the baseball stat Equivalent Average, which puts secondary stats into the familiar range of batting average.
Moving on, this formula judges the efficiency of players on essentially a basis independent of minutes played. This can be nice for finding out some things, like what little-used players (read: Todd MacCulloch) will break out if they get more minutes. However, if we're really judging the value of a player to his team, minutes are obviously an extremely significant part. My answer? Multiply the efficiency by minutes played to get the player's value to the team.
Dividing this by games played gives us a third statistic, value per game, which also nicely approximates an existing stat, in this case points per game. Great players still get around 20 vpg, although weaker players have higher vpg values than ppg. Teams are generally at about 120 vpg, as opposed to 100 ppg.
Finally, I realized when contemplating the use of the value statistic to compare players over the course of careers that it would unjustly aid players who hung around far too long in their careers, because no matter how bad they were, they'd always be picking up value. So, I stole another baseball concept, VORP -- value over replacement player. To find this, I simply subtracted .44 from the efficiency (this is about the level of a player who could be signed as an in-season free agent, or who is collecting dust at the end of the bench) and multiplied again by the minutes. After using this, however, I've concluded it really gives a much better picture of a player's true value than value. Star players usually have over 300 VORP over a season, while bad players can easily rack up even large negative scores.
Alright, I think that's everything for the introduction. I'd like to encourage any readers out there to e-mail me, as mentioned, but also to discuss my future conclusions. These stats and conclusions are just a starting point, and often the real point is to get a discussion started, so I'll be very glad to hear from readers their opinions and debate things with them.
My plan is to regularly update this column each Friday. If I don't hear anything before next Friday, I'm going to discuss a favored topic amongst stat-types, defending NBA MVP Allen Iverson.