Updated BPM All Time - 2015
Daniel Myers, the creator of Box Plus/Minus (BPM), has recently released a new and updated version of BPM. This, along with the completion of the 2015 playoffs, led me to updating the BPM All Time list. The new version included big improvements to defense and underrates perimeter defenders far less. For the methodology, I kept it far simpler, and used hall rating, which is derived from baseball. Basically, it takes into account longevity (wins over replacement) and peak (wins over average). I computed this for the regular season and playoffs for each player and added them to get a player's total hall rating. Although I suppose one could give playoff hall rating higher weight, it is unclear what the weight should exactly be -- so I didn't bother trying to do that for now.
For players for whom BPM was not available (seasons before 1978), I used an approximation of BPM based on a formula provided by Neil Paine of Fivethirtyeight.com -- 0.15*(PER - 15) + 30.5 *(WS/48 - 0.1). Also, Lebron has made the jump to 2nd all time and is now the greatest playoff performer ever. It will be interesting if he can catch Jordan and will largely depend on how much longer he plays along with how many elite years he has left.
Here are the Top 10 players in NBA history according to BPM:
For players for whom BPM was not available (seasons before 1978), I used an approximation of BPM based on a formula provided by Neil Paine of Fivethirtyeight.com -- 0.15*(PER - 15) + 30.5 *(WS/48 - 0.1). Also, Lebron has made the jump to 2nd all time and is now the greatest playoff performer ever. It will be interesting if he can catch Jordan and will largely depend on how much longer he plays along with how many elite years he has left.
All-Time List based on PER (Player Efficiency Rating)
As part of my on-going series on the construction of all-time lists based on various advanced metrics, I bring to you the all-time list for the highly popular statistic, PER. PER, which was created by John Hollinger, is perhaps the most well known advanced statistic today. In order to convert PER (a rate stat) to actual production, we can rely on Hollinger's estimated wins added (EWA), which computes the number of wins a player added to a team based on their PER, position, and minutes played.
The methodology for computing a player's all time score from their EWA was simple. First I computed their regular season EWA. Next, I computed their EWA in each round of the playoffs and gave higher weight to successive rounds (e.g. conference finals matter more than the opening round). Finally, I simply summed the player's regular season EWA and weighted playoff EWA.
The methodology for computing a player's all time score from their EWA was simple. First I computed their regular season EWA. Next, I computed their EWA in each round of the playoffs and gave higher weight to successive rounds (e.g. conference finals matter more than the opening round). Finally, I simply summed the player's regular season EWA and weighted playoff EWA.
Here are the Top 10 players in NBA history according to PER:
The Top 2 players of all time (MJ, Kareem) are consistent so far across all of the metrics that have been looked at (BPM, SPM, PER). Lebron just passed Wilt for 4th all time on this list during this current playoff run. He has a very good shot at passing Shaq and moving to 3rd by the time his career is over. Kobe and Duncan are not surprisingly very close to each other and are ranked 6th and 7th; both have a shot of passing Wilt in the next season or two (although it's not very likely). Please share your thoughts and feedback in the comments below.
All-Time List based on Statistical Plus/Minus (SPM)
Recently, I released an all-time list based on Box Plus/Minus (BPM). I went and looked at another very good statistic for player evaluation known as Statistical Plus/Minus (SPM). SPM, whose formula can be found here, belongs to the same class of stats as BPM - those that combine box score stats with plus-minus. SPM empirically performs very well and achieves similar levels of accuracy as BPM. While BPM performs slightly better, SPM can give us a new look at certain players; therefore in the context of an all-time list, it's best to look at the results of both.
SPM is a very interesting statistic in that it looks at box score statistics adjusted for pace and minutes played allowing for player comparisons across eras to be possible. It even has non-linear terms such as versatility (cube root of product of minute and pace adjusted points, rebounds, and assists), while also including a usage term which is key. In order to convert SPMs to production, you can calculate a player's VORP (Value over Replacement Player) by using the formula [SPM - (-2.0)] * (minutes/total team minutes) * (games played/total games).
To obtain an all time score for each player, I first calculated their regular season VORP. Since I had to manually code SPM (it's not available on basketball-reference), it was very difficult to compute series-specific SPMs. Therefore, to properly attempt to factor in the playoffs, I instead gave higher weight to deeper playoff runs. This is indirectly rewarding good performance in the conference finals, finals, etc. Next, I calculated the weighted sum of each player's playoff VORPs. Finally, I added the player's regular season VORP and total playoff VORP.
Here are the Top 10 players in NBA history:
According to SPM, Jordan, Kareem, Lebron, and Kobe are the clear Top 4 players in NBA history. It's interesting that SPM captured the exact same Top 10 as BPM did as well, which is strong evidence in support of these players (BPM and SPM are probably the 2 best advanced stats we have right now). Lebron will pass Kareem very quickly and has a good shot at supplanting Jordan in the top spot. Furthermore, if Kobe can be productive next year, he may have a shot at passing Kareem as well (although it will require near elite level production). Feel free to leave your thoughts and feedback bellow in the comments.
All Time List based on Box Plus/Minus (BPM)
I looked at several factors in order to determine a player's "all time score" so a list could be created. First, I calculated at a player's career total VORP in the regular season. Next, what a player is truly made of is defined in the playoffs where the best-of-the-best have shined the brightest. Just like Neil Paine, for each player I made adjustments for the level of competition faced, home-court advantage, and the leverage of each game, and recalculated their VORP for each round of the playoffs. I weight production in each successive round of the playoffs higher (e.g. conference finals matter more than the 1st round) and finally compute the weighted sum to determine a player's career playoff production. Playoff production will obviously be quite a bit lower than regular season production due to the fewer minutes and games; this undervalues some of the greatest players who have had several championship runs, for example. Therefore, I multiplied each player's playoff production by 6. After this adjustment, playoff production is in general slightly more valuable than regular season production. I personally find this to be fair since the playoffs are where teams' offensive and defensive effort are at their highest. Furthermore, the sample size of games played in the playoffs is usually pretty large for most of the greatest players in NBA history, so this is generally not a concern. Finally, I summed a player's regular season production and playoff production and put it on a scale of 1000 which results in their all time score. A score of 1000 can be thought of as the best possible career. Since BPM is not available for players before the 1973-1974 season due to certain box score stats not being recorded, it was calculated using estimates of these stats (this is for players such as Wilt, Jerry West, Oscar Robertson, etc.).
Here are the Top 20 players in NBA history:
Michael Jordan is by far the best player of all time -- who knew!? This list will remain constant in the near future with the one exception of Lebron James. Lebron is currently 3rd all time and will likely end up as the 2nd best player ever at his current pace, passing Kareem in the next few years. It's safe to say that Jordan is untouchable. Other active players that rank highly all time include Jason Kidd (26th), Chris Paul (28th), Paul Pierce (31st), Pau Gasol (41st), and Kevin Durant (47th).
Overall, in my opinion, this all time list is as close to the "truth" as you can get since it ranks players solely based on what they have done on the basketball court, using the best metric we have to measure player performance today. Feel free to leave your thoughts and feedback you may have on improving the methodology used in the comments below.
Removing Outliers - A New Look at Kobe Bryant and Advanced Stats
Recently, I went back and was looking at game logs of Kobe's prime years, namely 2000-2003 and 2005-2010. It was amazing at how consistently good Kobe was on a night to night basis. He is often criticized for things such as never shooting about 47% from the field in a season or being inefficient. This was hard to believe after eye-balling his game logs and then I came across the culprit - the outliers. I calculated that Kobe was roughly prone to about 10 "bad" games in a season in which he shot the ball very poorly, swaying his season's averages quite a bit. This should be no surprise as 82 games (less in his case as he missed a few games here and there) is a small sample size and outliers do make a significant difference. Now, it's no secret Kobe is more prone to these type of games, than say Lebron, because of his style of play. However, these outliers have perhaps painted Kobe's image in a negative manner, which is rather unfair to him in general.
Therefore, I went through each of his prime seasons and removed 10 outliers from every NBA player's game-log to recalculate each their new season averages. Over 85% of the data is still being retained in these situations (I didn't just go and remove all of the games Kobe shot less than 50%). I hypothesized that this is likely a better method of finding out just how good someone is and shows a new perspective of Kobe in his specific case. For the outlier removal, I used a One-Class Support Vector Machine (SVM) (reference: http://scikit-learn.org/stable/modules/outlier_detection.html). The SVM was trained on every player's game logs containing box score data. Since this is an automated machine learning method, the software picked up on different "types" of outliers for each player. For example, an outlier for Kobe may be a game with low FG% while an outlier for a big man may be games with low rebounds. Here are Kobe's updated season averages from 2000-2010 (with 2004 left out as he missed too many games due to his rape case and such):
As you can see, it has a dramatic impact on Kobe's statistics. His averages with 10 outliers removed are absolutely remarkable. Bryant's FG% on average is around 49% during this time period while maintaining a beyond elite TS% in every season as well. One season that stands out to me is his 2007 season in which he averaged 33.31 PPG on a 61.0% TS%; even after outlier removal the only player in NBA history to score at this volume and efficiency was Michael Jordan. Furthermore, after league-wide outlier removal, Kobe led the league in both PER and WS/48 in 2006 and 2007 (a feat accomplished by only a handful of players in NBA history). These results carry over to the playoffs as well. For example, Kobe's approximate PER of 30.88 in the 2009 playoffs, after outlier removal (6 games removed for each player), is the 5th highest ever in a championship run (at least 16 games).
Next, I wanted to see if my hypothesis was correct and if these updated advanced metrics were actually improved compared to before (correlated better to team win %). The method to calculate this, is to compute the minute-weighted average of WS/48 and PER for every team, and regress this upon winning %. This is described by Neil Paine here. He found that on a 1-year basis WS/48's correlation to wins was 0.694 and PER's correlation to wins was 0.638. I ran the correlation test again with the outlier-removed WS/48 and PER and obtained correlations of 0.726 and 0.654 respectively. This was very exciting because it had a significant improvement in each of the statistics and confirmed my hypothesis. This subsequently means that the outlier removal is not simply artificially inflating stats to make players like Kobe look better, but is doing a better job of explaining wins (and therefore player value).
To conclude, advanced stats do not "hate" Kobe Bryant. When popular advanced metrics such as WS/48 and PER are improved on with outlier removal, he is viewed in a very favorable light by them. In fact, after this, he is one of the very few players to lead the league in WS/48 and PER (Kareem, Moses Malone, Larry Bird, Jordan, Shaq, T-Mac, KG, Dirk, Lebron, and KD are the only others) -- and he did it twice (joining only Kareem, Bird, Jordan, Shaq, and Lebron to do it multiple times). Further experiments would include seeing how the correlation of WS/48 and PER change with respect to the number of outliers removed.
Feel free to leave your thoughts and suggestions in the comments below.
Therefore, I went through each of his prime seasons and removed 10 outliers from every NBA player's game-log to recalculate each their new season averages. Over 85% of the data is still being retained in these situations (I didn't just go and remove all of the games Kobe shot less than 50%). I hypothesized that this is likely a better method of finding out just how good someone is and shows a new perspective of Kobe in his specific case. For the outlier removal, I used a One-Class Support Vector Machine (SVM) (reference: http://scikit-learn.org/stable/modules/outlier_detection.html). The SVM was trained on every player's game logs containing box score data. Since this is an automated machine learning method, the software picked up on different "types" of outliers for each player. For example, an outlier for Kobe may be a game with low FG% while an outlier for a big man may be games with low rebounds. Here are Kobe's updated season averages from 2000-2010 (with 2004 left out as he missed too many games due to his rape case and such):
As you can see, it has a dramatic impact on Kobe's statistics. His averages with 10 outliers removed are absolutely remarkable. Bryant's FG% on average is around 49% during this time period while maintaining a beyond elite TS% in every season as well. One season that stands out to me is his 2007 season in which he averaged 33.31 PPG on a 61.0% TS%; even after outlier removal the only player in NBA history to score at this volume and efficiency was Michael Jordan. Furthermore, after league-wide outlier removal, Kobe led the league in both PER and WS/48 in 2006 and 2007 (a feat accomplished by only a handful of players in NBA history). These results carry over to the playoffs as well. For example, Kobe's approximate PER of 30.88 in the 2009 playoffs, after outlier removal (6 games removed for each player), is the 5th highest ever in a championship run (at least 16 games).
Next, I wanted to see if my hypothesis was correct and if these updated advanced metrics were actually improved compared to before (correlated better to team win %). The method to calculate this, is to compute the minute-weighted average of WS/48 and PER for every team, and regress this upon winning %. This is described by Neil Paine here. He found that on a 1-year basis WS/48's correlation to wins was 0.694 and PER's correlation to wins was 0.638. I ran the correlation test again with the outlier-removed WS/48 and PER and obtained correlations of 0.726 and 0.654 respectively. This was very exciting because it had a significant improvement in each of the statistics and confirmed my hypothesis. This subsequently means that the outlier removal is not simply artificially inflating stats to make players like Kobe look better, but is doing a better job of explaining wins (and therefore player value).
To conclude, advanced stats do not "hate" Kobe Bryant. When popular advanced metrics such as WS/48 and PER are improved on with outlier removal, he is viewed in a very favorable light by them. In fact, after this, he is one of the very few players to lead the league in WS/48 and PER (Kareem, Moses Malone, Larry Bird, Jordan, Shaq, T-Mac, KG, Dirk, Lebron, and KD are the only others) -- and he did it twice (joining only Kareem, Bird, Jordan, Shaq, and Lebron to do it multiple times). Further experiments would include seeing how the correlation of WS/48 and PER change with respect to the number of outliers removed.
Feel free to leave your thoughts and suggestions in the comments below.
Comparing Kobe, Lebron, and Duncan in the Playoffs Based On Impact
Kobe Bryant, Lebron James, and Tim Duncan are the three best players of this generation (we can count Shaq as part of the previous generation). All three have accomplished enough in their illustrious careers to be considered Top 10 players of all time. Fans constantly debate with each other about who the best player out of the three are. I decided to compare them based on how they've performed in the playoffs, against the best competition, by looking at their on/off court plus-minus impact stats. The advantage plus-minus stats have over box score based metrics (e.g. PER, Win Shares) is that they technically account for everything a player did on the court.
In order to come up with an offensive on/off impact score I combined the following:
1) Team's change in eFG%
2) Team's change in ORTG (offensive rating)
3) Team's change in TOV%
Furthermore, in order to come up with a defensive on/off impact score I combined the following:
1) Opponent's change in eFG%
2) Opponent's change in ORTG (offensive rating)
3) Opponent's change in TOV%
4) Team's change in STL%
5) Team's change in BLK%
6) Opponent's change in TRB%
Here are the results ranked by combined net impact:
The table shows Kobe is the best overall player by this criterion. Each have played thousands of minutes in the playoffs so sample size is not necessarily a concern. Kobe has the largest offensive impact and Duncan, not surprisingly, has the biggest defensive impact.
In order to come up with an offensive on/off impact score I combined the following:
1) Team's change in eFG%
2) Team's change in ORTG (offensive rating)
3) Team's change in TOV%
Furthermore, in order to come up with a defensive on/off impact score I combined the following:
1) Opponent's change in eFG%
2) Opponent's change in ORTG (offensive rating)
3) Opponent's change in TOV%
4) Team's change in STL%
5) Team's change in BLK%
6) Opponent's change in TRB%
Here are the results ranked by combined net impact:
Subscribe to:
Posts
(
Atom
)