Sports analytics
Sports analytics are a collection of relevant, historical, statistics that can provide a competitive advantage to a team or individual. Through the collection and analyzation of these data, sports analytics inform players, coaches and other staff in order to facilitate decision making both during and prior to sporting events. The term "sports analytics" was popularized in mainstream sports culture following the release of the 2011 film, Moneyball, in which Oakland Athletics General Manager Billy Beane (played by Brad Pitt) relies heavily on the use of analytics to build a competitive team on a minimal budget.
There are two key aspects of sports analytics — on-field and off-field analytics. On-field analytics deals with improving the on-field performance of teams and players, including questions such as "which player on the Red Sox contributed most to the team's offense?" or "who is the best wing player in the NBA?", etc. Off-field analytics deals with the business side of sports. Off-field analytics focuses on helping a sport organization or body surface patterns and insights through data that would help increase ticket and merchandise sales, improve fan engagement, etc. Off-field analytics essentially uses data to help rightsholders take decisions that would lead to higher growth and increased profitability.[1]
As technology has advanced over the last number of years data collection has become more in-depth and can be conducted with relative ease. Advancements in data collection have allowed for sports analytics to grow as well, leading to the development of advanced statistics and machine learning,[2] as well as sport specific technologies that allow for things like game simulations to be conducted by teams prior to play, improve fan acquisition and marketing strategies, and even understand the impact of sponsorship on each team as well as its fans.[3]
Another significant impact sports analytics have had on professional sports is in relation to sport gambling. In depth sports analytics have taken sports gambling to new levels, whether it be fantasy sports leagues or nightly wagers, bettors now have more information at their disposal to help aid decision making. A number of companies and webpages have been developed to help provide fans with up to the minute information for their betting needs.[3]
Sport-specific analytic tools and measurements
Major League Baseball (MLB)
Early history
Baseball was one of the first sports to embrace sports analytics with Earnshaw Cook publishing Percentage Baseball in 1964. This was the first publication citing sports analytics to garner national media attention.[4] In 1981, Bill James helped bring SABR (Society for American Baseball Research),[5] one of the leading sports analytical organizations for baseball, into national prominence when Sports Illustrated featured James in the article He Does It By The Numbers by Daniel Okrent (1981).[6]
In 1984, New York Mets manager Davey Johnson became the first known member of a known sports organization to advocate for the use of sports analytics. During his time with the Baltimore Orioles, Johnson had tried to convince the organization to use his FORTRAN baseball computer simulation to determine the team's optimal starting lineup. As manager of the Mets, Johnson tasked a team employee with writing a dBASE II application to run sophisticated statistical models in order to better understand the capabilities and tendencies of the team's opponents.[7] By the close of the twentieth century, sports analytics had gained significant acceptance by the management of many Major League Baseball clubs, notably the Oakland A's, Boston Red Sox and Cleveland Indians.
At the same time, baseball fans and sports media had begun to adopt sports analytics as a way to understand and report the game. In 1996, Baseball Prospectus[8] sought to build upon Bill James' work when it launched the Baseball Prospectus website in order to present sabermetric research and related findings as well as publish advanced metrics such as EqA, the Davenport Translations (DT's), and VORP. Baseball Prospectus has grown into a multi-channel sports media organization employing a team of statisticians and writers who publish New York Times Best Selling books and host weekly radio shows and podcasts.
Recent Developments
The MLB has set the benchmark in sports analytics for a number of years, with some of the game's brightest minds having never set foot into the heat of a major or minor league baseball game. Theo Epstein of the Chicago Cubs is one of those minds who has never suited up in a professional baseball game; instead Epstein relies on his Yale University education and the numbers behind the game to make many of his decisions.[9] Epstein, known for his role in ending two of baseball's most historic streaks (the Boston Red Sox curse of the Great Bambino in 2004, and as recently as the 2016 World Series, helping end the 108-year drought between World Series wins for the Chicago Cubs), is a member of a growing community in major league baseball who do not rely on years of major league playing experience. This community has been able to grow thanks to the in depth collection of statistics that has existed in baseball for decades. With analytics being relatively common in the MLB, there are a breadth of statistics that have become vital in the analysis of the game, which include:
- Batting average is one of the most commonly discussed statistics in baseball. A player's batting average is determined by dividing the number of hits by the number of at bats for that player. The use of statistics also provides players with different pitches they struggle with at the plate, it shows their tendencies and which pitch usually strikes them out.[10]
- On-base percentage is the percentage of times a player reaches base on either a hit, walk, or by being hit by a pitch. This is a significant offensive stat as it looks beyond hits and more importantly illustrates how often a batter can avoid being put out at the plate. This is a more in depth offensive statistic than batting average as it takes into account walks and being hit by a pitch, both of which are indicators of how a player handles an at bat. Sabermetrics can help change a player's approach in order to raise their own base percentage increasing productivity and ultimately their overall worth as a player.[11]
- Slugging average is the calculation that determines the number of bases a player earns on hits. To determine this stat, the number of bases earned is divided by the number of at bats. This is a good measure for measuring a batters power as the higher their slugging average is, the more likely they are to hit for extra bases (i.e. a double, triple or home run). For sluggers, analytics can help them improve decision making at the plate and look for their pitch. Now, hitters can study the tendencies of the pitchers they are going to face therefore familiarizing themselves before they are up to bat.[12]
- WHIP stands for Walks plus Hits allowed per Inning Pitched and tends to be viewed as a strong way to measure the success of a pitcher as it illustrates how many baserunners the pitcher allows on both hits and walks. This is also a proven method for looking at a pitcher's efficiency. Now, pitchers can study the upcoming lineup they are going to face and focus on tendencies of the batters. Like where they stand on the plate, what pitches they tend to chase, and what part of the field they like to hit.[13][14]
National Basketball Association (NBA)
Houston Rockets' Daryl Morey was the first NBA general manager to implement advanced metrics as a key aspect of player evaluation.[15] In the years that followed Morey's hiring, the NBA moved quickly to adopt advanced metrics-based player evaluation practices. In 2012, John Holliger left ESPN to become VP of Basketball Operations for the Memphis Grizzlies.
Beyond professional basketball front offices, major sports media websites such as Basketball Reference are dedicated to the collection, synthesis, and dissemination of advanced metrics to pro and college basketball organizations, sports media members, and fans.
NCAA college basketball
North Carolina, under coach Frank McGuire, was the first known basketball organization to utilize advanced possession metrics to gain a competitive advantage. Since then, sports analytics enthusiasts in basketball have created weighted statistics that measure each player and each team's on-court efficiency. Most basketball-specific advanced metrics feature a per-minute measurement to ensure that a player's incremental team contributions are measured irrespective of usage volume.
National Football League (NFL)
In 2003, the sports analytics-focused website Football Outsiders pioneered football's first comprehensive advanced metric, DVOA (defense-adjusted value over average),[16] which compares a player's success on each play to the league average based on a number of variables including down, distance, location on field, current score gap, quarter, and strength of opponent. Football Outsiders' work has since been widely cited by analytical members the sports media establishment. A few years later, Pro Football Focus launched a comprehensive statistical database, which soon featured a sophisticated player grading system.[17] Advanced Football Analytics (originally Advanced NFL Stats) has its EPA (expected points added) and WPA (win probability added) for NFL players.
Grantland lead football writer Bill Barnwell created the first metrics focused on predicting the future performance of an individual player, the Speed Score, which he referenced in a piece written for Pro Football Prospectus. After analyzing data pertaining to running back success, Barnwell discovered that the most successful running backs at the NFL level were both fast and heavy, therefore, Speed Score weights 40-yard dash times by assigning a premium to bigger, often stronger, running backs.[18]
One of the driving forces for the use of sports analytics in the NFL has been the growth of fantasy football. Fantasy sports writer, C. D. Carter and peers at XN Sports, NumberFire, and the long-form fantasy football analysis site, Rotoviz.com, have established an informal subculture of fantasy football sports writers who refer to themselves as "degens". The degen movement is responsible for the creation of numerous American football efficiency metrics that better explain past football performances and attempt to predict future player production. Height-adjusted Speed Score,[19] College Dominator Rating,[20] Target Premium,[21] Catch Radius,[22] Net Expected Points (NEP),[23] and Production Premium[24] were recently created and disseminated by degen writers and mathematicians. Building on the work of these writers, sites such as PlayerProfiler.com distill a wide variety of established advanced metrics into a single player snapshot designed to be palatable to the casual sports fan.[24]
National Hockey League (NHL)
The NHL has kept statistics since its inception, yet it is a relatively new adopter of analytics-based decision making. The Toronto Maple Leafs were the first team in the NHL to hire a member of management with a largely analytical background when they hired assistant general manager Kyle Dubas in 2014. Dubas, similar to Theo Epstein in the MLB, has never suited up in a professional game and relies on the numbers generated by players on a nightly basis both now and in the past to make decisions.[25]
- The Corsi statistic is an advanced statistic that has been widely adopted throughout the NHL, as teams, fans and media alike rely on the Corsi statistic to track shot attempt differential.[26] Corsi has been recognized as the most informative single statistic in the game of hockey as it can provide insight into both the offensive and defensive play of a team as well as the amount of time a team has possession of the puck.[27]
Professional Golf Association (PGA) Tour
The PGA Tour collects vast amounts of data throughout the season. These statistics track each shot a player takes in tournament play, collecting information on how far the ball travels and exactly where each shot is played from and where it finishes. These data have been used for a number of years by players and their coaches during practice sessions as well as during tournament preparation, highlighting the areas in which that player needs to improve before teeing it up in tournament play.
- Shotlink data collection has revolutionized the way that data is collected in the game of golf. Introduced on a full-time basis in 2003, Shotlink relies on a number of strategically placed on-course laser rangefinders and cameras to collect precise data from every shot that is struck on the PGA Tour.[28] With these data, players are able to see the areas of their game that need improving, and on a broader year-to-year basis, players can review course statistics from previous years to allow for relevant tournament preparation. On top of the year-to-year statistics provided players and fans can also easily access these statistics at an up to the minute rate, giving these data an extremely high velocity. Shotlink has also made its mark on the world of golf course design as designers have constant access to up to the minute statistics of professional golfers, allowing for these designers to create courses that can provide a challenge for the world's best players.[28]
Soccer
Soccer uses tracking data, such as the positional data of the players and ball, for teams to obtain information about players’ conditioning [29]. This data has also been used for evaluating attacking performance to estimate goals scored using Artificial Intelligence [30]. Other approaches have included dribbling and passing [31]. Research is also undergoing at Nagoya University to investigate the potential of using the defender-orientated ball recovery and being attacked as metrics, with it being used successfully with data from the Japanese J1 League to predict the strategies used by the teams [32].
History
Many statisticians attribute the popularization of sports analytics to current Oakland Athletics General Manager Billy Beane. Strapped with a minimalist budget, Beane relied on sabermetrics, a form of sports analytics, to evaluate players and make personnel decisions.
Understanding the importance of getting runners on base, Beane focussed on acquiring players with a high on base percentage with the logic that teams with a higher on base percentage are more likely to score runs. He was also able to achieve success on a shoestring budget by acquiring overlooked starting pitchers, often getting them for a fraction of the price that a big name pitcher may require. When Beane's Athletics began to achieve success, other major league teams took notice. The second team to adopt a similar approach was the Boston Red Sox, who in 2003 made Theo Epstein the interim general manager. Epstein, who remains the youngest general manager to ever be hired in the MLB, came into the position with zero professional playing experience, highly irregular at the time. Using a similar approach to that of Billy Beane, Epstein was able to form a Boston Red Sox team that in 2004, won the organization's first World Series in 86 years, breaking the alleged Curse of the Bambino. Many experts attribute some of Epstein's success to Boston Red Sox owner, John W. Henry, who achieved significant success in the investments industry by using data-based decision making. As owner, Henry provided Epstein with significant leeway when it came to data-based decision making and the use of sabermetrics, as he knew the impact that such tools can have in achieving success in both sports and business. Since his success in Boston, Epstein had moved on to Chicago, where in 2016 he led the Chicago Cubs to their first World Series title in 108 years. More recently, teams like the Houston Rockets of the NBA have put a heavy focus on analytics to dictate front office and on-court decisions. Daryl Morey, the General Manager of the Rockets decided to emphasize three point shots and used analytics to support his argument.[33] As a result, the Rockets began shooting many more three-point shots and even traded their budding big man, Clint Capela.[34]
The success of analytic based strategies and decision making in baseball was noted by executives in other professional sports leagues. Today, you would be hard pressed to find any professional organization who does not have at least one analytical expert on staff, let alone an entire department dedicated to analytics.[35] Some of the teams that have achieved great success while using a largely analytical based approach are:
Notable applications
Houston Astros (MLB)
The Astros rely heavily on analytics when making decisions. The team has employees with titles like, director of decision sciences, medical risk manager and mathematic modeler.[36] Unlike other professional teams who typically use analytics solely for player transactions and signings, the Astros have begun to use analytics to make decisions on how they will play on the field, "applying the defensive shift more than any other team in the MLB last season."[36] Using this approach, the Houston Astros captured their first World Series victory in franchise history in 2017.[37]
San Antonio Spurs (NBA)
One of the early adopters of SportVU, the San Antonio Spurs have been using analytics to gain a competitive advantage on opponents for a number of years. Collectively as a team the Spurs have honed in on the importance of the three pointer and as a result constantly rank among the league lead in three point attempts. The teams understanding of the importance of the "three" extends beyond the offensive side of the court as they are relentless at defending the three pointer in the defensive end of the court.[36]
Chicago Blackhawks (NHL)
In 2009 the Chicago Blackhawks turned to an outside company to produce analytical assessments for them.[36] Subsequently, the Blackhawks have achieved unparalleled success in the NHL, winning three Stanley Cups in six seasons. With this success has come a number of difficult decisions for Blackhawks management as they are often only able to hang onto a core group of players following each cup run, while other key players receive offers that the Blackhawks simply cannot match under the NHL's salary cap. However, by using this analytics based system, the team has continuously been able to fill these gaps by finding players who are undervalued by other teams but will fit well with the Blackhawks' style of play. Many times, a team put together like this will seem underwhelming but perform higher than expectations. This strategy could be adopted by teams with limited financial freedom to put together a competitive team.[38] This process has been refined by the Blackhawks who provide yet another example of the longevity that can be associated with analytic base decision making.[39]
Gambling
Sports analytics have had significant impact on the field of play but sports analytics have also contributed to the growing industry of sports gambling, which accounts for approximately 13% of the global gambling industry.[40] Valued somewhere between $700-$1,000 billion, sports gambling is extremely popular among groups of all kinds, from avid sports fans to recreational gamblers, you would be hard pressed to find a professional sporting event with nothing riding on the results. Many gamblers are attracted to sports gambling because of the plethora of information and analytics that are at their disposal when making decisions. One gambler, Bob Stoll, has been ahead of the analytics curve for a number of years, successfully betting against the line 56% (575–453) of the time in college football, a significant rate as a winning percentage above 52.4% is considered profitable. With the number of statistics so openly available to fans, Stoll combines a number of different statistics such as, home and away records, record vs divisional/non-divisional teams, rush yards per rush, etc., to make educated picks that have paid off more than half of the time.[41]
Results from academic research show evidence that Twitter contains enough information to be useful for predicting outcomes in football games.[42]
With the popularity of sports gambling came the development of a number of sports betting services. "Sports betting services are provided by companies such as William Hill, Ladbrokes, bet365, bwin, Paddy Power, betfair, Unibet and many more through their websites and in many cases betting shops. In 2012, William Hill generated around 2 billion U.S. dollars in revenue with about 30 billion U.S. dollars in total being staked / wagered with the company."[40]
See also
References
- ^ Ray, Sugato (June 22, 2017). "The Evolution and Future of Analytics in Sport". Proem Sports | Sports Analytics | Singapore & India. Retrieved August 5, 2018.
- ^ Soto Valero, C. (1 December 2016). "Predicting Win–loss outcomes in MLB regular season games – A comparative study using data mining methods". International Journal of Computer Science in Sport. 15 (2): 91–112. doi:10.1515/ijcss-2016-0007.
- ^ a b "How Data Analytics Helps Coaches in Planning". WorkInSports. August 21, 2017. Retrieved August 5, 2018.
- ^ Albert, James; Jay M. Bennett (2001). Curve Ball: Baseball, Statistics, and the Role of Chance in the Game. Springer. pp. 170–171. ISBN 0-387-98816-5.
- ^ "About SABR".
- ^ Okrent, Daniel. "He Does It By The Numbers..." Sports Illustrated.
- ^ Porter, Martin (1984-05-29). "The PC Goes to Bat". PC Magazine. p. 209. Retrieved 24 October 2013.
- ^ Fraser, James (2000). "Baseball Prospectus — Escaping Bill James' Shadow" (PDF). SABR Statistical Analysis Committee. pp. 4–5.
- ^ Schwarz, Alan (2004). The Numbers Game. New York: St. Martin's Press.
- ^ Goldstein, Phil (2017-07-10). "Baseball Is Bringing Sports Analytics to the Forefront". BizTech. Retrieved 2018-04-20.
- ^ Steinberg, Leigh. "CHANGING THE GAME: The Rise of Sports Analytics". Forbes. Retrieved 2018-04-22.
- ^ Greenberg, Neil (2017-06-01). "Analysis | The statistical revelation that has MLB hitters bombing more home runs than the steroid era". Washington Post. ISSN 0190-8286. Retrieved 2018-04-22.
- ^ "Better Than WHIP?". Beyond the Box Score. Retrieved 2018-04-22.
- ^ "WHIP | FanGraphs Sabermetrics Library". www.fangraphs.com. Retrieved 2018-04-22.
- ^ Friedman, Jason (2007). "Rocket Science". Houston Press.
- ^ "General Football Terms Glossary". Football Outsiders.
- ^ "History of ProFootballFocus". Archived from the original on 2014-07-01. Retrieved 2014-08-05.
- ^ Barnwell, Bill (2008). "Pro Football Prospectus". Football Outsiders.
- ^ Siegele, Shawn (2012). "Dominator Rating, Height-adjusted Speed Score, and WR Draft Rankings". Money In The Banana Stand.
- ^ DuPont, Frank (2012). "Game Plan". Archived from the original on 2014-07-18. Retrieved 2014-08-06.
- ^ Hribar, Rich (2013). "Fantasy Football 2013 WR Review". XN Sports.
- ^ Smith, Scott (2014). "The Catch Radius Project: In Search of Better TD Production". RotoViz.
- ^ "Glossary". NumberFire.
- ^ a b "Terms Glossary". PlayerProfiler. 2014.
- ^ "Appreciating The Importance of Sports Analytics in Hockey". www.workinsports.com. Retrieved 2018-04-23.
- ^ WILSON, KENT. "Wilson: Don't know Corsi? Here's a handy-dandy primer to NHL advanced stats". www.calgaryherald.com. Retrieved 2016-10-23.
- ^ "The Future of Hockey Analytics". The Hockey Writers. 2017-09-25. Retrieved 2018-04-23.
- ^ a b Burke, Monte. "ShotLink Is Making Golf Easier For Hacks And Harder For Pros". Forbes. Retrieved 2016-10-24.
- ^ Andrzejewski M, Chmura J, Pluta B, Konarski JM. Sprinting activities and distance covered by top level Europa league soccer players. International Journal of Sports Science & Coaching. 2015;10(1):39–50.
- ^ Van Roy M, Robberechts P, Decroos T, Davis J. Valuing on-the-ball actions in soccer: a critical comparison of XT and VAEP. In: Proceedings of the AAAI-20 Workshop on Artifical Intelligence in Team Sports. AI in Team Sports Organising Committee; 2020.
- ^ Decroos T, Bransen L, Van Haaren J, Davis J. Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining; 2019. p. 1851–1861.
- ^ Kosuke Toda, Masakiyo Teranishi, Keisuke Kushiro, Keisuke Fujii. Evaluation of soccer team defense based on prediction models of ball recovery and being attacked: A pilot study. Published: January 27, 2022 https://doi.org/10.1371/journal.pone.0263051
- ^ "Moreyball: The Houston Rockets and Analytics". Digital Innovation and Transformation. Retrieved 2020-09-25.
- ^ "Rockets trade Clint Capela for Robert Covington, signaling all-in shift to small ball barring other moves". CBSSports.com. Retrieved 2020-09-25.
- ^ "Sports Analytics Have Changed the Game For Good". www.workinsports.com. Retrieved 2018-04-23.
- ^ a b c d "The Great Analytics Ranking".
- ^ "Astros' World Series win may be remembered as the moment analytics conquered MLB for good". The Washington Post.
- ^ Plummer, Michael. "Council Post: 'Moneyball': Using Sports Analytics Theories To Identify Inefficiencies In Your Business". Forbes. Retrieved 2020-09-28.
- ^ "Bowman: Analytics give Hawks an advantage". ESPN.com. Retrieved 2018-04-23.
- ^ a b "Sports Betting - Statistics & Facts". Statista. Retrieved March 1, 2018.
- ^ "How Dr. Bob Uses Football Analytics for Profitable Gambling".
- ^ Schumaker, Robert P. "Predicting wins and spread in the Premier League using a sentiment analysis of twitter" (PDF).