Introduction
Ball-by-Ball data in cricket, records the information for every ball bowled in the game, like the batsman on strike, the bowler who is bowling, runs scored on that ball, if a wicket is taken on that ball etc. By using this data, we can compute metrics like the Strike Rate of a Batsman, Runs per over, Economy of a bowler and all the other standard metrics used in cricket.
To know more about Ball-by-Ball data, the sources for this type of data and the standard metrics which can be calculated using this data, visit this article.
In this article, we will be learning how to handle Ball-by-Ball data with Python using the basic libraries Python provides.
The dataset used for this article can be directly downloaded from here.
Tutorial
Importing the Libraries and Packages
Let’s start this tutorial by importing all the libraries and packages required.
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.preprocessing import MinMaxScaler
- The numpy library is used for dealing with arrays and making numerical operations.
- The pandas library is used for handling and analyzing data.
- The matplotlib library is a data visualization library used to draw necessary graphs and plots.
- The MinMaxScaler package is a package from SciKit-Learn library used for normalizing the data.
Loading the data
The datasets provided are in a CSV (Comma Separated Values) format. We load the datasets using the read_csv
function available in pandas. This function reads the CSV and loads it into a dataframe.
dfDeliveries = pd.read_csv('D:\Analytics\Cricket\Data\IPL_Ball_by_Ball_2022.csv')
dfMatches = pd.read_csv('D:\Analytics\Cricket\Data\IPL_Matches_2022.csv')
After loading the dataframes into variables, we can explore the dataframes by using the head()
function which returns the first 5 rows of the dataframe.
pd.set_option('max_columns',100)
dfDeliveries.head()
ID | innings | overs | ballnumber | batter | bowler | non-striker | extra_type | batsman_run | extras_run | total_run | non_boundary | isWicketDelivery | player_out | kind | fielders_involved | BattingTeam | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1312200 | 1 | 0 | 1 | YBK Jaiswal | Mohammed Shami | JC Buttler | NaN | 0 | 0 | 0 | 0 | 0 | NaN | NaN | NaN | Rajasthan Royals |
1 | 1312200 | 1 | 0 | 2 | YBK Jaiswal | Mohammed Shami | JC Buttler | legbyes | 0 | 1 | 1 | 0 | 0 | NaN | NaN | NaN | Rajasthan Royals |
2 | 1312200 | 1 | 0 | 3 | JC Buttler | Mohammed Shami | YBK Jaiswal | NaN | 1 | 0 | 1 | 0 | 0 | NaN | NaN | NaN | Rajasthan Royals |
3 | 1312200 | 1 | 0 | 4 | YBK Jaiswal | Mohammed Shami | JC Buttler | NaN | 0 | 0 | 0 | 0 | 0 | NaN | NaN | NaN | Rajasthan Royals |
4 | 1312200 | 1 | 0 | 5 | YBK Jaiswal | Mohammed Shami | JC Buttler | NaN | 0 | 0 | 0 | 0 | 0 | NaN | NaN | NaN | Rajasthan Royals |
dfMatches.head()
ID | City | Date | Season | MatchNumber | Team1 | Team2 | Venue | TossWinner | TossDecision | SuperOver | WinningTeam | WonBy | Margin | method | Player_of_Match | Team1Players | Team2Players | Umpire1 | Umpire2 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1312200 | Ahmedabad | 2022-05-29 | 2022 | Final | Rajasthan Royals | Gujarat Titans | Narendra Modi Stadium, Ahmedabad | Rajasthan Royals | bat | N | Gujarat Titans | Wickets | 7 | NaN | HH Pandya | [‘YBK Jaiswal’, ‘JC Buttler’, ‘SV Samson’, ‘D … | [‘WP Saha’, ‘Shubman Gill’, ‘MS Wade’, ‘HH Pan… | CB Gaffaney | Nitin Menon |
1 | 1312199 | Ahmedabad | 2022-05-27 | 2022 | Qualifier 2 | Royal Challengers Bangalore | Rajasthan Royals | Narendra Modi Stadium, Ahmedabad | Rajasthan Royals | field | N | Rajasthan Royals | Wickets | 7 | NaN | JC Buttler | [‘V Kohli’, ‘F du Plessis’, ‘RM Patidar’, ‘GJ … | [‘YBK Jaiswal’, ‘JC Buttler’, ‘SV Samson’, ‘D … | CB Gaffaney | Nitin Menon |
2 | 1312198 | Kolkata | 2022-05-25 | 2022 | Eliminator | Royal Challengers Bangalore | Lucknow Super Giants | Eden Gardens, Kolkata | Lucknow Super Giants | field | N | Royal Challengers Bangalore | Runs | 14 | NaN | RM Patidar | [‘V Kohli’, ‘F du Plessis’, ‘RM Patidar’, ‘GJ … | [‘Q de Kock’, ‘KL Rahul’, ‘M Vohra’, ‘DJ Hooda… | J Madanagopal | MA Gough |
3 | 1312197 | Kolkata | 2022-05-24 | 2022 | Qualifier 1 | Rajasthan Royals | Gujarat Titans | Eden Gardens, Kolkata | Gujarat Titans | field | N | Gujarat Titans | Wickets | 7 | NaN | DA Miller | [‘YBK Jaiswal’, ‘JC Buttler’, ‘SV Samson’, ‘D … | [‘WP Saha’, ‘Shubman Gill’, ‘MS Wade’, ‘HH Pan… | BNJ Oxenford | VK Sharma |
4 | 1304116 | Mumbai | 2022-05-22 | 2022 | 70 | Sunrisers Hyderabad | Punjab Kings | Wankhede Stadium, Mumbai | Sunrisers Hyderabad | bat | N | Punjab Kings | Wickets | 5 | NaN | Harpreet Brar | [‘PK Garg’, ‘Abhishek Sharma’, ‘RA Tripathi’, … | [‘JM Bairstow’, ‘S Dhawan’, ‘M Shahrukh Khan’,… | AK Chaudhary | NA Patwardhan |
Data PreProcessing
As we can see from the above Deliveries DataFrame, we do not have the Name of the Bowling team present in it.
To get the Name of the Bowling team for each delivery, we’ll be using the Matches DataFrame by merging/joining it with the Deliveries DataFrame and using the where function provided by numpy to write a conditional clause.
dfDeliveries = dfDeliveries.merge(dfMatches, on='ID')
dfDeliveries['BowlingTeam'] = np.where(dfDeliveries['BattingTeam']==dfDeliveries['Team1'],
dfDeliveries['Team2'],
dfDeliveries['Team1'])
dfDeliveries.head()
ID | innings | overs | ballnumber | batter | bowler | non-striker | extra_type | batsman_run | extras_run | total_run | non_boundary | isWicketDelivery | player_out | kind | fielders_involved | BattingTeam | City | Date | Season | MatchNumber | Team1 | Team2 | Venue | TossWinner | TossDecision | SuperOver | WinningTeam | WonBy | Margin | method | Player_of_Match | Team1Players | Team2Players | Umpire1 | Umpire2 | BowlingTeam | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1312200 | 1 | 0 | 1 | YBK Jaiswal | Mohammed Shami | JC Buttler | NaN | 0 | 0 | 0 | 0 | 0 | NaN | NaN | NaN | Rajasthan Royals | Ahmedabad | 2022-05-29 | 2022 | Final | Rajasthan Royals | Gujarat Titans | Narendra Modi Stadium, Ahmedabad | Rajasthan Royals | bat | N | Gujarat Titans | Wickets | 7 | NaN | HH Pandya | [‘YBK Jaiswal’, ‘JC Buttler’, ‘SV Samson’, ‘D … | [‘WP Saha’, ‘Shubman Gill’, ‘MS Wade’, ‘HH Pan… | CB Gaffaney | Nitin Menon | Gujarat Titans |
1 | 1312200 | 1 | 0 | 2 | YBK Jaiswal | Mohammed Shami | JC Buttler | legbyes | 0 | 1 | 1 | 0 | 0 | NaN | NaN | NaN | Rajasthan Royals | Ahmedabad | 2022-05-29 | 2022 | Final | Rajasthan Royals | Gujarat Titans | Narendra Modi Stadium, Ahmedabad | Rajasthan Royals | bat | N | Gujarat Titans | Wickets | 7 | NaN | HH Pandya | [‘YBK Jaiswal’, ‘JC Buttler’, ‘SV Samson’, ‘D … | [‘WP Saha’, ‘Shubman Gill’, ‘MS Wade’, ‘HH Pan… | CB Gaffaney | Nitin Menon | Gujarat Titans |
2 | 1312200 | 1 | 0 | 3 | JC Buttler | Mohammed Shami | YBK Jaiswal | NaN | 1 | 0 | 1 | 0 | 0 | NaN | NaN | NaN | Rajasthan Royals | Ahmedabad | 2022-05-29 | 2022 | Final | Rajasthan Royals | Gujarat Titans | Narendra Modi Stadium, Ahmedabad | Rajasthan Royals | bat | N | Gujarat Titans | Wickets | 7 | NaN | HH Pandya | [‘YBK Jaiswal’, ‘JC Buttler’, ‘SV Samson’, ‘D … | [‘WP Saha’, ‘Shubman Gill’, ‘MS Wade’, ‘HH Pan… | CB Gaffaney | Nitin Menon | Gujarat Titans |
3 | 1312200 | 1 | 0 | 4 | YBK Jaiswal | Mohammed Shami | JC Buttler | NaN | 0 | 0 | 0 | 0 | 0 | NaN | NaN | NaN | Rajasthan Royals | Ahmedabad | 2022-05-29 | 2022 | Final | Rajasthan Royals | Gujarat Titans | Narendra Modi Stadium, Ahmedabad | Rajasthan Royals | bat | N | Gujarat Titans | Wickets | 7 | NaN | HH Pandya | [‘YBK Jaiswal’, ‘JC Buttler’, ‘SV Samson’, ‘D … | [‘WP Saha’, ‘Shubman Gill’, ‘MS Wade’, ‘HH Pan… | CB Gaffaney | Nitin Menon | Gujarat Titans |
4 | 1312200 | 1 | 0 | 5 | YBK Jaiswal | Mohammed Shami | JC Buttler | NaN | 0 | 0 | 0 | 0 | 0 | NaN | NaN | NaN | Rajasthan Royals | Ahmedabad | 2022-05-29 | 2022 | Final | Rajasthan Royals | Gujarat Titans | Narendra Modi Stadium, Ahmedabad | Rajasthan Royals | bat | N | Gujarat Titans | Wickets | 7 | NaN | HH Pandya | [‘YBK Jaiswal’, ‘JC Buttler’, ‘SV Samson’, ‘D … | [‘WP Saha’, ‘Shubman Gill’, ‘MS Wade’, ‘HH Pan… | CB Gaffaney | Nitin Menon | Gujarat Titans |
PlayerWise Aggregation – Batting Metrics
In this section, we will be calculating the standard batting metrics for each and every batsmen in IPL 2022. The standard metrics we will be calculating are,
- Number of Innings Played as a Batsman (Innings)
- Number of Runs Scored (Runs)
- Number of Outs/Dismissals (Dismissals)
- Number of Balls Faced (Balls Faced)
- Number of 4s scored (4s)
- Number of 6s scored (6s)
- Highest Score (High Score)
- Number of Half Centuries scored (50)
- Number of Centuries scored (100)
- Batting Strike Rate (Strike Rate)
- Batting Average (Batting Average)
For calculating these metrics, we will be using basic conditional dataframe slicing and the groupby
, agg
, merge
, reset_index
functions provided by pandas along with some basic math functions like sum, max, count and nunique.
Innings, Runs and Dismissals
To calculate the number of innings played, runs scored and dismissals for each batsmen, we will use groupby function to group over the deliveries of each batsmen and use the agg function to get the number of innings by getting the unique Match IDs with nunique, runs scored by summing all the batsmen runs and dismissals by summing all the wicket deliveries.
We can convert this aggregated dataframe into a proper dataframe by resetting the index using the reset_index function.
dfBatsmen = dfDeliveries.groupby('batter')\
.agg({'ID':'nunique','batsman_run':'sum','isWicketDelivery':'sum'})\
.reset_index()
dfBatsmen.head()
batter | ID | batsman_run | isWicketDelivery | |
---|---|---|---|---|
0 | A Badoni | 11 | 161 | 9 |
1 | A Manohar | 7 | 108 | 6 |
2 | A Nortje | 1 | 1 | 0 |
3 | A Tomar | 1 | 4 | 1 |
4 | AD Russell | 12 | 335 | 9 |
Now, we rename the columns to their correct descriptions and display it with the head function to see our resultant dataframe.
dfBatsmen.columns = ['Batter','Innings','Runs','Dismissals']
dfBatsmen.head()
Batter | Innings | Runs | Dismissals | |
---|---|---|---|---|
0 | A Badoni | 11 | 161 | 9 |
1 | A Manohar | 7 | 108 | 6 |
2 | A Nortje | 1 | 1 | 0 |
3 | A Tomar | 1 | 4 | 1 |
4 | AD Russell | 12 | 335 | 9 |
We use the same aggregation methods with a combination of conditional slicing by excluding all the wide balls for calculating the number of balls faced, by including all the runs which are 4s and boundaries for calculating the number of 4s scored and similarly for 6s.
Balls Faced
dfBallsFaced = dfDeliveries[(dfDeliveries['extra_type']!='wides')].groupby('batter')['ID'].count().reset_index()
dfBallsFaced.columns = ['Batter','Balls Faced']
dfBallsFaced.head()
Batter | Balls Faced | |
---|---|---|
0 | A Badoni | 130 |
1 | A Manohar | 75 |
2 | A Nortje | 6 |
3 | A Tomar | 8 |
4 | AD Russell | 192 |
Boundaries – 4s and 6s
dfBatter4s = dfDeliveries[(dfDeliveries['batsman_run']==4)&(dfDeliveries['non_boundary']==0)]\
.groupby('batter')['ID']\
.count()\
.reset_index()\
.sort_values('ID')
dfBatter4s.columns = ['Batter','4s']
dfBatter4s.head()
Batter | 4s | |
---|---|---|
50 | KK Nair | 1 |
25 | DJ Mitchell | 1 |
60 | M Ashwin | 1 |
28 | DR Sams | 1 |
62 | M Vohra | 1 |
dfBatter6s = dfDeliveries[(dfDeliveries['batsman_run']==6)&(dfDeliveries['non_boundary']==0)]\
.groupby('batter')['ID']\
.count()\
.reset_index()\
.sort_values('ID')
dfBatter6s.columns = ['Batter','6s']
dfBatter6s.head()
Batter | 6s | |
---|---|---|
55 | Mohsin Khan | 1 |
90 | SE Rutherford | 1 |
17 | DJ Bravo | 1 |
82 | RV Patel | 1 |
94 | SP Jackson | 1 |
To calculate the High Scores, Number of Half Centuries and Centuries, we need to group over the batsmen and the Match ID and perform a sum function over all the batsmen runs to get the runs scored for each batsmen in each match.
dfMatchWise = dfDeliveries.groupby(['batter','ID'])['batsman_run'].sum().reset_index()
dfMatchWise.head()
batter | ID | batsman_run | |
---|---|---|---|
0 | A Badoni | 1304050 | 54 |
1 | A Badoni | 1304053 | 19 |
2 | A Badoni | 1304058 | 19 |
3 | A Badoni | 1304061 | 10 |
4 | A Badoni | 1304066 | 5 |
We now use this match-wise aggregated dataframe and again group over the batsmen and perform the max function to get the highest score scored by every batsmen.
High Score
dfHighScore = dfMatchWise.groupby(['batter'])['batsman_run'].max().reset_index()
dfHighScore.columns = ['Batter','High Score']
dfHighScore.head()
Batter | High Score | |
---|---|---|
0 | A Badoni | 54 |
1 | A Manohar | 43 |
2 | A Nortje | 1 |
3 | A Tomar | 4 |
4 | AD Russell | 70 |
Similarly, by using the match-wise aggregated dataframe along with conditional slicing of the dataframe by getting total runs scored in each match greater than or equal to 50 and less than 100 and counting it we calculate the number of half centuries scored and, getting total runs scored in each match greater than or equal to 100 and counting it we calculate the number of centuries scored.
50s and 100s
df50s = dfMatchWise[(dfMatchWise['batsman_run']>=50)&(dfMatchWise['batsman_run']<100)]\
.groupby('batter')['batsman_run']\
.count()\
.reset_index()\
.sort_values('batsman_run',ascending=False)
df50s.columns = ['Batter','50']
df50s.head()
Batter | 50 | |
---|---|---|
10 | DA Warner | 5 |
46 | Shubman Gill | 4 |
18 | JC Buttler | 4 |
16 | HH Pandya | 4 |
23 | LS Livingstone | 4 |
df100s = dfMatchWise[(dfMatchWise['batsman_run']>=100)]\
.groupby('batter')['batsman_run']\
.count()\
.reset_index()\
.sort_values('batsman_run',ascending=False)
df100s.columns = ['Batter','100']
df100s.head()
Batter | 100 | |
---|---|---|
0 | JC Buttler | 4 |
1 | KL Rahul | 2 |
2 | Q de Kock | 1 |
3 | RM Patidar | 1 |
Finally, we join all the dataframes we have created so far for each statistic using the merge function on the Batter column.
(Note: We need to join these dataframes in outer join by specifying the ‘how’ argument in the merge function as ‘outer’.)
Merging all the DataFrames
dfBatsmen = dfBatsmen.merge(dfBallsFaced, on='Batter',how='outer').merge(dfBatter4s, on='Batter',how='outer')\
.merge(dfBatter6s, on='Batter',how='outer').merge(dfHighScore, on='Batter',how='outer')\
.merge(df50s, on='Batter',how='outer').merge(df100s, on='Batter',how='outer')
dfBatsmen.head()
Batter | Innings | Runs | Dismissals | Balls Faced | 4s | 6s | High Score | 50 | 100 | |
---|---|---|---|---|---|---|---|---|---|---|
0 | A Badoni | 11 | 161 | 9 | 130 | 11.0 | 7.0 | 54 | 1.0 | NaN |
1 | A Manohar | 7 | 108 | 6 | 75 | 14.0 | 3.0 | 43 | NaN | NaN |
2 | A Nortje | 1 | 1 | 0 | 6 | NaN | NaN | 1 | NaN | NaN |
3 | A Tomar | 1 | 4 | 1 | 8 | 1.0 | NaN | 4 | NaN | NaN |
4 | AD Russell | 12 | 335 | 9 | 192 | 18.0 | 32.0 | 70 | 1.0 | NaN |
Using this newly joined dataframe we can derive metrics like Batting Strike Rate, by dividing the runs scored with the balls faced as a multiple of 100 and Batting Average, by dividing the runs scored with the number of dismissals.
dfBatsmen['Strike Rate'] = (dfBatsmen['Runs']/dfBatsmen['Balls Faced'])*100
dfBatsmen['Batting Average'] = dfBatsmen['Runs']/dfBatsmen['Dismissals']
Let’s fill the NaN values with 0 using the fillna function and sort the rows in the dataframe by using sort_values function with runs scored as the key to be sorted on and display the head of this dataframe.
dfBatsmen.fillna(0, inplace=True)
dfBatsmen.sort_values(by='Runs', ascending=False).head()
Batter | Innings | Runs | Dismissals | Balls Faced | 4s | 6s | High Score | 50 | 100 | Strike Rate | Batting Average | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
49 | JC Buttler | 17 | 863 | 15 | 579 | 83.0 | 45.0 | 116 | 4.0 | 4.0 | 149.050086 | 57.533333 |
65 | KL Rahul | 14 | 616 | 11 | 455 | 45.0 | 30.0 | 103 | 4.0 | 2.0 | 135.384615 | 56.000000 |
112 | Q de Kock | 15 | 508 | 15 | 341 | 47.0 | 23.0 | 140 | 3.0 | 1.0 | 148.973607 | 33.866667 |
42 | HH Pandya | 15 | 487 | 10 | 371 | 49.0 | 12.0 | 87 | 4.0 | 0.0 | 131.266846 | 48.700000 |
154 | Shubman Gill | 16 | 483 | 15 | 365 | 51.0 | 11.0 | 96 | 4.0 | 0.0 | 132.328767 | 32.200000 |
These aggregated standard metrics can be directly found on IPL’s website.
Why go through all this trouble to calculate something that’s readily available on the internet?
This is because the aggregation methods used to calculate all the standard metrics are used as the foundation to make much more in-depth analysis.
TeamWise Aggregation – Bowling Metrics in Death Overs
A T20 game can be divided into three phases. They are,
- Power Play (1-6 overs)
- Middle Overs (7-15 overs)
- Death Overs (16-20 overs)
Let’s create a small function that returns the phase of the game based on the over.
If we look at the unique values in the Overs column of the dataframe, the overs start from 0th over and end at the end of the 19th over.
This is why we will shift all the starting and ending overs of each phase by -1.
def overClassifier(val):
if val>=0 and val<6:
return 'Power Play'
elif val>=6 and val<15:
return 'Middle Overs'
else:
return 'Death Overs'
Let’s map this function to the overs column and get the phase of the play of the game at every delivery.
dfDeliveries['PhaseOfPlay'] = dfDeliveries['overs'].map(overClassifier)
dfDeliveries[['overs','PhaseOfPlay']].sort_values('overs').drop_duplicates().reset_index(drop=True)
overs | PhaseOfPlay | |
---|---|---|
0 | 0 | Power Play |
1 | 1 | Power Play |
2 | 2 | Power Play |
3 | 3 | Power Play |
4 | 4 | Power Play |
5 | 5 | Power Play |
6 | 6 | Middle Overs |
7 | 7 | Middle Overs |
8 | 8 | Middle Overs |
9 | 9 | Middle Overs |
10 | 10 | Middle Overs |
11 | 11 | Middle Overs |
12 | 12 | Middle Overs |
13 | 13 | Middle Overs |
14 | 14 | Middle Overs |
15 | 15 | Death Overs |
16 | 16 | Death Overs |
17 | 17 | Death Overs |
18 | 18 | Death Overs |
19 | 19 | Death Overs |
Using the conditional slicing, let’s take all the deliveries which occurred in the Death Overs of the game.
Using the same aggregation methods we performed above, we will be calculating the Runs Conceded, Wickets Taken, Balls Bowled by grouping over each Bowling Team. We merge all the dataframes of bowling metrics we have created and display the head of this dataframe.
dfDeathOvers = dfDeliveries[dfDeliveries['PhaseOfPlay']=='Death Overs']
Innings, Runs and Dismissals
dfDeathBowling = dfDeathOvers.groupby('BowlingTeam').agg({'total_run':'sum','isWicketDelivery':'sum'}).reset_index()
dfDeathBowling.columns = ['Team','Runs Conceded','Wickets']
Balls Bowled
dfDOBallsBowled = dfDeliveries[~((dfDeliveries['extra_type']=='wides'))&(dfDeliveries['PhaseOfPlay']=='Death Overs')]\
.groupby('BowlingTeam')['ID']\
.count().reset_index()
dfDOBallsBowled.columns = ['Team','Balls Bowled']
dfDeathBowling = dfDeathBowling.merge(dfDOBallsBowled, on='Team')
Overs, Bowling Average, Bowling Strike Rate and Economy Rate
dfDeathBowling['Overs'] = dfDeathBowling['Balls Bowled']/6
dfDeathBowling['Bowling Average'] = dfDeathBowling['Runs Conceded']/dfDeathBowling['Wickets']
dfDeathBowling['Bowling Strike Rate'] = dfDeathBowling['Balls Bowled']/dfDeathBowling['Wickets']
dfDeathBowling['Economy Rate'] = dfDeathBowling['Runs Conceded']/dfDeathBowling['Overs']
Let’s look at the teams with the best Economy Rate in the Death Overs.
To do that, we will sort the dataframe using the sort_values function in pandas and sort it on Economy Rate in ascending order since a lower Economy Rate is better.
dfDeathBowling = dfDeathBowling.sort_values('Economy Rate',ascending=False).reset_index(drop=True)
dfDeathBowling
Team | Runs Conceded | Wickets | Balls Bowled | Overs | Bowling Average | Bowling Strike Rate | Economy Rate | |
---|---|---|---|---|---|---|---|---|
0 | Lucknow Super Giants | 776 | 35 | 418 | 69.666667 | 22.171429 | 11.942857 | 11.138756 |
1 | Sunrisers Hyderabad | 669 | 33 | 369 | 61.500000 | 20.272727 | 11.181818 | 10.878049 |
2 | Mumbai Indians | 639 | 33 | 354 | 59.000000 | 19.363636 | 10.727273 | 10.830508 |
3 | Royal Challengers Bangalore | 744 | 31 | 422 | 70.333333 | 24.000000 | 13.612903 | 10.578199 |
4 | Rajasthan Royals | 813 | 39 | 469 | 78.166667 | 20.846154 | 12.025641 | 10.400853 |
5 | Chennai Super Kings | 599 | 27 | 346 | 57.666667 | 22.185185 | 12.814815 | 10.387283 |
6 | Delhi Capitals | 711 | 35 | 414 | 69.000000 | 20.314286 | 11.828571 | 10.304348 |
7 | Kolkata Knight Riders | 640 | 30 | 376 | 62.666667 | 21.333333 | 12.533333 | 10.212766 |
8 | Punjab Kings | 561 | 29 | 342 | 57.000000 | 19.344828 | 11.793103 | 9.842105 |
9 | Gujarat Titans | 645 | 38 | 415 | 69.166667 | 16.973684 | 10.921053 | 9.325301 |
Finally, let’s make a simple visualization for the teams with the best Economy Rate in the form of a horizontal bar graph.
fig,ax = plt.subplots(figsize=(12,10))
# Setting background colors
fig.set_facecolor('black')
ax.patch.set_facecolor('black')
# Setting the grid for axis
ax.set_axisbelow(True)
ax.grid(color='grey',which='major',linestyle='--',alpha=0.75)
# Embellishing the spines
ax.spines['right'].set_visible(False)
ax.spines['top'].set_visible(False)
ax.spines['left'].set_color('white')
ax.spines['bottom'].set_color('white')
ax.spines['left'].set_linewidth(3)
ax.spines['bottom'].set_linewidth(3)
# Normalizing the Economy Rate values
scaler = MinMaxScaler(feature_range=(0.2,1))
# Using the normalized Economy Rate values as alpha values
alphaList = scaler.fit_transform(dfDeathBowling['Economy Rate'].values.reshape(-1,1)).flatten().tolist()
alphaList.reverse()
# Plotting the horizontal bar graph
for index,row in dfDeathBowling.iterrows():
ax.barh(dfDeathBowling['Team'][index],dfDeathBowling['Economy Rate'][index],
color='#0217fa',alpha=alphaList[index])
ax.barh(dfDeathBowling['Team'][index],dfDeathBowling['Economy Rate'][index],
color='None',edgecolor='white',lw=2)
# Setting the fontsize, color and font-style of the tick labels
plt.yticks(fontsize=16,color='white',fontname='Arial Rounded MT Bold')
plt.xticks(fontsize=13,color='white',fontname='Arial Rounded MT Bold')
# Setting the labels and the title
plt.xlabel('Economy Rate',fontsize=18,color='white',fontname='Arial Rounded MT Bold')
plt.title(x=0.35,y=1.05,label='Economy Rate in Death Overs\nIndian Premier League - 2022',
fontsize=28,color='white',fontname='Arial Rounded MT Bold')
plt.show()