Football Data (CSV, JSON) - UPDATED 16/08/17

Post Reply
welshboy06
Posts: 165
Joined: Wed Mar 01, 2017 2:06 pm

Hi All,

UPDATED 16/08/17
ADDED CSV VERSIONS OF ALL LEAGUES

I've decided to start collecting football data. Mainly because I have an interest in the sport and also because of a certain jonnyg throwing stats around in a very hard to read format.

The data is quite simple and was just scraped from HKJC.

The data is in json format, but you should be able to convert it to csv and import it to Excel. I chose json as it reduces data duplication, is structured and can be read reasonably well by the human eye. It also works really well with Python (Which I'm using to scrape and also analyse the data)
The data contains the following info.

Code: Select all

League
Season
Game Date
Home Team
Away Team
And also Goals and Red cards (Player, Time and Team)
-----
Soon:
Total FT/HT Goals and Red Cards for HT and AT (Included in CSV's)
I've managed to scrape ALL leagues and Seasons from HKJC (Where specific goal time data is provided)
I may decide to scrape more data, but for most leagues there is 10+ years worth of games, with goal and red card time data.

The data is a little too big to post here, and I didn't fancy splitting everything into multiple zip files. So I've uploaded and shared to my Dropbox
Not sure if the mods/admins could increase the file size limit, so I can add directly to the thread.
The below is a direct link to the folder containing all the data I've scraped.
Each league is in a separate text file, and I've also included a .zip of ALL the files.

https://www.dropbox.com/sh/7sif9n90ehys ... bd6ba?dl=0

Contents of the above files

Code: Select all

Number of Countries: 20
Country: Argentine - Leagues: 1
League Name: Argentine Division 1 - Seasons: 11
(2009-10(Au), 2010-11 (Sp), 2010-11(Au), 2011-12(Au), 2012-13 (In), 2012-13(Fi), 2013-14 (In), 2013-14(Fi), 2014-15 (In), 2015, 2016)
Country: Australian - Leagues: 2
League Name: Australian Division 1 Playoffs - Seasons: 7
(2008-09, 2009-10, 2010-11, 2011-12, 2012-13, 2013-14, 2014-15)
League Name: Australian Division 1 - Seasons: 9
(2008-2009, 2009-2010, 2010-2011, 2011-2012, 2012-2013, 2013-2014, 2014-2015, 2015-2016, 2016-2017)
Country: Belgian - Leagues: 3
League Name: Belgian Division 1 Championship Playoffs - Seasons: 1
(2016)
League Name: Belgian Division 1 UE Cup Playoffs - Seasons: 1
(2016)
League Name: Belgian Division 1 - Seasons: 2
(2015-2016, 2016-2017)
Country: Brazilian - Leagues: 3
League Name: Brazilian Division 1 - Seasons: 9
(2005, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016)
League Name: Brazilian Paulista League Knockout stage - Seasons: 1
(2014)
League Name: Brazilian Paulista League - Seasons: 1
(2014)
Country: Chilean - Leagues: 1
League Name: Chilean Division 1 - Seasons: 6
(2014-15(AP), 2014-15(CL), 2015-16(AP), 2015-16(CL), 2016-17(AP), 2016-17(CL))
Country: Dutch - Leagues: 3
League Name: Dutch Cup - Seasons: 7
(2005-06, 2007-08, 2008-09, 2009-10, 2010-11, 2011-12, 2012-13)
League Name: Dutch Division 1 - Seasons: 10
(2005-2006, 2006-2007, 2008-2009, 2010-2011, 2011-2012, 2012-2013, 2013-2014, 2014-2015, 2015-2016, 2016-2017)
League Name: Dutch Division 2 - Seasons: 6
(2011-2012, 2012-2013, 2013-2014, 2014-2015, 2015-2016, 2016-2017)
Country: Eng - Leagues: 4
League Name: Eng Championship - Seasons: 13
(2004-2005, 2005-2006, 2006-2007, 2007-2008, 2008-2009, 2009-2010, 2010-2011, 2011-2012, 2012-2013, 2013-2014, 2014-2015, 2015-2016, 2016-2017)
League Name: Eng League 1 - Seasons: 8
(2009-2010, 2010-2011, 2011-2012, 2012-2013, 2013-2014, 2014-2015, 2015-2016, 2016-2017)
League Name: Eng League Cup - Seasons: 6
(2006-07, 2007-08, 2009-10, 2010-11, 2012-13, 2013-14)
League Name: Eng Premier - Seasons: 14
(2003-2004, 2004-2005, 2005-2006, 2006-2007, 2007-2008, 2008-2009, 2009-2010, 2010-2011, 2011-2012, 2012-2013, 2013-2014, 2014-2015, 2015-2016, 2016-2017)
Country: French - Leagues: 4
League Name: French Division 1 - Seasons: 12
(2005-2006, 2006-2007, 2007-2008, 2008-2009, 2009-2010, 2010-2011, 2011-2012, 2012-2013, 2013-2014, 2014-2015, 2015-2016, 2016-2017)
League Name: French Division 2 - Seasons: 6
(2011-2012, 2012-2013, 2013-2014, 2014-2015, 2015-2016, 2016-2017)
League Name: French FA Cup - Seasons: 1
(2005-06)
League Name: French League Cup - Seasons: 1
(2008-09)
Country: German - Leagues: 3
League Name: German Cup - Seasons: 5
(2006-07, 2007-08, 2008-09, 2012-13, 2013-14)
League Name: German Division 1 - Seasons: 14
(2003-2004, 2004-2005, 2005-2006, 2006-2007, 2007-2008, 2008-2009, 2009-2010, 2010-2011, 2011-2012, 2012-2013, 2013-2014, 2014-2015, 2015-2016, 2016-2017)
League Name: German Division 2 - Seasons: 8
(2009-2010, 2010-2011, 2011-2012, 2012-2013, 2013-2014, 2014-2015, 2015-2016, 2016-2017)
Country: Italian - Leagues: 2
League Name: Italian Cup - Seasons: 2
(2005-06, 2007-08)
League Name: Italian Division 1 - Seasons: 14
(2003-2004, 2004-2005, 2005-2006, 2006-2007, 2007-2008, 2008-2009, 2009-2010, 2010-2011, 2011-2012, 2012-2013, 2013-2014, 2014-2015, 2015-2016, 2016-2017)
Country: Japanese - Leagues: 3
League Name: Japanese Division 1 - Seasons: 14
(2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015-Stage 1, 2015-Stage 2, 2016-Stage 1, 2016-Stage 2)
League Name: Japanese Division 2 - Seasons: 9
(2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016)
League Name: Japanese League Cup - Seasons: 9
(2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014)
Country: Korean - Leagues: 1
League Name: Korean Division 1 - Seasons: 1
(2016)
Country: Mexican - Leagues: 1
League Name: Mexican Premier - Seasons: 6
(2014-15 (AP), 2014-15(CL), 2015-16(AP), 2015-16(CL), 2016-17(AP), 2016-17(CL))
Country: Norwegian - Leagues: 1
League Name: Norwegian Division 1 - Seasons: 12
(2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016)
Country: Portuguese - Leagues: 1
League Name: Portuguese Premier - Seasons: 9
(2008-2009, 2009-2010, 2010-2011, 2011-2012, 2012-2013, 2013-2014, 2014-2015, 2015-2016, 2016-2017)
Country: Russian - Leagues: 1
League Name: Russian Premier - Seasons: 3
(2014-2015, 2015-2016, 2016-2017)
Country: Scottish - Leagues: 2
League Name: Scottish League Cup - Seasons: 5
(2005-06, 2007-08, 2008-09, 2009-10, 2012-13)
League Name: Scottish Premier - Seasons: 12
(2005-2006, 2006-2007, 2007-2008, 2008-2009, 2009-2010, 2010-2011, 2011-2012, 2012-2013, 2013-2014, 2014-2015, 2015-2016, 2016-2017)
Country: Spanish - Leagues: 1
League Name: Spanish Division 1 - Seasons: 13
(2003-2004, 2004-2005, 2005-2006, 2006-2007, 2007-2008, 2009-2010, 2010-2011, 2011-2012, 2012-2013, 2013-2014, 2014-2015, 2015-2016, 2016-2017)
Country: Swedish - Leagues: 1
League Name: Swedish Division 1 - Seasons: 12
(2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016)
Country: US - Leagues: 2
League Name: US Football League Playoffs - Seasons: 8
(2007, 2009, 2010, 2011, 2012, 2013, 2014, 2015)
League Name: US Football League - Seasons: 10
(2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016)
Okay so there are now two sets of CSV files.
One with reds and goals separated, and one where they are together as "Events" The columns for each are as follows:


Goals and Reds Separated

Code: Select all

SEASON, DATE, HOME, AWAY, HHT_GOALS, AHT_GOALS, HFT_GOALS, AFT_GOALS, TOTAL_GOALS, HOME_REDS, AWAY_REDS, TOTAL_REDS, RED_1_TIME, RED_1_TEAM, RED_2_TIME, RED_2_TEAM, RED_3_TIME, RED_3_TEAM, RED_4_TIME, RED_4_TEAM, RED_5_TIME, RED_5_TEAM, RED_6_TIME, RED_6_TEAM, RED_7_TIME, RED_7_TEAM, RED_8_TIME, RED_8_TEAM, RED_9_TIME, RED_9_TEAM, RED_10_TIME, RED_10_TEAM, RED_11_TIME, RED_11_TEAM, RED_12_TIME, RED_12_TEAM, RED_13_TIME, RED_13_TEAM, RED_14_TIME, RED_14_TEAM, RED_15_TIME, RED_15_TEAM, GOAL_1_TIME, GOAL_1_TEAM, GOAL_2_TIME, GOAL_2_TEAM, GOAL_3_TIME, GOAL_3_TEAM, GOAL_4_TIME, GOAL_4_TEAM, GOAL_5_TIME, GOAL_5_TEAM, GOAL_6_TIME, GOAL_6_TEAM, GOAL_7_TIME, GOAL_7_TEAM, GOAL_8_TIME, GOAL_8_TEAM, GOAL_9_TIME, GOAL_9_TEAM, GOAL_10_TIME, GOAL_10_TEAM, GOAL_11_TIME, GOAL_11_TEAM, GOAL_12_TIME, GOAL_12_TEAM, GOAL_13_TIME, GOAL_13_TEAM, GOAL_14_TIME, GOAL_14_TEAM, GOAL_15_TIME, GOAL_15_TEAM

Goals and Reds Together as Events

Code: Select all

SEASON, DATE, HOME, AWAY, HHT_GOALS, AHT_GOALS, HFT_GOALS, AFT_GOALS, TOTAL_GOALS, HOME_REDS, AWAY_REDS, TOTAL_REDS, EVENT_1_TIME, EVENT_1_TEAM, EVENT_1_TYPE, EVENT_2_TIME, EVENT_2_TEAM, EVENT_2_TYPE, EVENT_3_TIME, EVENT_3_TEAM, EVENT_3_TYPE, EVENT_4_TIME, EVENT_4_TEAM, EVENT_4_TYPE, EVENT_5_TIME, EVENT_5_TEAM, EVENT_5_TYPE, EVENT_6_TIME, EVENT_6_TEAM, EVENT_6_TYPE, EVENT_7_TIME, EVENT_7_TEAM, EVENT_7_TYPE, EVENT_8_TIME, EVENT_8_TEAM, EVENT_8_TYPE, EVENT_9_TIME, EVENT_9_TEAM, EVENT_9_TYPE, EVENT_10_TIME, EVENT_10_TEAM, EVENT_10_TYPE, EVENT_11_TIME, EVENT_11_TEAM, EVENT_11_TYPE, EVENT_12_TIME, EVENT_12_TEAM, EVENT_12_TYPE, EVENT_13_TIME, EVENT_13_TEAM, EVENT_13_TYPE, EVENT_14_TIME, EVENT_14_TEAM, EVENT_14_TYPE, EVENT_15_TIME, EVENT_15_TEAM, EVENT_15_TYPE, EVENT_16_TIME, EVENT_16_TEAM, EVENT_16_TYPE, EVENT_17_TIME, EVENT_17_TEAM, EVENT_17_TYPE, EVENT_18_TIME, EVENT_18_TEAM, EVENT_18_TYPE, EVENT_19_TIME, EVENT_19_TEAM, EVENT_19_TYPE, EVENT_20_TIME, EVENT_20_TEAM, EVENT_20_TYPE, EVENT_21_TIME, EVENT_21_TEAM, EVENT_21_TYPE, EVENT_22_TIME, EVENT_22_TEAM, EVENT_22_TYPE, EVENT_23_TIME, EVENT_23_TEAM, EVENT_23_TYPE, EVENT_24_TIME, EVENT_24_TEAM, EVENT_24_TYPE, EVENT_25_TIME, EVENT_25_TEAM, EVENT_25_TYPE, EVENT_26_TIME, EVENT_26_TEAM, EVENT_26_TYPE, EVENT_27_TIME, EVENT_27_TEAM, EVENT_27_TYPE, EVENT_28_TIME, EVENT_28_TEAM, EVENT_28_TYPE, EVENT_29_TIME, EVENT_29_TEAM, EVENT_29_TYPE, EVENT_30_TIME, EVENT_30_TEAM, EVENT_30_TYPE
Please let me know if I should make any changes to the below columns, or the CSV's at all
I've tried, best I can, to go through the data and make sure all games are there. I've not spotted any leagues with missing games as of yet. BUT PLEASE LET ME KNOW IF YOU SPOT ANY ISSUES.

Just to note: The data is all scraped from HKJC, so if there are any errors it would be down to the data they provided. (The dates I've been told are in HK time)

Cheers,
Adam
Last edited by welshboy06 on Wed Aug 16, 2017 12:28 pm, edited 8 times in total.
dm1900
Posts: 71
Joined: Sun Jan 15, 2017 10:02 pm

Nice, is this kind of data available on other leagues?
welshboy06
Posts: 165
Joined: Wed Mar 01, 2017 2:06 pm

Hi doovd.

Yeah the HKJC has data on a load of leagues (http://football.hkjc.com/football/stati ... x?ci=en-us) So I will be working my way through these when I have time. I'll start with the Main leagues (Germany 1, France 1, Spain 1, Italy 1) Then I'll do div2/championship etc...

Currently though I only have the above data (BPL)

Cheers,
Adam
User avatar
Dallas
Posts: 22671
Joined: Sun Aug 09, 2015 10:57 pm
Location: Working From Home

Good Stuff Adam
User avatar
Euler
Posts: 24700
Joined: Wed Nov 10, 2010 1:39 pm
Location: Bet Angel HQ

Adam is immediately elevated to god status!
welshboy06
Posts: 165
Joined: Wed Mar 01, 2017 2:06 pm

Dallas wrote:
Sun Aug 13, 2017 10:08 am
Good Stuff Adam
Euler wrote:
Sun Aug 13, 2017 10:17 am
Adam is immediately elevated to god status!
Haha thanks both. I'll get more leagues sorted over the next few days. And maybe get something in place to start scraping the new seasons as the info goes up on hkjc
Tenable
Posts: 20
Joined: Sat Jul 16, 2016 4:04 pm

+1

Thanks for sharing bud,,
User avatar
Euler
Posts: 24700
Joined: Wed Nov 10, 2010 1:39 pm
Location: Bet Angel HQ

If you put up some data I'll be happy to analyse and share some findings. Though I tend to squirt most of my stuff out into csv files rather than json
welshboy06
Posts: 165
Joined: Wed Mar 01, 2017 2:06 pm

Hi Peter,

Its easy enough for me to export as csv, but I couldn't decide on the layout/format of the data.
My initial idea was to have columns like below..

LEAGUE, SEASON,GAME_DATE,HOME,AWAY,GOAL_1_TIME,GOAL_1_TEAM,GOAL_1_PLAYER,(Repeat goal columns 15? times),RED_1_TIME,RED_1_TEAM,RED_1_PLAYER,(Repead red columns 5? times)

If the above is something you can work with then I'll start posting .json and .csv's in the above format.
Or if you have any other suggestions for layout, I'd be happy to look in to it :)

Cheers,
Adam
dm1900
Posts: 71
Joined: Sun Jan 15, 2017 10:02 pm

I think the nature of the data lends itself more to json as there are often one to many relationships (e.g. game has many goals). Thanks for this!
welshboy06
Posts: 165
Joined: Wed Mar 01, 2017 2:06 pm

doovd wrote:
Sun Aug 13, 2017 12:28 pm
I think the nature of the data lends itself more to json as there are often one to many relationships (e.g. game has many goals). Thanks for this!
Yes, which is exactly why I chose json. Plus theres a neat library for python called jsonpickle, which lets me read and write my python objects directly to a json file. Much smaller and less overhead than a database or even a csv.

I believe the above json file has a slight error on some of the games dates (It points to a python object instead of showing the actual date) I've correct this and will be uploading the fixed version soon, as well as a csv copy of the BPL league.

Then I'll move on to scraping the other leagues.

Cheers,
Adam
spreadbetting
Posts: 3140
Joined: Sun Jan 31, 2010 8:06 pm

Does it convert well to a MySql database?
Tenable
Posts: 20
Joined: Sat Jul 16, 2016 4:04 pm

welshboy06 wrote:
Sun Aug 13, 2017 9:45 am
Hi All,

I've decided to start collecting football data. Mainly because I have an interest in the sport and also because of a certain jonnyg throwing stats around in a very hard to read format.

The data is quite simple and was just scraped from HKJC.

The data is in json format, but you should be able to convert it to csv and import it to Excel. I chose json as it reduces data duplication, is structured and can be read reasonably well by the human eye. It also works really well with Python (Which I'm using to scrape and also analyse the data)
What I’ve noticed from his posts is Jonnyg is years behind the curve when it comes to data and analysis compared to most people and especially the users of this forum,, something that has manually taken him 5 years and 10hrs a day of intensive typing to do you have just done in a day or two,, and in far greater detail,, I’m guessing most was even automated while having your Sunday lunch :lol:

With the amount of readily available data for every sport that can be downloaded, scraped from a number of sources or even collected in real time,, im lost as to why anyone would still be sitting doing this manually and collecting such little in the scheme of things
User avatar
jonnyg
Posts: 691
Joined: Wed Jan 18, 2017 8:11 pm

welshboy06 wrote:
Sun Aug 13, 2017 9:45 am
Hi All,

I've decided to start collecting football data. Mainly because I have an interest in the sport and also because of a certain jonnyg throwing stats around in a very hard to read format.

The data is quite simple and was just scraped from HKJC.

The data is in json format, but you should be able to convert it to csv and import it to Excel. I chose json as it reduces data duplication, is structured and can be read reasonably well by the human eye. It also works really well with Python (Which I'm using to scrape and also analyse the data)
The data contains the following info.
League
Season
Game Date
Home Team
Away Team
And also Goals and Red cards (Player, Time and Team)


At the moment I've only gotten around to scraping the BPL seasons that have all goal data on hkjc (04-05 to 16-17) I may also add in corners, however that would be total for the game, not timings.

More Leagues and seasons will be coming soon, but I'm pretty busy with work etc atm.

The file is a .txt file stored in the below zip (Couldn't upload .txt directly) and is less than 2mb! So should be easy to read and manipulate on anyones setup.

Just to note: The data is all scraped from HKJC, so if there are any errors it would be down to the data they provided.

BPL.zip

Now for one of the main reasons I did this, @jonnyg Asked me a question in another thread...
when you say easy ?

how easy ?





Well the answer to that question is 3.5gls (Total goals / Total Games)

Since the 2011-2012 season, the amount of games where the Home team scored first and the First goal was scored On or Before the 8th minute...
Total Games: 84 Total Goals: 294 Average Goals: 3.5
Min Goals: 1
Max Goals: 9

Cheers,
Adam

the question was rather different


"can you tell me for example what is the average goal production since 2011-2012 in games where the home team in the PL opened the scoring on 8 minutes ?
"
User avatar
jonnyg
Posts: 691
Joined: Wed Jan 18, 2017 8:11 pm

PL 2017-2018 in games where the home team opened the scoring on 8 minutes or before > exactly 8 minutes will be in bold

4-3 3-3

2016-2017

3-1 2-1 4-0 5-0 1-0 4-2 4-0 2-0 3-1 4-2 1-2 3-1 3-4 1-1 6-3 1-0 1-1 1-0 2-2 1-4 1-3 2-2 4-2 1-0 4-0 2-1 3-2 3-1 4-0 6-1 2-0 3-0 1-2 4-2 2-4 3-1 2-0 2-1 1-1

28-5-6 < average goal production = 3.85

2015-2016

4-0 2-2 3-1 1-1 2-1 2-2 4-1 3-1 4-0 3-0 2-0 2-0 1-0 5-1 2-1 2-2 3-0 2-1 3-0 3-1 2-1 2-0 3-1 1-3 1-5 5-1 3-0 3-2 2-0

2014-2015

2013-2014

2012-2013

2011-2012


will double check the average goal production data at the end
Last edited by jonnyg on Sun Aug 13, 2017 3:29 pm, edited 7 times in total.
Post Reply

Return to “Betfair Data”