
Regression Analysis of golf data with Machine Learning models
- or -
Post a project like this142
£41/hr(approx. $55/hr)
- Posted:
- Proposals: 36
- Remote
- #4328722
- Awarded
5+ years in Data Insights: Excel | Google Sheets | Power BI | Python | SQL | Data Scraping
PPH TOP Web Developer | Digital Marketing (SEO, Social Media Management, Facebook/Instagram Ads, Google Ads)|WordPress, Shopify, Wix, e-commerce, | Video Editing

WordPress Expert | Web & App Developer | SEO Specialist | Content Writer | Blockchain | Python | OpenAI | Machine Learning

Data Scientist | Machine Learning & Deep Learning Expert | Time Series & Research Specialist
100775611598837227480411555813260137711927296198242731510056666343120559521074983011505914





Description
Experience Level: Expert
I am looking for someone to help with regression analysis of a large dataset of golf related stats. I would like for the model output to be usable in excel. I am happy to pay by the hour or as a set fee for each project.
I have experimented with azure (including ensemble models) and google’s automl but that results haven’t been good enough; I suspect because of overfitting. I have then been making predictions through the endpoints provided within this service.
Therefore, I am looking for someone to:
• Analyse the large dataset to create a better dataset by removing columns, adding extra columns, explaining links between columns etc. Many of the columns are highlight correlated to each other so this may be part of the reason of overfitting.
• Run an improved dataset through several appropriate machine learning models. The model should be as complex as needed but no more complex than that.
• Output a model that I can then easily use in excel, to make future predictions. Ideally, I would also like to integrate this fully into my SQL database.
I would like this performed on my smaller womens golf dataset, and if successful, I would like the process repeated on my larger mens golf dataset. In addition, in the future I will be creating more models for data subsets where even more data is available.
The main part of the database is the process which converts raw scores to adjusted scores. The raw scores are generally between 63-80 and you can see them here (R1-R4 columns)
https://www.pgatour.com/tournaments/2025/att-pebble-beach-pro-am/R2025005/leaderboard
I take these raw scores and convert them to adjusted scores in my SQL database. The process of converting them mainly involves calculating the field strength and then offsetting the raw scores accordingly. Field strength is a measure of the average ability of the players within a tournament. By offsetting raw scores to adjusted scores, I analyse the two different golfers on opposite sides of the world, even though they may have never played in the same tournament.
The adjusted scores are then converted to hundreds of different categories, over different time periods, along with a few other columns such as the date.
I have been performing simple regression on this data to make future predictions of a golfer’s ability but I would like to experiment with a more complex model in the hope of creating more accurate predictions.
Many thanks for taking your time to read through this and please let me know if you would like me to send you a more detailed spec as well as some sample data. After receiving the more detailed spec and sample data, please explain which ml model(s) you intend to use and why you think that is a good fit for my project. Please also explain how I will then use this model for future predictions.
I have experimented with azure (including ensemble models) and google’s automl but that results haven’t been good enough; I suspect because of overfitting. I have then been making predictions through the endpoints provided within this service.
Therefore, I am looking for someone to:
• Analyse the large dataset to create a better dataset by removing columns, adding extra columns, explaining links between columns etc. Many of the columns are highlight correlated to each other so this may be part of the reason of overfitting.
• Run an improved dataset through several appropriate machine learning models. The model should be as complex as needed but no more complex than that.
• Output a model that I can then easily use in excel, to make future predictions. Ideally, I would also like to integrate this fully into my SQL database.
I would like this performed on my smaller womens golf dataset, and if successful, I would like the process repeated on my larger mens golf dataset. In addition, in the future I will be creating more models for data subsets where even more data is available.
The main part of the database is the process which converts raw scores to adjusted scores. The raw scores are generally between 63-80 and you can see them here (R1-R4 columns)
https://www.pgatour.com/tournaments/2025/att-pebble-beach-pro-am/R2025005/leaderboard
I take these raw scores and convert them to adjusted scores in my SQL database. The process of converting them mainly involves calculating the field strength and then offsetting the raw scores accordingly. Field strength is a measure of the average ability of the players within a tournament. By offsetting raw scores to adjusted scores, I analyse the two different golfers on opposite sides of the world, even though they may have never played in the same tournament.
The adjusted scores are then converted to hundreds of different categories, over different time periods, along with a few other columns such as the date.
I have been performing simple regression on this data to make future predictions of a golfer’s ability but I would like to experiment with a more complex model in the hope of creating more accurate predictions.
Many thanks for taking your time to read through this and please let me know if you would like me to send you a more detailed spec as well as some sample data. After receiving the more detailed spec and sample data, please explain which ml model(s) you intend to use and why you think that is a good fit for my project. Please also explain how I will then use this model for future predictions.

Steven H.
100% (71)Projects Completed
17
Freelancers worked with
16
Projects awarded
84%
Last project
9 May 2025
United Kingdom
New Proposal
Login to your account and send a proposal now to get this project.
Log inClarification Board Ask a Question
-
Would you like the final model to work entirely within Excel, or would you prefer a lightweight external integration for better performance?
Steven H.04 Feb 2025lightweight integration
Mahmood U.04 Feb 2025Okay, I understand your work. Can you inbox me for further discussion?
1121243
We collect cookies to enable the proper functioning and security of our website, and to enhance your experience. By clicking on 'Accept All Cookies', you consent to the use of these cookies. You can change your 'Cookies Settings' at any time. For more information, please read ourCookie Policy
Cookie Settings
Accept All Cookies