
Exploratory data analysis & regression model in python by 20/08
- or -
Post a project like this134
€230(approx. $274)
- Posted:
- Proposals: 13
- Remote
- #4417297
- OPPORTUNITY
- Completed
WordPress Developer | Custom Themes, Plugins & E-commerce Solutions,web scraping,Data Entry,Artificial intelligence
Full Stack Engineer| Web & Mobile | MLOps | Artificial Intelligence | NLP | Data Science
Delivering High-Quality IT Services at Competitive Prices |Experienced Full Stack Web and App Developer |Android and IOS App Development|


43491531227545511865354100775624017362534829791102712226718584120309110680858274899784
Description
Experience Level: Entry
Part1 : Provided a dataset of volume sales of products from 2019 to 2022 run an extensive exploratory data analysis including the following: .
1. Data Quality & Structure Checks
Missing values, duplicates, negative sales, outliers
Date consistency (no gaps, proper frequency, handling holidays/weekends)
2. Descriptive Statistics
Overall distribution of daily sales (mean, median, std, skewness, kurtosis)
By dimension: product, customer
Identify top products per customer by volume
3. Time Series Exploration
Trend: long-term upward or downward movement
Seasonality: daily/weekly patterns (weekdays vs weekends), monthly, quarterly, yearly cycles
Rolling averages (7-day, 30-day) to smooth patterns
4. Visualization Layer
Time series plots: raw daily sales, moving averages
Boxplots: distribution of sales by weekday or month
Histograms/density plots: sales distribution
5. Anomaly & Outlier Detection
Unusual spikes/drops
Use Z-scores or interquartile ranges to flag anomalies
6. Correlation & Drivers of Sales
Correlation if needed
7. Performance Metrics (Baseline)
Set benchmarks to prepare for forecasting models:
Average daily sales per SKU/store
Volatility (Coefficient of Variation)
Baseline forecast error (e.g., naïve forecast MAPE)
EDA Deliverables :
By the end of an extensive EDA, I should have:
Clear understanding of demand patterns, seasonality, and anomalies
Insights into drivers of sales (internal like price/promo, external like weather/events)
Segmentation of products into high/medium/low performers
A baseline performance snapshot to compare forecasting models against.
Part 2 :
After cleaning the data based on the above analysis, run a linear regression-based model to prepare a sales volume forecast at product & customer level for 2022 in python or/and pyspark. Measure the accuracy by introducing quality measures and explain why have you introduced these measures.
1. Data Quality & Structure Checks
Missing values, duplicates, negative sales, outliers
Date consistency (no gaps, proper frequency, handling holidays/weekends)
2. Descriptive Statistics
Overall distribution of daily sales (mean, median, std, skewness, kurtosis)
By dimension: product, customer
Identify top products per customer by volume
3. Time Series Exploration
Trend: long-term upward or downward movement
Seasonality: daily/weekly patterns (weekdays vs weekends), monthly, quarterly, yearly cycles
Rolling averages (7-day, 30-day) to smooth patterns
4. Visualization Layer
Time series plots: raw daily sales, moving averages
Boxplots: distribution of sales by weekday or month
Histograms/density plots: sales distribution
5. Anomaly & Outlier Detection
Unusual spikes/drops
Use Z-scores or interquartile ranges to flag anomalies
6. Correlation & Drivers of Sales
Correlation if needed
7. Performance Metrics (Baseline)
Set benchmarks to prepare for forecasting models:
Average daily sales per SKU/store
Volatility (Coefficient of Variation)
Baseline forecast error (e.g., naïve forecast MAPE)
EDA Deliverables :
By the end of an extensive EDA, I should have:
Clear understanding of demand patterns, seasonality, and anomalies
Insights into drivers of sales (internal like price/promo, external like weather/events)
Segmentation of products into high/medium/low performers
A baseline performance snapshot to compare forecasting models against.
Part 2 :
After cleaning the data based on the above analysis, run a linear regression-based model to prepare a sales volume forecast at product & customer level for 2022 in python or/and pyspark. Measure the accuracy by introducing quality measures and explain why have you introduced these measures.
Mata L.
100% (2)Projects Completed
2
Freelancers worked with
2
Projects awarded
67%
Last project
19 Aug 2025
Greece
New Proposal
Login to your account and send a proposal now to get this project.
Log inClarification Board Ask a Question
-

Do you see and read my proposals Data Exploratory Regression Model Python with start project familiar please ? Thanks
1137351
We collect cookies to enable the proper functioning and security of our website, and to enhance your experience. By clicking on 'Accept All Cookies', you consent to the use of these cookies. You can change your 'Cookies Settings' at any time. For more information, please read ourCookie Policy
Cookie Settings
Accept All Cookies