
Data Science & Analysis Projects
Looking for freelance Data Science & Analysis jobs and project work? PeoplePerHour has you covered.
Sold UK Property Data
I am looking for an experienced UK property data researcher / scraper to compile a high-quality dataset of sold residential properties in England & Wales (C3 and C4 use classes), over the past 4 years, where the buyer is a limited company. The output must be delivered in Excel or CSV, cleanly formatted and deduplicated. For each transaction, the data we require: Company name of purchaser Registered address of company Address of sold property Sale price Completion / sold date Please explain: How you would identify the data Timeline Examples of similar UK property data projects This may lead to ongoing repeat work for the right freelancer. Please start your proposal with: “UK property data understood”
4 days ago15 proposalsRemoteDATA ANALYTIC STRUCTURE
Seeking a skilled data analytics structure specialist to collaborate with a UX designer. Project involves organizing, modeling, and documenting data flows, creating clean, scalable schemas, and defining metrics to inform user experience decisions. Deliverables: annotated data architecture diagrams, data dictionary, ETL blueprint, and recommendations for instrumentation and dashboards. Require clear, concise documentation and pragmatic solutions to support iterative UX research and product design.
4 days ago10 proposalsRemoteLinkedIn Profile Data Extraction
I need comprehensive data extracted from approximately 375 LinkedIn profiles and delivered in a structured JSON format. You will use your own tools and infrastructure, ensuring each profile's roles are individually detailed. Scope of work - Extract info from 375 LinkedIn profiles, including full name, headline, location, LinkedIn URL, email, phone, about text, connection count, follower count, profile photo URL, premium status, and verified status. - For each profile, list every individual role separately with detailed info, including company name, job title, start date, end date, duration, location, employment type, description text, company industry, company size, company website, and LinkedIn URL. - Capture education details like school, degree, field of study, dates, and activities or societies. - Extract certifications with name, issuing organization, issue date, and credential ID. - Compile skills with endorsement counts, languages and proficiency, volunteer work, honors/awards, recommendations, featured section content, groups, and interests/follows. - Include recent posts/activity with full text, post date, likes, and comments with names. Read more Additional information You'll start with a test batch of 5 profiles and proceed with the full list upon approval. Please go through the pdf attached
5 days ago13 proposalsRemoteLooking for a data administrator
We are looking for a data administrator. There are requirements on this role. - Need to base on East Europe, United States or Canada - It is not full time, part time position Need to work few hours per week. - Need to speak in English If you are okay, share your CV Best regards
5 days ago11 proposalsRemoteopportunity
AI & Data Automation Specialist for OpenClaw project
Objective To implement an AI-driven system that converts operational retail data into clear insights and actionable tasks, improving decision-making and execution across the business. Scope The project will integrate existing data sources: Retail performance systems (sales, margin, transactions) Staffing and hours data ClickUp (task management) Using AI tools (ChatGPT / Claude and Openclaw), the system will: Analyse structured operational data Generate concise, action-focused reports Translate insights into prioritised tasks Assign and track execution via ClickUp Key Outputs Weekly performance reports (AI-generated) Exception / “red flag” reports Trend analysis (e.g. margin, volume, staffing efficiency) Structured task lists linked to operational issues Success Criteria Clear link between data → insight → action → outcome Reduction in manual reporting and decision effort Improved operational performance (e.g. margin, labour efficiency) Consistent execution through task tracking Approach Use existing reports and datasets (no rebuild of core systems) Layer AI analysis and workflow automation on top Start with manual processes → move to automation over time Outcome A scalable “operating system” for the business that: Identifies issues early Drives consistent action Enhances performance without increasing management overhead I am open to engaging over a longer period of time to help monitor and I improve the system. Please outline in your response what experience you have completing a similar project.
11 days ago41 proposalsRemoteI need a magneto e-commerce site scrapping for products
Our supplier uses magneto and we wish to scrape all the product data including images to upload onto our own site. Due to the sheer volume of products this is the easiest way to achieve this. The products then need uploading to our own woocommerce site with the image titles changing to remove the suppliers name and made SEO friendly.
16 days ago36 proposalsRemoteTrainer Requirement for SAP SFP
Hi Everyone! We are seeking an experienced trainer with expertise in SAP SFP and K2 Workflow Integration to deliver an offline training program. The location is for Bangalore. This opportunity is immediate, and the duration will be determined based on the trainer’s recommendation. Candidates with strong hands-on experience who are available to start soon are encouraged to connect. Please let me know if you are interested. Feel free to DM and share this within your network or tag someone who would be a great fit.
25 days ago1 proposalRemoteData engineer
We are seeking an experienced Data Engineer to help organize, clean, and structure complex real estate and regulatory compliance data across multiple sources. This role focuses on transforming inconsistent datasets related to leases, occupancy, tenants, and rent information into a reliable and scalable data foundation. The ideal candidate will review existing data, identify quality issues such as duplication and missing fields, and design standardized schemas and relationships. You will build transformation workflows to clean and normalize data from spreadsheets, databases, and system exports. In this role, you will create master datasets for properties, units, households, leases, and compliance tracking while implementing validation rules and exception reporting. You will also document data definitions, mapping logic, and business rules to support transparency and long-term maintainability, while collaborating with stakeholders to translate operational requirements into structured data models. Strong proficiency in SQL and Python is required, along with hands-on experience in ETL/ELT workflows and relational data modeling. Experience working with messy, Excel-heavy datasets and building data quality checks is essential, and familiarity with tools like dbt, Airflow, or cloud platforms such as Snowflake or BigQuery is highly preferred. Success in this role means delivering a clear, consistent source of truth for lease and occupancy data, reducing inconsistencies, and preparing the data environment for reporting, automation, and future product development.
a month ago15 proposalsRemote
Past Projects
Secure Multi-Tenant Print System Setup (CUPS/IPPS/Linux)
We are looking for an experienced Linux engineer to design and implement a secure, isolated print environment using CUPS. The goal is to build a lean, reliable system that delivers the required functionality without unnecessary complexity or ongoing maintenance overhead. Project Objectives The system should: Securely handle print jobs Support automatic printing for standard queues Allow manual release for sensitive jobs (e.g. finance) Enable safe re-routing of jobs if a printer is unavailable Ensure jobs are not retained longer than necessary Operate independently without affecting other services Key Requirement: Isolation A critical requirement is strict separation between environments: No visibility of other users’ printers or queues No access to other users’ print jobs No risk of jobs being sent to the wrong printer Strong access control and secure handling Technical Scope You will be responsible for: Setting up an isolated environment (VM preferred) Configuring CUPS with IPPS (TLS encryption) Creating hybrid queue logic (auto-print + hold/release) Implementing lightweight printer health checks Enabling manual fallback routing Applying security hardening and access controls Configuring automatic job cleanup policies Devices Printers include models similar to the following: Toshiba e-STUDIO Katun Arivia (All expected to support secure IPP/IPPS where possible.) Important Considerations We want to avoid: Overengineered solutions Complex custom scripting where not required High-maintenance or costly components We prefer: Native CUPS functionality Simple, predictable behaviour Low ongoing maintenance What We Expect Before full implementation, you should: Review the scope Recommend the most cost-effective and simplest architecture Identify anything that may increase time or complexity Suggest improvements where needed Deliverables Clear architecture outline Implementation of the agreed setup Documentation of configuration and approach Ideal Candidate Strong experience with Linux systems (AlmaLinux/RHEL-based) Hands-on experience with CUPS and IPP/IPPS Knowledge of secure system design and isolation Ability to deliver simple, practical solutions Experienced Linux engineer required to design and implement a secure, isolated print environment using CUPS/IPPS. Deliver a lean, reliable solution that securely handles print jobs, supports automatic printing for standard queues, provides hold-and-release for sensitive jobs, enables safe rerouting when printers are unavailable, and enforces timely job cleanup. Implement VM-based isolation, TLS-encrypted IPPS, hybrid queue logic, lightweight health checks, manual fallback routing, hardened access controls, and concise documentation. Recommend simplest cost-effective architecture and identify complexity risks. The ideal candidate has AlmaLinux/RHEL experience, strong CUPS and IPP/IPPS knowledge, and a focus on minimal maintenance and native functionality.
Secure Multi-Tenant Print System Setup (CUPS/IPPS/Linux)
We are looking for an experienced Linux engineer to design and implement a secure, isolated print environment using CUPS. The goal is to build a lean, reliable system that delivers the required functionality without unnecessary complexity or ongoing maintenance overhead. Project Objectives The system should: Securely handle print jobs Support automatic printing for standard queues Allow manual release for sensitive jobs (e.g. finance) Enable safe re-routing of jobs if a printer is unavailable Ensure jobs are not retained longer than necessary Operate independently without affecting other services Key Requirement: Isolation A critical requirement is strict separation between environments: No visibility of other users’ printers or queues No access to other users’ print jobs No risk of jobs being sent to the wrong printer Strong access control and secure handling Technical Scope You will be responsible for: Setting up an isolated environment (VM preferred) Configuring CUPS with IPPS (TLS encryption) Creating hybrid queue logic (auto-print + hold/release) Implementing lightweight printer health checks Enabling manual fallback routing Applying security hardening and access controls Configuring automatic job cleanup policies Devices Printers include models similar to the following: Toshiba e-STUDIO Katun Arivia (All expected to support secure IPP/IPPS where possible.) Important Considerations We want to avoid: Overengineered solutions Complex custom scripting where not required High-maintenance or costly components We prefer: Native CUPS functionality Simple, predictable behaviour Low ongoing maintenance What We Expect Before full implementation, you should: Review the scope Recommend the most cost-effective and simplest architecture Identify anything that may increase time or complexity Suggest improvements where needed Deliverables Clear architecture outline Implementation of the agreed setup Documentation of configuration and approach Ideal Candidate Strong experience with Linux systems (AlmaLinux/RHEL-based) Hands-on experience with CUPS and IPP/IPPS Knowledge of secure system design and isolation Ability to deliver simple, practical solutions
Simple Excel tracker and KPI dashboard
I’m looking for someone to help improve an existing Excel spreadsheet. The spreadsheet needs to: Track order quantity, order type, and order value Include a daily tracking view Include a monthly summary tracker Automatically calculate totals and basic KPIs Be clean, simple, and easy to update This is a straightforward task I already have a file started; it just needs improving and structuring properly. sheet doesn't have much data, but needs to get larger month by month. Ideal if you have experience creating Excel trackers or KPI dashboards. I would rather use a person than AI, but need to change the budget to fixed price as I don't see this as a complex project for an Excel professional.
Data Mining in Brent area
Data mining required.. We want to complie a list of the following businesses operating within 5 miles from Wembley Stadium HA0 1) All Estate agent Company name, physical adress / office, email, telephone number, Linkedin and ther solcial media handles, name of the office manager if possible 2) All Lawyers within 5 miles of Wembley stadium.. same information as above 3) All accountants within 5 miles of Wembley stadium 4) List of all Architecs 5) List of all mortgage brokers 6) List of all SOlicitors / Lawyers 7)
I Need Help with Microsoft Excel Conditional Formatting Logic
Hello, I need support from a Microsoft Excel expert regarding making rows bold for a specific time (such as every 2 days, 5 days or 50 days) This will help our editors to know when to publish an article for a category Please see the video here, you can understand more Video - https://drive.google.com/file/d/1fQRWpzcOaUz7_8mvlTOj2oR6mrZo2-NQ/view?usp=sharing I need you to provide me with logic for these conditions for Excel to make rows bold for every 2 days, 5 days, or 50 days To make you understand properly For the category that will get an article after 7 days interval, it will happen this way For 7 days cycle Day 0 (start date) - text will be bold on start day (2026,4,8) Day 1 - text is normal Day 2 - text is normal Day 3 - text is normal Day 4 - text is normal Day 5 - text is normal Day 6 - text is normal Day 0 (7th day) - text is bold again (for the 7th day the reminder will be 0 again) For 3 days cycle Day 0 (start date) - text will be bold on start day (2026,4,8) Day 1 - text is normal Day 2 - text is normal Day 0 (3rd day) - text is bold again (for the 3rd day the reminder will be 0 again) This loop will continue Let me know if you get this -Ahmed
opportunitypre-funded
25,000-Review Data Scraper Needed – Trustpilot to Feefo Import
We’re seeking a detail-oriented data scraper to help us migrate 25,000 reviews from a Trustpilot page. These reviews cover both service and product categories. The final deliverable will be in CSV format, aligning with Feefo’s import structure (examples available for both product and service). The key responsibilities will include: -Data required can be seen here / sample CSVs can be downloaded for both service & product: https://help.feefo.com/knowledge-base/importing-reviews) -Scraping all reviews accurately, ensuring data quality. -Structuring the output in CSV format, ready for Feefo import. -Handling both service and product reviews according to Feefo’s guidelines. We need this completed as soon as possible - ideally within the week. Timing is crucial. The ideal candidate should be fluent in English, able to communicate with our broader team, and willing to confirm any manual steps needed before starting. If you can deliver a reliable, fast turnaround, we’d love to hear from you!
4 STL files to run analysis on Ansys Fluent for lift and drag
I would like someone to fix the issues on 4 STL files (created on SOLIDWORKS) which are parts of a Formula car to be able to use on Ansys Fluent Student to run lift and drag analysis. There are problems with the meshing on Ansys which I can't seem to fix in order to run analysis of lift and drag on student version. Therefore, meshing needs to be fixed on Ansys. If lift and drag analysis can be shown works on the files that would help. This needs to be on Ansys Fluent Student. Files can be sent once agreed on project.
Pwa offline
send me invoice for first milestone.
Fix Formula Errors + Broken Links in Excel Budget Dashboard
I already have a complete Annual & Monthly Budget Dashboard in Excel. I do NOT need a new dashboard — I only need a freelancer to: - Fix formula errors - Repair broken links between sheets - Make sure calculations update correctly - Ensure all dashboard sections pull the right data The file includes sheets such as: • Setup • Transactions • Monthly Dashboard • Annual Dashboard • Comparative Dashboard Everything is already built — just needs fixing and cleanup. Requirements: • Strong Excel formula and debugging skills • Experience fixing broken dashboards • Ability to deliver clean and error free results • Good communication
opportunity
Building database of owners by web scraping
I have been working with DeepSeek to extract data from the web site tuscasasrurales.com. The data I need is shown on the attached file Granada.png. Some of this data requires a link to be clicked as shown in the uploaded file Tuscasasrurales.png. Email addresses have to be obtained by visiting the website (if there is one). Where no data is available, leave blank. I have been trying to extract the data province by Spanish province. To get a list by province, enter the site and add the name of the province in the search box e.g Granada which returns 304 entries. Although DeepSeek was unable to get all the info I wanted it has given me a python script which will do the job. I have uploaded this. It does not include the fields Bedrooms and Bathrooms which I would also like included. Can you do this work and how much would you charge? There are initially 10 provinces with an average of +/- 200 entries in each. Thanks - Allan
opportunity
Urgent OCR
In this folder are three folders called Bishopsgate archives, Newham archives and special collections and RIBA collections. https://www.dropbox.com/scl/fo/ixwakdbgdhodiq9sqlbk6/AAuJLgrYLb2WB1a--E_b76A?rlkey=y0f9poa3t1rpalmemh2qljs2n&st=dfvttlf8&dl=0. Ignore the folder called riba objects called scanning. Each of those three folders has many subfolders. For each of those subfolders perform these instructions OCR INSTRUCTIONS 1. One file per folder For each archive folder (e.g. TBUK1, NEWHAM2), create ONE plain text file (.txt). Put all documents from that folder into that one file. 2. Clear document separation and stable IDs Every time a new document starts, write: ============================== ARCHIVE_FOLDER: TBUK1 DOCUMENT_ID: TBUK1_01 DATE: (write date exactly as shown, or Unknown) PLACE: (write place exactly as shown, or Not stated) ------------------------------ Then paste the full OCR text of that document. For the next document: ============================== ARCHIVE_FOLDER: TBUK1 DOCUMENT_ID: TBUK1_02 DATE: PLACE: ------------------------------ Continue sequentially: TBUK1_03, TBUK1_04, etc. For a different folder (e.g. NEWHAM2): ARCHIVE_FOLDER: NEWHAM2 DOCUMENT_ID: NEWHAM2_01 Do not restart numbering without the folder prefix. If date or place is not visible: DATE: Unknown PLACE: Not stated Do NOT guess. 3. OCR rules * Copy text exactly as written. * Do NOT correct spelling or grammar. * Do NOT rewrite sentences. * Do NOT summarise. * Keep paragraph breaks. * Remove page numbers. * If a word cannot be read, write: [illegible] * Do not insert commentary. 4. Hand-drawn diagrams Do NOT attempt full OCR of technical drawings. Instead include: ============================== ARCHIVE_FOLDER: TBUK1 DOCUMENT_ID: TBUK1_XX DATE: PLACE: ------------------------------ HAND-DRAWN ENGINEERING DRAWING Title: (if visible) Location: (if visible) Company: (if visible) If no readable text at all: HAND-DRAWN ENGINEERING DRAWING (no readable text) Do NOT copy measurements or technical numbers from diagrams. 5. Save format * Save as .txt * Use UTF-8 encoding 6. For each subfolder also export one pdf for me to refer to easily - pdf must be under 30MB. You can use these Lower res versions of the subfolders for the pdfs. https://www.dropbox.com/scl/fo/hjed8pk08njhzfns6hhvb/ALP8N6HCGRHAuYwiJR_tEg4?rlkey=t4p5y3q3954monibbm2tktz90&st=lb5eo4lp&dl=0
Advanced Gantt Chart for Project Management Dashboard
I am looking for an experienced Power BI developer to build an advanced Gantt Chart visualization for a project management dashboard. The dashboard is used to track multiple projects and detailed task timelines, and the Gantt chart should provide a clear and interactive way to monitor project progress. Requirements The Gantt chart should support detailed project management information including: Multiple projects Project phases and sub-tasks Start date / End date Task duration Progress percentage Milestones Dependencies between tasks Responsible team or department Ability to filter by project, phase, and timeline A clear timeline view similar to MS Project Technical Requirements Must be built in Power BI Should work with large datasets Can use custom visuals or advanced Power BI modeling Must integrate into an existing Power BI dashboard