
Data Cleansing Projects
Looking for freelance data cleansing jobs and project work? Browse active opportunities on PeoplePerHour, or hire data miners through Toptal’s rigorously vetted talent network.
Data analysis
Seeking an experienced data analyst to perform comprehensive data analysis, including cleaning, transformation, exploratory analysis, statistical testing, and insightful visualization. Deliverables include documented code, clear summary of findings, actionable recommendations, and reproducible reports. Candidate should demonstrate strong proficiency in Python or R, SQL, and data visualization libraries, with attention to data integrity and effective communication of results. Timely delivery and iterative collaboration expected.
4 hours ago15 proposalsRemoteData scraping
I require an experienced data researcher / scraping specialist to extract high-quality, niche-specific data from multiple UK sources and deliver it in a clean, structured format. This is not a bulk scraping job — accuracy and relevance are critical. Scope of Work: 1. Target Data (Initial Focus) I am looking to build lists for the following niches: • Property investors (Buy-to-Let, HMO, developers) • Company directors in property-related businesses • Business owners suitable for commercial finance ⸻ 2. Data Sources You may use a combination of: • Companies House (via SIC codes & director data) • Property-related platforms (where legally accessible) • Public directories • LinkedIn (Sales Navigator filtering – manual research, not scraping if restricted) • Other compliant UK data sources ⸻ 3. Required Data Fields Each record should include (where available): • Full Name • Company Name • Role (Director / Owner) • Email Address (verified where possible) • Phone Number (if available) • Location (UK-based) • Industry / SIC Code • Notes (if relevant, e.g. property investor / developer) ⸻ 4. Data Quality Requirements • UK-based contacts only • No duplicate entries • No generic or irrelevant businesses • Data must be accurate and up-to-date • Avoid scraped junk or low-quality lists ⸻ 5. Compliance (VERY IMPORTANT) • All data must be sourced from publicly available or compliant sources • Must comply with UK GDPR guidelines • No illegal scraping or data extraction methods • No use of prohibited LinkedIn automation tools ⸻ 6. Output Format • Excel / Google Sheets • Clearly structured columns • Clean, ready for outreach use ⸻ 7. Initial Volume • Phase 1: 4000-5000 high-quality leads • Potential for ongoing weekly/monthly work ⸻ Ideal Candidate: • Experience with UK data sourcing (Companies House, etc.) • Strong data cleaning and validation skills • Familiar with B2B lead generation • Understanding of property / finance sector is a plus • Able to suggest better data sources (not just follow instructions) ⸻ Budget & Timeline: • Open to proposals based on quality • Looking to start immediately • Ongoing work available for the right person ⸻ To apply, please include: • Examples of similar data projects • Tools you use for sourcing and verification • Your approach to ensuring data accuracy ⸻ Start your proposal with: “Quality over quantity — understood” ⸻ Additional Notes: This project is focused on building a long-term data pipeline for targeted outreach campaigns. I am looking for someone reliable who can consistently deliver high-quality results.
2 days ago41 proposalsRemoteApollo.io - Data Extraction
I'm looking to work with someone that can extract the following data from Apollo.io Industry: Hospital & Health Care Location: UK Email Status: Verified Job Titles: Purchasing, Procurement, Care Manager, Founder This should show 4,074 contacts. I would like someone to send via excel the following details: Company Contact Name Job Title Email Address Phone Number I'm looking to give the go ahead ASAP. Please provide best price and timeframe for completing this.
10 hours ago46 proposalsRemoteopportunity
AI database management
Seeking an experienced developer to build a bespoke AI-driven lead generation platform inspired by valuationleads.com. The system must aggregate and cleanse diverse data sources, deploy machine learning models to score and prioritize prospects, integrate with CRM systems, and present an intuitive dashboard for campaign management and analytics. Emphasis on robust data security, scalability, and automated outreach workflows. Deliverables include comprehensive documentation, testing, and deployment.
25 days ago41 proposalsRemoteData engineer
We are seeking an experienced Data Engineer to help organize, clean, and structure complex real estate and regulatory compliance data across multiple sources. This role focuses on transforming inconsistent datasets related to leases, occupancy, tenants, and rent information into a reliable and scalable data foundation. The ideal candidate will review existing data, identify quality issues such as duplication and missing fields, and design standardized schemas and relationships. You will build transformation workflows to clean and normalize data from spreadsheets, databases, and system exports. In this role, you will create master datasets for properties, units, households, leases, and compliance tracking while implementing validation rules and exception reporting. You will also document data definitions, mapping logic, and business rules to support transparency and long-term maintainability, while collaborating with stakeholders to translate operational requirements into structured data models. Strong proficiency in SQL and Python is required, along with hands-on experience in ETL/ELT workflows and relational data modeling. Experience working with messy, Excel-heavy datasets and building data quality checks is essential, and familiarity with tools like dbt, Airflow, or cloud platforms such as Snowflake or BigQuery is highly preferred. Success in this role means delivering a clear, consistent source of truth for lease and occupancy data, reducing inconsistencies, and preparing the data environment for reporting, automation, and future product development.
13 days ago15 proposalsRemoteData scraping for influencers (automation)
Hello I’m looking for data for a specific set of influencers Specifically for influencers with 75K+ followers on Instagram or 50k subs on YouTube With an audience over 40 The influencer will be speaking on topics like financial planning, many wellbeing categories. Pls tell me how you would do this and how much you’ll charge for the data Also can we verify the engagement rate? Say per 1,000 emails and verify the emails? A Python script using the YouTube Data API v3 (free, official) and Phantombuster (Instagram) to automatically search, filter, and export influencer data to a Google Sheet or CSV. YouTube (API — Free & Clean) Use: YouTube Data API v3 Script logic: 1. Loop through keyword list 2. Search type=channel for each keyword 3. Pull channel details: name, URL, subscriber count, description, email (regex parse from description) 4. Filter: subscribers between 50,000 and 500,000 5. Export all results to CSV Keywords to loop: retirement planning lifestyle retirement vlog over 60 grandparenting tips grandma empty nest life over 50 pickleball over 50 senior fitness over 60 menopause health women over 50 midlife women wellness caregiver aging parents RV retirement full time gardening over 50 vegetable men's health over 50 medicare senior health Christian women over 50 faith retirement lifestyle active aging senior Output columns: Channel Name | URL | Subscribers | Email | Description snippet | Keyword searched Instagram (Phantombuster) Use: Phantombuster Instagram Hashtag Scraper + Profile Scraper (client will provide API key and session cookie) Script/phantom logic: 1. Run Instagram Hashtag Post Scraper on each hashtag below 2. Pull 500 posts per hashtag → extract unique profile URLs 3. Feed profile URLs into Instagram Profile Scraper 4. Extract: username, follower count, bio text, email from bio, website URL 5. Filter: followers 50,000–500,000 6. Export to CSV Hashtags to scrape: #retirementplanning #retirementlife #retireearly #grandparents #grandmalife #nanalife #grandpalife #emptynesters #lifeafterkids #over50life #pickleballlife #activeaging #seniorfitness #womenover50 #menopause #midlifewomen #caregiverlife #agingparents #seniorcare #rvlife #fulltimerv #retirementtravel #vegetablegarden #gardeningover50 #growyourown #menshealth #healthyaging #over50fitness #medicare #seniorliving #agingwell #christianwomen #faithoverfear #christianliving Output Format Single CSV, one row per creator: Platform | Handle/Channel | URL | Followers | Email | Bio/Description | Category | Source Keyword Filtering Rules (build into script) ∙ Followers: 50,000–500,000 only ∙ Skip accounts with 0 posts or last post over 60 days ∙ Skip accounts where bio contains: “under 18”, “teen”, “student”, “college” ∙ Flag rows with no email (for Apollo enrichment pass) ∙ Deduplicate on URL before export Deliverable
5 days ago18 proposalsRemotePython script run and scrape data
You need to build the script and run the data I don’t mind the process but the outcome is the data. Most likely using these methods Develop a Python automation to harvest influencer data from YouTube (Data API v3) and Instagram (Phantombuster) and export a deduplicated CSV. For YouTube: iterate provided keywords, search channels, capture channel name, URL, subscriber count, description snippet, parse emails, filter 50k–500k subscribers, and log keyword source. For Instagram: run hashtag scrapers, collect up to 500 posts per tag, extract unique profiles, scrape username, followers, bio, email, website, apply follower and activity/bio filters, flag missing emails, and produce a unified CSV with specified columns and filtering rules.
5 days ago24 proposalsRemoteCybersecurity and Data Protection Website
Seeking a polished, professional website for a cybersecurity and data protection consultancy. Project requires clear information architecture, persuasive copy emphasizing services (risk assessments, compliance, incident response, encryption, training), modern responsive design, intuitive navigation, and strong trust signals (case studies, certifications, testimonials). Deliverables: homepage, services pages, contact form, blog template, SEO-friendly content, and basic security hardening. Experience in cybersecurity preferred, attention to confidentiality and accessibility mandatory.
7 days ago32 proposalsRemoteAndroid Users for Field Data Collection!
Seeking Android users across the UK to collect field data for retail connectivity and Wi‑Fi mapping. Tasks: capture 2–3 photos, record a short video, and run a 30‑second Wi‑Fi scan via the app. Flexible schedule, per‑task payment, and opportunity to complete multiple assignments. Requirements: Android smartphone, active mobile data, and ability to visit local shopping areas. Clear instructions and operation manual provided upon assignment. Reliable collectors who follow guidelines and deliver timely, accurate submissions are encouraged to bid.
3 days ago8 proposalsRemoteData Engineer(Salesforce CRM)
We are looking for an experienced Data Engineer to help clean, standardize, and structure large volumes of real estate and regulatory compliance data. This role focuses on transforming messy, inconsistent datasets into a reliable and scalable data foundation that supports reporting, compliance tracking, and future automation. You will work closely with our business and technical teams to understand data logic, resolve inconsistencies, and design a clean, structured data environment. Responsibilities -Review and analyze existing datasets (leases, units, tenants, occupancy, rent, compliance) -Identify inconsistencies, duplicates, and missing or conflicting records -Standardize naming conventions, unique IDs, and relationships -Clean and transform data from spreadsheets, CSVs, and system exports -Build master datasets for properties, units, tenants, and leases -Develop repeatable data-cleaning workflows or pipelines -Implement data validation and quality checks -Create documentation (data dictionary, mappings, business rules) -Generate exception reports for data issues requiring review Required Skills -Strong experience with SQL and Python -Experience with ETL / data transformation workflows -Solid understanding of data modeling and relational databases -Experience cleaning messy, multi-source datasets -Strong knowledge of data validation and data quality practices -Ability to work independently and communicate clearly
8 days ago22 proposalsRemoteEgocentric Content Specialist (POV Video & Interaction Data)
We are building next-gen AI datasets at Clairva and are looking for Egocentric (POV) Data Capture Specialist, freelancers to capture first-person (POV) video data of real-world tasks. What You’ll Do... - Record POV videos using phone / GoPro / smart glasses - Capture simple actions like: + Product usage (apply, wear, unbox) + Daily activities (pick, place, interact) + Follow clear instructions for task-based recording + Upload videos with basic tags (action, object, environment) Use Case - Cooking from the cook’s POV - Applying makeup from user POV - Picking products in a warehouse - Walking through a store aisle - Product usage (fashion try-ons, beauty routines) - Assembly / manufacturing tasks0 Requirements - Access to a smartphone / camera (hands-free setup preferred) - Ability to follow structured instructions - Good lighting + stable video capture - Attention to detail Nice to Have - Experience with content creation (TikTok, Reels, YouTube) - Familiarity with POV-style videos Payment - Paid per video / task (assessed for quality & consistency) Send a quick intro + sample video (if available)
4 days ago15 proposalsRemoteCustomer service analyst
I'm looking for an experienced Customer Service Analyst to analyse service data and help improve overall customer service quality. What you'll do: • Collect and interpret customer service data to identify recurring issues, preferences, and behavioural patterns • Develop detailed reports and visual dashboards to track key performance indicators • Translate complex data findings into clear, actionable insights for non-technical stakeholders • Conduct root cause analysis to identify service gaps and recommend strategic improvements • Collaborate with cross-functional teams to design and execute long-term service strategies • Gather qualitative feedback to advocate for product enhancements or internal process updates • Use data insights to identify training needs and assist in developing best practice guides • Monitor and maintain internal systems to ensure data integrity and smooth workflows If you're organised, detail-oriented, and experienced in customer service analysis, send me a proposal!
a day ago15 proposalsRemoteMarketplace Data Collection
We are looking for a freelancer who can quickly collect listings from several online marketplaces. Task: - Find 3,000 listings on the marketplaces listed below that meet our criteria. - Add all listings to an Excel spreadsheet. - You must be registered on the platforms to open contact details and include them in the Excel file. Marketplaces: - Osta.ee - Skelbiu.lt - eMAG - Bazar - Allegro - OLX Requirements: - Attention to detail - Ability to work quickly and accurately - Registered accounts on the platforms to access contact details Deadline: The work must be completed within 3 days. Payment: Payment is negotiable. Important: We need someone who can start working immediately.
21 days ago25 proposalsRemoteopportunityurgent
Reformatting and cleaning data from an old CRM
I have several excel spreadsheets with excessive data which need cleaning and updating then putting into a workable format
17 days ago86 proposalsRemoteExpires in 12Email addresses for UK Companies in specific industries
Good afternoon, I am looking for a freelancer who can help with sourcing data files of email addresses for actively trading companies in England who are involved in the wholesale meat supply trade. I'm not looking to buy a random data list, these companies must be actively trading and fall under one of the following SIC codes: 46320, 10110, 10120, 10130, 47220. If you don't know what SIC codes are, this job is not for you. I am a freelance professional myself but this falls out my area of expertise. If this is a job you think you can reliably deliver, there is great potential for a long-term relationship. Along with your proposal, please outline how you plan to deliver this project, are you running a script, scraping web pages, buying a list? Also please provide an example of 10-20 data samples so I know that you understand what I am looking for. I haven't set a budget for this, but am willing to negotiate with a reliable professional for clean data. Budget is a placeholder only.
a day ago34 proposalsRemoteAI Telemetry Integration for Mesh Operations Dashboard
Hi, We’re working on a system that runs AI workloads across a distributed network of machines. Right now, we’re building an internal dashboard to monitor what’s going on in that network. The dashboard shows things like: node status (online / degraded / offline) latency and throughput tasks processed per minute telemetry data from nodes For this test, the goal is simple: we want to see how you handle integrating telemetry data into a working dashboard and making sure everything displays correctly. It’s less about perfect UI and more about how you structure things, handle data, and think through the flow.
7 days ago35 proposalsRemoteopportunity
UK Crypto Tax reconciliation & data analysis
Description: We are a UK-based accountancy firm specialising in Crypto tax, and we are looking for an experienced Crypto Tax Data Analyst to support ongoing client work. This role is focused heavily on data analysis rather than traditional accounting. Scope of work includes: - Reviewing wallet and exchange data (CSV/API exports) - Line-by-line transaction analysis - Identifying and categorising taxable events under UK (HMRC) rules - Reconciling discrepancies across wallets, exchanges, and DeFi activity - Cleaning and structuring datasets for tax reporting - Supporting preparation of outputs for final tax review Typical clients include: - High-volume traders - DeFi users (staking, liquidity pools, bridging, etc.) - NFT traders - Individuals with complex multi-wallet activity Requirements: - Strong understanding of UK Crypto tax treatment (HMRC guidance essential) - Proven experience using tools such as Koinly, Recap, CoinTracking or similar - Ability to handle large datasets accurately and efficiently - Strong analytical mindset and attention to detail - Experience identifying errors, duplicates, missing cost basis, and incorrect classifications Nice to have: - Experience working with UK accountancy firms - Familiarity with DeFi protocols and on-chain activity - Basic Excel / data manipulation skills Engagement: - Ongoing work available for the right candidate - Initially project-based, with potential for long-term collaboration Trial Task (Important): - Shortlisted candidates will be asked to complete a paid trial task. This will involve reviewing a sample dataset and: - Identifying key issues (e.g. missing cost basis, incorrect classifications, duplicates) - Providing a brief explanation of how you would resolve them - Demonstrating your approach to structuring clean, usable data This is a critical part of our selection process to ensure candidates can handle real-world Crypto data complexity. To apply: Please include: Examples of similar Crypto tax work you’ve completed Which software/tools you’ve used Your approach to handling messy or incomplete datasets
13 days ago6 proposalsRemoteHubspot website Template tidy up and publish
Need someone to finalise an existing hubspot website template, sort out links and data capture forms etc for a simple 2 page website
3 days ago30 proposalsRemoteUK Business DATA Supplier -
I am looking for a business data supplier. Data will be independant businesses - owners name, business name, address, email, whatsapp , post code price per 1000, 10000 & 100000 + turn around time. If you can scrape any other information for direct marketing, please let us know, including LinkedIn & plastic card companies Regards Proactiv
23 days ago27 proposalsRemoteTranslations, admin assistance
Spanish speaker required. Roles will include, responding to telephone calls in Spanish, admin assistance, and data input and more. Speaking Spanish is essential. Work hours will vary each week. Contact for more details.
3 days ago4 proposalsRemote