
Scraping Projects
Looking for freelance Scraping jobs and project work? PeoplePerHour has you covered.
Data scraping for influencers (automation)
Hello I’m looking for data for a specific set of influencers Specifically for influencers with 75K+ followers on Instagram or 50k subs on YouTube With an audience over 40 The influencer will be speaking on topics like financial planning, many wellbeing categories. Pls tell me how you would do this and how much you’ll charge for the data Also can we verify the engagement rate? Say per 1,000 emails and verify the emails? A Python script using the YouTube Data API v3 (free, official) and Phantombuster (Instagram) to automatically search, filter, and export influencer data to a Google Sheet or CSV. YouTube (API — Free & Clean) Use: YouTube Data API v3 Script logic: 1. Loop through keyword list 2. Search type=channel for each keyword 3. Pull channel details: name, URL, subscriber count, description, email (regex parse from description) 4. Filter: subscribers between 50,000 and 500,000 5. Export all results to CSV Keywords to loop: retirement planning lifestyle retirement vlog over 60 grandparenting tips grandma empty nest life over 50 pickleball over 50 senior fitness over 60 menopause health women over 50 midlife women wellness caregiver aging parents RV retirement full time gardening over 50 vegetable men's health over 50 medicare senior health Christian women over 50 faith retirement lifestyle active aging senior Output columns: Channel Name | URL | Subscribers | Email | Description snippet | Keyword searched Instagram (Phantombuster) Use: Phantombuster Instagram Hashtag Scraper + Profile Scraper (client will provide API key and session cookie) Script/phantom logic: 1. Run Instagram Hashtag Post Scraper on each hashtag below 2. Pull 500 posts per hashtag → extract unique profile URLs 3. Feed profile URLs into Instagram Profile Scraper 4. Extract: username, follower count, bio text, email from bio, website URL 5. Filter: followers 50,000–500,000 6. Export to CSV Hashtags to scrape: #retirementplanning #retirementlife #retireearly #grandparents #grandmalife #nanalife #grandpalife #emptynesters #lifeafterkids #over50life #pickleballlife #activeaging #seniorfitness #womenover50 #menopause #midlifewomen #caregiverlife #agingparents #seniorcare #rvlife #fulltimerv #retirementtravel #vegetablegarden #gardeningover50 #growyourown #menshealth #healthyaging #over50fitness #medicare #seniorliving #agingwell #christianwomen #faithoverfear #christianliving Output Format Single CSV, one row per creator: Platform | Handle/Channel | URL | Followers | Email | Bio/Description | Category | Source Keyword Filtering Rules (build into script) ∙ Followers: 50,000–500,000 only ∙ Skip accounts with 0 posts or last post over 60 days ∙ Skip accounts where bio contains: “under 18”, “teen”, “student”, “college” ∙ Flag rows with no email (for Apollo enrichment pass) ∙ Deduplicate on URL before export Deliverable
3 days ago17 proposalsRemotePython script run and scrape data
You need to build the script and run the data I don’t mind the process but the outcome is the data. Most likely using these methods Develop a Python automation to harvest influencer data from YouTube (Data API v3) and Instagram (Phantombuster) and export a deduplicated CSV. For YouTube: iterate provided keywords, search channels, capture channel name, URL, subscriber count, description snippet, parse emails, filter 50k–500k subscribers, and log keyword source. For Instagram: run hashtag scrapers, collect up to 500 posts per tag, extract unique profiles, scrape username, followers, bio, email, website, apply follower and activity/bio filters, flag missing emails, and produce a unified CSV with specified columns and filtering rules.
3 days ago22 proposalsRemoteAI agent build
I need help building an AI agent that help me with this: -Scrape dental clinics (Google Maps, directories) -Qualify leads (bad marketing, no video, weak socials) -Personalise outreach (email + IG DMs at scale) -Follow up automatically (so you don’t lose leads) -Track replies + book calls (calendar integration) Thanks
a day ago33 proposalsRemoteUK Business DATA Supplier -
I am looking for a business data supplier. Data will be independant businesses - owners name, business name, address, email, whatsapp , post code price per 1000, 10000 & 100000 + turn around time. If you can scrape any other information for direct marketing, please let us know, including LinkedIn & plastic card companies Regards Proactiv
22 days ago27 proposalsRemoteProspect Intelligence Analyst | Research Assistant
ROLE OVERVIEW Our firm helps small service businesses in the US and UK identify and fix operational revenue leaks — the gaps that cause them to lose enquiries and bookings without realizing it. You sit at the front of our Prospect Machine. Each week you research small businesses, identify their primary revenue leak, score them, enrich decision maker contacts, and populate our structured prospecting tracker. Your output feeds directly to our Cold Caller and Business Systems Consultants. This is not a data entry role. It requires pattern recognition, fast decision-making from limited information, and the discipline to work at consistent pace to a fixed weekly deadline. CORE RESPONSIBILITIES • Source 150–200 raw businesses per week using scraping tools provided by company • Filter to 120–150 qualified SMBs • Review each business's online digital presence — website, booking system, social pages, and online reputation — and conduct test calls outside office hours to assess missed call risk and after-hours responsiveness. Identify the primary operational revenue leak based on what the evidence shows • Find and verify the decision maker via enrichment tools provided by company • Score each lead, flag Priority Leads, and escalate leads immediately • Populate the B2B Prospecting Tracker • Submit all deliverables via agreed upon platform and time. Onboarding Ramp Week 1— 50–60 leads, research and tracker only. Full SOP and training provided. Quality standards apply from day one. Week 2 onwards — Cold call script prep for top 20 Priority Leads added. Week 3 onwards — Industry community and directory identification added. REQUIREMENTS • Experience in B2B lead research, business intelligence, or structured data research • Able to make fast, confident decisions from publicly available data • Strong attention to detail — accurate entries matter more than perfect ones • Comfortable following a structured SOP independently, without frequent check-ins • Clear written and spoken English, with prompt communication COMPENSATION & STRUCTURE • $17.00/hr · 15 hrs/week · ~$255/week • Performance review at 45 days — rate increase available for strong performers HOW TO APPLY Begin your application with the word SIGNAL — applications that don't will not be reviewed. Then answer these two questions: 1. Describe a research or data project where you worked to a consistent weekly output target. What tools did you use and what was your weekly volume? 2. You're researching a dental practice. Their website has no online booking system — new patients are instructed to call during office hours only. There is no contact form and no alternative way to enquire outside of calling. In two to three sentences: identify the primary revenue leak, explain why it matters commercially, and give this lead a score out of 10 with a one-sentence justification. Note Question 2 has a clear correct answer. We are looking for specific, evidenced reasoning — not a general description of the problem.
11 days ago12 proposalsRemote
Past "Scraping" Projects
Web scraping
I need this source to be scraped https://light-building.messefrankfurt.com/frankfurt/en/exhibitor-search.html?page=1&pagesize=90 I have the attached the sample what needed to be scraped i need the output in excel format Budget is fixed
opportunity
Building database of owners by web scraping
I have been working with DeepSeek to extract data from the web site tuscasasrurales.com. The data I need is shown on the attached file Granada.png. Some of this data requires a link to be clicked as shown in the uploaded file Tuscasasrurales.png. Email addresses have to be obtained by visiting the website (if there is one). Where no data is available, leave blank. I have been trying to extract the data province by Spanish province. To get a list by province, enter the site and add the name of the province in the search box e.g Granada which returns 304 entries. Although DeepSeek was unable to get all the info I wanted it has given me a python script which will do the job. I have uploaded this. It does not include the fields Bedrooms and Bathrooms which I would also like included. Can you do this work and how much would you charge? There are initially 10 provinces with an average of +/- 200 entries in each. Thanks - Allan
Scraping of Website database information
Require a straightforward web-scraping project to extract publicly available database records from a website. Gather complete contact details including names, email addresses, phone numbers and physical addresses. Deliver clean, deduplicated structured data (CSV or JSON) with clear field mapping. Ensure respectful, lawful scraping practices, include brief documentation of methods, sample code and any dependencies. Bid with estimated timeline and fixed price.
Data Scraping
I am seeking an adept data scraper to extract comprehensive information from a specified Shopify blog and compile it into a structured spreadsheet for seamless importation into a WordPress platform. The required data includes the original blog URL, publication date, URL link to the article's image, article title, and the full content of each article. The task involves approximately 174 articles, necessitating meticulous attention to detail and accuracy. Your expertise in data extraction and formatting will be invaluable for this project. the url is: https://kingsnqueens.com/blogs/news. Thank you for your interest!
PDF Scraping
Hi, I need to scrape Item# & HT# from first table and "Item and Piece# together" and HT# from 2nd table. I also need Spool# given on the bottom right corner. Sample PDF is attached. There are around 600 pages. This is not a manual data entry job. I need coders to extract data programmatically within few hours.
opportunity
Weekly Scrape to CSV / Google Doc
We need a scraper building. The site ot be scraped is https://www.jesuk.com/ We have the login to enable prices to be seen, this will be shared once the project is awarded. The scraper needs to run daily and populate a database with a front end where we can see: SKU Product Title Price - highlighting Price changes New Products added Products that no longer exist It should show the last 7 days of prices for each product on the front end - but store date for the last 30 days That full list should be downloadable to excel/csv We want a daily email alert showing the number of price changes / new products / removed products. I have a csv with the current SKUs and the current price, which we can use as a starting data point. Once up and running this scraper will need to sit on one of our servers Please ask any other questions you may have.
Data scraping
Data scraping for trades businesses (plumbers electricians etc) in Preston UK around a 10 mile radius
Python Web Scraping Expert Needed for UK Directory (42k Rows)
Description: We need a Python scraping expert to extract data from a public UK government directory (Architects Registration Board) mentioned below. https://architects-register.org.uk/ The Data: We need the full list of approximately 42,000 active architects. Fields required: Registration number, Name, Firm Name, Website, Address, Phone, Email, and Profile URL. The Technical Challenge: This site uses AJAX dynamic loading/session cookies. Standard visual scrapers (like ParseHub or WebHarvy) fail because the URL does not change on pagination. You will need to use a custom Python script (Selenium, Playwright, or Scrapy) to bypass the pagination block and extract the deep-click contact details. Requirements: Must be able to bypass AJAX pagination. Must deliver the final, clean Excel file within 48 hours of being hired. If you can deliver this quickly and cleanly, we have ongoing high-volume scraping projects for our enterprise clients.
opportunity
UK Business Data Scraping – Companies with 5–50 Employees
Description: I am looking for an experienced data scraping / lead generation specialist to build a high-quality list of UK businesses suitable for IT support services outreach. This data will be used for cold calling and email outreach by my internal sales team, so accuracy and relevance are extremely important. Target Businesses: - Location: United Kingdom - Company size: 5–50 employees Industries: Professional services, construction, property, finance, recruitment, logistics, healthcare clinics, legal firms, SMEs, etc. Exclude: IT companies, MSPs, software companies, telecom providers, public sector. Data Fields Required: For each company, I need: Company Name Contact Name (Owner / Director / IT Manager where possible) Job Title Phone Number (MAIN PRIORITY) Email Address (if available) Website Company Address / City Estimated Employee Size Industry / Sector LinkedIn Profile (company or contact if available) Volume Required: Initial test: 500 leads Ongoing potential: 2,000–5,000 leads per month if quality is good Data Sources (examples — open to suggestions): LinkedIn Google Maps Companies House Business directories Industry directories Public databases Important Requirements: Data must be GDPR compliant and sourced from publicly available information Phone numbers must be verified where possible No duplicate or low-quality records Prefer decision-maker contacts where available Deliverable Format: Excel or Google Sheets with clearly labelled columns. Ideal Freelancer: Proven experience with UK B2B lead generation Experience scraping LinkedIn and business directories Able to provide sample data before full project begins Good communication and reliability for ongoing work Please Include in Your Proposal: Examples of similar projects you’ve completed Your scraping tools / methods (VERY IMPORTANT) Cost per lead or cost per 500 leads Timeframe for delivery Sample (10–20 leads) if possible This may become a long-term ongoing project for the right person. Must be willing to negotiate on price, i've just set this price to speak to top certs but expecting a better price. Thank you.
Scrape comptitors website
I havea competitor " enchantedfairyportraits" whose website I would like to scrape and utilize on my website " fairyportriatstudio They have automation with emails etc
opportunity
1,000,000 UK business records scraped within 2 weeks 2
Hi there We need a database scraped of UK limited company names, their UK mobile number beginning 07, their email address, their website address, the Facebook business page URL and the link from where the details were obtained. The emails address needs to be generic such as info@businessname or hello@businessname We also need the category of industry from where each record relates to such as accountant, estate agent, clothes shop. We require 1,000,000 records within 2 weeks. If you are unable provide 1,000,000 in 2 weeks then please do not bid for the project. Please also do not bid on the project unless you have read the description and understand it completely. We have been contacted with numerous freelancers who state they can do it and then provide data which is nowhere near what we have asked for, so please, no time wasters. Thanks
opportunity
AI-Powered Price Scraper & Monitoring System (Multi-Website)
We are looking for an experienced developer to build a scalable AI-powered price scraping and monitoring system. The system should automatically extract product pricing data from multiple e-commerce websites and store it in a structured database for monitoring and analysis. The system must support multi-tenant architecture, role-based permissions, subscription tiers, and Stripe payment integration. The goal is to allow different companies to monitor product prices across multiple websites, with usage limits based on subscription plans. Project Scope 1. Target Websites • Scrape product prices from 7–10 e-commerce websites • Support dynamic content (JavaScript-rendered pages) • Proxy rotation & anti-bot handling • Scheduled scraping • Historical price tracking • Price change alerts (email or webhook) • Handle pagination and product variations 2. Multi-Tenant Architecture • Super Admin role • Manage all companies • Manage subscription plans • View system-wide usage • Suspend / activate companies 2. Data Extraction • Extract product name • Current price • Original price (if available) • SKU / Product ID • Availability status • Timestamp 3.1 Company Admin role • Manage company users • Set scraping targets (websites & products) • View company usage stats 3.2 Company Users • View price tracking dashboard • Access only assigned websites/products 3.3 Subscription & Usage Limits System must support different plan levels: Each plan should control: • Maximum number of websites • Maximum number of products • Scraping frequency (e.g., 1h / 3h / 6h / 24h) • Maximum concurrent scraping jobs • Historical data retention length Stripe Integration • Stripe subscription integration • Monthly / Yearly billing (7 days free trial) • Webhook handling for subscription status updates • Automatic feature unlock based on plan • Auto suspend account if payment fails • Admin ability to manually upgrade/downgrade plan 4. AI-Assisted Selector Detection • Use AI or intelligent selector logic to detect price elements • System should adapt if minor DOM changes occur • Minimize manual reconfiguration 5. Infrastructure • Proxy rotation support • Anti-bot handling • Headless browser support (e.g., Puppeteer / Playwright) • Scalable deployment (Docker preferred) 6. Database & Storage • Store data in MySQL • Historical price tracking • Ability to compare price changes 7. Monitoring & Automation • Scheduled scraping (e.g., every 1–6 hours) • Email or webhook alerts when price changes • Logging and error reporting 8. Dashboard • Admin and users dashboard • Search by product • View historical price chart Technical Requirements Preferred stack: • Laravel • Playwright / Puppeteer / Scrapy • REST API architecture • Docker deployment Deliverables • Fully working scraping system • Deployment guide • Source code • Documentation • 2 weeks post-delivery support Bonus Experience with anti-bot bypass, rotating residential proxies, and large-scale scraping is highly preferred. If interested, please include your portfolio and examples of similar scraping projects.
opportunity
1,000,000 UK business records scraped within 2 weeks
Hi there We need a database scraped of UK limited company names, their UK mobile number beginning 07, their email address, their website address, the Facebook business page URL and the link from where the details were obtained. The emails address needs to be generic such as info@businessname or hello@businessname We also need the category of industry from where each record relates to such as accountant, estate agent, clothes shop. We require 1,000,000 records within 2 weeks. If you are unable provide 1,000,000 in 2 weeks then please do not bid for the project. Thanks
opportunity
Python Data Pipeline — Web Scraping, Multi-Platform, languages
I'm building a children's activity discovery platform for Switzerland (think "Google for kids' activities"). I need a recurring data pipeline that scrapes class schedules from ~500 providers across multiple booking platforms in Geneva, expanding to all of Switzerland within 18 months. What needs to be scraped: Cogito-Sport (Angular/JavaScript portal) — 8-12 swimming clubs iClassPro (JavaScript portal) — 3-5 providers loisirsjeunes.ch (static HTML, paginated, ~200 activities via sequential IDs) Ville de Genève sports index (static HTML) PDF timetables (22 community centres) Individual club websites (mixed HTML) What to extract from each source: Activity name, provider, day of week, time, age range, price, address, registration URL. I'm building a children's activity discovery platform for Switzerland (think "Google for kids' activities"). I need a recurring data pipeline that scrapes class schedules from ~500 providers across multiple booking platforms in Geneva, expanding to all of Switzerland within 18 months. What needs to be scraped: Cogito-Sport (Angular/JavaScript portal) — 8-12 swimming clubs iClassPro (JavaScript portal) — 3-5 providers loisirsjeunes.ch (static HTML, paginated, ~200 activities via sequential IDs) Ville de Genève sports index (static HTML) PDF timetables (22 community centres) Individual club websites (mixed HTML) What to extract from each source: Activity name, provider, day of week, time, age range, price, address, registration URL. Deliverables: Working scrapers for all source types Config file to add new providers without code changes Normalisation layer mapping all sources to unified schema Change detection and summary email Deployment on Railway with cron schedule Clean CSV output + optional Airtable API push README written for non-developers In your proposal, briefly describe how you have handled JavaScript-rendered Angular or React pages in a previous project — what tools did you use and how did you handle DOM waiting? I have a full technical spec document available on request. P.S. Please note: due to payment processing limitations I am unable to work with freelancers based in Russia or Belarus.
Web Scraping Required
I require a structured data extraction project from the following directory: https://www.buildington.co.uk/companies The objective is to extract and structure company data into a clean Excel spreadsheet. Required fields: • Company Name • Contact Name (if available) • Telephone Number • Email Address • Website URL Important: • Some data is available directly on the directory page. • In certain cases, the freelancer may need to visit the company’s website to retrieve missing contact details. • Data must be structured, cleaned and deduplicated. • Output format: Excel (.xlsx) with clearly labelled columns. • Please confirm your approach and tools before starting. This is not a one-off copy and paste task. I am looking for someone who can create a reliable and efficient extraction method.