
Scrape webpages, no crawling required
- or -
Post a project like this1782
£86(approx. $116)
- Posted:
- Proposals: 9
- Remote
- #3158616
- Awarded
Data Entry | Web Scraping | Data Scraping | Data Mining | PDF Conversion Assistant
data entry | microsoft excel | microsoft word | data entry clerk | data research | data analysis | data scraping | data mining | pdf to Word | virtual assistants

Expert Data Entry, Lead Generation Expert, Data Scraping ,Web Scraping, Web Research and Pro Translator
32410683284930142971346297433777195440154126872524720943189921
Description
Experience Level: Entry
I need to scrape 3324 webpages. I have a list of the urls to scrape, no crawling is required. Each webpage has the same format, only one scraper is required. The urls are provided in a CSV file.
URLs to scrape: https://drive.google.com/file/d/1narrrwVo5GFCyG9wsKfkjPj7AcyGe6MX/view?usp=sharing
For each URL I require this data in json format (example data provided):
url: https://www.xero.com/uk/advisors/accountant/armstrong-watson-156b74095297/
name: Armstrong Watson
headoffice: "Victoria Place, Fairview House, Carlisle, England"
website: https://www.armstrongwatson.co.uk/xero-cloud-accounting
*rootdomain: armstrongwatson.co.uk
*hompage: https://www.armstrongwatson.co.uk
partnerstatus: Platinum champion partner
partnersince: 2013
facebook: https://en-gb.facebook.com/armstrongwatson/
twitter: https://www.twitter.com/armstrongwatson
linkedin: https://www.linkedin.com/company/armstrong-watson/
offices: [{"name": "carlisle", "address":"Victoria Place, Fairview House, Carlisle, CA1 1EX, England", "phone":"+44 01228 690100"},{...}]
officecount: 10
**logo: "armstrongwatson.png"
* Field calculated from website url
** Logo downloaded and all logos put in one folder. To name the logo, take the name field, lowercase it and remove all non-alphanumeric characters, and add .png extension.
Save data to json format. Please validate the json before completing the job. Logo files should be saved in on folder and provided in zip archive file.
Please do not create empty fields. If a field does not exist, omit it from the json.
URLs to scrape: https://drive.google.com/file/d/1narrrwVo5GFCyG9wsKfkjPj7AcyGe6MX/view?usp=sharing
For each URL I require this data in json format (example data provided):
url: https://www.xero.com/uk/advisors/accountant/armstrong-watson-156b74095297/
name: Armstrong Watson
headoffice: "Victoria Place, Fairview House, Carlisle, England"
website: https://www.armstrongwatson.co.uk/xero-cloud-accounting
*rootdomain: armstrongwatson.co.uk
*hompage: https://www.armstrongwatson.co.uk
partnerstatus: Platinum champion partner
partnersince: 2013
facebook: https://en-gb.facebook.com/armstrongwatson/
twitter: https://www.twitter.com/armstrongwatson
linkedin: https://www.linkedin.com/company/armstrong-watson/
offices: [{"name": "carlisle", "address":"Victoria Place, Fairview House, Carlisle, CA1 1EX, England", "phone":"+44 01228 690100"},{...}]
officecount: 10
**logo: "armstrongwatson.png"
* Field calculated from website url
** Logo downloaded and all logos put in one folder. To name the logo, take the name field, lowercase it and remove all non-alphanumeric characters, and add .png extension.
Save data to json format. Please validate the json before completing the job. Logo files should be saved in on folder and provided in zip archive file.
Please do not create empty fields. If a field does not exist, omit it from the json.
Stuart K.
100% (6)Projects Completed
8
Freelancers worked with
5
Projects awarded
88%
Last project
23 Aug 2022
United Kingdom
New Proposal
Login to your account and send a proposal now to get this project.
Log inClarification Board Ask a Question
-
There are no clarification messages.
We collect cookies to enable the proper functioning and security of our website, and to enhance your experience. By clicking on 'Accept All Cookies', you consent to the use of these cookies. You can change your 'Cookies Settings' at any time. For more information, please read ourCookie Policy
Cookie Settings
Accept All Cookies