I need a Gumtree scraper which scans and stores related info of new posts
- or -
Post a project like this£60(approx. $76)
- Posted:
- Proposals: 6
- Remote
- #1008661
- Expired
Web Scraping | Web Design & Development |Mobile App Design & Development | Wordpress Website Development | Salesforce
Oakland
Expert in php python devops cicd AWS cloud microservices docker & k8 reactjs nodejs fastapi django flask yii cakephp codeigniter laravel DS AIML data Analyst javascript perl Rest Api wordpress
Little Elm
113809641422534762934242066481006045
Description
Experience Level: Intermediate
General information for the website: I run a direct marketing tool for property managers and agents
Num. of web pages/modules: 1
Description of requirements/features: I run a business app which collects property ads posted from public websites and feeds it into a database where it can later be presented to users of the system in the aid of direct marketing to them. My system is built in PHP and uses MySQL to store data.
I need a single PHP script which whenever called will connect via CURL to gumtree.com and provide gumtree with a request to return the listings of private ads for property to rent within a certain search location (provided as GET argument) ie:
https://www.gumtree.com/search?search_category=flats-and-houses-for-rent&q=&search_location=Walsall
The GET parameters above can be added as variables at the top of the script. I can later modify this to accept input from other scripts or add additional GET arguments. So to illustrate, something such as this is acceptable:
$website_url = 'https://www.gumtree.com/search?';
$category = 'flats-and-houses-for-rent';
$search_loc = 'Walsall';
This initial CURL request will return a listings page. The script should then iterate through every ad link provided on the return HTML (and the following two pages in order of pagination) and collect metadata from each ad including:
The unique ID of this post if any
The poster's name
The poster's telephone if provided, if not the email address
The title of the ad
The rental price for the property
The seller type
The date the add was posted
The property type
The number of bedrooms
The script has to be able to parse the HTML that gets returned from each ad page and extract the above information as an array of PHP variables.
I will process this array separately for input into the database. You will not need to do any cleansing or validation on the values, just extract the TEXT NODES and strip any and all HTML from the above metadata before assigning them to variables. If blank text nodes are encountered, that is also fine. I will process these later.
There is no front-facing interface, no output that needs to be presented to the screen and ideally no libraries should be linked to if the entire codebase can be added to a single PHP file.
Extra notes: The project will be for a PHP HTML scraper and requires metadata to be pulled from each return page, forming a large array comprised of individual arrays comprised of the ads' metadata. Ie:
LISTINGS ARRAY = [
LISTING = [ uid= ... , poster_name = ... , poster_tel = ... , title = ... , rental_price = ... , seller_type = ... , data_posted = .. , etc ]
LISTING = [ uid= ... , poster_name = ... , poster_tel = ... , title = ... , rental_price = ... , seller_type = ... , data_posted = .. , etc ]
LISTING = [ uid= ... , poster_name = ... , poster_tel = ... , title = ... , rental_price = ... , seller_type = ... , data_posted = .. , etc ]
]
Extra notes: I take great pride in the software I have created and have been involved in all aspects of its development until now however I no longer have the time to sit and code everything myself.
This project best serves a developer who has created a PHP scraper in the past and I do not mind if code is reused extensively or integrated from sources where the developer has the license or right to do so.
Num. of web pages/modules: 1
Description of requirements/features: I run a business app which collects property ads posted from public websites and feeds it into a database where it can later be presented to users of the system in the aid of direct marketing to them. My system is built in PHP and uses MySQL to store data.
I need a single PHP script which whenever called will connect via CURL to gumtree.com and provide gumtree with a request to return the listings of private ads for property to rent within a certain search location (provided as GET argument) ie:
https://www.gumtree.com/search?search_category=flats-and-houses-for-rent&q=&search_location=Walsall
The GET parameters above can be added as variables at the top of the script. I can later modify this to accept input from other scripts or add additional GET arguments. So to illustrate, something such as this is acceptable:
$website_url = 'https://www.gumtree.com/search?';
$category = 'flats-and-houses-for-rent';
$search_loc = 'Walsall';
This initial CURL request will return a listings page. The script should then iterate through every ad link provided on the return HTML (and the following two pages in order of pagination) and collect metadata from each ad including:
The unique ID of this post if any
The poster's name
The poster's telephone if provided, if not the email address
The title of the ad
The rental price for the property
The seller type
The date the add was posted
The property type
The number of bedrooms
The script has to be able to parse the HTML that gets returned from each ad page and extract the above information as an array of PHP variables.
I will process this array separately for input into the database. You will not need to do any cleansing or validation on the values, just extract the TEXT NODES and strip any and all HTML from the above metadata before assigning them to variables. If blank text nodes are encountered, that is also fine. I will process these later.
There is no front-facing interface, no output that needs to be presented to the screen and ideally no libraries should be linked to if the entire codebase can be added to a single PHP file.
Extra notes: The project will be for a PHP HTML scraper and requires metadata to be pulled from each return page, forming a large array comprised of individual arrays comprised of the ads' metadata. Ie:
LISTINGS ARRAY = [
LISTING = [ uid= ... , poster_name = ... , poster_tel = ... , title = ... , rental_price = ... , seller_type = ... , data_posted = .. , etc ]
LISTING = [ uid= ... , poster_name = ... , poster_tel = ... , title = ... , rental_price = ... , seller_type = ... , data_posted = .. , etc ]
LISTING = [ uid= ... , poster_name = ... , poster_tel = ... , title = ... , rental_price = ... , seller_type = ... , data_posted = .. , etc ]
]
Extra notes: I take great pride in the software I have created and have been involved in all aspects of its development until now however I no longer have the time to sit and code everything myself.
This project best serves a developer who has created a PHP scraper in the past and I do not mind if code is reused extensively or integrated from sources where the developer has the license or right to do so.
Joseph R.
0% (0)Projects Completed
-
Freelancers worked with
-
Projects awarded
0%
Last project
15 Dec 2024
United Kingdom
New Proposal
Login to your account and send a proposal now to get this project.
Log inClarification Board Ask a Question
-
There are no clarification messages.
We collect cookies to enable the proper functioning and security of our website, and to enhance your experience. By clicking on 'Accept All Cookies', you consent to the use of these cookies. You can change your 'Cookies Settings' at any time. For more information, please read ourCookie Policy
Cookie Settings
Accept All Cookies