Application in Python to scrape web from suppliers & treat XLS data
- or -
Post a project like this2171
$
- Posted:
- Proposals: 9
- Remote
- #1977626
- Awarded
Python | Web Scraping | Software development | API Development | Data Mining | Automation
Rajkot
1888949506139524121104336312807541429713149323817479842144447
Description
Experience Level: Entry
Estimated project duration: less than 1 week
Application in Python to scrape web data from suppliers, treat XLS data (and optionally, upload to MySQL database from my Prestashop store).
I have a job that consists of creating an application in Phyton to extract data from the websites of my suppliers.
In principle, I would extract all the data in a single Excel file (.xls) with a column layout determined by me.
It is important to know that some of my suppliers have flash-based websites. I really have several providers, but initially I prefer to start seeing the results on which I comment, based on flash.
If you confirm that you can extract data from this type of website, I will pass some eventual access data (15 days) so you can send me a proof of it.
I need each product:
1. In the .xls file I need to have Category, subcategory (all levels in each case), product name, reference, part number or EAN, stock, description, technical characteristics, links of all the images (the largest ones) ). As seen in the file DATA.jpg, and in that same order.
2. Extract the content of the Description / Characteristics that appears in column H of the file DATA.jpg, apart from in this column, in independent files in html format (the same format of the extracted data), whose names will be in each product, the reference of the same. The description that appears in column H must be inserted in the body, that is, between the TBODY labels (initial and final, obviously) of the file CLEAN.jpg that you have attached, and saved with the name of the corresponding product reference. .
3. When the information is from several suppliers, the program should process it and group it as if it were a single supplier, adding an indicative to each product (a separate column or an acronym in parentheses at the end of PN or similar to discriminate What is the supplier that the product belongs to? Establish at least two comparison methods for, once grouped for example by EAN or by your PN, allow me to compare the prices and stay, for example, only with the cheapest ones.
4. (Optionally) Once processed the information, know if you can make connection with the SQL database of my store so that the products are updated periodically. My store is made in Prestashop v.1.6
I would like you to indicate, basically if you can extract the web data in flash, which is the one that worries me the most. If you can, give me a closed price for the preparation of this application for this provider, and it would cost me, once I see the operation of the application with this provider, that I adapted it to extract from at least 5 suppliers, which would indicate in its moment.
Thanks, and greetings
I have a job that consists of creating an application in Phyton to extract data from the websites of my suppliers.
In principle, I would extract all the data in a single Excel file (.xls) with a column layout determined by me.
It is important to know that some of my suppliers have flash-based websites. I really have several providers, but initially I prefer to start seeing the results on which I comment, based on flash.
If you confirm that you can extract data from this type of website, I will pass some eventual access data (15 days) so you can send me a proof of it.
I need each product:
1. In the .xls file I need to have Category, subcategory (all levels in each case), product name, reference, part number or EAN, stock, description, technical characteristics, links of all the images (the largest ones) ). As seen in the file DATA.jpg, and in that same order.
2. Extract the content of the Description / Characteristics that appears in column H of the file DATA.jpg, apart from in this column, in independent files in html format (the same format of the extracted data), whose names will be in each product, the reference of the same. The description that appears in column H must be inserted in the body, that is, between the TBODY labels (initial and final, obviously) of the file CLEAN.jpg that you have attached, and saved with the name of the corresponding product reference. .
3. When the information is from several suppliers, the program should process it and group it as if it were a single supplier, adding an indicative to each product (a separate column or an acronym in parentheses at the end of PN or similar to discriminate What is the supplier that the product belongs to? Establish at least two comparison methods for, once grouped for example by EAN or by your PN, allow me to compare the prices and stay, for example, only with the cheapest ones.
4. (Optionally) Once processed the information, know if you can make connection with the SQL database of my store so that the products are updated periodically. My store is made in Prestashop v.1.6
I would like you to indicate, basically if you can extract the web data in flash, which is the one that worries me the most. If you can, give me a closed price for the preparation of this application for this provider, and it would cost me, once I see the operation of the application with this provider, that I adapted it to extract from at least 5 suppliers, which would indicate in its moment.
Thanks, and greetings
SANTIAGO G.
80% (1)Projects Completed
-
Freelancers worked with
-
Projects awarded
100%
Last project
26 Apr 2024
Spain
New Proposal
Login to your account and send a proposal now to get this project.
Log inClarification Board Ask a Question
-
There are no clarification messages.
We collect cookies to enable the proper functioning and security of our website, and to enhance your experience. By clicking on 'Accept All Cookies', you consent to the use of these cookies. You can change your 'Cookies Settings' at any time. For more information, please read ourCookie Policy
Cookie Settings
Accept All Cookies