API crawling engine in multi threaded Python or similar
- or -
Post a project like this£500(approx. $622)
- Posted:
- Proposals: 1
- Remote
- #86982
- Expired
Description
Experience Level: Expert
I need a solution which will pull data from 2 APIs and return it to a local mysql database.
The really important factor is efficiency, as I have a lot of accounts to retrieve data from and some of it may be required per minute/hour or be very large.
The solution will essentially be continuously running through a list of accounts & data requests which I have provided, either at weekly, daily, hourly or per minute (depending on the request).
I will provide:
A list of queries to run through, held in mysql table(s). Your script should be constantly working through this list. Queries in the list will be specified to run either per week, day, hour or minute.
The format of mysql tables that you return the data to.
Key features:
It should be modular. Initially it will use Google Analytics and Omniture APIs but should be expandable to accommodate other APIs.
You don't need to worry about the format of the API requests to send, as I already have them working in PHP on a single request basis. You can use this code or start from scratch. What I need from you is a system which will automate the retrieval of data, maximise how much can be obtained (through multithreaded calls) and store it in mysql.
It should be capable of asynchronous calls. Eg for Google Analytics you are allowed 4 concurrent calls and they can take ages (>60 seconds) so it's important I can use all 4 at once. Omniture does not have this limit.
It should be fail proof in that if any of the APIs is down it should keep trying every x seconds and then obtain all missing table rows once the API is back up.
I'd like an uber-simple admin page which shows a log of the last 500 or so calls and their success/failure and the time taken to retrieve.
When applying, please briefly suggest how you would tackle this problem. It doesn't have to be python if you have a better idea. The only limitation is that it has to run on a LAMP server.
I have lots of similar projects so there will be ongoing work for the right person.
I'd prefer UK based but not essential.
Feel free to ask me any questions
The really important factor is efficiency, as I have a lot of accounts to retrieve data from and some of it may be required per minute/hour or be very large.
The solution will essentially be continuously running through a list of accounts & data requests which I have provided, either at weekly, daily, hourly or per minute (depending on the request).
I will provide:
A list of queries to run through, held in mysql table(s). Your script should be constantly working through this list. Queries in the list will be specified to run either per week, day, hour or minute.
The format of mysql tables that you return the data to.
Key features:
It should be modular. Initially it will use Google Analytics and Omniture APIs but should be expandable to accommodate other APIs.
You don't need to worry about the format of the API requests to send, as I already have them working in PHP on a single request basis. You can use this code or start from scratch. What I need from you is a system which will automate the retrieval of data, maximise how much can be obtained (through multithreaded calls) and store it in mysql.
It should be capable of asynchronous calls. Eg for Google Analytics you are allowed 4 concurrent calls and they can take ages (>60 seconds) so it's important I can use all 4 at once. Omniture does not have this limit.
It should be fail proof in that if any of the APIs is down it should keep trying every x seconds and then obtain all missing table rows once the API is back up.
I'd like an uber-simple admin page which shows a log of the last 500 or so calls and their success/failure and the time taken to retrieve.
When applying, please briefly suggest how you would tackle this problem. It doesn't have to be python if you have a better idea. The only limitation is that it has to run on a LAMP server.
I have lots of similar projects so there will be ongoing work for the right person.
I'd prefer UK based but not essential.
Feel free to ask me any questions
Daniel H.
0% (0)Projects Completed
-
Freelancers worked with
-
Projects awarded
0%
Last project
25 Apr 2024
United Kingdom
New Proposal
Login to your account and send a proposal now to get this project.
Log inClarification Board Ask a Question
-
There are no clarification messages.
We collect cookies to enable the proper functioning and security of our website, and to enhance your experience. By clicking on 'Accept All Cookies', you consent to the use of these cookies. You can change your 'Cookies Settings' at any time. For more information, please read ourCookie Policy
Cookie Settings
Accept All Cookies