Python Script for Data Validation
- or -
Post a project like this2392
$
- Posted:
- Proposals: 3
- Remote
- #1704385
- Awarded
Description
Experience Level: Entry
Estimated project duration: 1 day or less
General information for the business: Data Validation
Description of requirements/functionality: I need a Python script, which must run on OS X, to run the following workflow:
1. Accept CSV with columns 'Email' and 'Company'. The CSV may contain other columns in any order.
2. Compare the email address domain and company name. If they match sufficiently (fuzzy match, levenstein), accept ALL similar rows in the file. Do not ask for that company name again in that session.
3. If they don't match sufficiently, show on screen + prompt the user to check.
4. If rejected, add ALL rows and columns for that rejection to a CSV called 'rejected'
5. On completion, create a file called 'Processed', which contains only valid data
Please document your code clearly and if using Levenstein/Fuzzy matching, have a configurable score variable I can tweak later.
Specific technologies required: Python
OS requirements: Mac OS
Extra notes: See example file attached.
Ask User for Match = Domain and Company Name are quite different, but similar
Accept This = Domain and Company Name are practically the same
Reject This = Very different company name and domain, reject without user input
Reject This = Completely different, reject without user input
Description of requirements/functionality: I need a Python script, which must run on OS X, to run the following workflow:
1. Accept CSV with columns 'Email' and 'Company'. The CSV may contain other columns in any order.
2. Compare the email address domain and company name. If they match sufficiently (fuzzy match, levenstein), accept ALL similar rows in the file. Do not ask for that company name again in that session.
3. If they don't match sufficiently, show on screen + prompt the user to check.
4. If rejected, add ALL rows and columns for that rejection to a CSV called 'rejected'
5. On completion, create a file called 'Processed', which contains only valid data
Please document your code clearly and if using Levenstein/Fuzzy matching, have a configurable score variable I can tweak later.
Specific technologies required: Python
OS requirements: Mac OS
Extra notes: See example file attached.
Ask User for Match = Domain and Company Name are quite different, but similar
Accept This = Domain and Company Name are practically the same
Reject This = Very different company name and domain, reject without user input
Reject This = Completely different, reject without user input
Peter W.
100% (3)Projects Completed
5
Freelancers worked with
3
Projects awarded
71%
Last project
19 Sep 2017
United Kingdom
New Proposal
Login to your account and send a proposal now to get this project.
Log inClarification Board Ask a Question
-
There are no clarification messages.
We collect cookies to enable the proper functioning and security of our website, and to enhance your experience. By clicking on 'Accept All Cookies', you consent to the use of these cookies. You can change your 'Cookies Settings' at any time. For more information, please read ourCookie Policy
Cookie Settings
Accept All Cookies