Data Cleansing Project + Identify Duplicate and Potentially Duplicate Records
3006
$$$
- Posted:
- Proposals: 6
- Remote
- #1020422
- Archived
Professional Expert ( Excel, data entry, proof reading, translation, password cracking etc)
Ludhiana
6008345818625971807504968227301153625
Description
Experience Level: Expert
General information for the business: Used Equipment Reseller
Database management system (DBMS): MySQL
Description of requirements/functionality: We have an export of data from our ERP system (Microsoft Excel spreadsheet). The database contains 56,061 records in 28 fields. I need someone to cleanse the data as follows prior to our importing this data into our new CRM system:
1. Standardize capitalization using capitalization of the the first word and every other word except for articles, prepositions, and conjunctions (except in ID field). Do not standardize lookup fields such as "Most Recent Sales Type", "Initial Sales Type", etc., as these flow from the ERP in a standardized format that should remain the same when imported into CRM.
2. Standardize e-mail addresses using lowercase
3. Standardize states using 2 letter code
4. Standardize phone number using 555-555-5555 for all records with U.S. addresses, and using 55 555-555-5555 for all international phone numbers.
5. Remove any leading or trailing spaces
6. Standardize address components such as "street, avenue, ave, st, suite, ste., drive, dr., boulevard, blvd, etc. using the U.S. Postal Service standardized abbreviations (see http://pe.usps.gov/text/pub28/28apc_002.htm)
7. Using an "approximate match" or "fuzzy lookup" algorithm, identify the records that are probable matches/overlaps. Using a pivot table or other mechanism, create a view so that these records are isolated for easy review.
Also, we are interested in checking e-mail addresses for bounces, addresses for deliverability (CASS/NCOA databases), and phone numbers for validity. These requests are optional and not part of the core project, but it would be nice to be able to do this.
File output (finished product) must be in an Excel worksheet. Detailed notes including methodologies used must be provided. If MySQL or other DB tools are used, we must receive any queries, VB Scripting, etc.
Turnaround time is 72 hours from posting of project.
Specific technologies required: Microsoft Excel and/or MySQL
Extra notes:
Database management system (DBMS): MySQL
Description of requirements/functionality: We have an export of data from our ERP system (Microsoft Excel spreadsheet). The database contains 56,061 records in 28 fields. I need someone to cleanse the data as follows prior to our importing this data into our new CRM system:
1. Standardize capitalization using capitalization of the the first word and every other word except for articles, prepositions, and conjunctions (except in ID field). Do not standardize lookup fields such as "Most Recent Sales Type", "Initial Sales Type", etc., as these flow from the ERP in a standardized format that should remain the same when imported into CRM.
2. Standardize e-mail addresses using lowercase
3. Standardize states using 2 letter code
4. Standardize phone number using 555-555-5555 for all records with U.S. addresses, and using 55 555-555-5555 for all international phone numbers.
5. Remove any leading or trailing spaces
6. Standardize address components such as "street, avenue, ave, st, suite, ste., drive, dr., boulevard, blvd, etc. using the U.S. Postal Service standardized abbreviations (see http://pe.usps.gov/text/pub28/28apc_002.htm)
7. Using an "approximate match" or "fuzzy lookup" algorithm, identify the records that are probable matches/overlaps. Using a pivot table or other mechanism, create a view so that these records are isolated for easy review.
Also, we are interested in checking e-mail addresses for bounces, addresses for deliverability (CASS/NCOA databases), and phone numbers for validity. These requests are optional and not part of the core project, but it would be nice to be able to do this.
File output (finished product) must be in an Excel worksheet. Detailed notes including methodologies used must be provided. If MySQL or other DB tools are used, we must receive any queries, VB Scripting, etc.
Turnaround time is 72 hours from posting of project.
Specific technologies required: Microsoft Excel and/or MySQL
Extra notes:
Matt W.
100% (6)Projects Completed
7
Freelancers worked with
8
Projects awarded
21%
Last project
20 Dec 2017
United States
New Proposal
Login to your account and send a proposal now to get this project.
Log inClarification Board Ask a Question
-
There are no clarification messages.
We collect cookies to enable the proper functioning and security of our website, and to enhance your experience. By clicking on 'Accept All Cookies', you consent to the use of these cookies. You can change your 'Cookies Settings' at any time. For more information, please read ourCookie Policy
Cookie Settings
Accept All Cookies