Merge data in Excel and manually dedupe the spreadsheet
- or -
Post a project like this1618
£300(approx. $377)
- Posted:
- Proposals: 60
- Remote
- #2596396
- OPPORTUNITY
- Awarded
Admin Support ; Data Collection ; Data Entry ; Web Research; Linkbuilding; Excel; Word & PDF, Photo Editing
Karachi
Lead Generation,Data Analysis,Data Visualization,Dashboard,LinkedIn Leads,Market Research,MS Excel,Google sheets,Web Research,CRM/CMS,Google Ads,PPC
Bay Minette
Internet Researcher/Research/Data entry/Lead generation/Data collection/Web search/Data scrapping/Market research/PDF conversion/web scrapping/virtual assistant/Email database/Excel/Email Database
Vadodara
Logo Designer | Graphic Designer | e-Commerce Assistance (Shopify/WordPress) | Data Entry
Tampa
957453212629641622327217848932306393188780319713982507173210534171673441087
Description
Experience Level: Expert
We want to build an Excel database of clinics using different spreadsheet sources of data we have collected.
We want you to merge the sources of data and dedupe the data in Excel so that we have a unique set of clean records, containing the IDs of the various data sources.
It is a database of clinics (facilities) and the people that work in them.
So there are two different tables in the directory, Facilities and People.
We need to maintain the relationships between the people and the facilities where they work using relational IDs.
Within their own “ecosystem” all records have their own relational IDs – so Sugar People are mapped to Sugar Facilities, Connect People are mapped to Connect Facilities etc… (more details to be provided once selected)
Please merge all people records together, inserting the IDs in the relevant columns, then dedupe the records.
When you dedupe the records, ensure that we have one perfect clean set of data, adding any additional conflicting contact fields to the additional fields columns AND including all relevant ID fields so there is just one row that contains the full set of data.
Please do the same for the facilities, merging the different facility sources and deduping the data. There will be a lot of unclean duplicates in this data. Mostly the duplicates will require a human to spot, as the names of clinics and format of addresses are not exact matches and the names are never consistent. But by using sorts and filters it is possible to spot similar named facilities with similar addresses, often the same zip code or email address - and then with common sense you can recognise it is a duplicate record. When you spot a duplicate make sure you add all the correct IDs when you merge the row and add any conflicting additional data from the other sheets in the additional details columns so there is just one row that contains the full set of clean data.
We want you to merge the sources of data and dedupe the data in Excel so that we have a unique set of clean records, containing the IDs of the various data sources.
It is a database of clinics (facilities) and the people that work in them.
So there are two different tables in the directory, Facilities and People.
We need to maintain the relationships between the people and the facilities where they work using relational IDs.
Within their own “ecosystem” all records have their own relational IDs – so Sugar People are mapped to Sugar Facilities, Connect People are mapped to Connect Facilities etc… (more details to be provided once selected)
Please merge all people records together, inserting the IDs in the relevant columns, then dedupe the records.
When you dedupe the records, ensure that we have one perfect clean set of data, adding any additional conflicting contact fields to the additional fields columns AND including all relevant ID fields so there is just one row that contains the full set of data.
Please do the same for the facilities, merging the different facility sources and deduping the data. There will be a lot of unclean duplicates in this data. Mostly the duplicates will require a human to spot, as the names of clinics and format of addresses are not exact matches and the names are never consistent. But by using sorts and filters it is possible to spot similar named facilities with similar addresses, often the same zip code or email address - and then with common sense you can recognise it is a duplicate record. When you spot a duplicate make sure you add all the correct IDs when you merge the row and add any conflicting additional data from the other sheets in the additional details columns so there is just one row that contains the full set of clean data.
Ben R.
100% (20)Projects Completed
11
Freelancers worked with
11
Projects awarded
85%
Last project
27 Jan 2020
United Kingdom
New Proposal
Login to your account and send a proposal now to get this project.
Log inClarification Board Ask a Question
-
How many data sources do you have? Can you send a copy of the the excel files?
-
Hi Ben,
how big is the file? How many rows?
cheers,
Agnes -
Hi Ben,
Like the others I would need to know the number of rows of data please.
Susan -
Hello Ben,
How many records are there in each table?
Thanks.
Best Regards,
Melih Erbas -
Hi Ben,
How many records are there in total?
Many thanks,
Joe,
862410861950861832861801861793
We collect cookies to enable the proper functioning and security of our website, and to enhance your experience. By clicking on 'Accept All Cookies', you consent to the use of these cookies. You can change your 'Cookies Settings' at any time. For more information, please read ourCookie Policy
Cookie Settings
Accept All Cookies