Excel routine to compare address data sets and score for similarity

  • Posted
  • Proposals 1
  • Remote
  • #24526
  • Archived
Neil W. has already sent a proposal.
  • 0

Description

Experience Level: Intermediate
This job is to create a Microsoft Excel page (Mac for Excel 2004 compatible) - or web-based database application - that will compare two sets of data and score individual lines from one set of records for matches with data from the second set.
The first data set (Master file) is a file containing about 8,000 names and addresses with the following seven fields: Name, four Address fields, County, Full UK Postcode.
The second data set (Reference file) is a file of about 4,000 records with two fields: Name and Part UK Postcode (the first half of the postcode, typically three or four characters).
Rather than manually compare the two lists, I would like a smart Excel solution to find and score matches between the two data sets.
I don't know if this is even possible, but the the ideal solution will result in a data set that includes the seven data fields and an additional field that provides a match score, possibly as a percentage.
As a starting point the, the two postcode fields should match (the Master postcode must contain the first part of the Postcode in the Reference file). If the Postcode matches, the name fields should be compared for similarity and a match score calculated (100% for a direct hit).
I will attach sample Master and Reference files (tab delimited format) that may help explain the task at hand.

Clarification Board

    There are no clarification messages.