
Scrape or Curl data from website and add to Google Sheet
- or -
Post a project like this1819
£150(approx. $199)
- Posted:
- Proposals: 7
- Remote
- #2790523
- OPPORTUNITY
- Completed
Description
Experience Level: Intermediate
Please provide your FINAL and fixed price for this job. If you have any questions, please post on this site, any direct contacts will be rejected. Once the person is chosen, i'll accept their quote, so it will need to be your full & final cost.
I'm looking for a Google Sheet script to be written that does the following:
The following function should be accessed by a custom menu option in GoogleSheets called "Retrieve PSC"
Take a number (e.g. 06663152) from a cell in Googlesheets and place this into a URL (e.g.https://beta.companieshouse.gov.uk/company/06663152/persons-with-significant-control) to extract the data from that page.
Once on that page e.g. https://beta.companieshouse.gov.uk/company/06663152/persons-with-significant-control if the value is a company e.g. "GBST UK Holdings Ltd" then store the number under "Registration number" and repeat process until names of people are found, see next section
If a name of a person or persons is found, e.g. https://beta.companieshouse.gov.uk/company/10720030/persons-with-significant-control
Then store the Name, e.g. "Mr Alex Burrows" and Correspondence Address .e.g. "10-12, Mulberry Green, Harlow, England, CM17 0ET" and Date of Birther "February 1988" in cells in a row, along with Company Number as a number, e.g. 10720030 and with this number linked to the original page e.g. https://beta.companieshouse.gov.uk/company/10720030/persons-with-significant-control and company name, e.g. DigiDropLtd.
Then in the next row store the data from the next person, with all the same information as above. Repeat this for all the persons listed.
The next step should be access under the same menu titled "Retrieve Officers"
This step should use the same company number as before, e.g. 06663152 and add use this in the url "https://beta.companieshouse.gov.uk/company/10720030/officers" as above, replacing the number "10720030" with the value from the cell e.g. 06663152. Now it should retrieve the name of each person and their address and place this into 2 cells, e.g. "Alex Burrows" and "2 - 3 Stable Court, Herriard Park, Herriard, Basingstoke, England, RG25 2PL"
The 3rd step "Match address"
Retrieve the "Registered Office Address" from this page https://beta.companieshouse.gov.uk/company/10720030, again replacing the company number as before.
So now for a company there will be a Registered Office Address, and an address for each PSC and a name & address for each Officer. This final step should compare the address for the PSC with that in Registered office address, if different, keep the PSC address, if the same, look to see if there is an address for that person under Officer and if that address differs from Registered Office Address use that address.
This should result in a name & address for each PSC and where possible using a different address than the registered office address.
Step 4 "Mail Merge"
The mail merge should use a temmplate word document and replace various fields in the document with values from the Google Sheet, Company Name, Sector and name & address of a PSC. The mail merge needs to create one page/letter for each PSC but ONLY if a value in the column "INCLUDE" is YES.
I'm looking for a Google Sheet script to be written that does the following:
The following function should be accessed by a custom menu option in GoogleSheets called "Retrieve PSC"
Take a number (e.g. 06663152) from a cell in Googlesheets and place this into a URL (e.g.https://beta.companieshouse.gov.uk/company/06663152/persons-with-significant-control) to extract the data from that page.
Once on that page e.g. https://beta.companieshouse.gov.uk/company/06663152/persons-with-significant-control if the value is a company e.g. "GBST UK Holdings Ltd" then store the number under "Registration number" and repeat process until names of people are found, see next section
If a name of a person or persons is found, e.g. https://beta.companieshouse.gov.uk/company/10720030/persons-with-significant-control
Then store the Name, e.g. "Mr Alex Burrows" and Correspondence Address .e.g. "10-12, Mulberry Green, Harlow, England, CM17 0ET" and Date of Birther "February 1988" in cells in a row, along with Company Number as a number, e.g. 10720030 and with this number linked to the original page e.g. https://beta.companieshouse.gov.uk/company/10720030/persons-with-significant-control and company name, e.g. DigiDropLtd.
Then in the next row store the data from the next person, with all the same information as above. Repeat this for all the persons listed.
The next step should be access under the same menu titled "Retrieve Officers"
This step should use the same company number as before, e.g. 06663152 and add use this in the url "https://beta.companieshouse.gov.uk/company/10720030/officers" as above, replacing the number "10720030" with the value from the cell e.g. 06663152. Now it should retrieve the name of each person and their address and place this into 2 cells, e.g. "Alex Burrows" and "2 - 3 Stable Court, Herriard Park, Herriard, Basingstoke, England, RG25 2PL"
The 3rd step "Match address"
Retrieve the "Registered Office Address" from this page https://beta.companieshouse.gov.uk/company/10720030, again replacing the company number as before.
So now for a company there will be a Registered Office Address, and an address for each PSC and a name & address for each Officer. This final step should compare the address for the PSC with that in Registered office address, if different, keep the PSC address, if the same, look to see if there is an address for that person under Officer and if that address differs from Registered Office Address use that address.
This should result in a name & address for each PSC and where possible using a different address than the registered office address.
Step 4 "Mail Merge"
The mail merge should use a temmplate word document and replace various fields in the document with values from the Google Sheet, Company Name, Sector and name & address of a PSC. The mail merge needs to create one page/letter for each PSC but ONLY if a value in the column "INCLUDE" is YES.

Giles B.
100% (10)Projects Completed
18
Freelancers worked with
10
Projects awarded
43%
Last project
24 Apr 2020
United Kingdom
New Proposal
Login to your account and send a proposal now to get this project.
Log inClarification Board Ask a Question
-
Can You Please Send Me The Remaining Company Numbers Please.
-
What Are The Other company number???
-
Can You Please Share More Details About Links.
Giles B.10 Apr 2020What more details of the links do you need, if you click on the links you can see the data and the brief describes what information is to be extracted from each page?
Charan T.10 Apr 2020You Mentioned Example Links In The Job Description.
-
Can i do this manually mean without script ???
Giles B.10 Apr 2020No as this will be used over and over for thousands of records
910097910086910077910010
We collect cookies to enable the proper functioning and security of our website, and to enhance your experience. By clicking on 'Accept All Cookies', you consent to the use of these cookies. You can change your 'Cookies Settings' at any time. For more information, please read ourCookie Policy
Cookie Settings
Accept All Cookies