Urgent Project: Full Database Extraction of via scrape

- or -

Post a project like this

Ends in (days)

Fixed Price

£100(approx. $134)

Posted: 19 hours ago
Proposals: 35
Remote
#4496809
Open for Proposals

+ have already sent a proposal.

Description

Experience Level: Entry

Overview

We need a developer to build and run a scraping script that extracts all current records from the UK Government's Individual Insolvency Register (IIR).

Register URL: https://www.insolvencydirect.bis.gov.uk/eiir/search

This is a public government database published under the Open Government Licence v3.0, which permits reuse including for commercial purposes. We have had this data scraped successfully before, so we know it is achievable.
What the Register Contains

The register lists individuals currently subject to an Individual Voluntary Arrangement (IVA), Debt Relief Order (DRO), or Bankruptcy in England and Wales. Each record contains personal details and case information on a dedicated case detail page.

Data Fields Required
We need the following fields extracted for every record on the register and delivered in a single CSV/Excel file:

FieldSource LocationNotesTitleCase detail page — "Individual details"e.g. Mr, Mrs, MsFirst NameCase detail page
"Forename(s)"Last NameCase detail page — "Surname"Address Line 1Case detail page — "Last known address"Parsed/split from full addressAddress Line 2Case detail page — "Last known address"Parsed/split from full addressCityCase detail page — "Last known address"Parsed/split from full addressPostcodeCase detail page — "Last known address"Parsed/split from full addressDate of BirthCase detail page — "Date of birth"Format: DD/MM/YYYYTypeCase detail page — "Type" under Insolvency case detailse.g. Individual Voluntary Arrangement, Debt Relief Order, BankruptcyArrangement DateCase detail page — "Arrangement date" (or equivalent date field)Format: DD/MM/YYYY
FirmCase detail page — Insolvency practitioner "Firm"The IP firm managing the case

How the Register Works (Technical Context)

No API exists - the data must be scraped from the web interface.

The search page accepts a name query and returns paginated results.

Search results URLs follow the pattern: /eiir/search-results/{base64_encoded_search_term}/{page_number}
Each result links to a case detail page at /eiir/case-details/... containing all fields above in structured HTML.
The site is server-rendered HTML (no JavaScript framework rendering), making parsing straightforward.

The search term in the URL is standard base64-encoded text.

Suggested Approach
The typical method (which has worked before) is:

Systematic search iteration - cycle through common surnames, alphabetical letter combinations, or other strategies to ensure full coverage of all records on the register.

Collect all case detail URLs from paginated search results.

Deduplicate case URLs (the same record may appear under multiple search terms).

Visit each case detail page and parse the HTML to extract all required fields.

Address parsing — split the "Last known address" field into Address Line 1, Address Line 2, City, and Postcode.
Compile into a single CSV or Excel file with one row per record.

Polite rate limiting (delays between requests) should be used to avoid overloading the server or triggering blocks.
Deliverables

The complete dataset as a CSV and/or Excel file, with all fields above, one row per record.

The scraping script (Python preferred but open to other languages), so we can re-run this periodically.

Brief documentation on how to re-run the script (dependencies, any configuration needed).

Important Notes

We need all records currently on the register, not just a sample. Please confirm your approach to achieving full coverage.

The data is publicly available under the Open Government Licence v3.0.

We have done this before and know it is achievable — please only apply if you are confident you can deliver.
This is urgent - we need this turned around quickly. Please state your estimated delivery time when quoting.

Michelle L.

Projects Completed

Freelancers worked with

Projects awarded

49%

Last project

18 Mar 2026

United Kingdom

Michelle's other projects

Graphic Design & Animation
$107

New Proposal

Clarification Board Ask a Question

Yesterday

HI Michelle,

I have placed a bid with sample work. Have you had a chance to reviewed it?

Thanks
Sumit
SaS Technologies
Yesterday

A few quick questions:
1. Do you want only current active records or historical data as well if accessible?
2. Would you like the final output in both CSV and Excel formats?
3. Should the script support future incremental updates or only full fresh extraction each time?