Web-based PDF data extraction to text file
- or -
Post a project like this3704
$100
- Posted:
- Proposals: 2
- Remote
- #413598
- Awarded
Description
Experience Level: Intermediate
General information for the website: data extraction from voter list file in pdf
Description of requirements/features: Write a web-based script file to extract certain numerical data from PDF file of voter list. I have attached 2 sample PDF files. One in Hindi and the other in Tamil.
We just want to extract the serial no, voter id no, age in a text file. (serial no is the first item in the top rectangle box. followed by voter id number. last line is the age. each page will typically have 20 voters list with 2 names in each row.
The script (ideally to be browser based) should ask for local input pdf file name and output .txt file with the same name (to be stored locally) for each voter serial no, voter id no and age. If the voter's sex can be determined for Hindi, it will be fine. If not, its ok, just the numeric data will do.
CMS and Admin requirements: note on sample data:
1 (serial no), TN/28/161/0000103 (voter id),70 (age) from file 2014 kunnam......
1 (serial no), UP/30/139/0108002 (voter id), 54 (age) from file barabanki....
Specific technologies required: open to any technology. if you have the ability to extract and translate the non-numeric data will be added bonus and we can discuss additional cost separately. initially the priority is to extract only numeric data.
Extra notes: this is a test program to find out the viability of hosting a large database. those who can deliver the results quickly will be considered for continuous engagement.
Description of requirements/features: Write a web-based script file to extract certain numerical data from PDF file of voter list. I have attached 2 sample PDF files. One in Hindi and the other in Tamil.
We just want to extract the serial no, voter id no, age in a text file. (serial no is the first item in the top rectangle box. followed by voter id number. last line is the age. each page will typically have 20 voters list with 2 names in each row.
The script (ideally to be browser based) should ask for local input pdf file name and output .txt file with the same name (to be stored locally) for each voter serial no, voter id no and age. If the voter's sex can be determined for Hindi, it will be fine. If not, its ok, just the numeric data will do.
CMS and Admin requirements: note on sample data:
1 (serial no), TN/28/161/0000103 (voter id),70 (age) from file 2014 kunnam......
1 (serial no), UP/30/139/0108002 (voter id), 54 (age) from file barabanki....
Specific technologies required: open to any technology. if you have the ability to extract and translate the non-numeric data will be added bonus and we can discuss additional cost separately. initially the priority is to extract only numeric data.
Extra notes: this is a test program to find out the viability of hosting a large database. those who can deliver the results quickly will be considered for continuous engagement.
Srini V.
100% (3)Projects Completed
3
Freelancers worked with
3
Projects awarded
25%
Last project
14 May 2014
India
New Proposal
Login to your account and send a proposal now to get this project.
Log inClarification Board Ask a Question
-
There are no clarification messages.
We collect cookies to enable the proper functioning and security of our website, and to enhance your experience. By clicking on 'Accept All Cookies', you consent to the use of these cookies. You can change your 'Cookies Settings' at any time. For more information, please read ourCookie Policy
Cookie Settings
Accept All Cookies