I would like a program built which can draw out key data points from a batch of PDF t

- or -

Post a project like this

Ends in (days)

2325

Per Hour

£15_/hr(approx. $19_/hr)

Posted: 6 years ago
Proposals: 9
Remote
#1801286
Awarded

+ have already sent a proposal.

Description

Experience Level: Intermediate

General information for the business: Research
Description of requirements/functionality: I have a whole collection of movie scripts, all in the same format and I would like some to build a program which analyses each document and pulls out key information.

Movie scripts follow a very rigid formatting standard, and most are written in a handful of writing programs, which means we can be sure that all my files will follow the same consistent formatting. To give you an idea, check out this image https://www.nyfa.edu/student-resources/wp-content/uploads/2014/06/final-draft-screen-shot.png

Most of the work in indemnifying elements will be achieved via the indentation, capitalization and certain key phrases (such as EXT or INT for the start of a scene).

I can help give details about what each of these are.

From each PDF document the program looks at, I would like two types of data:

1. Basic data – such as number of pages, the data on the cover page (e.g. https://www.writersstore.com/system/imagemanager/screenplay-title-page-example.gif) etc.
2. More complex data – Such as the number of times a character speaks, the description used to define a character, etc. These are all identifiable by strict formatting rules and I can explain these. The person writing the tool will need to work out how to code the program to look for these particular patterns.

Right now I intend this program just to be used by me privately in my research. If it proves useful then I may take it further but that doesn’t seem very likely now.

I assume that the best way of dealing with this is to build a very basic version (grabbing the data listed above) and then for us to add functionality as needed.

To apply, please let me know the following things:

• How you would tackle this (i.e. the coding language you’d use, etc)
• How you would bill your time (i.e. flat fee, by the hour etc) and at what rates.
• An estimate of the time / cost involved. I appreciate that this may change when we figure out the detail but it would be useful to get a sense of scale from your point of view.
• Examples of your past work

Because I won’t know how to pick between different applicants, I will use the information above to generate a shortlist of a few people. Then we can chat more about the requirements, meaning you can generate a more detailed plan and quote.

You can ignore the amount this job is listed with. It's a dummy amount just so I can post the job. Write your quote in the application text and we will talk before I accept anything.

Any applications which only say “We can do this!” won’t be considered. I do need a bit of context on you and how you’d tackle it in order to be able to pick between applicants.

Thank you.
Extra notes:

New Proposal

Clarification Board Ask a Question

25 Nov 2017

Hello there.

I would to clarify one thing first:
Do you need the written program to be using Image recognition algorithms to solve indentations, words etc., or the entire logic behind your request could be processed just by parsing the data in your .pdf's as Text ?

Description

PPH User P.

New Proposal

Clarification Board Ask a Question