
I am looking for Python code to extract specific field information from a parsed file
- or -
Post a project like this2878
€100(approx. $115)
- Posted:
- Proposals: 5
- Remote
- #1631093
- Awarded
Description
Experience Level: Entry
General information for the business: Parsed file type is CoNNL-X. Output from TweeboParser application (URL - http://www.cs.cmu.edu/~ark/TweetNLP/).
Description of requirements/functionality: This should be a short Python program that reads a CoNLL-X parsed file (see attached TestTweets.txt.predict file). For each tweet, the program should identify any entities in the tweet and any neighboring descriptor words.
As can be seen in attached requirements document, in displayed tweet, there are 2 entities (with POSTAG valued as ‘^’, and both of these are a Multi-Word expressions – denoted by MWE) – ‘State Farm’ and ‘JD Power’. I would like to identify the entities in the text and any neighboring (within 2 words of entity) descriptor words (Verbs – V, Adjectives – A and Adverbs – R). The entities plus any neighboring descriptor words will be printed to an outfile (.csv or .xls), where each tweet will be identified by id (i.e. tweet #1 has id of 001 etc).
OS requirements: Windows
Extra notes:
Description of requirements/functionality: This should be a short Python program that reads a CoNLL-X parsed file (see attached TestTweets.txt.predict file). For each tweet, the program should identify any entities in the tweet and any neighboring descriptor words.
As can be seen in attached requirements document, in displayed tweet, there are 2 entities (with POSTAG valued as ‘^’, and both of these are a Multi-Word expressions – denoted by MWE) – ‘State Farm’ and ‘JD Power’. I would like to identify the entities in the text and any neighboring (within 2 words of entity) descriptor words (Verbs – V, Adjectives – A and Adverbs – R). The entities plus any neighboring descriptor words will be printed to an outfile (.csv or .xls), where each tweet will be identified by id (i.e. tweet #1 has id of 001 etc).
OS requirements: Windows
Extra notes:

Eddy S.
93% (18)Projects Completed
18
Freelancers worked with
14
Projects awarded
91%
Last project
3 Feb 2025
United States
New Proposal
Login to your account and send a proposal now to get this project.
Log inClarification Board Ask a Question
-
Hi Eddy,
an entity should have ^ in both the CPOSTAG and POSTAG columns right? Is that why "Busibank" should not be considered an entity?
Lucy
453905
We collect cookies to enable the proper functioning and security of our website, and to enhance your experience. By clicking on 'Accept All Cookies', you consent to the use of these cookies. You can change your 'Cookies Settings' at any time. For more information, please read ourCookie Policy
Cookie Settings
Accept All Cookies