Natural language processing of scientific articles
- or -
Post a project like this1087
$35/hr
- Posted:
- Proposals: 9
- Remote
- #3232668
- Awarded
IT Consultant |Virtual Assistant| graphics| | Website design | social media management .
Nairobi
309110635222851278066498412888112098966214764037108865248953
Description
Experience Level: Expert
Estimated project duration: More than 6 months
I have a fairly complex project that involves natural language processing of PDF of scientific articles.
In general, these are the main steps:
1. Search and retrieve scientific articles from Scopus/Web of Science, specific journals (i.e., the few exceptions not indexed in Scopus/WoS), Google Scholar (to capture the rest, if any), and targeted collections (e.g., digitized graduate student theses).
2. Use NLP and supervised ML to identify articles that contain biological information (e.g., presence of particular species). Apply relevant labels to these documents and upload them to tagtog.net using its API.
3. Working with a training dataset of expert-curated keyword lists and identified relationships (specifically the occurrence of organisms at locations, annotated using tagtog.net by biologists) to categorize the articles according to their biology sub-disciplines.
4. From there, we will iteratively improve the annotations and incorporate more articles (either using tagtog or your own ML algorithms) to obtain a comprehensive database of biological species and where they have been reported to occur.
5. Apply geographic information retrieval techniques to assign probability or certainty scores to locations of reported occurrence. You will be advised by people with GIR expertise, but you will be responsible for the coding, normalization, and basic QA.
See my tagtog project here: https://www.tagtog.net/charlesklee/RSRTB-01/pool
If interested, you will first do a paid trial using Scopus and 100 downloaded PDFs (I will download them using your script, so don't worry about paywall) for steps 1 and 2. If you are a good fit for the project, we will discuss your further involvement (including steps not listed above where you have relevant expertise).
In general, these are the main steps:
1. Search and retrieve scientific articles from Scopus/Web of Science, specific journals (i.e., the few exceptions not indexed in Scopus/WoS), Google Scholar (to capture the rest, if any), and targeted collections (e.g., digitized graduate student theses).
2. Use NLP and supervised ML to identify articles that contain biological information (e.g., presence of particular species). Apply relevant labels to these documents and upload them to tagtog.net using its API.
3. Working with a training dataset of expert-curated keyword lists and identified relationships (specifically the occurrence of organisms at locations, annotated using tagtog.net by biologists) to categorize the articles according to their biology sub-disciplines.
4. From there, we will iteratively improve the annotations and incorporate more articles (either using tagtog or your own ML algorithms) to obtain a comprehensive database of biological species and where they have been reported to occur.
5. Apply geographic information retrieval techniques to assign probability or certainty scores to locations of reported occurrence. You will be advised by people with GIR expertise, but you will be responsible for the coding, normalization, and basic QA.
See my tagtog project here: https://www.tagtog.net/charlesklee/RSRTB-01/pool
If interested, you will first do a paid trial using Scopus and 100 downloaded PDFs (I will download them using your script, so don't worry about paywall) for steps 1 and 2. If you are a good fit for the project, we will discuss your further involvement (including steps not listed above where you have relevant expertise).
Charles L.
100% (152)Projects Completed
2
Freelancers worked with
2
Projects awarded
100%
Last project
18 May 2023
New Zealand
New Proposal
Login to your account and send a proposal now to get this project.
Log inClarification Board Ask a Question
-
There are no clarification messages.
We collect cookies to enable the proper functioning and security of our website, and to enhance your experience. By clicking on 'Accept All Cookies', you consent to the use of these cookies. You can change your 'Cookies Settings' at any time. For more information, please read ourCookie Policy
Cookie Settings
Accept All Cookies