
Convert txt file to comma delimiter file ready for importing into excel
- or -
Post a project like this3801
£75(approx. $100)
- Posted:
- Proposals: 7
- Remote
- #901503
- Awarded
Expert in php python devops nodejs perl OpenAI AIML API AWS web scraping shopify wordpress N8N Docker k8 reactjs fastapi django flask yii cakephp laravel data Analyst

1054207758873672872104772566483709481055748
Description
Experience Level: Intermediate
Estimated project duration: 1 day or less
we had several pdf of similar pattern, heading, sub title/s, text, pic and figcap, but we have now extracted to word and then into plain text file. We need either javascript or php or python script to read each line or chunk (LF -LF) and allows us to select if this is going to be H1 (Heading) or H2 (Subtitle), bodytext and Figcap. There is not much to say Heading, Subtitle and figcaps are usually cap letters with space or hypen and or (). your script will allows for us to setup a pattern that will be used to parse the rest of the plain text document, which will then be saved (downloaded) to csv (comma or tab) for easy import in excel.
Summary and suggested idea but feel free to suggest your idea, remeber the heading/subtitles and figcap are not named as you see in the examples they are unquie and different apart from the (continued) sub title
The CSV format
Headings are only grabed once per section and no duplicates, subtitles are unquie unless a contain (continued) if this is the case dont capture in CSV.
Heading, subtitle, text, figcaps.
ABCDEG, ABCDE, BLAH, FIGONE,FIGTWO
, FGHIJK, BLAH, FIGONE
read text file into memory
read line/chunk into text box show checkboxs (Chapter, Section, Subtile, Text and FigCap )
read next line/chunk into another text box show checkboxs (Chapter, Section, Subtile, Text and FigCap )
read next line/chunk into another text box show checkboxs (Chapter, Section, Subtile, Text and FigCap )
read next line/chunk into another text box show checkboxs (Chapter, Section, Subtile, Text and FigCap )
present a button to Parse and once submitted use the above pattern to filter the rest of doc into CSV file which is downloaded
REGARDS
Summary and suggested idea but feel free to suggest your idea, remeber the heading/subtitles and figcap are not named as you see in the examples they are unquie and different apart from the (continued) sub title
The CSV format
Headings are only grabed once per section and no duplicates, subtitles are unquie unless a contain (continued) if this is the case dont capture in CSV.
Heading, subtitle, text, figcaps.
ABCDEG, ABCDE, BLAH, FIGONE,FIGTWO
, FGHIJK, BLAH, FIGONE
read text file into memory
read line/chunk into text box show checkboxs (Chapter, Section, Subtile, Text and FigCap )
read next line/chunk into another text box show checkboxs (Chapter, Section, Subtile, Text and FigCap )
read next line/chunk into another text box show checkboxs (Chapter, Section, Subtile, Text and FigCap )
read next line/chunk into another text box show checkboxs (Chapter, Section, Subtile, Text and FigCap )
present a button to Parse and once submitted use the above pattern to filter the rest of doc into CSV file which is downloaded
REGARDS
Eric B.
99% (32)Projects Completed
48
Freelancers worked with
44
Projects awarded
25%
Last project
23 Jun 2022
United Kingdom
New Proposal
Login to your account and send a proposal now to get this project.
Log inClarification Board Ask a Question
-
There are no clarification messages.
We collect cookies to enable the proper functioning and security of our website, and to enhance your experience. By clicking on 'Accept All Cookies', you consent to the use of these cookies. You can change your 'Cookies Settings' at any time. For more information, please read ourCookie Policy
Cookie Settings
Accept All Cookies