Write a perl script to match traffic data and output
- or -
Post a project like this2561
£220(approx. $276)
- Posted:
- Proposals: 4
- Remote
- #1519728
- Completed
Description
Experience Level: Intermediate
General information for the business: Write a perl script to match traffic data and output
Description of requirements/functionality: The objective is to analyse the file of vehicles entering and leaving multiple sites, match the IN time with the OUT time, and write these to a '_good' file.
Some records will not be matchable, and these need to be written to a '_bad' file for further processing outside of this coding project.
This will be run on a macos 10.12.4 (Sierra) machine running perl 5.18.
If it is necessary, I can install perl 5.24 and run it from a user directory.
I may also run this on a Ubuntu Linux box but I am happy to assume portability.
I understand CPAN and can install perl packages, and I am fairly competent in the command line and can do most that's needed but I'd prefer a clean well commented simple perl job that a novice like me can understand.
The script should take a file name as input, and output that as two new filenames with '_good' and '_bad' appended at the end. [This script should not append to any existing files but replace them]
I have provided xls file for examples, and .csv files for use.
The .csv files have had dos2unix run on them to assure they are Unix compliant files.
The input file is a CSV with the headings:
date,plate,site,direction,time,class,logo
Each entry either has data or has blanks thus: [any blank lines should be ignored]
date,plate,site,direction,time,class,logo
30.04.2012,BF55WWE,Site 1,OUT,08:00:00,L,MGM
30.04.2012,FM02ZKE,Site 1,OUT,12:28:00,,
THIS DATA FORMAT CAN BE ADJUSTED FOR SEQUENCING IF NECESSARY.
All IN records that match an OUT record on data and plate and site should have parts of IN and OUT sets of data written to a new '_good' file
This new file will have the original data but some additional data, one a time calculation, and one a logical one.
date,plate,site,direction_in,time_in,class_in,logo_in,direction_out,time_out,class_out,logo_out,time_diff,class,logo
date, the date shared by IN and OUT, only one needed be captured
plate, the plate shared by IN and OUT, only one needed be captured
site, the site shared by IN and OUT, only one needed be captured
direction_in, the original IN direction [this may seem redundant but the field has other uses for us as well]
time_in, the original IN time
class_in, the original IN class [this may seem redundant but the field has other uses for us as well]
logo_in, the original IN logo [this may seem redundant but the field has other uses for us as well]
direction_out, the original OUT direction [this may seem redundant but the field has other uses for us as well]
time_out, the original OUT time
class_out, the original OUT class [this may seem redundant but the field has other uses for us as well]
logo_out, the original OUT logo [this may seem redundant but the field has other uses for us as well]
time_diff, the difference in time between time_in and time_out
class, this is logical, if class_in has value, use class_in, otherwise use class_out, even if also null
logo, this is logical, if class_in has value, use class_in, otherwise use class_out, even if also null
This is an example of the '_good' file:
date,plate,site,direction_in,time_in,class_in,logo_in,direction_out,time_out,class_out,logo_out,time_diff,class,logo
30.04.2012,AD07VRU,Site 1,IN,08:13:00,,,OUT,17:45:00,,,09:32:00,,
30.04.2012,AE53VPZ,Site 1,IN,13:15:00,,,OUT,18:40:00,,,05:25:00,,
30.04.2012,BL53JVU,Site 1,IN,09:28:00,2,HAMSONS CAR HIRE,OUT,09:37:00,2,HAMSONS CAR HIRE,00:09:00,2,HAMSONS CAR HIRE
30.04.2012,BL53JVU,Site 1,OUT,09:37:00,2,HAMSONS CAR HIRE,OUT,09:37:00,2,HAMSONS CAR HIRE,00:00:00,2,HAMSONS CAR HIRE
If a record in the input file does not match then it should be written to the '_bad' file in the same format as the original input.
We shall then manually edit it until it either does match, or we discard it, but that is not part of this coding.
The files will then be imported into Excel and as such there is a requirement that the resulting file are comma delimited files that import and look as the examples in this bundle.
The files may contain upto 7000 lines of data at a time, but not more.
It is a requirement that any problems in the logic above are pointed out and better solutions are suggested, the purpose is to process the data correctly and output the desired CSV files, I am not worried about being told I have made a mistake in my spec!!
Tom
Specific technologies required: perl5
OS requirements: Mac OS, Linux
Extra notes:
Description of requirements/functionality: The objective is to analyse the file of vehicles entering and leaving multiple sites, match the IN time with the OUT time, and write these to a '_good' file.
Some records will not be matchable, and these need to be written to a '_bad' file for further processing outside of this coding project.
This will be run on a macos 10.12.4 (Sierra) machine running perl 5.18.
If it is necessary, I can install perl 5.24 and run it from a user directory.
I may also run this on a Ubuntu Linux box but I am happy to assume portability.
I understand CPAN and can install perl packages, and I am fairly competent in the command line and can do most that's needed but I'd prefer a clean well commented simple perl job that a novice like me can understand.
The script should take a file name as input, and output that as two new filenames with '_good' and '_bad' appended at the end. [This script should not append to any existing files but replace them]
I have provided xls file for examples, and .csv files for use.
The .csv files have had dos2unix run on them to assure they are Unix compliant files.
The input file is a CSV with the headings:
date,plate,site,direction,time,class,logo
Each entry either has data or has blanks thus: [any blank lines should be ignored]
date,plate,site,direction,time,class,logo
30.04.2012,BF55WWE,Site 1,OUT,08:00:00,L,MGM
30.04.2012,FM02ZKE,Site 1,OUT,12:28:00,,
THIS DATA FORMAT CAN BE ADJUSTED FOR SEQUENCING IF NECESSARY.
All IN records that match an OUT record on data and plate and site should have parts of IN and OUT sets of data written to a new '_good' file
This new file will have the original data but some additional data, one a time calculation, and one a logical one.
date,plate,site,direction_in,time_in,class_in,logo_in,direction_out,time_out,class_out,logo_out,time_diff,class,logo
date, the date shared by IN and OUT, only one needed be captured
plate, the plate shared by IN and OUT, only one needed be captured
site, the site shared by IN and OUT, only one needed be captured
direction_in, the original IN direction [this may seem redundant but the field has other uses for us as well]
time_in, the original IN time
class_in, the original IN class [this may seem redundant but the field has other uses for us as well]
logo_in, the original IN logo [this may seem redundant but the field has other uses for us as well]
direction_out, the original OUT direction [this may seem redundant but the field has other uses for us as well]
time_out, the original OUT time
class_out, the original OUT class [this may seem redundant but the field has other uses for us as well]
logo_out, the original OUT logo [this may seem redundant but the field has other uses for us as well]
time_diff, the difference in time between time_in and time_out
class, this is logical, if class_in has value, use class_in, otherwise use class_out, even if also null
logo, this is logical, if class_in has value, use class_in, otherwise use class_out, even if also null
This is an example of the '_good' file:
date,plate,site,direction_in,time_in,class_in,logo_in,direction_out,time_out,class_out,logo_out,time_diff,class,logo
30.04.2012,AD07VRU,Site 1,IN,08:13:00,,,OUT,17:45:00,,,09:32:00,,
30.04.2012,AE53VPZ,Site 1,IN,13:15:00,,,OUT,18:40:00,,,05:25:00,,
30.04.2012,BL53JVU,Site 1,IN,09:28:00,2,HAMSONS CAR HIRE,OUT,09:37:00,2,HAMSONS CAR HIRE,00:09:00,2,HAMSONS CAR HIRE
30.04.2012,BL53JVU,Site 1,OUT,09:37:00,2,HAMSONS CAR HIRE,OUT,09:37:00,2,HAMSONS CAR HIRE,00:00:00,2,HAMSONS CAR HIRE
If a record in the input file does not match then it should be written to the '_bad' file in the same format as the original input.
We shall then manually edit it until it either does match, or we discard it, but that is not part of this coding.
The files will then be imported into Excel and as such there is a requirement that the resulting file are comma delimited files that import and look as the examples in this bundle.
The files may contain upto 7000 lines of data at a time, but not more.
It is a requirement that any problems in the logic above are pointed out and better solutions are suggested, the purpose is to process the data correctly and output the desired CSV files, I am not worried about being told I have made a mistake in my spec!!
Tom
Specific technologies required: perl5
OS requirements: Mac OS, Linux
Extra notes:
Thomas Z.
100% (3)Projects Completed
3
Freelancers worked with
2
Projects awarded
100%
Last project
20 May 2017
United Kingdom
New Proposal
Login to your account and send a proposal now to get this project.
Log inClarification Board Ask a Question
-
Hi Thomas,
thank you for the invitation on this project. Is it ok if I have a look at this project tomorrow as I'm finishing an urgent project right now?
Thank you,
Lucy
376777
We collect cookies to enable the proper functioning and security of our website, and to enhance your experience. By clicking on 'Accept All Cookies', you consent to the use of these cookies. You can change your 'Cookies Settings' at any time. For more information, please read ourCookie Policy
Cookie Settings
Accept All Cookies