
AI Vision System - YOLO Dataset Preparation
- or -
Post a project like this- Posted:
- Proposals: 14
- Remote
- #4487291
- Open for Proposals









Description
Description:
I am building an AI-based visual inspection system using real-world highway images and need support preparing a high-quality dataset for YOLO training.
This is a structured, multi-phase project. Accuracy, consistency, and attention to detail are critical.
⚠️ Important: This job will begin with Phase 1 only (image scrubbing and classification).
Further phases (annotation and YOLO dataset preparation) will follow based on performance.
---
Scope of Work:
Phase 1 – Image Scrubbing & Pre-Classification (Current Phase)
This is the most critical step of the project.
* Review large volumes of real-world highway images (hundreds of thousands available)
* Identify and filter out images that are not useful (no defects, irrelevant content, low-quality data)
* Sort and group images into the correct defect classifications based on provided examples
* Ensure consistency when assigning images to classes (similar defects must always be grouped the same way)
Note:
These are real-world inspection images. Multiple defect types may appear across different stretches of highway, and some images may contain no relevant defects at all. Strong judgment is required. Please message if you have any questions, i need acuracy and strong attention to detail.
---
Phase 2 – Image Annotation (Future Phase)
* Use LabelMe to annotate images
* Draw bounding boxes or polygons depending on object type
* Label objects according to a predefined class list (34–36 classes)
* Follow strict naming and labeling conventions
---
Phase 3 – Dataset Preparation for YOLO (Future Phase)
* Convert LabelMe annotations into YOLO format
* Ensure correct class IDs and structure
* Organize dataset into train/ and val/ folders
* Verify all images have matching label files
---
Phase 4 – Quality Control (Ongoing)
* Ensure labels are accurate and consistent
* Avoid missing or incorrect annotations
* Perform validation before delivery
---
Class System:
* 34–36 defect classes
* Each class will be provided with example images
* All classes are important — the goal is to reflect real-world conditions, not prioritize a subset
* Consistency across similar defect types is critical
---
Requirements:
* Experience reviewing or organizing large image datasets
* Strong attention to detail and consistency
* Ability to follow structured instructions and class definitions
* Familiarity with LabelMe or similar tools is a plus
* Basic understanding of YOLO format is a plus (required for later phases)
---
Deliverables (Phase 1):
* Scrubbed and filtered image sets
* Images grouped into correct classifications
* Clean and organized folder structure
---
Volume:
* Very large dataset (hundreds of thousands of images available)
* Initial batches will be provided for Phase 1
* Only a subset of images will move forward to annotation
* Potential for ongoing work across multiple phases
---
To Apply:
Please include:
* Confirmation that you understand Phase 1 is focused on image scrubbing and classification
* Your approach to reviewing and filtering large image datasets
* Your expected turnaround time for an initial batch
* Any relevant experience with image datasets or annotation work
---
Test Task:
A small test batch will be provided.
You will be asked to scrub and classify images based on provided examples.
---
Notes:
* Accuracy and consistency are more important than speed
* This is part of a larger AI system — data quality is critical
* Strong performance in Phase 1 may lead to continued work in annotation and dataset preparation phases
Marco T.
0% (0)New Proposal
Login to your account and send a proposal now to get this project.
Log inClarification Board Ask a Question
-

- How clearly defined are your defect classes at this stage—do you already have strict classification guidelines, or would you like help refining edge cases where defects may overlap or appear ambiguous?
- What criteria should be used to reject images—beyond obvious issues like low quality or irrelevance, are there specific thresholds (e.g., defect visibility, size, or clarity) that determine whether an image is usable?Marco T.Fri 11:56amI have strict guidelines per project SOP's, and I have samples of each classsification.
-

Could you please specify the approximate number of images included in the Phase 1 batch
Marco T.Fri 11:59amI have hundreds of thousands of images. We are using real world images from the cameras installed in our vehicles. They are all shot in the same size and there are no blurry images. The easiest part is removing the ones with do deteriorating. That easily removes 70 to 80 % of the images. from there its finding enough images of each classification.
Vijeet D.Fri 2:40pmok what I meant to ask is your budget 300$ for how many images? because its unrealistic for 100s of 1000s of images
-

Hi Marco,
Can you please share the test batch so we may perform scrubbing & classification?
Looking forward to your reply.
Best Regards,
VConn Pvt Ltd
-

Where is your listed below attached files?
batch data freelancer.zip
sample 1.mp4
asf_agrietamiento_fatiga.pdf
corriemento o ondulaciones.png
bache.png
agrietamiento por fatiga.png
grieta longitudinal.png