
Data Science Projects
Looking for freelance data science jobs and project work? Browse active opportunities on PeoplePerHour, or hire data scientists through Toptal’s rigorously vetted talent network.
featuredopportunity
Interactive AI Experience – 3D Guide & Custom Image Gen
I am an artist developing a browser-based interactive ritual experience where a 3D speaking character guides participants through a reflective AI-driven dialogue about the future. At the end of the interaction, the system produces: • A symbolic, poetic spoken response • One AI-generated image based on the participant’s clarified vision, rendered in a custom visual style trained on my artwork This is a poetic, immersive digital art experience, not a generic chatbot or commercial tool. Deliverable: A mini website / web module that can be integrated into an existing website (for example, as a subpage or subdirectory). Scope Clarification The generated images will later be shown in a separate digital “wall” project built by another team. This job does NOT include building that wall interface. Your responsibility is to: ✔ Generate the images ✔ Store them with structured metadata ✔ Make them exportable for future integration Technical Constraints (Non-Negotiable) - • Open-source / open-weight AI models only (LLM, image generation, TTS, STT) • Self-hosted deployment on my infrastructure (Hetzner servers) • No proprietary AI APIs Core User Experience Flow - - Short conceptual intro animation - 3D character appears and speaks, introducing the ritual - User selects one of five thematic prompts - User shares a vision (text input; voice input optional bonus) - AI-guided dialogue (2–4 turns) to clarify the scenario - Final symbolic spoken response from the character - One AI-generated image created from the clarified vision - Session data saved for archive and future visual display Technical Requirements - Frontend (Mini Website) • Immersive but lightweight interface • Smooth transitions between stages • Audio playback (music + character voice) • Responsive design (desktop + mobile) • Built using React / Next.js or similar 3D Speaking Character - • WebGL / Three.js / A-Frame (or similar) • Rigged character model (provided) • Idle animation • Speaking animation synced to audio (lip sync preferred, amplitude-based acceptable for MVP) AI Dialogue System (Open-Source LLM) - • Self-hosted open-weight model • Multi-turn conversation handling • Structured prompting system • Outputs: – follow-up prompts – final poetic response – structured summary for image generation Voice System (Open-Source TTS) - • Open-source text-to-speech hosted on server • Audio drives speaking animation Custom Style Image Generation - The generated image must consistently match a custom artistic visual language based on my artwork. Prompting alone is not enough. You must implement: Preferred: LoRA training using my artwork dataset Alternative: Style adapter / reference conditioning Requirements: • One image per session • Seed reproducibility • Style strength control • Save prompt + generation parameters Backend & Storage Store for each session: • Selected prompt theme • Dialogue transcript • Final spoken response • Scenario summary • Image prompt + parameters • Generated image file • Timestamp Admin Panel Simple password-protected page to: • View sessions • Download text and images Deployment Requirements • Linux deployment on Hetzner • Docker / Docker Compose preferred • Documentation for: – setup – model downloads – environment variables – running services – updating style model Project Timeline Total duration: 2 months Skills Required • Web 3D (Three.js / A-Frame / WebGL) • Experience integrating animated 3D characters in the browser • Experience serving open-source LLMs • Diffusion model LoRA or adapter training • Backend/API development • Docker + Linux deployment How to Apply Please include: 2–3 relevant projects (AI apps, WebGL/WebXR, or interactive experiences) Proposed tech stack (frontend, backend, model serving) Which open models you would use (LLM, diffusion, TTS) and why Recommended server setup (GPU/VRAM) for acceptable performance Screening Questions How would you sync speech audio to a 3D character animation in the browser? Which open-weight LLM would you deploy and how would you serve it? How would you train and deploy a custom style LoRA for image generation? What server setup would you recommend and why?
Science Thesis Proofreading
I’m preparing my sciences-based thesis for submission and want a fresh set of expert eyes on the full manuscript. Your task is to proofread the document in English—grammar, punctuation, flow, and consistency—while respecting the technical terminology of my field. I write and speak both English and Arabic, so if you’re comfortable handling short bilingual sections such as the abstract or key terms that appear in both languages, that will be a plus, though the core text is entirely English. Please work directly in Word using Track Changes, then return two files: one showing every edit and one clean, ready-to-submit version. A seven-day turnaround from the moment we start keeps me on schedule. Acceptance criteria • All edits visible in Track Changes, no unresolved comments • Readability improved without altering scientific meaning • Final file free of spelling or grammatical errors (target: near-zero Grammarly flags) • Bilingual abstract, where touched, reads naturally in both languages Let me know about your experience with academic theses or dissertations in the sciences and your typical word-count capacity per day. I’m eager to work with someone who can polish this document to publication quality.
9 days ago23 proposalsRemoteopportunity
Data Analyst
Position Overview: We are seeking a Data Analyst to transform raw data into actionable insights. In this role, you will analyze complex datasets, create reports, and develop visualizations to support business decisions and optimize processes. Key Responsibilities: Collect, clean, and organize data from multiple sources. Analyze datasets to identify trends and insights. Create reports and dashboards using BI tools (Tableau, Power BI, etc.). Design visualizations to communicate findings clearly. Collaborate with teams to provide data-driven recommendations. Ensure data integrity through audits and validation checks. Required Skills and Experience: Education: Bachelor’s degree in Data Science, Statistics, or related field. Experience: 1–3 years in data analysis or a related field. Technical Skills: Proficiency in SQL and data analysis tools (Excel, Python, R, etc.). BI Tools: Experience with data visualization tools (Tableau, Power BI). Analytical Skills: Strong ability to interpret data and provide actionable insights. Communication: Ability to present complex data simply and effectively. Nice to Have: Experience with machine learning or predictive analytics. Familiarity with data warehousing and ETL processes.
5 days ago24 proposalsRemoteData classification
I will share a purely categorical dataset and need it turned into a clear, well-documented end-to-end classification workflow that I can study for academic purposes. Using Python with Pandas, NumPy, scikit-learn, and visualisations in Matplotlib or Seaborn, start with an exploratory review, handle all cleaning and preprocessing (encoding, missing values, feature selection), then build and compare suitable classification models. Sound evaluation—accuracy, precision, recall, F1 or any metric you judge relevant—must accompany the models, followed by a concise discussion of the results and why a particular approach performs best. Please highlight your experience with similar projects when you respond; I value demonstrated know-how over long proposals. Deliverables I expect: • A well-commented Jupyter notebook covering EDA, preprocessing, model training, and evaluation • The cleaned dataset (or the code that generates it) • A brief markdown or slide deck that walks through the methodology, findings, and recommended next steps Clarity of explanation is just as important as model accuracy, as the primary goal is learning from your workflow.
5 days ago11 proposalsRemoteData Scraping
We are looking for a freelancer with experience in data extraction and web automation to collect a list of registered businesses from a Laravel-based platform that requires login. I have valid login credentials (my own account). The task includes: Logging in using provided credentials Accessing the authenticated business listing Handling pagination to retrieve all entries Exporting the data to CSV or Excel
9 days ago32 proposalsRemoteSenior Data Engineer
We are seeking a Senior Data Engineer to design, implement, and optimize data pipelines utilizing Scala, Spark, and Java. The ideal candidate will develop and maintain real-time data processing systems essential for business operations. Collaboration with data scientists and analysts is crucial to understand data requirements and deliver high-quality solutions. Responsibilities include ensuring data quality through robust testing, monitoring workflows, and troubleshooting pipelines. Candidates should possess a degree in Computer Science or Engineering, with proven experience in data engineering, real-time processing, and SQL proficiency. Familiarity with cloud platforms and data governance is preferred. We offer a competitive salary, benefits, and opportunities for professional growth in a collaborative environment. Key Responsibilities: - Design, implement, and optimize data pipelines using Scala, Spark, and Java. - Develop and maintain real-time data processing systems to support business-critical operations. - Collaborate with data scientists, analysts, and other stakeholders to understand data requirements and deliver high-quality solutions. - Ensure data quality and reliability through robust testing and validation procedures. - Monitor and troubleshoot data pipelines and workflows to ensure high availability and performance. - Stay current with emerging technologies and industry best practices to continuously improve our data infrastructure. Qualifications: -Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field. - Proven experience with Scala, Spark, and Java in a data engineering or similar role. - Strong understanding of real-time data processing and streaming technologies. - Experience with big data platforms and tools such as Hadoop, Kafka, and Flink is a plus. - Proficiency in SQL and experience with relational databases. - Excellent problem-solving skills and attention to detail. - Strong communication and collaboration skills to work effectively with cross-functional teams. Preferred Skills: - Experience with cloud platforms (AWS, Azure, Google Cloud) and their data services. - Knowledge of data warehousing solutions and ETL processes. - Familiarity with data governance and security best practices.
18 days ago18 proposalsRemoteEgocentric Video Data Collection
Project Overview: We are collecting egocentric (first-person POV) video data of everyday household tasks recorded in real residential environments. These videos will be used for human-to-robot / humanoid training datasets. Recording Requirements (Mandatory): First-person (POV) recording using a head-mounted smartphone or equivalent device Recording must reflect natural human viewpoint Handheld, tripod, or surface-mounted recordings are NOT allowed Video Specifications: Resolution: Minimum 1920×1080 (1080p) Orientation: Landscape only Frame Rate: 30 FPS minimum (60 FPS preferred) No slow motion No vertical videos Visibility & Framing Rules: Hands and objects must be clearly visible at all times Camera angle slightly downward No blurred or out-of-focus footage No faces visible (including mirrors, photos, reflections) No IDs, documents, personal data Content Rules: Minimum video duration: 4–5 minutes per video No idle time longer than 3 seconds Only meaningful household tasks (kitchen, living room, bedroom, bathroom.) No eating, drinking, washing hands, sitting idle, or unrelated activities Editing Rules: Raw video files only accepted No trimming No filters No overlays, effects, or text, audio Rejection Criteria - Videos will be rejected if they include: Vertical videos Slow-motion recording Phone held by hand or placed on a surface Camera zoomed in too much Blurry, unfocused, or unstable footage Idle time longer than 3 seconds Edited or trimmed videos Eligibility: Open to international freelancers Must have access to a residential home environment Ability to follow strict technical guidelines is required Before Applying - Applicants must confirm: They have a head-mounted recording setup Their device supports 1080p @ 30 FPS and above. They agree to follow all recording rules strictly Deliverables: Raw video files only Uploaded to assigned cloud folder, sorted by environment Payment Details: Payment only for approved hours and rejected video hours are not paid. Job Types: Part-time, Fresher, Contractual / Temporary, Freelance, Volunteer Contract length: 6 weeks Benefits: Flexible schedule Work from home Work Location: Remote
5 hours ago4 proposalsRemoteData input to a website
I'm a real estate agent. I have about 100 properties that I needed loaded to my website. This will include 100's of photos, property descriptions, property details, etc. I need a reliable freelancer able to do this professionally and accurately. Must be completed quickly
8 days ago67 proposalsRemoteI need somone to clean my list of data
I have an email list of approximately 35,000 email addresses and I need someone to clean the data. This can be done either manually or using an automated process — I’m flexible on the method. The requirements are: 1. Remove all duplicate email addresses 2. Remove any invalid email addresses or addresses that are likely to bounce I’m not concerned about how this is carried out, as long as the final list of email addresses is accurate and fully cleaned WITH NO INVALD OR OR EMAILS THAT BOUNCE.
4 days ago67 proposalsRemoteMarine Event Prediction System (Environmental Data + ML)
I am building a system to log marine-related events at specific locations and times, automatically enrich those events with historical environmental data, and use the growing dataset to predict the likelihood of similar events occurring under comparable future conditions. This project is focused on environmental pattern detection, particularly phase transitions and converging conditions, not static averages. The system will be developed in clear milestones and must be modular and expandable.
a day ago16 proposalsRemoteopportunity
HEADTEACHERS Primary school email addresses
Please provide email contact data for the following roles within London and Surrey Primary and Secondary Schools: Headteachers Deputy Headteachers Business Managers PE Teachers School Offices Before proceeding, please confirm the total data count and the date of the last data cleanse. It is essential that the data is restricted to London and Surrey only, with no records from other regions. Please also confirm data turnaround time or if data is readily available
a day ago20 proposalsRemoteMultilingual audio data collection project
We are conducting a multilingual audio data collection project and are seeking native speakers from specific regions to participate. The project involves recording natural, high-quality voice samples in your native language and regional accent. We are currently looking for native speakers of English (Ireland, New Zealand, Scotland, South Africa, Wales, Singapore), German (Switzerland), Chinese (Hong Kong), and Cantonese (China, Hong Kong). Participants must be born and raised in the respective region to ensure authentic pronunciation and accent accuracy. This is a remote, freelance opportunity suitable for individuals with clear speech and access to a quiet recording environment.
6 days ago5 proposalsRemoteExperience-Focused CV Writing
A fresh Sport Science graduate known for discipline, accountability, and strong interpersonal skills gained through years of athletic training and community-facing activities. I need a professional to draft a polished resume or CV that brings my experience and internship history to the forefront. My goal is to position myself competitively for roles that require public service, teamwork, and the ability to perform under pressure. The document should clearly convey: • the practical skills I honed while studying and coaching, • any internships or part-time roles that prove I can handle real-world responsibilities, • the communication and leadership qualities sport has instilled in me. Please shape the content so recruiters instantly grasp the value I offer, structure it for easy scanning, and deliver the final file in an editable format (DOCX or Google Docs) plus a clean PDF ready to send.
17 hours ago15 proposalsRemoteData scraping needed
I have several directories I need to obtain contact information from... Can anyone help me? I will have all the sites for the person who will help!
20 days ago67 proposalsRemoteDeveloper UK/US Financial News Screener (Python / APIs / Data)
PROJECT OVERVIEW I’m building a financial stock news screener focused initially on UK (AIM & Main Market) and then later US small-cap stocks. The system is designed to filter genuinely market-moving news from daily noise. This is an ongoing build, not a one-off task. I’m looking for one capable individual developer (not an agency) who can own the technical implementation and iterate with me as the logic evolves. Where necessary, I want the option to work side-by-side in person for short periods, so UK or Europe location matters. WHAT THE SYSTEM DOES (HIGH LEVEL) - Ingests real-time and scheduled stock news from multiple APIs (UK & US) - Parses and classifies news using firstly rules, secondly AI - Scores news for market impact (positive / negative / neutral) - Flags dilution, governance and risk signals - Outputs structured alerts and internal dashboards This is a logic-heavy, data-driven project, not a UI-first build. REQUIRED SKILLS (MUST-HAVE) Please do not apply unless you are comfortable with most of the following: - Python (primary language) - Working with REST APIs (news, market data, etc.) - Data parsing, filtering and scoring logic - SQL or NoSQL databases (Postgres, MongoDB or similar) - Clean, readable, maintainable code - Git / version control - Comfortable discussing system design and trade-offs Experience with financial data, trading tools, or market news is a major plus. NICE TO HAVE (NOT ESSENTIAL) - Experience with financial markets, trading, RNS or SEC filings - AI / LLM-based text classification - Elasticsearch or similar search tools - AWS or cloud deployment - Previous SaaS or data-platform builds LOCATION REQUIREMENT (IMPORTANT) You must be based in or near one of the following: - UK - Spain - Western or Southern Europe (easy travel to UK or Spain) Due to the likely evolving nature of the project I want the option to meet in person and potentially work together for short, focused periods if required. Please clearly state your location when applying. ENGAGEMENT TYPE - Individual freelancer only (no agencies) - Long-term potential if the fit is right - Paid hourly or milestone-based (open to discussion) HOW TO APPLY (VERY IMPORTANT) To avoid generic applications, please include: 1. A short description of a similar data-driven or API-heavy project you’ve worked on 2. Your primary tech stack 3. Your current location 4. Your hourly rate 5. Confirmation that you are open to occasional in-person collaboration Applications that ignore this will not be considered. WORKING STYLE I’m an equities/indices trader with 30 years experience, reasonably technical but not a developer. I value clear thinking, no nonsense honest communication, and someone who challenges bad ideas rather than blindly coding them.
a day ago13 proposalsRemoteAdministrative Support @Data Management Microsoft Word & Excel
I am looking for professional support in customer service and administrative tasks using Microsoft Office tools. The project includes handling customer inquiries, organizing data, and preparing well-formatted documents to support daily business operations. The required tasks include: • Responding to customer messages professionally • Data entry and information organization • Creating and formatting documents using Microsoft Word • Managing simple Excel sheets for tracking data or tasks • Providing general administrative support Accuracy, clear communication, and timely delivery are essential for this project. The goal is to ensure organized data, professional customer interaction, and smooth workflow using reliable office tools.
5 days ago20 proposalsRemoteopportunity
We are building a Data Center Construction Intelligence Platform designed to support large-scale commercial and data center construction projects across the U.S. This is not a simple app or MVP. We are looking for a high-level, long-term technical partner to help architect and build a continuously updating construction platform that integrates: Construction data Cost models Vendor and material intelligence AI-assisted workflows Ongoing updates as standards, pricing, and specs change Think Procore + Togal + industry-specific AI, purpose-built for data center construction.What the App Will Do (Core Functions)
20 hours ago0 proposalsRemoteopportunity
We are preparing to build a Data Center Construction Intelligence Platform. Before development begins, we need an initial, authoritative dataset covering previously built and currently active data center construction projects in the U.S. This dataset will be the foundation of a platform that is designed to update monthly over time. This role is research and data structuring only — no application development, no scraping tools, and no automation required. Data Sources (Representative, Not Exhaustive) Research should be based on reputable industry sources, such as: BuilderConnected Dodge Construction Network CRANE Construction Intelligence Public owner / developer disclosures Industry reports and trade publications Publicly available permitting and project announcements ⚠️ Important: You are not required to have paid access to all platforms. We are looking for structured summaries and metadata, not proprietary dumps. Scope of Work 1️⃣ Data Center Project Inventory Compile structured information on: Previously built data centers Currently active / under-construction data centers Where available, capture: Location (city, state) Project status (completed / active) Facility type (enterprise, hyperscale, colocation, edge, etc.) Approximate scale (high-level) General delivery model (design-build, CM, etc.) 2️⃣ Construction Systems & Assemblies (Project-Informed) Using real-world project data, organize: Structural systems Electrical & power infrastructure Mechanical / cooling strategies Fire protection approaches Low-voltage & networking considerations Focus on patterns and common approaches, not engineering drawings. 3️⃣ Tier-Level & Redundancy Context (High Level) Where identifiable, note: Tier I–IV alignment (if stated or implied) Redundancy concepts (N, N+1, 2N, etc.) How redundancy impacts construction scope This should remain descriptive, not technical design. 4️⃣ Cost & Schedule Drivers (Qualitative) Based on project patterns, identify: Primary cost drivers Schedule risk factors Regional labor/material sensitivity Supply chain and lead-time influences No exact pricing required. 5️⃣ Structured, App-Ready Data Format (Critical) All information must be delivered in structured spreadsheet format, with clearly defined columns such as: project_name location project_status facility_type system_category assembly_or_component tier_or_redundancy_level primary_cost_drivers schedule_risk_factors data_source notes This dataset will later feed an application and AI system. Deliverables You must provide: Google Sheets or Excel file(s) Clean, consistent structure and naming CSV export Short summary document explaining: Data sources used Assumptions and limitations Recommendations for monthly updates No PDFs. No slide decks. No scraped raw dumps. Ongoing Monthly Updates (Future Work) After the initial dataset is delivered, we plan to: Update the dataset monthly Add newly announced or completed projects Refine patterns as the dataset grows Please indicate in your proposal: Whether you are open to a monthly update engagement Your estimated monthly fee range for ongoing updates Ideal Candidate Experience with construction, infrastructure, or industrial research Familiarity with construction intelligence platforms or trade data Strong data organization skills Methodical, detail-oriented approach Comfortable citing and tracking data sources Screening Question (Required) How would you gather and structure authoritative construction project data so it can be updated monthly and later used in an application or AI system? Generic responses will not be considered.Selection Criteria We will prioritize candidates who: Demonstrate structured thinking Understand construction project data Respect data-source boundaries and licensing Clearly explain how they will organize and maintain the dataset
20 hours ago0 proposalsRemoteopportunity
Data Validation & Research Admin (Virtual Assistant)
We are running a 3 to 6 month data quality control project for a new database, which can be extended based on performance. We need support from a qualified research and data quality control virtual assistant. Check and validate structured data before it’s published Spot and fix discrepancies across different sources Track team deliverables and make sure deadlines are hit Flag any quality issues or gaps early Keep simple documentation/audit trails of changes Put together a weekly data quality & research progress report Suggest small process improvements to make the workflow smoother Required Experience: Experience with data quality, research support, or admin work Super detail-oriented and organised Able to manage deadlines across multiple people/projects Strong Excel skills, including formulas and basic data analysis High numeracy and attention to detail Basic understanding of finance concepts Methodical, process-driven, likes keeping things neat Bonus: experience handling structured datasets Success Metrics: Team research deadlines are consistently met Clear improvement in data consistency and accuracy Weekly reports are trusted and relied on Fewer ad hoc data issues or escalations Team works with clearer expectations and accountability Payment & Commitment: Initial rate: £120 per 10 hours (DOE), can be increased after initial successful performance review Minimum commitment: 10 hours per month Performance bonuses available
11 days ago39 proposalsRemoteopportunity
.step model from scan date - racing yacht
I have detailed scan data from a RTC360 of a racing yacht and require a high precision .step model.
3 days ago10 proposalsRemoteExpert Data Researcher Needed – UK SPV Landlord Database
I’m looking for a highly competent, expert-level data researcher to carry out an ad-hoc project collating UK landlords operating through SPV (Special Purpose Vehicle) limited companies. The task is to build a clean, well-structured spreadsheet (any format that can be imported into Excel/Google Sheets) containing verified data sourced primarily from Companies House and/or other reliable UK property/ownership databases. IS THERE A WAY TO INCLUDE CONTACT DETAILS - EMAIL / /ADDRESS / TELEPHONE? Key Requirements: Proven experience using Companies House and corporate ownership databases Strong understanding of UK property SPV structures Ability to identify and filter landlord companies accurately Excellent data hygiene and validation skills Able to work independently and deliver quickly (ASAP) Deliverable: A spreadsheet containing SPV landlord company data Clearly labelled columns (e.g. company name, number, directors, registered address, SIC codes, property links if available, etc.) Scalable structure so the dataset can be expanded later Project Type: Fixed price (please quote based on your expected hours) Ad-hoc with potential for repeat work Ideal Candidate: Advanced researcher or analyst Prior experience in property, financial due diligence, or corporate intelligence Comfortable working with large datasets and complex filtering logic Please include: Relevant experience with Companies House or similar databases Example of similar work (if available) Your proposed fixed price and estimated turnaround time
7 days ago20 proposalsRemote