
Software Engineer — AI Benchmark & Test Developer (Remote, EST)
- or -
Post a project like this$150
- Posted:
- Proposals: 26
- Remote
- #4496497
- Expired
Senior Full Stack .NET Developer | .NET Core, ASP.NET, MVC, Blazor, C#, SQL Server

♛ Most Trusted #1 Team |19+ years of expertise in Website, Mobile Apps, Desktop & Console Games. Wordpress, ReactJS, Shopify, Laravel, Python, React Native, Flutter, Unity, Unreal Engine and AR/VR




Full-Stack Web & Mobile App Developer With AI Integration & Automation Expertise | Top Rated

✨✨✨✨✨ Top 10 UK based - AI |Mobile & Web Apps | AI & ML | Website | CRM/CMS


AI Full Stack Developer (Websites & WebApps) SaaS, .NET, ERP, ML, NLP, Python, AWS, React, Node, JavaScript
1431110131342561217151212834212903820692718622796031349279715924081456332111285852281781
Description
Experience Level: Entry
Experienced Software Engineer needed to develop AI coding benchmarks: analyze real GitHub PRs, implement robust fixes, author rigorous tests, and craft clear LLM evaluation prompts within reproducible Docker environments. Follow a disciplined 7-step workflow—review PRs, run and validate repos in Docker, implement golden solutions on a dedicated branch, write fail-to-pass tests, compose metadata and prompt variants, and open a three-commit PR. Must be available during EST business hours, proficient in multiple languages (e.g., Python, JavaScript/TypeScript, Go, Ruby, Java), skilled with Git, testing frameworks, Docker, and CI. Ideal candidates practice meticulous TDD, contribute to open source, and communicate proactively, producing high-quality, reproducible deliverables validated by automated checks.
William D.
0% (0)Projects Completed
-
Freelancers worked with
-
Projects awarded
0%
Last project
5 Jul 2026
United States
New Proposal
Login to your account and send a proposal now to get this project.
Log inClarification Board Ask a Question
-

A few quick questions:
1. Which programming languages/repositories will be the primary focus initially?
2. Do you already have benchmark formatting guidelines/templates prepared?
3. Will the work mainly involve backend repos, full-stack apps, or mixed repositories?
1154863
We collect cookies to enable the proper functioning and security of our website, and to enhance your experience. By clicking on 'Accept All Cookies', you consent to the use of these cookies. You can change your 'Cookies Settings' at any time. For more information, please read ourCookie Policy
Cookie Settings
Accept All Cookies