
Software Engineer — AI Benchmark & Test Developer (Remote, EST)
- or -
Post a project like this29
$150
- Posted:
- Proposals: 18
- Remote
- #4496497
- Open for Proposals
Senior Full Stack .NET Developer | .NET Core, ASP.NET, MVC, Blazor, C#, SQL Server

♛ Most Trusted #1 Team |19+ years of expertise in Website, Mobile Apps, Desktop & Console Games. Wordpress, ReactJS, Shopify, Laravel, Python, React Native, Flutter, Unity, Unreal Engine and AR/VR




Full-Stack Web & Mobile App Developer With AI Integration & Automation Expertise

AI,Software & Web/Mobile Machine & Learning Specialist | 1000+ Success Stories | Smart SaaS & Automation Expert
✨✨✨✨✨ Top 10 UK based - AI |Mobile & Web Apps | AI & ML | Website | CRM/CMS

14311101331088113134256121715121283429256880129038206927186123777422279603134927971592408
Description
Experience Level: Entry
Experienced Software Engineer needed to develop AI coding benchmarks: analyze real GitHub PRs, implement robust fixes, author rigorous tests, and craft clear LLM evaluation prompts within reproducible Docker environments. Follow a disciplined 7-step workflow—review PRs, run and validate repos in Docker, implement golden solutions on a dedicated branch, write fail-to-pass tests, compose metadata and prompt variants, and open a three-commit PR. Must be available during EST business hours, proficient in multiple languages (e.g., Python, JavaScript/TypeScript, Go, Ruby, Java), skilled with Git, testing frameworks, Docker, and CI. Ideal candidates practice meticulous TDD, contribute to open source, and communicate proactively, producing high-quality, reproducible deliverables validated by automated checks.
William D.
0% (0)Projects Completed
-
Freelancers worked with
-
Projects awarded
0%
Last project
19 May 2026
United States
New Proposal
Login to your account and send a proposal now to get this project.
Log inClarification Board Ask a Question
-

A few quick questions:
1. Which programming languages/repositories will be the primary focus initially?
2. Do you already have benchmark formatting guidelines/templates prepared?
3. Will the work mainly involve backend repos, full-stack apps, or mixed repositories?
1154863
We collect cookies to enable the proper functioning and security of our website, and to enhance your experience. By clicking on 'Accept All Cookies', you consent to the use of these cookies. You can change your 'Cookies Settings' at any time. For more information, please read ourCookie Policy
Cookie Settings
Accept All Cookies