Senior Software Engineer – AI Data Infrastructure
Location: United States (Remote or Hybrid)
Domain: Artificial Intelligence, Data Platforms
Help Build the Future of AI
Join a fast-growing company at the forefront of AI development. This organization provides foundational tools and services for cutting-edge AI research and enterprise applications. Since its inception, it has pioneered data-centric approaches essential for training the next generation of AI systems.
About the Company
The company offers three integrated solutions designed to support frontier AI development:
-
Enterprise Platform & Tools: Scalable annotation, workflow automation, and quality control systems.
-
Specialized Data Labeling Services: High-accuracy labeling solutions utilizing domain experts.
-
Expert Marketplace: On-demand access to a global network of skilled annotators and subject matter experts.
Why You’ll Want to Join
-
Impact-Oriented Culture: Operates like a startup—lean, agile, and driven by results. Your growth aligns directly with your contributions.
-
Technical Excellence: Be part of a high-caliber team working on infrastructure that supports cutting-edge AI models.
-
High Velocity: Encourages autonomy, rapid iteration, and a strong sense of ownership.
-
Continuous Learning: Engage in meaningful, complex problems that require constant learning and innovation.
-
Clear Accountability: Roles are well-defined, and success metrics are transparent.
Role Overview
As a Senior Software Engineer – AI Platform, you'll lead the development of core data infrastructure components that manage the storage, processing, and movement of large-scale data sets for training AI models. This role focuses on building high-performance, scalable systems integrated with modern database and streaming technologies. You’ll collaborate cross-functionally from ideation to production, enabling customers to efficiently manage their AI data pipelines.
Key Responsibilities
- Design and implement scalable data infrastructure, including distributed systems and high-performance databases (relational, NoSQL, and cloud-native).
- Optimize data systems for performance, indexing, and querying in support of AI model workflows.
- Build and maintain high-throughput pipelines using distributed messaging systems and job orchestration tools.
- Collaborate with stakeholders to align infrastructure capabilities with product and customer needs.
- Contribute to Agile processes such as sprint planning and stand-ups.
- Mentor junior engineers in data engineering best practices.
- Resolve customer-facing infrastructure issues in collaboration with support teams.
- Stay current with innovations in data infrastructure and integrate relevant technologies.
- Contribute to technical content including documentation, blogs, and conference talks.
Ideal Candidate Will Have
- Bachelor’s degree (or higher) in Computer Science, Data Engineering, or a related field.
- 5+ years of experience in backend or data infrastructure engineering.
- Proficiency with:
- Relational databases (e.g., PostgreSQL, MySQL)
- NoSQL systems (e.g., MongoDB, Cassandra)
- Cloud-native data stores (e.g., DynamoDB, Google Spanner)
- Experience building large-scale data pipelines and distributed systems using tools like Kafka, RabbitMQ, or similar.
- Proficient in backend languages such as Python, Java, or TypeScript.
- Strong system design skills, especially for high-volume, performance-critical data systems.
- Familiarity with search engines (e.g., ElasticSearch).
- High agency and comfort with ambiguity in a fast-paced setting.
- Ability to break down complex tasks into actionable work.
- Experience using AI developer tools like GitHub Copilot or Cursor.
Preferred Qualifications
- Experience with data warehouses like Snowflake or BigQuery
- Familiarity with Kubernetes or other orchestration platforms
- Experience with GCP (preferred), AWS, or Azure
- Understanding of memory optimization for data-intensive systems
- Exposure to AI-driven feature development using tools from OpenAI, Anthropic, or similar
- Knowledge of DevOps tools such as ArgoCD or DataDog
Technology Stack
-
Frontend: React.js, Redux, TypeScript
-
Backend: Node.js, Python, Java, TypeScript
-
API Layer: GraphQL
-
Infrastructure: Google Cloud Platform, Kubernetes
-
Databases: MySQL, PostgreSQL, Spanner
-
Streaming: Kafka, Pub/Sub
Compensation
-
Base Salary Range (US-based): $180,000 – $260,000 USD
- Additional equity and benefits not included in the base range. Final compensation based on skills, experience, and location.