About the Role:
The GenAI Platform team builds the core building blocks powering CrowdStrike’s next generation of AI products—model inference, knowledge bases, agents/tools, guardrails, cost management, and the SDLC around AI artifacts (evaluation, benchmarking, versioning, deploy). This platform is central to CrowdStrike’s AI strategy and supports our path to $10B ARR.
You’ll partner closely with the Charlotte AI team, data scientists, and engineering/product groups across CrowdStrike, and engage with ProdSec, InfoSec, Legal, and Privacy to deliver secure, compliant, enterprise‑grade AI capabilities.
What You’ll Do:
Design and build automation for services, data pipelines, and LLM workflows using Go/Python—covering unit, integration, contract, end‑to‑end, and performance tests.
Create LLM evaluation pipelines: curate datasets, define ground truth, automate scoring, and set release gates for accuracy, safety, latency, and cost; support human‑in‑the‑loop review.
Define test strategy and coverage for functional and non‑functional needs (performance, scalability, resilience, stability), including Kafka‑based async flows and OpenSearch/SQL/Redis data layers.
Harden safety and guardrails: test for prompt injection, jailbreaks, PII leakage, toxicity, hallucinations, and adherence to prompt/tool contracts.
Shift left in CI/CD: integrate fast/slow suites, flaky‑test management, coverage and quality gates in Jenkins/Git workflows; deliver build health dashboards.
Own observability for quality: instrument tests and services (metrics, logs, traces) and track SLIs/SLOs such as p95 latency, error rate, and cost per request/token.
Validate integrations from internal customers with the GenAI Platform SDKs/APIs; provide reference fakes/mocks and test harnesses.
Participate in design and code reviews with a strong focus on testability, reliability, and security; mentor engineers on quality best practices.
Collaborate cross‑functionally (ProdSec/InfoSec/Legal/Privacy) to meet compliance, data residency, auditability, and RBAC requirements.
Continuously improve tools, frameworks, and processes; stay current with testing techniques and GenAI evaluation methods.
What You’ll Bring (Minimum Qualifications):
5+ years building test automation as an SDET or software engineer or equivalent practical experience.
Proficiency in Go or Python (ideally both), writing production‑quality code and tests.
Solid CS fundamentals (data structures, algorithms, OS, networking, distributed systems).
Hands-on experience with automated testing frameworks (e.g., pytest, Go test/Ginkgo/Gomega, JUnit/TestNG); API (REST/gRPC) and message‑driven testing.
CI/CD expertise (Jenkins, Git/Bitbucket), artifact/version management, and branch strategies.
AWS experience (S3, EC2, IAM; familiarity with VPC networking and security basics).
Experience testing AI/ML or LLM‑integrated systems (evaluation datasets, prompt/version management, safety/guardrails).
Strong collaboration and clear written/verbal communication; comfortable operating with ambiguity and iterating quickly.
Nice to Have:
GenAI in production: LLM serving, RAG, agents/tools, guardrails/safety, evaluation frameworks; tuning for cost/latency; vector search (OpenSearch k‑NN/FAISS).
Familiarity with frameworks like LangChain, LlamaIndex, CrewAI, n8n.
Kubernetes/containers and IaC (Docker, Terraform/CloudFormation).
Observability (OpenTelemetry), UI automation (Playwright), contract testing (e.g., Pact).
Cybersecurity domain knowledge and secure SDLC; experience with privacy/compliance in enterprise environments.