About the Role:
The data engineering team is on a mission to create a hyper scale data lake, which helps find bad actors and stop breaches. The team builds and operates systems to centralize all of the data the falcon platform collects, making it easy for internal and external customers to transform and access the data for analytics, machine learning, and threat hunting. As a senior engineer on the team, you will contribute to the full spectrum of our systems, from foundational processing and data storage, through scalable pipelines, to frameworks, tools and applications that make that data available to other teams and systems.
Your primary toolset in your work will be Java microservices, Spark/Scala data processing (also some Flink), Kubernetes and AWS native tooling.
Your primary focus will be owning our new graph database, which you will have a significant hand in building.
What You'll Do:
Write highly fault-tolerant Java code within Apache Spark to produce platform products used by our customers to query our event pipelines/ingestion for insight into active threat trends and related analytics
Design, develop, and maintain ultra-high-scale data platforms that process petabytes of data
Participate in technical reviews of our products and help us develop new features and enhance stability
Continually help us improve the efficiency and reduce latency of our high-performance services to delight our customers
Research and implement new ways for both internal stakeholders as well as customers to query their data efficiently and extract results in the format they desire
What You'll Need:
10+ years' experience combined between backend/cloud development and data platform engineering roles
5+ years of experience building data platform product(s) or features with (one of) Apache Spark, Flink or Iceberg, or with comparable tools in GCP
5+ years of experience programming with Java, Scala or Kotlin.
Proven experience owning robust feature/product design end to end, yourself, especially with vaguely defined problem statements or only 'loose' specs leading the way.
Proven expertise with algorithms, distributed systems design and the software development lifecycle
Experience building large scale data/event pipelines
Expertise designing solutions with relational SQL and NoSQL databases, including Postgres/MySQL, Cassandra, DynamoDB
Good test driven development discipline
Reasonable proficiency with Linux administration tools
Proven ability to work effectively with remote teams"
Experience with the Following is Desirable:
Go
Pinot or other time-series/OLAP-style database
Iceberg
Kubernetes
Jenkins
Parquet
Protocol Buffers/GRPC