About the role:
Samsara is seeking a Senior Data Engineer to join our Data Tools team, comprising both Software Engineers and Data Engineers.
Our Data Tools team is at the forefront of enhancing data analysis efficiency and effectiveness. Our software engineers develop and maintain applications and libraries aimed at streamlining data analysis workflows within Samsara. Meanwhile, our data engineers focus on managing core data sets integral to our analytics data model.
As a Senior Data Engineer, your primary responsibility will be designing and maintaining data pipelines, primarily utilizing SparkSQL and Pyspark, within our central data lake. These pipelines are crucial for ingesting and transforming source data from our IOT devices and software products into our core data model, facilitating statistical analysis, model training, and dashboard creation.
This role is open to candidates residing in the US except the San Francisco Bay Area (125 mi. radius from 1 De Haro St, San Francisco) and NYC Metro Area (50 mi. radius from 131 W 55th St, New York).
You should apply if:
-
You want to impact the industries that run our world: Your efforts will result in real-world impact—helping to keep the lights on, get food into grocery stores, reduce emissions, and most importantly, ensure workers return home safely.
-
You are the architect of your own career: If you put in the work, this role won’t be your last at Samsara. We set up our employees for success and have built a culture that encourages rapid career development, and countless opportunities to experiment and master your craft in a hyper-growth environment.
-
You’re energized by our opportunity: The vision we have to digitize large sectors of the global economy requires your full focus and best efforts to bring forth creative, ambitious ideas for our customers.
-
You want to be with the best: At Samsara, we win together, celebrate together and support each other. You will be surrounded by a high-caliber team that will encourage you to do your best.
In this role, you will:
- Build and maintain highly reliable computed tables, incorporating data from various sources, including unstructured data like video and audio, Samsara sensor & product data, and customer metadata.
- Access, manipulate, and integrate external datasets with internal data
- Deliver high-quality data with strong uptime and reliability requirements, including customer-facing data sets.
- Collaborate closely with cross-functional teams such as Data Science & Analytics, AI/ML, and other Data Engineers to ensure high-quality data for diverse purposes from causal inference, model training, and dashboarding.
- Champion, role model, and embed Samsara’s cultural principles (Focus on Customer Success, Build for the Long Term, Adopt a Growth Mindset, Be Inclusive, Win as a Team) as we scale globally and across new offices
Minimum requirements for the role:
- BA / MS degree in Computer Science, Statistics, or a related discipline
- 4+ years experience in a data engineering-focused role
- Demonstrated experience in designing data models at scale
- Proficiency in building ETL pipelines to handle large volumes of data
- Experience with Spark-based data platforms
- Strong command of at least one data orchestration tool (e.g Airflow, Dagster, or Prefect)
- Expertise in SQL, Python, and working with REST APIs.
- Familiarity with software engineering fundamentals and reading backend development code
- Experience with version control systems such as Git/GitHub
An ideal candidate also has:
- Familiarity with time series data and late-arriving data
- Knowledge of Databricks, Delta Lakes, and Dagster
- Previous experience working in a public cloud (e.g AWS, GCP, Azure)
- Exposure working on a data model for a product’s first-party data
- Exposure to complex data, including ML outputs and/or client-side signals