Datafold automates data pipeline testing for data engineers. With Datafold, data engineers can deal with data quality issues in the pull request by seeing how a change to source code impacts data produced throughout the entire data pipeline/DAG. Datafold is used by data teams at Patreon, Thumbtack, Substack, Angellist, among others, and raised $22M from YC, NEA & Amplify Partners. Our founding story and Launch HN: [https://news.ycombinator.com/item?id=24071955]
Roles:
* Frontend Engineer: $180K - $250K + equity [https://bit.ly/3tW4zk7]
* Backend Engineer: $180K - $250K + equity [https://bit.ly/3J2mmdz]
* Data Solutions Engineer $130K – $200K + equity [https://bit.ly/3iPsNpY]
Salary ranges are for SF Bay Area (adjusted based on location factor), Intermediate to Staff levels.
Location: REMOTE (PST ±5) US visa sponsorship: yes
Stack: Python, Rust, FastAPI, PostgreSQL, Neo4j, ClickHouse | Typescript, React, Redux
Here are some projects you might work on:
* Static code analysis to compute column-level data lineage graph
* ML-based anomaly detection in multidimensional time series
* Data diff tool that finds discrepancies between 1B+ row datasets across databases
Contact: [email protected]