Loading…
Thursday, November 14 • 5:00pm - 5:30pm
Dagster: a Framework for Data Processing Applications

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

We introduce Dagster, an open source Python library for building ETL processes, ML pipelines, and similar software systems, all of which we call data applications.

Data applications are graphs of functional computations that consume and produce data assets. Dagster provides abstractions and tools for modeling the semantics of these applications by providing a unified type system, a data dependency graph, a configuration system, a structured API for emitting events such as data quality tests and materializations, and high-quality developer tools built on those abstractions.  Computations themselves can be in the tools used by builders -- Spark jobs for data engineers, SQL statements for analysts, Python for data scientists -- and can be deployed to arbitrary orchestration engines -- such as Airflow, Dask, or Kubernetes-based execution.

The result is more reliable, testable, understandable data systems, that leverage the existing tools that work and that are deployable to your infrastructure.



Speakers
avatar for Nick Schrock

Nick Schrock

Founder, Elementl
Nick is the founder/CEO of Elementl and the creator of Dagster (http://dagster.io) the data orchestrator for machine learning, analytics, and ETL. Prior to founding Elementl Nick was a principal engineer and director at Facebook and created GraphQL.


Thursday November 14, 2019 5:00pm - 5:30pm PST
reactive