Foundry Pipeline Builder

Learn to build production-ready pipelines according to data engineering best practices, without needing to know how to code.

In this course, I teach you how to develop, deploy, and maintain production-grade data pipelines with Foundry’s Pipeline Builder application.

Pipeline Builder opens the world of data engineering to people who don’t know how to code. Nonetheless, it still takes skill to build a good data pipeline. This means learning core data engineering concepts. It also means learning to use Pipeline Builder not alone, but in concert with Foundry’s other data engineering applications: Data Lineage, Data Health, and Scheduler, among others.

If you’ve never done data engineering before, this course will teach you the concepts and skills required to use Pipeline Builder correctly — so you can build pipelines your data engineering team would be proud to call their own.

If you’re a data engineer, you’ll learn how to use Pipeline Builder as a complement to the code-based tools you’re already familiar with. It will save you time on day-to-day workflows and make it easier to collaborate with subject matter experts who need to sign off on pipeline logic.

What you‘ll learn

This course will teach you how to create production-ready data pipelines in Foundry using the Pipeline Builder application. I’ll cover everything you need to know, and our journey will take us well beyond Pipeline Builder.

What’s more, the course includes a supplementary library of how-to guides for advanced data transformations in Pipeline Builder, which will save you hours of time trying to figure out how to build more complicated pipelines including: 

  • Array manipulation
  • Advanced aggregations, including windowing
  • String manipulation using regular expressions
  • Handling JSON
  • And many more…

Curriculum

{{curriculum}}

Foundry Overview

  • Breaking down data pipelines
  • Pipeline Builder for different roles
  • Supporting applications from across Foundry

Building a simple pipeline

  • Adding, transforming, joining, and unioning data
  • Outputting data, including multi-output builds
  • Scheduling data pipelines
  • Updating and redeploying pipelines

Handing failures

  • Health checks with Data Health
  • Pipeline notifications
  • Monitoring pipelines
  • Data Expectations vs. Data Health

Advanced Pipeline Builder usage

  • Parameters and custom functions
  • Advanced transformations
  • Pipeline documentation and organization
  • Re-using pipeline logic

Pipeline optimization

  • Spark 101
  • Improving our pipeline’s performance
  • Rules of thumb for Spark optimization

Collaboration

  • Branching
  • Proposals
  • Reverting changes
  • Validating changes

Powering the Ontology

  • Backing object types with data pipelines
  • Ontology Manager 101

Pipeline Organization

  • Data Lineage application
  • Pipeline organization at scale
  • Stages: raw, clean, ontology
  • Scheduling pitfalls and solutions

Incremental pipelines

  • Use cases for incremental pipelines
  • Building an incremental pipeline
  • Common issues

Streaming pipelines

  • Building a streaming pipeline
  • Outputting timeseries objects to the ontology
  • Monitoring streaming pipelines

Geospatial data

  • Building a pipeline for geospatial data
  • Foundry Map for validation

Pipeline Security

  • Security best practices
  • Project-based security boundaries
  • Security markings
  • Implementing real-world examples

Maintaining data pipelines

  • Advanced scheduling
  • Issues application

Debugging pipelines

  • Job Tracker
  • Spark logs
  • Common failure modes

Exporting pipelines to code

  • Java transforms 101
  • Limitations of code export

When not to use Pipeline Builder

  • Situations where Pipeline Builder is not the best fit
  • What to use instead

Frequently-asked questions

What is Pipeline Builder?

Pipeline Builder is the most recent addition to Foundry’s data integration suite. It lets anyone create production-grade data pipelines without needing to know how to code.

With Pipeline Builder, you have the ability to develop and deploy batch, incremental and even streaming pipelines in a point-and-click interface. You benefit from the same version control of logic and data that Foundry’s code-based tools pioneered, and Pipeline Builder is fully integrated into Foundry’s pipeline orchestration, monitoring and alerting, and security capabilities. Pipelines made in Pipeline Builder can even be exported to code if needed.

And if you want to learn to code data pipelines, Pipeline Builder is a good way to start: it relies on the same git-based version control workflow as Foundry’s Code Authoring application; it will let you improve your understanding of Spark best practices; and it will let you gain experience at structuring and organizing the logic of data pipelines. Imagine trying to learn all that while simultaneously learning your first programming language.

Who is Pipeline Builder for?

Pipeline Builder is the most recent addition to Foundry’s data integration suite. It lets anyone create production-grade data pipelines without needing to know how to code.

With Pipeline Builder, you have the ability to develop and deploy batch, incremental and even streaming pipelines in a point-and-click interface. You benefit from the same version control of logic and data that Foundry’s code-based tools pioneered, and Pipeline Builder is fully integrated into Foundry’s pipeline orchestration, monitoring and alerting, and security capabilities. Pipelines made in Pipeline Builder can even be exported to code if needed.

And if you want to learn to code data pipelines, Pipeline Builder is a good way to start: it relies on the same git-based version control workflow as Foundry’s Code Authoring application; it will let you improve your understanding of Spark best practices; and it will let you gain experience at structuring and organizing the logic of data pipelines. Imagine trying to learn all that while simultaneously learning your first programming language.

Another question?

Pipeline Builder is the most recent addition to Foundry’s data integration suite. It lets anyone create production-grade data pipelines without needing to know how to code.

With Pipeline Builder, you have the ability to develop and deploy batch, incremental and even streaming pipelines in a point-and-click interface. You benefit from the same version control of logic and data that Foundry’s code-based tools pioneered, and Pipeline Builder is fully integrated into Foundry’s pipeline orchestration, monitoring and alerting, and security capabilities. Pipelines made in Pipeline Builder can even be exported to code if needed.

And if you want to learn to code data pipelines, Pipeline Builder is a good way to start: it relies on the same git-based version control workflow as Foundry’s Code Authoring application; it will let you improve your understanding of Spark best practices; and it will let you gain experience at structuring and organizing the logic of data pipelines. Imagine trying to learn all that while simultaneously learning your first programming language.

Ready to enroll?
16 modules
Approx. 480 minutes
Bonus: Advanced Transforms Reference
Launching Fall 2023

Meet your teacher

Taylor helps Palantir customers & partners get more out of Foundry

Previously, Taylor worked at Palantir deploying Foundry to commercial and government customers around the world. His experience includes working with customers in healthcare, transportation, media, finance, and manufacturing.