Foundry Pipeline Builder

Learn to build production-ready pipelines according to data engineering best practices, without needing to know how to code.

Get early access

In this course, I teach you how to develop, deploy, and maintain production-grade data pipelines with Foundry’s Pipeline Builder application.

Pipeline Builder opens the world of data engineering to people who don’t know how to code. Nonetheless, it still takes skill to build a good data pipeline. This means learning core data engineering concepts. It also means learning to use Pipeline Builder not alone, but in concert with Foundry’s other data engineering applications: Data Lineage, Data Health, and Scheduler, among others.

If you’ve never done data engineering before, this course will teach you the concepts and skills required to use Pipeline Builder correctly — so you can build pipelines your data engineering team would be proud to call their own.

If you’re a data engineer, you’ll learn how to use Pipeline Builder as a complement to the code-based tools you’re already familiar with. It will save you time on day-to-day workflows and make it easier to collaborate with subject matter experts who need to sign off on pipeline logic.

What you‘ll learn

This course will teach you how to create production-ready data pipelines in Foundry using the Pipeline Builder application. I’ll cover everything you need to know, and our journey will take us well beyond Pipeline Builder.

What’s more, the course includes a supplementary library of how-to guides for advanced data transformations in Pipeline Builder, which will save you hours of time trying to figure out how to build more complicated pipelines including:

Array manipulation
Advanced aggregations, including windowing
String manipulation using regular expressions
Handling JSON
And many more…

Curriculum

Foundry Overview

Breaking down data pipelines
Pipeline Builder for different roles
Supporting applications from across Foundry

Building a simple pipeline

Adding, transforming, joining, and unioning data
Outputting data, including multi-output builds
Scheduling data pipelines
Updating and redeploying pipelines

Handing failures

Health checks with Data Health
Pipeline notifications
Monitoring pipelines
Data Expectations vs. Data Health

Advanced Pipeline Builder usage

Parameters and custom functions
Advanced transformations
Pipeline documentation and organization
Re-using pipeline logic

Pipeline optimization

Spark 101
Improving our pipeline’s performance
Rules of thumb for Spark optimization

Collaboration

Branching
Proposals
Reverting changes
Validating changes

Powering the Ontology

Backing object types with data pipelines
Ontology Manager 101

Pipeline Organization

Data Lineage application
Pipeline organization at scale
Stages: raw, clean, ontology
Scheduling pitfalls and solutions

Incremental pipelines

Use cases for incremental pipelines
Building an incremental pipeline
Common issues

Streaming pipelines

Building a streaming pipeline
Outputting timeseries objects to the ontology
Monitoring streaming pipelines

Geospatial data

Building a pipeline for geospatial data
Foundry Map for validation

Pipeline Security

Security best practices
Project-based security boundaries
Security markings
Implementing real-world examples

Maintaining data pipelines

Advanced scheduling
Issues application

Debugging pipelines

Job Tracker
Spark logs
Common failure modes

Exporting pipelines to code

Java transforms 101
Limitations of code export

When not to use Pipeline Builder

Situations where Pipeline Builder is not the best fit
What to use instead

Frequently-asked questions

What is Pipeline Builder?

Pipeline Builder is the most recent addition to Foundry’s data integration suite. It lets anyone create production-grade data pipelines without needing to know how to code.

With Pipeline Builder, you have the ability to develop and deploy batch, incremental and even streaming pipelines in a point-and-click interface. You benefit from the same version control of logic and data that Foundry’s code-based tools pioneered, and Pipeline Builder is fully integrated into Foundry’s pipeline orchestration, monitoring and alerting, and security capabilities. Pipelines made in Pipeline Builder can even be exported to code if needed.

And if you want to learn to code data pipelines, Pipeline Builder is a good way to start: it relies on the same git-based version control workflow as Foundry’s Code Authoring application; it will let you improve your understanding of Spark best practices; and it will let you gain experience at structuring and organizing the logic of data pipelines. Imagine trying to learn all that while simultaneously learning your first programming language.

Who is Pipeline Builder for?

Pipeline Builder is the most recent addition to Foundry’s data integration suite. It lets anyone create production-grade data pipelines without needing to know how to code.

Another question?

Pipeline Builder is the most recent addition to Foundry’s data integration suite. It lets anyone create production-grade data pipelines without needing to know how to code.

Ready to enroll?

16 modules

Approx. 480 minutes

Bonus: Advanced Transforms Reference

Launching Fall 2023

Get early access

Meet your teacher

Taylor helps Palantir customers & partners get more out of Foundry

Previously, Taylor worked at Palantir deploying Foundry to commercial and government customers around the world. His experience includes working with customers in healthcare, transportation, media, finance, and manufacturing.

More about Taylor