Foundry Pipeline Builder
Learn to build production-ready pipelines according to data engineering best practices, without needing to know how to code.
In this course, I teach you how to develop, deploy, and maintain production-grade data pipelines with Foundry’s Pipeline Builder application.
Pipeline Builder opens the world of data engineering to people who don’t know how to code. Nonetheless, it still takes skill to build a good data pipeline. This means learning core data engineering concepts. It also means learning to use Pipeline Builder not alone, but in concert with Foundry’s other data engineering applications: Data Lineage, Data Health, and Scheduler, among others.
If you’ve never done data engineering before, this course will teach you the concepts and skills required to use Pipeline Builder correctly — so you can build pipelines your data engineering team would be proud to call their own.
If you’re a data engineer, you’ll learn how to use Pipeline Builder as a complement to the code-based tools you’re already familiar with. It will save you time on day-to-day workflows and make it easier to collaborate with subject matter experts who need to sign off on pipeline logic.
What you‘ll learn
This course will teach you how to create production-ready data pipelines in Foundry using the Pipeline Builder application. I’ll cover everything you need to know, and our journey will take us well beyond Pipeline Builder.
What’s more, the course includes a supplementary library of how-to guides for advanced data transformations in Pipeline Builder, which will save you hours of time trying to figure out how to build more complicated pipelines including:
- Array manipulation
- Advanced aggregations, including windowing
- String manipulation using regular expressions
- Handling JSON
- And many more…
Curriculum
{{curriculum}}
Foundry Overview
- Breaking down data pipelines
- Pipeline Builder for different roles
- Supporting applications from across Foundry
Building a simple pipeline
- Adding, transforming, joining, and unioning data
- Outputting data, including multi-output builds
- Scheduling data pipelines
- Updating and redeploying pipelines
Handing failures
- Health checks with Data Health
- Pipeline notifications
- Monitoring pipelines
- Data Expectations vs. Data Health
Advanced Pipeline Builder usage
- Parameters and custom functions
- Advanced transformations
- Pipeline documentation and organization
- Re-using pipeline logic
Pipeline optimization
- Spark 101
- Improving our pipeline’s performance
- Rules of thumb for Spark optimization
Collaboration
- Branching
- Proposals
- Reverting changes
- Validating changes
Powering the Ontology
- Backing object types with data pipelines
- Ontology Manager 101
Pipeline Organization
- Data Lineage application
- Pipeline organization at scale
- Stages: raw, clean, ontology
- Scheduling pitfalls and solutions
Incremental pipelines
- Use cases for incremental pipelines
- Building an incremental pipeline
- Common issues
Streaming pipelines
- Building a streaming pipeline
- Outputting timeseries objects to the ontology
- Monitoring streaming pipelines
Geospatial data
- Building a pipeline for geospatial data
- Foundry Map for validation
Pipeline Security
- Security best practices
- Project-based security boundaries
- Security markings
- Implementing real-world examples
Maintaining data pipelines
- Advanced scheduling
- Issues application
Debugging pipelines
- Job Tracker
- Spark logs
- Common failure modes
Exporting pipelines to code
- Java transforms 101
- Limitations of code export
When not to use Pipeline Builder
- Situations where Pipeline Builder is not the best fit
- What to use instead
Frequently-asked questions
Pipeline Builder is the most recent addition to Foundry’s data integration suite. It lets anyone create production-grade data pipelines without needing to know how to code.
With Pipeline Builder, you have the ability to develop and deploy batch, incremental and even streaming pipelines in a point-and-click interface. You benefit from the same version control of logic and data that Foundry’s code-based tools pioneered, and Pipeline Builder is fully integrated into Foundry’s pipeline orchestration, monitoring and alerting, and security capabilities. Pipelines made in Pipeline Builder can even be exported to code if needed.
And if you want to learn to code data pipelines, Pipeline Builder is a good way to start: it relies on the same git-based version control workflow as Foundry’s Code Authoring application; it will let you improve your understanding of Spark best practices; and it will let you gain experience at structuring and organizing the logic of data pipelines. Imagine trying to learn all that while simultaneously learning your first programming language.
Pipeline Builder is the most recent addition to Foundry’s data integration suite. It lets anyone create production-grade data pipelines without needing to know how to code.
With Pipeline Builder, you have the ability to develop and deploy batch, incremental and even streaming pipelines in a point-and-click interface. You benefit from the same version control of logic and data that Foundry’s code-based tools pioneered, and Pipeline Builder is fully integrated into Foundry’s pipeline orchestration, monitoring and alerting, and security capabilities. Pipelines made in Pipeline Builder can even be exported to code if needed.
And if you want to learn to code data pipelines, Pipeline Builder is a good way to start: it relies on the same git-based version control workflow as Foundry’s Code Authoring application; it will let you improve your understanding of Spark best practices; and it will let you gain experience at structuring and organizing the logic of data pipelines. Imagine trying to learn all that while simultaneously learning your first programming language.
Pipeline Builder is the most recent addition to Foundry’s data integration suite. It lets anyone create production-grade data pipelines without needing to know how to code.
With Pipeline Builder, you have the ability to develop and deploy batch, incremental and even streaming pipelines in a point-and-click interface. You benefit from the same version control of logic and data that Foundry’s code-based tools pioneered, and Pipeline Builder is fully integrated into Foundry’s pipeline orchestration, monitoring and alerting, and security capabilities. Pipelines made in Pipeline Builder can even be exported to code if needed.
And if you want to learn to code data pipelines, Pipeline Builder is a good way to start: it relies on the same git-based version control workflow as Foundry’s Code Authoring application; it will let you improve your understanding of Spark best practices; and it will let you gain experience at structuring and organizing the logic of data pipelines. Imagine trying to learn all that while simultaneously learning your first programming language.
Meet your teacher
Taylor helps Palantir customers & partners get more out of Foundry
Previously, Taylor worked at Palantir deploying Foundry to commercial and government customers around the world. His experience includes working with customers in healthcare, transportation, media, finance, and manufacturing.