Professional Documents
Culture Documents
Palantir en
Palantir en
Welcome to Foundry Foundations. In this series of short videos and supplementary content,
we would like to familiarize you with the basics of the Palantir Foundry Platform. By the end
of this course, you should be able to answer questions such as, what is Foundry?
What is the Foundry Ontology? And who is the ontology for and how can you make use of it?
Note that throughout our training, we'll use the word Foundry to refer to the software you're
learning. Some organizations may have built their own platform on top of Foundry or might
refer to it by another name, but Palantir Foundry is the name of the underlying software
platform.
This course is organized into four sections. First, we'll discuss what is Foundry and highlight
what the platform was built for. Then, we'll help you understand how Foundry can be used to
deliver value for our customers in various industries.
We'll do that by discussing the ontology first on a conceptual level. To make things more
tangible, we'll then provide you with some deep dives and mini demos into the different
constituent elements of the ontology and how it can be used.
Finally, we'll give you a tour of the rest of the Foundry training material to help you continue
your Foundry learning journey. Let's dive in. So let's start with the first question. What is
Foundry?
Foundry is an operational decision -making platform. Foundry empowers every type of user,
from technical back -end engineer to shop floor worker, data analyst, and executive, to drive
impact using their data and models.
Foundry has been continuously designed in response to pressing needs seen in mission -
critical contexts around the world. From defense and intelligence to vaccine distribution. to
manufacturing optimization.
Today, Foundry's footprint spans over 50 industries in the commercial sector, as well as
government agencies. The way we achieve this is in close partnership with our customers,
building software that's needed on the front lines.
Examples of some of the industries that use Foundry to power their mission critical work
include healthcare, automotive, retail, mining, insurance, utilities, supply chain, and the list
goes on. From such engagements, we built Foundry for security -conscious customers who
need the capability to handle financial data, personally identifiable information, protected
health information, controlled, unclassified information, and even classified government data
in a secure and compliant manner.
Foundry's strong security enables regulatory requirements across industries and continents
by aligning with frameworks like HIPAA, GDPR, and ITAR. One thing that we've learned from
many years of forward deployed engineering is that there is often complexity in unlocking the
true value of digital transformation, and we think there are a couple of reasons for that.
First of all, digital transformation requires a common language and living representation of
the organization that is shared by data, analytics, and operational teams. Further, this
representation needs to be used to coordinate individual teams and connect them in real
time.
1
Finally, the representation needs to be flexible and adapt to changing conditions over time,
thereby unlocking true organizational learning. To tackle this, it's not good enough just to
have something that serves highly technical data users in isolation.
True digital transformation can only be achieved together, bringing together teams across
operational functions and technical skill sets. Yet, many organizations make large
investments in data and perpetually struggle to connect the individual elements of their tech
stack.
The same holds true for an organization's investments into analytics. Analytics investments
are great, but they are not sufficient. Analytics teams can build great dashboards and
models, and they may have even deployed models as endpoints or containers, but then
what?
More often than not, it's unclear how these investments connect back to the organization,
close the feedback loop, and move the needle in operational settings. So, what organizations
oftentimes find themselves in is a situation which we metaphorically describe as the data
assembly line.
2
Oftentimes, you'll want to explore different scenarios, branch them and see what effects your
scenarios would have on downstream operations. And when you're ready to make a
decision, you need to push updates to these scenarios to cement that seamless connectivity
between strategy and operations.
Okay, so now that we've understood the layers of the ontology, let's take a look at the
functional components that the different user groups can employ in order to hydrate, activate
and wield the ontology.
Lots of our customers start with a single workflow and grow the ontology organically over
time. It starts with data integration. Pipeline applications in Foundry make it quick and easy to
hydrate the ontology with different data sources, including tabular data, sensor data,
transactional systems, geospatial sources, third -party data, and so on.
Once the data sources hit the ontology, they are transformed into concepts that are intuitive
and relate to real -world concepts like customers, products, prices, and warehouses. In order
to get started, we need to think about all the different data sources coming together from
structured data and unstructured data, helping to feed this common, organization -centric
version of the world.
Many modern organizations' operations are built on a foundation not only of data, but of
models operating on that data. So if I think about models my organization might have already
built, they might come from a lot of different sources.
My model building tools, stored procedures, various tools and models, the list goes on, and
data scientists make use of these tools to answer pressing questions for their organizations.
But a common problem is that all of these models are built on historical snippets of data and
are not fully integrated into the broad spectrum of applications they might need to use them
in, whether it's analytics or operational applications.
Data science teams often go from data to model quite quickly, but going from model to
impact is a bit more difficult. Often teams are siloed and models go underutilized, so we can
only leverage the value of models if we unite them in a shared system, the ontology.
And so again we need this shared representation to help power all sorts of analytical
capabilities. Let's say we want to change the discount price of an item based on a model that
helps me identify the optimal discount each week.
3
There are many types of data sources the system can connect to natively, whether it's
structured data, unstructured data, tabular data, sensor data, IoT data, geospatial data, or
commercial data sources like SAP, Salesforce, Oracle, and so on.
We've built intelligent connectors to be able to get you to build your ontology foundation
quickly. But what does that mean in practice? As one example, let's take a look at how the
Palantir HyperAuto software suite can be used to automate ingestion, cleaning and
transforming data from, in this case, SAP.
Before synchronizing data from the source system, we are able to explore and preview
information in it. We can view the table metadata and relationships between tables in the
source system, apply some point and click configurations to automatically generate a
transformation pipeline, and launch the generation of data extracts for chosen modules or
predefined workflows.
HyperAuto automatically indicates and groups the tables needed for the workflow together,
and, when ingested, applies the necessary downstream logic to get data ready to
operationalize the use case. Building on top of HyperAuto is much faster than using the
same software.
than building an entire pipeline from scratch, and hence enables users to increase their
speed to value. However, there is more than just connecting to individual systems.
Oftentimes, we need to reconcile data from many different data sources and get it into a
format that actually matches the ontology representation of an organization, such as the
shop locations, the customers, the warehouses, etc.
This is where data engineers make use of low -code and pro -code applications with which
they can build ontology pipelines. To show you an example of what such a pipeline looks like,
let's dive into our data lineage application, which allows data engineers to trace back each
modification of a data set they have undertaken, allowing for maximum transparency into
where, for instance, a specific KPI came from and how it was calculated.
On the lineage graph, you can easily trace back the data that is being used in your analysis
or operational application to its source and examine the individual transformations that have
been made along the way.
Other common questions that pop up here are, is the dataset up to date? Is my dataset
healthy? The data lineage application provides tooling to answer these questions simply. In
this example, all of the red boxes represent datasets in my pipeline that are out of date,
whereas the blue boxes represent datasets that are up to date.
Similarly, I might also want to know who has access to this data. Again, the data lineage
application lets us check this on the pipeline level. In this example, I can see that my
colleague is able to view the datasets shown in blue on the data lineage graph, but does not
have access to those shown in red.
These are just a few examples of questions data engineers regularly need to answer.
Building robust data pipelines requires tools that allow you to apply rigor to your
development, and the data lineage tool we just showed you is one of them.
You can learn more about how Foundry is great at handling sensitive information by
exploring our operational security tools and our compliance and accreditation.
4
Ontology Deep Dive – Models
Okay, now that we have seen some of the components that help data engineers feed data
into the Foundry ontology, let's dive deeper into the model side of the ontology. What's
especially important here is that models don't need to only live on Foundry.
Instead, they can be built in time's means for data scientists working with models is that they
need to be able to plug in models that they may have developed elsewhere to the ontology,
both feeding the ontology, but also making use of the accurate representation of the
organization's digital twin to run models in a production environment.
Similarly to how its data connectors work, Foundry natively integrates with a number of
common model building tools, including AWS SageMaker, DataRobot, IBM Watson, GCP
Vertex AI, and more. Whether you connect to a model developed outside of Foundry or
develop it within Foundry using the model workbench, you can use Foundry to develop, train,
test, and deploy any model you have incorporated into your model library.
Foundry serves the important need of data scientists to manage their models appropriately.
Which data set exactly have I trained my model on? How can I use production data safely to
make sure my model continues to learn and responds to the reality of my operation?
How can I manage upgrades to my model and compare results before and after? Foundry
takes a structured approach to move models from the R &D bench into real production
workflows that end users can rely on.
For that purpose, we have built a framework called Modeling Objectives that handles the
entire model lifecycle from submission, evaluation, release, deployment into production, and
constant monitoring.
First of all, data scientists need appropriate tools to create and curate their training and
testing data sets. So let's take a look at an example. Here we have built a customized image
annotation tool using the building blocks of an application builder called Workshop, which
we'll discover more in detail later.
In this case, we need medical professionals to annotate the images, which is why it is
important to have an easy -to -use system that tracks every change to the underlying training
data annotation. And in our case, every change is fed back into the ontology.
Once the data is annotated, it's time to build and train the model. This is CodeWorkbook, one
of the foundry applications aimed more at data scientists. Each of these nodes represents
either data or a model, and you can click into one to see the underlying code.
You can track each bit of data, what it takes as an input, and where the output goes.
Codeworkbook supports R, Python, and SQL language, and allows for customized
environments as well as includes core functionality that facilitates the usage of open source
libraries such as TensorFlow, Keras, or Scikit -learn.
In addition to building, training, and managing models, we want to be able to get them linked
to the ontology to update dynamic values on entities that we refer to in operational workflows.
An end user might need a model embedded in scenarios and simulations to help inform the
next best action to take.
If we go back to our medical case, the annotation tool that we used can also be used for
production workflows. So instead of just annotating training and testing data, a physician can
use this tool to annotate an image and run this annotation through a pre -trained model,
which, in this case, estimates the two -year survival likelihood of a particular patient.
5
Depending on this estimate, the radiologist might need to alert the treating clinician or
request an additional scan for the patients. The applications in the ontology system make this
possible and create a direct collaboration link between model builders, model consumers,
and the affected downstream operations.
6
The ontology is configured and managed in a central location, including permissions and
access controls, definition of access types, for instance, defining what may be written back
into a source system, and health status.
This allows users who wield the ontology to focus on the analytics and operational decision
making content rather than setting and resetting important boundaries for their explorations
each time.
7
Ontology Deep Dive – Integrations
Finally, let's briefly take a look at what an operational user experience can look like, either in
Foundry or in third -party systems connected to Foundry. We know that many of our
customers have reasons not to use Foundry as their front end for some user -facing
workflows, so Palantir has built extensive APIs, including our OPIs or ontology programming
interfaces to connect the ontology directly to third -party systems.
Let's go through a brief example here. The first view is what we have built using the
Workshop Application Building Tool in Foundry. A user in the logistics center of an enterprise
attends to an alert inbox that surfaces resource and capacity shortages affecting fulfillment.
When alerts occur, the user can attempt to resolve the issue by partially fulfilling it or by
requesting more capacity from external vendors directly from the platform via email.
Whatever the user chooses to do, the action will be written back into the source system and
available regardless of whether one looks at Foundry or the third -party system.
However, the user might also be required to take this action directly in the third -party
system, which Foundry also enables. The latest information from Foundry is simply written
back into the third -party system, and any action taken in that system is synced back into
Foundry.
This ensures that there is one consistent source of truth that enables effective collaboration,
even cross -platform.