How to build a search feature for Foundry’s ontology

Note: SInce this tutorial was published, the Ontology SDK was released. That is the recommended way of building apps that leverage the Ontology.

This article demonstrates how to build a search feature for a web app that uses Foundry’s ontology.

You can find the example code on GitHub. While this project uses Typescript and React, most of the points covered here are relevant to other tech stacks even if the code is not.

First we build a simple search feature that lets users search one property of one object type. Then we modify that feature to enable searching multiple properties of that object type. Finally, we create a live search experience where results appear as the user is typing. In a future article, we will cover searching across multiple object types simultaneously.

However, before we start coding we need to clarify just what we mean by “search”.

Searching vs. Filtering

Both searching and filtering are ubiquitous, but from a user’s perspective they are not the same. Some distinguishing characteristics are:

Starting point

Search experiences usually start from a blank slate and add results.
Filtering experiences usually start from a set of results and winnow them down.

Relevance

Search has some notion of relevance.
Filtering usually allows users to sort results, but there’s rarely any intrinsic sense of relevancy to the sort order.

Knowledge of the data model

Search doesn’t require users to be aware of the data model of the things they’re looking for.
Filtering does require users to have a sense of the dimensions along which the results can be described (e.g. in the context of the pants section of an online clothing store, a user should understand what the filter for inseam length refers to).

Despite these differences, search and filtering often appear side by side. For example:

Google Image Search returns images ranked by relevance and then lets you filter by aspects such as color and size
McMaster-Carr lets you search for a part or product and then offers detailed filtering options depending on what the search results are
The US Library of Congress search applies a filter by default, “Available Online”, and offers additional filter options on the results page

Sometimes, search appears alone. This is common in documentation sites, such as the TailwindCSS docs, the Apache Spark docs, or the Foundry docs. This is what we will be building.

Understanding how search differs from filtering helps us deconstruct our desired user experience into specific requirements. We want to let users input a query term and receive results sorted by relevance; users should not need to understand the data model of that object type beforehand.

Foundry’s Ontology APIs

How many of those requirements can we achieve using Foundry’s ontology APIs?

The relevant endpoint is <code>/search</code> (docs). To use it, you specify the object type in the url and you include a query in the request body. Optionally, the request body can also include the page size and instructions for how to sort the results.

Queries

Queries have three parts: type, field, and value. Type refers to which operator is used, such as “exactly equal to” or “contains all terms”. Field is which property on the object type is being checked. Value is the query term the user types into our search bar.

{
  type: "allTerms",
  field: "properties.component",
  value: "seatbelt",
}

Queries can be nested.

{
  type: "or",
  value: [
    { type: "anyTerms", field: "properties.subject", value: "recall repair" },
    { type: "allTerm", field: "properties.component", value: "seatbelt" }
  ]
}

Looking at the available query types, we can see that our search feature can only be as sophisticated as combinations of these operators will allow.

While we can do things like matching phrases or checking that at least one word of the query term is present, we can’t do things like accommodating typos or searching for synonyms. We will need to keep this in mind when building our search feature.

Ordering

By including an ordering field in the request body, we can sort the results. We can order by the values of existing properties. We don’t have the ability to order by values computed on the fly. The docs include more details on ordering under the query parameters section of the list objects endpoint.

In this example project, we will not use ordering because ordering by existing properties will not help us rank search results by relevance. Ordering is typically more useful in the context of filter features.

Our plan

Using the <code>/search</code> endpoint of Foundry’s ontology APIs, we will be able to build a search feature that lets users search for objects without knowing its data model ahead of time. We won’t be able to rank search results by relevancy.

We will be able to customize the precision and recall of our search feature within the bounds of the available query types — we could be very precise by using “exactly equals” or we could have high recall by using “contains at least one of the terms”. We won’t be able to handle typos.

We will start with a simple search experience where the user inputs a query term and clicks a button to search.

Setup

Our goal is to let users search US vehicle recalls. The source of this data is the US National Highway Traffic Safety Administration. We have created a Vehicle Recall object type in our ontology.

In this first attempt, we will be searching a single property, the Complaint subject. Examples of subjects are “Fuel line leakage”, “Steering gears”, and “Converter neutral wire”.

Project Setup

Clone the search-demo repository and create a <code>.env.local</code> file, which will be excluded from version control. In that file, add the following environment variables. You’ll need to get a developer token from Foundry’s settings page.


# /.env.local

TOKEN=<your-foundry-token>
NEXT_PUBLIC_HOSTNAME=<your-foundry-hostname>
ONTOLOGYRID=<your-ontology-rid>

Install the dependencies. This demo uses pnpm for package management, but if you prefer npm or yarn those should work fine too.

Vehicle Recall Object Type

If you haven’t already created a Vehicle Recall object type in your ontology, do that now. If your Vehicle Recall object type has a different API name than VehicleRecall, you will need to change all instances of VehicleRecall in the code to whatever your API name is for the Vehicle Recall object type.

You can download the NHTSA Vehicle Recall data from the repo as either csv or excel files. You can upload either of them to Foundry and use that dataset to back you Vehicle Recall object type.

In the <code>/src/types.ts</code> file we have already included an interface for the Vehicle Recall object type. If you omit any properties from the version in your ontology or if you change any property names, you’ll need to update this interface definition.

You can, of course, use an existing object type provided that you update the code accordingly. In particular you’ll need to:

Add an interface definition for your object type in <code>/src/types.ts</code>
Replace instances of VehicleRecall with your object type’s API name

Verify the Setup

By this point you have:

Cloned the repo and installed the dependencies
Added a <code>.env.local</code> file with all required environment variables
Created a Vehicle Recall object type in your ontology and updated the code if you used a different API name than VehicleRecall

Next, start the server with pnpm dev and verify that you see the Search Demo landing page at http://localhost:3000. You should be able to search the Vehicle Recall objects at http://localhost:3000/simple.

Part 1: Simple Search

Checkout the tutorial branch to get started. You can refer to the main branch for the final code whenever you want.

Note: This project uses Next.js, which organizes routes using a file structure. You can learn more in the Next.js docs.

Searching

If you start the server and navigate to /simple you’ll see that we have the page title and some text, but nothing else. First, let’s add a search bar using the <code><SearchBar /></code> components in <code>/src/components</code>.


// /src/app/simple/page.tsx

<SearchBar
  queryTerm={queryTerm}
  handleSearch={handleSearch}
  setQueryTerm={setQueryTermTerm}
  isSearching={isSearching}
/>

We can see that we’ll also need to track the text the user has typed into the search bar. We’ll use React’s useState hook for that.

const [queryTerm, setQueryTermTerm] = useState("");

Additionally, we need a function, handleSearch, to run when the user clicks the “Search” button. If the input is empty nothing should happen, but if there’s a value then we want to run the search. We’re using a second function, fetchResults, to make the API call because we also want that function available when users click a button to load more results.


function handleSearch() {
  if (queryTerm !== "") {
    fetchResults();
  }
}

The fetchResults function needs to call our app’s API that handles communicating with Foundry’s ontology APIs. We will build our app’s API in a moment. For now, we need to construct a search query, post that to our app’s API, save the search results, and update any other state variables that need to change after the search is complete.

The Simple Search page of our app will only search a single property, so our query can be quite simple. We’ve chosen to use the allTerms operator on the subject property of the VehicleRecall object type. You can experiment with a different operator or property to search over.


const query: Query = {
  type: "allTerms",
  field: "properties.subject",
  value: queryTerm,
};

So far, our <code>fetchResults()</code> function looks like this:

async function fetchResults() {
  const query: Query = {
    type: "allTerms",
    field: "properties.subject",
    value: queryTerm,
  };
}

It doesn’t yet do anything, so let’s fix that by calling our soon-to-be-created <code>/search</code> endpoint.

async function fetchResults() {
  const query: Query = {
    type: "allTerms",
    field: "properties.subject",
    value: queryTerm,
  };

  fetch(`/api/search?objectType=${objectType}`, {
    method: "POST",
    headers: { "Content-Type": "application/json" },
    body: JSON.stringify({ query, pageSize, pageToken: nextPageToken }),
  })
    .then((resp) => resp.json())
    .then((respJson) => {
      setResults([...results, ...respJson["data"]]);
      setNextPageToken(respJson["nextPageToken"]);
      setIsSearching(false);
    });
}

The function passes an object type in the URL’s query parameters. That’s not strictly necessary for this demo since we’re only searching a single object type, but in a future article we will show how to search over multiple object types.

Several variables required by this function are missing, so let’s add them and review why we need them.

First, although a page size value is optional, it’s good practice to add it if you have that option. Foundry’s <code>/search</code> endpoint does let us specify a page size, and I’ve set it at 10.

const pageSize = 10;

Next, we need a way to store the search results. We can do that with a list of VehicleRecall objects. In our fetchResults function, we set the results variable to a new array of all the existing results and the newly returned results.

const [results, setResults] = useState([]);

const [results, setResults] = useState([]);

Finally, we need to store the nextPageToken value. Foundry endpoints paginate results and use the common method of returning a token to the client to use on its next request.

const [nextPageToken, setNextPageToken] = useState();

Now, our function should be free of errors. However, since we’re working with React we need to refactor our function slightly so that it’s aware of the latest values of our stateful variables such as queryTerm. To do that, we use the useCallback hook, which you can read more about in the React docs.


const fetchResults = useCallback(async () => {
  const query: Query = {
    type: "allTerms",
    field: "properties.subject",
    value: queryTerm,
  };

  fetch(`/api/search?objectType=${objectType}`, {
    method: "POST",
    headers: { "Content-Type": "application/json" },
    body: JSON.stringify({ query, pageSize, pageToken: nextPageToken }),
  })
    .then((resp) => resp.json())
    .then((respJson) => {
      setResults([...results, ...respJson["data"]]);
      setNextPageToken(respJson["nextPageToken"]);
      setIsSearching(false);
    });
}, [queryTerm, nextPageToken, results]);

With our fetchResults function done, we need to return to our handleSearch function. It does not behave correctly when the user has already searched for some other term. That’s because of how the fetchResults function appends new results to the current ones and passes in the nextPageToken. If there are already results from a previous search, those results will remain. If there’s a page token from a previous search, that will cause an error. A solution is simply to reset the search results and page token.


function handleSearch() {
  if (queryTerm !== "") {
    setIsSearching(true);
    setResults([]);
    setNextPageToken(undefined);
    fetchResults();
  }
}

And, for a better user experience, let’s clear all results if the user clears the text in the search bar.


useEffect(() => {
  if (queryTerm === "") {
    setResults([]);
    setNextPageToken(undefined);
  }
}, [queryTerm]);

Our app’s /search endpoint

Before displaying the list of search results, we will create our app’s <code>/search</code> endpoint. It will accept a POST request from our front-end and make a POST request to Foundry’s API to execute the search. Note that in a real app you would want to include proper error handling.


export async function POST(request: Request) {
  const body = await request.json();
  const objectType = new URL(request.url).searchParams.get("objectType");
  if (objectType) {
    const url = `https://${process.env.NEXT_PUBLIC_HOSTNAME}/api/v1/ontologies/${process.env.ONTOLOGYRID}/objects/${objectType}/search/`;
    return await fetch(url, {
      method: "POST",
      headers: {
        "Content-Type": "application/json",
        Authorization: `Bearer ${process.env.TOKEN}`,
      },
      body: JSON.stringify(body),
    })
      .then((resp) => resp.json())
      .then((respJson) => NextResponse.json(respJson));
  }
}

At this point, the search bar should work. However, no results appear on the page.

Adding a Results List

Returning to our /src/app/simple/page.tsx file, we want to use the <ResultsList /> component to display the search results.


// /src/app/simple/page.tsx

export default function SimpleSearch() {
  ...
  return (
    ...
    
  ...
}

Now, you should be able to successfully search for Vehicle Recalls and see the results appear. To wrap up this simple example, we just need to add the ability to load more results. The code already has a “Load More” button that displays a loading spinner when the isSearching variable equals true. Functionally, it currently only serves to give feedback to the user that a search is running. It should also display when there are more pages of results to fetch even if the user is not waiting for the search to execute.

Modifying the logic to check the truthiness of either nextPageToken or isSearching will achieve that. And, since our fetchResults function already uses the nextPageToken, adding a call to fetchResults in the Load More button’s onClick prop will add the next page of results to our list.



  {(nextPageToken || isSearching) && (
    
  )}

Part 2: Multi-property Search

We just saw how to build a search bar that lets users search across a single property of an object type. Now we’ll make our query more sophisticated to allow searching across multiple properties. Searching multiple properties requires changing just the query our app sends to Foundry. Otherwise the code is the same as in Part 1: Simple Search.

Copy the code from /src/app/simple/page.tsx into a new page.tsx in /src/app/multi-prop. The only part we will change is the query variable in the fetchResults function. For the sake of illustrating more types of operators and how to combine sub-queries, this query is somewhat contrived. However, in real applications queries may need to be rather complicated in order to enable the desired search functionality.


const query: Query = {
  type: "or",
  value: [
    { type: "allTerms", field: "properties.subject", value: queryTerm },
    { type: "anyTerm", field: "properties.component", value: queryTerm },
    {
      type: "allTerms",
      field: "properties.consequenceSummary",
      value: queryTerm,
    },
    {
      type: "allTerms",
      field: "properties.correctiveAction",
      value: queryTerm,
    },
    {
      type: "eq",
      field: "properties.recallType",
      value: queryTerm
        .split(" ")
        .map(
          (s) =>
            s.substring(0, 1).toUpperCase() + s.substring(1).toLowerCase()
        )
        .join(" "),
    },
    {
      type: "eq",
      field: "properties.nhtsaId",
      value: queryTerm,
    },
    {
      type: "and",
      value: [
        {
          type: "anyTerm",
          field: "properties.recallDescription",
          value: queryTerm,
        },
        {
          type: "anyTerm",
          field: "properties.manufacturer",
          value: queryTerm,
        },
      ],
    },
  ],
};

Notes on the query

There are a few points to highlight about this query. First, and and or operators require lists of queries. Along with not, these allow for very sophisticated queries to be constructed.

Second, though we have written out the queries manually for sake of demonstration, it’s often helpful to rely on functions that dynamically construct queries. In fact, this is essential if you want to build advanced filtering features, such as what travel sites like Expedia or Booking.com offer their users. Refer to the Query interface definition in /src/types.ts for one way to type inputs to such a builder function.

Third, it’s vital to check which query types are case sensitive. For example, eq is case sensitive, which presents a challenge for searching the RecallType property. All RecallType words begin with an uppercase letter. One option is to modify the data in the pipeline backing our Vehicle Recall object type, but many times that’s not an option. Other users may depend on the particular formatting of this property for unrelated workflows. Instead, we process each part of the user’s query term prior to running the search in Foundry. More complicated transformations might deserve stand-alone functions.

Part 3: Live Search

Letting a user click a button to run a search is perfectly functional, but sometimes we want to provide a different user experience. Type-ahead or live search is one such option — results appear as the user types.

When it comes to interacting with Foundry, live search is conceptually very similar to our simple and multi-property search features. Therefore, we won’t be reviewing most of the code changes as they are React-specific. One point worth noting is that if you build a live search feature, it’s important to debounce your requests so that your app doesn’t send needlessly many requests.

Refer to the <code>/src/app/live/page.tsx</code> file on the main branch for the code. The main changes are wrapping the search function in a useEffect hook, memoizing the query, and adding a named handleLoadMore function.

Where to go from here

Our search feature works, but it does have one notable shortcoming. The results are not sorted by their relevance to the user’s query.

Our suggestion is to embrace what Foundry does offer, sophisticated filtering capabilities in this case, and try to design user experiences that maximize those capabilities. For example, it might be possible to design a UI that encourages quick and aggressive pruning of search results via filters. Along with large page sizes, that could allow for your app to handle relevancy ranking.

In a future post, we will show how to simultaneously search multiple object types.

In the meantime, there are several ways you can extend this search feature to meet the needs of your organization, including:

Adding the ability to filter results, which would require dynamically constructing the query in response to user selections of which filters to apply and what values to provide those filters.
Adding error handling and better handling of bad states (e.g. a better no results message)