How to fuzzy search with Xata

Explore the world of free text search possibilities with Xata and take a deep dive into fuzzy search.

Written by

Benedicte Raae

Published on

February 13, 2023

One of my favorite features of Xata is the built-in "fuzzy search" functionality. Most database solutions let you search for an exact match. Users these days, though, often expect a more forgiving search, one that will match "olso" to "oslo" and "alez" to "alex".

Fuzzy search to the rescue! 💪

The difference in results with fuzzy search enabled and disabled
Fuzzy search enabled and disabled

FYI: The example code uses a Xata Worker using the Xata SDK + a React component using ReactQuery. However, you may use any type of server-side + client-side setup you are comfortable with.

#

Fuzzy search is enabled by default

Xata search functionality comes with the fuzziness param set to 1 by default, letting the user make one typo, such as one wrong character ("alez") or one set of swapped characters ("olso") etc. It's a great default, and the one we use for pruneyourfollows.com.

Nonetheless I do like to explicitly state the fuzziness level in the Code, together with the configuration for partially matching words.

import React, { useState } from 'react';
import { useDebounce } from 'usehooks-ts';
import { useQuery } from '@tanstack/react-query';
import { xataWorker } from './xata';

// Code executed on the server as a Cloudflare Worker
const searchAccount = xataWorker('searchAccount', async ({ xata }, { term }) => {
  const { records } = await xata.search.all(term, {
    tables: [
      {
        table: 'accounts',
        target: ['name', 'username']
      }
    ],
    // Fuzziness level
    fuzziness: 1,
    // Partially matching words
    prefix: 'phrase'
  });

  return { records };
});

// Code executed in the browser
export default function App() {
  const [term, setTerm] = useState('');
  const debouncedTerm = useDebounce(term, 300);

  const { data: results } = useQuery({
    queryKey: ['search', debouncedTerm],
    queryFn: () => {
      return searchAccount({ term: debouncedTerm });
    },
    enabled: Boolean(debouncedTerm),
    placeholderData: []
  });

  return (
    <main>
      <form onSubmit={(event) => event.preventDefault()}>
        <label htmlFor="search-field">Search accounts: </label>
        <input
          id="search-field"
          type="search"
          value={term}
          placeholder="Search accounts: name or username"
          onChange={(event) => {
            setTerm(event.target.value);
          }}
        />
      </form>

      <ul>
        {results.map(({ record }) => {
          return (
            <li key={record.username}>
              <a href={`http://twitter.com/${record.username}`}>
                <img src={record.meta.profile_image_url} aria-hidden={true} />
                <span
                  dangerouslySetInnerHTML={{
                    __html: record.name
                  }}
                />
                <br />
                @
                <span
                  dangerouslySetInnerHTML={{
                    __html: record.username
                  }}
                />
              </a>
            </li>
          );
        })}
      </ul>
    </main>
  );
}

Other possible levels of fuzziness are 0, to disable it altogether, and 2 to extend the allowed typo tolerance. Higher than 2 is not supported.

In addition to expecting a more forgiving search, users expect to know precisely why a result matches their search term. With a fuzzy search, that is even more crucial, as it can lead to some unexpected matches.

When your mind thinks "oslo" but you wrote "olso", the "Espen Olson" results might feel strange when not highlighted.

The difference between results with and without highlighting
With and without highlighting

Luckily, Xata generates ready-to-use HTML, wrapping the relevant matching text in <em> for us. However, you need to use getMetadata on each record to get access to it.

import React, { useState } from 'react';
import { useDebounce } from 'usehooks-ts';
import { useQuery } from '@tanstack/react-query';
import { xataWorker } from './xata';

// Code executed on the server as a Cloudflare Worker
const searchAccount = xataWorker('searchAccount', async ({ xata }, { term }) => {
  const { records } = await xata.search.all(term, {
    tables: [
      {
        table: 'accounts',
        target: ['name', 'username']
      }
    ],
    fuzziness: 1,
    prefix: 'phrase'
  });

  // Getting the highlights (and more)
  const enrichedResults = records.map((result) => {
    return {
      ...result,
      ...result.record.getMetadata()
    };
  });

  return enrichedResults;
});

//Code executed in the browser
export default function App() {
  const [term, setTerm] = useState('');
  const debouncedTerm = useDebounce(term, 300);

  const { data: results } = useQuery({
    queryKey: ['search', debouncedTerm],
    queryFn: () => {
      return searchAccount({ term: debouncedTerm });
    },
    enabled: Boolean(debouncedTerm),
    placeholderData: []
  });

  return (
    <main>
      <form onSubmit={(event) => event.preventDefault()}>
        <label htmlFor="search-field">Search accounts: </label>
        <input
          id="search-field"
          type="search"
          value={term}
          placeholder="Search accounts: name or username"
          onChange={(event) => {
            setTerm(event.target.value);
          }}
        />
      </form>

      <ul>
        {results.map(({ record, highlight }) => {
          return (
            <li key={record.username}>
              <a href={`http://twitter.com/${record.username}`}>
                <img src={record.meta.profile_image_url} aria-hidden={true} />
                <span
                  dangerouslySetInnerHTML={{
                    __html: highlight.name || record.name
                  }}
                />
                <br />
                @
                <span
                  dangerouslySetInnerHTML={{
                    __html: highlight.username || record.username
                  }}
                />
              </a>
            </li>
          );
        })}
      </ul>
    </main>
  );
}

Xata even allows you to play around with search—no code needed—using the Search Engine Playground.

Pro tip: Use the "Get Code Snippet" feature to get a head start while coding.

Xata Playground showing similar code as this article
Xata Playground showing similar code as this article

I hope you are excited to dig into fuzzy search with Xata:

If you need any help implementing fuzzy search in your project, join the Xata Discord.

Start free,
pay as you grow

Xata provides the best free plan in the industry. It is production ready by default and doesn't pause or cool-down. Take your time to build your business and upgrade when you're ready to scale.

Free plan includes
  • Single team member
  • 10 database branches
  • High availability
  • 15 GB data storage
  • 15 GB search engine storage
  • 2 GB file attachments
  • 250 AI queries per month
Start freeExplore all plans
Free plan includes
  • Single team member
  • 10 database branches
  • High availability
  • 15 GB data storage
  • 15 GB search engine storage
  • 2 GB file attachments
  • 250 AI queries per month

Sign up to our newsletter

By subscribing, you agree with Xata’s Terms of Service and Privacy Policy.

Copyright © 2024 Xatabase Inc.
All rights reserved.

Product

RoadmapFeature requestsPricingStatusAI solutionsFile attachments