What is Xata?
Ask AI

Asking questions of your data with AI

Edit on GitHub

The ask endpoint helps you create interactive and engaging conversational interfaces, Q&A assistants, and chatbots. It provides users with swift and direct access to the information they need, eliminating the need to navigate through lengthy documentation or articles.

The endpoint leverages up-to-date information from your Xata database, such as documentation, knowledge bases, or any source of data you've configured. The ask endpoint uses Xata's search capabilities to find relevant information from your database, and uses this with OpenAI's ChatGPT API to interpret your question and generate natural language responses.

Ask AI runs in the optional Search store which is eventually consistent. The Search store is enabled by default and can be disabled in the Database Settings from the Web UI. The ask API and features described in this page require the Search store to be enabled.

#

How does it work?

At a high level, the ask endpoint works like this:

  1. Run a text search against the database to find the records that are most relevant to the question asked by the user.
  2. Take the most relevant results, and create a custom prompt for OpenAI.
  3. Send the prompt to the ChatGPT API and let the model complete the answer.

The ask endpoint can use either keyword or vector search to find records that are most relevant to the user's query. To learn more about keyword or vector searches, see our blog the compares the two approaches.

When you use the ask endpoint, you can choose to continue any conversation you've started. Xata will remember the rules and context you've set, and allow you to continue to ask follow-up questions. Check the examples below to see how to use it.

The ask endpoint has the following general form:

const result = await xata.db.Tutorial.ask('<question>', {
  rules: [
    // ...array of strings with the rules for the model...,
  ],
  searchType: 'keyword|vector',
  search: {
    fuzziness: 0 | 1 | 2,
    prefix: 'phrase|disabled',
    target: {
      // ...search target options...
    },
    filter: {
      // ...search filter options...
    },
    boosters: [
      // ...search boosters options...
    ]
  },
  vectorSearch: {
    column: '<embedding column>',
    contentColumn: '<content column>',
    filter: {
      // ...search filter options...
    }
  }
});

The ask endpoint works at the table level. In the above snippet, the name of the table is called Tutorial, as a placeholder. Replace Tutorial with the name of your table name.

The only mandatory parameter is the question, which is the question that the ask endpoint will attempt to answer. The other options are:

  • rules: an array of strings with the rules for the model. The rules are passed as part of the prompt and control the behavior of the model. For example, you can use the rules to tell the model to answer the question in a specific way, for example, more or less concise. See the examples on this page for some ideas.
  • searchType: the type of search to use to find the relevant records. Can be either "keyword", for using the keyword free-text-search, or "vector", for using semantic/similarity search. The default is "keyword".
  • search: options for the keyword search. See the Keyword search options section for the available options.
  • vectorSearch: options for the vector search. See the Vector search options section for the available options.

Note that after performing the search step, the ask endpoint will use the top 3 results. In case of keyword search, all string or text columns that have more than 150 characters in the top 3 results will be added to the prompt. In case of vector search, the specified contentColumn from the top 3 results will be added to the prompt.

OpenAI has limits on how much information can be sent or received. If your data is too large for OpenAI to process, Xata will take care of it for you. The endpoint will attempt to dynamically select the portions of the text that are most relevant . In chat sessions that span many messages, Xata will always ensure your context and rules are maintained, and pop messages in FIFO order; the first messages will be trimmed first.

The response contains the generated answer and the record IDs that were used as context:

{
  "answer": "< answer >",
  "sessionId": "cg52bk1eqh5rd5hndhq95jercs",
  "records": [
    "b70d541d114ff54ad15915636450663f",
    "8ae4837002e21f013aa85c30a126ea1c",
    "4b137344a3c53d5152c45ed514188cd2"
  ]
}

You can lookup the records by IDs in the database to get their contents.

The options that you can pass under the search key mirror a subset of the options from the search endpoint. See that section of the guide for more details.

  • target to select the columns to search in and to configure column weights.
  • prefix to configure the prefix matching behavior.
  • fuzziness to configure the fuzzy matching behavior.
  • filter to filter the records that are returned by the search.
  • boosters to tune the relevancy controls.

The options that you can pass under the search key mirror a subset of the options from the vector search endpoint. See that section of the guide for more details.

  • column to select the column that contains the vector embeddings.
  • contentColumn to select the column that contains the text content that will be used as context.
  • filter to filter the records that are returned by the search.

Because the ask endpoint can take several seconds to complete, it supports streaming the response using server-sent events.

To enable this mode, pass the Accept: text/event-stream header in the request. The response will be a stream of JSON objects, prefixed by the keywords data. The last object in the stream contains the "done": true:

data: {"text":""}

data: {"text":"To"}

data: {"text":" perform"}

data: {"text":" an"}

...

data: {"done":true, "sessionId": "cg52bk1eqh5rd5hndhq95jercs", "records":["0415d654cc8dd113e9b9423f425697ae","91301cdc08524e6f6eef5d0b9d66e345","ae0c68966fc0f00cb913b3c209a7c5a1"]}

You can find a full Next.js application using the ask endpoint with server-sent events in our examples repository.

Xata supports the functionality of asking follow-up questions (as in a conversational model), which allows you to create ChatGPT-style interfaces for your data. To achieve this, Xata introduces the concept of a session.

A session represents a single conversation between a user and the data in one of your Xata tables. Each session contains relevant data from your tables, known as context. The session preserves the rules you specified at its start, along with all the questions sent and answers received during the interaction.

When communicating with OpenAI, the ask endpoint ensures that your rules, your context, and as many questions as possible are transmitted each time. If the total amount of messages becomes too long for OpenAI to process, Xata will trim messages from earlier in the conversation until the message size is manageable for OpenAI to handle effectively.

Each session maintains its own cached set of your context, so changes to the underlying data in your database will not affect ongoing conversations. All cached information: your context, rules, and all conversations are automatically removed after 7 days.

const result = await xata.db.Tutorial.ask('<your follow-up message>', { sessionId: '<session id>' });

To run these examples you need to have a table that contains enough text to make the context useful for a question. String/text columns that are less than 150 characters are ignored by the model.

To index your custom data in Xata, you can use, for example, this web crawler.

The following is the simplest example of using the ask endpoint:

const result = await xata.db.Tutorial.ask('What is this tutorial about?');

Because the search step is not specified, keyword search is being used with the default configuration.

The following example tunes the rules and the search step, while still using keyword search:

const result = await xata.db.Tutorial.ask('How do I retrieve a single record?', {
  rules: [
    'Do not answer questions about pricing or the free tier. Respond that Xata has several options available, please check https://xata.io/pricing for more information.',
    'When you give an example, this example must exist exactly in the context given.',
    'Only answer questions that are relating to the defined context or are general technical questions. If asked about a question outside of the context, you can respond with "It doesn\'t look like I have enough information to answer that. Check the documentation or contact support."',
    'Your name is DanGPT'
  ],
  searchType: 'keyword',
  search: {
    fuzziness: 2,
    prefix: 'phrase',
    target: ['slug', { column: 'title', weight: 4 }, 'content', 'section', { column: 'keywords', weight: 4 }],
    boosters: [
      {
        valueBooster: {
          column: 'section',
          value: 'guide',
          factor: 18
        }
      }
    ]
  }
});

The following example uses vector search:

 const result = await xata.db.Tutorial.ask("How do a retrieve a single record?", {
  rules: [
    "Do not answer questions about pricing or the free tier. Respond that Xata has several options available, please check https://xata.io/pricing for more information.",
    "If the user asks a how-to question, provide a code snippet in the language they asked for with TypeScript as the default.",
    "Only answer questions that are relating to the defined context or are general technical questions. If asked about a question outside of the context, you can respond with \"It doesn't look like I have enough information to answer that. Check the documentation or contact support.\"",
    "Results should be relevant to the context provided and match what is expected for a cloud database.",
    "If the question doesn't appear to be answerable from the context provided, but seems to be a question about TypeScript, Javascript, or REST APIs, you may answer from outside of the provided context.",
    "Your name is DanGPT"
  ],
  searchType: "vector",
  vectorSearch: {
    column: "embeddings",
    contentColumn: "content",
    filter: {
      section: "guide"
    }
  }
 })

The following example shows a conversation.

const result = await xata.db.Tutorial.ask("What's the difference between summarize and aggregate?", {
  rules: [
    'Do not answer questions about pricing or the free tier. Respond that Xata has several options available, please check https://xata.io/pricing for more information.',
    'When you give an example, this example must exist exactly in the context given.',
    'Only answer questions that relate to the defined context or are general technical questions. If asked about a question outside of the context, you can respond with "It doesn\'t look like I have enough information to answer that. Check the documentation or contact support."',
    'Your name is DanGPT'
  ],
  searchType: 'keyword',
  search: {
    fuzziness: 2,
    prefix: 'phrase',
    target: ['slug', { column: 'title', weight: 4 }, 'content', 'section', { column: 'keywords', weight: 4 }],
    boosters: [
      {
        valueBooster: {
          column: 'section',
          value: 'guide',
          factor: 18
        }
      }
    ]
  }
});

From here, you might receive the following response:

{
  "answer": "Summarize and aggregate both return similar results, but they differ in terms of the underlying store the data is served from. When using summarize, data is retrieved from PostgreSQL, which ensures consistent results. However, PostgreSQL storage is not optimized for these types of queries. On the other hand, the aggregate endpoint retrieves data from the column-store, which is optimized for larger tables and can handle more cardinality. This makes the aggregate endpoint faster and more suitable for workloads with eventual consistency. Please refer to the documentation for more information on the differences between summarize and aggregate.",
  "sessionId": "cg52bk1eqh5rd5hndhq95jercs",
  "records": ["rec_a", "rec_b", "rec_c"]
}

Now, using the session_id key returned in the response, we're able to continue the dialogue.

const result = await xata.db.TutorialaskContinue('cg52bk1eqh5rd5hndhq95jercs', 'Can you show me an example of both?');

You will receive a response similar to the following:

{
  "answer": "Sure, ..."
}

If you see the following error:

no records found, it seems that I don't have enough data to answer this question.

It means that the search step did not return any results or that the returned results don't have enough text to generate a prompt. Note that string and text columns that have less than 150 characters are ignored.

To solve this issue, you typically need to add more text to the content column to ensure it has more than 150 characters for most records.

If you never use the ask endpoint, your data will never be sent to OpenAI.

When you use the endpoint, Xata finds the most relevant fields matching your search query and sends them in fragments or in their entirety to OpenAI. This process is carried on a per-request basis. Data is not synced to any schedule. Only the data in the table you're searching for, and matching the parameters you provide, will be sent.

On this page

How does it work?UsageKeyword search optionsVector search optionsStreaming the responseAsking follow-up questionsExamplesTroubleshootingNo records foundFAQsWhat data is given to OpenAI?