I Built a Bot To Help You Write Production Code From API Docs in Minutes, Not Days.

We created an API testing plugin named LiveAPI which helps you try out APIs on your own in the API Documentation sites. It can also generate code in many languages and widely used libraries. It's powered by Lama2 a markdown-like language for APIs.

We are trying to add more features to LiveAPI to make it more useful to the customers. We got the idea to create a bot that will answer questions based on the API documentation page content.

The bot should be able to :

  1. Find relevant information within the docs page.
  2. Answer questions raised by the users.

We trained it to answer questions from a collection of markdown documents and return appropriate answers after some prompting. We will be using the Gemini AI to set up this.

Here are some example question answers that it generated:

Question: How to create a get request?

Answer: I can tell you that! Based on the information provided, a GET request can be made using the following format: GET https://httpbin.org/get. For further information, you can refer to the source file available on Github.

What Is a Document Search QA System?

Document search is an AI algorithm where we try to find the most similar document from a collection. We store data as embeddings in a vectorDB to make this process easy. Embeddings are numeric representations of text. If two embeddings have near-similar values, it means they are semantically equivalent. Since it's a vector we can figure out similarity using cosine rules(dot product).

The question embedding will be a vector (list of float values), which will be compared against the vector of the documents using the dot product. This vector returned from the API is already normalized. The dot product represents the similarity in direction between two vectors.

The values of the dot product can range between -1 and 1, inclusive. If the dot product between two vectors is 1, then the vectors are in the same direction. If the dot product value is 0, then these vectors are orthogonal, or unrelated, to each other. Lastly, if the dot product is -1, then the vectors point in the opposite direction and are not similar to each other.

Using The Gemini AI Model

This demo demonstrates how to use the Gemini API to create embeddings so that you can perform document search. You will use the Python client library to build a word embedding that allows you to compare search strings, or questions, to document contents.

I used colab to set this up. You can follow along the steps.

In this tutorial, we use embedding to perform a document search over a set of documents to ask questions related to the Lama2 document.

Setup the Colab Environment

First, download and install the Gemini API Python library.

pip install -U -q google.generativeai

Initialize the following libraries.

import textwrap
import numpy as np
import pandas as pd
import google.generativeai as genai
import userdatafrom IPython.display import Markdown

Grab an API Key

Before you can use the Gemini API, you must first obtain an API key. If you don't already have one, create a key with one click in Google AI Studio.

In Colab, add the key to the secrets manager under the "🔑" in the left panel. Give it the name API_KEY.

Pass the key to genai.configure(api_key=API_KEY)

API_KEY=userdata.get('API_KEY')
genai.configure(api_key=API_KEY)

Next, it's important to choose a specific model and stick with it. The outputs of different models are not compatible with each other.

for m in genai.list_models():
    if 'embedContent' in m.supported_generation_methods:
        print(m.name)

Results in :

models/embedding-001
models/embedding-001

We will be using embedding-001 model for now.

Building an Embedding Database

Here are three sample texts to use to build the embedding database. You will use the Gemini API to create embeddings of each document. Turn them into a dataframe for better visualization.

DOCUMENT1 = {
    "title": "Installation/Update",
    "content": "To install/update Lama2 and its dependencies automatically, run the following: curl -s https://hexmos.com/lama2/install.sh | bash -s"}
DOCUMENT2 = {
    "title": "How to use",
    "content": """Type l2 into the terminal. You should get something like:Usage:
  l2 [OPTIONS] [LamaAPIFile]

Application Options:
  -o, --output=      Path to output JSON file to store logs, headers and result
  -v, --verbose      Show verbose debug information
  -n, --nocolor      Disable color in httpie output
  -e  --env=         Get a JSON of environment variables revelant to input arg
  -h, --help         Usage help for Lama2
      --version      Print Lama2 binary version

Help Options:
  -h, --help         Show this help message"""}
DOCUMENT3 = {
    "title": "From VS Code",
    "content": """
    Find Lama2 for VSCode at the VSCode Marketplace. The extension requires the l2 command available (usually at /usr/local/bin/l2 for Linux/MacOS and C:\ProgramData\chocolatey\bin for Windows).
Once the extension is installed, open the command palette (ctrl + shift + p) and search for Execute the current file to execute the file
    """}

documents = [DOCUMENT1, DOCUMENT2, DOCUMENT3]

Here is a script that I used to convert the docs into a list of dictionaries:

parse_markdown will split the .md files into a dictionary with title and content. The heading is taken as the title. If there is no heading at the beginning of the file use the filename as the title.

import os
import re

def parse_markdown(filename, markdown_content):
    sections = []
    current_title = filename
    current_content = []
    lines = markdown_content.splitlines()
    heading_pattern = re.compile(r'^(#{1,6})\s*(.*)')
    for line in lines:
        match = heading_pattern.match(line)
        if match:
            if current_content:
                sections.append({
                    'title': current_title,
                    'content': '\n'.join(current_content).strip()
                })
                current_content = []

            current_title = match.group(2)
        else:
            current_content.append(line)

    if current_content:
        sections.append({
            'title': current_title,
            'content': '\n'.join(current_content).strip()
        })

    return sections

Helper function to do file read operations.


def process_markdown_file(file_path):
    filename = os.path.basename(file_path)

    with open(file_path, 'r', encoding='utf-8') as file:
        markdown_content = file.read()

    sections = parse_markdown(filename, markdown_content)

    return sections

It takes all the markdown files ending with .md and passes them to the process_markdown_file function. Store the results in a list.

def process_all_markdown_files(root_folder):
    sections = []
    for root, dirs, files in os.walk(root_folder):
        for file in files:
            if file.endswith('.md'):
                file_path = os.path.join(root, file)
                newsections = process_markdown_file(file_path)
                filtered_sections = [section for section in newsections if section['content']]

                sections.extend(filtered_sections)
    return sections

Provide the root folder and run the pipeline to get the structure data.


# Example usage:
root_folder = "demo"  # Replace with your root directory containing markdown files
results = process_all_markdown_files(root_folder)
print(results)

documents = results

Organize the contents of the dictionary into a dataframe for better visualization.

df = pd.DataFrame(documents)
df.columns = ['Title', 'Text']
print(df)

For the new embeddings model, embedding-001, there is a task type parameter and the optional title (for task_type=RETRIEVAL_DOCUMENT). Specifying a title for RETRIEVAL_DOCUMENT provides better quality embeddings for retrieval.

Get the embeddings for each of these bodies of text. Add this information to the dataframe.

genai.embed_content(model=model,content=text,task_type="retrieval_document",title=title)["embedding"]

df['Embeddings'] = df.apply(lambda row: embed_fn(row['Title'], row['Text']), axis=1)
print(df)

Document Search With Question Answering System

Now that the embeddings are generated, we need a Q&A system to search these documents. You will ask a question about hyperparameter tuning, create an embedding of the question, and compare it against the collection of embeddings in the dataframe.

Embedding model (embedding-001), specify the task type as QUERY for user query and DOCUMENT when embedding a document text.

Task Type Description
RETRIEVAL_QUERY Specifies the given text is a query in a search/retrieval setting.
RETRIEVAL_DOCUMENT Specifies the given text is a document in a search/retrieval setting.
query = "How to create a get request?"
model = 'models/embedding-001'

request = genai.embed_content(model=model,
                              content=query,
                              task_type="retrieval_query")

Use the find_best_passage function to calculate the dot products, and then sort the dataframe from the largest to smallest dot product value to retrieve the relevant passage out of the database.

def find_best_passage(query, dataframe):
  """
  Compute the distances between the query and each document in the dataframe
  using the dot product.
  """
  query_embedding = genai.embed_content(model=model,
                                        content=query,
                                        task_type="retrieval_query")
  dot_products = np.dot(np.stack(dataframe['Embeddings']), query_embedding["embedding"])
  idx = np.argmax(dot_products)
  return dataframe.iloc[idx]['Text']

View the most relevant document from the database:

passage = find_best_passage(query, df)
print(passage)

Result:
GET https://httpbin.org/get Get [Source File](https://github.com/HexmosTech/Lama2/tree/main/examples/0000_sample_get.l2)

Question and Answering Application

Let's try to use the text generation API to create a Q&A system. Input your custom data below to create a simple question-and-answer example. You will still use the dot product as a metric of similarity.

def make_prompt(query, relevant_passage):
  escaped = relevant_passage.replace("'", "").replace('"', "").replace("\n", " ")
  prompt = textwrap.dedent("""You are a helpful and informative bot that answers questions using text from the reference passage included below. \
  Be sure to respond in a complete sentence, being comprehensive, including all relevant background information. \
  However, you are talking to a non-technical audience, so be sure to break down complicated concepts and \
  strike a friendly and converstional tone. \
  If the passage is irrelevant to the answer, you may ignore it.
  QUESTION: '{query}'
  PASSAGE: '{relevant_passage}'

    ANSWER:
  """).format(query=query, relevant_passage=escaped)

  return prompt

Create a friendly response tailored for the user.

prompt = make_prompt(query, passage)print(prompt)

Results:

You are a helpful and informative bot that answers questions using text from the reference passage included below.   Be sure to respond in a complete sentence, being comprehensive, including all relevant background information.   However, you are talking to a non-technical audience, so be sure to break down complicated concepts and strike a friendly and conversational tone.   If the passage is irrelevant to the answer, you may ignore it.
  QUESTION: 'How to create a get request?'
  PASSAGE: '``` GET https://httpbin.org/get ```  Get [Source File](https://github.com/HexmosTech/Lama2/tree/main/examples/0000_sample_get.l2)'

    ANSWER:

Choose one of the Gemini content generation models to find the answer to your query.

for m in genai.list_models():  
    if 'generateContent' in m.supported_generation_methods:
        print(m.name)

Results:

models/gemini-pro
models/gemini-ultra

We choose the pro model.

model = genai.GenerativeModel('gemini-1.5-pro-latest')
answer = model.generate_content(prompt)

Convert to markdown and print the results.

Markdown(answer.text)

Output:

I can tell you that! Based on the information provided, a GET request can be made using the following format: GET https://httpbin.org/get. For further information, you can refer to the source file available on Github.

Conclusion and Next Steps

Through this demo, we find it quite easy to set up a document-answering QA AI bot. The difficult part is taking this to production. Will be writing on how to take it to production with a simple backend in the next article. Thanks for the read stay tuned for the next part.

FeedZap: Read 2X Books This Year

FeedZap helps you consume your books through a healthy, snackable feed, so that you can read more with less time, effort and energy.