I spent the last 6 months building LiveAPI Proxy: Here are 10 HARD-EARNED Engineering Lessons you can use now

I spent quite some time trying to solve an issue with LiveAPI, Which prevented it from executing API Requests. Here is my journey and the lessons I got trying to build a proxy server of my own to solve this issue.

I spent the last 6 months building LiveAPI Proxy: Here are 10 HARD-EARNED Engineering Lessons you can use now

How LiveAPI Taught me some important Lessons in engineering

I have been working on a product named LiveAPI. Let me just give an idea of what this product does.

The above API doc is a static one, users cant execute and change things by themselves.

Static API docs like these often lose customer attention before the developers even try the APIs.

The above API Doc uses LiveAPI, here developers can execute these APIs instantly right within their browser, so that developer attention can be captured within the first 30 seconds of their visit.

LiveAPI uses a WASM Binary and a language core for executing the APIs. These things are already built up and we started testing this on some httpbin URLs, everything seemed fine.

When we tried doing a GET request to www.google.com, it failed.
We investigated further and found out that there was a CORS error going on.

CORS error prevents us from making requests from one site to another site.
But this is a vital thing, because we are always requesting from one site(API docs) to another site(the target API url).

So we thought for a while on this issue, and an idea popped up. How about we use proxyservers? This is a potential solution to this problem and will get us back up and running. Let's see how proxy servers can be a useful approach.

Learning about Proxies: Engineering a Solution for CORS-Free browser requests

What is a proxy server?

alt text

Consider this example.
Here you can see two people, Alice and Bob. In the middle there is a proxy.

Alice asked the proxy to forward a message to him, Bob also does the same.
The proxy acts as the middleman here passing information between these two people.

This is how proxy servers work.
A proxyserver acts as a middleman between a client and a server, We have 3 things: Client Requests, Proxy Server and Responses.

Client Request: When you send a request to a website. Instead of the website receiving it first, the proxy server receives it.

Proxy Server: The proxy server then forwards your request to the actual website. It’s like a middleman that handles the communication.

Response: The website responds to the proxy server, which then forwards the response back to you.

How Proxies aid with solving the CORS problem

The proxy server makes the request to the target API on behalf of our LiveAPI tool. Since the browser sees the request coming from the proxy server rather than from our site, it bypasses the CORS restrictions.

Figuring out how to build a proxy server: The approach I took

Since we got an idea of what the solution looks like, We were thinking about what technologies should we use.

For our case, we already had an apache2 server up and running, and since Apache is a widely used server with a lot of module support, we felt it was the better choice for building our proxy server.

1 powerful reason a day nudging you to read
so that you can read more, and level up in life.

Sent throughout the year. Absolutely FREE.

Putting the Solution into Action: Building an Apache2 Proxy and Getting LiveAPI working

Setting things up

To setup the proxyserver, we first created a forward_proxy-le-ssl.conf file inside /etc/apache2/sites-available

<IfModule mod_ssl.c>
 <VirtualHost *:443>
 ProxyPreserveHost On

 # Server Name
 ServerName example.com

 # Proxy and authorization
 <Proxy "*">
 AuthType Basic
 AuthName "Restricted Access"
 AuthUserFile /etc/apache2/sites-available/proxy.htpasswd
 Require valid-user
 </Proxy>

 # Error Log
 ErrorLog ${APACHE_LOG_DIR}/error_forward_proxy.log

 # SSL Certificates
 SSLCertificateFile /etc/letsencrypt/live/example.com/fullchain.pem
 SSLCertificateKeyFile /etc/letsencrypt/live/example.com/privkey.pem
 Include /etc/letsencrypt/options-ssl-apache.conf


 </VirtualHost>
</IfModule>

Here I have a basic configuration set up, with the server name pointing towards example.com and logs being directed to a file and SSL Certificates

Let's go over the code briefly.

<IfModule mod_ssl.c>:

  • This condition ensures that the configuration inside the block is only applied if the mod_ssl module is loaded. This module enables SSL/TLS for the Apache server.
  • SSL/TLS provides encryption for secure communication over the internet and protects sensitive data.

<VirtualHost *:443>:

  • This block defines a virtual host that listens on port 443, which is the standard port for HTTPS connections.

ServerName example.com:

  • This directive specifies the domain name for this virtual host.

<Proxy "*">:

  • This line specifies that the following directives apply to all proxy requests. The asterisk (*) is a wildcard that means any URL or request being proxied by the server.

AuthType Basic:

  • This sets the authentication type to "Basic". Basic authentication is a simple method where the client (like a web browser) sends a username and password encoded in base64.

AuthName "Restricted Access":

  • This specifies the name of the authentication realm. When users try to access the proxy, they will see a pop-up asking for a username and password, and it will display "Restricted Access" as the prompt message.

AuthUserFile /etc/apache2/sites-available/proxy.htpasswd:

  • This points to the location of the password file. The file /etc/apache2/sites-available/proxy.htpasswd contains a list of usernames and passwords that are allowed to access the proxy.
  • To create proxy.htpasswd, you can use the htpasswd command-line tool provided by Apache, which allows you to add or modify usernames and passwords in the file securely.

Require valid-user:

  • This specifies that any valid user (anyone whose credentials are in the proxy.htpasswd file) is allowed to access the proxy.

ErrorLog ${APACHE_LOG_DIR}/error_forward_proxy.log:

  • This directive specifies the location of the error log file for this virtual host. Errors related to this virtual host will be logged to the specified file.

After making the modifications, this is what I get on accessing example.com

alt text

Forwarding the request

Now my next goal is to use this for executing web request. For example: When I do https://example.com/https://www.google.com, it should lead to the www.google.com website

I added a Proxy directive and a bunch of Rewrite rules which will help us forward the request properly.

<Proxy "*">
Require valid-user
Require all granted
</Proxy>
RewriteCond %{HTTP_HOST} ^example\.com$ [NC]
RewriteCond %{REQUEST_METHOD} !OPTIONS
RewriteRule ^/https:/http:/([^/]+)(/.*)?$ https://$1$2 [P,R=301]

RewriteCond %{HTTP_HOST} ^example\.com$ [NC]
RewriteCond %{REQUEST_METHOD} !OPTIONS
RewriteRule ^/http:/([^/]+)(/.*)?$ https://$1$2 [P,R=301]

RewriteCond %{HTTP_HOST} ^example\.com$ [NC]
RewriteCond %{REQUEST_METHOD} !OPTIONS
RewriteRule ^/https:/([^/]+)(.*)?$ https://$1$2 [P,R=301]

RewriteCond %{REQUEST_METHOD} OPTIONS
RewriteRule ^(.*)$ $1 [R=200,L]

After making the changes, I came across this problem
alt text

Something feels not right. I googled a bit about this issue and turns out I didnt enable the SSLProxyEngine which is responsible for securely handling requests.

I added the following line to the configuration

SSLProxyEngine on

After restarting the server once more, now when I access https://example.com/https://www.google.com , I get this

alt text

I'm seeing Google, but there's something off. It gives a 404 even though the URL was right.
I messed around with the configuration and tried turning off things that might interfere.

When I removed this line ProxyPreserveHost On, the issue was solved

alt text

And just like that, our proxy started working!

But as you can see there is an issue with the images not loading up. We set that aside since that's not the priority for the moment.

Testing the proxy with examples

1. Simple GET Request example

Our next goal is to test this on more examples and ensure it doesn't break.
We have an https://hexmos.com/lama2/tutorials/examples.html page. Where we are going to implement LiveAPI. So it would be better to ensure all the examples work properly

We hooked our proxy up with the LiveAPI widget and started testing it.
We started with a simple GET request.

alt text

GET
https://www.httpbin.org/ip

On executing, we found another problem.

alt text

CORS error is popping up once more, On inspecting the headers I found out that there are Duplicate Access-Allow-Origin-Headers

To fix this, I applied the following in the apache2 config file.

Header onsuccess unset Access-Control-Allow-Origin
Header always set Access-Control-Allow-Origin "*"

This unsets the duplicate origin and sets a new one, and solves the CORS issue.

2. Authorization header example

We continued testing again and came across another roadblock.
This code is meant to pass the authorization headers.

url = "http://google.com"
REMOTE_COORD = "https://httpbin.org"
---
# stage 1

POST
${REMOTE_COORD}/anything

{
    "username": "admin",
    "password": "Password@123",
    "from": "${LOCAL_COORD}/anything",
    "url": "${url}",
    "Token": "MySuperSecretToken"
}

---

// filtering, store in var
console.log("@@Result", result)
TOKEN = result["json"]["Token"]
console.log(TOKEN)

---

# stage 2
GET
${REMOTE_COORD}/bearer

Authorization: 'Bearer ${TOKEN}'

The problem here is, that our proxy server already requires an authorization header and now on top of that another authorization header is used. This clashes both auth headers and causes the request to not work.

As a solution to this, we tried various apache2 related solutions but couldn't get any results.
After some more thinking, we got the idea to use a custom header and use that header as authorization for the proxy.

I used the custom header called proxyauth
Here is how the configuration is modified

<Proxy "*">
Require valid-user
Require all granted

<If "%{HTTP:proxyauth} != 'Basic cHJveFlTXXJ2GXI5cHJvedkyGjUyMzQ0NnNlcnZlcg=='">
 ErrorDocument 401 "Unauthorized Access"
 Require valid-user
</If>

</Proxy>

I changed the contents of the Proxy Directive. I removed the existing Authentication and replaced it with a simple check. This checks whether the proxyauth header equals the base64 of the username and password combination.

If it doesn't match, then it will display an error.
Through this approach we got the problem solved as well.

We resumed our testing,
We came across one last issue in the examples, which is

POST
https://httpbin.org/post
# HEADERS
Cookie:"sessionid=foo;another-cookie=bar"
# DATA
hello=world

Here the cookie header is not being passed properly as a header.
To verify this issue, we checked other API executors online, some of them handled it, but some didn't.

Since some people already handled it, there should be some way.
We tried various solutions and ended up using a similar approach to the authorization issue we had earlier.

We used a CustomCookie Header.
This CustomCookie header is read by apache2 and converted to a Cookie header accordingly.

I added the following line to make it work

 RequestHeader set Cookie "expr=%{HTTP:CustomCookie}"

Through these fixes, our LiveAPI widget is ready for integration into other platforms.

Getting LiveAPI used in the Real World: Lessons learned from dealing with Cloudflare

We started trying to integrate it with some popular sites that don't have such API widgets.

We started with cal.com. They had their API docs as open source, so I could just take a pull and try to integrate our widget.

With quite a bit of trying, I got the widget to load up.
When I tried to execute one of their APIs. I encountered another problem.

Cloudflare is blocking our proxy.

alt text

We again started to search for some solutions in apache2, but no solutions were found.
We checked out how other API executors perform this. Instead of calling the proxy server directly, they are using an API to perform this.

So we tried the same approach. I set up a flask server with all the functionality and this was the result

from flask import Flask, request, Response
import requests
from urllib.parse import urlparse, urlencode, urlunparse

app = Flask(__name__)

@app.route('/proxy', methods=['GET', 'POST', 'PUT', 'DELETE', 'PATCH', 'OPTIONS'])
def proxy_request():
    # Extract the URL from the request headers
    print("API")
    for header, value in request.headers.items():
        print(f"{header}: {value}")
    headers = {key: value for key, value in request.headers if key.lower() != 'host'}
    url = request.headers.get('hex-url')

    if url is None:
        return Response('hex-url header is missing', status=400)

    # Parse the original request URL to extract query parameters
    parsed_url = urlparse(url)
    query_params = parsed_url.query

    # For GET requests, append query parameters from the original URL
    if request.method == 'GET':
        print("Get Request")
        query_params = urlencode(request.args)
        new_url = urlunparse(parsed_url._replace(query=query_params))
        data = None  # No data for GET requests

Here is the full python script

In the proxyserver, I have added the following lines to receive the requests from the flask server

ProxyPass /proxy http://127.0.0.1:5000/proxy
ProxyPassReverse /proxy http://127.0.0.1:5000/proxy

This fixed the API issue and I got the cal.com API working.

alt text

Using may be a better approach but it requires us to always get a flask server running. If the flask server fails, our whole proxy will fail and make it unusable.

We have kept this as a temporary solution, if we can find a suitable solution in apache2, we will use that instead.

10 Important Lessons I got from this long journey: How you can use it

1. Do a proper research about the technologies to solve the problem

The lesson I got from the CORS problem is to do a proper research about the technologies to solve the problem. In some cases, we may not be aware of the technology that could be the very cure to the problem. In this case, I had to study about what is proxy servers and set them up.

2. Think of the problem as a puzzle

If the solutions are not available readily on the internet, Sometimes you have to think of the problem as a puzzle. The answer is out there, waiting to be discovered, and you can find it without relying on external sources like the internet.

That's how we figured out the solution behind the cookie headers.

3. Simpler solutions do exist

We always need to remember simpler solutions do exist. When I tried to solve the authentication problem, there were loads of threads and loads of new data to process. But in the end, it wasn't useful at all. Suddenly on a night, an idea popped into my head, an organic one without any reference to the internet. I first thought it might not work. But when I implemented it, Turns out it became the solution to my problem.

4. It doesnt hurt to toggle things around a bit

This lesson applies whenever something seems to be broken. When something is broken, it doesn't hurt to toggle things around a bit. Sometimes certain code gets added without any particular reason. That's what I faced in the ProxyPreserveHost problem. A line called ProxyPreserveHost was staying there, When I disabled that line, the problem got solved.

5. Get your code exposed to outside conditions

However perfect our code and functionality may seem to us in our testing environment, it's important to get your code exposed to the outside conditions. It will reveal various problems that you didn't know even existed. And it saves up future frustration when our code gets used by others

6. Expect Roadblocks

Cloudflare gave me an unpleasant lesson. However further you think you have progressed in a problem, there can be roadblocks in between that cause you to switch to temporary solutions for the time being.

7. Stick with software that's stood the test of time

We always tend to jump towards new technologies. But also at the same time we need to remeber to always put the trust into software that has been there for a long time. Because the majority of the features we wish for in the software most probably have been already implemented.

8. Watch out for drawbacks, even if its a good solution

From Flask i learned that however good a solution may be, we should be on the lookout for the demerits associated with it. The Flask solution seemed quick and easy but the possibility of failure is higher since we started to depend on a server.

9. Logging levels save time

May it be any platform, use logging effectively as well as the various logging levels. That will speed up your debugging process. Here is an article I have written on Logging Levels and How to use them effectively.

10. Avoid experimenting for longer periods without understanding

This we may often overlook but its important: We should know exactly what we are trying to do. A lot of time went by because the proper knowledge of the regex involved in the rewrite rules was not there. Once we got a better clarity we were able to solve the problem quickly.

Conclusion

The journey we have shown you just highlights how solving engineering problems is not as straightforward. There will be various twists and turns and roadblocks. All that matters is the proper problem-solving approach to see the other end.

Even if we were successful at one solution, in this case apache2, towards the end a single problem causes us to use Flask. We will be investigating more on that to see whether the flask solution is viable or not. Here is the link to the repo where both flask and proxy configurations are stored.

We will be focussing more on the Widget development for now, as the Proxy and the fundamental problems have been solved. You can check on our progress through my Twitter, where I will keep you guys updated on the status of LiveAPI and the new problems we have been solving lately.

Here is our latest UI Demo for LiveAPI.

FeedZap: Read 2X Books This Year

FeedZap helps you consume your books through a healthy, snackable feed, so that you can read more with less time, effort and energy.