I spent the last 6 months building LiveAPI Proxy: Here are 10 HARD-EARNED Engineering Lessons you can use now
I spent quite some time trying to solve an issue with LiveAPI, Which prevented it from executing API Requests. Here is my journey and the lessons I got trying to build a proxy server of my own to solve this issue.
How LiveAPI Taught me some important Lessons in engineering
I have been working on a product named LiveAPI. Let me just give an idea of what this product does.
The above API doc is a static one, users cant execute and change things by themselves.
Static API docs like these often lose customer attention before the developers even try the APIs.
The above API Doc uses LiveAPI, here developers can execute these APIs instantly right within their browser, so that developer attention can be captured within the first 30 seconds of their visit.
LiveAPI uses a WASM Binary and a language core for executing the APIs. These things are already built up and we started testing this on some httpbin URLs, everything seemed fine.
When we tried doing a GET request to www.google.com, it failed.
We investigated further and found out that there was a CORS error going on.
CORS error prevents us from making requests from one site to another site.
But this is a vital thing, because we are always requesting from one site(API docs) to another site(the target API url).
So we thought for a while on this issue, and an idea popped up. How about we use proxyservers? This is a potential solution to this problem and will get us back up and running. Let's see how proxy servers can be a useful approach.
Learning about Proxies: Engineering a Solution for CORS-Free browser requests
What is a proxy server?
Consider this example.
Here you can see two people, Alice and Bob. In the middle there is a proxy.
Alice asked the proxy to forward a message to him, Bob also does the same.
The proxy acts as the middleman here passing information between these two people.
This is how proxy servers work.
A proxyserver acts as a middleman between a client and a server, We have 3 things: Client Requests, Proxy Server and Responses.
Client Request: When you send a request to a website. Instead of the website receiving it first, the proxy server receives it.
Proxy Server: The proxy server then forwards your request to the actual website. It’s like a middleman that handles the communication.
Response: The website responds to the proxy server, which then forwards the response back to you.
How Proxies aid with solving the CORS problem
The proxy server makes the request to the target API on behalf of our LiveAPI tool. Since the browser sees the request coming from the proxy server rather than from our site, it bypasses the CORS restrictions.
Figuring out how to build a proxy server: The approach I took
Since we got an idea of what the solution looks like, We were thinking about what technologies should we use.
For our case, we already had an apache2 server up and running, and since Apache is a widely used server with a lot of module support, we felt it was the better choice for building our proxy server.
Putting the Solution into Action: Building an Apache2 Proxy and Getting LiveAPI working
Setting things up
To setup the proxyserver, we first created a forward_proxy-le-ssl.conf
file inside /etc/apache2/sites-available
<IfModule mod_ssl.c>
<VirtualHost *:443>
ProxyPreserveHost On
# Server Name
ServerName example.com
# Proxy and authorization
<Proxy "*">
AuthType Basic
AuthName "Restricted Access"
AuthUserFile /etc/apache2/sites-available/proxy.htpasswd
Require valid-user
</Proxy>
# Error Log
ErrorLog ${APACHE_LOG_DIR}/error_forward_proxy.log
# SSL Certificates
SSLCertificateFile /etc/letsencrypt/live/example.com/fullchain.pem
SSLCertificateKeyFile /etc/letsencrypt/live/example.com/privkey.pem
Include /etc/letsencrypt/options-ssl-apache.conf
</VirtualHost>
</IfModule>
Here I have a basic configuration set up, with the server name pointing towards example.com and logs being directed to a file and SSL Certificates
Let's go over the code briefly.
<IfModule mod_ssl.c>:
- This condition ensures that the configuration inside the block is only applied if the mod_ssl module is loaded. This module enables SSL/TLS for the Apache server.
- SSL/TLS provides encryption for secure communication over the internet and protects sensitive data.
<VirtualHost *:443>:
- This block defines a virtual host that listens on port 443, which is the standard port for HTTPS connections.
ServerName example.com:
- This directive specifies the domain name for this virtual host.
<Proxy "*">:
- This line specifies that the following directives apply to all proxy requests. The asterisk (*) is a wildcard that means any URL or request being proxied by the server.
AuthType Basic:
- This sets the authentication type to "Basic". Basic authentication is a simple method where the client (like a web browser) sends a username and password encoded in base64.
AuthName "Restricted Access":
- This specifies the name of the authentication realm. When users try to access the proxy, they will see a pop-up asking for a username and password, and it will display "Restricted Access" as the prompt message.
AuthUserFile /etc/apache2/sites-available/proxy.htpasswd:
- This points to the location of the password file. The file /etc/apache2/sites-available/proxy.htpasswd contains a list of usernames and passwords that are allowed to access the proxy.
- To create proxy.htpasswd, you can use the htpasswd command-line tool provided by Apache, which allows you to add or modify usernames and passwords in the file securely.
Require valid-user:
- This specifies that any valid user (anyone whose credentials are in the proxy.htpasswd file) is allowed to access the proxy.
ErrorLog ${APACHE_LOG_DIR}/error_forward_proxy.log:
- This directive specifies the location of the error log file for this virtual host. Errors related to this virtual host will be logged to the specified file.
After making the modifications, this is what I get on accessing example.com
Forwarding the request
Now my next goal is to use this for executing web request. For example: When I do https://example.com/https://www.google.com, it should lead to the www.google.com website
I added a Proxy directive and a bunch of Rewrite rules which will help us forward the request properly.
<Proxy "*">
Require valid-user
Require all granted
</Proxy>
RewriteCond %{HTTP_HOST} ^example\.com$ [NC]
RewriteCond %{REQUEST_METHOD} !OPTIONS
RewriteRule ^/https:/http:/([^/]+)(/.*)?$ https://$1$2 [P,R=301]
RewriteCond %{HTTP_HOST} ^example\.com$ [NC]
RewriteCond %{REQUEST_METHOD} !OPTIONS
RewriteRule ^/http:/([^/]+)(/.*)?$ https://$1$2 [P,R=301]
RewriteCond %{HTTP_HOST} ^example\.com$ [NC]
RewriteCond %{REQUEST_METHOD} !OPTIONS
RewriteRule ^/https:/([^/]+)(.*)?$ https://$1$2 [P,R=301]
RewriteCond %{REQUEST_METHOD} OPTIONS
RewriteRule ^(.*)$ $1 [R=200,L]
After making the changes, I came across this problem
Something feels not right. I googled a bit about this issue and turns out I didnt enable the SSLProxyEngine which is responsible for securely handling requests.
I added the following line to the configuration
SSLProxyEngine on
After restarting the server once more, now when I access https://example.com/https://www.google.com , I get this
I'm seeing Google, but there's something off. It gives a 404 even though the URL was right.
I messed around with the configuration and tried turning off things that might interfere.
When I removed this line ProxyPreserveHost On
, the issue was solved
And just like that, our proxy started working!
But as you can see there is an issue with the images not loading up. We set that aside since that's not the priority for the moment.
Testing the proxy with examples
1. Simple GET Request example
Our next goal is to test this on more examples and ensure it doesn't break.
We have an https://hexmos.com/lama2/tutorials/examples.html page. Where we are going to implement LiveAPI. So it would be better to ensure all the examples work properly
We hooked our proxy up with the LiveAPI widget and started testing it.
We started with a simple GET request.
GET
https://www.httpbin.org/ip
On executing, we found another problem.
CORS error is popping up once more, On inspecting the headers I found out that there are Duplicate Access-Allow-Origin-Headers
To fix this, I applied the following in the apache2 config file.
Header onsuccess unset Access-Control-Allow-Origin
Header always set Access-Control-Allow-Origin "*"
This unsets the duplicate origin and sets a new one, and solves the CORS issue.
2. Authorization header example
We continued testing again and came across another roadblock.
This code is meant to pass the authorization headers.
url = "http://google.com"
REMOTE_COORD = "https://httpbin.org"
---
# stage 1
POST
${REMOTE_COORD}/anything
{
"username": "admin",
"password": "Password@123",
"from": "${LOCAL_COORD}/anything",
"url": "${url}",
"Token": "MySuperSecretToken"
}
---
// filtering, store in var
console.log("@@Result", result)
TOKEN = result["json"]["Token"]
console.log(TOKEN)
---
# stage 2
GET
${REMOTE_COORD}/bearer
Authorization: 'Bearer ${TOKEN}'
The problem here is, that our proxy server already requires an authorization header and now on top of that another authorization header is used. This clashes both auth headers and causes the request to not work.
As a solution to this, we tried various apache2 related solutions but couldn't get any results.
After some more thinking, we got the idea to use a custom header and use that header as authorization for the proxy.
I used the custom header called proxyauth
Here is how the configuration is modified
<Proxy "*">
Require valid-user
Require all granted
<If "%{HTTP:proxyauth} != 'Basic cHJveFlTXXJ2GXI5cHJvedkyGjUyMzQ0NnNlcnZlcg=='">
ErrorDocument 401 "Unauthorized Access"
Require valid-user
</If>
</Proxy>
I changed the contents of the Proxy Directive. I removed the existing Authentication and replaced it with a simple check. This checks whether the proxyauth
header equals the base64 of the username and password combination.
If it doesn't match, then it will display an error.
Through this approach we got the problem solved as well.
3. Cookie header example
We resumed our testing,
We came across one last issue in the examples, which is
POST
https://httpbin.org/post
# HEADERS
Cookie:"sessionid=foo;another-cookie=bar"
# DATA
hello=world
Here the cookie header is not being passed properly as a header.
To verify this issue, we checked other API executors online, some of them handled it, but some didn't.
Since some people already handled it, there should be some way.
We tried various solutions and ended up using a similar approach to the authorization issue we had earlier.
We used a CustomCookie
Header.
This CustomCookie header is read by apache2 and converted to a Cookie header accordingly.
I added the following line to make it work
RequestHeader set Cookie "expr=%{HTTP:CustomCookie}"
Through these fixes, our LiveAPI widget is ready for integration into other platforms.
Getting LiveAPI used in the Real World: Lessons learned from dealing with Cloudflare
We started trying to integrate it with some popular sites that don't have such API widgets.
We started with cal.com. They had their API docs as open source, so I could just take a pull and try to integrate our widget.
With quite a bit of trying, I got the widget to load up.
When I tried to execute one of their APIs. I encountered another problem.
Cloudflare is blocking our proxy.
We again started to search for some solutions in apache2, but no solutions were found.
We checked out how other API executors perform this. Instead of calling the proxy server directly, they are using an API to perform this.
So we tried the same approach. I set up a flask server with all the functionality and this was the result
from flask import Flask, request, Response
import requests
from urllib.parse import urlparse, urlencode, urlunparse
app = Flask(__name__)
@app.route('/proxy', methods=['GET', 'POST', 'PUT', 'DELETE', 'PATCH', 'OPTIONS'])
def proxy_request():
# Extract the URL from the request headers
print("API")
for header, value in request.headers.items():
print(f"{header}: {value}")
headers = {key: value for key, value in request.headers if key.lower() != 'host'}
url = request.headers.get('hex-url')
if url is None:
return Response('hex-url header is missing', status=400)
# Parse the original request URL to extract query parameters
parsed_url = urlparse(url)
query_params = parsed_url.query
# For GET requests, append query parameters from the original URL
if request.method == 'GET':
print("Get Request")
query_params = urlencode(request.args)
new_url = urlunparse(parsed_url._replace(query=query_params))
data = None # No data for GET requests
Here is the full python script
In the proxyserver, I have added the following lines to receive the requests from the flask server
ProxyPass /proxy http://127.0.0.1:5000/proxy
ProxyPassReverse /proxy http://127.0.0.1:5000/proxy
This fixed the API issue and I got the cal.com API working.
Using may be a better approach but it requires us to always get a flask server running. If the flask server fails, our whole proxy will fail and make it unusable.
We have kept this as a temporary solution, if we can find a suitable solution in apache2, we will use that instead.
10 Important Lessons I got from this long journey: How you can use it
1. Do a proper research about the technologies to solve the problem
The lesson I got from the CORS problem is to do a proper research about the technologies to solve the problem. In some cases, we may not be aware of the technology that could be the very cure to the problem. In this case, I had to study about what is proxy servers and set them up.
2. Think of the problem as a puzzle
If the solutions are not available readily on the internet, Sometimes you have to think of the problem as a puzzle. The answer is out there, waiting to be discovered, and you can find it without relying on external sources like the internet.
That's how we figured out the solution behind the cookie headers.
3. Simpler solutions do exist
We always need to remember simpler solutions do exist. When I tried to solve the authentication problem, there were loads of threads and loads of new data to process. But in the end, it wasn't useful at all. Suddenly on a night, an idea popped into my head, an organic one without any reference to the internet. I first thought it might not work. But when I implemented it, Turns out it became the solution to my problem.
4. It doesnt hurt to toggle things around a bit
This lesson applies whenever something seems to be broken. When something is broken, it doesn't hurt to toggle things around a bit. Sometimes certain code gets added without any particular reason. That's what I faced in the ProxyPreserveHost problem. A line called ProxyPreserveHost was staying there, When I disabled that line, the problem got solved.
5. Get your code exposed to outside conditions
However perfect our code and functionality may seem to us in our testing environment, it's important to get your code exposed to the outside conditions. It will reveal various problems that you didn't know even existed. And it saves up future frustration when our code gets used by others
6. Expect Roadblocks
Cloudflare gave me an unpleasant lesson. However further you think you have progressed in a problem, there can be roadblocks in between that cause you to switch to temporary solutions for the time being.
7. Stick with software that's stood the test of time
We always tend to jump towards new technologies. But also at the same time we need to remeber to always put the trust into software that has been there for a long time. Because the majority of the features we wish for in the software most probably have been already implemented.
8. Watch out for drawbacks, even if its a good solution
From Flask i learned that however good a solution may be, we should be on the lookout for the demerits associated with it. The Flask solution seemed quick and easy but the possibility of failure is higher since we started to depend on a server.
9. Logging levels save time
May it be any platform, use logging effectively as well as the various logging levels. That will speed up your debugging process. Here is an article I have written on Logging Levels and How to use them effectively.
10. Avoid experimenting for longer periods without understanding
This we may often overlook but its important: We should know exactly what we are trying to do. A lot of time went by because the proper knowledge of the regex involved in the rewrite rules was not there. Once we got a better clarity we were able to solve the problem quickly.
Conclusion
The journey we have shown you just highlights how solving engineering problems is not as straightforward. There will be various twists and turns and roadblocks. All that matters is the proper problem-solving approach to see the other end.
Even if we were successful at one solution, in this case apache2, towards the end a single problem causes us to use Flask. We will be investigating more on that to see whether the flask solution is viable or not. Here is the link to the repo where both flask and proxy configurations are stored.
We will be focussing more on the Widget development for now, as the Proxy and the fundamental problems have been solved. You can check on our progress through my Twitter, where I will keep you guys updated on the status of LiveAPI and the new problems we have been solving lately.
Here is our latest UI Demo for LiveAPI.
FeedZap: Read 2X Books This Year
FeedZap helps you consume your books through a healthy, snackable feed, so that you can read more with less time, effort and energy.