Using a Proxy framework to automate API robustness of apps

One of the realities of the new world of CI/CD that we live in is the “bad push” or rather one that was not adequately tested before it was pushed by DevOps via Chef or Docker to production servers. This is because it’s just too easy to make a change on the dev env and promote them over to production. In an ideal world there would be automated tests baked into the CI pipeline to catch these issues but when your app is using 3rd party backends, you are at the mercy of the professionalism of these teams. One way to solve this is simply to build your own middleware to ensure the API responses are all kosher. Another way is to bake in defensive programming into your app’s model layer (as in MVC) to ensure that even if bad responses are received, your app does not barf.

To test this, there are several ways: (a) build a full scale mock server that is able to record and replay backend responses (b) use a proxy to intercept the responses and modify it in several ways:
  1. Add/Remove headers
  2. Modify the content body eg. change values for example of the value is an int, change it to a string etc.
  3. Truncate the content body
There are several advantages of using the proxy server approach: (a) you don’t have to build a mock server from scratch and maintain it (b) you are working with traffic from real production backends.

I chanced upon this tool “mitmproxy” while researching for a tool to do just this. There are some nice things I liked about it:
  1. Easy set up – binaries available for MacOSX and pip install for Linux (Ubuntu)
  2. Inline scripts to intercept end point requests and manipulate responses are in Python so no need for complicated set ups using Maven or ant etc.
  3. 2 modes of operation – interactive and CLI

They have the standard MITM certs to decrypt SSL traffic just like Charles Proxy (which is a great tool btw). See http://mitmproxy.org for details.

Once installed and after both mitmproxy and mitmdump are both in your $PATH, you can start digging in to the tool. Best way would be to use the interactive tool “mitmproxy” first to get the feel for it. There are of course flags to change the port etc (default is 8080). This site (see section 2.6) gives a good intro to how to navigate the tool – http://blog.philippheckel.com/2013/07/01/how-to-use-mitmproxy-to-read-and-modify-https-traffic-of-your-phone/

However the real power of this tool is the fact you are able to run what they call “inline scripts” – essentially pieces of Python code handlers for request, response etc.

Here’s an example code to demonstrate this (you can find this here: https://github.com/foohm71/mitmproxy-stuff – it’s the dumpInfo.py script)
def dumpInfo(flow):
   dict = flow.__dict__
   print "[Flow Info]"
   print "Host:" + dict["Host"]
   print "method:" + dict["method"]
   print "protocol:" + dict["protocol"]
   print "[Request Info]"
   print "request start time:" + dict["requestStartTime"]
   print "request end time:" + dict["requestEndTime"]
   print "request body:" + dict["requestBody"]
   headers = dict["requestHeaders"]
   for k in headers.keys():
      print "request header: " + k + " = " + headers[k][0]
   print "[Response Info]"
   print "response code:",dict["responseCode"]
   print "response start time:" + dict["responseStartTime"]
   print "response end time:" + dict["responseEndTime"]
   print "response body:" + dict["responseBody"]
   headers = dict["responseHeaders"]
   for k in headers.keys():
      print "response header: " + k + " = " + headers[k][0]</code>

def request(context, flow):
   dict = flow.__dict__
   request = flow.request
   dict["Host"] = str(request.host)
   dict["method"] = str(request.method)
   dict["protocolVersion"] = str(request.httpversion)
   dict["protocol"] = str(request.scheme)
   dict["requestStartTime"] = str(request.timestamp_start)
   dict["requestEndTime"] = str(request.timestamp_end)
   dict["requestHeaders"] = request.headers
   dict["requestBody"] = request.get_decoded_content()

def response(context, flow):
   dict = flow.__dict__
   response = flow.response
   dict["responseCode"] = response.code
   dict["responseStartTime"] = str(response.timestamp_start)
   dict["responseEndTime"] = str(response.timestamp_end)
   dict["responseHeaders"] = response.headers
   dict["responseBody"] = response.get_decoded_content()

   dumpInfo(flow)
<div>

All this does it extract out information about the request/response and puts it in a dict object that is passed around. Once done, it just prints out the information. To run this, use the CLI version of the tool in this way: mitmdump -s <script>

As there is a framework for this, we could extend this code to perform the following: based on different end points (or hosts), protocols, body, we could perform different types of response manipulation.

One example of response manipulation could be to just truncate a JSON response like this:
deftruncateJSONString(jsonstr, length):
   return jsonstr[:int(length)]

Another could be to recursively parse the JSON response for a key and replace its value:
def findReplaceValue(jsonobj, key, value):
   if type(jsonobj) == type({}):
   for k in jsonobj:
      if k == key:
         jsonobj[k] = value
      findReplace(jsonobj[k], key, value)
Sometimes your request is in the form of a form POST, in that case, you may need to extract a form field and perform the response manipulation based on that or a combination of fields:
form = request.get_form_urlencoded()
username = form[“username"]
dict[“Username”] = username
Advertisements

Leave a comment

Filed under Uncategorized

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s