Sunday, January 23, 2011

An MD5 Hashing function for varnish

I was toying with varnish, a very cool open source web accelerator that seems to be getting a lot of attention recently. Since we have done some fairly complex caching setups at my current employer using a well known CDN, I figured I would dig in and seen how capable Varnish really was.

For starters, the documentation is unfortunately a bit slim. There is virtually nothing around on google and no real advanced or complex examples anywhere. So when you need to do some serious tinkering and you hit an error, you just need to go through some trial and error.

The first thing i noticed was that there were no builtin "hash functions" you can call, in particular MD5. I had a need to take 3 parameters from the query string (something you need to do with a regex) and concatenate them into an MD5 hash. Since Varnish didn't supply this functionality I figured I was pretty much out of luck. However, a little more digging and I discovered a couple of VERY interesting things. Firstly, the DSL that Varnish uses is basically compiled into C to be super fast. What this means is that there is the ability to put inline C directly into the config file. Thats right, you can basically wrap your C code with C{}C right in the config file and do almost anything in there. Now we are getting somewhere. But was I really going to find an implementation of md5.c and stick that in the middle of a config file? That seemed like serious overkill...

Then i discovered my second secret feature, the load_module. There is an example buried in the wiki on how to compile the MaxMind GeoIP library into VCL as a module and execute its function to query for a Country to IP Lookup. Ah, now we are talking. So what did I need to do? What anyone would do in this situation, grab an implementation of md5 in C and write your own library of course! So that is what I did. I downloaded the md5 implementation written by L. Peter Deutch (http://sourceforge.net/projects/libmd5-rfc/files/) and then wrote my own library. This involved a couple of steps....

First I had to write my own md5 library that i can expose to Varnish, which I conveniently named md5_hash. I had to create basically a C source file with the following contents:


char * md5_hash(char * md5_string)
{
int status = 0;

md5_state_t state;
md5_byte_t digest[16];
static char hex_output[16*2 + 1];
int di;

md5_init(&state);
md5_append(&state, (const md5_byte_t *)md5_string, strlen(md5_string));
md5_finish(&state, digest);

for (di = 0; di < 16; ++di)
sprintf(hex_output + di * 2, "%02x", digest[di]);

return hex_output;
}


Then i built a Makefile that turned this into my very own libmd5varnish.so to be used inside of Varnish. Now we need to load this into Varnish and make the md5_hash() function available. To do this you need to use inline C and place the following in your VCL file :


C{
#include
#include
#include

static const char* (*md5_hash)(char* str) = NULL;

__attribute__((constructor)) void
load_module()
{
const char* symbol_name = "md5_hash";
const char* plugin_name = "/etc/varnish/modules/md5/libmd5varnish.so";
void* handle = NULL;

handle = dlopen( plugin_name, RTLD_NOW );
if (handle != NULL) {
md5_hash = dlsym( handle, symbol_name );
if (md5_hash == NULL)
fprintf( stderr, "\nError: Could not load MD5 plugin:\n%s\n\n", dlerror() );
else
printf( "MD5 plugin loaded successfully.\n");
}
else
fprintf( stderr, "\nError: Could not load MD5 plugin:\n%s\n\n", dlerror() );
}
}C


Excellent. Now i have a function called "md5_hash" available to me via inline C. So how do I call it? Say you want to set a header that is the MD5 Sum of a string called "random blog post" (could just as easily be a header you extract via VRT_Get_Hdr). You stick this anywhere you need in your config:


C{
VRT_SetHdr(sp, HDR_REQ, "\006X-MD5:", (*md5_hash)("random blog post"), vrt_magic_string_end);
}C



Thats pretty much it, i did some basic testing and it works like a charm. To save someone else the hassle I opensourced the whole library I wrote and stuck it on github here https://github.com/denen99/libmd5varnish. Feel free to fork or improve it.

Hope this helps someone, I know I searched for hours with nothing in sight on how to solve this. There are a couple of posts on the mailing list that claim this will be natively party of version 3.0 but I didnt want to wait :-).

Good luck.

3 comments:

Anonymous said...

You're my computer hero!!!

ML said...

MD5 (Message Digest Algorithm 5) is a widely used cryptographic hash function that produces a 128-bit (16-byte) hash value, cyber security Projects For Final Year commonly represented as a 32-character hexadecimal number. It's designed to take an input (or 'message') and return a fixed-size string of bytes, typically a digest Information Security Projects For Final Year that is unique to the given input.

Kamloops Latina Escorts said...

Greeat reading your blog post