Tuesday, March 11, 2008

A little FFI goes a long way

As my two faithful readers have surely noted, one of my little projects is a Leopard-specific GUI for R that tries to leverage as many of the new features as possible. As it happens, one of those features is Garbage Collection, which I can't use because of the way the R framework is built. Perhaps I'll be able to use Simon's method for building self-contained R applications with the framework built in the appropriate way.

What I'd really like to talk about though is the fact that having PyObjC and whatever the Ruby bridge is called means that I can be positive that a copy of libffi is available on every Leopard system. This let me finally build something I've wanted for a long time: an elegant C<->Cocoa thunk for the R function pointers. To wit:

ptr_R_WriteConsoleEx = ffi_bind(self,@selector(writeConsole:length:type:),&ffi_type_void,
4,&ffi_type_pointer,&ffi_type_sint,&ffi_type_sint);


So, how do we do it? Well, first we need a little bit of context to keep a binding between a function pointer and a specific object/method combination. We also keep the types of arguments around as well as the return type. If we were doing a complete libffi bridge we would introspect this from the class at runtime rather than the static method I use here. In the case of R this would be overkill since the R function pointers are very much fixed entities. In any case, here's the little structure I use:

typedef struct {
id obj;
SEL sel;
IMP fn;
int nargs;
ffi_type *retval;
ffi_cif *cif;
ffi_type *types[0];
} ffi_call_struct;


Next, we need to write a little function to use as a closure and move the C function call to the ObjC function call. All libffi closure handlers look have the same function declaration:

void handler(ffi_cif *cif,void *retval,void *args[],void *user_data) {

First we build a place to hold our arguments

int i;
ffi_call_struct *call = (ffi_call_struct*)user_data;
void **values = (void**)malloc(sizeof(void*)*(call->nargs+2));

and simply transfer the arguments accross, making sure that the first two are the target object and target selector (Obj-C methods have two implicit arguments):

for(i=0;inargs;i++) { values[2+i] = args[i]; }
values[0] = &(call->obj);
values[1] = &(call->sel);

Then all we do is use ffi_call to make a call. Note that we use the NSAutoreleasePool method, which wouldn't be necessary if we could enable garbage collection:

NSAutoreleasePool *pool = [[NSAutoreleasePool alloc] init];
ffi_call(call->cif,(void (*)(void))call->fn,retval,values);
[pool release];
free(values);
}


Next we need a little function to build an appropriate closure function to use as the R function pointer:

void *ffi_bind(id obj,SEL selector,ffi_type *retval,int nargs,...) {
int i;
ffi_cif *cif;
ffi_closure *closure;
va_list ap;


We use vargs so that we can define the argument lists inline. First, we allocate our context structure and transfer the argument types

//Allocate the structure and copy the call information into the structure.
ffi_call_struct *call = (ffi_call_struct*)malloc(sizeof(ffi_call_struct)+(nargs+2)*sizeof(ffi_type*));
call->obj = obj;
call->sel = selector;
call->fn = [obj methodForSelector:selector];
call->retval = retval;
call->nargs = nargs;

call->cif = (ffi_cif*)malloc(sizeof(ffi_cif));
cif = (ffi_cif*)malloc(sizeof(ffi_cif));

call->types[0] = &ffi_type_pointer;
call->types[1] = &ffi_type_pointer;
va_start(ap,nargs);
for(i=0;itypes[2+i] = va_arg(ap,ffi_type*);


Next we define the two ffi_cifs. The first one is the ffi_cif of the Obj-C method call and the second is the one for the function pointer we're trying to create. Finally, we create the closure and return it

ffi_prep_cif(call->cif,FFI_DEFAULT_ABI,nargs+2,call->retval,call->types); //The CIF for the ObjC Method
ffi_prep_cif(cif,FFI_DEFAULT_ABI,nargs,call->retval,&(call->types[2])); //The CIF for the function call
closure = (ffi_closure*)malloc(sizeof(ffi_closure));
ffi_prep_closure(closure,cif,handler,call);
return closure;
}


and that's pretty much all there is to it. There are two minor caveats, the first is that anything that employs method swizzling on this class is likely to fail since we cache the function pointer at bind time. Fortunately, swizzling is relatively rare these days so it generally doesn't happen. It may also technically be slower than the hand-coded method, but there are a couple of areas where we could probably speed up the thunking operation---the allocation of
values
in particular could be moved to the context object.

No comments: