2008 06 16Cocoa Regular Expressions via JavascriptCore
Ever been stumped by the lack of regular expressions in Cocoa ? You can do quite a lot of complicated stuff in a few lines of code but comes the time to match a string to some pattern and then … where are regexes ? On your mac, they're living in PHP, Ruby, Perl, on webpages via WebKit, in some command line tools, but where in Cocoa ? Nowhere ! How about using WebKit ? Why not, but WebKit needs a NSWindow, a WebView, an html page with some code. Someone on CocoaDev suggested JavascriptCore, so we'll use just that !
Using JavascriptCore
To use JavascriptCore and get some results out of it :
- create a context :
JSGlobalContextRef ctx = JSGlobalContextCreate(NULL);
- create a JS script string :
JSStringRef scriptJS = JSStringCreateWithCFString((CFStringRef)@"var p = new RegExp(pattern, flags); string.match(p)");
- evaluate our script and get the result :
JSValueRef result = JSEvaluateScript(ctx, scriptJS, NULL, NULL, 0, NULL);
- convert our Javascript result to a Cocoa string :
JSStringRef resultStringJS = JSValueToStringCopy(ctx, result, NULL); CFStringRef resultString = JSStringCopyCFString(kCFAllocatorDefault, resultStringJS); JSStringRelease(resultStringJS); // Returns a Cocoa object by casting from a CFString return (id)resultString;
There's more as you have to do conversions to and from Cocoa. WebKit bridges strings, arrays, numbers but JavascriptCore does nothing, so we have to handle it ourselves.
From there we do a direct mapping of Javascript's match
and replace
.
Augmenting NSString
We'll define three methods :
-
match
matches the first result found. Returns a string or nil. Equivalent to Javascript'sstring.match(/pattern/)
-
matchAll
returns all matches. Returns a string array or nil. Equivalent to Javascript'sstring.match(/pattern/g)
-
replace
matches the first result found. Returns a string or nil. Equivalent to Javascript'sstring.replace(/pattern/)
Given a sample string That is a sample STRING
and a pattern of I[\w]+
(match capital I followed by one or more [a-zA-Z0-9]) :
// matches ING - (id)matchWithPattern:(NSString*)pattern; // matches ['ING'] - (id)matchAllWithPattern:(NSString*)pattern; // returns That is a sample STR*** - (id)replaceWithString:(NSString*)replacement andPattern:(NSString*)pattern;
And case (in)sensitive versions :
// matches is - (id)matchWithPattern:(NSString*)pattern isCaseSensitive:(BOOL)c; // matches ['is', 'ING'] - (id)matchAllWithPattern:(NSString*)pattern isCaseSensitive:(BOOL)c; // returns That *** a sample STR*** - (id)replaceWithString:(NSString*)replacement andPattern:(NSString*)pattern isCaseSensitive:(BOOL)c;
Sample code
RegEx via JavascriptCore.zip NOTE this uses JavascriptCore for regular expressions, and WebView to display the results. That means there are actually two Javascript engines running :) one for matching, one for displaying. The displaying is of course completely optional, if you want to use regexes in your project copyRegExJS.m
and RegExJS.h
then add JavascriptCore.framework
.
I used this approach in a recent release of my app, and I have to say — whilst it's quite clever, it blows out memory usage significantly due to the JS JIT compiler and other things being hosted in your app's memory space. If you don't have concerns about memory, this is a nice, simple approach — otherwise, I think you're better off finding something built on top of the simple POSIX regex tools.