Reliability comparison between Assira, reCaptcha and Defined Questions

We are testing Anti-Splog with our beta launch. Since no-one likes reCAPTCHA, and it seemed that 12 cats/dogs were a few too many with Assira, we decided to start with Defined Questions. Do you have any reliability statistics comparing the three methods? We are finding that a few valid sign-ups are not passing, yet we are still having splog get through. What degree of difficulty do the questions need to be to effectively mitigate splog? Is there any way for us to correlate which questions may be failing?