Clean up data that was copy/pasted

I have some custom user pages where users fill out forms and posts are created using the wp_insert_post() function. I assumed that function was doing all the clean-up for me, but when users copy/paste text, sometimes odd characters end up on the front-end site, including those annoying triangles with question marks inside of them. Anyone know of a good resource for best practices in this area?

Or maybe its a character encoding issue. The site's meta tag is UTF-8, and so are most of the WordPress's tables...