Information Week Article: “YouTube Wrestles With Scammer-Generated Content”
InformationWeek reports that YouTube is “struggling” with posted videos showing such things as stolen credit cards, PINs, etc. They go on to talk about how difficult it is to screen video content.
A single line mentions that meta-content can be used for screening (searching for keywords that can identify the content), but a YouTube spokesman goes on to say that they rely “on our community to know our community guidelines and flag content that violates the guidelines.”
First of all, the type of community that will be looking for that niche content isn’t going to be all that quick to flag it.
Secondly, how hard would it be to build a signature base of meta-word and behavioral screening to remove the largest portion of objectionable (illegal) content? Here are a few ideas to think about as you read the article – feel free to post your own:
- Spam assassin for content anyone? Use the meta data to help weight the red flag.
- Watch topics that users post to/visit and use this to weight a flag. For instance, a little old lady that is concerned about “poodles” and “identity theft” will not affect the weight as much as someone looking for “Free credit card numbers” and “MS Windows licenses”.
- Use Natural Language Processing techniques to identify and weight actual posts (remember the “StupidFilter“?).
I realize full well that these techniques can be gamed just like anything else, but it seems to me that they are viable, not so hard to implement (I use components of them in my work – although the scale is different!), and a darn spot better than relying on the crooks to report themselves!