spiders and s(pam)

Hmm. I was thinking about the problem of lifting emails off websites.

take the thing back a step.

1) get the identity of the agent.

2) allow or deny

3) leave mailto: in tact since the sitewall is working – why bother with CAPTCHA / images instead of email address / javascript versions of mailto: etc.

1a) if the agent is mascaraiding as a proper web browser, then use data, blacklists/ greylists to provide a weighting.

also, if the agent is “good” then get it to fill out a form giving some references and allow it through if they check out.

hmm. a fourth year project i wonder?

Leave a Comment