Launching and maintaining an increasingly popular forum requires constant attention from the developer.
Sure, the initial forum launch went quite smoothly. And why wouldn’t it? No one outside your immediate circle will know it even exists. At this stage, it is enough to assume a “pleasant path”. Some basic form validation will do the trick. “You must provide a title and description to create a new thread.”
People are generally nice. Of course they won’t take advantage of your new platform. Correct?
First Step
In the first few weeks, your forum grows steadily. Conversation starts in each respective thread. how exciting! Remember that scene in Kindergarten Cop when Arnold Schwarzenegger learns that his unorthodox teaching techniques are starting to work with Kindergarten? “Yeah – it’s working. It’s working!” He says. … of course you don’t remember that scene. But I do – and watching your budding platform grow is exactly what it feels like.
However, one day, you wake up, check in to the forum, and notice your first piece of spam. The thread reads:
Greeting. I really enjoy the teachings on this site. Click here for HD Megavideo Streaming.
damn. He found us. I don’t know how, but they found us.
keyword censoring
You turn it off. No big deal. Maybe some basic keyword verification will do the trick. In Laravel, we can easily create a custom validation rule:
php artisan make:rule NoInvalidKeywords
A few lines of code later, and we now have a list of invalid keywords – beginning with “MegaVideo.”
request()->validate([ 'body' => ['required', new NoInvalidKeywords] ]);
Problem solved! …Right?
Step Two
Spammers are like dinosaurs: life finds a way. Your original keyword string checker didn’t account for clever formatting, such as “MEGA%%VID%EO DOWNLOAD H%D LINK.” So once again, you wake up to a new pot of spam. back to the drawing board.
Email Confirmation
Perhaps the next step is not verification, but email confirmation. You cannot participate in my forums until you have confirmed your email address. It should be around all those adsfasfaasfasfadsf@domain.com signups you’re suddenly seeing.
Laravel – again to the rescue – provides email confirmation via the MustVerifyEmail contract. Add this to your User model and the person will automatically receive a confirmation email after signing up. You can then protect your routes by implementing verified middleware.
Route::post('threads', function () { // })->middleware('verified');
With this change, we are now effectively saying, “You cannot create a forum thread until you have verified your email address.”
problem solved! …Correct?
Step Three
Email confirmation helps a whole lot, but maybe not to the extent that you initially expected. Like the bad guy at the end of the movie who just won’t die, Spam continues.
You will soon start encountering spam written in some foreign languages. What is the motivation to do so? Who knows, but it comes in waves.
Leave out any projects you had prepared for the morning. We need to fix this.
language detection
It’s time to spend the next hour researching how to validate against particular languages. What’s interesting is that the search results return a link to your own forum. You find a thread that recommends the following regular expression.
<?php namespace App\Rules; use Illuminate\Contracts\Validation\Rule; class EnglishOnly implements Rule { public function passes($attribute, $value) { return ! preg_match("/\p{Han}+/u", $value); } public function message() { return 'Please use English for the :attribute.'; } }
You take a moment to smile. Wow, the tool I built to help others just helped me improve the tool, itself. That’s very cool.
Anyhow, focus Daniel-san. Back to work.
This validation rule isn’t quite right, but it’s a decent-enough first step to quickly push out a preventative fix.
request()->validate([ 'body' => [ 'required', new NoInvalidKeywords, new EnglishOnly ] ]);
Problem solved! …Right?
Step Four
Don’t rest on your laurels for too long. The next step – and the problem – stems not from spammers or bots, but from people actively participating in your forum! Looks like someone has realized that you are not throttling new threads or answers. A wicked Grinch-like grin begins to shape his face.
What if I write a script that answers this thread every five seconds with ambiguity?
Now you’re definitely thinking to yourself, “But why? What’s the point?” Like many things in life, the answer is simple: “To see if they can.” Rule number one: Assume the worst. People are not naturally nice.
throttling
Less chipper than your former self, you return to your codebase in search of solutions. Perhaps I should limit the frequency at which you can create new threads or answers. The same user responding multiple times in a minute is definitely a sign of malicious intent. …that, or Red Bull.
Yes for Laravel’s custom validation rules. The gift that keeps on giving.
<?php namespace App\Rules; use Illuminate\Contracts\Validation\Rule; use App\User; class PostThrottling implements Rule { public function __construct(protected User $user) { } public function passes($attribute, $value) { return $this->user->latestThread?->created_at->lt( now()->subMinutes(2) ); } public function message() { return 'You are posting too frequently.'; } }
This quick rule determines if the given user created a thread within the last two minutes. If so, that’s gonna be a no from us, dog. Let’s add this new rule to our primary validation logic.
request()->validate([ 'body' => [ 'required', new PostThrottling($user), new NoInvalidKeywords, new EnglishOnly ] ]);
At this point, you’re somewhat bitter that you’ve spent the morning on a commitment to protect your forum from the users it was meant to serve. But at least it has happened. Let’s push it to production and get back to work.
problem solved! …Correct?
Step Five
A week goes by without spam. Watch the festive music from the end of Independence Day. you won.
Get on the wire and inform squadrons around the world. Tell them how to bring down those bitch sons!
…If only.
Like many Hollywood movies these days, there’s always a sequel. The spammers haven’t gone. They are only regrouping; Waiting for the right time to attack It has been decided that your upcoming family vacation is the right time. “We strike at dawn” I think as they laugh amongst themselves.
“Now boarding zone 3” says the flight attendant. You get up, collect your bags, and make your way to the boarding entrance. You feel a slight movement in your left pocket. a new email. “I really need to disable these damn notifications”, you think. You remind yourself that you’re on vacation and work emails aren’t interested in you. And, yet, it is not knowing that traps us all. What if it’s important? What if the site is down? Come on, it’s only a bridge to the slot machine. And then you’ll turn it off. of course you will.
You set your bag down, reach into your pocket, and secretly check your phone (to avoid your spouse’s judgmental look). The front page of your forum is covered in spam. We are under attack! How did they get through our defenses?
“Sorry dear. I have to deal with this.” You make embarrassing remarks to your spouse. With enthusiasm in your move, you go to your assigned seat, pull out your laptop, and quickly boot up your code editor. The flight does not take off for 25 minutes. there’s still time! I can fix it.
honeypots
Maybe we should be more considerate. Is there any way to spoof these bots, to separate them from submissions from our active users?
The answer, as it turned out, is yes! Honeypots to the rescue. Fortunately, it’s a simple-enough technique that we should be able to apply before the plane lifts… if we stay focused.
Honeypot is a pattern or technique that involves adding invisible inputs to your form. A bot will scan your form, encounter these fields, and fill them out to the best of its ability.
Now, if these special request fields are present, all we need to do is validate. If so, Ding Ding Ding: We have a spammer on our hands. No doubt. Abort, Abort!
It is such a simple solution that is amazingly effective. I wish we had implemented this at the launch of the forum. Sadly, the obvious solution is rarely the first we consider.
problem solved! …Correct?
Step Six
Congratulations. You kicked ass with time. With five minutes to fly, you’ve won your vacation back. Don’t forget to disable those notifications… just like you promised.
Fast-forward a few months, and spam is a fraction of what it was before. And yet, even then, a fraction is nothing. Despite your best efforts, you haven’t really won the battle. Like a near-defeated monster, it still crawls towards you, hoping for one last grab.
Looks like we need a reCAPTCHA service. Should we make a deal with Satan to solve this relentless problem? “Help me solve my pending spam issue, and help train your AI how to detect my users stop sign.”
Like every bloody thing on the web, it’s never that easy to implement. But, several browser tabs later, you manage to find the solution together.
Surely, the combination of keyword censoring, email confirmation, language detection, post throttling, honeypots and recaptcha is enough to kill this beast. It certainly seems so. It’s been three days without a single piece of spam.