Blocking spammers with SFS

By Ronald van Belzen | May 13, 2018

Let's focus our attention on nodes. Core functionality does not save the IP address of the user together with the node when it is created, so we need to introduce that to our module. The most straightforward approach would be to save the IP address in our own database table. For that I need to introduce a database table (sfs_hostname) by defining a new entity called SfsHostname.

<?php
/* /src/Entity/SfsHostname.php */

namespace Drupal\sfs\Entity;

use Drupal\Core\Entity\ContentEntityBase;
use Drupal\Core\Entity\ContentEntityInterface;
use Drupal\Core\Entity\EntityTypeInterface;
use Drupal\Core\Field\BaseFieldDefinition;

/**
 * Defines the sfs hostname entity.
 *
 * @ContentEntityType(
 *   id = "sfs_hostname",
 *   label = @Translation("SFS Hostname"),
 *   base_table = "sfs_hostname",
 *   entity_keys = {
 *     "id" = "id",
 *     "uuid" = "uuid",
 *     "label" = "hostname",
 *   },
 *   handlers = {
 *     "storage_schema" = "Drupal\sfs\SfsHostnameStorageSchema",
 *   },
 *   admin_permission = "administer sfs",
 * )
 */
class SfsHostname extends ContentEntityBase implements ContentEntityInterface {

  /**
   * @param \Drupal\Core\Entity\EntityTypeInterface $entity_type
   *
   * @return array|\Drupal\Core\Field\FieldDefinitionInterface[]|mixed
   */
  public static function baseFieldDefinitions(EntityTypeInterface $entity_type) {

    $fields['id'] = BaseFieldDefinition::create('integer')
    ->setLabel(t('ID'))
      ->setReadOnly(TRUE);

    $fields['uuid'] = BaseFieldDefinition::create('uuid')
    ->setLabel(t('UUID'))
      ->setReadOnly(TRUE);

    $fields['hostname'] = BaseFieldDefinition::create('string')
      ->setLabel(t('Host name'));
	  
    $fields['uid'] = BaseFieldDefinition::create('integer')
    ->setLabel(t('User ID')); //index

    $fields['entity_id'] = BaseFieldDefinition::create('integer')
    ->setLabel(t('Entity ID'));

    $fields['entity_type'] = BaseFieldDefinition::create('string')
    ->setLabel(t('Entity type'));

    $fields['created'] = BaseFieldDefinition::create('created')
    ->setLabel(t('Creation date'));

    return $fields;
  }
}

Just in case in the future we might be interested in saving IP addresses for other entity types than nodes, I included the field "entity_type" to make that possible. I also included indexes to speed up the lookup of IP addresses. These are defined in the storage handler SfsHostnameStorageSchema to wich is reffered in the annotation of SfsHostname.

<?php
/* /src/SfsHostnameStorageSchema.php */

namespace Drupal\sfs;

use Drupal\Core\Entity\ContentEntityTypeInterface;
use Drupal\Core\Entity\Sql\SqlContentEntityStorageSchema;

/**
 * Defines the sfs_hostname schema handler.
 */
class SfsHostnameStorageSchema extends SqlContentEntityStorageSchema {
    
    /**
     * {@inheritdoc}
     */
    protected function getEntitySchema(ContentEntityTypeInterface $entity_type, $reset = FALSE) {
        $schema = parent::getEntitySchema($entity_type, $reset);
        
        $schema['sfs_hostname']['indexes'] += [
            'sfs_hostname_field_uid_value' => ['uid'],
            'sfs_hostname_field_hostname_value' => ['hostname'],
            'sfs_hostname_field_entity_id_value' => ['entity_id'],
            'sfs_hostname_field_entity_type_value' => ['entity_type'],
        ];
        
        return $schema;
    }
}

Now we need to tell Drupal that we need to save the IP address in this database table when a node is created, and remove it when a node is deleted. For this we can use the hooks "hook_node_insert()" and "hook_node_delete()".

// sfs.module

/**
 * Implements hook_node_insert().
 */
function sfs_node_insert($node) {
  db_insert('sfs_hostname')
    ->fields(array(
      'entity_id' => $node->id(),
      'entity_type' => 'node',
      'uid' => $node->getOwnerId(),
      'hostname' => \Drupal::request()->getClientIp(),
    ))
    ->execute();
}

/**
 * Implements hook_node_delete().
 */
function sfs_node_delete($node) {
  db_delete('sfs_hostname')
  ->condition('entity_id', $node->id())
  ->condition('entity_type', 'node')
  ->execute();
}

So now we have the IP addresses to check when we want to report the creators of nodes to www.stopforumspam.com. Other entity types like comments do save the IP address together with the other data. For users we need other techniques to retrieve their IP address(es). And lastly, we cannot report spammers that use the contact form unless we decide to create a copy of all contact messages locally. I will come back to that in the next blog entry.

First I'd like to concentrate on blocking the spam, like the title of the blog says.

For that we need to validate the incoming data from the forms that users need to fill in for posting comments, registering themselves,etc. This is also accomplished with the help of hooks.

<?php
/* sfs.module */
use GuzzleHttp\Exception\ServerException;
use Drupal\Core\Form\FormStateInterface;
use Drupal\Core\Entity\EntityInterface;
use Drupal\Core\Url;


/**
 * Add validation to form submissions of user registration, node, comment and 
 * contact when the configuration settings allow it.
 * 
 * Implements hook_form_alter().
 */
function sfs_form_alter(&$form, FormStateInterface $form_state, $form_id) {
  $config = \Drupal::config('sfs.settings');
  if ($config->get('sfs_check_user_registration') && $form_id == 'user_register_form') {
    $form['#validate'][] = 'sfs_user_registration_form_validate';
  }
  if ($config->get('sfs_check_node')) {
    if (stripos($form_id, 'node_') === 0 && stripos($form_id, '_form') !== FALSE) {
      $form['#validate'][] = 'sfs_node_form_validate';
    }
  }
  if ($config->get('sfs_check_comment')) {
    if (stripos($form_id, 'comment_') === 0 && stripos($form_id, '_form') !== FALSE) {
      $form['#validate'][] = 'sfs_comment_form_validate';
    }
  }
  if ($config->get('sfs_check_contact')) {
    if (stripos($form_id, 'contact_message_') === 0 && stripos($form_id, '_form') !== FALSE) {
      $form['#validate'][] = 'sfs_contact_form_validate';
    }
  }
}

The above code conditionally adds extra validations to the 4 user activities we are interested in. The condition is the configuration setting that tells us for what kind of activity of the user we are interested of blocking known spammers. For comments, for example it adds extra validation defined in the function "sfs_comment_form_validate" that is shown next.

/* sfs.module */

/**
 * Validate callback for comment form.
 */
function sfs_comment_form_validate(&$form, FormStateInterface $form_state) {
  $config = \Drupal::config('sfs.settings');
  
  if ($config->get('sfs_check_comment')) {
    $form_errors = $form_state->getErrors();
    if (!$form_errors) {
      $sfs = \Drupal::service('sfs.detect.spam');
      $ip = \Drupal::request()->getClientIp();
      $values = $form_state->getValues();
      if ($sfs->isSpammer($values['name'], NULL, $ip)) {
        $form_state->setErrorByName('', $config->get('sfs_blocked_message'));
        \Drupal::logger('sfs')->notice('Blocked spam comment: name:@name  e-mail:@mail  IP:@ip,
          ['@name' => $values['name'], '@mail' => NULL, '@ip' => $ip]);
        $delay = $config->get('sfs_flood_delay');
        if ($delay) {
          sleep($delay);
        }
      }
    }
  }
}

The validation function checks again whether comments need to be blocked for known spammers. When the form already contains errors it is not going to check with www.stopforumspam.com, because other errors will stop the form submission already. But when it is free of errors it first retrieves the service 'sfs.detect.spam' that you may remember from the previous post.

Then it retrieves the IP address from the request and the values filled in the form that, unfortunately, only contains the name in the case of comments.

When the service return that the user is a known spammer then it sets an error for the form and form submission will not take place. When the configuration settings require it the response to the user is also delayed by a few seconds.

The submissions of the other forms are handled in similar ways. The user registration delivers all fields to the service for checking at www.stopforumspam.com but for the rest the validations are similar.

That is all it takes to block known spammers.

Next I will continue the way spammers, that do get through, can be reported to www.stopforumspam.com so they will not hinder us and others in the future.

Comments

Ronald van Belzen wrote on Tue, 09/04/2018 - 21:49

In a way your test was successful. Due to a lack of content the filter could not decide whether your post was ham or spam and placed your message in the to be approved comment queue, after which I approved it.

Add new comment