How spam filters save us from intrusive ads (featuring @phenom as author)

Attention: This post has been written by @phenom

All of you received spam on your mailboxes, but it was successfully blocked and did not reach your eyes. But do you know how these spam filters work, how they determine that the email is a spam, how they protect you from it and what spammers do to circumvent all these measures? We will discuss it in my today ‘s article.

enter image description here


Let’s understand what a word spam means.

  • Spam - is an advertisement that is sent out against the will of the recipient.

And now easy to understand, what is a spam filter.

  • Spam filter - is the software that automatically detect spam, which is designed to be used by users or by servers and allows filtering out normal conversation from spam mailings.

Almost all spam filters use two main methods of filtering:

Analysis of the content of the letter

In this method, a statistical analysis of the content is used. To use this method it is needed to "train" filters, I mean that letters are manually sorted to identify the statistical characteristics of normal email and spam. The method works very well when sorting messages in which advertising information is provided in plain text or HTML. After training on a sufficiently large amount of emails, it becomes possible to cut up to 95-97% of the spam.

enter image description here

However, there are ways to circumvent these filters. To do this, spammers write random text and place advertising in the form of an image. The presence of random text cheats filter and does not allow to train it. Many email services use the button "Report Spam" to train the filter. Information about what messages are considered as spam are used to filter these messages and for the training of filters in the future. Gmail and Facebook use such a system.

Analysis of the sender

There are many blacklists with IP-addresses of the computers that are sending spam. To know if the IP is blacklisted there is made a request through DNS. Therefore, these lists are called DNSBL (DNS Black List).

enter image description here

This method is currently not very effective, as spammers find new servers for their goals faster than filters place them in the blacklist. In addition, several computers that send spam can compromise the entire mail domain or subnet, and thousands of law-abiding people would be indefinitely unable to send an e-mail to servers, using such blacklist. Also, irresponsible and incorrect use of blacklists by administrators leads to blockage of a large number of innocent users.

Greylisting

Greylisting is based on the analysis of the "behavior" of software designed to send spam messages and comparing it with the normal behavior of different mail servers. Spammer programs couldn’t re-send a letter after the appearance of administration errors. The simplest version of the software work based on the gray list works like this - all previously unknown SMTP-servers are considered as "gray."

enter image description here

Mails from such servers are not accepted, but not completely rejected - it returns with a temporary error code. If the sending server repeats its attempt after a certain period, the server transfers it to the white list. Therefore, normal letters are not lost but only delayed. This method now allows filtering out up to 90% of spam with virtually no risk of losing normal emails. However, it is also not perfect.

There are also many other methods:

  • refusal to accept letters with the wrong return address (letter from non-existent domain);
  • analysis of message headers;
  • systems of determining the characteristics of mass posts etc .;

But this methods are used so rarely that there is no need to talk more about them.

And in the end of my article, I would like to tell you how to defend your blog from spam.

Modern methods of fighting against spam use different types of captcha.

Captcha methods:

  • Captcha-picture - provided by universal service reCAPTCHA and makes us write some words or numbers from an object on the picture.

  • Text captcha - a captcha that offers to write an answer to the proposed question, to write numbers or letters given above or to solve equations.

  • Interactive captcha- it is a captcha that offers the user to interact with images and objects to determine that he is not a robot.


As you now know, even such corporations as Google can’t create software that would allow filtering spam with 100% accuracy and without losing any real messages that mistaken for spam, so spam will continue annoying us, but in small quantities.

There are still a lot of ways to circumvent spam filters and spammers do this really well. So if you have your blog - use captcha filters which can help you to cut up to 90% of spam and if spam comes to your email then you need to flag it as spam, so that developers could figure out how this email came through filters and will teach their filter to fight spam of this type.

Image credit: 1, 2, 3, 4, 5, 6

Follow me if you're a GEEK like me or want to learn more about IT/Technological/Math topics

Alex aka @phenom


Attention: This post has been written by @phenom

@knozaki2015 features authors and artist to promote them and a diversity of content. https://steemit.chat/channel/academy (if you want to get in touch)

The author will receive 100% of the STEEM Dollars from this post

Don't just follow me, follow the author as well, if you like their post @phenom

H2
H3
H4
3 columns
2 columns
1 column
13 Comments