On Twitter: @JamesFirth and @s_r_o_c (post feed)

Got a tip? tip@sroc.eu



Wednesday, 8 June 2011

Barclaycard robots read each link I post on Twitter

This is a real WTF.  As some may have noticed I've experimented over the last couple of years with my own url shortener ejf.me. I use the data along with amalgamated visitor stats from other sites I run to compile my web trends charts.

I have also spent the last few years studying robots that visit my blogs and sites, and as part of this study I maintain a categorised database of web browser user agent strings, and a list of IP subnets where robots are likely to reside (e.g. Amazon Web Services, Google, Yahoo etc).

I've identified several methods to detect anomalies, such as robot behaviour from visitors who don't identify themselves as robots in the user agent string.  Behaviours such as: frequent repeat visits; frequent visits within a very short time of publication; failure to download images and other superfluous page elements during a visit; and, odd combinations of capabilities in the user agent string itself.

Since 19th May 2011 I've noticed one such anomaly from an IP subnet listed as belonging to Barclaycard:

netname:        BARCLAYCARD
descr:          Barclaycard
country:        GB
admin-c:        IJE3-RIPE
tech-c:         RC1629-RIPE
In the last 7 days alone, this bot has visited 18 of the links I posted within a minute or so of posting (often within seconds):
Visit from Barclaycard           Link first posted       ejf handle
| 2011-06-03 09:25:06 | 2011-06-03 09:24:15 | eC       |
| 2011-06-03 10:05:20 | 2011-06-03 10:05:14 | eD       |
| 2011-06-06 15:27:09 | 2011-06-06 15:26:20 | eK       |
| 2011-06-06 16:03:43 | 2011-06-06 16:03:41 | eL       |
| 2011-06-07 09:08:26 | 2011-06-07 09:06:33 | eM       |
| 2011-06-07 12:14:13 | 2011-06-07 12:14:05 | eO       |
| 2011-06-07 15:08:21 | 2011-06-07 15:08:18 | eQ       |
| 2011-06-07 16:56:22 | 2011-06-07 16:56:09 | eS       |
| 2011-06-07 17:19:29 | 2011-06-07 17:19:21 | eT       |
| 2011-06-07 20:29:25 | 2011-06-07 20:26:44 | eV       |
| 2011-06-08 19:02:33 | 2011-06-08 19:01:41 | eW       |
| 2011-05-19 12:03:13 | 2011-05-19 11:49:27 | ej       |
| 2011-06-01 09:11:19 | 2011-06-01 09:11:04 | eq       |
| 2011-06-01 10:51:40 | 2011-06-01 10:51:10 | es       |
| 2011-06-01 13:24:21 | 2011-06-01 13:24:18 | et       |
| 2011-06-01 13:46:38 | 2011-06-01 13:46:18 | eu       |
| 2011-06-01 13:54:19 | 2011-06-01 13:53:03 | ev       |
| 2011-06-01 16:57:39 | 2011-06-01 16:57:11 | ex       | 

So why would Barclaycard be so interested in my digital-policy focussed Twitter stream to want to grab an imprint of each post within minutes of posting?

I don't keep all the data I analyse for my web trends survey, but in the data still in my logs I've searched back to the earliest known visit from this Barclaycard-allocated IP subnet.  It's this: http://ejf.me/ej - a story I linked to in the Daily Telegraph about the Fred Goodwin Super-injunction!

Filed under: bizarre

@JamesFirth

No comments:

Post a Comment

Comments will be accepted so long as they're on-topic, do not include gratuitous language and do not include personal attacks or libellous assertions.

Comments are the views of the commentator and not necessarily the view of the blog owner.

Comments on newer posts are not normally pre-moderated and the blog owner cannot be held responsible for comments made by 3rd parties.

Requests for comment removal will be considered via the Contact section (above) or email to editorial@slightlyrightofcentre.com.