Skip to content

Rise of the Spewbots

Dear Reader,

More specifically,

Dear reader thinking that it is a good idea to write a twitter bot that automatically spews tweets without regard to anyone else because everyone obviously wants to know what your bot has to say,

I guess because Oprah is now on twitter, things are exploding. Today alone, I’ve noticed 5-7 new bots on twitter polluting my search time lines spouting drivel and polluting them to the point of almost uselessness.

How I use twitter

I guess first, I should explain how I use twitter since different people use it differently. One of my jobs at Ibuildings is to be a PHP Community Evangelist. It’s the awesome part of my job and I love it. My twitter persona revolves around that aspect of my life. I try to make most of my tweets valuable to at least someone. (ok, my recent rants against #skype were just therapeutic for me.) My general rule of thumb is that most of my tweets need to inform or amuse.

I like finding people asking questions tagged #php and helping by answering them when I can. I like finding new blog posts about #php and helping spread the good ones. The rise of using search and hashtags has really expanded the usefulness of twitter for me…until now.

Rise of the Spewbots

Over the past few months I’ve seen an increasing number of bots tweeting. I have coined the term spewbots because for the most part, these bots spew forth data that I can easily get elsewhere should I want it and they add very little to the conversation. These bots break down into four categories.

Accidental spewbots

Today I got 6 svn check in notices from a spewbot. Now honestly, I’m sure they weren’t trying to pollute anyone’s time line. However, because they use the Zend Framework, I get their notices. I’ll refrain from ranting about the stupidity of tweeting svn check in notices, if that’s your thing and it doesn’t affect me then hey, go for it. Please let me make one request though. Protect your account’s timeline. A simple check of a box on your account’s settings and your tweets won’t go out into the public time line. Anyone who wants can follow your bot and once you approve them, can can partake of your spewy goodness.

Hashtag repeaters

I really don’t understand why some people feel the need to automatically re-tweet things that people tag. @hashwordpress is an excellent example of this. If anyone wants every tweet that has the #wordpress hashtag, all they need to do use search.twitter.com, twitterfall.com or any of a hundred ways to get these tweets; re-tweeting them helps no one.

Can we please just get rid of these? If you are the owner of one of these bots, please either shut it down, protect it’s time line so that you are not a spewbot, or leave me a comment here telling me why I’m wrong and they are really a good thing.

Bot testing spewage

All of a sudden on Friday, my entire twitter time line was replaced by a series of tweets from a single account. Each of them were just random words and the #php hashtag. In my frustration, I tweeted to this bot to please stop polluting my time line. Much to my surprise, I got an email about 20 minutes later from the owner of the bot asking what the problem was. I explained to them that their testing was polluting the #php hashtag. I asked them to replace the # with another character and they wouldn’t be bothering anyone. Much to my surprise, they agreed and I haven’t seen a tweet from them again.

If you are testing, there is no need to tag your tests. You are monitoring them and can tell if your bot is working. If you refuse to protect your bot’s time line during testing at least don’t tag the tweets, please?

…and the rest

There’s not really a way to classify the rest of the spewbots. Most of them that pollute the #php time line are job related. If they only tweeted one or two tweets a day it probably wouldn’t annoy me so much but you get bots like @fresh_projects, @freelance_jobs and @joomlajobs that tweet multiple tweets at a time several times a day.

I really don’t have a good way to handle these. To be honest, they are like like Democrats, they piss me off but they’ve got as much right to be here as I do. I really wish my client of choice, twhirl, would let me right click on an account and tell it to never ever show me a tweet from this account again. That way they are free to pollute, but I am free to ignore them. Actually, this would be an awesome feature to implement in twitter itself but since it takes 30 days to get a response from them on a simple bug report/request these days, the chances of that happening are slim. So for now, I’ll just have to put up with them.

If anyone has a good idea for how to cleanse the #php search timeline of these parasites, I’d love to hear it. I tried building a Y!Pipes filter to filter them out but Pipes won’t check the feed often enough to pickup all the tweets.

Conclusion

Until we find a permanent solution to the problem of spewbots, I’ guess we just have to put up up with them. I filter the most egregious of the spewbots with the -from: option in search.twitter.com but there’s a 140 limit to the search query so I have to constantly have to re-evaluate which ones I filter and which ones I put up with.

I’m open to ideas on how to combat the problem of spewbots, especially the last group of them. For now though, I’m out of ideas.

Until next time,
(l)(k)(bunny)
=C=

One Line Linux “Twitter From File” Command

Dear Reader,

Sometime recently I was surfing around and came across a blog post where a user wrote a PHP script – fully OO, it was very pretty – that would pop the top line off of a file and tweet it, that struck me as odd. Actually, first it struck me as stupid since all twitter.com needs is another mindless bot spewing lines from a file at regular intervals, once I got past stupid it struck me as odd.
Update: ‘tweetFromFile’ PHP Class is the original blog post.

I used to have a friend named Michael Chaney. (I’m pretty sure he and I are friends but we’ve not spoken since June.) Michael once posted in an email that one of his hobbies/quirks was that he would try and find ways to condense complex tasks into a single line of bash script. (I’m paraphrasing) I thought of that email when I read the “Tweet a line from a file” post. It just strikes me that it’s a bit overkill to fire up PHP for a simple task like this.

Before you ask, no, I’m not falling out of love with PHP but it’s a “right tool for the job” issue. PHP can be used to do this, but it’s not really necessary. So, if we are going to ignore the web’s all-purpose sledge hammer, what can we use? How about the tools that come with Linux?

Source Material

First, you are going to need a file to tweet. Now if you are using this to do something stupid like tweet UTC every minute or the status changes of your dorm room lamp then stop right now. I do not want the powers I’m going to teach you used for evil. However, if you are planning something like tweeting the “Dead Parrot” skit one line at a time, read on my friend and make sure you ping me so I can follow you. Either way you will need a file full of single lines less than 140 characters each. For my testing, I started with a Unix fortune file “bofh-excuses”

If you are on CentOS?

yum install fortune-bofh-excuses

If you are not on CentOS, figure it out on your own.

I’ll leave it as an exercise for the user to figure out how to strip out the cruft from the file or just create your own file of wit ready for tweeting. Whatever you do, name the file tweets.txt and put it in your working directory.

Twitter Account

Yes Sparky, you will need a twitter account to play with this code. No Sparky, you can’t borrow mine. If you don’t have one or you don’t want to bother your followers with inane test messages, I suggest registering a new one. Grab one nobody would want. (Not that I did, my test account is @elePHPant)

The Command

Here it is in all its glory for those too anxious to wait for my explanation.

head -n 1 tweets.txt | xargs -r -s 140 -I {} curl -s -d "status={}" -u twitterAcct:p.ass.word  http://twitter.com/statuses/update.json > /dev/null ;sed -i '1d' tweets.txt

For those of you who see line noise from an old modem, let’s go through this line by line. Note: broken up like this is will not work, if you are copying and pasting, use the one above.

1
2
3
4
head -n 1 tweets.txt
xargs -r -s 140 -I {} 
curl -s -d "status={}" -u twitterAcct:p.ass.word  http://twitter.com/statuses/update.json > /dev/null
sed -i '1d' tweets.txt
  1. This command returns the first line from the tweets.txt file.
  2. This command helps us build a command to execute using the output form line #1. The -I {} is critical here as it’s what tells xargs to take the input – in this case coming from stdin – and replace every instance of {} with it. For safety’s sake, the -s 140 makes sure we don’t send twitter anything over 140 characters. Finally, the -r makes sure that it doesn’t call curl if there is no line to pass in.
  3. This is the heart of the command, a call to curl.
    • -s, tells curl to run silent. This does not prevent output but it suppresses curl’s normal output. Anything coming from twitter will still be output.
    • The next option, -d specifies the string of data to be POSTed. Since we need this to be POST instead of GET, we have to specify the data string this way. The string following the -d command is the data to be sent. This is normal HTML name=value pairs separated by ampersands.
    • -u allows us to specify the username:password pair. Twitter uses basic authentication so it’s easy to authenticate simple tools like this. It is also highly insecure since you have to actually type your username and password into the cron script. This is one of these don’t try this at home things.
    • The next parameter is the url to call. If you have questions about how to call twitter’s API, check out the twitter API page for details.
    • Finally, because this command will return data, the final portion of this command will dump anything twitter sends back, in our case, JSON, to /dev/null. There is no error checking on this command, it either works and you see the tweet, or it fails and you don’t.
  4. This command pops the first line off the file using sed, the Serial EDitor. The -i command tells sed to edit the file in place. the ‘1d’ says delete the first line. The final parameter to the sed command is the name of the file.

That’s it, drop all of those into a cron job that runs every 10 minutes, place 100 lines tweets.txt that advertise your blog, weight loss, male enhancement or a porn site and you too can reduce your followers to just the other bots who are following you in hopes that you will follow them so they can spam you.

I really don’t expect anyone to find this useful but it was an interesting exercise in bash, a tool I don’t get to use nearly as much as I would like to.

Until next time,
(l)(k)(bunny)
=C=

p.s. If you don’t follow me on twitter and want to, I’m a real person, not a bot. Follow me at @calevans.

Some Related Content:

Desert Island Twitter Game

Dear Reader,

Ok, you are lost on a desert island and you can only follow 5 people on twitter. (Don’t think about it too hard, it’s just a freakin’ game) Who do you choose and why?

Here are my 5.

5: @andigutmans
Ok, so I work for Andi. (Actually, I work for @markdevisser) Andi is new to twitter but has already begun to see the potential. He’s started monitoring Zend’s footprint on twitter and I’ve seen him answer people’s questions or gripes about Zend, even when they are not addressed to him.

4: @lizziekeiper
I only met Lizzy a few month ago but she’s fun and every few days she asks her “Question of the day”. I like people that make me stop and think for a minute during my day.

3: @weierophinney
I work with Matthew and we usually talk every day or so. Matthew only tweets when he has something to say. (and that’s rare on twitter) so if Matthew says something, I stop and read it and if he posts a URL, I almost always visit it.

2: @mtabini
Marco is a good friend and a really bright guy. I follow him because his posts are almost always funny or insightful.

1: @everysandwich
Fred Leo is the funniest man I’ve never met. (No offense @SoupySales) We’ve been on-line friends for about 3 years now. I help him with his blogs from time to time and in return he makes me laugh almost every day. Before twitter, Fred and I would talk on AIM every day. Talking with Fred on AIM is difficult, not because he’s hard to understand but because I hated to just blather. Since the things he said made me laugh out loud, I felt I had to be funny too. Which was fine, we had some awesome discussions but it’s taxing on my brain. With twitter now, I can get my Fredisms without feeing the pressure to be funny back. (We still talk on AIM though, just not as much) If you’ve got time, go visit his blog and listen to PETA Girl or root around till you find the story of Aunt Mary. It’s worth it, I promise.

Until Next Time,
(l)(k)(bunny)
=C=

Using Twitter for a Competitive Advantage

Dear Reader,

Over at the Small Business Idea Forum, Staci asked about twitter and I replied. This, along with a couple of other things today are pointing me towards a blog post and possible a podcast this weekend.

Twitter has gone from WTH to ZOMG to “Hey, I can use this for my benefit!” I like any tool that hits that last stage.

Three things have come together today to prompt me to write this post.

First, my friend and editor Elizabeth Naramore tweeted today:

someone explain to me the reasoning behind a company “following me” on twitter; are they just hoping I follow them too?

She’s not the first person that has noticed this trend, just the latest. The trend of following everyone on twitter because a lot of people automatically follow you back is growing. The obvious benefit is if you follow 10,000 people on twitter and 10% follow you back because they don’t know any better, when you post, 1,000 people see your post. So as a side note to this blog let me jsut advise any twitter user out there, don’t auto-follow. When you get a twitter “follow” notice, check out who it is. If it’s not someone you know then it’s twitter spam. Don’t bother to follow them. (You don’t have to block them though, let them artificially inflate your follower number.)

Then I saw this post from Michal Arrington. (Whom I do not follow because I do not know and usually don’t care to hear what he has to say outside of techcrunch.com.) It was an A-Ha! moment for me. I do a lot of scanning with Google Alerts but his point is very important.

Twitter is the place where conversations are exploding well before they even make it to mainstream blogs.

It’s not enough these days to just monitor the web via Google alerts or some paid clipping service. Blogs are a trailing indicator these days. To be on top of your brand you have got to take it to the next level. tweetscan.com lets you do just that.

Finally, a forum post over at the Small Business Idea Forum again mentioned twitter and my reply there got me thinking.

Twitter started as a way to connect friends but is fast becoming a powerful marketing and business intelligence tool. I cover the former briefly in my forum post and on Sixty Second Tech but it’s the latter that I really want to talk about.

tweetscan.com

tweetscan.com is just what you think it is, a search engine for twitter. Yes, Google indexes twitter but these days that just not fast enough. Thankfully the guys and gals behind tweetscan solve that problem for us. It looks like they database and index the public feed. I don’t know where they get their resources but I hope to god they stay alive because this is something that twitter really needs.

If you have looked at their page by now and can’t figure out how to use it, please turn in your Internet secret decoder ring and shut off your modem. If you did figure it out, bully for you, you are as smart as a fifth grader! A couple of notes. If you read their blog and wiki (these people are on the web 2.0 ball!) then you know that they support OR and “-” operators. This makes life ever so much more interesting. GO ahead, play with a few queries like cats OR dogs. Hopefully they will add AND and NOT in there soon.

So, you can scan for topics. That’s kind of cool but other than replacing google egosurfing with twitter egosurfing what’s the pint, right? Here’s the point. Search for your brand! In my case I have searches for “Cal Evans”, Zend and ZF. All fine and good, as Arrington points out, I can now see things before they happen as twitter is a leading indicator. But who wants to go visit their page every so often and execute a series of searches?

FEED ME!

Thankfully, the people behind tweetscan are fully Web 2.0 compliant and they provide me with a custom feed for each search I execute. This means I can plug the RSS feed of the above search for “Cal Evans”, into ANY feed reader and voila, instant ego surfing!

Now, I use Google Reader as my primary feed reader and it does a wonderful job. However, these feeds (I’ve got 8 now) are much more important to me than anything I have in Google Reader. I almost need them to be push. The next best thing to push is pull in a program I already use. I did NOT want to have to install yet another piece of software to make this whole thing work. (Que Attensa to enter stage right) I used to use this Outlook plugin back when I was at Jupiter Hosting. It’s a great way to add RSS feeds into Outlook. It’s made some progress since 2005 and now is very unobtrusive.

Wrap It Up

So, to summarize; tweetsearch.com + Attensa’s outlook plugin = quick and easy business intelligence. Don’t forget to add feeds for your major competitors brands as well!

Until next time,
(l)(k)(bunny)
=C=
So,

Join the Podcast Generation!

Dear Reader,

I was talking to a friend of mine recently and mentioned that I have a podcast (yes, I routinely try and convert my friends to podcast listeners.) His response to me was “Yes, I have an iPod but I’m just not a member of the podcast generation”. His problem is that he is suffering from information overload. Others that I have talked to complain that they just don’t know where to start. No matter what your excuse, if you are not listening to podcasts on a regular basis then you are missing out on a lot (I mean a LOT) of good information, tips and entertainment that is there for the taking.

So, I’m going to make it my mission to try and convert people to be podcast listeners. I hope, along the way, I’ll pick up a few listeners to my podcasts but honestly, a rising tide floats all boats. So, if you don’t currently listen to podcasts on a regular basis, keep reading. If you are already a member of the podcast generation(you subscribe to at least one podcast), I want you to send the URL to this article to at least 5 friends that don’t. If you’ve got a twitter account, tweet it. Let’s see if we can’t increase the number of podcast listeners significantly in the next month.

Editors Note: If you are in a hurry, just skim the bullet points and visit the URLs.

Cal’s 4 Step Program to joining the Podcast Generation:

1: Figure out how you are going to listen.
If you don’t have an mp3 player, you are not out in the cold, it means you will most likely have to listen to them on your computer. I have 2 iPods myself and a total of 8 in the family and I still listen to about 25% of my podcasts on my computer. So don’t fret if you don’t have an mp3 player.

Most podcast sites these days have an embedded flash player. If you are going to be listening via your computer, take advantage of these. The only downside here is that you have to go check each podcasts site on a regular basis. The embedded players are great for testing out a podcast to see if you want to subscribe though.

If you don’t want to have to check each site regularly, You will however, want a “Podcatcher”. A podcatcher is a program similar to an RSS feed reader that gathers all the feeds from all the podcasts you want to subscribe to and puts them in one place. Your podcatcher checks each feed on a regular basis to make sure that when you are ready to listen, you have the content downloaded and ready to go.

By far, the most popular podcatcher on earth is iTunes. iTunes is free from apple and comes with every iPod. It works on Windows and on OSX. You don’t have to have an iPod to use iTunes but if you have one it makes life so easy.

There are a lot of other podcatchers and originally I was going to list all I could find. However, in researching the list I found that someone else has already done the research. If you don’t want to use iTunes, visit PodCatcherMatrix.org and find the podcatcher that is right for you.

I know that Microsoft packages software with the Zune but I do not yet have one so I can’t comment on it. If you have a Zune and the software, please leave a comment telling us what you like/don’t like about it.

2: Figure out what you are going to listen to.
This may sound simplistic but you really don’t want to listen to everything out there. (Actually, you probably couldn’t but that’s beside the point) To get started, pick one topic that you like and find a single podcast you like on that topic. This could be more difficult than you think. First, there are several good podcast directories out there. If you use iTunes, by far the most popular is the iTunes store. It is, however, not the only source.

Find one show that you like and subscribe. Then as you have time, find a second, a third, etc. Since most shows release every week or every other week, if you are using podcasts to fill time on your commute, you will eventually find how many you need to fill the void. Resist the urge to type in a keyword into iTunes and then subscribe to every one of them. Podcasts vary in quality of content an production values. Not all the high quality content podcasts have high production values and that’s ok. However, you will want to be selective in who you subscribe to. Also, don’t be afraid to drop a podcast that is not filling a need.Speaking as a podcaster I want “listeners” not “subscribers”.

3: For the first month, commit time each week to look for new podcasts
If you do this for a month, it will become as natural as checking your email. Just check the directories for new podcasts that match your keywords. If there is a new one and the description looks interesting, either subscribe to it in your podcatcher or give it a listen on-line if you’ve got the time.

4: Participate
This is where most of the podcast generation fail. Podcasters want feedback, we want lots of it. So if you like a podcast, take the time to tell the podcaster you do.

  • If they have a blog, comment.
  • If they are listed in iTunes, rate them.
  • If they are on one of the directories, rate them.
  • If they have a forum, post.

Participation is one of the things that sets podcasting apart from traditional “broadcast” media and it’s an important part of being part of the podcast generation. While most podcasts resemble radio shows in that the host talks to you, almost all podcasts have some way for you to communicate back to the host and we really want to hear from you. Regardless of the topic, most podcsts are one side of a conversation, you are the other half. Make sure you hold up your end of the conversation.

Many podcasts these days have “listener call-in lines” They actively encourage you to call in and pe a part of the conversation. You don’t have to have any special equipment or even a “radio voice” to have your voice heard. If your favorite podcast has a listener call-in line, put it on your speed dial and when listening, pause the podcast and call in. Let the host know if you agree, disagree, or just have more information.

Podcasts I listen To

That’s my four points. Now, I’m going to share with you the podcasts I listen to on a regular basis. Not because I have some deep insight as to what you should be listening to but mainly as a small way of saying thank you to the people that produce these shows. Consider my listing them here as an endorsement, if they fall in your area of interest, I would highly recommend them to you. (These are in alphabetical order, not in any order of preference.)

So, there you have it, a primer on joining the podcast nation. As I told my friend, our dues are cheap and our benefits are plentiful. No more excuses, get out there and participate!

Until next time,
(l)(k)(bunny)
=C=