Skip to main content

#XFactor

Today I downloaded tweets for the show #XFactor as this hastag was sort of trending. I didn't want to get one more sports related hashtag. I wasn't too hopeful about this because I thought the hashtag wasn't necessarily trending. But I was wrong! I could quickly get 100K tweets and to my surprise I found that the wordcloud was dominated by certain words ("factor" was sort of obvious but I didn't anticipate it). This is an indication of a lot of retweets. So for me now is the time to adjust the code to remove retweets. But I am not sure whether to remove them because that will mean that we put the same weight on all the tweets not taking into account the popularity (through retweets in this case. I haven't looked into the favorite tweets so far). Anyway, here is the wordcoud.

The word "Olly" shows up because it's the name of the presenter of the show and it seems that he screwed up by telling a contestant the outcome before the results were formally announced.
http://www.telegraph.co.uk/culture/tvandradio/x-factor/11989313/What-time-and-when-is-The-X-Factor-on-TV-this-weekend-Plus-One-Direction-will-perform.html




To my utter surprise and also supporting the view that most look like retweets, the sentiment score was super positive. Also note that the sentiment was available for almost 80% of the tweets, which is very high compared to the past sentiment graphs where this was only about 50%. This all points to a large number of retweets. Maybe someone would like to study who retweeted them. Email me if you want to work on it.



Comments

Popular posts from this blog

#SanBernadino

Update: Check the companion wordcloud for this here Even as I write this, the events are still unfolding in San Bernardino. Two or possibly three gunmen killed 14 innocent people. You can read up more here: http://www.nytimes.com/2015/12/03/us/san-bernardino-shooting.html or here: http://www.latimes.com/local/lanow/la-me-ln-san-bernardino-shooting-live-updates-htmlstory.html Anyway, the most trending hashtag is misspelt! It's worth noticing that at this point NRA was quite common. This is because up to this point people were blaming NRA for the tragedy. This has changed since I downloaded the tweets, however! The sentiment for the hashtag is surprisingly neutral. I say surprisingly because now the hastag has become very toxic as the shooters are identified as Muslims. Suddenly there are so many xenophobic comments that I am sure the sentiment would be quite negative now.

#io2016

Google's largest annual event - the I/O conference - is taking place right now. You can find more news about it on Techcrunch or other tech blogs I have around 32K clean tweets for the analysis. I was interested in knowing which Google products get most mention. I didn't use Google in the wordcloud because it would be obviously the most commonly used word. Well, as it turns out Android is the most popular product that was discussed in these tweets. Allo is the new chat app that Google introduced to take on iMessage. Overall sentiment was quite positive but there were still around 18% negative tweets. These could be because of Google invited people to suggest what its next version of Android should be called. It starts with N so basically it was open to a lot of racist jokes.

#PanamaPapers

Perhaps by now you must have heard about the biggest news of this year. A Panama based company Mossack Fonseca which helped create shell companies to many rich people around the world in order to do legal as well as illegal things (hiding taxable income, corruption money, etc.) sprouted a major leak. More than 11 million documents amounting to 2.6 TB were leaked to journalists. You can read everything about this here: http://panamapapers.sueddeutsche.de/en/ The size of the leak is enormous! Here is a graph for comparison: In light of this #panamapapers hashtag has been trending today. I downloaded more than 200K tweets but only around 34,000 were original and rest were retweets. Here are the wordcloud and sentiment graph