White Paper

Machine learning is becoming more and more useful in the world of technology. Companies like Google, Apple, and Facebook are spending lots of money and resources and development of machine learning. Google’s Deepmind just recently beat a world champion at the game of Go. Go is a very complex game, much more complex than chess or checkers. The idea behind machine learning is that you give a program a set of training data that the machine will use to make decisions in the future. In our case, we want to make a program that will detect the emotion of a tweet. We envision ourselves starting with the creation of the actual machine learning program and the creation of a custom made dataset we will use. Doing this in tandem will allow us to custom fit the dataset for the program. In order to create the dataset, we will most likely need to create an elementary program in which we can manually process and store the tweets. Once the program and data set have been created, we will input the data set into the program, and, from there, it should be able to accept any other tweet as an input and detect the emotion. Depending on the efficacy of the core program, we may develop a browser extension that works within twitter. It would query the core program for each tweet being viewed and would display the emotion next to the tweet.

Dataset Sample (from MySQL Database):

[expand title=”Display Programs’ Code” swaptitle=”Hide Programs’ Code” swapalt=”Close Journal”]

import pymysql
import pymysql.cursors
import sys

emotions = {'love':1, 'joy':2, 'surprise':3, 'anger':4, 'sadness':5, 'fear':6}
connection = pymysql.connect(host='127.0.0.1', user='root', password='meme',db='twitter', charset='utf8mb4', cursorclass=pymysql.cursors.DictCursor)

try:
    while True:
        with connection.cursor() as cursor:
            ratedDataSQL = "SELECT `tweet` FROM `data` WHERE `rated_by_ferdi`=0 limit 5;"
            cursor.execute(ratedDataSQL)
            tweetText = cursor.fetchone()['tweet']
            print 'Love (1), Joy (2), Surprise (3), Anger (4), Sadness (5), Fear (6)'
            emotion = raw_input(tweetText.encode('utf-8')+':\n').lower()
            if '1' in emotion or 'love' in emotion:
                emotion = emotions['love']
            elif '2' in emotion or 'joy' in emotion:
                emotion = emotions['joy']
            elif '3' in emotion or 'surprise' in emotion:
                emotion = emotions['surprise']
            elif '4' in emotion or 'anger' in emotion:
                emotion = emotions['anger']
            elif '5' in emotion or 'sadness' in emotion:
                emotion = emotions['sadness']
            elif '6' in emotion or 'fear' in emotion:
                emotion = emotions['fear']
            else:
                print 'error'
                sys.exit()
            insertTweetSQL = 'UPDATE `data` SET `emotion_ferdi` = %i, `rated_by_ferdi` = 1 WHERE `tweet` = "%s"' % (emotion, tweetText)
            cursor.execute(insertTweetSQL)
            connection.commit()




except KeyboardInterrupt:
	connection.close()
	sys.exit()

Below is the source code for the program that mines the tweets from twitter.

import pymysql
import pymysql.cursors
from twitterscraper import query_tweets

keywords = ['nfl', 'ripaim']

try:
    with connection.cursor() as cursor:
        ratedDataSQL = "SELECT `tweet` FROM `data` WHERE `rated_by_ferdi`=0"
        cursor.execute(ratedDataSQL)
        print cursor.fetchone()

finally:
    connection.close()
for keyword in keywords:
    for tweet in query_tweets(keyword, 10)[:10]:
        tweetText = tweet.text.encode('utf-8')+'\n'
        if 'pic.twitter' not in tweetText and '' not in tweetText and '' not in tweetText:
            print tweetText

[/expand]