After the tool parsed the JSON, it allowed me to download CSV files of each conversation. The ones that interested me were conversations that had the most messages sent which included my close friends and my ex-girlfriends. These CSV files ended up not being totally perfect. There were many things that disrupted the structure such as carriage returns, URLs, and weird emoticons that we liked to use. So I had to clean up these files as well, which led to a little bit of information lost.
I then used a csv-parse library to parse these CSVs. Because this project mainly focuses on my own messaging style, I only had to extract messages that I sent from these conversations. After compiling all of my messages, I noticed that my messaging style is not very conducive to analyze as complete thoughts. I like to message rapidly
in a sort of
stream of consciousness,
kind
of like
this.
So I also needed a way to determine complete thoughts. I ended up appending messages that I sent within 5 seconds of my last message. This was done by finding the difference of time in the timestamp of consecutive messages.
After I was satisfied with the final text file, I used the RiTa library to analyze the text with markov chains and generate sentences. The result was bizarre and familiar.
My main code Bot.js on digitalocean server, run with forever lib
https://gist.github.com/bryanjhsu/4353b3cbb14dc6c2163db88148018d2f
For now, everytime someone DMs the bot, bot generates a sentence from markov chain and replies to the sender. The bot also refollows any user who follows it, allowing the user to easily DM to the bot after following.
I am definitely going to continue working on this bot. I would like to have the bot recognize key words in received messages and reply in a way so that it becomes more "conversational".
Bonus: here is a conversation my ShoobyBot had with Liarbot (a bot that tweets anything you DM it)