HOBBY PROJECTS
Project | We Need To Talk About Sentiment Analysis
In this blog post, I utilize three pre-trained sentiment algorithms - ranging from dictionaries to machine learning models - to demonstrate the risks of blindly applying these models indiscriminately and what we can do to ameliorate these challenges. More specifically, I scraped around 1000 speeches and 11.000 tweets from politicians and compared the sentiment scores across the different algorithms and text types under study.
The presidential speeches and coded datasets are available on Kaggle. The code is available on Github.
Project | Twitter Analytics: analyzing the twittersphere of Flemish politics
I scraped around 600.000 tweets from Flemish elected officials (251 accounts) and analyzed their communication patterns (i.e. who talks to whom), patterns in word usage and emotional polarity
The results were featured in Knack (print) and Datanews, The results and interpretation of the social network analysis (SNA) are reported in Knack (online) , but you can consult the SNA color-coded by network community or polarity right here. For more info, consult this presentation I gave as a guest speaker at the 'media innovation week' (Ghent Uni.).
Project | NordVPN switcher: A Python script for server rotation
NordVPN switcher is a user-friendly and flexible script in Python to switch between different servers, which works both on Windows and Linux without altering your code one bit. Hitherto, the available rotation tools only worked on Linux and were rather clunky. Contrary to this, the script presented here guides the user through a couple of straightforward menu options. Alternatively, one could also provide the function with preloaded settings or even a Python list of predetermined connection options. Check out the code and step-by-step tutorial on Github!
Project | Popular Times scraper: scraping Google maps using Selenium
I developed a Google maps scraper that not only scrapes general place info (e.g. reviews), but also the valuable 'popular times' data, which is unavailable through the Google maps API. To put the scraper to the test, I collected info on more than 13.000 google places across Europe. Check out the demo (video), the code, and some example output data (1|2).
Read all about it in this blog post where I analyze the scraped data and discuss the potential and pitfalls of leveraging popular times data within a research context.
Project | PageRank manipulator: Influence Google search results
This script combines the NordVPN switcher (see above) with Selenium to influence the google PageRank score of specific search results. In the demo (video), you can see how this would work out in a fictitious example for Elon Musk. The script is able to handle Captcha-pages due to suspicious network traffic without any trouble. To avoid bot detection, I incorporated several adjustments to the standard Chromedriver settings to trick Google a real human is searching Google. The code is available on Github.
PUBLICATIONS
Research| Exploring the Effect of In-Game Purchases on Mobile Game Use with Smartphone Trace Data (1st author)
This study leverages smartphone trace data to explore the longitudinal effect of in-game purchase behavior on continual mobile game use. In total, approximately 100.000 hours of mobile game activity among 6.340 subjects were analyzed. (suppl. material)
Keywords
-
Big Data
-
Computational Methods
-
Survival modeling
-
R
Research| (What) can Journalism Studies learn from supervised machine learning? (2nd author)
Keywords
-
Big Data
-
Machine learning
-
Content analysis
-
Theory of communication
Research| Addressing the Temporality of Online Repositories When Working with Trace Data (1st author)
In this paper we discuss the development of a sequential web scraper in R and the challenges that came with such development due to (1) the inherent temporality of web repositories and (2) possible biases introduced by the platform architecture. The results of the app scraper are freely available on Kaggle.
Keywords
-
Webscraping
-
Trace data
Research| #Muslim? Instagram, Visual Culture and the Mediatization of Muslim Religiosity (3rd author)
Keywords
-
Social media analysis
-
Natural Language Processing (NLP)