QUEST Datathon 2021

Hack Islamophobia

Varun Singhai, Asher Gilani, Daniel Ben-Or

1. Introduction

2. Data Wrangling

I want to figure out where Islamaphobic tweets are originating from (which tweets, users, etc) as well as what factors contribute to responses to these tweets.

The given dataset comes with a lot of excess columns so I will try to extract the relevant data.

2.1 Imports

In [1]:
import pandas as pd
import statsmodels as sm
from collections import defaultdict, Counter
import json

import matplotlib.pyplot as plt
import matplotlib.colors as mcolors
from matplotlib.colors import Normalize
import seaborn as sns; sns.set()

import networkx as nx
from pyvis.network import Network

import nltk
from nltk.corpus import stopwords
from nltk.tokenize import TweetTokenizer
nltk.download('punkt'); nltk.download('stopwords')
from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer

from wordcloud import WordCloud

import datetime
from tqdm import tqdm
import bar_chart_race as bcr

import twint

import folium
from geopy.geocoders import Nominatim
from folium.plugins import HeatMap
[nltk_data] Downloading package punkt to
[nltk_data]     C:\Users\asher\AppData\Roaming\nltk_data...
[nltk_data]   Package punkt is already up-to-date!
[nltk_data] Downloading package stopwords to
[nltk_data]     C:\Users\asher\AppData\Roaming\nltk_data...
[nltk_data]   Package stopwords is already up-to-date!

2.2 Loading in the Data

In [2]:
df_50k = pd.read_csv('./noislamophobia-dataset-50k.csv')
df_75k = pd.read_csv('./noislamophobia-dataset-75k.csv')
df = pd.concat([df_50k,df_75k])
df.head()
c:\users\asher\appdata\local\programs\python\python37\lib\site-packages\IPython\core\interactiveshell.py:3049: DtypeWarning: Columns (2,8,32) have mixed types.Specify dtype option on import or set low_memory=False.
  interactivity=interactivity, compiler=compiler, result=result)
Out[2]:
_id contributors coordinates created_at entities extended_entities favorite_count favorited geo id ... quoted_status_id_str retweet_count retweeted retweeted_status source text full_text truncated user withheld_in_countries
0 ObjectId(59dbede4e6e465a2d67a1062) NaN NaN Sun Oct 22 10:34:06 +0000 2017 {"hashtags":[{"text":"BanIslam","indices":[93,... NaN 0 False NaN 9.220484e+17 ... NaN 4 False {"created_at":"Fri Oct 20 09:33:45 +0000 2017"... <a href="http://twitter.com/download/iphone" r... RT @Private34349909: @AmyMek @Dab7One 1/ @real... NaN False {"id":8.009259051368776e+17,"id_str":"80092590... NaN
1 ObjectId(59dbede4e6e465a2d67a106c) NaN NaN Sun Oct 22 10:13:40 +0000 2017 {"hashtags":[{"text":"muslim","indices":[16,23... NaN 0 False NaN 9.220433e+17 ... NaN 10 False {"created_at":"Sat Oct 21 18:37:07 +0000 2017"... <a href="http://twitter.com/download/android" ... RT @ensine: All #muslim rulers were savages bc... NaN False {"id":375089876,"id_str":"375089876","name":"द... NaN
2 ObjectId(59dbf266e6e465a2d67a4810) NaN NaN Sun Oct 22 07:50:47 +0000 2017 {"hashtags":[{"text":"Raqqa","indices":[37,43]... NaN 0 False NaN 9.220073e+17 ... NaN 172 False {"created_at":"Fri Oct 20 23:15:45 +0000 2017"... <a href="http://twitter.com/download/android" ... RT @SLandinSoCal: Liberated Women of #Raqqa‼️R... NaN False {"id":7.092975641898926e+17,"id_str":"70929756... NaN
3 ObjectId(59dbf266e6e465a2d67a4828) NaN NaN Sun Oct 22 10:38:23 +0000 2017 {"hashtags":[],"symbols":[],"user_mentions":[{... NaN 0 False NaN 9.220495e+17 ... NaN 1 False NaN <a href="http://twitter.com" rel="nofollow">Tw... @Stormtroepen @cdavandaag @sybrandbuma @gertja... NaN True {"id":1340408646,"id_str":"1340408646","name":... NaN
4 ObjectId(59e8e630e6e465a2d6238c9c) NaN NaN Sun Oct 22 08:59:53 +0000 2017 {"hashtags":[{"text":"RT","indices":[51,54]},{... NaN 0 False NaN 9.220247e+17 ... NaN 25 False {"created_at":"Sun Oct 22 01:40:55 +0000 2017"... <a href="http://twitter.com/download/android" ... RT @PoliticalIslam: Sharia at odds with Articl... NaN False {"id":150595824,"id_str":"150595824","name":"🇮... ["DE"]

5 rows × 33 columns

In [3]:
df.in_reply_to_screen_name.notna().sum()
Out[3]:
12162

2.3 Tidy Data

In [4]:
# Get a username column
df.user[0]
Out[4]:
0    {"id":8.009259051368776e+17,"id_str":"80092590...
0    {"id":8.009259051368776e+17,"id_str":"80092590...
Name: user, dtype: object
In [5]:
origin = df[['user', 'in_reply_to_screen_name', 'retweet_count', 'retweeted', 'created_at', 'text']]
origin.head()
Out[5]:
user in_reply_to_screen_name retweet_count retweeted created_at text
0 {"id":8.009259051368776e+17,"id_str":"80092590... NaN 4 False Sun Oct 22 10:34:06 +0000 2017 RT @Private34349909: @AmyMek @Dab7One 1/ @real...
1 {"id":375089876,"id_str":"375089876","name":"द... NaN 10 False Sun Oct 22 10:13:40 +0000 2017 RT @ensine: All #muslim rulers were savages bc...
2 {"id":7.092975641898926e+17,"id_str":"70929756... NaN 172 False Sun Oct 22 07:50:47 +0000 2017 RT @SLandinSoCal: Liberated Women of #Raqqa‼️R...
3 {"id":1340408646,"id_str":"1340408646","name":... Stormtroepen 1 False Sun Oct 22 10:38:23 +0000 2017 @Stormtroepen @cdavandaag @sybrandbuma @gertja...
4 {"id":150595824,"id_str":"150595824","name":"🇮... NaN 25 False Sun Oct 22 08:59:53 +0000 2017 RT @PoliticalIslam: Sharia at odds with Articl...
In [6]:
# want to get username, user_id_str, followers_count, verified out of 'user'
user_df = origin.user.apply(json.loads).apply(pd.Series)
user_df.head()
Out[6]:
id id_str name screen_name location description url entities protected followers_count ... profile_text_color profile_use_background_image has_extended_profile default_profile default_profile_image following follow_request_sent notifications translator_type withheld_in_countries
0 8.009259e+17 800925905136877569 Cl_USARocks 🇺🇸 usarocks_c United States Grateful 4 Trump fam! Support Trump 100%! Than... https://t.co/LiQOFDfZgS {'url': {'urls': [{'url': 'https://t.co/LiQOFD... False 10789 ... 333333 True False True False False False False none NaN
1 3.750899e+08 375089876 दिनेश कुमार सिंह vanrash02chahat बिलासपुर छत्तीसगढ़ None {'description': {'urls': []}} False 292 ... 333333 True False False False False False False none NaN
2 7.092976e+17 709297564189892608 CanadianDeplorable gtstuart1 British Columbia, Canada Little Wonder, You little Wonder You None {'description': {'urls': []}} False 154 ... 333333 True False True False False False False none NaN
3 1.340409e+09 1340408646 Hannesz Hannesz1956 Nederland Hannesz. Oud-journo. Het Nieuws van 1 Kant. An... https://t.co/MkLnSMT8PF {'url': {'urls': [{'url': 'https://t.co/MkLnSM... False 7010 ... 5E412F True False False False False False False none NaN
4 1.505958e+08 150595824 🇮🇱ReasonPrevail🇺🇸🇫🇷 ReasonPrevail US of A Exposing MEDIA bias & PaLIEstinian propaganda.... https://t.co/LlpRv4hPMT {'url': {'urls': [{'url': 'https://t.co/LlpRv4... False 359 ... 333333 True False False False False False False none NaN

5 rows × 43 columns

In [7]:
origin['id_str'] = user_df.id_str.copy()
origin['username'] = user_df.screen_name.copy()
origin['followers_count'] = user_df.followers_count.copy()
origin['verified'] = user_df.verified.copy()
origin = origin[['username', 'id_str', 'followers_count', 'verified', 'in_reply_to_screen_name', 'retweet_count', 'retweeted', 'text', 'created_at']]
origin.head()
c:\users\asher\appdata\local\programs\python\python37\lib\site-packages\ipykernel_launcher.py:1: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  """Entry point for launching an IPython kernel.
c:\users\asher\appdata\local\programs\python\python37\lib\site-packages\ipykernel_launcher.py:2: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  
c:\users\asher\appdata\local\programs\python\python37\lib\site-packages\ipykernel_launcher.py:3: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  This is separate from the ipykernel package so we can avoid doing imports until
c:\users\asher\appdata\local\programs\python\python37\lib\site-packages\ipykernel_launcher.py:4: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  after removing the cwd from sys.path.
Out[7]:
username id_str followers_count verified in_reply_to_screen_name retweet_count retweeted text created_at
0 usarocks_c 800925905136877569 10789 False NaN 4 False RT @Private34349909: @AmyMek @Dab7One 1/ @real... Sun Oct 22 10:34:06 +0000 2017
1 vanrash02chahat 375089876 292 False NaN 10 False RT @ensine: All #muslim rulers were savages bc... Sun Oct 22 10:13:40 +0000 2017
2 gtstuart1 709297564189892608 154 False NaN 172 False RT @SLandinSoCal: Liberated Women of #Raqqa‼️R... Sun Oct 22 07:50:47 +0000 2017
3 Hannesz1956 1340408646 7010 False Stormtroepen 1 False @Stormtroepen @cdavandaag @sybrandbuma @gertja... Sun Oct 22 10:38:23 +0000 2017
4 ReasonPrevail 150595824 359 False NaN 25 False RT @PoliticalIslam: Sharia at odds with Articl... Sun Oct 22 08:59:53 +0000 2017
In [8]:
# The retweeted column is always false, so recalculate it
origin.retweeted = origin.text.str.startswith('RT')
print(f'Number of tweets that are RTs: {origin.retweeted.sum()}')

# created_at column is not of type datetime
origin.created_at = pd.to_datetime(origin.created_at)

origin[origin.retweeted].head()
Number of tweets that are RTs: 95890
Out[8]:
username id_str followers_count verified in_reply_to_screen_name retweet_count retweeted text created_at
0 usarocks_c 800925905136877569 10789 False NaN 4 True RT @Private34349909: @AmyMek @Dab7One 1/ @real... 2017-10-22 10:34:06+00:00
1 vanrash02chahat 375089876 292 False NaN 10 True RT @ensine: All #muslim rulers were savages bc... 2017-10-22 10:13:40+00:00
2 gtstuart1 709297564189892608 154 False NaN 172 True RT @SLandinSoCal: Liberated Women of #Raqqa‼️R... 2017-10-22 07:50:47+00:00
4 ReasonPrevail 150595824 359 False NaN 25 True RT @PoliticalIslam: Sharia at odds with Articl... 2017-10-22 08:59:53+00:00
5 NattieNexit 700721268354641920 546 False NaN 2 True RT @Hannesz1956: @Stormtroepen @NattieNexit @d... 2017-10-22 10:51:17+00:00

3. Data Exploration

I now have the dataframe origin which has the username and some potential measures for how popular the user is and how much "penetration potential" their tweets have.

Will now begin to explore the data.

3.1 Plotting

In [9]:
sns.histplot(data=origin, x='followers_count', binrange=(0, 10000))
Out[9]:
<matplotlib.axes._subplots.AxesSubplot at 0x2c467f6c9b0>

3.2 Verified Users

In [10]:
origin[origin.verified].head()
Out[10]:
username id_str followers_count verified in_reply_to_screen_name retweet_count retweeted text created_at
1358 LauraLoomer 537709549 105475 True NaN 28 False They deserve it. Maybe when they all decide to... 2017-11-01 06:43:34+00:00
13733 peterboykin 24493104 57752 True NaN 2167 True RT @OneVoiceUS: RT if you want to put #America... 2018-03-16 03:15:29+00:00
14808 WorldofIsaac 57328888 29603 True NaN 0 False In this guy's profile\n\n"West Michigander, na... 2018-03-30 03:52:18+00:00
14818 peterboykin 24493104 58888 True NaN 93 True RT @dodt2003: Although we’ve been distracted l... 2018-03-30 08:11:54+00:00
15592 JessieJaneDuff 478855762 137642 True NaN 36 True RT @OldManStoneZone: Ok America, this little v... 2018-04-11 14:52:35+00:00

3.3 Retweet Distribution

In [11]:
# I excludes tweets with < 5 retweets since they are a clear majority and mess up the scale of the graph
sns.histplot(data=origin, x='retweet_count', binrange=(5, 500))
Out[11]:
<matplotlib.axes._subplots.AxesSubplot at 0x2c467dff0b8>

3.4 Time Distribution of Tweets

In [12]:
origin.groupby([origin.created_at.dt.year, origin.created_at.dt.month])['username'].count().plot(kind='bar', figsize=(20, 5))
Out[12]:
<matplotlib.axes._subplots.AxesSubplot at 0x2c467b27390>

3.5 Initial Conclusions

  • Most of the people tweeting in this dataset have very few followers. The distribution of followers is highest at 0 with an exponential dropoff for larger amounts of followers.
  • There are very few verified users (around 28). These users obviously have a high amount of followers compared to other users.
  • These verified users are retweeted a lot. The retweet_count distribution has an exponential dropoff and all of these verified users have retweet_counts much higher than shown in the retweet_count histogram.
  • Verified users also retweet others a lot. This was not properly indicated in the retweeted column, so I recalculated it.
  • Most of our tweets come from late 2017 - 2018 with fewer tweets in 2019 and 2020. November 2017 had a particularly high amount of tweets. Likely due to this attempted bombing.

4 Word Cloud

Show which words are the most popular to get a general idea of what people are tweeting about.

In [13]:
# Filter out words that start with things in filter_list or any stopwords
filter_list = ['@', '#', 'http', 'rt', '&', 'islam', 'muslim', 'religion', 'don', 'need', 'know']
stops = set(stopwords.words('english'))

def good_wd(wd):
    return not any(wd.lower().startswith(x) for x in filter_list) and wd not in stops

# Clean up tweets
tweets = df.text.dropna().str.split().apply(lambda lst: ' '.join(filter(good_wd, lst)))

# Generate word cloud
wordcloud = WordCloud().generate(' '.join(tweets))
image = wordcloud.to_image()
#image.save('wordcloud.png')

wordcloud

5. Penetration Network

At this point, it seems clear that verified users have much higher Twitter penetration since they are much more involved in the process of retweeting as well as being retweeted. There are not many verified users in this dataset so I will also be looking at users with a high amount of followers. I will try to visualize how tweets from these users spread and what factors influence this spread.

5.1 Generating Mentions Column

In [14]:
# Get a column that has all the @'d users for that tweet
origin['mentions'] = origin.text.str.split().apply(lambda lst: [(x[1:-1] if x.endswith(':') else x[1:]) for x in filter(lambda x: x.startswith('@'), lst)])
origin.head()
Out[14]:
username id_str followers_count verified in_reply_to_screen_name retweet_count retweeted text created_at mentions
0 usarocks_c 800925905136877569 10789 False NaN 4 True RT @Private34349909: @AmyMek @Dab7One 1/ @real... 2017-10-22 10:34:06+00:00 [Private34349909, AmyMek, Dab7One, realDonaldT...
1 vanrash02chahat 375089876 292 False NaN 10 True RT @ensine: All #muslim rulers were savages bc... 2017-10-22 10:13:40+00:00 [ensine]
2 gtstuart1 709297564189892608 154 False NaN 172 True RT @SLandinSoCal: Liberated Women of #Raqqa‼️R... 2017-10-22 07:50:47+00:00 [SLandinSoCal]
3 Hannesz1956 1340408646 7010 False Stormtroepen 1 False @Stormtroepen @cdavandaag @sybrandbuma @gertja... 2017-10-22 10:38:23+00:00 [Stormtroepen, cdavandaag, sybrandbuma, gertja...
4 ReasonPrevail 150595824 359 False NaN 25 True RT @PoliticalIslam: Sharia at odds with Articl... 2017-10-22 08:59:53+00:00 [PoliticalIslam]

5.2 Aggregating Users + Preparing Network

In [15]:
cols = ['username', 'followers_count', 'retweet_count', 'mentions']

# Aggregate users to count their average num of followers, total retweets, and total mentions
graph_df = origin[cols].groupby('username').agg({'followers_count': 'mean', 'retweet_count': 'sum', 'mentions': 'sum'}).reset_index()

# Will use follower count to represent node size; scale them between sizes [1000, 51000] for the network
graph_df['scaled_count'] = 1000 + (graph_df.followers_count - graph_df.followers_count.min()) * 50000 / (graph_df.followers_count.max() - graph_df.followers_count.min())

# Explode df on mentions to add in edges
graph_df = graph_df.explode('mentions')

# Add underscore to usernames and mentions so they aren't treated as ints by the library
graph_df.username = graph_df.username + '_'
graph_df.mentions = graph_df.mentions + '_'

# Calculate hex color codes so that higher retweet count corresponds to darker red node
norm = Normalize(vmin=0, vmax=5000, clip=True)
mapper = plt.cm.ScalarMappable(norm=norm, cmap=plt.cm.Reds)
graph_df['colors'] = graph_df.retweet_count.apply(lambda x: mcolors.to_hex(mapper.to_rgba(x)))

graph_df.head()
Out[15]:
username followers_count retweet_count mentions scaled_count colors
0 0000DD02_ 2784.5 16 traybishop_ 1141.797923 #fff5f0
0 0000DD02_ 2784.5 16 RealDrGina_ 1141.797923 #fff5f0
0 0000DD02_ 2784.5 16 76rooster_ 1141.797923 #fff5f0
1 000_Gopal32_ 2574.5 248 TharkiBaba01_ 1131.103880 #ffede5
1 000_Gopal32_ 2574.5 248 Donotshit_ 1131.103880 #ffede5

5.3 Creating Network

In [16]:
# More red = More retweets, Larger = More followers
nodes = graph_df.head(500).drop_duplicates('username')
mentions = [x for x in graph_df.head(500).mentions.dropna() if x not in nodes.username]
title_col = 'Followers: ' + nodes.followers_count.astype(int).astype(str) + '\nRetweets: ' + nodes.retweet_count.astype(str)

nt = Network(height=800, width='100%', notebook=True, directed=True)
nt.add_nodes(nodes.username.to_numpy(), title=title_col, value=nodes.scaled_count, color=nodes.colors)
nt.add_nodes(mentions, title=mentions, value=[50] * len(mentions), color=['#FFFFFF'] * len(mentions))
nt.add_edges(graph_df[['username', 'mentions']].head(500).dropna().to_records(index=False))

nt.show('pen_network.html')
Out[16]:

5.4 Actionable Insights

  • This graph makes it easy to identify the highest "penetration" users, the users whose tweets have the most influence. Twitter could use this data to just ban users who are spreading too much Islamophobic sentiment to quickly mitigate the spread.
  • This graph also reveals the hidden structures in our data which is the relationship between different users. Activists could focus their efforts on these central node users since they have the highest potential of spreading positive messages out to their connected group.
  • This enables graph algorithms such as page rank or shortest path algorithms. These could be used to identify the most toxic users or to estimate how long news or Islamophobic will spread through a network.

6. Sentiment Analysis

I will be using the VADER Sentiment Analysis library since it works well on emojis and slang. This library also uses NLTK under the hood for parsing out stop words. This will generate a score for each tweet from -1 to 1.

  • Score < -0.05: Negative Sentiment
  • Score > 0.05: Positive Sentiment
  • -0.05 <= Score <= 0.05: Neutral Sentiment

6.1 Sentiment Intensity Column

In [17]:
sent = SentimentIntensityAnalyzer()
origin['sentiment'] = origin.text.apply(lambda s: sent.polarity_scores(s)['compound'])
origin.head()
Out[17]:
username id_str followers_count verified in_reply_to_screen_name retweet_count retweeted text created_at mentions sentiment
0 usarocks_c 800925905136877569 10789 False NaN 4 True RT @Private34349909: @AmyMek @Dab7One 1/ @real... 2017-10-22 10:34:06+00:00 [Private34349909, AmyMek, Dab7One, realDonaldT... 0.4648
1 vanrash02chahat 375089876 292 False NaN 10 True RT @ensine: All #muslim rulers were savages bc... 2017-10-22 10:13:40+00:00 [ensine] -0.5267
2 gtstuart1 709297564189892608 154 False NaN 172 True RT @SLandinSoCal: Liberated Women of #Raqqa‼️R... 2017-10-22 07:50:47+00:00 [SLandinSoCal] 0.0000
3 Hannesz1956 1340408646 7010 False Stormtroepen 1 False @Stormtroepen @cdavandaag @sybrandbuma @gertja... 2017-10-22 10:38:23+00:00 [Stormtroepen, cdavandaag, sybrandbuma, gertja... 0.2960
4 ReasonPrevail 150595824 359 False NaN 25 True RT @PoliticalIslam: Sharia at odds with Articl... 2017-10-22 08:59:53+00:00 [PoliticalIslam] 0.0000

6.2 Showcase Best/Worse Tweets

In [18]:
# Five worst sentiment tweets
list(origin.sort_values('sentiment').text[:5])
Out[18]:
['@potus WE HAVE A #terrorism problem \nAmerica will NEVER submit to 🚫#Islam \n😠😠😠😠😠😠😠😠😠😠😠 @ICEgov @SecretService @FBI… https://t.co/LTzitgKuJk',
 '@lsarsour Explain why Islam also promotes animal abuse?!? #BanSharia #BanIslam #BanLindaSarsour \n😠😠😠😠😠😠😠😠😠😠😠😠\nhttps://t.co/L7yLwvQopf',
 '#Sweden :  #Girl #gang #raped , #VAGINA SET ON #FIRE 😠😠😠😠😠😠😠😠\n#EnoughIsEnough \nSTOP #rapeculture\nSTOP #CARNAGE… https://t.co/Lax7gOkkze',
 'RT @Chris_A10_USA: #Ireland 🇮🇪: #Muslim #migrant charged in random #stabbing #murder, Gardai still say no link to #terrorism 😠😠😠😠😠\n#BanShar…',
 'RT @Chris_A10_USA: #Ireland 🇮🇪: #Muslim #migrant charged in random #stabbing #murder, Gardai still say no link to #terrorism 😠😠😠😠😠\n#BanShar…']
In [19]:
# Five best sentiment tweets
list(origin.sort_values('sentiment', ascending=False).text[:5])
Out[19]:
['😂 😂😂 😂😂😂😂😂😂😂😂😂😂😂😂😂😂😂😂😂😂😂😂😂😂😂😂😂😂😂😂😂😂😂😂😂😂😂😂😂😂😂😂😂😂😂😂😂😂😂😂😂😂😂😂😂😂😂😂😂😂😂😂😂😂😂😂😂😂😂😂😂😂😂😂😂😂😂😂😂😂😂😂😂😂😂😂😂😂😂😂😂😂😂😂😂😂😂\n#Banislam #MAGA\nhttps://t.co/Nyo1aLf9Ka',
 'Goodnight friends \n🇳🇱🇳🇱🇳🇱🇳🇱🇳🇱🇳🇱🇳🇱\n♥️♥️♥️♥️♥️♥️♥️\n🇺🇸🇺🇸🇺🇸🇺🇸🇺🇸🇺🇸🇺🇸\n♥️♥️♥️♥️♥️♥️♥️\n#Trump \n#TrumpLandslide2020… https://t.co/w7k752m9Dj',
 '@WindowFixed @Ilhan 🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣\nWhere are all the moderates\nwhen you need them!\n\nMuslim women, children, &amp; gays\nbein… https://t.co/V8U2TRBfSp',
 'RT @BoondockCat: @WindowFixed @Ilhan 🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣\nWhere are all the moderates\nwhen you need them!\n\nMuslim women, children, &amp; gays\nbeing raped…',
 'RT @n_gaged: @FearThouNot @lbc360 but, but they keep telling us that Islam is the religion(sic) of peace.🤫😂😂🤣🤣😂😂😡 #BanIslamQuranCitizenship…']
In [20]:
# Five tweets with most neutral sentiment
list(origin.sort_values('sentiment', key=abs).text[:5])
Out[20]:
['@kelly_syra @RealMAGASteve @RosieAndujar @sxdoc @starcrosswolf @Dr_Kaco @Stump_for_Trump @TrumpTrainMRA4… https://t.co/fnFezUy9ld',
 'RT @RevolutieNL: Kerken in Urk op slot vanwege oproepen tot aanslagen vanuit de moslimgemeenschap.\n\nDaarom #BanIslam\nhttps://t.co/mOAjapv07S',
 'RT @RevolutieNL: Kerken in Urk op slot vanwege oproepen tot aanslagen vanuit de moslimgemeenschap.\n\nDaarom #BanIslam\nhttps://t.co/mOAjapv07S',
 'RT @RevolutieNL: Kerken in Urk op slot vanwege oproepen tot aanslagen vanuit de moslimgemeenschap.\n\nDaarom #BanIslam\nhttps://t.co/mOAjapv07S',
 'RT @jodaka97: WAKE UP AMERICA! Mu$1ims cannot be allowed to take office!  There are over 90 of them running for office across the US.  They…']

6.3 Plotting Sentiment Distribution

In [21]:
origin.sentiment.plot.kde()
Out[21]:
<matplotlib.axes._subplots.AxesSubplot at 0x2c4679e5da0>

This sentiment analysis is not perfect. It definitely does a better job detecting meaner tweets, but it seems to rely heavily on the type of emojis used since the top five all have angry emojis. The highest sentiment tweets are still quite Islamaphobic, but this is to be expected since this dataset primarily includes Islamaphobic tweets. Looking at the KDE plot of sentiment distribution, most tweets have been tagged with neutral sentiment. There is a higher peak of negative tweets in comparison to positive tweets which is a good sanity check.

7. Hashtag Bar Chart Race

This type of visualization helps with trends over time. We can see what trends there are amongst the hashtags, whether that is popularity, distribution over time, as well as overall cumulation of the hashtags.

In [22]:
df = df.drop_duplicates()
df_na = df.dropna(axis=1, thresh=.75)
In [23]:
table = defaultdict(int)

for entity in df['entities']:
    test = json.loads(entity)
    for x in test['hashtags']:
            table[x['text'].lower()] += 1

We want to create an index in the dataframe that is solely dates for the library to run against.

In [24]:
df_datetime = pd.to_datetime(df['created_at']).dt.date
df_datetime.head()
Out[24]:
0    2017-10-22
1    2017-10-22
2    2017-10-22
3    2017-10-22
4    2017-10-22
Name: created_at, dtype: object

7.1 Distribution of Tweets

Through this distribution, we can see an overall downward trend in islamaphobic tweets, where spikes are correlated with terrorist attacks that have muslim backgrounds (such as ISIS). For example, in November of 2017, the highest influx of islamophobic tweets occured after the terrorist car attack that happened in New York City.

In [25]:
df_datetime.hist(figsize=(10,10), bins=100)
plt.show()
In [26]:
table = dict(sorted(table.items(), key=lambda item: item[1], reverse=True))
Counter(table).most_common(10)
Out[26]:
[('banislam', 28780),
 ('bansharia', 13500),
 ('islamistheproblem', 8704),
 ('islam', 6305),
 ('islamexposed', 6135),
 ('muslim', 5480),
 ('religionofpeace', 4469),
 ('uk', 4398),
 ('maga', 4156),
 ('bansharialaw', 4006)]

Create a new dataframe for the bar chart race with an index created on the date and the entities column as one of the columns in the dataframe, that holds the hashtags in them

In [27]:
new_df = df.copy()
new_df['date'] = pd.to_datetime(new_df['created_at']).dt.date
new_df.index = new_df['date']
new_df = new_df[['entities']]
new_df.head()
Out[27]:
entities
date
2017-10-22 {"hashtags":[{"text":"BanIslam","indices":[93,...
2017-10-22 {"hashtags":[{"text":"muslim","indices":[16,23...
2017-10-22 {"hashtags":[{"text":"Raqqa","indices":[37,43]...
2017-10-22 {"hashtags":[],"symbols":[],"user_mentions":[{...
2017-10-22 {"hashtags":[{"text":"RT","indices":[51,54]},{...
In [28]:
datetime_dict = defaultdict(lambda: defaultdict(dict))

def get_counts(entity):
    table = defaultdict(int)
    test = json.loads(entity)
    for x in test['hashtags']:
        table[x['text'].lower()] += 1
    return table
    
for idx, row in tqdm(new_df.iterrows()):
    datetime_dict[idx] = dict(Counter(datetime_dict[idx]) + Counter(get_counts(row['entities'])))
119019it [00:39, 2996.13it/s]

7.2 Video

The bar chart race video shows some pretty revealing revelations. Most of the islamophobic sentiments have the hashtag #banIslam in them, while as the rest of the hashtags seems to influx as time goes on. Furthermore, another pattern is how the frequency of the tweets start to slow down as time goes on, showing that twitter seems to be hampering down on racial charge tweets as time goes on.

In [29]:
# Renders a 5 second video
'''
df_bar_chart_race = pd.DataFrame.from_dict(datetime_dict,orient='index').fillna(value=0).cumsum()
bcr.bar_chart_race(df = df_bar_chart_race.head(), 
                    n_bars=5, 
                    title = "Popular Hashtags (2017-2021)", 
                    period_length=250,
                    bar_kwargs={'alpha': .7},
                    bar_label_size=7)
'''
Out[29]:
'\ndf_bar_chart_race = pd.DataFrame.from_dict(datetime_dict,orient=\'index\').fillna(value=0).cumsum()\nbcr.bar_chart_race(df = df_bar_chart_race.head(), \n                    n_bars=5, \n                    title = "Popular Hashtags (2017-2021)", \n                    period_length=250,\n                    bar_kwargs={\'alpha\': .7},\n                    bar_label_size=7)\n'

8. Live Tweet Flagging

This classification model is in the beginning phases, but through simply passing in the most popular hashtags found in the 2 datasets given to us, I am able to flag tweets in live time (or upto a given date), and it is very evident that racially charged tweets are still very prevelant in our society. Through some more NLP on the classification, I hope to flag down tweets at a higher and more efficient rate than the flagging I am doing right now. This live flagging is simply a tool to monitor all the analysis we have provided above that help identify the problem.

8.1 Classification

In [30]:
# grab the hashtags we want to track
table = defaultdict(int)

for entity in df['entities']:
    test = json.loads(entity)
    for x in test['hashtags']:
            table[x['text'].lower()] += 1

table = Counter(table)
table = dict(table.most_common(100))

# Remove hashtags that won't hold relevance in flagging
remove  = ['islam', 'muslim', 'religionofpeace', 'uk', 'maga', 'britain', 'muslims', 'buildthewall', 'jihad', 'americafirst', 'religionbeliefs', 'patriotic', 'america', 'wakeupamerica', 'islamic', 'rt', 'stopcarnage', 'pvv', 'kag', 'breaking', 'makedclisten', 'christian',
'educateyourselfonislam', 'tcot', 'terror','freetommy', 'allah', 'migrant', 'usa', 'trump2020', 'travelban', 'us', 'freetommyrobinson', 'rape', 'immigrationreform', 'bannogozones', 'france', 'lilbulli', 'germany','draintheswamp', 'canada', 'europe','cspi','pakistan','trump','veterans', 'trumptrain', 'iran', 'bancair','ramadan', 'closernation','walkaway', 'tocatchathief', 'minnesota', 'wwg1wga', 'potus', 'hamas', 'quran', 'trudeaumustgo', 'murder','ovc16', 'sweden', 'christians',
'police', 'israel', 'isis']
[table.pop(key) for key in remove]
Counter(table).most_common(15)
Out[30]:
[('banislam', 28780),
 ('bansharia', 13500),
 ('islamistheproblem', 8704),
 ('islamexposed', 6135),
 ('bansharialaw', 4006),
 ('stopislam', 2660),
 ('rapejihad', 1096),
 ('nosharia', 1051),
 ('sharialaw', 908),
 ('billwarnerphd', 838),
 ('sharia', 827),
 ('cair', 814),
 ('islamicstate', 801),
 ('endislam', 716),
 ('qanon', 651)]
In [31]:
string = " OR ".join(list(table.keys())[:10])
print(string)
banislam OR bansharia OR islamistheproblem OR islamexposed OR bansharialaw OR stopislam OR rapejihad OR nosharia OR sharialaw OR billwarnerphd

8.2 Live Tweets

Notice how recent these tweets are, through some further digging, I was able to get the location of some of the users, creating a hypothesis that some of these users may be bots with the minimal amount of data associated with the username.

In [33]:
# Configure
c = twint.Config()
c = twint.Config()
c.Search = string
c.Since = '2021-02-13'
c.Limit = 100
c.Store_csv = True
c.Output = 'twitter2.csv'

# Run
twint.run.Search(c)
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-33-a73fac20fe2c> in <module>
      9 
     10 # Run
---> 11 twint.run.Search(c)

~\AppData\Roaming\Python\Python37\site-packages\twint\run.py in Search(config, callback)
    408     config.Followers = False
    409     config.Profile = False
--> 410     run(config, callback)
    411     if config.Pandas_au:
    412         storage.panda._autoget("tweet")

~\AppData\Roaming\Python\Python37\site-packages\twint\run.py in run(config, callback)
    327         raise
    328 
--> 329     get_event_loop().run_until_complete(Twint(config).main(callback))
    330 
    331 

c:\users\asher\appdata\local\programs\python\python37\lib\asyncio\base_events.py in run_until_complete(self, future)
    569         future.add_done_callback(_run_until_complete_cb)
    570         try:
--> 571             self.run_forever()
    572         except:
    573             if new_task and future.done() and not future.cancelled():

c:\users\asher\appdata\local\programs\python\python37\lib\asyncio\base_events.py in run_forever(self)
    524         self._check_closed()
    525         if self.is_running():
--> 526             raise RuntimeError('This event loop is already running')
    527         if events._get_running_loop() is not None:
    528             raise RuntimeError(

RuntimeError: This event loop is already running
1363018547807588354 2021-02-20 01:52:00 -0500 <FrontSocial> @Swen_2017 @Nigel_Farage @AlohaHa59067534 C'est ça, vive le mondialisme islamo-collabo !💩  #Grandremplacement #migrants #DefendEurope #Remigration #StopIslam #FrontSocial #RIC  https://t.co/YxcDHF5bkc
1363005304422694914 2021-02-20 00:59:22 -0500 <AllGoneTomorrow> @AriBerman @SusanSarandon THIS IS WHAT I'M ON ABOUT ⚠  #NoDominionism❗ #NoSharia❗ #SeparateChurchAndState❗ #FreedomFromReligion❕❗❕⚠‼
1362989659480023040 2021-02-19 23:57:12 -0500 <FrontSocial> @suivezlecoq C'est à cause d'eux et de gens comme vous que nous ne pouvons plus tuer le Cochon !🇫🇷😠  #Grandremplacement #migrants #DefendEurope #Remigration #StopIslam #FrontSocial #RIC   https://t.co/cPt6WMz2iR
1362984633546256386 2021-02-19 23:37:14 -0500 <FrontSocial> @PoliceNat44 Vous faites de la peine... #ProtectionAnimale 😏  Par contre pour les Français, eux ils peuvent se faire remplacer et massacrer au couteau , c'est la mode hein ?🤔  #Grandremplacement #migrants #DefendEurope #Remigration #StopIslam #FrontSocial #RIC    https://t.co/cPt6WMz2iR
1362977999822467076 2021-02-19 23:10:52 -0500 <FrontSocial> @BultotPatrice Ah bon, alors là, c'est la meilleur, la France n'est pas une poubelle depuis plus de 40 ans ?!🚮🤔🤭😂  #Grandremplacement #migrants #DefendEurope #Remigration #StopIslam #FrontSocial #RIC    https://t.co/pvPGx3ZpSf
1362976633158852610 2021-02-19 23:05:26 -0500 <FrontSocial> @AnneFarmer65 La faute au réchauffement climatique selon #Macron !🤭😂  #GrandRemplacement #DefendEurope #Remigration #StopIslam #FrontSocial #RIC #migrants  https://t.co/XjCpZMgXFr
1362958470325149703 2021-02-19 21:53:16 -0500 <FrontSocial> @KAYDM49 @BasedSpain1 @APES_Cat @thedukeoriginal @UnionJackGuy @DutchDL 84 billion: budgetary cost of immigration in France and and the new #migrants of #Macron are not in the lot !⤵️  Not to mention the cost of violence, crime, theft, rape, scams, trafficking, etc...  #DefendEurope #Remigration #StopIslam #FrontSocial #RIC   https://t.co/z0LrDWN4HD
1362952881725399040 2021-02-19 21:31:04 -0500 <ChrisIsHere9> @Zeitgeschehen_ @dima973 🤢🤢 Abschieben diesen Islamisten. #StopIslam
1362951659928899585 2021-02-19 21:26:12 -0500 <FrontSocial> @Nigel_Farage @AlohaHa59067534 Merci !🇬🇧👍 #Brexit   J'adore...🇫🇷😂 #Frexit   #Grandremplacement #migrants #DefendEurope #Remigration #StopIslam #FrontSocial #RIC   https://t.co/XOdpgly8OF
1362950060657508352 2021-02-19 21:19:51 -0500 <heidiEC5> #islamistheproblem
1362934947808481281 2021-02-19 20:19:48 -0500 <ChrisIsHere9> @tagesthemen @HadijaHaruna Es gibt noch viel zu tun, damit #Breitscheidplatz sich nicht wiederholt. #StopIslam
1362924754739597316 2021-02-19 19:39:18 -0500 <FrontSocial> @Suliv16 @F_Desouche C'est ça, prenez nous pour des cons !😏  #Erdogan : "Les mosquées sont nos casernes, les coupoles nos casques, les minarets nos baïonnettes et les croyants nos soldats"  #Grandremplacement #migrants #DefendEurope #Remigration #StopIslam #FrontSocial #RIC   https://t.co/l6VV2qJ4QD  https://t.co/HY2P50T7Pf
1362895330673573888 2021-02-19 17:42:22 -0500 <nelsonlahaya> @umarebru @NUnl @NUnl geen enkele normaal denkende Nederlandse vrouw laat zich onderdrukken door zo'n doek om haar hoofd te knopen. Laat staan dat ze ermee van de zon gaan genieten.  Is  https://t.co/v49NUvjc3m geworden tot islam slaaf? Inderdaad om van te kotsen. #stopislam
1362892766045483013 2021-02-19 17:32:11 -0500 <Steam7R> #stopislam #STOPimmigration
1362892150279659523 2021-02-19 17:29:44 -0500 <Steam7R> #TurkeyIsATerrorState #TurkenTerreur #stopislam
1362889187796676610 2021-02-19 17:17:58 -0500 <FrontSocial> @TF1LeJT Traduction: Français allez à la recherche des premières jonquilles en fleurs et pendant ce temps...😁  #Grandremplacement #migrants #DefendEurope #Remigration #StopIslam #FrontSocial #RIC   https://t.co/0TzSvWXTSy  https://t.co/ycciA4IW6e
1362887604887293953 2021-02-19 17:11:40 -0500 <FrontSocial> @Actu17 Seulement 7 ans et bientôt libre alors qu'il est clandestin !🤪  Pays de merde !🚮  #Grandremplacement #migrants #DefendEurope #Remigration #StopIslam #FrontSocial #RIC    https://t.co/izwRB5Qtbb
1362882532233125889 2021-02-19 16:51:31 -0500 <LeilaMansouri11> The #Iranian women's alpine ski team flew on Wednesday to Italy for the world championships without their coach, whose husband has barred her from leaving the country, Iranian media reports. Millions suffer under #ShariaLaw in 2021.   https://t.co/gQWN0lFrhh
1362881373715988480 2021-02-19 16:46:55 -0500 <espabilinator> @IreneMontero Ocurre a las mujeres en Europa y tiene una causa confirmada, pero os negais a admitirlo. El efecto llamada a la inmigración, los papeles para todos por un nicho de votos y la aceptación de sus costumbres bárbaras os convierten en hipócritas demagogos. #stopinvasion #stopislam
1362879356947857415 2021-02-19 16:38:54 -0500 <MortgagedHeart> #islamogauchisme #FoutagedeGueule #StopIslam #PauvreFrance
1362879226534322176 2021-02-19 16:38:23 -0500 <MortgagedHeart> #Remigration #StopImmigration #IslamHorsdEurope #StopIslam #PauvreFrance
1362874085760499715 2021-02-19 16:17:57 -0500 <nelsonlahaya> @TheEdgyVeggie1 Komt omdat mohamed een gewelddadige idioot was. En wie vernoemt zijn kind nou naar een pedofiel? #stopislam
1362871210237972480 2021-02-19 16:06:32 -0500 <FrontSocial> @F_Desouche Traduction: l’Assurance maladie s’oppose à #Macron qui avait choisi #Microsoft sans appel d'offre, sur la gestion des données de santé des Français ce qui est contraire à la loi.😏 #Grandremplacement #DefendEurope #Remigration #StopIslam #FrontSocial #RIC  https://t.co/vGVez1nqAf
1362868739759370246 2021-02-19 15:56:43 -0500 <FrontSocial> @CNEWS Darmanin vous êtes un #Rigolo !🤡  Le #migrant est resté en France et a décapité une personne !!!  #Grandremplacement #migrants #DefendEurope #Remigration #StopIslam #FrontSocial #RIC   https://t.co/GK9zajTf9T
1362867637437530115 2021-02-19 15:52:20 -0500 <FrontSocial> @Valeurs 20 ans pour démanteler un vaste réseau de mariages blancs  ?🤔  Bande de bras cassés et encore c'est gentil !🤥  #Grandremplacement #migrants #DefendEurope #Remigration #StopIslam #FrontSocial #RIC   https://t.co/qEoD7wGcST
1362866604028153856 2021-02-19 15:48:13 -0500 <FrontSocial> @F_Desouche Au moins c'est clair, le vrai problème c'est bien l'islam !  Contrairement à ce que nous disent les médias et les responsables politiques Français !  #Grandremplacement #migrants #DefendEurope #Remigration #StopIslam #FrontSocial #RIC   https://t.co/67FxQLOFyb
1362863183992619016 2021-02-19 15:34:38 -0500 <Jonnybravo662> @PrinsesChrissie @DevliegerErik Sigrid? Deze Sigrid? ⬇️  .@SigridKaag #D66 is 't vrouwelijk equivalent van #YasserArafat #SigridKaag is 'n achterbakse manipulerende machtswellusting die maar op 1 plek thuis hoort #sixfeetunder   https://t.co/G4fkiulpjy  #stopislamiseringvanNL #banislam #grenzendicht  #Nexit #PVV
1362862992157777921 2021-02-19 15:33:52 -0500 <TinyEssex> @JakeHepple1 #BanHalal #BanIslam
1362860068077785088 2021-02-19 15:22:15 -0500 <Jonnybravo662> @are_clouds @KingTweede @Zoeker21 @SigridKaag @D66 .@SigridKaag #D66 is het vrouwelijk equivalent van #YasserArafat #SigridKaag is een achterbakse manipulerende machtswellusting die maar op één plek thuis hoort #sixfeetunder    https://t.co/G4fkiulpjy  #stopislamiseringvanNL #banislam #grenzendicht #Nexit #PVV #StemNederlandTerug
1362859850347249664 2021-02-19 15:21:23 -0500 <Jonnybravo662> @KingTweede @SigridKaag @D66 .@SigridKaag #D66 is het vrouwelijk equivalent van #YasserArafat #SigridKaag is een achterbakse manipulerende machtswellusting die maar op één plek thuis hoort #sixfeetunder    https://t.co/G4fkiulpjy  #stopislamiseringvanNL #banislam #grenzendicht #Nexit #PVV #StemNederlandTerug
1362858532949938177 2021-02-19 15:16:09 -0500 <FrontSocial> @F_Desouche Loutfi ?! bug😂  Encore une chance pour la France, le vivre-ensemble et l'enrichissement culturel selon ceux qui nous gouvernent de gauche ou de droite...🤭🤥  #Grandremplacement #migrants #DefendEurope #Remigration #StopIslam #FrontSocial #RIC   https://t.co/wwvtUklqnX
1362857829066018820 2021-02-19 15:13:21 -0500 <FrontSocial> @F_Desouche Traduction: #Présidentielle2022 : #Hidalgo a «très peur» pour sa peau si les Français arrivaient au pouvoir !😁  #Grandremplacement #migrants #DefendEurope #Remigration #StopIslam #FrontSocial #RIC   https://t.co/FIQNwNDWwb
1362857169994063874 2021-02-19 15:10:44 -0500 <FrontSocial> @F_Desouche #Remigration immédiate des #migrants et il faut détruire ces navires de traites humaines pour le #GrandRemplacement !😠  #SoutienGenerationIdentitaire #ManifGenerationID  #DefendEurope #Remigration #StopIslam #FrontSocial #RIC   https://t.co/hBNyNejUbu
1362855758182952962 2021-02-19 15:05:08 -0500 <FrontSocial> @F_Desouche Tiens les Marocains aussi ne veulent pas êtres envahis, mais là ce n'est pas raciste ?🤔😁  Qu'ils retournent chez eux et les vaches seront bien gardées !  #Grandremplacement #migrants #DefendEurope #Remigration #StopIslam #FrontSocial #RIC   https://t.co/zCJpIFJ8f0
1362852996367994881 2021-02-19 14:54:09 -0500 <NipperPB> This explains in detail the mistakes that Western governments are making in not limiting the spread of Islam. #BanSharia #Secularism #FreeSpeech #ReformIslam
1362851941018185729 2021-02-19 14:49:57 -0500 <FrontSocial> @Valeurs La barbarie et l'injustice contre les Français, voilà ce qu'est devenue la France !  #Grandremplacement #migrants #DefendEurope #Remigration #StopIslam #FrontSocial #RIC   https://t.co/amuOBR0I4y
1362851207455342596 2021-02-19 14:47:03 -0500 <FrontSocial> @F_Desouche Exactement le même laxisme qu'en France...  #Grandremplacement #migrants #DefendEurope #Remigration #StopIslam #FrontSocial #RIC   https://t.co/e5eE3tV3Ym
1362850838260154369 2021-02-19 14:45:35 -0500 <FrontSocial> @F_Desouche Traduction: La Commission européenne lance une procédure d'infraction contre plusieurs pays, dont la Belgique ou la Suède, accusés de ne pas en faire assez pour le #GrandRemplacement !🇪🇺  #migrants #DefendEurope #Remigration #StopIslam #FrontSocial #RIC   https://t.co/z2ZttoYVOH
1362850238298521600 2021-02-19 14:43:12 -0500 <saraosalvatore> @liliaragnar Più corretto “ la presidente “
1362850010992369665 2021-02-19 14:42:17 -0500 <FrontSocial> @F_Desouche #Macron vient de perdre un fournisseur...😁  #Grandremplacement #migrants #DefendEurope #Remigration #StopIslam #FrontSocial #RIC   https://t.co/MSKtCP6VW5
1362849377031700480 2021-02-19 14:39:46 -0500 <saraosalvatore>  https://t.co/BgLtYFtTLp
1362849173402492931 2021-02-19 14:38:58 -0500 <FrontSocial> @Valeurs L'immigration, les #migrants sont des chances pour la France, le vivre ensemble et l'enrichissement culturel selon ceux qui nous gouvernent de gauche ou de droite...  #Grandremplacement #migrants #DefendEurope #Remigration #StopIslam #FrontSocial #RIC   https://t.co/HLhH9xznLp
1362847050140962816 2021-02-19 14:30:31 -0500 <FrontSocial> @Coy5774 @Girolata20 @akoualong Le #RN est devenu un parti mondialiste comme les autres...  - des cadres gays LGBT  - le #GrandRemplacement n'existe pas. - l'islam n'est pas un problème. ...  #Grandremplacement #migrants #DefendEurope #Remigration #StopIslam #FrontSocial #RIC   https://t.co/5IDwXm0FIq
1362846765024759810 2021-02-19 14:29:23 -0500 <BokBarbar> @socrate1231 Minalble #Darmanin ! Leur idéologie mortifère est hallucinante !! #Macronie #migrants #stopislam
1362843096715780101 2021-02-19 14:14:49 -0500 <FrontSocial> @Valeurs Qui sont les fournisseurs de drogues pour #LREM ?🤔  Les fameux jeunes qu'il ne faut plus contrôler ?😁  #Grandremplacement #migrants #DefendEurope #Remigration #StopIslam #FrontSocial #RIC #BalanceTonPost   https://t.co/lzz2sEAqUK  https://t.co/NnL0VsgdTO
1362839939260551170 2021-02-19 14:02:16 -0500 <heidiEC5> #islamistheproblem
1362833580041379851 2021-02-19 13:37:00 -0500 <FrontSocial> @joka06482774 @Valeurs Exact, 84 milliards : coût budgétaire de l’immigration en France !  #SoutienGenerationIdentitaire !🇫🇷  #Grandremplacement #migrants #DefendEurope #Remigration #StopIslam #FrontSocial #RIC   https://t.co/z0LrDWN4HD
1362832794280480772 2021-02-19 13:33:53 -0500 <FrontSocial> @Valeurs Courage, fuyons, ça promet ! #RN    Pauvre France !🇫🇷 #SoutienGénérationIdentitaire   #Grandremplacement #migrants #DefendEurope #Remigration #StopIslam #FrontSocial #RIC   https://t.co/5IDwXm0FIq
1362831376702205952 2021-02-19 13:28:15 -0500 <lanllaire> égorgé par un de ceux qu'il aidait !!! et le cas est pas rare !!! #remigration #StopIslam
1362831318309081095 2021-02-19 13:28:01 -0500 <FrontSocial> @Valeurs Les islamistes se croient déjà en terrain conquis avec #Macron !  #Grandremplacement #migrants #DefendEurope #Remigration #StopIslam #FrontSocial #RIC    https://t.co/2w8Ks2F7Me
1362823836102721536 2021-02-19 12:58:17 -0500 <adnzafar> The Power of SADAQAH  #business #entrepreneurs #Entrepreneur #Entrepreneurship #sadaqah #londonislovinit #corporate #BusinessGrowth #businessadvice #businesstravel #businessmodels #businessman #islam #islamteachesus #islamisation #IslamExposed   https://t.co/rhwONc1KfU
1362823802833485827 2021-02-19 12:58:09 -0500 <saraosalvatore> @liliaragnar @Paolo_ADP10 Li manteniamo tutti a debito perché? oltretutto non accettano la ns civiltà non vogliono integrarsi ma imporci la sharija...
1362822473142726664 2021-02-19 12:52:52 -0500 <saraosalvatore> @PicodaMirandola @Giandom84354994 Vaffa buro
1362820041482395653 2021-02-19 12:43:12 -0500 <AdamekMiroslav> @DanielHule @ODScz @H_Langsadlova @MarekVyborny Ujguři jsou nejkrvežíznivější teroristi z ISIS v Idlibu.  Viděl jsem tady na twitteru videa jak vařili malý děti Jezídů za živa v kotly ☝️☠ Jak upalovali jejich matky v klecích..apod.. Zjistěte si o Ujgurech nějaká fakta než něco napíšete.  Chcete ty videa nasdilet?  #StopIslam
1362820031567003650 2021-02-19 12:43:10 -0500 <saraosalvatore> @matteosalvinimi al Senato: 115 senatori il blocco del cdx supera i 110 del csx. Adesso potete bloccare tutte le cazzate dei sinistri!
1362819619761848324 2021-02-19 12:41:31 -0500 <pgr13000> @GDarmanin @bayrou #referendum immigration #remigration #stopislam #LaRacailleTue merci aux politiques pour ce magnifique vivre ensemble et au prochain remplacement
1362819534290382848 2021-02-19 12:41:11 -0500 <saraosalvatore> Non piangerò per la raggi, ma certo non voterò gualtieri...
1362816707107127305 2021-02-19 12:29:57 -0500 <saraosalvatore> SISTEMA PALAMARA: vietato indagare a sinistra!
1362815626264399873 2021-02-19 12:25:39 -0500 <saraosalvatore> @chiccotesta Rapine legalizzate?
1362812001953710088 2021-02-19 12:11:15 -0500 <BergVincentvd> @RenMid #StopIslam
1362811231921405953 2021-02-19 12:08:12 -0500 <Jonnybravo662> @jo_14leeuwen @GroenIn033 @SigridKaag #stopislamiseringvanNL  #stopsalafisme #banislam #PVV  https://t.co/1q0zfgemu7
1362809027890212865 2021-02-19 11:59:26 -0500 <VrijeMeinung> #StopIslam
1362808493812682752 2021-02-19 11:57:19 -0500 <AWJAvanHattem> #STOPISLAM  #PVV #Brabant
1362804473421258756 2021-02-19 11:41:20 -0500 <LansKKu> @SammyMahdi Omdat sommigen hun geloof de vrouw als sexobject ziet als ze niet gesluierd zijn. Verbiedcdie sekte de die gelijke mensenrechten niet respecteert. #StopIslam
1362804034428678149 2021-02-19 11:39:36 -0500 <FrontSocial> @Reine_Margot2 @VotezPoisson Plus il y a de monde à la manifestation de #GenerationIdentitaire à Paris et moins il y aura de risque de violence par l'extrême gauche !  #SoutienGenerationIdentitaire  #Grandremplacement #migrants #DefendEurope #Remigration #StopIslam #FrontSocial #RIC   https://t.co/Bl0dgpgnWp
1362801576591708160 2021-02-19 11:29:50 -0500 <ilove_sharialaw> @jkcarnah @mngop Thank you for pinning this!!
1362798921421103104 2021-02-19 11:19:17 -0500 <FrontSocial> @philippe_dormoy @F_Desouche Oui, ils nous imposent la #charia et c'est pire pour les enfants des Français les plus pauvres qui ne mangent pas de viande hormis à la cantine.  Des #islamocollabo #EELV #LFI #PC #PS #LR #LREM   #Grandremplacement #migrants #DefendEurope #Remigration #StopIslam #FrontSocial #RIC  https://t.co/hc5hb4kCcN
1362791245756653569 2021-02-19 10:48:47 -0500 <FrontSocial> @kevinbossuet Ah ah ah, quand tu découvre ce qu'est #LFI La France Islamique !⤵️ 🤭😂🤣  #Grandremplacement #migrants #DefendEurope #Remigration #StopIslam #FrontSocial #RIC   https://t.co/khWalvKw7g
1362787854968061958 2021-02-19 10:35:18 -0500 <FrontSocial> @AnkouJl @Andrgilles1 @F_Desouche Ah ça, pour avoir les pieds dans la merde, à croire que les Français adorent ça...💩🤥  #Grandremplacement #migrants #DefendEurope #Remigration #StopIslam #FrontSocial #RIC  https://t.co/2idI5Gledr
1362784635789443079 2021-02-19 10:22:31 -0500 <marilynnefriedm> Shameful but not unexpected: in their ongoing bigotry towards (through low expectations of) #Palestinians, @AP @washingtonpost, @latimes blame #Israel when #Hamas Deny #WomensRights. #sharialaw  https://t.co/lFcAiqP8Rq @honestreporting
1362784038189232137 2021-02-19 10:20:08 -0500 <FrontSocial> @jfpoisson78 @VotezPoisson Mais le #GrandRemplacement n’existe pas, l'immigration apporte de l'enrichissement culturel, le vivre ensemble et les nouveaux petits #migrants sont gentils...  Tous des chances pour la France !   #DefendEurope #Remigration #StopIslam #FrontSocial #RIC   https://t.co/6S0i3MC9bj  https://t.co/YDNknpyB0o
1362780668204310532 2021-02-19 10:06:45 -0500 <FrontSocial> @F_Desouche L'immigration, les #migrants sont des chances pour la France, le vivre ensemble et l'enrichissement culturel selon ceux qui nous gouvernent de gauche ou de droite...  #Grandremplacement #migrants #DefendEurope #Remigration #StopIslam #FrontSocial #RIC    https://t.co/qmFyNahqGA
1362779461830250498 2021-02-19 10:01:57 -0500 <FrontSocial> @F_Desouche Traduction: A Lyon, la mairie #EELV prive les écoliers de viande : c’est ce qui convient aux enfants qui ne mangent pas de porc.   Imposer aux Français l'islamisation !  #Grandremplacement #migrants #DefendEurope #Remigration #StopIslam #FrontSocial #RIC   https://t.co/RduS78yeAs
1362778441364484097 2021-02-19 09:57:54 -0500 <FrontSocial> @F_Desouche La seule chose qui compte pour les islamo-collabo c'est " les enfants qui ne mangent pas de porc" pour tous les autres enfants c'est privation de viande !😠 #EELV #LFI   #Grandremplacement #migrants #DefendEurope #Remigration #StopIslam #FrontSocial #RIC    https://t.co/HhVL6ugkqz
1362773483961286662 2021-02-19 09:38:12 -0500 <Justicetrack9> @nelsonlahaya @ajaxfriend @tunahankuzu @MinPres @DenkNL Share everyone to vote Geert Wilder the next Dutch prime minister 2021 and let's wonderful year in 2021 for The Netherland  #stopislam
1362771181854687243 2021-02-19 09:29:03 -0500 <saraosalvatore> @TinoMazzini @kadreg1 Causa ed effetto. Compro caro e chi straguadagna mi manifesterà la sua felicitá
1362770360358162432 2021-02-19 09:25:47 -0500 <Justicetrack9> @VarunMhatre19 #HindusLivesMatter  #ChristianLivesMatter #BuddhaLivesMatter No persecution!  #StopIslam Islam is enemy all of religious, islam is first enemy and second is communism, but islam more dangerous
1362770111074082820 2021-02-19 09:24:48 -0500 <saraosalvatore> @kadreg1 Nessuno!!! Finora ha comprato chillo che costa chiù
1362769007959490562 2021-02-19 09:20:25 -0500 <saraosalvatore> Ministro della Giustizia!!!
1362768461919887360 2021-02-19 09:18:14 -0500 <saraosalvatore> Ancora blatera?  https://t.co/exR4CBKOnO
1362764497107132423 2021-02-19 09:02:29 -0500 <luishon2> @LaurenGruel Here I’m real (bot…?) but unfollow anybody that doesn’t follow me Or any treasonous Socialist Communist or #ShariaLaw enforcer  https://t.co/nq1Ke6IYqG
1362759424608198657 2021-02-19 08:42:20 -0500 <saraosalvatore> @andrea_news @NicolaPorro @a_meluzzi @UGiangrieco @claudio_2022 @adrianobusolin @bruno_luckn @AlfioKrancic @OllaPiero @Ilconservator @erpedrini @Gigadesires @Capezzone @italianfirst2 Che meriti ha la “signora arcuri”?
1362759091299483649 2021-02-19 08:41:00 -0500 <saraosalvatore> Il PD ha paura di presentarsi da solo e spinge ancora per allearsi con il M5S! Certo che Zingarello sembra non avere nessuna strategia, ha verificato se davvero hanno idee affini e compatibili?
1362753461599809536 2021-02-19 08:18:38 -0500 <saraosalvatore> @laltrodiego @irritatrix Questo dovrebbe rispondere degli omessi controlli al ponte Morandi di Genova, ma a sx non si indaga secondo il Sistema Palamara
1362751791364853770 2021-02-19 08:12:00 -0500 <CaPrTim> #newsom #TeachersFirst #Union learn where edu. happens @maryreinking1 @mayorlucy @leighannehiggin @vineyardtheatre @phillybizmedia #BanSharia #AmericaFirst @maryhowardVT @_Pirate_news @ElaineWeaverCre @mitchellreports @lmfeeney @KaviLadnier @NAMI_Washington @stevebradford  https://t.co/QffebRRMdV
1362751554521038852 2021-02-19 08:11:03 -0500 <BIGGGSS297> @TGhazniwal Such a B!tch Way to Do Battle in War Time..There's No Honor in Anything there doing.. I get it, u want them out &amp; have ur own "sharialaw"..But the way its being done is Not right at all..U "FIGHT" in wartime Not Plant IEDZ &amp; set it off from a Far..Its Weak.. Neway..✌💖
1362747041495977990 2021-02-19 07:53:07 -0500 <santroch60> ALÁ NO ES GRANDE. "Mahoma era un charlatán cobarde que odiaba haber nacido". #STOPIslam #islam  https://t.co/7jULilcfd2
1362728614903623683 2021-02-19 06:39:54 -0500 <FrontSocial> @F_Desouche Le fameux enrichissement culturel et le vivre-ensemble !😁  #Grandremplacement #migrants #DefendEurope #Remigration #StopIslam #FrontSocial #RIC   https://t.co/6A33MubOc4
1362721857154277379 2021-02-19 06:13:03 -0500 <FrontSocial> @Valeurs #Remigration des clandestins et prison pour ceux qui les  les aident !😠  #Grandremplacement #migrants #DefendEurope #StopIslam #FrontSocial #RIC   https://t.co/CxJb5VtNBG
1362714153241182208 2021-02-19 05:42:26 -0500 <saraosalvatore> Bill GATES HA SENTENZIATO CI VUOLE LA TERZA DOSE.
1362713747270303744 2021-02-19 05:40:50 -0500 <saraosalvatore> @GeMa7799 certamente come con Zingarello il PD è avanti nei sondaggi...
1362713554206482434 2021-02-19 05:40:03 -0500 <saraosalvatore> @Signorasinasce a sx non si indaga, vedi Sistema Palamara...
1362713398241349635 2021-02-19 05:39:26 -0500 <saraosalvatore> @isabellaisola3 @manu_etoile in Italia manca anagrafe e piano vaccinale. qui a Roma inserisco il mio codice fiscale e mi rispondono in automatico: i tuoi dati non corrispondono alla vaccinazione in corso... bravo zingarello
1362712688745414659 2021-02-19 05:36:37 -0500 <saraosalvatore> @giovanni_morici certamente li ha imposti Mattarella, ma se doveva avere un governo di "eccellenze" poteva rifiutarsi, evidentemente prevalgono gli interessi di parte comunista pidiota
1362708436190912513 2021-02-19 05:19:43 -0500 <afrvet> #StopIslam  https://t.co/LI4ogSG6LY
1362699179915681792 2021-02-19 04:42:56 -0500 <off_h2> @YaelBRAUNPIVET @_LICRA_ @GOLDMANNAriel @Le_CRIF @1ElisaMoreno @MarleneSchiappa @auroreberge @LaREM_AN @EliseFajgeles @DILCRAH @EnMarche78 "Cette fois ci ce sont les musulmans qui vont vous faire la peau..."  Quelle "magnifique" représentation de la religion de paix et d'amour que les islamogauchistes s'évertuent à nous faire gober ! TOUS doivent être expulsés sans ménagement ! #StopIslam #Dubalai #Dehors #Expulsion
1362698214743420931 2021-02-19 04:39:06 -0500 <LansKKu> @Outlaw__Mike Onderdrukte vrouwen in België. De sekte gaat met haar regels in tegen de universele gelijke mensenrechten. Verbieden die sekte #StopIslam
1362694516478783488 2021-02-19 04:24:25 -0500 <R_Van_Antwerpen> “Buitenlanders nemen er de straten over: mensen die de taal niet spreken, die zich crimineel gedragen, die geen enkele moeite doen om te integreren. Voor hen zijn wij 'kuffars', ongelovigen, van wie je je verre houdt.”  #stopislam
1362683904939982849 2021-02-19 03:42:15 -0500 <VrijeMeinung> #StopIslam
1362676883062685697 2021-02-19 03:14:20 -0500 <saraosalvatore> @savvucciu @Signorasinasce Anche qui a Roma sole splendente...
In [34]:
df_twitter = pd.read_csv('./twitter2.csv')
df_twitter.head()

counter = 0
for index, row in (df_twitter.iterrows()):
    if df_twitter['date'][index] == "2021-02-19" and counter  <20:
        print(df_twitter['created_at'][index] + ' ' + df_twitter['username'][index] + ': ' + df_twitter['tweet'][index] + '\n')
        counter += 1
2021-02-19 23:57:12 Eastern Standard Time frontsocial: @suivezlecoq C'est à cause d'eux et de gens comme vous que nous ne pouvons plus tuer le Cochon !🇫🇷😠  #Grandremplacement #migrants #DefendEurope #Remigration #StopIslam #FrontSocial #RIC   https://t.co/cPt6WMz2iR

2021-02-19 23:37:14 Eastern Standard Time frontsocial: @PoliceNat44 Vous faites de la peine... #ProtectionAnimale 😏  Par contre pour les Français, eux ils peuvent se faire remplacer et massacrer au couteau , c'est la mode hein ?🤔  #Grandremplacement #migrants #DefendEurope #Remigration #StopIslam #FrontSocial #RIC    https://t.co/cPt6WMz2iR

2021-02-19 23:10:52 Eastern Standard Time frontsocial: @BultotPatrice Ah bon, alors là, c'est la meilleur, la France n'est pas une poubelle depuis plus de 40 ans ?!🚮🤔🤭😂  #Grandremplacement #migrants #DefendEurope #Remigration #StopIslam #FrontSocial #RIC    https://t.co/pvPGx3ZpSf

2021-02-19 23:05:26 Eastern Standard Time frontsocial: @AnneFarmer65 La faute au réchauffement climatique selon #Macron !🤭😂  #GrandRemplacement #DefendEurope #Remigration #StopIslam #FrontSocial #RIC #migrants  https://t.co/XjCpZMgXFr

2021-02-19 21:53:16 Eastern Standard Time frontsocial: @KAYDM49 @BasedSpain1 @APES_Cat @thedukeoriginal @UnionJackGuy @DutchDL 84 billion: budgetary cost of immigration in France and and the new #migrants of #Macron are not in the lot !⤵️  Not to mention the cost of violence, crime, theft, rape, scams, trafficking, etc...  #DefendEurope #Remigration #StopIslam #FrontSocial #RIC   https://t.co/z0LrDWN4HD

2021-02-19 21:31:04 Eastern Standard Time chrisishere9: @Zeitgeschehen_ @dima973 🤢🤢 Abschieben diesen Islamisten. #StopIslam

2021-02-19 21:26:12 Eastern Standard Time frontsocial: @Nigel_Farage @AlohaHa59067534 Merci !🇬🇧👍 #Brexit   J'adore...🇫🇷😂 #Frexit   #Grandremplacement #migrants #DefendEurope #Remigration #StopIslam #FrontSocial #RIC   https://t.co/XOdpgly8OF

2021-02-19 21:19:51 Eastern Standard Time heidiec5: #islamistheproblem

2021-02-19 20:19:48 Eastern Standard Time chrisishere9: @tagesthemen @HadijaHaruna Es gibt noch viel zu tun, damit #Breitscheidplatz sich nicht wiederholt. #StopIslam

2021-02-19 19:39:18 Eastern Standard Time frontsocial: @Suliv16 @F_Desouche C'est ça, prenez nous pour des cons !😏  #Erdogan : "Les mosquées sont nos casernes, les coupoles nos casques, les minarets nos baïonnettes et les croyants nos soldats"  #Grandremplacement #migrants #DefendEurope #Remigration #StopIslam #FrontSocial #RIC   https://t.co/l6VV2qJ4QD  https://t.co/HY2P50T7Pf

2021-02-19 17:42:22 Eastern Standard Time nelsonlahaya: @umarebru @NUnl @NUnl geen enkele normaal denkende Nederlandse vrouw laat zich onderdrukken door zo'n doek om haar hoofd te knopen. Laat staan dat ze ermee van de zon gaan genieten.  Is  https://t.co/v49NUvjc3m geworden tot islam slaaf? Inderdaad om van te kotsen. #stopislam

2021-02-19 17:32:11 Eastern Standard Time steam7r: #stopislam #STOPimmigration

2021-02-19 17:29:44 Eastern Standard Time steam7r: #TurkeyIsATerrorState #TurkenTerreur #stopislam

2021-02-19 17:17:58 Eastern Standard Time frontsocial: @TF1LeJT Traduction: Français allez à la recherche des premières jonquilles en fleurs et pendant ce temps...😁  #Grandremplacement #migrants #DefendEurope #Remigration #StopIslam #FrontSocial #RIC   https://t.co/0TzSvWXTSy  https://t.co/ycciA4IW6e

2021-02-19 17:11:40 Eastern Standard Time frontsocial: @Actu17 Seulement 7 ans et bientôt libre alors qu'il est clandestin !🤪  Pays de merde !🚮  #Grandremplacement #migrants #DefendEurope #Remigration #StopIslam #FrontSocial #RIC    https://t.co/izwRB5Qtbb

2021-02-19 16:51:31 Eastern Standard Time leilamansouri11: The #Iranian women's alpine ski team flew on Wednesday to Italy for the world championships without their coach, whose husband has barred her from leaving the country, Iranian media reports. Millions suffer under #ShariaLaw in 2021.   https://t.co/gQWN0lFrhh

2021-02-19 16:46:55 Eastern Standard Time espabilinator: @IreneMontero Ocurre a las mujeres en Europa y tiene una causa confirmada, pero os negais a admitirlo. El efecto llamada a la inmigración, los papeles para todos por un nicho de votos y la aceptación de sus costumbres bárbaras os convierten en hipócritas demagogos. #stopinvasion #stopislam

2021-02-19 16:38:54 Eastern Standard Time mortgagedheart: #islamogauchisme #FoutagedeGueule #StopIslam #PauvreFrance

2021-02-19 16:38:23 Eastern Standard Time mortgagedheart: #Remigration #StopImmigration #IslamHorsdEurope #StopIslam #PauvreFrance

2021-02-19 16:17:57 Eastern Standard Time nelsonlahaya: @TheEdgyVeggie1 Komt omdat mohamed een gewelddadige idioot was. En wie vernoemt zijn kind nou naar een pedofiel? #stopislam

9. Twitter Heatmap

In [35]:
place_df = df[df["place"].notna()].astype('str')
place_df = place_df["place"]
print(place_df)
94       {"id":"1d68da80ca90416d","url":"https://api.tw...
206      {"id":"53cef5332ac9d7d0","url":"https://api.tw...
638      {"id":"f632697d33274211","url":"https://api.tw...
642      {"id":"5635c19c2b5078d1","url":"https://api.tw...
997      {"id":"a81f9ed24c15d6af","url":"https://api.tw...
                               ...                        
73381    {"id":"2a8a74486cd0d519","url":"https://api.tw...
73915    {"id":"243cc16f6417a167","url":"https://api.tw...
73972    {"id":"8e9665cec9370f0f","url":"https://api.tw...
74221    {"id":"67d92742f1ebf307","url":"https://api.tw...
74229    {"id":"ad0818e2fb208dde","url":"https://api.tw...
Name: place, Length: 675, dtype: object
In [36]:
places = place_df.apply(json.loads).apply(pd.Series)
places['full_name'] = places['full_name'].str.replace(', USA', '')
places['location'] = places['full_name'] + ", " + places['country']
places.head()
Out[36]:
id url place_type name full_name country_code country contained_within bounding_box attributes location
94 1d68da80ca90416d https://api.twitter.com/1.1/geo/id/1d68da80ca9... city Kawartha Lakes Kawartha Lakes, Ontario CA Canada [] {'type': 'Polygon', 'coordinates': [[[-79.2086... {} Kawartha Lakes, Ontario, Canada
206 53cef5332ac9d7d0 https://api.twitter.com/1.1/geo/id/53cef5332ac... city Waunfawr Waunfawr, Wales GB United Kingdom [] {'type': 'Polygon', 'coordinates': [[[-4.21700... {} Waunfawr, Wales, United Kingdom
638 f632697d33274211 https://api.twitter.com/1.1/geo/id/f632697d332... city Mount Airy Mount Airy, MD US United States [] {'type': 'Polygon', 'coordinates': [[[-77.1957... {} Mount Airy, MD, United States
642 5635c19c2b5078d1 https://api.twitter.com/1.1/geo/id/5635c19c2b5... admin Virginia Virginia US United States [] {'type': 'Polygon', 'coordinates': [[[-83.6752... {} Virginia, United States
997 a81f9ed24c15d6af https://api.twitter.com/1.1/geo/id/a81f9ed24c1... city Grand Rapids Grand Rapids, MI US United States [] {'type': 'Polygon', 'coordinates': [[[-85.7514... {} Grand Rapids, MI, United States

9.1 Dot Map

In [37]:
locator = Nominatim(user_agent='myGeocoder')
m = folium.Map(location=[20,0], zoom_start=2)
data = []
for index, row in places.iterrows():
    location = locator.geocode(row['location'])
    if (location == None):
        location = locator.geocode(row['country'])
        
    data.append([location.latitude, location.longitude, 1])
    
    folium.Circle(
        radius=400,
        location=[location.latitude, location.longitude],
        popup=row["name"],
        color="crimson",
        fill=False,
    ).add_to(m)
places_copy = pd.DataFrame(data, columns=['latitude', 'longitude', 'count'])
m.save('50kdotmap.html')
m
Out[37]:
Make this Notebook Trusted to load map: File -> Trust Notebook

9.2 Heat Map

In [38]:
n = folium.Map(location=[20,0], zoom_start=2)
HeatMap(data=places_copy[['latitude', 'longitude', 'count']].groupby(['latitude', 'longitude']).sum().reset_index().values.tolist(), radius=10, max_zoom=10).add_to(n)
n.save('50kheatmap.html')
n
Out[38]:
Make this Notebook Trusted to load map: File -> Trust Notebook
In [39]:
total_entries = len(df.index)
print(len(df.index))
print(len(places.index))
location_entries = len(places.index)
print(str(100*(location_entries/total_entries)) + "% of total entries have location information available.")
#Studies demonstrate that approximately 0.85% of tweets are geotagged: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4636345/"
#Study looked at a sample of 113 million tweets. As such, we can be 99% confident that that this is a statistically significant difference
#What is making this percentage so low? Most likely, there is an abundance of bots posting negative about Islam
119019
675
0.567136339576034% of total entries have location information available.

10. Conclusion

We were able to use this data to discover a lot of relationships:

  • Relationships between different users through mentions
  • The relationship between users and where they are tweeting from

The tools we presented allow activists to take data-driven action in reaching out to specific users in specific regions to efficiently manage their efforts in combating Islamophobia.