Catalan Referendum Twitter corpus

  1. Jiménez-Zafra, Salud María 1
  2. Martín-Valdivia, María Teresa 1
  3. Sáez-Castillo, Antonio José 1
  4. Conde-Sánchez, Antonio 1
  1. 1 Universidad de Jaén
    info

    Universidad de Jaén

    Jaén, España

    ROR https://ror.org/0122p5f64

Verleger: Dryad

Datum der Publikation: 2020

Art: Dataset

CC0 1.0

Zusammenfassung

This corpus consists of 46,962 tweets related to the Catalan referendum, a very controversial topic in Spain due to it was an independence referendum called by the Catalan regional government and suspended by the Constitutional Court of Spain after a request from the Spanish government. All the tweets were downloaded on October 1, 2017 with the hashtags #CatalanReferendum or #ReferendumCatalan. Later, we collected features of these tweets on October 31, 2017 in order to analyze their virality. Each item in this collection is made up of the features we used from each tweet to perform the virality analysis: lang: Tweet language. retweet_count: Total number of retweets recorded for a given tweet. favourite_count: Total number of favourites recorded for a given tweet. is_quote_status: Whether a tweet includes a quote of another tweet. num_hashtags: Total number of hashtags in the tweet. num_urls: Total number of URLs in the tweet. num_mentions: Total number of users mentioned in the tweet. interval_time: Interval of the day on which the tweet was published (morning (06:00-12:00), afternoon (12:00-18:00), evening (18:00-00:00) or night (00:00-06:00)). positive_words_iSOL: Total number of positive words found in the tweet using iSOL lexicon. negative_words_iSOL: Total number of negative words found in the tweet using iSOL lexicon. positive_words_NRC: Total number of positive words found in the tweet using NRC lexicon. negative_words_NRC: Total number of negative words found in the tweet using NRC lexicon.     positive_words_mlSenticon: Total number of positive words found in the tweet using ML-SentiCon lexicon. negative_words_mlSenticon: Total number of negative words found in the tweet using ML-SentiCon lexicon. verified_user: Whether the tweet is from a verified user. followers_count_user: Total number of users who follow the author of a tweet. friends_count_user: Total number of friends that the author is following. listed_count_user: Total number of lists that include the author of a tweet. favourites_count_user: Total number of favourited tweets by a user. statuses_count_user: Total number of tweets made by the author since the creation of the account.