Answering student questions

Answering student questions

di Lejla Dzanko -
Numero di risposte: 1

Dear all,

a reminder that if you have questions about the code in general, you should ask them here as that way you might receive an answer faster and also others can benefit from the answer.

Questions specific to your project can be sent via email or posted here if you'd like.

I am going to post a few questions I got via email:


  • For some reason  I cannot search for tweets prior to November 23rd 2022

In Lab 1 we used the recent search API that can only access data from the last week. I talked about the limitations of different access levels in class. In order to collect data from any point in time, you should try applying for Academic research access. You can read a guide written by your classmate on how to do it in the Student forum of the website.

  • Is there a way to put a limit on how many tweets we want, being that the topic is very talked about and the code ran for almost an hour getting tweets within a 24 hour period?
What you can do is:
  1. change the code so it does not paginate until the end of results, currently it runs as long as there is a "next_token" in the response but you can choose to stop it after the first, second or n-th time so if you collect data day-by-day you'd get <=500 tweets from each day
  2. use smaller time periods (f.e, one hour. ten minutes, etc)
  • We wanted to know that is the number of the hashtag are important? Is it good to choose a general topic or a specific name? Can we use just a part of a specific hashtag or we should take all of it?
Once you pick out a topic, you can search for it using Twitter (manually, here: https://twitter.com/explore) and see what hashtags people are using when talking about it. I would say that the number of hashtags is not important, as long as they are relevant. You should use hashtags the way they appear in the tweets. You can also do a trial run (collect tweets from few key points, but use the approach I mentioned above, i.e do not download all results, but just a sample, 500 tweets per day or so) and then extract the hashtags like we did in Lab2 and rank them (by frequency/degree or by PageRank). That would give you a good idea on which hastags to include. Btw, you also don't have to use hashtags. It can be a keyword or you can search for tweets posted by a specific account, etc, etc.

Ciao,
L.