At the end of November, an article entitled "Active Annotation in Evaluating the Credibility of Web-Based Medical Information: Guidelines for Creating Training Data Sets for Machine Learning" was published, and one of its authors is prof. Mikołaj Morzy, leader of our research group from Poznań.
The main result presented in the paper is the design and implementation of an active annotation process for manually labeling the credibility of medical information extracted from the Web. The active annotation process allows to discover and tag two times more non-credible statements as the random annotation in the same time interval. This is achieved by first clustering semantically similar statements into groups and re-ranking sentences for annotation within the clusters. Sentences that are within clusters containing many non-credible sentences are pushed to the top of the ranking, allowing the annotators to find many more non-credible and dubious claims. This recursive procedure of first clustering sentences and then triggering the re-ranking of sentences is the original contribution of the #Webimmunization project. It is similar in spirit to the procedure employed by the #Webimmunization team when annotating the fake news in tweets related to COVID-19.