TrustAlert Health News Classifier and ICD9 Similarity Search

This is a demo of the TrustAlert European project, which aims at predicting disease outbreaks based on news articles extracted from GDELT.

How the Model Works:

  1. News Article Classification:

    • Classifies a news article into one of the predefined IPTC taxonomy topics (e.g., health, business, politics).
    • Uses a pre-trained sentence transformer model (all-mpnet-base-v2) and cosine similarity for classification.
  2. ICD9 Similarity Search:

    • If classified as "health", searches for the top k most similar ICD9 codes using cosine similarity.

Demo Interface:

  • Input: Textbox for entering the news article.
  • Output: Predicted topic and top 10 most similar ICD9 codes (if applicable).
  • Interaction: Button to trigger classification and similarity search.

This tool integrates NLP techniques with health informatics to support disease outbreak prediction and monitoring.

Examples