TrustAlert Health News Classifier and ICD9 Similarity Search
This is a demo of the TrustAlert European project, which aims at predicting disease outbreaks based on news articles extracted from GDELT.
How the Model Works:
News Article Classification:
- Classifies a news article into one of the predefined IPTC taxonomy topics (e.g., health, business, politics).
- Uses a pre-trained sentence transformer model (
all-mpnet-base-v2
) and cosine similarity for classification.
ICD9 Similarity Search:
- If classified as "health", searches for the top k most similar ICD9 codes using cosine similarity.
Demo Interface:
- Input: Textbox for entering the news article.
- Output: Predicted topic and top 10 most similar ICD9 codes (if applicable).
- Interaction: Button to trigger classification and similarity search.
This tool integrates NLP techniques with health informatics to support disease outbreak prediction and monitoring.
Examples