Talha IjlalTalha Ijlal

Graph Theory / NLP, Semester Project

Graph-Based Text Classifier

Takes three topics from the user, scrapes 15 Medium articles per topic using GraphQL queries (Medium's internal API), cleans and tokenizes the content, converts each article into a graph structure, and runs network analysis including spanning trees and centrality metrics. Results are visualized using Gravis.

ContextGraph Theory Course, UET Lahore
Technologies
PythonGraphQLNetworkXGravisNLTKBeautifulSoup

How It Works

  1. Accept three topics from the user as input.
  2. Query Medium's internal GraphQL API to fetch 15 articles per topic (45 total).
  3. Clean and tokenize article text, removing stopwords, punctuation, and low-frequency terms.
  4. Build a graph per article where nodes are tokens and edges represent co-occurrence within a window.
  5. Run network analysis: minimum spanning tree, degree centrality, betweenness centrality.
  6. Visualize graphs and spanning trees interactively using Gravis.

A key engineering detail: Medium does not expose a public API. Learning to reverse-engineer their internal GraphQL schema and construct valid queries was a core part of the project.