← Projects
UC Berkeley
Software2025

NGordnet — WordNet Hyponym Explorer

Web-based linguistic analysis tool built in Java as part of UC Berkeley's CS61B Data Structures course. Uses graph traversal and historical language data to explore semantic relationships between words.

NGordnet combines two large datasets: Princeton WordNet (semantic relationships between words) and Google N-Gram historical word frequencies. The system allows users to search for words and explore semantic relationships such as hyponyms — more specific versions of a word — while optionally ranking results by historical usage frequency.

For example, a query for the word change returns related hyponyms such as alteration, increase, jump, leap, modification, transition, and variation — ranked using historical language usage data from the Google Books corpus.

NGordnet hyponym query example

The Challenge

The goal of the project was to extend an existing N-Gram analysis tool with semantic graph queries. Given one or more words, the system must:

  • Find all matching WordNet synsets
  • Traverse a directed graph of semantic relationships
  • Return all reachable hyponyms
  • Compute intersections for multi-word queries
  • Rank results using historical frequency data
  • Return the top-k most frequently used words

The implementation needed to handle datasets containing tens of thousands of words and relationships efficiently.

Technical Implementation

The project is built around three primary components:

Graph

A custom directed graph implementation using adjacency lists. Supports graph construction, efficient neighbor lookup, Depth-First Search (DFS), and reachability queries.

WordNet

Parses synset files and hyponym relationship files. Builds mappings between word → synset IDs and synset ID → words, and exposes APIs for semantic lookups.

HyponymsHandler

The web-facing query handler. Responsible for processing user requests, computing graph traversals, finding intersections between multiple word queries, integrating N-Gram frequency rankings, and returning sorted results.

What I Learned

This project strengthened my understanding of:

  • Graph theory and directed graph design
  • Depth-First Search (DFS) traversal
  • Large-scale data parsing
  • HashMaps and adjacency-list representations
  • Algorithm design
  • Software architecture
  • Building APIs around complex data structures

Key Contributions

  • Implemented directed graph data structures from scratch
  • Built graph traversal algorithms for semantic search
  • Parsed and integrated large WordNet datasets
  • Added frequency-based ranking using N-Gram data
  • Implemented multi-word query intersections
  • Developed backend query handling in Java

Stack

JavaGraph TheoryDepth-First Search (DFS)WordNetGoogle N-GramsData StructuresAlgorithmsObject-Oriented DesignWeb BackendGit