Software2025

NGordnet — WordNet Hyponym Explorer

Web-based linguistic analysis tool built in Java as part of UC Berkeley's CS61B Data Structures course. Uses graph traversal and historical language data to explore semantic relationships between words.

GitHub

NGordnet combines two large datasets: Princeton WordNet (semantic relationships between words) and Google N-Gram historical word frequencies. The system allows users to search for words and explore semantic relationships such as hyponyms — more specific versions of a word — while optionally ranking results by historical usage frequency.

For example, a query for the word change returns related hyponyms such as alteration, increase, jump, leap, modification, transition, and variation — ranked using historical language usage data from the Google Books corpus.

The Challenge

The goal of the project was to extend an existing N-Gram analysis tool with semantic graph queries. Given one or more words, the system must:

Find all matching WordNet synsets
Traverse a directed graph of semantic relationships
Return all reachable hyponyms
Compute intersections for multi-word queries
Rank results using historical frequency data
Return the top-k most frequently used words

The implementation needed to handle datasets containing tens of thousands of words and relationships efficiently.

Technical Implementation

The project is built around three primary components:

Graph

A custom directed graph implementation using adjacency lists. Supports graph construction, efficient neighbor lookup, Depth-First Search (DFS), and reachability queries.

WordNet

Parses synset files and hyponym relationship files. Builds mappings between word → synset IDs and synset ID → words, and exposes APIs for semantic lookups.

HyponymsHandler

The web-facing query handler. Responsible for processing user requests, computing graph traversals, finding intersections between multiple word queries, integrating N-Gram frequency rankings, and returning sorted results.

What I Learned

This project strengthened my understanding of:

Graph theory and directed graph design
Depth-First Search (DFS) traversal
Large-scale data parsing
HashMaps and adjacency-list representations
Algorithm design
Software architecture
Building APIs around complex data structures

Key Contributions

Implemented directed graph data structures from scratch
Built graph traversal algorithms for semantic search
Parsed and integrated large WordNet datasets
Added frequency-based ranking using N-Gram data
Implemented multi-word query intersections
Developed backend query handling in Java

Stack

JavaGraph TheoryDepth-First Search (DFS)WordNetGoogle N-GramsData StructuresAlgorithmsObject-Oriented DesignWeb BackendGit

GitHub →