Files

Alexander Zinn 74fd80f91c Initial Commit - Sonnet 4.5

2025-11-22 18:18:23 -05:00

13 KiB

Raw Blame History

HashMap Implementation - Technical Documentation

Overview

This is a production-ready HashMap implementation in TypeScript that strictly follows OOP SOLID principles and best practices. The implementation uses separate chaining for collision resolution and provides automatic resizing based on load factor.

SOLID Principles Implementation

1. Single Responsibility Principle (SRP)

Each class has one clearly defined responsibility:

`HashMap` (`src/core/HashMap.ts`)

Responsibility: Managing the hash table and coordinating operations
Single Purpose: Provide efficient key-value storage and retrieval

`HashNode` (`src/models/HashNode.ts`)

Responsibility: Storing a single key-value pair and linking to the next node
Single Purpose: Data container for collision chains

`DefaultHashFunction` (`src/hash-functions/DefaultHashFunction.ts`)

Responsibility: Computing hash values for keys
Single Purpose: Convert keys to bucket indices

`NumericHashFunction` (`src/hash-functions/NumericHashFunction.ts`)

Responsibility: Optimized hashing for numeric keys
Single Purpose: Provide better distribution for numeric data

2. Open/Closed Principle (OCP)

Open for Extension, Closed for Modification

The implementation is extensible without modifying core code:

// Extend functionality by providing custom hash functions
class CustomHashFunction implements IHashFunction<string> {
  hash(key: string, capacity: number): number {
    // Custom hashing logic
    return /* computed hash */;
  }
}

// Use custom function without modifying HashMap
const map = new HashMap<string, number>(16, 0.75, new CustomHashFunction());

Key Design Decisions:

Hash function is injected via constructor (dependency injection)
New hash strategies can be added without changing HashMap
Generic types allow any key/value types without modification

3. Liskov Substitution Principle (LSP)

Subtypes must be substitutable for their base types

All implementations properly implement their interfaces:

// Any IHashFunction can replace another
function createMap<K, V>(hashFn: IHashFunction<K>): IHashMap<K, V> {
  return new HashMap<K, V>(16, 0.75, hashFn);
}

// All these work identically
const map1 = createMap(new DefaultHashFunction());
const map2 = createMap(new NumericHashFunction());
const map3 = createMap(new CustomHashFunction());

Guarantees:

All IHashFunction implementations provide correct hash values
HashMap correctly implements IHashMap interface
No unexpected behavior when substituting implementations

4. Interface Segregation Principle (ISP)

Clients shouldn't depend on interfaces they don't use

The codebase provides focused, minimal interfaces:

`IHashFunction<K>`

interface IHashFunction<K> {
  hash(key: K, capacity: number): number;
}

Single method interface
Only requires hash computation
No unnecessary methods

`IHashMap<K, V>`

interface IHashMap<K, V> {
  set(key: K, value: V): void;
  get(key: K): V | undefined;
  has(key: K): boolean;
  delete(key: K): boolean;
  clear(): void;
  // ... iterator methods
}

Focused on map operations
No coupling to hashing details
Clean separation of concerns

5. Dependency Inversion Principle (DIP)

Depend on abstractions, not concretions

High-level modules depend on abstractions:

export class HashMap<K, V> implements IHashMap<K, V> {
  private readonly hashFunction: IHashFunction<K>;  // Depends on abstraction
  
  constructor(
    initialCapacity: number = 16,
    loadFactorThreshold: number = 0.75,
    hashFunction?: IHashFunction<K>  // Inject dependency
  ) {
    this.hashFunction = hashFunction ?? new DefaultHashFunction<K>();
  }
}

Benefits:

HashMap doesn't depend on concrete hash implementations
Easy to test with mock hash functions
Can swap hash strategies at runtime
Follows Dependency Injection pattern

Architecture

Directory Structure

src/
├── core/                          # Core implementations
│   └── HashMap.ts                 # Main HashMap class
├── interfaces/                    # Contracts and abstractions
│   ├── IHashFunction.ts           # Hash function interface
│   └── IHashMap.ts                # HashMap interface
├── models/                        # Data structures
│   └── HashNode.ts                # Collision chain node
├── hash-functions/                # Hashing strategies
│   ├── DefaultHashFunction.ts     # General-purpose hashing
│   └── NumericHashFunction.ts     # Numeric optimization
├── examples/                      # Usage demonstrations
│   ├── basic-usage.ts
│   └── custom-hash-function.ts
└── index.ts                       # Public API exports

Design Patterns Used

1. Strategy Pattern

Where: Hash function selection
Why: Allows different hashing algorithms to be plugged in
Implementation: IHashFunction interface with multiple implementations

2. Iterator Pattern

Where: keys(), values(), entries() methods
Why: Provides consistent way to traverse the collection
Implementation: Generator functions with IterableIterator<T>

3. Dependency Injection

Where: Constructor accepts IHashFunction
Why: Decouples HashMap from specific hash implementations
Implementation: Constructor parameter with default

Data Structure Design

Collision Resolution: Separate Chaining

Buckets Array:
[0] -> Node(k1, v1) -> Node(k2, v2) -> null
[1] -> null
[2] -> Node(k3, v3) -> null
[3] -> Node(k4, v4) -> Node(k5, v5) -> Node(k6, v6) -> null
...

Advantages:

Simple to implement
No clustering issues
Can handle high load factors
Dynamic growth with chains

Trade-offs:

Extra memory for node references
Cache locality could be better
O(n) worst-case for long chains

Load Factor and Resizing

Default Configuration:

Initial Capacity: 16 buckets
Load Factor Threshold: 0.75

Resizing Strategy:

if (size / capacity >= loadFactorThreshold) {
  resize(capacity * 2);  // Double the capacity
}

Why 0.75?

Good balance between space and time
Keeps chains short on average
Industry standard (used by Java HashMap)

Performance Characteristics

Time Complexity

Operation	Average Case	Worst Case	Notes
`set(k, v)`	O(1)	O(n)	Worst case if all keys hash to same bucket
`get(k)`	O(1)	O(n)	Requires traversing collision chain
`has(k)`	O(1)	O(n)	Same as get
`delete(k)`	O(1)	O(n)	Requires finding and unlinking node
`clear()`	O(capacity)	O(capacity)	Must null all bucket references
`keys()`	O(n)	O(n)	Must visit all entries
`values()`	O(n)	O(n)	Must visit all entries
`entries()`	O(n)	O(n)	Must visit all entries

Space Complexity

Storage: O(n) where n is number of entries
Overhead: O(capacity) for buckets array
Per Entry: Constant overhead for HashNode

Load Factor Impact

Load Factor = size / capacity

Low Load Factor (< 0.5):
✓ Fewer collisions
✓ Faster operations
✗ Wastes memory

High Load Factor (> 0.9):
✓ Better memory usage
✗ More collisions
✗ Slower operations

Optimal (0.75):
✓ Good balance
✓ Reasonable memory usage
✓ Good performance

Best Practices Demonstrated

1. Type Safety

// Full generic support
const map = new HashMap<string, User>();  // Type-safe
map.set("id", user);  // ✓ Correct
map.set(123, user);   // ✗ Type error

2. Immutability Where Appropriate

// Read-only properties
private readonly hashFunction: IHashFunction<K>;
private readonly loadFactorThreshold: number;
private readonly initialCapacity: number;

3. Defensive Programming

// Validate constructor arguments
if (initialCapacity <= 0) {
  throw new Error("Initial capacity must be positive");
}
if (loadFactorThreshold <= 0 || loadFactorThreshold > 1) {
  throw new Error("Load factor must be between 0 and 1");
}

4. Clear Documentation

Every public method documented with JSDoc
Time complexity noted in comments
Usage examples provided

5. Comprehensive Testing

32 test cases covering all functionality
Edge cases (null, undefined, empty strings)
Performance tests (1000 entries)
Custom hash function tests

6. Iterator Support

// Makes HashMap usable in for...of loops
[Symbol.iterator](): IterableIterator<[K, V]> {
  return this.entries();
}

// Usage
for (const [key, value] of map) {
  console.log(key, value);
}

7. Separation of Concerns

Hashing logic separated from storage logic
Node structure separated from HashMap
Interfaces defined separately from implementations

Advanced Features

1. Custom Hash Functions

Create domain-specific hash functions:

// Case-insensitive string keys
class CaseInsensitiveHash implements IHashFunction<string> {
  hash(key: string, capacity: number): number {
    return computeHash(key.toLowerCase(), capacity);
  }
}

// Composite object keys
class PersonHashFunction implements IHashFunction<Person> {
  hash(person: Person, capacity: number): number {
    const str = `${person.firstName}:${person.lastName}:${person.age}`;
    return computeHash(str, capacity);
  }
}

2. Performance Monitoring

const map = new HashMap<string, number>();

// Monitor internal state
console.log(`Capacity: ${map.capacity}`);
console.log(`Size: ${map.size}`);
console.log(`Load Factor: ${map.loadFactor}`);

3. Bulk Operations

// Efficient bulk insertion
const entries: [string, number][] = [
  ["a", 1], ["b", 2], ["c", 3]
];

for (const [key, value] of entries) {
  map.set(key, value);
}

Testing Strategy

Test Coverage

bun test

Coverage Breakdown:

Core HashMap: 100% function/line coverage
Hash Functions: 66-87% (edge cases for special values)
Overall: 92% line coverage

Test Categories

Constructor Tests
- Default initialization
- Custom parameters
- Invalid input validation
Basic Operations
- Set/Get/Has/Delete
- Update existing values
- Non-existent keys
Iteration Tests
- Keys iterator
- Values iterator
- Entries iterator
- forEach callback
- for...of loops
Resizing Tests
- Automatic growth
- Data preservation
- Load factor triggers
Edge Cases
- Null values
- Undefined values
- Empty string keys
- Large datasets (1000 entries)
Custom Hash Functions
- NumericHashFunction
- Custom implementations

Usage Examples

Basic Usage

const scores = new HashMap<string, number>();
scores.set("Alice", 95);
scores.set("Bob", 87);
console.log(scores.get("Alice")); // 95

With TypeScript Interfaces

interface Product {
  id: number;
  name: string;
  price: number;
}

const products = new HashMap<number, Product>();
products.set(1, { id: 1, name: "Widget", price: 9.99 });

Custom Configuration

const map = new HashMap<string, number>(
  32,              // Initial capacity
  0.8,             // Load factor threshold
  customHashFn     // Custom hash function
);

Comparison with Native Map

Advantages of This Implementation

Educational Value: Shows internal workings
Customizable: Inject custom hash functions
Observable: Can monitor capacity and load factor
Extensible: Easy to add new features

Native Map Advantages

Performance: Highly optimized in V8/JSC
Battle-tested: Used in production worldwide
Standard API: Consistent across platforms

When to Use Each

Use HashMap (this implementation):

Learning data structures
Need custom hash functions
Want to understand internals
Require specific behavior

Use Native Map:

Production applications
Performance critical paths
Standard use cases
Browser compatibility needs

Future Enhancements

Possible improvements while maintaining SOLID principles:

Additional Hash Functions
- CryptoHashFunction (secure hashing)
- IdentityHashFunction (reference equality)
Performance Optimizations
- Red-black tree for long chains (like Java 8+)
- Dynamic shrinking on deletions
Additional Features
- Weak key references
- Computed values (getOrCompute)
- Batch operations
Observability
- Event listeners for changes
- Statistics tracking
- Performance metrics

Conclusion

This HashMap implementation demonstrates how to build a production-quality data structure while adhering to SOLID principles. The clean architecture makes it maintainable, testable, and extensible. It serves as both a practical tool and an educational resource for understanding hash tables and object-oriented design.

13 KiB Raw Blame History