hash-map/CLAUDE.md

# HashMap Implementation - Technical Documentation

## Overview

This is a production-ready HashMap implementation in TypeScript that strictly follows OOP SOLID principles and best practices. The implementation uses separate chaining for collision resolution and provides automatic resizing based on load factor.

## SOLID Principles Implementation

### 1. Single Responsibility Principle (SRP)

Each class has one clearly defined responsibility:

#### `HashMap` (`src/core/HashMap.ts`)
- **Responsibility**: Managing the hash table and coordinating operations
- **Single Purpose**: Provide efficient key-value storage and retrieval

#### `HashNode` (`src/models/HashNode.ts`)
- **Responsibility**: Storing a single key-value pair and linking to the next node
- **Single Purpose**: Data container for collision chains

#### `DefaultHashFunction` (`src/hash-functions/DefaultHashFunction.ts`)
- **Responsibility**: Computing hash values for keys
- **Single Purpose**: Convert keys to bucket indices

#### `NumericHashFunction` (`src/hash-functions/NumericHashFunction.ts`)
- **Responsibility**: Optimized hashing for numeric keys
- **Single Purpose**: Provide better distribution for numeric data

### 2. Open/Closed Principle (OCP)

**Open for Extension, Closed for Modification**

The implementation is extensible without modifying core code:

```typescript
// Extend functionality by providing custom hash functions
class CustomHashFunction implements IHashFunction<string> {
  hash(key: string, capacity: number): number {
    // Custom hashing logic
    return /* computed hash */;
  }
}

// Use custom function without modifying HashMap
const map = new HashMap<string, number>(16, 0.75, new CustomHashFunction());
```

**Key Design Decisions:**
- Hash function is injected via constructor (dependency injection)
- New hash strategies can be added without changing HashMap
- Generic types allow any key/value types without modification

### 3. Liskov Substitution Principle (LSP)

**Subtypes must be substitutable for their base types**

All implementations properly implement their interfaces:

```typescript
// Any IHashFunction can replace another
function createMap<K, V>(hashFn: IHashFunction<K>): IHashMap<K, V> {
  return new HashMap<K, V>(16, 0.75, hashFn);
}

// All these work identically
const map1 = createMap(new DefaultHashFunction());
const map2 = createMap(new NumericHashFunction());
const map3 = createMap(new CustomHashFunction());
```

**Guarantees:**
- All IHashFunction implementations provide correct hash values
- HashMap correctly implements IHashMap interface
- No unexpected behavior when substituting implementations

### 4. Interface Segregation Principle (ISP)

**Clients shouldn't depend on interfaces they don't use**

The codebase provides focused, minimal interfaces:

#### `IHashFunction<K>`
```typescript
interface IHashFunction<K> {
  hash(key: K, capacity: number): number;
}
```
- Single method interface
- Only requires hash computation
- No unnecessary methods

#### `IHashMap<K, V>`
```typescript
interface IHashMap<K, V> {
  set(key: K, value: V): void;
  get(key: K): V | undefined;
  has(key: K): boolean;
  delete(key: K): boolean;
  clear(): void;
  // ... iterator methods
}
```
- Focused on map operations
- No coupling to hashing details
- Clean separation of concerns

### 5. Dependency Inversion Principle (DIP)

**Depend on abstractions, not concretions**

High-level modules depend on abstractions:

```typescript
export class HashMap<K, V> implements IHashMap<K, V> {
  private readonly hashFunction: IHashFunction<K>;  // Depends on abstraction

  constructor(
    initialCapacity: number = 16,
    loadFactorThreshold: number = 0.75,
    hashFunction?: IHashFunction<K>  // Inject dependency
  ) {
    this.hashFunction = hashFunction ?? new DefaultHashFunction<K>();
  }
}
```

**Benefits:**
- HashMap doesn't depend on concrete hash implementations
- Easy to test with mock hash functions
- Can swap hash strategies at runtime
- Follows Dependency Injection pattern

## Architecture

### Directory Structure

```
src/
├── core/                          # Core implementations
│   └── HashMap.ts                 # Main HashMap class
├── interfaces/                    # Contracts and abstractions
│   ├── IHashFunction.ts           # Hash function interface
│   └── IHashMap.ts                # HashMap interface
├── models/                        # Data structures
│   └── HashNode.ts                # Collision chain node
├── hash-functions/                # Hashing strategies
│   ├── DefaultHashFunction.ts     # General-purpose hashing
│   └── NumericHashFunction.ts     # Numeric optimization
├── examples/                      # Usage demonstrations
│   ├── basic-usage.ts
│   └── custom-hash-function.ts
└── index.ts                       # Public API exports
```

### Design Patterns Used

#### 1. Strategy Pattern
- **Where**: Hash function selection
- **Why**: Allows different hashing algorithms to be plugged in
- **Implementation**: `IHashFunction` interface with multiple implementations

#### 2. Iterator Pattern
- **Where**: `keys()`, `values()`, `entries()` methods
- **Why**: Provides consistent way to traverse the collection
- **Implementation**: Generator functions with `IterableIterator<T>`

#### 3. Dependency Injection
- **Where**: Constructor accepts `IHashFunction`
- **Why**: Decouples HashMap from specific hash implementations
- **Implementation**: Constructor parameter with default

### Data Structure Design

#### Collision Resolution: Separate Chaining

```
Buckets Array:
[0] -> Node(k1, v1) -> Node(k2, v2) -> null
[1] -> null
[2] -> Node(k3, v3) -> null
[3] -> Node(k4, v4) -> Node(k5, v5) -> Node(k6, v6) -> null
...
```

**Advantages:**
- Simple to implement
- No clustering issues
- Can handle high load factors
- Dynamic growth with chains

**Trade-offs:**
- Extra memory for node references
- Cache locality could be better
- O(n) worst-case for long chains

#### Load Factor and Resizing

**Default Configuration:**
- Initial Capacity: 16 buckets
- Load Factor Threshold: 0.75

**Resizing Strategy:**
```typescript
if (size / capacity >= loadFactorThreshold) {
  resize(capacity * 2);  // Double the capacity
}
```

**Why 0.75?**
- Good balance between space and time
- Keeps chains short on average
- Industry standard (used by Java HashMap)

## Performance Characteristics

### Time Complexity

| Operation | Average Case | Worst Case | Notes |
|-----------|--------------|------------|-------|
| `set(k, v)` | O(1) | O(n) | Worst case if all keys hash to same bucket |
| `get(k)` | O(1) | O(n) | Requires traversing collision chain |
| `has(k)` | O(1) | O(n) | Same as get |
| `delete(k)` | O(1) | O(n) | Requires finding and unlinking node |
| `clear()` | O(capacity) | O(capacity) | Must null all bucket references |
| `keys()` | O(n) | O(n) | Must visit all entries |
| `values()` | O(n) | O(n) | Must visit all entries |
| `entries()` | O(n) | O(n) | Must visit all entries |

### Space Complexity

- **Storage**: O(n) where n is number of entries
- **Overhead**: O(capacity) for buckets array
- **Per Entry**: Constant overhead for HashNode

### Load Factor Impact

```
Load Factor = size / capacity

Low Load Factor (< 0.5):
✓ Fewer collisions
✓ Faster operations
✗ Wastes memory

High Load Factor (> 0.9):
✓ Better memory usage
✗ More collisions
✗ Slower operations

Optimal (0.75):
✓ Good balance
✓ Reasonable memory usage
✓ Good performance
```

## Best Practices Demonstrated

### 1. Type Safety
```typescript
// Full generic support
const map = new HashMap<string, User>();  // Type-safe
map.set("id", user);  // ✓ Correct
map.set(123, user);   // ✗ Type error
```

### 2. Immutability Where Appropriate
```typescript
// Read-only properties
private readonly hashFunction: IHashFunction<K>;
private readonly loadFactorThreshold: number;
private readonly initialCapacity: number;
```

### 3. Defensive Programming
```typescript
// Validate constructor arguments
if (initialCapacity <= 0) {
  throw new Error("Initial capacity must be positive");
}
if (loadFactorThreshold <= 0 || loadFactorThreshold > 1) {
  throw new Error("Load factor must be between 0 and 1");
}
```

### 4. Clear Documentation
- Every public method documented with JSDoc
- Time complexity noted in comments
- Usage examples provided

### 5. Comprehensive Testing
- 32 test cases covering all functionality
- Edge cases (null, undefined, empty strings)
- Performance tests (1000 entries)
- Custom hash function tests

### 6. Iterator Support
```typescript
// Makes HashMap usable in for...of loops
[Symbol.iterator](): IterableIterator<[K, V]> {
  return this.entries();
}

// Usage
for (const [key, value] of map) {
  console.log(key, value);
}
```

### 7. Separation of Concerns
- Hashing logic separated from storage logic
- Node structure separated from HashMap
- Interfaces defined separately from implementations

## Advanced Features

### 1. Custom Hash Functions

Create domain-specific hash functions:

```typescript
// Case-insensitive string keys
class CaseInsensitiveHash implements IHashFunction<string> {
  hash(key: string, capacity: number): number {
    return computeHash(key.toLowerCase(), capacity);
  }
}

// Composite object keys
class PersonHashFunction implements IHashFunction<Person> {
  hash(person: Person, capacity: number): number {
    const str = `${person.firstName}:${person.lastName}:${person.age}`;
    return computeHash(str, capacity);
  }
}
```

### 2. Performance Monitoring

```typescript
const map = new HashMap<string, number>();

// Monitor internal state
console.log(`Capacity: ${map.capacity}`);
console.log(`Size: ${map.size}`);
console.log(`Load Factor: ${map.loadFactor}`);
```

### 3. Bulk Operations

```typescript
// Efficient bulk insertion
const entries: [string, number][] = [
  ["a", 1], ["b", 2], ["c", 3]
];

for (const [key, value] of entries) {
  map.set(key, value);
}
```

## Testing Strategy

### Test Coverage

```bash
bun test
```

**Coverage Breakdown:**
- Core HashMap: 100% function/line coverage
- Hash Functions: 66-87% (edge cases for special values)
- Overall: 92% line coverage

### Test Categories

1. **Constructor Tests**
   - Default initialization
   - Custom parameters
   - Invalid input validation

2. **Basic Operations**
   - Set/Get/Has/Delete
   - Update existing values
   - Non-existent keys

3. **Iteration Tests**
   - Keys iterator
   - Values iterator
   - Entries iterator
   - forEach callback
   - for...of loops

4. **Resizing Tests**
   - Automatic growth
   - Data preservation
   - Load factor triggers

5. **Edge Cases**
   - Null values
   - Undefined values
   - Empty string keys
   - Large datasets (1000 entries)

6. **Custom Hash Functions**
   - NumericHashFunction
   - Custom implementations

## Usage Examples

### Basic Usage
```typescript
const scores = new HashMap<string, number>();
scores.set("Alice", 95);
scores.set("Bob", 87);
console.log(scores.get("Alice")); // 95
```

### With TypeScript Interfaces
```typescript
interface Product {
  id: number;
  name: string;
  price: number;
}

const products = new HashMap<number, Product>();
products.set(1, { id: 1, name: "Widget", price: 9.99 });
```

### Custom Configuration
```typescript
const map = new HashMap<string, number>(
  32,              // Initial capacity
  0.8,             // Load factor threshold
  customHashFn     // Custom hash function
);
```

## Comparison with Native Map

### Advantages of This Implementation

1. **Educational Value**: Shows internal workings
2. **Customizable**: Inject custom hash functions
3. **Observable**: Can monitor capacity and load factor
4. **Extensible**: Easy to add new features

### Native Map Advantages

1. **Performance**: Highly optimized in V8/JSC
2. **Battle-tested**: Used in production worldwide
3. **Standard API**: Consistent across platforms

### When to Use Each

**Use HashMap (this implementation):**
- Learning data structures
- Need custom hash functions
- Want to understand internals
- Require specific behavior

**Use Native Map:**
- Production applications
- Performance critical paths
- Standard use cases
- Browser compatibility needs

## Future Enhancements

Possible improvements while maintaining SOLID principles:

1. **Additional Hash Functions**
   - CryptoHashFunction (secure hashing)
   - IdentityHashFunction (reference equality)

2. **Performance Optimizations**
   - Red-black tree for long chains (like Java 8+)
   - Dynamic shrinking on deletions

3. **Additional Features**
   - Weak key references
   - Computed values (getOrCompute)
   - Batch operations

4. **Observability**
   - Event listeners for changes
   - Statistics tracking
   - Performance metrics

## Conclusion

This HashMap implementation demonstrates how to build a production-quality data structure while adhering to SOLID principles. The clean architecture makes it maintainable, testable, and extensible. It serves as both a practical tool and an educational resource for understanding hash tables and object-oriented design.