494 lines
13 KiB
Markdown
494 lines
13 KiB
Markdown
# HashMap Implementation - Technical Documentation
|
|
|
|
## Overview
|
|
|
|
This is a production-ready HashMap implementation in TypeScript that strictly follows OOP SOLID principles and best practices. The implementation uses separate chaining for collision resolution and provides automatic resizing based on load factor.
|
|
|
|
## SOLID Principles Implementation
|
|
|
|
### 1. Single Responsibility Principle (SRP)
|
|
|
|
Each class has one clearly defined responsibility:
|
|
|
|
#### `HashMap` (`src/core/HashMap.ts`)
|
|
- **Responsibility**: Managing the hash table and coordinating operations
|
|
- **Single Purpose**: Provide efficient key-value storage and retrieval
|
|
|
|
#### `HashNode` (`src/models/HashNode.ts`)
|
|
- **Responsibility**: Storing a single key-value pair and linking to the next node
|
|
- **Single Purpose**: Data container for collision chains
|
|
|
|
#### `DefaultHashFunction` (`src/hash-functions/DefaultHashFunction.ts`)
|
|
- **Responsibility**: Computing hash values for keys
|
|
- **Single Purpose**: Convert keys to bucket indices
|
|
|
|
#### `NumericHashFunction` (`src/hash-functions/NumericHashFunction.ts`)
|
|
- **Responsibility**: Optimized hashing for numeric keys
|
|
- **Single Purpose**: Provide better distribution for numeric data
|
|
|
|
### 2. Open/Closed Principle (OCP)
|
|
|
|
**Open for Extension, Closed for Modification**
|
|
|
|
The implementation is extensible without modifying core code:
|
|
|
|
```typescript
|
|
// Extend functionality by providing custom hash functions
|
|
class CustomHashFunction implements IHashFunction<string> {
|
|
hash(key: string, capacity: number): number {
|
|
// Custom hashing logic
|
|
return /* computed hash */;
|
|
}
|
|
}
|
|
|
|
// Use custom function without modifying HashMap
|
|
const map = new HashMap<string, number>(16, 0.75, new CustomHashFunction());
|
|
```
|
|
|
|
**Key Design Decisions:**
|
|
- Hash function is injected via constructor (dependency injection)
|
|
- New hash strategies can be added without changing HashMap
|
|
- Generic types allow any key/value types without modification
|
|
|
|
### 3. Liskov Substitution Principle (LSP)
|
|
|
|
**Subtypes must be substitutable for their base types**
|
|
|
|
All implementations properly implement their interfaces:
|
|
|
|
```typescript
|
|
// Any IHashFunction can replace another
|
|
function createMap<K, V>(hashFn: IHashFunction<K>): IHashMap<K, V> {
|
|
return new HashMap<K, V>(16, 0.75, hashFn);
|
|
}
|
|
|
|
// All these work identically
|
|
const map1 = createMap(new DefaultHashFunction());
|
|
const map2 = createMap(new NumericHashFunction());
|
|
const map3 = createMap(new CustomHashFunction());
|
|
```
|
|
|
|
**Guarantees:**
|
|
- All IHashFunction implementations provide correct hash values
|
|
- HashMap correctly implements IHashMap interface
|
|
- No unexpected behavior when substituting implementations
|
|
|
|
### 4. Interface Segregation Principle (ISP)
|
|
|
|
**Clients shouldn't depend on interfaces they don't use**
|
|
|
|
The codebase provides focused, minimal interfaces:
|
|
|
|
#### `IHashFunction<K>`
|
|
```typescript
|
|
interface IHashFunction<K> {
|
|
hash(key: K, capacity: number): number;
|
|
}
|
|
```
|
|
- Single method interface
|
|
- Only requires hash computation
|
|
- No unnecessary methods
|
|
|
|
#### `IHashMap<K, V>`
|
|
```typescript
|
|
interface IHashMap<K, V> {
|
|
set(key: K, value: V): void;
|
|
get(key: K): V | undefined;
|
|
has(key: K): boolean;
|
|
delete(key: K): boolean;
|
|
clear(): void;
|
|
// ... iterator methods
|
|
}
|
|
```
|
|
- Focused on map operations
|
|
- No coupling to hashing details
|
|
- Clean separation of concerns
|
|
|
|
### 5. Dependency Inversion Principle (DIP)
|
|
|
|
**Depend on abstractions, not concretions**
|
|
|
|
High-level modules depend on abstractions:
|
|
|
|
```typescript
|
|
export class HashMap<K, V> implements IHashMap<K, V> {
|
|
private readonly hashFunction: IHashFunction<K>; // Depends on abstraction
|
|
|
|
constructor(
|
|
initialCapacity: number = 16,
|
|
loadFactorThreshold: number = 0.75,
|
|
hashFunction?: IHashFunction<K> // Inject dependency
|
|
) {
|
|
this.hashFunction = hashFunction ?? new DefaultHashFunction<K>();
|
|
}
|
|
}
|
|
```
|
|
|
|
**Benefits:**
|
|
- HashMap doesn't depend on concrete hash implementations
|
|
- Easy to test with mock hash functions
|
|
- Can swap hash strategies at runtime
|
|
- Follows Dependency Injection pattern
|
|
|
|
## Architecture
|
|
|
|
### Directory Structure
|
|
|
|
```
|
|
src/
|
|
├── core/ # Core implementations
|
|
│ └── HashMap.ts # Main HashMap class
|
|
├── interfaces/ # Contracts and abstractions
|
|
│ ├── IHashFunction.ts # Hash function interface
|
|
│ └── IHashMap.ts # HashMap interface
|
|
├── models/ # Data structures
|
|
│ └── HashNode.ts # Collision chain node
|
|
├── hash-functions/ # Hashing strategies
|
|
│ ├── DefaultHashFunction.ts # General-purpose hashing
|
|
│ └── NumericHashFunction.ts # Numeric optimization
|
|
├── examples/ # Usage demonstrations
|
|
│ ├── basic-usage.ts
|
|
│ └── custom-hash-function.ts
|
|
└── index.ts # Public API exports
|
|
```
|
|
|
|
### Design Patterns Used
|
|
|
|
#### 1. Strategy Pattern
|
|
- **Where**: Hash function selection
|
|
- **Why**: Allows different hashing algorithms to be plugged in
|
|
- **Implementation**: `IHashFunction` interface with multiple implementations
|
|
|
|
#### 2. Iterator Pattern
|
|
- **Where**: `keys()`, `values()`, `entries()` methods
|
|
- **Why**: Provides consistent way to traverse the collection
|
|
- **Implementation**: Generator functions with `IterableIterator<T>`
|
|
|
|
#### 3. Dependency Injection
|
|
- **Where**: Constructor accepts `IHashFunction`
|
|
- **Why**: Decouples HashMap from specific hash implementations
|
|
- **Implementation**: Constructor parameter with default
|
|
|
|
### Data Structure Design
|
|
|
|
#### Collision Resolution: Separate Chaining
|
|
|
|
```
|
|
Buckets Array:
|
|
[0] -> Node(k1, v1) -> Node(k2, v2) -> null
|
|
[1] -> null
|
|
[2] -> Node(k3, v3) -> null
|
|
[3] -> Node(k4, v4) -> Node(k5, v5) -> Node(k6, v6) -> null
|
|
...
|
|
```
|
|
|
|
**Advantages:**
|
|
- Simple to implement
|
|
- No clustering issues
|
|
- Can handle high load factors
|
|
- Dynamic growth with chains
|
|
|
|
**Trade-offs:**
|
|
- Extra memory for node references
|
|
- Cache locality could be better
|
|
- O(n) worst-case for long chains
|
|
|
|
#### Load Factor and Resizing
|
|
|
|
**Default Configuration:**
|
|
- Initial Capacity: 16 buckets
|
|
- Load Factor Threshold: 0.75
|
|
|
|
**Resizing Strategy:**
|
|
```typescript
|
|
if (size / capacity >= loadFactorThreshold) {
|
|
resize(capacity * 2); // Double the capacity
|
|
}
|
|
```
|
|
|
|
**Why 0.75?**
|
|
- Good balance between space and time
|
|
- Keeps chains short on average
|
|
- Industry standard (used by Java HashMap)
|
|
|
|
## Performance Characteristics
|
|
|
|
### Time Complexity
|
|
|
|
| Operation | Average Case | Worst Case | Notes |
|
|
|-----------|--------------|------------|-------|
|
|
| `set(k, v)` | O(1) | O(n) | Worst case if all keys hash to same bucket |
|
|
| `get(k)` | O(1) | O(n) | Requires traversing collision chain |
|
|
| `has(k)` | O(1) | O(n) | Same as get |
|
|
| `delete(k)` | O(1) | O(n) | Requires finding and unlinking node |
|
|
| `clear()` | O(capacity) | O(capacity) | Must null all bucket references |
|
|
| `keys()` | O(n) | O(n) | Must visit all entries |
|
|
| `values()` | O(n) | O(n) | Must visit all entries |
|
|
| `entries()` | O(n) | O(n) | Must visit all entries |
|
|
|
|
### Space Complexity
|
|
|
|
- **Storage**: O(n) where n is number of entries
|
|
- **Overhead**: O(capacity) for buckets array
|
|
- **Per Entry**: Constant overhead for HashNode
|
|
|
|
### Load Factor Impact
|
|
|
|
```
|
|
Load Factor = size / capacity
|
|
|
|
Low Load Factor (< 0.5):
|
|
✓ Fewer collisions
|
|
✓ Faster operations
|
|
✗ Wastes memory
|
|
|
|
High Load Factor (> 0.9):
|
|
✓ Better memory usage
|
|
✗ More collisions
|
|
✗ Slower operations
|
|
|
|
Optimal (0.75):
|
|
✓ Good balance
|
|
✓ Reasonable memory usage
|
|
✓ Good performance
|
|
```
|
|
|
|
## Best Practices Demonstrated
|
|
|
|
### 1. Type Safety
|
|
```typescript
|
|
// Full generic support
|
|
const map = new HashMap<string, User>(); // Type-safe
|
|
map.set("id", user); // ✓ Correct
|
|
map.set(123, user); // ✗ Type error
|
|
```
|
|
|
|
### 2. Immutability Where Appropriate
|
|
```typescript
|
|
// Read-only properties
|
|
private readonly hashFunction: IHashFunction<K>;
|
|
private readonly loadFactorThreshold: number;
|
|
private readonly initialCapacity: number;
|
|
```
|
|
|
|
### 3. Defensive Programming
|
|
```typescript
|
|
// Validate constructor arguments
|
|
if (initialCapacity <= 0) {
|
|
throw new Error("Initial capacity must be positive");
|
|
}
|
|
if (loadFactorThreshold <= 0 || loadFactorThreshold > 1) {
|
|
throw new Error("Load factor must be between 0 and 1");
|
|
}
|
|
```
|
|
|
|
### 4. Clear Documentation
|
|
- Every public method documented with JSDoc
|
|
- Time complexity noted in comments
|
|
- Usage examples provided
|
|
|
|
### 5. Comprehensive Testing
|
|
- 32 test cases covering all functionality
|
|
- Edge cases (null, undefined, empty strings)
|
|
- Performance tests (1000 entries)
|
|
- Custom hash function tests
|
|
|
|
### 6. Iterator Support
|
|
```typescript
|
|
// Makes HashMap usable in for...of loops
|
|
[Symbol.iterator](): IterableIterator<[K, V]> {
|
|
return this.entries();
|
|
}
|
|
|
|
// Usage
|
|
for (const [key, value] of map) {
|
|
console.log(key, value);
|
|
}
|
|
```
|
|
|
|
### 7. Separation of Concerns
|
|
- Hashing logic separated from storage logic
|
|
- Node structure separated from HashMap
|
|
- Interfaces defined separately from implementations
|
|
|
|
## Advanced Features
|
|
|
|
### 1. Custom Hash Functions
|
|
|
|
Create domain-specific hash functions:
|
|
|
|
```typescript
|
|
// Case-insensitive string keys
|
|
class CaseInsensitiveHash implements IHashFunction<string> {
|
|
hash(key: string, capacity: number): number {
|
|
return computeHash(key.toLowerCase(), capacity);
|
|
}
|
|
}
|
|
|
|
// Composite object keys
|
|
class PersonHashFunction implements IHashFunction<Person> {
|
|
hash(person: Person, capacity: number): number {
|
|
const str = `${person.firstName}:${person.lastName}:${person.age}`;
|
|
return computeHash(str, capacity);
|
|
}
|
|
}
|
|
```
|
|
|
|
### 2. Performance Monitoring
|
|
|
|
```typescript
|
|
const map = new HashMap<string, number>();
|
|
|
|
// Monitor internal state
|
|
console.log(`Capacity: ${map.capacity}`);
|
|
console.log(`Size: ${map.size}`);
|
|
console.log(`Load Factor: ${map.loadFactor}`);
|
|
```
|
|
|
|
### 3. Bulk Operations
|
|
|
|
```typescript
|
|
// Efficient bulk insertion
|
|
const entries: [string, number][] = [
|
|
["a", 1], ["b", 2], ["c", 3]
|
|
];
|
|
|
|
for (const [key, value] of entries) {
|
|
map.set(key, value);
|
|
}
|
|
```
|
|
|
|
## Testing Strategy
|
|
|
|
### Test Coverage
|
|
|
|
```bash
|
|
bun test
|
|
```
|
|
|
|
**Coverage Breakdown:**
|
|
- Core HashMap: 100% function/line coverage
|
|
- Hash Functions: 66-87% (edge cases for special values)
|
|
- Overall: 92% line coverage
|
|
|
|
### Test Categories
|
|
|
|
1. **Constructor Tests**
|
|
- Default initialization
|
|
- Custom parameters
|
|
- Invalid input validation
|
|
|
|
2. **Basic Operations**
|
|
- Set/Get/Has/Delete
|
|
- Update existing values
|
|
- Non-existent keys
|
|
|
|
3. **Iteration Tests**
|
|
- Keys iterator
|
|
- Values iterator
|
|
- Entries iterator
|
|
- forEach callback
|
|
- for...of loops
|
|
|
|
4. **Resizing Tests**
|
|
- Automatic growth
|
|
- Data preservation
|
|
- Load factor triggers
|
|
|
|
5. **Edge Cases**
|
|
- Null values
|
|
- Undefined values
|
|
- Empty string keys
|
|
- Large datasets (1000 entries)
|
|
|
|
6. **Custom Hash Functions**
|
|
- NumericHashFunction
|
|
- Custom implementations
|
|
|
|
## Usage Examples
|
|
|
|
### Basic Usage
|
|
```typescript
|
|
const scores = new HashMap<string, number>();
|
|
scores.set("Alice", 95);
|
|
scores.set("Bob", 87);
|
|
console.log(scores.get("Alice")); // 95
|
|
```
|
|
|
|
### With TypeScript Interfaces
|
|
```typescript
|
|
interface Product {
|
|
id: number;
|
|
name: string;
|
|
price: number;
|
|
}
|
|
|
|
const products = new HashMap<number, Product>();
|
|
products.set(1, { id: 1, name: "Widget", price: 9.99 });
|
|
```
|
|
|
|
### Custom Configuration
|
|
```typescript
|
|
const map = new HashMap<string, number>(
|
|
32, // Initial capacity
|
|
0.8, // Load factor threshold
|
|
customHashFn // Custom hash function
|
|
);
|
|
```
|
|
|
|
## Comparison with Native Map
|
|
|
|
### Advantages of This Implementation
|
|
|
|
1. **Educational Value**: Shows internal workings
|
|
2. **Customizable**: Inject custom hash functions
|
|
3. **Observable**: Can monitor capacity and load factor
|
|
4. **Extensible**: Easy to add new features
|
|
|
|
### Native Map Advantages
|
|
|
|
1. **Performance**: Highly optimized in V8/JSC
|
|
2. **Battle-tested**: Used in production worldwide
|
|
3. **Standard API**: Consistent across platforms
|
|
|
|
### When to Use Each
|
|
|
|
**Use HashMap (this implementation):**
|
|
- Learning data structures
|
|
- Need custom hash functions
|
|
- Want to understand internals
|
|
- Require specific behavior
|
|
|
|
**Use Native Map:**
|
|
- Production applications
|
|
- Performance critical paths
|
|
- Standard use cases
|
|
- Browser compatibility needs
|
|
|
|
## Future Enhancements
|
|
|
|
Possible improvements while maintaining SOLID principles:
|
|
|
|
1. **Additional Hash Functions**
|
|
- CryptoHashFunction (secure hashing)
|
|
- IdentityHashFunction (reference equality)
|
|
|
|
2. **Performance Optimizations**
|
|
- Red-black tree for long chains (like Java 8+)
|
|
- Dynamic shrinking on deletions
|
|
|
|
3. **Additional Features**
|
|
- Weak key references
|
|
- Computed values (getOrCompute)
|
|
- Batch operations
|
|
|
|
4. **Observability**
|
|
- Event listeners for changes
|
|
- Statistics tracking
|
|
- Performance metrics
|
|
|
|
## Conclusion
|
|
|
|
This HashMap implementation demonstrates how to build a production-quality data structure while adhering to SOLID principles. The clean architecture makes it maintainable, testable, and extensible. It serves as both a practical tool and an educational resource for understanding hash tables and object-oriented design.
|
|
|