lemkin-resources

Timeline Generator Tool

Overview

Automated tool for extracting temporal information from documents and generating chronological timelines of events for legal proceedings and investigations.

Features

Installation

pip install dateparser spacy plotly
python -m spacy download en_core_web_lg

Quick Start

from timeline_generator import TimelineGenerator

# Initialize
tg = TimelineGenerator()

# Process documents
timeline = tg.generate_from_documents([
    'witness_statement_1.pdf',
    'police_report.pdf',
    'medical_records.pdf'
])

# Export timeline
timeline.to_html('case_timeline.html')
timeline.to_json('case_timeline.json')
timeline.to_csv('case_timeline.csv')

API Reference

TimelineGenerator.generate_from_documents(document_list)

Processes multiple documents and creates a unified timeline.

Parameters:

Returns:

Timeline.to_html(output_path, interactive=True)

Exports timeline as interactive HTML visualization.

Timeline.to_json(output_path)

Exports timeline data as JSON.

Timeline.filter_by_date(start_date, end_date)

Filters timeline to specific date range.

Output Format

JSON Structure

{
  "events": [
    {
      "date": "2024-03-15",
      "time": "14:30",
      "description": "Witness reported incident",
      "source": "witness_statement_1.pdf",
      "confidence": 0.95,
      "entities": ["John Doe", "Location X"]
    }
  ]
}

Configuration

Create config.yaml to customize:

timeline:
  date_formats:
    - "%Y-%m-%d"
    - "%d/%m/%Y"
    - "%B %d, %Y"

  languages:
    - en
    - fr
    - es

  conflict_resolution: "most_recent"
  min_confidence: 0.7

Supported Document Types

Performance

Examples

Basic Timeline Generation

tg = TimelineGenerator()
timeline = tg.generate_from_text("""
  On March 15, 2024, the incident was first reported.
  Two days later, police arrived at the scene.
  The investigation concluded on March 30, 2024.
""")

Advanced Filtering

# Generate timeline
timeline = tg.generate_from_documents(documents)

# Filter to specific period
march_events = timeline.filter_by_date("2024-03-01", "2024-03-31")

# Filter by confidence
high_confidence = timeline.filter_by_confidence(min_confidence=0.9)

# Combine filters
filtered = timeline.filter(
    start_date="2024-03-01",
    end_date="2024-03-31",
    min_confidence=0.8,
    entities=["John Doe"]
)

Visualization Options

The tool supports multiple visualization formats:

Error Handling

The tool handles various edge cases:

Contributing

See main repository CONTRIBUTING.md for guidelines.

License

MIT License