Architecture
Project Structure
munger-matics/
├── src/munger_matics/ # core package — business logic and data layer
├── app/ # Streamlit dashboard (presentation layer)
│ └── pages/ # multi-page app structure
├── tests/ # mirrors src/ structure
├── notebooks/ # exploratory analysis only
├── data/
│ ├── raw/ # source data, never modified
│ └── processed/ # derived outputs
├── flows/ # Prefect orchestration
├── config/ # business configuration (committed)
└── scripts/ # one-off utility scripts
Key Design Decisions
src/ layout
The core package lives in src/munger_matics/ rather than at the root. This enforces that the package must be installed before it can be imported, which catches packaging issues early and prevents accidental imports of uninstalled code.
Dashboard lives outside src/
app/ is a presentation layer, not business logic. Keeping it separate from src/ makes the boundary explicit: the dashboard depends on the core package, never the other way around. The core package must be usable without Streamlit.
data/ is gitignored
data/raw/ and data/processed/ are excluded from version control. This directory will contain personal financial data (bank statements, transaction exports) that must never be committed. The directory structure itself is tracked via .gitkeep files so it is preserved without the contents.
config/ vs .env
Two separate concerns:
.env— secrets and environment-specific values (never committed)config/— business configuration such as budget categories, account mappings, spending thresholds (committed, versioned)
Notebooks are exploration only
notebooks/ is for investigation and prototyping. No application code may import from notebooks. Useful logic discovered in a notebook must be promoted to src/ before it can be used by the app.
flows/ not prefect/
The orchestration directory is named after what it does, not the tool used to do it. This keeps the mental model clean if the orchestration layer ever changes.
Technology Choices
| Tool | Role | Why |
|---|---|---|
| uv | Environment & dependency management | Fast, modern, single tool for venv + packages + lockfile |
| Polars | Data manipulation | Faster than Pandas, expressive API, strong typing |
| Pydantic | Data validation | Enforces data model correctness at runtime using type annotations |
| Streamlit | Dashboard | Low-friction Python-native UI, no frontend knowledge required |
| Prefect | Orchestration | Manages scheduled and triggered data pipelines |
| Ruff | Linting & formatting | Replaces Flake8 + Black + isort in a single fast tool |
| mypy | Type checking | Catches type errors at development time before runtime |
| pytest | Testing | Standard, well-supported Python test framework |
| MkDocs Material | Documentation site | Markdown-based, Python-native, deploys to GitHub Pages |
| Dependabot | Dependency updates | Automated weekly PRs for outdated dependencies |