chore: restore kua-money-trace from Bruno (source was on Bruno, Gal copy was lost)

This commit is contained in:
Kavi 2026-05-01 03:00:31 -04:00
commit cd5b02a4e3
39 changed files with 11513 additions and 0 deletions

6
.gitignore vendored Normal file
View File

@ -0,0 +1,6 @@
node_modules/
data/mail-archive/
data/mail-oauth/
data/*.state.json
.env
.env.*

4
.kua-vault.json Normal file
View File

@ -0,0 +1,4 @@
{
"project": "kua-money-trace",
"include": ["shared"]
}

163
README.md Normal file
View File

@ -0,0 +1,163 @@
# kua-money-trace
Servicio separado para reconstruir el arbol de origen y destino del dinero.
No reemplaza contabilidad ni aprobacion humana. Guarda hechos, propone enlaces y permite ver de donde vino la plata que termino financiando cada gasto.
## Primer MVP
- Ledger en JSON local.
- Motor de grafo con enlaces explicitos y enlaces FIFO por cuenta.
- Trazabilidad hacia atras desde cualquier movimiento, documento o evento economico.
- API HTTP minima sin dependencias externas.
- CLI para resumen y pruebas rapidas.
## Ejecutar
```bash
npm test
npm run summarize
npm run demo
npm run serve
```
Luego:
```bash
curl http://localhost:3910/health
curl http://localhost:3910/nodes/event:black-spa-10804/origin-tree
```
## Descargar correos
Primera cuenta configurada:
```text
vjoati-gmail -> vjoati@gmail.com
kdoi-email -> kdoi@email.com
```
La clave no se guarda en el repo. Para Gmail se debe usar una app password/OAuth y exponerla solo en el entorno:
```bash
export VJOATI_GMAIL_APP_PASSWORD='...'
npm run mail:dry-run
npm run mail:download -- --limit 25 --since 2025-01-01
```
### Gmail API OAuth
El camino recomendado para Gmail es OAuth con permiso solo lectura. Necesita un OAuth Client de Google Cloud
con redirect URI `http://127.0.0.1:3912/oauth2callback` y la Gmail API habilitada.
```bash
export GOOGLE_OAUTH_CLIENT_ID='...apps.googleusercontent.com'
export GOOGLE_OAUTH_CLIENT_SECRET='...'
npm run mail:gmail-oauth
```
El comando imprime una URL, espera el callback local y guarda el refresh token en:
```text
data/mail-oauth/vjoati-gmail.token.json
```
Luego descarga desde Google, no desde Apple Mail:
```bash
npm run mail:gmail-download -- --limit 100 --since 2025-01-01
```
El archivo queda en:
```text
data/mail-archive/vjoati-gmail/
raw-eml/yyyy/mm/*.eml
attachments/yyyy/mm/*
manifests/emails.ndjson
```
Para probar sin tocar Gmail:
```bash
npm run mail:fixture
```
Si Gmail no permite app password, hay dos caminos ya soportados:
### Apple Mail local
Lista cuentas/carpetas locales de Mail.app:
```bash
npm run mail:list-apple
```
Importa una carpeta `.mbox` local de Apple Mail:
```bash
node src/mailCli.js import-apple-mail \
--account vjoati-gmail \
--source '/Users/kavi/Library/Mail/V10/.../INBOX.mbox' \
--limit 25
```
Para cuentas ya configuradas en macOS, es mejor usar el indice de Mail. Esto resuelve la cuenta desde
`~/Library/Accounts/Accounts4.sqlite`, lee `Envelope Index`, y archiva los `.emlx` locales por id de mensaje:
```bash
node src/mailCli.js import-apple-mail-index \
--account vjoati-gmail \
--mailbox all \
--since 2025-01-01 \
--limit 100
```
En Gmail, `--mailbox all` usa `[Gmail]/Todos` cuando existe y si no cae a `INBOX`.
Para la cuenta `kdoi-email`, el importador detecta la IMAP directa con mensajes y usa `INBOX`.
Atajos:
```bash
npm run mail:import-apple-index -- --limit 100 --since 2025-01-01
npm run mail:import-kdoi -- --limit 100 --since 2025-01-01
```
### Google Takeout / MBOX
Exporta Gmail desde Google Takeout como archivo `.mbox`, luego:
```bash
node src/mailCli.js import-mbox \
--account vjoati-gmail \
--file '/ruta/al/archivo.mbox' \
--limit 1000
```
## Idea central
Un pago con tarjeta no es el origen final. El arbol correcto puede ser:
```text
DTE / gasto Muralla
└─ cargo Visa Darwin
└─ pago Visa desde cuenta corriente Darwin
└─ ingreso puro Darwin
```
Y para cuentas europeas:
```text
Gasto con debito Revolut
└─ saldo Revolut EUR
└─ carga Revolut con Visa Chile
└─ pago Visa desde cuenta corriente Chile
└─ ingreso puro
```
## Proximos pasos
- Importadores para cartolas Banco de Chile, Santander, Revolut, Mercado Pago y SII RCV.
- Archivo de correos y adjuntos en Storagebox.
- Extraccion IA de PDFs/correos con estado `propuesto`, nunca aprobado automaticamente.
- Persistencia Postgres con auditoria de decisiones.

34
config/entities.json Normal file
View File

@ -0,0 +1,34 @@
{
"entities": [
{
"id": "vicente",
"name": "Vicente Tirado",
"kind": "person",
"rut": "18.393.009-5"
},
{
"id": "darwin",
"name": "Darwin Bruna",
"kind": "person",
"rut": "17.194.206-3"
},
{
"id": "muralla",
"name": "Muralla SpA",
"kind": "company",
"rut": "78.188.363-8"
},
{
"id": "murallita",
"name": "Vicente Tirado Alimentos y Bebidas",
"kind": "company",
"rut": "78.225.723-4"
},
{
"id": "kua",
"name": "Kua",
"kind": "company",
"rut": "78.230.716-9"
}
]
}

View File

@ -0,0 +1,38 @@
{
"accounts": [
{
"id": "vjoati-gmail",
"email": "vjoati@gmail.com",
"provider": "gmail",
"host": "imap.gmail.com",
"port": 993,
"secure": true,
"auth": {
"user": "vjoati@gmail.com",
"passwordEnv": "VJOATI_GMAIL_APP_PASSWORD"
},
"mailboxes": [
"INBOX"
],
"defaultSince": "2025-01-01",
"oauth": {
"clientIdEnv": "GOOGLE_OAUTH_CLIENT_ID",
"clientSecretEnv": "GOOGLE_OAUTH_CLIENT_SECRET",
"redirectUri": "http://127.0.0.1:3912/oauth2callback",
"tokenPath": "data/mail-oauth/vjoati-gmail.token.json"
},
"archive": {
"root": "data/mail-archive/vjoati-gmail"
}
},
{
"id": "kdoi-email",
"email": "kdoi@email.com",
"provider": "apple-mail-local",
"defaultSince": "2025-01-01",
"archive": {
"root": "data/mail-archive/kdoi-email"
}
}
]
}

38
config/mail-accounts.json Normal file
View File

@ -0,0 +1,38 @@
{
"accounts": [
{
"id": "vjoati-gmail",
"email": "vjoati@gmail.com",
"provider": "gmail",
"host": "imap.gmail.com",
"port": 993,
"secure": true,
"auth": {
"user": "vjoati@gmail.com",
"passwordEnv": "VJOATI_GMAIL_APP_PASSWORD"
},
"mailboxes": [
"INBOX"
],
"defaultSince": "2025-01-01",
"oauth": {
"clientIdEnv": "GOOGLE_OAUTH_CLIENT_ID",
"clientSecretEnv": "GOOGLE_OAUTH_CLIENT_SECRET",
"redirectUri": "http://127.0.0.1:3912/oauth2callback",
"tokenPath": "data/mail-oauth/vjoati-gmail.token.json"
},
"archive": {
"root": "data/mail-archive/vjoati-gmail"
}
},
{
"id": "kdoi-email",
"email": "kdoi@email.com",
"provider": "apple-mail-local",
"defaultSince": "2025-01-01",
"archive": {
"root": "data/mail-archive/kdoi-email"
}
}
]
}

102
data/assignments-v2.json Normal file
View File

@ -0,0 +1,102 @@
[
{"id":"1d1a3b94285d4817","labels":["Triage/Personal","Categoría/Familia","Idioma/Inglés"]},
{"id":"19dd817da6a4a375","labels":["Triage/Prob Basura","Categoría/Compras","Idioma/Alemán"]},
{"id":"19dd80130bd41d0d","labels":["Triage/Prob Basura","Categoría/Compras","Idioma/Alemán"]},
{"id":"19dd7f34f725b4c7","labels":["Triage/Promoción","Categoría/Noticias","Idioma/Inglés"]},
{"id":"19dd7f185bf0a77e","labels":["Triage/Prob Basura","Categoría/Compras","Idioma/Alemán"]},
{"id":"19dd7da0b1390b9f","labels":["Triage/Prob Basura","Categoría/Compras","Idioma/Alemán"]},
{"id":"19dd7b1cef8d92c4","labels":["Triage/Prob Basura","Categoría/Compras","Idioma/Alemán"]},
{"id":"19dd7a98943b8915","labels":["Triage/Prob Basura","Categoría/Compras","Idioma/Español"]},
{"id":"19dd7660a6e81e0a","labels":["Triage/Prob Basura","Categoría/Compras","Idioma/Alemán"]},
{"id":"19dd743ef60a213a","labels":["Triage/Prob Basura","Categoría/Compras","Idioma/Español"]},
{"id":"19dd6aa226bc6fb2","labels":["Triage/Promoción","Categoría/Cultura","Idioma/Español"]},
{"id":"19dd6a442e9459c3","labels":["Triage/Personal","Categoría/Trabajo","Idioma/Español"]},
{"id":"19dd6a0dbccfdf72","labels":["Triage/Prob Basura","Categoría/Viajes","Idioma/Portugués"]},
{"id":"19dd68f20d65ae9c","labels":["Triage/Importante","Categoría/Servicios","Idioma/Inglés"]},
{"id":"19dd66ff283871d7","labels":["Triage/Importante","Categoría/Salud","Idioma/Español"]},
{"id":"19dd66fa85bd71af","labels":["Triage/Importante","Categoría/Bancos","Idioma/Alemán","Instrumento/Tarjeta de Crédito","Banco/Barclays"]},
{"id":"19dd668fea307df8","labels":["Triage/Prob Basura","Categoría/Viajes","Idioma/Español"]},
{"id":"19dd6569ae9da11b","labels":["Triage/Prob Basura","Categoría/Otro","Idioma/Inglés"]},
{"id":"19dd6558d050b1f8","labels":["Triage/Prob Basura","Categoría/Otro","Idioma/Inglés"]},
{"id":"19dd64a24e68a6fe","labels":["Triage/Importante","Categoría/Salud","Idioma/Español"]},
{"id":"19dd629a0d2bc65d","labels":["Triage/Promoción","Categoría/Comida","Idioma/Español"]},
{"id":"19dd61e5108deafd","labels":["Triage/Prob Basura","Categoría/Compras","Idioma/Español"]},
{"id":"19dd61dbab965d08","labels":["Triage/Prob Basura","Categoría/Otro","Idioma/Inglés"]},
{"id":"19dd600007b7f8dc","labels":["Triage/Promoción","Categoría/Noticias","Idioma/Inglés"]},
{"id":"19dd5f99b9b23450","labels":["Triage/Promoción","Categoría/Cultura","Idioma/Inglés"]},
{"id":"19dd5f5928f399ad","labels":["Triage/Promoción","Categoría/Educación","Idioma/Portugués"]},
{"id":"19dd5f24c2cc7f4e","labels":["Triage/Promoción","Categoría/Noticias","Idioma/Inglés"]},
{"id":"19dd5eba8609ba87","labels":["Triage/Promoción","Categoría/Cultura","Idioma/Inglés"]},
{"id":"19dd5ea311507a9c","labels":["Triage/Prob Basura","Categoría/Viajes","Idioma/Inglés"]},
{"id":"19dd5ddf30e936d7","labels":["Triage/Prob Basura","Categoría/Compras","Idioma/Español"]},
{"id":"19dd5ddba592dc9d","labels":["Triage/Importante","Categoría/Bancos","Idioma/Español","Instrumento/Tarjeta de Débito","Banco/MACH"]},
{"id":"19dd5cd27e750b3b","labels":["Triage/Personal","Categoría/Salud","Idioma/Español"]},
{"id":"19dd5c923b21b408","labels":["Triage/Promoción","Categoría/Noticias","Idioma/Inglés"]},
{"id":"19dd5b9ab59cc4fb","labels":["Triage/Importante","Categoría/Servicios","Idioma/Español"]},
{"id":"19dd5b163e9b5eef","labels":["Triage/Prob Basura","Categoría/Compras","Idioma/Inglés"]},
{"id":"19dd5b0d7f90e34f","labels":["Triage/Prob Basura","Categoría/Compras","Idioma/Inglés"]},
{"id":"19dd596ef6a7ef8b","labels":["Triage/Prob Basura","Categoría/Comida","Idioma/Inglés"]},
{"id":"19dd5932a834e279","labels":["Triage/Promoción","Categoría/Educación","Idioma/Inglés"]},
{"id":"19dd577597a7f388","labels":["Triage/Promoción","Categoría/Noticias","Idioma/Inglés"]},
{"id":"19dd56f69dbc6d2d","labels":["Triage/Prob Basura","Categoría/Viajes","Idioma/Español"]},
{"id":"19dd567f2da117a3","labels":["Triage/Promoción","Categoría/Cultura","Idioma/Inglés"]},
{"id":"19dd5670d521d7e6","labels":["Triage/Prob Basura","Categoría/Viajes","Idioma/Español"]},
{"id":"19dd55787be27824","labels":["Triage/Promoción","Categoría/Otro","Idioma/Inglés"]},
{"id":"19dd54756a0d843b","labels":["Triage/Prob Basura","Categoría/Trabajo","Idioma/Inglés"]},
{"id":"19dd5455f3c7898c","labels":["Triage/Prob Basura","Categoría/Compras","Idioma/Alemán"]},
{"id":"19dd544923501f0b","labels":["Triage/Prob Basura","Categoría/Compras","Idioma/Inglés"]},
{"id":"19dd5410e3586e5b","labels":["Triage/Prob Basura","Categoría/Viajes","Idioma/Portugués"]},
{"id":"19dd53e22e296641","labels":["Triage/Promoción","Categoría/Otro","Idioma/Alemán"]},
{"id":"19dd52c9c05b0fee","labels":["Triage/Promoción","Categoría/Educación","Idioma/Español"]},
{"id":"19dd5298181b3d57","labels":["Triage/Prob Basura","Categoría/Compras","Idioma/Inglés"]},
{"id":"19dd522318505d5f","labels":["Triage/Promoción","Categoría/Bancos","Idioma/Español","Banco/MACH"]},
{"id":"19dd516a03e49bec","labels":["Triage/Prob Basura","Categoría/Compras","Idioma/Inglés"]},
{"id":"19dd50a08a723f19","labels":["Triage/Prob Basura","Categoría/Viajes","Idioma/Portugués"]},
{"id":"19dd4ffa85f3b427","labels":["Triage/Importante","Categoría/Bancos","Idioma/Español","Instrumento/Tarjeta de Crédito","Banco/BCI"]},
{"id":"19dd4f6f1cb003fc","labels":["Triage/Prob Basura","Categoría/Viajes","Idioma/Portugués"]},
{"id":"19dd4ed6a63c95cf","labels":["Triage/Promoción","Categoría/Salud","Idioma/Español"]},
{"id":"19dd4e587a9d6084","labels":["Triage/Prob Basura","Categoría/Viajes","Idioma/Español"]},
{"id":"19dd4d88343f74bb","labels":["Triage/Prob Basura","Categoría/Compras","Idioma/Inglés"]},
{"id":"19dd4d71f3110e75","labels":["Triage/Prob Basura","Categoría/Compras","Idioma/Inglés"]},
{"id":"19dd4d6b90b7ba59","labels":["Triage/Prob Basura","Categoría/Compras","Idioma/Alemán"]},
{"id":"19dd4c418b101f46","labels":["Triage/Promoción","Categoría/Cultura","Idioma/Español"]},
{"id":"19dd4a9838bbb685","labels":["Triage/Prob Basura","Categoría/Compras","Idioma/Alemán"]},
{"id":"19dd4a77cfaea3aa","labels":["Triage/Prob Basura","Categoría/Compras","Idioma/Inglés"]},
{"id":"19dd4a437b882b28","labels":["Triage/Prob Basura","Categoría/Compras","Idioma/Español"]},
{"id":"19dd4a188a42cf0d","labels":["Triage/Prob Basura","Categoría/Viajes","Idioma/Portugués"]},
{"id":"19dd49ed235c77e9","labels":["Triage/Promoción","Categoría/Bancos","Idioma/Inglés","Banco/Revolut"]},
{"id":"19dd4949043d3ad7","labels":["Triage/Promoción","Categoría/Salud","Idioma/Español"]},
{"id":"19dd48f42a73d77c","labels":["Triage/Promoción","Categoría/Comida","Idioma/Español"]},
{"id":"19dd48c8197f1688","labels":["Triage/Promoción","Categoría/Comida","Idioma/Español"]},
{"id":"19dd47dcb3d663db","labels":["Triage/Prob Basura","Categoría/Compras","Idioma/Alemán"]},
{"id":"19dd476ca39a9236","labels":["Triage/Promoción","Categoría/Cultura","Idioma/Español"]},
{"id":"19dd46710a0a559c","labels":["Triage/Promoción","Categoría/Noticias","Idioma/Inglés"]},
{"id":"19dd44f6a53e6781","labels":["Triage/Prob Basura","Categoría/Viajes","Idioma/Inglés"]},
{"id":"19dd442fe209af0b","labels":["Triage/Prob Basura","Categoría/Compras","Idioma/Inglés"]},
{"id":"19dd44074dde40e3","labels":["Triage/Promoción","Categoría/Compras","Idioma/Alemán"]},
{"id":"19dd4382b75fc5ec","labels":["Triage/Prob Basura","Categoría/Compras","Idioma/Alemán"]},
{"id":"19dd437f1417ddaa","labels":["Triage/Promoción","Categoría/Educación","Idioma/Español"]},
{"id":"19dd4307a78a1e3f","labels":["Triage/Prob Basura","Categoría/Comida","Idioma/Inglés"]},
{"id":"19dd4302894fbec4","labels":["Triage/Prob Basura","Categoría/Compras","Idioma/Inglés"]},
{"id":"19dd42e531dfcbe5","labels":["Triage/Promoción","Categoría/Transporte","Idioma/Alemán"]},
{"id":"19dd417818f7c530","labels":["Triage/Promoción","Categoría/Salud","Idioma/Alemán"]},
{"id":"19dd400bb62ed65e","labels":["Triage/Promoción","Categoría/Salud","Idioma/Español"]},
{"id":"19dd3daa0f073a6f","labels":["Triage/Prob Basura","Categoría/Compras","Idioma/Inglés"]},
{"id":"19dd3d80eadd391c","labels":["Triage/Promoción","Categoría/Educación","Idioma/Inglés"]},
{"id":"19dd3cc1a7048946","labels":["Triage/Prob Basura","Categoría/Compras","Idioma/Francés"]},
{"id":"19dd3c1c2ed21a75","labels":["Triage/Promoción","Categoría/Noticias","Idioma/Inglés"]},
{"id":"19dd3bf35b58b880","labels":["Triage/Prob Basura","Categoría/Compras","Idioma/Alemán"]},
{"id":"19dd3a763400feb1","labels":["Triage/Importante","Categoría/Bancos","Idioma/Inglés","Instrumento/Cuenta Vista","Banco/Revolut"]},
{"id":"19dd3a0b9e5dc517","labels":["Triage/Importante","Categoría/Servicios","Idioma/Español","Banco/MercadoPago"]},
{"id":"19dd39d939ddcb17","labels":["Triage/Prob Basura","Categoría/Viajes","Idioma/Alemán"]},
{"id":"19dd38cc0dddbacb","labels":["Triage/Importante","Categoría/Bancos","Idioma/Español","Banco/Tenpo"]},
{"id":"19dd33e47e4ac97b","labels":["Triage/Promoción","Categoría/Viajes","Idioma/Alemán"]},
{"id":"19dd33ba26d6c094","labels":["Triage/Importante","Categoría/Bancos","Idioma/Español","Instrumento/Tarjeta de Débito","Banco/CopecPay"]},
{"id":"19dd335e52ba9b9c","labels":["Triage/Promoción","Categoría/Viajes","Idioma/Inglés"]},
{"id":"19dd329c7ff21ae7","labels":["Triage/Prob Basura","Categoría/Compras","Idioma/Alemán"]},
{"id":"19dd31b9022bb76d","labels":["Triage/Prob Basura","Categoría/Salud","Idioma/Inglés"]},
{"id":"19dd31aed50b4458","labels":["Triage/Prob Basura","Categoría/Compras","Idioma/Inglés"]},
{"id":"19dd31a95cb226a8","labels":["Triage/Prob Basura","Categoría/Compras","Idioma/Alemán"]},
{"id":"19dd30c92829e557","labels":["Triage/Prob Basura","Categoría/Compras","Idioma/Inglés"]},
{"id":"19dd30c8141508e0","labels":["Triage/Prob Basura","Categoría/Compras","Idioma/Inglés"]}
]

1330
data/assignments-v3.json Normal file

File diff suppressed because it is too large Load Diff

32
data/assignments-v4.json Normal file
View File

@ -0,0 +1,32 @@
[
{
"id": "19dd66fa85bd71af",
"labels": [],
"removeLabels": ["Tarjeta de Crédito"],
"note": "#16 Barclays support email — no cartola, remove instrument label"
},
{
"id": "19dd5ddba592dc9d",
"labels": ["Comprobante"],
"removeLabels": ["Tarjeta de Débito"],
"note": "#31 MACH comprobante de pago (TOKU *VIDA CAMARA SEGUR, Visa Débito 7253)"
},
{
"id": "19dd4ffa85f3b427",
"labels": [],
"removeLabels": ["Tarjeta de Crédito"],
"note": "#54 BCI payment reminder — no cartola, remove instrument label"
},
{
"id": "19dd3a763400feb1",
"labels": [],
"removeLabels": ["Cuenta Vista"],
"note": "#88 Revolut statement notification — no PDF attached, remove instrument label"
},
{
"id": "19dd33ba26d6c094",
"labels": [],
"removeLabels": ["Tarjeta de Débito"],
"note": "#93 CopecPay failed payment notification — no cartola, remove instrument label"
}
]

1012
data/assignments-v5.json Normal file

File diff suppressed because it is too large Load Diff

4570
data/assignments-v6.json Normal file

File diff suppressed because it is too large Load Diff

128
data/example-ledger.json Normal file
View File

@ -0,0 +1,128 @@
{
"entities": [
{
"id": "darwin",
"name": "Darwin Bruna",
"kind": "person",
"rut": "17.194.206-3"
},
{
"id": "muralla",
"name": "Muralla SpA",
"kind": "company",
"rut": "78.188.363-8"
},
{
"id": "black-spa",
"name": "BLACK SPA",
"kind": "vendor"
}
],
"accounts": [
{
"id": "darwin-bchile-cc-5402",
"ownerEntityId": "darwin",
"institution": "Banco de Chile / Edwards",
"instrument": "checking_account",
"currency": "CLP",
"label": "Cuenta corriente Darwin terminada 5402"
},
{
"id": "darwin-lider-visa-5018",
"ownerEntityId": "darwin",
"institution": "Tarjeta Lider",
"instrument": "credit_card",
"currency": "CLP",
"label": "Tarjeta credito Darwin terminada 5018"
}
],
"movements": [
{
"id": "darwin-income-2025-12-05",
"date": "2025-12-05",
"accountId": "darwin-bchile-cc-5402",
"direction": "in",
"amount": { "currency": "CLP", "value": 500000 },
"description": "Ingreso puro Darwin sin trazabilidad anterior",
"counterparty": "Ingreso externo",
"economicType": "pure_income"
},
{
"id": "darwin-bank-pay-visa-2026-01-15",
"date": "2026-01-15",
"accountId": "darwin-bchile-cc-5402",
"direction": "out",
"amount": { "currency": "CLP", "value": 300000 },
"description": "PAGO TARJETA LIDER VISA 5018",
"counterparty": "Tarjeta Lider",
"economicType": "card_payment",
"cardAccountId": "darwin-lider-visa-5018"
},
{
"id": "darwin-visa-black-spa-2025-12-30",
"date": "2025-12-30",
"accountId": "darwin-lider-visa-5018",
"direction": "out",
"amount": { "currency": "CLP", "value": 126616 },
"description": "CAFE CULTURA BLACK SPA",
"counterparty": "BLACK SPA",
"economicType": "card_charge",
"beneficiaryEntityId": "muralla"
},
{
"id": "darwin-visa-cocina-con-alma-2025-12-30",
"date": "2025-12-30",
"accountId": "darwin-lider-visa-5018",
"direction": "out",
"amount": { "currency": "CLP", "value": 78255 },
"description": "MERPAGO*COCINACONALMA",
"counterparty": "Cocina con Alma",
"economicType": "card_charge",
"beneficiaryEntityId": "por_confirmar"
}
],
"documents": [
{
"id": "black-spa-10804",
"kind": "dte_invoice",
"issuerName": "BLACK SPA",
"receiverEntityId": "muralla",
"folio": "10804",
"documentDate": "2025-12-30",
"amount": { "currency": "CLP", "value": 126616 },
"status": "matched"
}
],
"events": [
{
"id": "black-spa-10804",
"kind": "real_expense",
"entityId": "muralla",
"date": "2025-12-30",
"amount": { "currency": "CLP", "value": 126616 },
"description": "Gasto Muralla BLACK SPA folio 10804"
}
],
"links": [
{
"from": "doc:black-spa-10804",
"to": "event:black-spa-10804",
"type": "documents_event",
"amount": { "currency": "CLP", "value": 126616 },
"method": "exact",
"confidence": "exact",
"state": "approved",
"note": "Factura RCV Muralla 2026-01, fecha documento 2025-12-30."
},
{
"from": "mov:darwin-visa-black-spa-2025-12-30",
"to": "event:black-spa-10804",
"type": "finances_event",
"amount": { "currency": "CLP", "value": 126616 },
"method": "manual",
"confidence": "exact",
"state": "approved",
"note": "Cargo CAFE CULTURA BLACK SPA pagado con tarjeta Darwin 5018."
}
]
}

29
debug-labels.js Normal file
View File

@ -0,0 +1,29 @@
import { google } from 'googleapis';
import { loadAuthorizedOAuthClient } from './src/gmailApi.js';
import { getMailAccount } from './src/mailConfig.js';
async function check() {
const account = await getMailAccount('vjoati-gmail');
const auth = await loadAuthorizedOAuthClient({ account });
const gmail = google.gmail({ version: 'v1', auth });
const msgId = '19de22e8637bfecf'; // ZEIT SPRACHEN from May 1
try {
const res = await gmail.users.messages.get({ userId: 'me', id: msgId });
console.log('--- Message Metadata ---');
console.log('ID:', res.data.id);
console.log('Snippet:', res.data.snippet);
console.log('Label IDs:', JSON.stringify(res.data.labelIds));
const labelsRes = await gmail.users.labels.list({ userId: 'me' });
const labels = labelsRes.data.labels;
console.log('\n--- Label Mapping for this Message ---');
res.data.labelIds.forEach(id => {
const match = labels.find(l => l.id === id);
console.log(id + ' -> ' + (match ? match.name : 'Unknown'));
});
} catch (e) {
console.error('Error:', e.message);
}
}
check();

109
docs/architecture.md Normal file
View File

@ -0,0 +1,109 @@
# Arquitectura
## Alcance
`kua-money-trace` mantiene una fuente unica de verdad para:
- movimientos financieros
- documentos tributarios y no tributarios
- archivos/correos originales
- enlaces entre movimientos
- arboles de origen de fondos
- decisiones humanas de revision
## Principios
1. Todo archivo original se conserva intacto.
2. Todo analisis de IA es una propuesta revisable.
3. Todo movimiento tiene tipo economico.
4. Un pago de tarjeta no es gasto: liquida deuda de tarjeta.
5. Un fondeo de billetera o cuenta extranjera no es ingreso operacional.
6. La plata en una cuenta es fungible; si no hay enlace exacto, se usa FIFO por cuenta y moneda.
7. Cada enlace tiene metodo, confianza y estado.
## Pipeline
```text
IMAP / archivos manuales / APIs banco
-> archivo raw en Storagebox
-> extraccion de adjuntos
-> parser deterministico
-> clasificador IA
-> ledger normalizado
-> grafo de dinero
-> revision humana
-> export contable
```
## Ingesta inicial Gmail
Cuenta inicial:
```text
vjoati@gmail.com
```
Config local:
```text
config/mail-accounts.json
```
Credencial esperada:
```text
VJOATI_GMAIL_APP_PASSWORD
```
Por seguridad, el downloader solo guarda:
- correo original `.eml`
- adjuntos originales
- hash SHA-256
- manifiesto NDJSON
- preview de texto
La IA debe leer desde este archivo y crear propuestas separadas. No debe modificar los originales.
## Alternativas cuando Gmail no permite app password
1. **Apple Mail local (`.emlx`)**: si la cuenta ya esta sincronizada en Mail.app, importamos los mensajes locales sin pedir credenciales nuevas.
2. **Google Takeout (`.mbox`)**: sirve para carga historica masiva sin OAuth ni IMAP.
3. **Gmail API OAuth**: camino correcto para sync continuo si no hay app password. Usaria `gmail.readonly` y `users.messages.get(format=raw)`.
## Servicios existentes
- `kua-mail`: reutilizable como referencia para IMAP, sync, folders, threads, `mailparser`.
- `kua-notify`: no sirve para inbound mail, pero si como referencia de servicio Fastify con Vault, auditoria, healthcheck y auth.
- Storagebox: reutilizar patron SFTP via `ssh2-sftp-client`.
## Storagebox sugerido
```text
/accounting-mail/
raw-eml/{mailbox}/{yyyy}/{mm}/{message_hash}.eml
attachments/{yyyy}/{mm}/{sha256}-{original_filename}
normalized/{source}/{entity}/{account}/{period}/{sha256}.json
manifests/documents.ndjson
manifests/extractions.ndjson
```
## Nodos del grafo
- `movement:*`: cartola, tarjeta, Mercado Pago, Revolut, etc.
- `document:*`: factura, boleta, invoice, recibo, transferencia.
- `event:*`: gasto real, ingreso real, deuda, reembolso, ajuste.
- `file:*`: archivo original en Storagebox.
## Enlaces del grafo
Direccion: `from` financia, explica o liquida `to`.
Ejemplo:
```text
movement:darwin-income-001 -> movement:darwin-bank-pay-visa-001
movement:darwin-bank-pay-visa-001 -> movement:darwin-visa-black-spa-001
movement:darwin-visa-black-spa-001 -> event:black-spa-10804
document:black-spa-10804 -> event:black-spa-10804
```

122
docs/data-model.md Normal file
View File

@ -0,0 +1,122 @@
# Modelo de Datos MVP
## Entity
Persona o empresa.
```json
{
"id": "darwin",
"name": "Darwin Bruna",
"kind": "person",
"rut": "17.194.206-3"
}
```
## Account
Cuenta, tarjeta o billetera.
```json
{
"id": "darwin-bchile-cc-5402",
"ownerEntityId": "darwin",
"institution": "Banco de Chile",
"instrument": "checking_account",
"currency": "CLP",
"label": "Cuenta corriente 5402"
}
```
Instrumentos iniciales:
- `checking_account`
- `current_account`
- `savings_account`
- `credit_card`
- `debit_card`
- `prepaid_card`
- `wallet`
- `foreign_account`
- `cash`
## Movement
Movimiento financiero observado.
```json
{
"id": "mov:darwin-visa-black-spa",
"date": "2025-12-30",
"accountId": "darwin-lider-visa-5018",
"direction": "out",
"amount": { "currency": "CLP", "value": 126616 },
"description": "CAFE CULTURA BLACK SPA",
"counterparty": "BLACK SPA",
"economicType": "card_charge",
"beneficiaryEntityId": "muralla"
}
```
## Document
Documento de respaldo.
```json
{
"id": "doc:black-spa-10804",
"kind": "dte_invoice",
"issuerName": "BLACK SPA",
"issuerRut": "76.xxx.xxx-x",
"receiverEntityId": "muralla",
"folio": "10804",
"documentDate": "2025-12-30",
"amount": { "currency": "CLP", "value": 126616 }
}
```
## Economic Event
Evento economico real: gasto, ingreso, deuda, reembolso.
```json
{
"id": "event:black-spa-10804",
"kind": "real_expense",
"entityId": "muralla",
"amount": { "currency": "CLP", "value": 126616 },
"description": "Gasto Muralla BLACK SPA folio 10804"
}
```
## Link
Enlace entre nodos.
```json
{
"from": "mov:darwin-bank-income-001",
"to": "mov:darwin-bank-pay-visa-001",
"type": "funds",
"amount": { "currency": "CLP", "value": 126616 },
"method": "fifo",
"confidence": "rule",
"state": "proposed"
}
```
Estados:
- `proposed`
- `approved`
- `rejected`
- `needs_review`
Confianzas:
- `exact`
- `strong`
- `probable`
- `rule`
- `manual`
- `unknown`

121
kua.json Normal file
View File

@ -0,0 +1,121 @@
{
"name": "kua-money-trace",
"version": "0.1.0",
"description": "Money trace and mail/document ingestion service for accounting-grade fund origin trees.",
"type": "service",
"framework": "node",
"team": {
"owner": "kavi",
"status": "active"
},
"stack": {
"backend": {
"runtime": "node",
"framework": "node",
"version": "20"
},
"database": {
"type": "local-json",
"version": "mvp"
},
"storage": {
"type": "local-files",
"planned": "storagebox"
}
},
"services": {
"api": {
"port": 3910,
"healthcheck": "/health",
"command": "npm run serve"
},
"gmail_oauth": {
"port": 3912,
"redirect_uri": "http://127.0.0.1:3912/oauth2callback",
"command": "npm run mail:gmail-oauth"
}
},
"environments": {
"development": {
"server": "gal",
"path": "/home/kavi/kua-money-trace",
"port": 3910,
"secrets": {
"project": "kua-money-trace",
"env": "dev",
"include": ["shared"]
}
},
"production": {
"server": "bruno",
"path": "/home/kavi/kua-money-trace",
"port": 3910,
"secrets": {
"project": "kua-money-trace",
"env": "prod",
"include": ["shared"]
}
}
},
"secrets": {
"project": "kua-money-trace",
"include": ["shared"],
"required": [
"GOOGLE_OAUTH_CLIENT_ID",
"GOOGLE_OAUTH_CLIENT_SECRET"
]
},
"migration": {
"databases": [],
"volumes": [
{
"name": "mail-archive",
"path": "data/mail-archive",
"backup_required": true,
"note": "Raw emails, attachments, and manifests. Move to Storagebox before production ingestion."
},
{
"name": "mail-oauth",
"path": "data/mail-oauth",
"backup_required": true,
"secret": true,
"note": "OAuth refresh tokens. Excluded from source transfer and should not be committed."
}
],
"env_files": [],
"required_secrets": [
"GOOGLE_OAUTH_CLIENT_ID",
"GOOGLE_OAUTH_CLIENT_SECRET"
],
"post_migration": [
"npm install",
"npm test"
]
},
"infrastructure": {
"database": "local-json",
"cache": "none",
"queue": "none",
"storage": "local-files",
"secrets": "kuavault"
},
"dependencies": {
"runtime": {
"node": ">=20.0.0"
},
"system": [
"sqlite3"
]
},
"health": {
"endpoint": "/health",
"timeout": 15,
"expected_status": 200
},
"deploy": {
"production": {
"mode": "direct",
"server": "gal"
}
}
}

13
list-labels.js Normal file
View File

@ -0,0 +1,13 @@
import { google } from 'googleapis';
import { loadAuthorizedOAuthClient } from './src/gmailApi.js';
import { getMailAccount } from './src/mailConfig.js';
async function run() {
const account = await getMailAccount('vjoati-gmail');
const auth = await loadAuthorizedOAuthClient({ account });
const gmail = google.gmail({ version: 'v1', auth });
const res = await gmail.users.labels.list({ userId: 'me' });
console.log(JSON.stringify(res.data.labels, null, 2));
}
run();

1126
package-lock.json generated Normal file

File diff suppressed because it is too large Load Diff

30
package.json Normal file
View File

@ -0,0 +1,30 @@
{
"name": "kua-money-trace",
"version": "0.1.0",
"description": "Money trace service for accounting-grade fund origin trees.",
"type": "module",
"private": true,
"scripts": {
"test": "node --test",
"demo": "node src/cli.js trace data/example-ledger.json event:black-spa-10804",
"summarize": "node src/cli.js summarize data/example-ledger.json",
"mail:dry-run": "node src/mailCli.js dry-run --account vjoati-gmail",
"mail:download": "node src/mailCli.js download --account vjoati-gmail",
"mail:gmail-oauth": "node src/mailCli.js gmail-oauth --account vjoati-gmail",
"mail:gmail-download": "node src/mailCli.js gmail-download --account vjoati-gmail",
"mail:gmail-trash": "node src/mailCli.js gmail-trash --account vjoati-gmail",
"mail:fixture": "node src/mailCli.js parse-fixture test/fixtures/sample-bank-email.eml --account vjoati-gmail",
"mail:list-apple": "node src/mailCli.js list-apple-mail",
"mail:import-apple-index": "node src/mailCli.js import-apple-mail-index --account vjoati-gmail",
"mail:import-kdoi": "node src/mailCli.js import-apple-mail-index --account kdoi-email",
"serve": "node src/server.js --ledger data/example-ledger.json --port 3910"
},
"engines": {
"node": ">=20"
},
"dependencies": {
"googleapis": "^171.4.0",
"imapflow": "^1.3.2",
"mailparser": "^3.9.8"
}
}

113
schema.sql Normal file
View File

@ -0,0 +1,113 @@
-- kua-money-trace MVP schema draft
-- Designed for Postgres. The current MVP uses JSON files, but this schema
-- defines the service boundary before adding persistence.
create table if not exists entities (
id text primary key,
name text not null,
kind text not null check (kind in ('person', 'company', 'vendor', 'bank', 'platform', 'unknown')),
rut text,
metadata jsonb not null default '{}'::jsonb,
created_at timestamptz not null default now()
);
create table if not exists financial_accounts (
id text primary key,
owner_entity_id text references entities(id),
institution text not null,
instrument text not null,
currency text not null,
label text,
account_number_hint text,
metadata jsonb not null default '{}'::jsonb,
created_at timestamptz not null default now()
);
create table if not exists source_files (
id text primary key,
storage_key text not null unique,
sha256 text not null,
original_filename text,
mime_type text,
size_bytes bigint,
source_kind text not null,
metadata jsonb not null default '{}'::jsonb,
created_at timestamptz not null default now()
);
create table if not exists mail_messages (
id text primary key,
mailbox text not null,
message_id text,
from_addr jsonb,
to_addrs jsonb,
subject text,
date_sent timestamptz,
raw_file_id text references source_files(id),
metadata jsonb not null default '{}'::jsonb,
created_at timestamptz not null default now()
);
create table if not exists documents (
id text primary key,
kind text not null,
issuer_name text,
issuer_rut text,
receiver_entity_id text references entities(id),
folio text,
document_date date,
currency text not null,
amount numeric(18, 4) not null,
source_file_id text references source_files(id),
extraction_state text not null default 'proposed',
metadata jsonb not null default '{}'::jsonb,
created_at timestamptz not null default now()
);
create table if not exists financial_movements (
id text primary key,
account_id text not null references financial_accounts(id),
occurred_at timestamptz not null,
direction text not null check (direction in ('in', 'out')),
currency text not null,
amount numeric(18, 4) not null,
description text,
counterparty text,
economic_type text not null default 'unknown',
beneficiary_entity_id text references entities(id),
source_file_id text references source_files(id),
metadata jsonb not null default '{}'::jsonb,
created_at timestamptz not null default now()
);
create table if not exists economic_events (
id text primary key,
kind text not null,
entity_id text references entities(id),
occurred_at timestamptz,
currency text not null,
amount numeric(18, 4) not null,
description text,
state text not null default 'proposed',
metadata jsonb not null default '{}'::jsonb,
created_at timestamptz not null default now()
);
create table if not exists money_links (
id text primary key,
from_node text not null,
to_node text not null,
link_type text not null,
currency text not null,
amount numeric(18, 4) not null,
method text not null,
confidence text not null default 'unknown',
state text not null default 'proposed',
note text,
created_at timestamptz not null default now()
);
create index if not exists idx_money_links_from on money_links(from_node);
create index if not exists idx_money_links_to on money_links(to_node);
create index if not exists idx_financial_movements_account_date on financial_movements(account_id, occurred_at);
create index if not exists idx_documents_receiver_date on documents(receiver_entity_id, document_date);

79
src/classifier.js Normal file
View File

@ -0,0 +1,79 @@
import { ECONOMIC_TYPES } from './domain.js';
const CARD_PAYMENT_PATTERNS = [
/pago\s*(tc|tarjeta|visa|mastercard|cmr)/i,
/cargo por pago tc/i,
/pago.*tarjeta/i,
];
const WALLET_FUNDING_PATTERNS = [
/ingreso de dinero/i,
/carga.*mercado pago/i,
/top[-\s]?up/i,
/revolut/i,
];
const NON_ACCOUNTING_PATTERNS = [
/interes/i,
/impuesto linea de credito/i,
/comision admin/i,
/saldo inicial/i,
/saldo final/i,
];
export function classifyMovement(movement, account) {
if (movement.economicType) return movement.economicType;
const text = [movement.description, movement.counterparty].filter(Boolean).join(' ');
if (NON_ACCOUNTING_PATTERNS.some((pattern) => pattern.test(text))) {
return ECONOMIC_TYPES.NON_ACCOUNTING;
}
if (account?.instrument === 'credit_card' && movement.direction === 'out') {
return ECONOMIC_TYPES.CARD_CHARGE;
}
if (movement.direction === 'out' && CARD_PAYMENT_PATTERNS.some((pattern) => pattern.test(text))) {
return ECONOMIC_TYPES.CARD_PAYMENT;
}
if (movement.direction === 'in' && account?.instrument === 'wallet') {
return ECONOMIC_TYPES.WALLET_FUNDING;
}
if (movement.direction === 'in' && account?.instrument === 'foreign_account' && WALLET_FUNDING_PATTERNS.some((pattern) => pattern.test(text))) {
return ECONOMIC_TYPES.FOREIGN_ACCOUNT_FUNDING;
}
if (movement.direction === 'in') {
return account?.ownerKind === 'company' ? ECONOMIC_TYPES.OPERATING_INCOME : ECONOMIC_TYPES.PURE_INCOME;
}
if (movement.direction === 'out') {
return ECONOMIC_TYPES.REAL_EXPENSE;
}
return ECONOMIC_TYPES.UNKNOWN;
}
export function classifyAllMovements(ledger) {
const entityById = new Map(ledger.entities.map((entity) => [entity.id, entity]));
const accountById = new Map(ledger.accounts.map((account) => {
const owner = entityById.get(account.ownerEntityId);
return [account.id, { ...account, ownerKind: owner?.kind || null }];
}));
return {
...ledger,
movements: ledger.movements.map((movement) => {
const rawAccountId = movement.accountId?.includes(':')
? movement.accountId.split(':').at(-1)
: movement.accountId;
const account = accountById.get(rawAccountId);
return {
...movement,
economicType: classifyMovement(movement, account),
};
}),
};
}

40
src/cli.js Normal file
View File

@ -0,0 +1,40 @@
#!/usr/bin/env node
import { buildMoneyGraph, destinationTree, originTree, summarizeLedger } from './moneyGraph.js';
import { formatDestinationTree, formatOriginTree } from './formatTree.js';
import { loadLedger, resolveAccountOwnerHints } from './ledgerStore.js';
const [command, ledgerPath, nodeId] = process.argv.slice(2);
if (!command || !ledgerPath) {
usage();
process.exit(1);
}
const ledger = resolveAccountOwnerHints(await loadLedger(ledgerPath));
const graph = buildMoneyGraph(ledger);
if (command === 'summarize') {
console.log(JSON.stringify(summarizeLedger(ledger, graph), null, 2));
} else if (command === 'trace') {
if (!nodeId) {
console.error('trace requires node id');
process.exit(1);
}
console.log(formatOriginTree(originTree(graph, nodeId)));
} else if (command === 'destinations') {
if (!nodeId) {
console.error('destinations requires node id');
process.exit(1);
}
console.log(formatDestinationTree(destinationTree(graph, nodeId)));
} else {
usage();
process.exit(1);
}
function usage() {
console.error(`Usage:
node src/cli.js summarize <ledger.json>
node src/cli.js trace <ledger.json> <node-id>
node src/cli.js destinations <ledger.json> <node-id>`);
}

133
src/domain.js Normal file
View File

@ -0,0 +1,133 @@
export const ECONOMIC_TYPES = Object.freeze({
PURE_INCOME: 'pure_income',
OPERATING_INCOME: 'operating_income',
REAL_EXPENSE: 'real_expense',
CARD_CHARGE: 'card_charge',
CARD_PAYMENT: 'card_payment',
WALLET_FUNDING: 'wallet_funding',
FOREIGN_ACCOUNT_FUNDING: 'foreign_account_funding',
INTERNAL_TRANSFER: 'internal_transfer',
REIMBURSEMENT: 'reimbursement',
PARTNER_LOAN: 'partner_loan',
PARTNER_WITHDRAWAL: 'partner_withdrawal',
REFUND: 'refund',
ADJUSTMENT: 'adjustment',
NON_ACCOUNTING: 'non_accounting',
UNKNOWN: 'unknown',
});
export const LINK_TYPES = Object.freeze({
FUNDS: 'funds',
SETTLES_CARD_CHARGE: 'settles_card_charge',
FINANCES_EVENT: 'finances_event',
DOCUMENTS_EVENT: 'documents_event',
INTERNAL_TRANSFER: 'internal_transfer',
REIMBURSES: 'reimburses',
FX_CONVERSION: 'fx_conversion',
PLATFORM_PAYMENT: 'platform_payment',
});
export const LINK_METHODS = Object.freeze({
EXACT: 'exact',
RULE_FIFO: 'rule_fifo',
MANUAL: 'manual',
AI_PROPOSED: 'ai_proposed',
IMPORTED: 'imported',
});
export function amount(currency, value) {
if (!currency || typeof currency !== 'string') {
throw new Error('amount.currency is required');
}
if (!Number.isFinite(value)) {
throw new Error('amount.value must be a finite number');
}
return { currency, value };
}
export function signedAmount(movement) {
const sign = movement.direction === 'out' ? -1 : 1;
return sign * movement.amount.value;
}
export function nodeId(kind, id) {
if (id.includes(':')) return id;
return `${kind}:${id}`;
}
export function compareDateThenId(a, b) {
const byDate = String(a.date || '').localeCompare(String(b.date || ''));
if (byDate !== 0) return byDate;
return String(a.id).localeCompare(String(b.id));
}
export function assertLedgerShape(ledger) {
if (!ledger || typeof ledger !== 'object') throw new Error('ledger must be an object');
for (const key of ['entities', 'accounts', 'movements']) {
if (!Array.isArray(ledger[key])) throw new Error(`ledger.${key} must be an array`);
}
if (ledger.documents && !Array.isArray(ledger.documents)) {
throw new Error('ledger.documents must be an array');
}
if (ledger.events && !Array.isArray(ledger.events)) {
throw new Error('ledger.events must be an array');
}
if (ledger.links && !Array.isArray(ledger.links)) {
throw new Error('ledger.links must be an array');
}
}
export function normalizeLedger(raw) {
assertLedgerShape(raw);
return {
entities: raw.entities,
accounts: raw.accounts,
movements: raw.movements.map((movement) => ({
...movement,
id: nodeId('mov', movement.id),
})),
documents: (raw.documents || []).map((document) => ({
...document,
id: nodeId('doc', document.id),
})),
events: (raw.events || []).map((event) => ({
...event,
id: nodeId('event', event.id),
})),
links: (raw.links || []).map((link) => ({
state: 'proposed',
confidence: 'unknown',
...link,
from: normalizeExistingNodeId(link.from),
to: normalizeExistingNodeId(link.to),
})),
};
}
function normalizeExistingNodeId(id) {
if (typeof id !== 'string') throw new Error('link node ids must be strings');
if (id.includes(':')) return id;
throw new Error(`link node id must include prefix: ${id}`);
}
export function buildNodeIndex(ledger) {
const nodes = new Map();
for (const entity of ledger.entities) {
nodes.set(nodeId('entity', entity.id), { kind: 'entity', ...entity, id: nodeId('entity', entity.id) });
}
for (const account of ledger.accounts) {
nodes.set(nodeId('account', account.id), { kind: 'account', ...account, id: nodeId('account', account.id) });
}
for (const movement of ledger.movements) {
nodes.set(movement.id, { kind: 'movement', ...movement });
}
for (const document of ledger.documents) {
nodes.set(document.id, { kind: 'document', ...document });
}
for (const event of ledger.events) {
nodes.set(event.id, { kind: 'event', ...event });
}
return nodes;
}

55
src/formatTree.js Normal file
View File

@ -0,0 +1,55 @@
function labelNode(node) {
const amount = node.amount ? ` ${formatAmount(node.amount)}` : '';
const date = node.date || node.documentDate ? ` ${node.date || node.documentDate}` : '';
const text = node.description || node.subject || node.label || node.name || node.issuerName || node.id;
return `${node.id}${date}${amount} - ${text}`;
}
function formatAmount(amount) {
const value = new Intl.NumberFormat('es-CL').format(amount.value);
return `${amount.currency} ${value}`;
}
function labelLink(link) {
const amount = link.amount ? ` ${formatAmount(link.amount)}` : '';
const method = link.method ? ` (${link.method})` : '';
return `${link.type}${amount}${method}`;
}
export function formatOriginTree(tree) {
const lines = [];
function visit(branch, prefix, isLast) {
const connector = prefix ? (isLast ? '└─ ' : '├─ ') : '';
lines.push(`${prefix}${connector}${labelNode(branch.node)}`);
const nextPrefix = prefix + (prefix ? (isLast ? ' ' : '│ ') : '');
branch.incoming.forEach((incoming, index) => {
const lastIncoming = index === branch.incoming.length - 1;
const linkPrefix = nextPrefix + (lastIncoming ? '└─ ' : '├─ ');
lines.push(`${linkPrefix}${labelLink(incoming.link)}`);
visit(incoming.source, nextPrefix + (lastIncoming ? ' ' : '│ '), true);
});
}
visit(tree, '', true);
return lines.join('\n');
}
export function formatDestinationTree(tree) {
const lines = [];
function visit(branch, prefix, isLast) {
const connector = prefix ? (isLast ? '└─ ' : '├─ ') : '';
lines.push(`${prefix}${connector}${labelNode(branch.node)}`);
const nextPrefix = prefix + (prefix ? (isLast ? ' ' : '│ ') : '');
branch.outgoing.forEach((outgoing, index) => {
const lastOutgoing = index === branch.outgoing.length - 1;
const linkPrefix = nextPrefix + (lastOutgoing ? '└─ ' : '├─ ');
lines.push(`${linkPrefix}${labelLink(outgoing.link)}`);
visit(outgoing.target, nextPrefix + (lastOutgoing ? ' ' : '│ '), true);
});
}
visit(tree, '', true);
return lines.join('\n');
}

368
src/gmailApi.js Normal file
View File

@ -0,0 +1,368 @@
import fs from 'node:fs/promises';
import http from 'node:http';
import path from 'node:path';
import { google } from 'googleapis';
import { archiveRawEmail, sha256 } from './mailArchive.js';
const GMAIL_READONLY_SCOPE = 'https://www.googleapis.com/auth/gmail.readonly';
const GMAIL_MODIFY_SCOPE = 'https://www.googleapis.com/auth/gmail.modify';
const DEFAULT_OAUTH_SCOPES = [GMAIL_MODIFY_SCOPE];
const DEFAULT_REDIRECT_URI = 'http://127.0.0.1:3912/oauth2callback';
export async function runGmailOAuthFlow({ account, redirectUri = DEFAULT_REDIRECT_URI, scopes = DEFAULT_OAUTH_SCOPES }) {
const oauth2Client = createOAuthClient({ account, redirectUri });
const authUrl = oauth2Client.generateAuthUrl({
access_type: 'offline',
prompt: 'consent',
scope: scopes,
});
const result = await waitForOAuthCallback({ authUrl, redirectUri, scopes });
const { tokens } = await oauth2Client.getToken(result.code);
oauth2Client.setCredentials(tokens);
const tokenPath = resolveTokenPath(account);
await writeToken(tokenPath, tokens);
return {
accountId: account.id,
email: account.email,
tokenPath,
scopes,
authUrl,
};
}
export async function trashGmailMessages({ account, ids, gmail }) {
if (!Array.isArray(ids) || ids.length === 0) {
throw new Error('trashGmailMessages requires a non-empty ids array');
}
const gmailClient = gmail || google.gmail({
version: 'v1',
auth: await loadAuthorizedOAuthClient({ account }),
});
const results = [];
for (const id of ids) {
try {
const response = await gmailClient.users.messages.trash({ userId: 'me', id });
results.push({
id,
trashed: true,
labelIds: response.data.labelIds || [],
});
} catch (error) {
results.push({
id,
trashed: false,
error: { message: error.message, code: error.code },
});
}
}
return {
accountId: account.id,
requested: ids.length,
trashed: results.filter((r) => r.trashed).length,
failed: results.filter((r) => !r.trashed).length,
results,
};
}
export async function downloadGmailApiAccount({
account,
limit = 25,
since = account.defaultSince,
query,
archiveRoot,
includeSpamTrash = false,
gmail,
}) {
const gmailClient = gmail || google.gmail({
version: 'v1',
auth: await loadAuthorizedOAuthClient({ account }),
});
const q = query || buildGmailQuery({ since });
const messageRefs = await listGmailMessageRefs({
gmail: gmailClient,
q,
limit,
includeSpamTrash,
});
const root = archiveRoot || account.archive?.root || `data/mail-archive/${account.id}`;
const uidIndexDir = path.join(root, 'manifests', '.uid-index');
await fs.mkdir(uidIndexDir, { recursive: true });
const records = [];
for (const ref of messageRefs) {
const uidMarker = path.join(uidIndexDir, sha256(ref.id));
try {
await fs.access(uidMarker);
records.push({ id: ref.id, skipped: true, reason: 'already archived' });
continue;
} catch { /* not yet archived, proceed */ }
const response = await gmailClient.users.messages.get({
userId: 'me',
id: ref.id,
format: 'raw',
});
const raw = response.data.raw;
if (!raw) {
records.push({ id: ref.id, skipped: true, reason: 'missing raw message body' });
continue;
}
const record = await archiveRawEmail({
account,
mailbox: 'gmail-api',
uid: ref.id,
source: decodeBase64Url(raw),
archiveRoot: root,
});
await fs.writeFile(uidMarker, '', { flag: 'wx' }).catch(() => {});
record.gmail = {
id: ref.id,
threadId: ref.threadId || response.data.threadId || null,
historyId: response.data.historyId || null,
internalDate: response.data.internalDate || null,
labelIds: response.data.labelIds || [],
};
records.push(record);
}
return {
accountId: account.id,
email: account.email,
query: q,
archiveRoot: root,
requested: Number(limit),
listed: messageRefs.length,
imported: records.filter((record) => !record.skipped).length,
records,
};
}
export async function listGmailMessageRefs({ gmail, q, limit = 25, includeSpamTrash = false }) {
const refs = [];
let pageToken;
while (refs.length < limit) {
const response = await gmail.users.messages.list({
userId: 'me',
q,
includeSpamTrash,
maxResults: Math.min(500, limit - refs.length),
pageToken,
});
refs.push(...(response.data.messages || []));
pageToken = response.data.nextPageToken;
if (!pageToken) break;
}
return refs.slice(0, limit);
}
export async function gmailEngagement({ account, sender, sample = 50, gmail }) {
if (!sender) throw new Error('gmailEngagement requires a sender');
const gmailClient = gmail || google.gmail({
version: 'v1',
auth: await loadAuthorizedOAuthClient({ account }),
});
const q = `from:${sender}`;
const refs = await listGmailMessageRefs({ gmail: gmailClient, q, limit: sample, includeSpamTrash: false });
let unread = 0;
for (const ref of refs) {
const meta = await gmailClient.users.messages.get({
userId: 'me',
id: ref.id,
format: 'metadata',
metadataHeaders: ['From'],
});
if ((meta.data.labelIds || []).includes('UNREAD')) unread += 1;
}
const total = refs.length;
const read = total - unread;
return {
sender,
sampled: total,
read,
unread,
readRate: total === 0 ? null : Number((read / total).toFixed(3)),
};
}
async function ensureGmailLabels({ gmail, names }) {
const existing = await gmail.users.labels.list({ userId: 'me' });
const byName = new Map((existing.data.labels || []).map((l) => [l.name, l.id]));
const out = {};
for (const name of names) {
if (byName.has(name)) {
out[name] = byName.get(name);
continue;
}
const created = await gmail.users.labels.create({
userId: 'me',
requestBody: {
name,
labelListVisibility: 'labelShow',
messageListVisibility: 'show',
},
});
out[name] = created.data.id;
byName.set(name, created.data.id);
}
return out;
}
export async function applyGmailLabels({ account, assignments, gmail }) {
if (!Array.isArray(assignments) || assignments.length === 0) {
throw new Error('applyGmailLabels requires a non-empty assignments array');
}
const gmailClient = gmail || google.gmail({
version: 'v1',
auth: await loadAuthorizedOAuthClient({ account }),
});
const allNames = [...new Set(assignments.flatMap((a) => [
...(a.labels || []),
...(a.removeLabels || []),
]))];
const labelIdByName = await ensureGmailLabels({ gmail: gmailClient, names: allNames });
const results = [];
for (const { id, labels = [], removeLabels = [] } of assignments) {
const addLabelIds = labels.map((name) => labelIdByName[name]).filter(Boolean);
const removeLabelIds = removeLabels.map((name) => labelIdByName[name]).filter(Boolean);
if (addLabelIds.length === 0 && removeLabelIds.length === 0) {
results.push({ id, applied: false, reason: 'no resolvable labels' });
continue;
}
try {
const response = await gmailClient.users.messages.modify({
userId: 'me',
id,
requestBody: { addLabelIds, removeLabelIds },
});
results.push({
id,
applied: true,
labels,
removeLabels,
labelIds: response.data.labelIds || [],
});
} catch (error) {
results.push({
id,
applied: false,
labels,
removeLabels,
error: { message: error.message, code: error.code },
});
}
}
return {
accountId: account.id,
requested: assignments.length,
applied: results.filter((r) => r.applied).length,
failed: results.filter((r) => !r.applied).length,
labels: labelIdByName,
results,
};
}
export async function loadAuthorizedOAuthClient({ account, redirectUri = DEFAULT_REDIRECT_URI }) {
const oauth2Client = createOAuthClient({ account, redirectUri });
const tokenPath = resolveTokenPath(account);
const tokens = JSON.parse(await fs.readFile(tokenPath, 'utf8'));
oauth2Client.setCredentials(tokens);
oauth2Client.on('tokens', async (newTokens) => {
await writeToken(tokenPath, { ...tokens, ...newTokens });
});
return oauth2Client;
}
export function createOAuthClient({ account, redirectUri = DEFAULT_REDIRECT_URI }) {
const clientId = readOAuthValue(account.oauth?.clientIdEnv, 'GOOGLE_OAUTH_CLIENT_ID');
const clientSecret = readOAuthValue(account.oauth?.clientSecretEnv, 'GOOGLE_OAUTH_CLIENT_SECRET', false);
if (!clientId) {
throw new Error(`missing OAuth client id. Set ${account.oauth?.clientIdEnv || 'GOOGLE_OAUTH_CLIENT_ID'}`);
}
return new google.auth.OAuth2(clientId, clientSecret || undefined, account.oauth?.redirectUri || redirectUri);
}
export function buildGmailQuery({ since }) {
if (!since) return '';
const date = new Date(since);
if (Number.isNaN(date.getTime())) throw new Error(`invalid since date: ${since}`);
const yyyy = date.getUTCFullYear();
const mm = String(date.getUTCMonth() + 1).padStart(2, '0');
const dd = String(date.getUTCDate()).padStart(2, '0');
return `after:${yyyy}/${mm}/${dd}`;
}
export function decodeBase64Url(value) {
const normalized = String(value).replaceAll('-', '+').replaceAll('_', '/');
const padded = normalized.padEnd(Math.ceil(normalized.length / 4) * 4, '=');
return Buffer.from(padded, 'base64');
}
export function resolveTokenPath(account) {
return path.resolve(account.oauth?.tokenPath || `data/mail-oauth/${account.id}.token.json`);
}
async function waitForOAuthCallback({ authUrl, redirectUri, scopes = DEFAULT_OAUTH_SCOPES }) {
const redirect = new URL(redirectUri);
if (!['127.0.0.1', 'localhost'].includes(redirect.hostname)) {
throw new Error(`local OAuth callback requires localhost redirect URI, got ${redirectUri}`);
}
return new Promise((resolve, reject) => {
const server = http.createServer((request, response) => {
const requestUrl = new URL(request.url, redirectUri);
if (requestUrl.pathname !== redirect.pathname) {
response.writeHead(404);
response.end('Not found');
return;
}
const code = requestUrl.searchParams.get('code');
const error = requestUrl.searchParams.get('error');
if (error) {
response.writeHead(400, { 'Content-Type': 'text/plain' });
response.end(`OAuth failed: ${error}`);
server.close();
reject(new Error(`OAuth failed: ${error}`));
return;
}
if (!code) {
response.writeHead(400, { 'Content-Type': 'text/plain' });
response.end('Missing code');
return;
}
response.writeHead(200, { 'Content-Type': 'text/plain' });
response.end('OAuth complete. You can close this tab.');
server.close();
resolve({ code });
});
server.on('error', reject);
server.listen(Number(redirect.port), redirect.hostname, () => {
console.log(JSON.stringify({
message: 'Open this URL in your browser and approve Gmail access (read + modify, includes trash/label).',
authUrl,
callback: redirectUri,
scopes,
}, null, 2));
});
});
}
async function writeToken(tokenPath, tokens) {
await fs.mkdir(path.dirname(tokenPath), { recursive: true });
await fs.writeFile(tokenPath, `${JSON.stringify(tokens, null, 2)}\n`, { mode: 0o600 });
}
function readOAuthValue(primaryEnv, fallbackEnv, required = true) {
const envName = primaryEnv || fallbackEnv;
const value = process.env[envName] || (fallbackEnv && process.env[fallbackEnv]);
if (required && !value) return null;
return value || null;
}

26
src/ledgerStore.js Normal file
View File

@ -0,0 +1,26 @@
import fs from 'node:fs/promises';
import path from 'node:path';
import { classifyAllMovements } from './classifier.js';
import { normalizeLedger } from './domain.js';
export async function loadLedger(filePath) {
const absolutePath = path.resolve(filePath);
const rawText = await fs.readFile(absolutePath, 'utf8');
const raw = JSON.parse(rawText);
const normalized = normalizeLedger(raw);
return classifyAllMovements(normalized);
}
export function resolveAccountOwnerHints(ledger) {
const accountById = new Map(ledger.accounts.map((account) => [account.id, account]));
return {
...ledger,
movements: ledger.movements.map((movement) => {
const account = accountById.get(movement.accountId);
return {
...movement,
ownerEntityId: account?.ownerEntityId || movement.ownerEntityId || null,
};
}),
};
}

402
src/localMailImport.js Normal file
View File

@ -0,0 +1,402 @@
import fs from 'node:fs/promises';
import path from 'node:path';
import { execFile } from 'node:child_process';
import { promisify } from 'node:util';
import { archiveRawEmail } from './mailArchive.js';
const DEFAULT_MAIL_ROOT = '/Users/kavi/Library/Mail/V10';
const DEFAULT_ENVELOPE_INDEX = path.join(DEFAULT_MAIL_ROOT, 'MailData', 'Envelope Index');
const DEFAULT_ACCOUNTS_DB = '/Users/kavi/Library/Accounts/Accounts4.sqlite';
const execFileAsync = promisify(execFile);
export async function listAppleMailSources(mailRoot = DEFAULT_MAIL_ROOT) {
const sources = [];
const accountDirs = await safeReaddir(mailRoot, { withFileTypes: true });
for (const accountDir of accountDirs.filter((entry) => entry.isDirectory())) {
if (accountDir.name === 'MailData') continue;
const accountPath = path.join(mailRoot, accountDir.name);
const mailboxes = await findMailboxDirs(accountPath);
if (!mailboxes.length) continue;
sources.push({
accountDir: accountDir.name,
accountPath,
mailboxes: mailboxes.map((mailbox) => ({
path: mailbox,
label: mailboxLabel(mailbox, accountPath),
})),
});
}
return sources;
}
export async function importAppleMailEmlx({ account, sourcePath, archiveRoot, limit = 25, since }) {
const files = await findFiles(sourcePath, (file) => file.endsWith('.emlx') && !file.endsWith('.partial.emlx'));
const selectedFiles = [];
const sinceDate = since ? new Date(since) : null;
for (const file of files.sort()) {
if (sinceDate) {
const stat = await fs.stat(file);
if (stat.mtime < sinceDate) continue;
}
selectedFiles.push(file);
if (selectedFiles.length >= limit) break;
}
const records = [];
for (const file of selectedFiles) {
const source = await readEmlxRawMessage(file);
const record = await archiveRawEmail({
account,
mailbox: mailboxLabel(file, sourcePath),
uid: path.basename(file, '.emlx'),
source,
archiveRoot,
});
record.localSource = file;
records.push(record);
}
return {
sourcePath,
archiveRoot,
imported: records.length,
scanned: files.length,
records,
};
}
export async function importAppleMailFromIndex({
account,
email = account.email || account.auth?.user,
mailRoot = DEFAULT_MAIL_ROOT,
indexPath = DEFAULT_ENVELOPE_INDEX,
accountsDbPath = DEFAULT_ACCOUNTS_DB,
mailbox = 'all',
archiveRoot,
limit = 25,
since,
}) {
if (!email) throw new Error('email is required to resolve Apple Mail account');
const { imapAccount, mailboxRecord } = await resolveAppleMailAccountMailbox({
email,
accountsDbPath,
indexPath,
mailbox,
});
const sourcePath = mailboxPathForUrl(mailRoot, mailboxRecord.url);
const sinceTimestamp = since ? Math.floor(new Date(since).getTime() / 1000) : 0;
const importLimit = Number(limit);
const candidateLimit = Math.max(importLimit * 20, importLimit + 100);
const messages = await queryJson(indexPath, `
select m.ROWID as rowid, m.remote_id as remoteId, m.date_received as dateReceived
from messages m
where m.mailbox = ${Number(mailboxRecord.rowid)}
and m.deleted = 0
and m.date_received >= ${Number.isFinite(sinceTimestamp) ? sinceTimestamp : 0}
order by m.date_received desc
limit ${Number.isFinite(candidateLimit) ? candidateLimit : 25}
`);
const records = [];
const missing = [];
for (const message of messages) {
const localSource = await findEmlxByRowId(sourcePath, message.rowid);
if (!localSource) {
missing.push(message.rowid);
continue;
}
if (records.length >= importLimit) break;
const source = await readEmlxRawMessage(localSource);
const record = await archiveRawEmail({
account,
mailbox: decodeMailboxUrl(mailboxRecord.url),
uid: String(message.rowid),
source,
archiveRoot,
});
record.localSource = localSource;
record.appleMail = {
accountIdentifier: imapAccount.identifier,
mailboxUrl: mailboxRecord.url,
rowid: message.rowid,
remoteId: message.remoteId,
dateReceived: message.dateReceived,
};
records.push(record);
}
return {
email,
imapAccount,
mailbox: mailboxRecord,
sourcePath,
archiveRoot,
imported: records.length,
scanned: messages.length,
missing,
records,
};
}
export async function resolveAppleMailAccountMailbox({
email,
accountsDbPath = DEFAULT_ACCOUNTS_DB,
indexPath = DEFAULT_ENVELOPE_INDEX,
mailbox = 'all',
}) {
const imapAccounts = await resolveAppleMailImapAccounts({ email, accountsDbPath });
const attempts = [];
for (const imapAccount of imapAccounts) {
try {
const mailboxRecord = await resolveMailboxRecord({ indexPath, imapAccountId: imapAccount.identifier, mailbox });
attempts.push({ imapAccount, mailboxRecord });
} catch {
// Some macOS account records are stale or have no Mail mailbox. Try the next candidate.
}
}
if (!attempts.length) {
throw new Error(`Apple Mail mailbox not found for ${email}/${mailbox}`);
}
return attempts.sort((a, b) => Number(b.mailboxRecord.totalCount || 0) - Number(a.mailboxRecord.totalCount || 0))[0];
}
export async function resolveAppleMailImapAccount({ email, accountsDbPath = DEFAULT_ACCOUNTS_DB }) {
const rows = await resolveAppleMailImapAccounts({ email, accountsDbPath });
if (!rows.length) {
throw new Error(`Apple Mail IMAP account not found for ${email}`);
}
return rows[0];
}
export async function resolveAppleMailImapAccounts({ email, accountsDbPath = DEFAULT_ACCOUNTS_DB }) {
const childRows = await queryJson(accountsDbPath, `
select child.ZIDENTIFIER as identifier,
parent.ZUSERNAME as email,
parent.ZACCOUNTDESCRIPTION as description,
parent.ZIDENTIFIER as parentIdentifier,
'child' as source
from ZACCOUNT parent
join ZACCOUNT child on child.ZPARENTACCOUNT = parent.Z_PK
join ZACCOUNTTYPE childType on child.ZACCOUNTTYPE = childType.Z_PK
where lower(parent.ZUSERNAME) = lower('${sqlString(email)}')
and childType.ZIDENTIFIER = 'com.apple.account.IMAP'
order by child.Z_PK
`);
const directRows = await queryJson(accountsDbPath, `
select account.ZIDENTIFIER as identifier,
account.ZUSERNAME as email,
account.ZACCOUNTDESCRIPTION as description,
null as parentIdentifier,
'direct' as source
from ZACCOUNT account
join ZACCOUNTTYPE accountType on account.ZACCOUNTTYPE = accountType.Z_PK
where lower(account.ZUSERNAME) = lower('${sqlString(email)}')
and accountType.ZIDENTIFIER = 'com.apple.account.IMAP'
order by account.Z_PK
`);
const rows = [...directRows, ...childRows];
const unique = new Map();
for (const row of rows) {
if (!unique.has(row.identifier)) unique.set(row.identifier, row);
}
const result = [...unique.values()];
if (!rows.length) {
throw new Error(`Apple Mail IMAP account not found for ${email}`);
}
return result;
}
export async function resolveMailboxRecord({ indexPath = DEFAULT_ENVELOPE_INDEX, imapAccountId, mailbox = 'all' }) {
const accountPrefix = `imap://${imapAccountId}/`;
const candidates = mailboxCandidates(mailbox)
.map((name) => `${accountPrefix}${encodeMailboxPath(name)}`);
for (const url of candidates) {
const rows = await queryJson(indexPath, `
select ROWID as rowid, url, total_count as totalCount, unread_count as unreadCount, unseen_count as unseenCount
from mailboxes
where url = '${sqlString(url)}'
limit 1
`);
if (rows.length) return rows[0];
}
const available = await queryJson(indexPath, `
select ROWID as rowid, url, total_count as totalCount
from mailboxes
where url like '${sqlString(accountPrefix)}%'
order by total_count desc
limit 20
`);
throw new Error(`Apple Mail mailbox not found for ${imapAccountId}/${mailbox}. Available: ${available.map((row) => row.url).join(', ')}`);
}
export async function readEmlxRawMessage(filePath) {
const content = await fs.readFile(filePath);
const newlineIndex = content.indexOf(0x0a);
if (newlineIndex < 0) throw new Error(`invalid emlx file without first line: ${filePath}`);
const sizeText = content.subarray(0, newlineIndex).toString('utf8').trim();
const declaredSize = Number(sizeText);
if (!Number.isFinite(declaredSize) || declaredSize <= 0) {
return content.subarray(newlineIndex + 1);
}
const start = newlineIndex + 1;
const end = Math.min(start + declaredSize, content.length);
return content.subarray(start, end);
}
export async function findEmlxByRowId(sourcePath, rowid) {
const fileName = `${rowid}.emlx`;
const baseDirs = await mailDataBaseDirs(sourcePath);
const bucketParts = bucketPathParts(rowid);
for (const baseDir of baseDirs) {
const candidate = path.join(baseDir, 'Data', ...bucketParts, 'Messages', fileName);
if (await fileExists(candidate)) return candidate;
}
return findFirstFile(sourcePath, fileName);
}
export function bucketPathParts(rowid) {
const bucket = Math.floor(Number(rowid) / 1000);
if (!bucket) return [];
return String(bucket).split('').reverse();
}
export function mailboxPathForUrl(mailRoot, url) {
const withoutScheme = url.replace(/^imap:\/\//, '');
const slashIndex = withoutScheme.indexOf('/');
const accountDir = slashIndex >= 0 ? withoutScheme.slice(0, slashIndex) : withoutScheme;
const mailboxPath = slashIndex >= 0 ? withoutScheme.slice(slashIndex + 1) : '';
const mailboxParts = mailboxPath
.split('/')
.filter(Boolean)
.map((part) => decodeURIComponent(part));
return path.join(mailRoot, accountDir, ...mailboxParts.map((part) => `${part}.mbox`));
}
export function decodeMailboxUrl(url) {
const withoutScheme = url.replace(/^imap:\/\//, '');
const slashIndex = withoutScheme.indexOf('/');
const mailboxPath = slashIndex >= 0 ? withoutScheme.slice(slashIndex + 1) : '';
return mailboxPath
.split('/')
.filter(Boolean)
.map((part) => decodeURIComponent(part))
.join('/');
}
async function findMailboxDirs(root) {
const dirs = [];
const entries = await safeReaddir(root, { withFileTypes: true });
for (const entry of entries) {
if (!entry.isDirectory()) continue;
const fullPath = path.join(root, entry.name);
if (entry.name.endsWith('.mbox')) dirs.push(fullPath);
const nested = await findMailboxDirs(fullPath);
dirs.push(...nested);
}
return dirs;
}
async function findFiles(root, predicate) {
const found = [];
const entries = await safeReaddir(root, { withFileTypes: true });
for (const entry of entries) {
const fullPath = path.join(root, entry.name);
if (entry.isDirectory()) {
found.push(...await findFiles(fullPath, predicate));
} else if (entry.isFile() && predicate(fullPath)) {
found.push(fullPath);
}
}
return found;
}
async function findFirstFile(root, fileName) {
const entries = await safeReaddir(root, { withFileTypes: true });
for (const entry of entries) {
const fullPath = path.join(root, entry.name);
if (entry.isFile() && entry.name === fileName) return fullPath;
if (entry.isDirectory()) {
const nested = await findFirstFile(fullPath, fileName);
if (nested) return nested;
}
}
return null;
}
async function mailDataBaseDirs(sourcePath) {
const baseDirs = [sourcePath];
const entries = await safeReaddir(sourcePath, { withFileTypes: true });
for (const entry of entries) {
if (!entry.isDirectory()) continue;
const fullPath = path.join(sourcePath, entry.name);
if (await fileExists(path.join(fullPath, 'Data'))) baseDirs.push(fullPath);
}
return baseDirs;
}
async function queryJson(dbPath, sql) {
const { stdout } = await execFileAsync('sqlite3', ['-json', dbPath, sql], {
maxBuffer: 1024 * 1024 * 20,
});
return stdout.trim() ? JSON.parse(stdout) : [];
}
function mailboxCandidates(mailbox) {
if (!mailbox || mailbox === 'all') return ['[Gmail]/Todos', '[Gmail]/All Mail', 'INBOX'];
if (mailbox === 'inbox') return ['INBOX'];
return [mailbox];
}
function encodeMailboxPath(mailboxPath) {
return mailboxPath
.split('/')
.map((part) => encodeURIComponent(part))
.join('/');
}
function sqlString(value) {
return String(value).replaceAll("'", "''");
}
async function fileExists(filePath) {
try {
await fs.access(filePath);
return true;
} catch {
return false;
}
}
async function safeReaddir(dir, options) {
try {
return await fs.readdir(dir, options);
} catch {
return [];
}
}
function mailboxLabel(itemPath, rootPath) {
return path.relative(rootPath, itemPath)
.split(path.sep)
.filter((part) => part && part !== 'Data' && part !== 'Messages')
.join('/');
}

123
src/mailArchive.js Normal file
View File

@ -0,0 +1,123 @@
import crypto from 'node:crypto';
import fs from 'node:fs/promises';
import path from 'node:path';
import { simpleParser } from 'mailparser';
export function sha256(buffer) {
return crypto.createHash('sha256').update(buffer).digest('hex');
}
export function safeFileName(input) {
return String(input || 'unnamed')
.normalize('NFKD')
.replace(/[^\w.\-]+/g, '_')
.replace(/^_+|_+$/g, '')
.slice(0, 180) || 'unnamed';
}
export function dateParts(dateLike) {
const date = dateLike ? new Date(dateLike) : new Date();
if (Number.isNaN(date.getTime())) return dateParts(new Date());
return {
yyyy: String(date.getUTCFullYear()),
mm: String(date.getUTCMonth() + 1).padStart(2, '0'),
dd: String(date.getUTCDate()).padStart(2, '0'),
iso: date.toISOString(),
};
}
export async function archiveRawEmail({ account, mailbox, uid, source, archiveRoot }) {
const rawBuffer = Buffer.isBuffer(source) ? source : Buffer.from(source);
const hash = sha256(rawBuffer);
const parsed = await simpleParser(rawBuffer);
const parts = dateParts(parsed.date);
const rawDir = path.join(archiveRoot, 'raw-eml', parts.yyyy, parts.mm);
await fs.mkdir(rawDir, { recursive: true });
const emlName = `${parts.dd}-${uid || 'no-uid'}-${hash.slice(0, 16)}.eml`;
const emlPath = path.join(rawDir, emlName);
await writeFileIfMissing(emlPath, rawBuffer);
const attachmentRecords = await archiveAttachments({
parsed,
archiveRoot,
dateParts: parts,
emailHash: hash,
});
const record = {
id: `mail:${account.id}:${mailbox}:${uid || hash.slice(0, 16)}`,
accountId: account.id,
mailbox,
uid: uid || null,
messageId: parsed.messageId || null,
date: parts.iso,
from: parsed.from?.text || null,
to: parsed.to?.text || null,
subject: parsed.subject || null,
raw: {
path: emlPath,
sha256: hash,
size: rawBuffer.length,
},
attachments: attachmentRecords,
textPreview: (parsed.text || '').replace(/\s+/g, ' ').trim().slice(0, 500),
archivedAt: new Date().toISOString(),
};
record.manifestAppended = await appendManifest(archiveRoot, 'emails.ndjson', record);
return record;
}
export async function archiveAttachments({ parsed, archiveRoot, dateParts: parts, emailHash }) {
const attachments = [];
const attachDir = path.join(archiveRoot, 'attachments', parts.yyyy, parts.mm);
await fs.mkdir(attachDir, { recursive: true });
for (const attachment of parsed.attachments || []) {
const content = Buffer.from(attachment.content || []);
const hash = sha256(content);
const filename = safeFileName(attachment.filename || `attachment-${hash.slice(0, 8)}`);
const storedName = `${hash.slice(0, 16)}-${filename}`;
const storedPath = path.join(attachDir, storedName);
await writeFileIfMissing(storedPath, content);
attachments.push({
filename: attachment.filename || null,
contentType: attachment.contentType || null,
size: content.length,
sha256: hash,
path: storedPath,
emailSha256: emailHash,
});
}
return attachments;
}
export async function appendManifest(archiveRoot, name, record) {
const manifestDir = path.join(archiveRoot, 'manifests');
await fs.mkdir(manifestDir, { recursive: true });
const markerDir = path.join(manifestDir, '.index', safeFileName(name));
await fs.mkdir(markerDir, { recursive: true });
const markerKey = sha256(`${record.id}|${record.raw?.sha256 || ''}`);
const markerPath = path.join(markerDir, markerKey);
try {
await fs.writeFile(markerPath, new Date().toISOString(), { flag: 'wx' });
} catch (error) {
if (error.code === 'EEXIST') return false;
throw error;
}
await fs.appendFile(path.join(manifestDir, name), `${JSON.stringify(record)}\n`, 'utf8');
return true;
}
async function writeFileIfMissing(filePath, content) {
try {
await fs.access(filePath);
} catch {
await fs.writeFile(filePath, content);
}
}

202
src/mailCli.js Normal file
View File

@ -0,0 +1,202 @@
#!/usr/bin/env node
import fs from 'node:fs/promises';
import { archiveRawEmail } from './mailArchive.js';
import { downloadAccount, dryRunAccount } from './mailDownloader.js';
import { downloadGmailApiAccount, runGmailOAuthFlow, trashGmailMessages, gmailEngagement, applyGmailLabels } from './gmailApi.js';
import { getMailAccount } from './mailConfig.js';
import { importAppleMailEmlx, importAppleMailFromIndex, listAppleMailSources } from './localMailImport.js';
import { importMbox } from './mboxImport.js';
const [command, ...rest] = process.argv.slice(2);
const args = parseArgs(rest);
try {
if (command === 'dry-run') {
const account = await getMailAccount(required(args.account, '--account'));
const result = await dryRunAccount(account);
console.log(JSON.stringify(result, null, 2));
} else if (command === 'download') {
const account = await getMailAccount(required(args.account, '--account'));
const result = await downloadAccount(account, {
limit: args.limit ? Number(args.limit) : 25,
since: args.since,
archiveRoot: args.archiveRoot,
});
console.log(JSON.stringify(result, null, 2));
} else if (command === 'gmail-oauth') {
const account = await getMailAccount(required(args.account, '--account'));
const result = await runGmailOAuthFlow({
account,
redirectUri: args.redirectUri,
});
console.log(JSON.stringify(result, null, 2));
} else if (command === 'gmail-download') {
const account = await getMailAccount(required(args.account, '--account'));
const result = await downloadGmailApiAccount({
account,
limit: args.limit ? Number(args.limit) : 25,
since: args.since || account.defaultSince,
query: args.query,
includeSpamTrash: parseBoolean(args.includeSpamTrash),
archiveRoot: args.archiveRoot || account.archive?.root || `data/mail-archive/${account.id}`,
});
console.log(JSON.stringify(result, null, 2));
} else if (command === 'gmail-engagement') {
const account = await getMailAccount(required(args.account, '--account'));
const sender = required(args.sender, '--sender');
const result = await gmailEngagement({
account,
sender,
sample: args.sample ? Number(args.sample) : 50,
});
console.log(JSON.stringify(result, null, 2));
} else if (command === 'gmail-apply-labels') {
const account = await getMailAccount(required(args.account, '--account'));
const filePath = required(args.file, '--file');
const raw = await fs.readFile(filePath, 'utf8');
let assignments;
try {
const parsed = JSON.parse(raw);
assignments = Array.isArray(parsed) ? parsed : parsed.assignments;
} catch (error) {
throw new Error(`could not parse JSON at ${filePath}: ${error.message}`);
}
if (!Array.isArray(assignments)) {
throw new Error('--file must contain an array or {assignments:[...]} of {id,labels}');
}
if (parseBoolean(args.dryRun ?? args['dry-run'])) {
console.log(JSON.stringify({
accountId: account.id,
dryRun: true,
assignments,
uniqueLabels: [...new Set(assignments.flatMap((a) => a.labels || []))].sort(),
}, null, 2));
} else {
const result = await applyGmailLabels({ account, assignments });
console.log(JSON.stringify(result, null, 2));
}
} else if (command === 'gmail-trash') {
const account = await getMailAccount(required(args.account, '--account'));
const ids = parseIdList(args.id, args.ids);
if (ids.length === 0) throw new Error('gmail-trash requires --id <messageId> or --ids <id1,id2,...>');
const result = await trashGmailMessages({ account, ids });
console.log(JSON.stringify(result, null, 2));
} else if (command === 'parse-fixture') {
const fixturePath = rest.find((value) => !value.startsWith('--'));
if (!fixturePath) throw new Error('parse-fixture requires an .eml path');
const account = await getMailAccount(required(args.account, '--account'));
const source = await fs.readFile(fixturePath);
const result = await archiveRawEmail({
account,
mailbox: 'fixture',
uid: 'fixture',
source,
archiveRoot: args.archiveRoot || account.archive?.root || `data/mail-archive/${account.id}`,
});
console.log(JSON.stringify(result, null, 2));
} else if (command === 'list-apple-mail') {
const result = await listAppleMailSources(args.root);
console.log(JSON.stringify({ sources: result }, null, 2));
} else if (command === 'import-apple-mail') {
const account = await getMailAccount(required(args.account, '--account'));
const sourcePath = required(args.source, '--source');
const result = await importAppleMailEmlx({
account,
sourcePath,
limit: args.limit ? Number(args.limit) : 25,
since: args.since,
archiveRoot: args.archiveRoot || account.archive?.root || `data/mail-archive/${account.id}`,
});
console.log(JSON.stringify(result, null, 2));
} else if (command === 'import-apple-mail-index') {
const account = await getMailAccount(required(args.account, '--account'));
const result = await importAppleMailFromIndex({
account,
email: args.email || account.email || account.auth?.user,
mailRoot: args.root,
indexPath: args.index,
accountsDbPath: args.accountsDb,
mailbox: args.mailbox || 'all',
limit: args.limit ? Number(args.limit) : 25,
since: args.since || account.defaultSince,
archiveRoot: args.archiveRoot || account.archive?.root || `data/mail-archive/${account.id}`,
});
console.log(JSON.stringify(result, null, 2));
} else if (command === 'import-mbox') {
const account = await getMailAccount(required(args.account, '--account'));
const mboxPath = required(args.file, '--file');
const result = await importMbox({
account,
mboxPath,
limit: args.limit ? Number(args.limit) : 25,
archiveRoot: args.archiveRoot || account.archive?.root || `data/mail-archive/${account.id}`,
});
console.log(JSON.stringify(result, null, 2));
} else {
usage();
process.exit(1);
}
} catch (error) {
console.error(formatError(error));
process.exit(1);
}
function parseArgs(values) {
const parsed = {};
for (let i = 0; i < values.length; i += 1) {
const value = values[i];
if (!value.startsWith('--')) continue;
parsed[value.slice(2)] = values[i + 1];
i += 1;
}
return parsed;
}
function required(value, name) {
if (!value) throw new Error(`${name} is required`);
return value;
}
function parseIdList(single, csv) {
const ids = [];
if (single) ids.push(...String(single).split(',').map((v) => v.trim()).filter(Boolean));
if (csv) ids.push(...String(csv).split(',').map((v) => v.trim()).filter(Boolean));
return [...new Set(ids)];
}
function parseBoolean(value) {
if (value === undefined) return false;
return ['1', 'true', 'yes', 'y'].includes(String(value).toLowerCase());
}
function usage() {
console.error(`Usage:
node src/mailCli.js dry-run --account vjoati-gmail
node src/mailCli.js download --account vjoati-gmail [--since 2025-01-01] [--limit 25]
node src/mailCli.js gmail-oauth --account vjoati-gmail
node src/mailCli.js gmail-download --account vjoati-gmail [--since 2025-01-01] [--limit 25]
node src/mailCli.js gmail-trash --account vjoati-gmail (--id <messageId> | --ids id1,id2,...)
node src/mailCli.js gmail-engagement --account vjoati-gmail --sender <addr-or-domain> [--sample 50]
node src/mailCli.js gmail-apply-labels --account vjoati-gmail --file <assignments.json> [--dry-run true]
node src/mailCli.js parse-fixture <file.eml> --account vjoati-gmail
node src/mailCli.js list-apple-mail
node src/mailCli.js import-apple-mail --account vjoati-gmail --source /Users/kavi/Library/Mail/V10/... --limit 25
node src/mailCli.js import-apple-mail-index --account vjoati-gmail [--mailbox all] [--since 2025-01-01] [--limit 25]
node src/mailCli.js import-mbox --account vjoati-gmail --file /path/to/takeout.mbox --limit 25`);
}
function formatError(error) {
const details = {
message: error.message,
name: error.name,
code: error.code,
response: error.response,
responseText: error.responseText,
authenticationFailed: error.authenticationFailed,
};
return JSON.stringify(
Object.fromEntries(Object.entries(details).filter(([, value]) => value !== undefined)),
null,
2
);
}

30
src/mailConfig.js Normal file
View File

@ -0,0 +1,30 @@
import fs from 'node:fs/promises';
import path from 'node:path';
const DEFAULT_CONFIG = 'config/mail-accounts.json';
export async function loadMailConfig(configPath = DEFAULT_CONFIG) {
const absolutePath = path.resolve(configPath);
const raw = JSON.parse(await fs.readFile(absolutePath, 'utf8'));
if (!Array.isArray(raw.accounts)) throw new Error('mail config must include accounts[]');
return raw;
}
export async function getMailAccount(accountId, configPath = DEFAULT_CONFIG) {
const config = await loadMailConfig(configPath);
const account = config.accounts.find((candidate) => candidate.id === accountId);
if (!account) throw new Error(`mail account not found: ${accountId}`);
return account;
}
export function requireAccountPassword(account) {
const envName = account.auth?.passwordEnv;
if (!envName) throw new Error(`account ${account.id} missing auth.passwordEnv`);
const password = process.env[envName];
if (!password) {
throw new Error(
`missing ${envName}. For Gmail use an app password or OAuth token; do not store it in the repo.`
);
}
return password;
}

87
src/mailDownloader.js Normal file
View File

@ -0,0 +1,87 @@
import { ImapFlow } from 'imapflow';
import { archiveRawEmail } from './mailArchive.js';
import { requireAccountPassword } from './mailConfig.js';
export async function dryRunAccount(account) {
const password = requireAccountPassword(account);
const client = makeClient(account, password);
await client.connect();
try {
const mailboxes = await client.list();
const selected = [];
for (const mailbox of mailboxes) {
if (mailbox.flags?.has?.('\\Noselect')) continue;
if (account.mailboxes?.length && !account.mailboxes.includes(mailbox.path)) continue;
const status = await client.status(mailbox.path, { messages: true, unseen: true, uidNext: true });
selected.push({
path: mailbox.path,
messages: status.messages,
unseen: status.unseen,
uidNext: status.uidNext,
});
}
return { accountId: account.id, email: account.email, mailboxes: selected };
} finally {
await client.logout().catch(() => {});
}
}
export async function downloadAccount(account, options = {}) {
const password = requireAccountPassword(account);
const client = makeClient(account, password);
const limit = Number(options.limit || 25);
const since = options.since || account.defaultSince || null;
const archiveRoot = options.archiveRoot || account.archive?.root || `data/mail-archive/${account.id}`;
const results = [];
await client.connect();
try {
const mailboxes = account.mailboxes?.length ? account.mailboxes : ['INBOX'];
for (const mailbox of mailboxes) {
const lock = await client.getMailboxLock(mailbox);
try {
const criteria = since ? { since: new Date(since) } : { all: true };
const uids = await client.search(criteria, { uid: true });
const selectedUids = uids.slice(Math.max(0, uids.length - limit));
for await (const msg of client.fetch(selectedUids, { uid: true, source: true }, { uid: true })) {
const record = await archiveRawEmail({
account,
mailbox,
uid: msg.uid,
source: msg.source,
archiveRoot,
});
results.push(record);
}
} finally {
lock.release();
}
}
} finally {
await client.logout().catch(() => {});
}
return {
accountId: account.id,
archiveRoot,
downloaded: results.length,
records: results,
};
}
function makeClient(account, password) {
return new ImapFlow({
host: account.host,
port: account.port || 993,
secure: account.secure !== false,
auth: {
user: account.auth?.user || account.email,
pass: password,
},
logger: false,
tls: {
rejectUnauthorized: true,
},
});
}

56
src/mboxImport.js Normal file
View File

@ -0,0 +1,56 @@
import fs from 'node:fs/promises';
import path from 'node:path';
import { archiveRawEmail } from './mailArchive.js';
export async function importMbox({ account, mboxPath, archiveRoot, limit = 25 }) {
const content = await fs.readFile(mboxPath);
const messages = splitMbox(content);
const selected = messages.slice(0, limit);
const records = [];
for (let index = 0; index < selected.length; index += 1) {
const record = await archiveRawEmail({
account,
mailbox: `mbox:${path.basename(mboxPath)}`,
uid: `mbox-${index + 1}`,
source: selected[index],
archiveRoot,
});
record.localSource = mboxPath;
records.push(record);
}
return {
mboxPath,
archiveRoot,
imported: records.length,
scanned: messages.length,
records,
};
}
export function splitMbox(buffer) {
const text = buffer.toString('binary');
const starts = [];
if (text.startsWith('From ')) starts.push(0);
let position = 0;
while (true) {
const next = text.indexOf('\nFrom ', position);
if (next === -1) break;
starts.push(next + 1);
position = next + 1;
}
if (!starts.length) return [buffer];
const messages = [];
for (let i = 0; i < starts.length; i += 1) {
const start = starts[i];
const end = starts[i + 1] || buffer.length;
const chunk = buffer.subarray(start, end);
const firstNewline = chunk.indexOf(0x0a);
if (firstNewline >= 0) messages.push(chunk.subarray(firstNewline + 1));
}
return messages.filter((message) => message.length > 0);
}

236
src/moneyGraph.js Normal file
View File

@ -0,0 +1,236 @@
import {
ECONOMIC_TYPES,
LINK_METHODS,
LINK_TYPES,
buildNodeIndex,
compareDateThenId,
} from './domain.js';
function makeLink({ from, to, type, amount, method, confidence = 'rule', state = 'proposed', note }) {
return { from, to, type, amount, method, confidence, state, note };
}
export function buildMoneyGraph(ledger) {
const nodes = buildNodeIndex(ledger);
const links = [...ledger.links];
links.push(...inferAccountFifoFundingLinks(ledger, links));
links.push(...inferCardSettlementLinks(ledger, links));
return {
nodes,
links,
incoming: indexLinks(links, 'to'),
outgoing: indexLinks(links, 'from'),
};
}
function indexLinks(links, field) {
const index = new Map();
for (const link of links) {
if (!index.has(link[field])) index.set(link[field], []);
index.get(link[field]).push(link);
}
return index;
}
function inferAccountFifoFundingLinks(ledger, existingLinks) {
const movementsByAccountCurrency = new Map();
const hasIncomingFunding = new Set(
existingLinks
.filter((link) => [LINK_TYPES.FUNDS, LINK_TYPES.INTERNAL_TRANSFER, LINK_TYPES.FX_CONVERSION].includes(link.type))
.map((link) => link.to)
);
for (const movement of ledger.movements) {
const key = `${movement.accountId}|${movement.amount.currency}`;
if (!movementsByAccountCurrency.has(key)) movementsByAccountCurrency.set(key, []);
movementsByAccountCurrency.get(key).push(movement);
}
const inferred = [];
for (const movements of movementsByAccountCurrency.values()) {
movements.sort(compareDateThenId);
const lots = [];
for (const movement of movements) {
if (movement.direction === 'in') {
if (isFundingLot(movement)) {
lots.push({ movement, remaining: movement.amount.value });
}
continue;
}
if (movement.direction !== 'out') continue;
if (!isCashOutflowNeedingOrigin(movement)) continue;
if (hasIncomingFunding.has(movement.id)) continue;
let needed = movement.amount.value;
for (const lot of lots) {
if (needed <= 0) break;
if (lot.remaining <= 0) continue;
const used = Math.min(lot.remaining, needed);
inferred.push(makeLink({
from: lot.movement.id,
to: movement.id,
type: LINK_TYPES.FUNDS,
amount: { currency: movement.amount.currency, value: used },
method: LINK_METHODS.RULE_FIFO,
note: 'FIFO por cuenta y moneda: ingreso previo financia salida posterior.',
}));
lot.remaining -= used;
needed -= used;
}
}
}
return inferred;
}
function isFundingLot(movement) {
return [
ECONOMIC_TYPES.PURE_INCOME,
ECONOMIC_TYPES.OPERATING_INCOME,
ECONOMIC_TYPES.REFUND,
ECONOMIC_TYPES.REIMBURSEMENT,
ECONOMIC_TYPES.PARTNER_LOAN,
ECONOMIC_TYPES.INTERNAL_TRANSFER,
ECONOMIC_TYPES.FOREIGN_ACCOUNT_FUNDING,
ECONOMIC_TYPES.WALLET_FUNDING,
].includes(movement.economicType);
}
function isCashOutflowNeedingOrigin(movement) {
return [
ECONOMIC_TYPES.REAL_EXPENSE,
ECONOMIC_TYPES.CARD_PAYMENT,
ECONOMIC_TYPES.WALLET_FUNDING,
ECONOMIC_TYPES.FOREIGN_ACCOUNT_FUNDING,
ECONOMIC_TYPES.INTERNAL_TRANSFER,
ECONOMIC_TYPES.REIMBURSEMENT,
ECONOMIC_TYPES.PARTNER_WITHDRAWAL,
].includes(movement.economicType);
}
function inferCardSettlementLinks(ledger, existingLinks) {
const movementsByAccountCurrency = new Map();
const manuallySettledCharges = new Set(
existingLinks
.filter((link) => link.type === LINK_TYPES.SETTLES_CARD_CHARGE)
.map((link) => link.to)
);
for (const movement of ledger.movements) {
if (!movement.cardAccountId && movement.economicType !== ECONOMIC_TYPES.CARD_CHARGE) continue;
const accountId = movement.cardAccountId || movement.accountId;
const key = `${accountId}|${movement.amount.currency}`;
if (!movementsByAccountCurrency.has(key)) movementsByAccountCurrency.set(key, []);
movementsByAccountCurrency.get(key).push(movement);
}
const inferred = [];
for (const movements of movementsByAccountCurrency.values()) {
const charges = movements
.filter((movement) => movement.economicType === ECONOMIC_TYPES.CARD_CHARGE && !manuallySettledCharges.has(movement.id))
.sort(compareDateThenId)
.map((movement) => ({ movement, remaining: movement.amount.value }));
const payments = movements
.filter((movement) => movement.economicType === ECONOMIC_TYPES.CARD_PAYMENT)
.sort(compareDateThenId);
for (const payment of payments) {
let remainingPayment = payment.amount.value;
for (const charge of charges) {
if (remainingPayment <= 0) break;
if (charge.remaining <= 0) continue;
if (String(charge.movement.date) > String(payment.date)) continue;
const settled = Math.min(charge.remaining, remainingPayment);
inferred.push(makeLink({
from: payment.id,
to: charge.movement.id,
type: LINK_TYPES.SETTLES_CARD_CHARGE,
amount: { currency: payment.amount.currency, value: settled },
method: LINK_METHODS.RULE_FIFO,
note: 'Pago de tarjeta liquida cargos anteriores por FIFO dentro de la misma tarjeta y moneda.',
}));
charge.remaining -= settled;
remainingPayment -= settled;
}
}
}
return inferred;
}
export function originTree(graph, nodeId, options = {}) {
const maxDepth = options.maxDepth ?? 20;
const seen = new Set();
function walk(id, depth) {
const node = graph.nodes.get(id) || { id, kind: 'unknown', description: 'Nodo no encontrado' };
if (depth >= maxDepth) return { node, truncated: true, incoming: [] };
if (seen.has(id)) return { node, cycle: true, incoming: [] };
seen.add(id);
const incoming = (graph.incoming.get(id) || []).map((link) => ({
link,
source: walk(link.from, depth + 1),
}));
seen.delete(id);
return { node, incoming };
}
return walk(nodeId, 0);
}
export function destinationTree(graph, nodeId, options = {}) {
const maxDepth = options.maxDepth ?? 20;
const seen = new Set();
function walk(id, depth) {
const node = graph.nodes.get(id) || { id, kind: 'unknown', description: 'Nodo no encontrado' };
if (depth >= maxDepth) return { node, truncated: true, outgoing: [] };
if (seen.has(id)) return { node, cycle: true, outgoing: [] };
seen.add(id);
const outgoing = (graph.outgoing.get(id) || []).map((link) => ({
link,
target: walk(link.to, depth + 1),
}));
seen.delete(id);
return { node, outgoing };
}
return walk(nodeId, 0);
}
export function summarizeLedger(ledger, graph) {
const byType = new Map();
const byEntity = new Map();
for (const movement of ledger.movements) {
const key = movement.economicType || ECONOMIC_TYPES.UNKNOWN;
byType.set(key, (byType.get(key) || 0) + movement.amount.value);
const entityKey = movement.beneficiaryEntityId || movement.ownerEntityId || 'sin_entidad';
byEntity.set(entityKey, (byEntity.get(entityKey) || 0) + movement.amount.value);
}
return {
counts: {
entities: ledger.entities.length,
accounts: ledger.accounts.length,
movements: ledger.movements.length,
documents: ledger.documents.length,
events: ledger.events.length,
links: graph.links.length,
},
movementAmountByType: Object.fromEntries([...byType.entries()].sort()),
movementAmountByEntityHint: Object.fromEntries([...byEntity.entries()].sort()),
};
}

79
src/server.js Normal file
View File

@ -0,0 +1,79 @@
#!/usr/bin/env node
import http from 'node:http';
import { URL } from 'node:url';
import { buildMoneyGraph, destinationTree, originTree, summarizeLedger } from './moneyGraph.js';
import { loadLedger, resolveAccountOwnerHints } from './ledgerStore.js';
const args = parseArgs(process.argv.slice(2));
const ledgerPath = args.ledger || 'data/example-ledger.json';
const port = Number(args.port || 3910);
const host = args.host || '127.0.0.1';
let ledger = resolveAccountOwnerHints(await loadLedger(ledgerPath));
let graph = buildMoneyGraph(ledger);
const server = http.createServer(async (req, res) => {
try {
const url = new URL(req.url, `http://${req.headers.host}`);
if (req.method === 'GET' && url.pathname === '/health') {
return sendJson(res, { status: 'ok', service: 'kua-money-trace', ledger: ledgerPath });
}
if (req.method === 'POST' && url.pathname === '/reload') {
ledger = resolveAccountOwnerHints(await loadLedger(ledgerPath));
graph = buildMoneyGraph(ledger);
return sendJson(res, { reloaded: true, summary: summarizeLedger(ledger, graph) });
}
if (req.method === 'GET' && url.pathname === '/summary') {
return sendJson(res, summarizeLedger(ledger, graph));
}
if (req.method === 'GET' && url.pathname === '/entities') {
return sendJson(res, { entities: ledger.entities });
}
if (req.method === 'GET' && url.pathname === '/accounts') {
return sendJson(res, { accounts: ledger.accounts });
}
if (req.method === 'GET' && url.pathname === '/movements') {
return sendJson(res, { movements: ledger.movements });
}
const originMatch = url.pathname.match(/^\/nodes\/(.+)\/origin-tree$/);
if (req.method === 'GET' && originMatch) {
return sendJson(res, originTree(graph, decodeURIComponent(originMatch[1])));
}
const destinationMatch = url.pathname.match(/^\/nodes\/(.+)\/destination-tree$/);
if (req.method === 'GET' && destinationMatch) {
return sendJson(res, destinationTree(graph, decodeURIComponent(destinationMatch[1])));
}
sendJson(res, { error: 'not found' }, 404);
} catch (error) {
sendJson(res, { error: error.message }, 500);
}
});
server.listen(port, host, () => {
console.log(`kua-money-trace listening on http://${host}:${port}`);
});
function sendJson(res, body, status = 200) {
res.writeHead(status, { 'content-type': 'application/json; charset=utf-8' });
res.end(JSON.stringify(body, null, 2));
}
function parseArgs(argv) {
const parsed = {};
for (let i = 0; i < argv.length; i += 1) {
if (argv[i].startsWith('--')) {
parsed[argv[i].slice(2)] = argv[i + 1];
i += 1;
}
}
return parsed;
}

20
test/fixtures/sample-bank-email.eml vendored Normal file
View File

@ -0,0 +1,20 @@
From: Banco Demo <no-reply@banco.example>
To: vjoati@gmail.com
Subject: Estado de cuenta tarjeta credito terminado 5018
Date: Tue, 30 Dec 2025 10:30:00 -0300
Message-ID: <sample-bank-email-5018@example>
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="boundary-demo"
--boundary-demo
Content-Type: text/plain; charset="UTF-8"
Adjuntamos estado de cuenta de tarjeta de credito terminada 5018.
--boundary-demo
Content-Type: application/pdf; name="estado-cuenta-5018.pdf"
Content-Disposition: attachment; filename="estado-cuenta-5018.pdf"
Content-Transfer-Encoding: base64
JVBERi0xLjQKMSAwIG9iago8PCAvVHlwZSAvQ2F0YWxvZyA+PgplbmRvYmoKdHJhaWxlcgo8PCAvUm9vdCAxIDAgUiA+PgolJUVPRgo=
--boundary-demo--

82
test/gmailApi.test.js Normal file
View File

@ -0,0 +1,82 @@
import assert from 'node:assert/strict';
import fs from 'node:fs/promises';
import os from 'node:os';
import path from 'node:path';
import test from 'node:test';
import {
buildGmailQuery,
decodeBase64Url,
downloadGmailApiAccount,
listGmailMessageRefs,
} from '../src/gmailApi.js';
const sample = await fs.readFile('test/fixtures/sample-bank-email.eml');
test('builds gmail after query from ISO date', () => {
assert.equal(buildGmailQuery({ since: '2025-01-01' }), 'after:2025/01/01');
});
test('decodes gmail base64url raw message', () => {
const encoded = sample.toString('base64').replaceAll('+', '-').replaceAll('/', '_').replaceAll('=', '');
assert.equal(decodeBase64Url(encoded).toString(), sample.toString());
});
test('lists gmail message refs across pages', async () => {
const calls = [];
const gmail = {
users: {
messages: {
async list(params) {
calls.push(params);
if (!params.pageToken) {
return { data: { messages: [{ id: 'a' }], nextPageToken: 'next' } };
}
return { data: { messages: [{ id: 'b' }] } };
},
},
},
};
const refs = await listGmailMessageRefs({ gmail, q: 'after:2025/01/01', limit: 2 });
assert.deepEqual(refs.map((ref) => ref.id), ['a', 'b']);
assert.equal(calls[1].pageToken, 'next');
});
test('downloads gmail raw messages into archive', async () => {
const tmp = await fs.mkdtemp(path.join(os.tmpdir(), 'kua-gmail-api-'));
const encoded = sample.toString('base64').replaceAll('+', '-').replaceAll('/', '_').replaceAll('=', '');
const gmail = {
users: {
messages: {
async list() {
return { data: { messages: [{ id: 'gmail-message-1', threadId: 'thread-1' }] } };
},
async get(params) {
assert.equal(params.format, 'raw');
return {
data: {
id: params.id,
threadId: 'thread-1',
historyId: 'history-1',
internalDate: '1764547200000',
labelIds: ['INBOX'],
raw: encoded,
},
};
},
},
},
};
const result = await downloadGmailApiAccount({
account: { id: 'vjoati-gmail', email: 'vjoati@gmail.com' },
gmail,
archiveRoot: tmp,
since: '2025-01-01',
limit: 1,
});
assert.equal(result.imported, 1);
assert.equal(result.records[0].gmail.id, 'gmail-message-1');
assert.equal(result.records[0].attachments.length, 1);
});

View File

@ -0,0 +1,228 @@
import assert from 'node:assert/strict';
import { execFile } from 'node:child_process';
import fs from 'node:fs/promises';
import os from 'node:os';
import path from 'node:path';
import test from 'node:test';
import { promisify } from 'node:util';
import {
bucketPathParts,
importAppleMailEmlx,
importAppleMailFromIndex,
mailboxPathForUrl,
readEmlxRawMessage,
} from '../src/localMailImport.js';
import { importMbox, splitMbox } from '../src/mboxImport.js';
const sample = await fs.readFile('test/fixtures/sample-bank-email.eml');
const execFileAsync = promisify(execFile);
test('reads raw message from emlx declared byte size', async () => {
const tmp = await fs.mkdtemp(path.join(os.tmpdir(), 'kua-emlx-'));
const emlxPath = path.join(tmp, '1.emlx');
await fs.writeFile(emlxPath, Buffer.concat([
Buffer.from(`${sample.length}\n`, 'utf8'),
sample,
Buffer.from('\n<?xml version="1.0"?><plist></plist>\n', 'utf8'),
]));
const raw = await readEmlxRawMessage(emlxPath);
assert.equal(raw.toString(), sample.toString());
});
test('imports apple mail emlx folder into archive', async () => {
const tmp = await fs.mkdtemp(path.join(os.tmpdir(), 'kua-emlx-import-'));
const source = path.join(tmp, 'INBOX.mbox', 'Data', 'Messages');
const archive = path.join(tmp, 'archive');
await fs.mkdir(source, { recursive: true });
await fs.writeFile(path.join(source, '42.emlx'), Buffer.concat([
Buffer.from(`${sample.length}\n`, 'utf8'),
sample,
]));
const result = await importAppleMailEmlx({
account: { id: 'vjoati-gmail' },
sourcePath: path.join(tmp, 'INBOX.mbox'),
archiveRoot: archive,
limit: 10,
});
assert.equal(result.imported, 1);
assert.equal(result.records[0].attachments.length, 1);
});
test('imports apple mail by account and envelope index', async () => {
const tmp = await fs.mkdtemp(path.join(os.tmpdir(), 'kua-apple-index-'));
const mailRoot = path.join(tmp, 'Mail', 'V10');
const archive = path.join(tmp, 'archive');
const accountsDb = path.join(tmp, 'Accounts4.sqlite');
const envelopeIndex = path.join(mailRoot, 'MailData', 'Envelope Index');
const accountUuid = 'AAAAAAAA-BBBB-CCCC-DDDD-EEEEEEEEEEEE';
const sourcePath = path.join(mailRoot, accountUuid, '[Gmail].mbox', 'Todos.mbox');
const rowid = 12345;
const emlxPath = path.join(sourcePath, 'Data', ...bucketPathParts(rowid), 'Messages', `${rowid}.emlx`);
await fs.mkdir(path.dirname(envelopeIndex), { recursive: true });
await fs.mkdir(path.dirname(emlxPath), { recursive: true });
await fs.writeFile(emlxPath, Buffer.concat([
Buffer.from(`${sample.length}\n`, 'utf8'),
sample,
]));
await execSql(accountsDb, `
create table ZACCOUNT (
Z_PK integer primary key,
ZPARENTACCOUNT integer,
ZACCOUNTDESCRIPTION varchar,
ZUSERNAME varchar,
ZIDENTIFIER varchar,
ZACCOUNTTYPE integer
);
create table ZACCOUNTTYPE (Z_PK integer primary key, ZIDENTIFIER varchar);
insert into ZACCOUNTTYPE values (1, 'com.apple.account.Google');
insert into ZACCOUNTTYPE values (2, 'com.apple.account.IMAP');
insert into ZACCOUNT values (25, null, 'Google', 'vjoati@gmail.com', 'parent-google', 1);
insert into ZACCOUNT values (26, 25, null, null, '${accountUuid}', 2);
`);
await execSql(envelopeIndex, `
create table mailboxes (
ROWID integer primary key,
url text,
total_count integer,
unread_count integer,
unseen_count integer
);
create table messages (
ROWID integer primary key,
remote_id integer,
date_received integer,
mailbox integer,
deleted integer
);
insert into mailboxes values (16, 'imap://${accountUuid}/%5BGmail%5D/Todos', 1, 0, 0);
insert into messages values (${rowid}, 999, 1764547200, 16, 0);
`);
const result = await importAppleMailFromIndex({
account: { id: 'vjoati-gmail', email: 'vjoati@gmail.com' },
mailRoot,
indexPath: envelopeIndex,
accountsDbPath: accountsDb,
archiveRoot: archive,
since: '2025-01-01',
limit: 10,
});
assert.equal(result.imported, 1);
assert.equal(result.records[0].appleMail.rowid, rowid);
assert.equal(result.records[0].attachments.length, 1);
});
test('prefers populated direct IMAP account over empty child IMAP account', async () => {
const tmp = await fs.mkdtemp(path.join(os.tmpdir(), 'kua-apple-direct-index-'));
const mailRoot = path.join(tmp, 'Mail', 'V10');
const archive = path.join(tmp, 'archive');
const accountsDb = path.join(tmp, 'Accounts4.sqlite');
const envelopeIndex = path.join(mailRoot, 'MailData', 'Envelope Index');
const directUuid = 'DIRECT-IMAP-ACCOUNT';
const childUuid = 'CHILD-EMPTY-ACCOUNT';
const rowid = 22345;
const sourcePath = path.join(mailRoot, directUuid, 'INBOX.mbox');
const emlxPath = path.join(sourcePath, 'Data', ...bucketPathParts(rowid), 'Messages', `${rowid}.emlx`);
await fs.mkdir(path.dirname(envelopeIndex), { recursive: true });
await fs.mkdir(path.dirname(emlxPath), { recursive: true });
await fs.writeFile(emlxPath, Buffer.concat([
Buffer.from(`${sample.length}\n`, 'utf8'),
sample,
]));
await execSql(accountsDb, `
create table ZACCOUNT (
Z_PK integer primary key,
ZPARENTACCOUNT integer,
ZACCOUNTDESCRIPTION varchar,
ZUSERNAME varchar,
ZIDENTIFIER varchar,
ZACCOUNTTYPE integer
);
create table ZACCOUNTTYPE (Z_PK integer primary key, ZIDENTIFIER varchar);
insert into ZACCOUNTTYPE values (1, 'com.apple.account.Google');
insert into ZACCOUNTTYPE values (2, 'com.apple.account.IMAP');
insert into ZACCOUNT values (22, null, 'Kdoi Email', 'kdoi@email.com', '${directUuid}', 2);
insert into ZACCOUNT values (39, null, 'kdoi@email.com', 'kdoi@email.com', 'parent-google', 1);
insert into ZACCOUNT values (40, 39, null, null, '${childUuid}', 2);
`);
await execSql(envelopeIndex, `
create table mailboxes (
ROWID integer primary key,
url text,
total_count integer,
unread_count integer,
unseen_count integer
);
create table messages (
ROWID integer primary key,
remote_id integer,
date_received integer,
mailbox integer,
deleted integer
);
insert into mailboxes values (10, 'imap://${directUuid}/INBOX', 10, 0, 0);
insert into mailboxes values (48, 'imap://${childUuid}/INBOX', 0, 0, 0);
insert into messages values (${rowid}, 888, 1764547200, 10, 0);
`);
const result = await importAppleMailFromIndex({
account: { id: 'kdoi-email', email: 'kdoi@email.com' },
mailRoot,
indexPath: envelopeIndex,
accountsDbPath: accountsDb,
archiveRoot: archive,
since: '2025-01-01',
limit: 10,
});
assert.equal(result.imported, 1);
assert.equal(result.imapAccount.identifier, directUuid);
assert.equal(result.mailbox.url, `imap://${directUuid}/INBOX`);
});
test('maps apple mailbox urls and message buckets to filesystem paths', () => {
assert.deepEqual(bucketPathParts(563), []);
assert.deepEqual(bucketPathParts(27431), ['7', '2']);
assert.deepEqual(bucketPathParts(258680), ['8', '5', '2']);
assert.equal(
mailboxPathForUrl('/Mail/V10', 'imap://ACCOUNT/%5BGmail%5D/Todos'),
path.join('/Mail/V10', 'ACCOUNT', '[Gmail].mbox', 'Todos.mbox')
);
});
test('splits and imports mbox messages', async () => {
const tmp = await fs.mkdtemp(path.join(os.tmpdir(), 'kua-mbox-'));
const mboxPath = path.join(tmp, 'takeout.mbox');
const archive = path.join(tmp, 'archive');
const mbox = Buffer.concat([
Buffer.from('From sender@example Tue Dec 30 10:30:00 2025\n'),
sample,
Buffer.from('\nFrom sender@example Wed Dec 31 10:30:00 2025\n'),
sample,
]);
await fs.writeFile(mboxPath, mbox);
assert.equal(splitMbox(mbox).length, 2);
const result = await importMbox({
account: { id: 'vjoati-gmail' },
mboxPath,
archiveRoot: archive,
limit: 1,
});
assert.equal(result.scanned, 2);
assert.equal(result.imported, 1);
});
async function execSql(dbPath, sql) {
await execFileAsync('sqlite3', [dbPath, sql]);
}

30
test/mailArchive.test.js Normal file
View File

@ -0,0 +1,30 @@
import assert from 'node:assert/strict';
import fs from 'node:fs/promises';
import os from 'node:os';
import path from 'node:path';
import test from 'node:test';
import { archiveRawEmail } from '../src/mailArchive.js';
test('archives raw email and attachments with manifest records', async () => {
const tmp = await fs.mkdtemp(path.join(os.tmpdir(), 'kua-money-trace-mail-'));
const source = await fs.readFile('test/fixtures/sample-bank-email.eml');
const record = await archiveRawEmail({
account: { id: 'vjoati-gmail' },
mailbox: 'fixture',
uid: 1,
source,
archiveRoot: tmp,
});
assert.equal(record.accountId, 'vjoati-gmail');
assert.equal(record.mailbox, 'fixture');
assert.equal(record.subject, 'Estado de cuenta tarjeta credito terminado 5018');
assert.equal(record.attachments.length, 1);
assert.equal(record.attachments[0].filename, 'estado-cuenta-5018.pdf');
await fs.access(record.raw.path);
await fs.access(record.attachments[0].path);
const manifest = await fs.readFile(path.join(tmp, 'manifests', 'emails.ndjson'), 'utf8');
assert.match(manifest, /sample-bank-email-5018/);
});

117
test/moneyGraph.test.js Normal file
View File

@ -0,0 +1,117 @@
import assert from 'node:assert/strict';
import test from 'node:test';
import { classifyAllMovements } from '../src/classifier.js';
import { normalizeLedger } from '../src/domain.js';
import { buildMoneyGraph, originTree } from '../src/moneyGraph.js';
const ledgerFixture = normalizeLedger({
entities: [
{ id: 'darwin', name: 'Darwin Bruna', kind: 'person' },
{ id: 'muralla', name: 'Muralla SpA', kind: 'company' },
],
accounts: [
{
id: 'darwin-cc',
ownerEntityId: 'darwin',
institution: 'Banco de Chile',
instrument: 'checking_account',
currency: 'CLP',
},
{
id: 'darwin-visa',
ownerEntityId: 'darwin',
institution: 'Visa',
instrument: 'credit_card',
currency: 'CLP',
},
],
movements: [
{
id: 'income',
date: '2025-12-01',
accountId: 'darwin-cc',
direction: 'in',
amount: { currency: 'CLP', value: 200000 },
description: 'Ingreso puro',
economicType: 'pure_income',
},
{
id: 'pay-card',
date: '2026-01-15',
accountId: 'darwin-cc',
cardAccountId: 'darwin-visa',
direction: 'out',
amount: { currency: 'CLP', value: 150000 },
description: 'PAGO TARJETA VISA',
},
{
id: 'card-charge',
date: '2025-12-30',
accountId: 'darwin-visa',
direction: 'out',
amount: { currency: 'CLP', value: 126616 },
description: 'CAFE CULTURA BLACK SPA',
},
],
documents: [
{
id: 'black-spa-10804',
kind: 'dte_invoice',
issuerName: 'BLACK SPA',
receiverEntityId: 'muralla',
amount: { currency: 'CLP', value: 126616 },
},
],
events: [
{
id: 'black-spa-10804',
kind: 'real_expense',
entityId: 'muralla',
amount: { currency: 'CLP', value: 126616 },
description: 'Gasto Muralla BLACK SPA',
},
],
links: [
{
from: 'mov:card-charge',
to: 'event:black-spa-10804',
type: 'finances_event',
amount: { currency: 'CLP', value: 126616 },
method: 'manual',
},
{
from: 'doc:black-spa-10804',
to: 'event:black-spa-10804',
type: 'documents_event',
amount: { currency: 'CLP', value: 126616 },
method: 'exact',
},
],
});
test('builds origin tree from real expense to card, card payment and pure income', () => {
const ledger = classifyAllMovements(ledgerFixture);
const graph = buildMoneyGraph(ledger);
const tree = originTree(graph, 'event:black-spa-10804');
const json = JSON.stringify(tree);
assert.match(json, /mov:card-charge/);
assert.match(json, /mov:pay-card/);
assert.match(json, /mov:income/);
assert.match(json, /settles_card_charge/);
assert.match(json, /funds/);
});
test('classifies untyped credit-card outflow as card charge', () => {
const ledger = classifyAllMovements(ledgerFixture);
const charge = ledger.movements.find((movement) => movement.id === 'mov:card-charge');
assert.equal(charge.economicType, 'card_charge');
});
test('classifies card payment by description', () => {
const ledger = classifyAllMovements(ledgerFixture);
const payment = ledger.movements.find((movement) => movement.id === 'mov:pay-card');
assert.equal(payment.economicType, 'card_payment');
});