V 1.9 Fehler beseitig grobsortierung pdf Originale Sicherung, und Zugferd erkennung
This commit is contained in:
parent
013b037322
commit
6e85481f52
52 changed files with 9338 additions and 3093 deletions
0
.env.example
Normal file → Executable file
0
.env.example
Normal file → Executable file
0
.gitignore
vendored
Normal file → Executable file
0
.gitignore
vendored
Normal file → Executable file
BIN
Docker - Image/V1.9.tar
Executable file
BIN
Docker - Image/V1.9.tar
Executable file
Binary file not shown.
149
README.md
149
README.md
|
|
@ -1,149 +0,0 @@
|
|||
<<<<<<< HEAD
|
||||
# docker.dateiverwaltung
|
||||
|
||||
=======
|
||||
# Dateiverwaltung
|
||||
|
||||
Modulares Dokumenten-Management-System für automatische Verarbeitung, Sortierung und Benennung von Dokumenten.
|
||||
|
||||
## Features
|
||||
|
||||
- **Mail-Abruf**: Automatischer Abruf von Attachments aus IMAP-Postfächern
|
||||
- **PDF-Verarbeitung**: Text-Extraktion und OCR für gescannte Dokumente
|
||||
- **ZUGFeRD-Erkennung**: Automatische Erkennung und separate Ablage von ZUGFeRD-Rechnungen
|
||||
- **Regel-Engine**: Flexible, erweiterbare Regeln für Erkennung und Benennung
|
||||
- **Pipeline-System**: Mehrere unabhängige Pipelines (Firma, Privat, etc.)
|
||||
|
||||
## Schnellstart
|
||||
|
||||
### Mit Docker (empfohlen)
|
||||
|
||||
```bash
|
||||
# Image bauen und starten
|
||||
docker-compose up -d
|
||||
|
||||
# Logs ansehen
|
||||
docker-compose logs -f
|
||||
|
||||
# Stoppen
|
||||
docker-compose down
|
||||
```
|
||||
|
||||
Dann im Browser öffnen: http://localhost:8000
|
||||
|
||||
### Ohne Docker
|
||||
|
||||
```bash
|
||||
# Virtuelle Umgebung erstellen
|
||||
cd backend
|
||||
python -m venv venv
|
||||
source venv/bin/activate # Linux/Mac
|
||||
# oder: venv\Scripts\activate # Windows
|
||||
|
||||
# Abhängigkeiten installieren
|
||||
pip install -r requirements.txt
|
||||
|
||||
# Starten
|
||||
uvicorn app.main:app --reload --host 0.0.0.0 --port 8000
|
||||
```
|
||||
|
||||
## Benennungsschema
|
||||
|
||||
### Wiederkehrende Dokumente (Rechnungen)
|
||||
```
|
||||
{Jahr}.{Monat}.{Tag} - {Kategorie} - {Ersteller} - {Dokumentennummer} - {Sammelbegriff} - {Preis} EUR.pdf
|
||||
|
||||
Beispiel:
|
||||
2026.02.01 - Rechnung - Sonepar - 10023934 - Material - 1600 EUR.pdf
|
||||
```
|
||||
|
||||
### Einmalige Dokumente (Verträge, Zeugnisse)
|
||||
```
|
||||
{Typ} - {Aussteller} - {Beschreibung} - {Jahr}.pdf
|
||||
|
||||
Beispiel:
|
||||
Zeugnis - Schule X - Grundschulzeugnis - 2026.pdf
|
||||
```
|
||||
|
||||
## Projektstruktur
|
||||
|
||||
```
|
||||
dateiverwaltung/
|
||||
├── backend/
|
||||
│ ├── app/
|
||||
│ │ ├── models/ # Datenbank-Modelle
|
||||
│ │ ├── modules/ # Kernmodule (Mail, PDF, Sorter)
|
||||
│ │ ├── routes/ # API Endpoints
|
||||
│ │ ├── services/ # Business Logic
|
||||
│ │ └── main.py # FastAPI App
|
||||
│ └── requirements.txt
|
||||
├── frontend/
|
||||
│ ├── static/
|
||||
│ │ ├── css/
|
||||
│ │ └── js/
|
||||
│ └── templates/
|
||||
├── data/ # Persistente Daten
|
||||
│ ├── inbox/ # Neue Dateien
|
||||
│ ├── processed/ # Verarbeitete Dateien
|
||||
│ ├── archive/ # Sortierte Dateien
|
||||
│ └── zugferd/ # ZUGFeRD-Rechnungen
|
||||
├── regeln/ # Regel-Beispiele
|
||||
├── docker-compose.yml
|
||||
├── Dockerfile
|
||||
└── README.md
|
||||
```
|
||||
|
||||
## Module
|
||||
|
||||
### Mail-Fetcher
|
||||
Holt Attachments aus IMAP-Postfächern mit konfigurierbaren Filtern:
|
||||
- Dateitypen (.pdf, .jpg, etc.)
|
||||
- Maximale Größe
|
||||
- IMAP-Ordner
|
||||
|
||||
### PDF-Processor
|
||||
- **Text-Extraktion**: Mit pdfplumber/pypdf
|
||||
- **OCR**: Mit ocrmypdf + Tesseract (deutsch)
|
||||
- **ZUGFeRD**: Erkennung via factur-x Library
|
||||
|
||||
### Sorter
|
||||
Regelbasierte Erkennung und Benennung:
|
||||
- Pattern-Matching (Text, Absender, Dateiname)
|
||||
- Regex-basierte Feldextraktion
|
||||
- Konfigurierbares Namensschema
|
||||
|
||||
## API Endpoints
|
||||
|
||||
| Methode | Endpoint | Beschreibung |
|
||||
|---------|----------|--------------|
|
||||
| GET | /api/pipelines | Alle Pipelines |
|
||||
| POST | /api/pipelines | Neue Pipeline |
|
||||
| POST | /api/pipelines/{id}/run | Pipeline ausführen |
|
||||
| GET | /api/pipelines/{id}/mail-configs | Mail-Konfigurationen |
|
||||
| POST | /api/pipelines/{id}/mail-configs | Postfach hinzufügen |
|
||||
| GET | /api/pipelines/{id}/regeln | Sortier-Regeln |
|
||||
| POST | /api/pipelines/{id}/regeln | Regel hinzufügen |
|
||||
| POST | /api/regeln/test | Regel testen |
|
||||
| GET | /api/dokumente | Verarbeitete Dokumente |
|
||||
| GET | /api/stats | Statistiken |
|
||||
|
||||
## Regex-Beispiele für Regeln
|
||||
|
||||
```yaml
|
||||
# Datum (DD.MM.YYYY)
|
||||
(\d{2}[./]\d{2}[./]\d{4})
|
||||
|
||||
# Rechnungsnummer
|
||||
(?:Rechnungsnummer|Invoice)[:\s]*(\d+)
|
||||
|
||||
# Betrag mit EUR
|
||||
(?:Gesamtbetrag|Summe)[:\s]*([\d.,]+)\s*(?:EUR|€)
|
||||
```
|
||||
|
||||
## Erweiterungen (geplant)
|
||||
|
||||
- [ ] Claude API Integration für KI-Validierung
|
||||
- [ ] Scheduler für automatische Ausführung
|
||||
- [ ] Dolibarr-Integration
|
||||
- [ ] Dashboard mit Grafiken
|
||||
>>>>>>> 8585cc3 (Dateiverwaltung Email attachment abruf läuft)
|
||||
6
Dockerfile → Source/Dockerfile
Normal file → Executable file
6
Dockerfile → Source/Dockerfile
Normal file → Executable file
|
|
@ -9,6 +9,7 @@ RUN apt-get update && apt-get install -y --no-install-recommends \
|
|||
poppler-utils \
|
||||
ghostscript \
|
||||
libmagic1 \
|
||||
curl \
|
||||
&& rm -rf /var/lib/apt/lists/*
|
||||
|
||||
# Arbeitsverzeichnis
|
||||
|
|
@ -21,15 +22,10 @@ RUN pip install --no-cache-dir -r requirements.txt
|
|||
# Anwendung kopieren
|
||||
COPY backend/ ./backend/
|
||||
COPY frontend/ ./frontend/
|
||||
COPY config/ ./config/
|
||||
COPY regeln/ ./regeln/
|
||||
|
||||
# Daten-Verzeichnis
|
||||
RUN mkdir -p /app/data/inbox /app/data/processed /app/data/archive /app/data/zugferd
|
||||
|
||||
# Umgebungsvariablen
|
||||
ENV PYTHONPATH=/app
|
||||
ENV DATABASE_URL=sqlite:////app/data/dateiverwaltung.db
|
||||
|
||||
# Port
|
||||
EXPOSE 8000
|
||||
150
Source/README.md
Executable file
150
Source/README.md
Executable file
|
|
@ -0,0 +1,150 @@
|
|||
# Dateiverwaltung
|
||||
|
||||
Dokumenten-Management-System für automatische Verarbeitung, Sortierung und Benennung von Dokumenten.
|
||||
|
||||
## Features
|
||||
|
||||
- **Mail-Abruf**: Automatischer Abruf von Attachments aus IMAP-Postfächern
|
||||
- **Grobsortierung**: Dateien nach Typ verschieben (PDF, Bilder, ZUGFeRD, Signiert)
|
||||
- **PDF-Verarbeitung**: Text-Extraktion und OCR für gescannte Dokumente
|
||||
- **ZUGFeRD-Erkennung**: Automatische Erkennung von ZUGFeRD-Rechnungen
|
||||
- **Regel-Engine**: Flexible Regeln für Erkennung und automatische Benennung
|
||||
- **Zeitpläne**: Automatische Ausführung per Scheduler
|
||||
|
||||
## Deployment mit Portainer
|
||||
|
||||
### 1. Image bauen oder pullen
|
||||
|
||||
**Option A: Image aus tar laden**
|
||||
```bash
|
||||
docker load -i dateiverwaltung-image.tar
|
||||
```
|
||||
|
||||
**Option B: Image selbst bauen**
|
||||
```bash
|
||||
docker build -t dateiverwaltung:latest .
|
||||
```
|
||||
|
||||
### 2. Container in Portainer erstellen
|
||||
|
||||
Neuen Container erstellen mit folgenden Einstellungen:
|
||||
|
||||
**Image:** `dateiverwaltung:latest`
|
||||
|
||||
**Port Mapping:**
|
||||
| Host | Container |
|
||||
|------|-----------|
|
||||
| 8080 | 8000 |
|
||||
|
||||
**Volumes:**
|
||||
| Host | Container | Beschreibung |
|
||||
|------|-----------|--------------|
|
||||
| `/mnt/user/...` | `/mnt/user/...` | Zugriff auf NAS-Ordner |
|
||||
|
||||
**Environment Variables:**
|
||||
|
||||
| Variable | Beschreibung | Beispiel |
|
||||
|----------|--------------|----------|
|
||||
| `DATABASE_URL` | Datenbank-Verbindung (MariaDB/MySQL) | `mysql+pymysql://user:pass@host/db` |
|
||||
| `TZ` | Zeitzone | `Europe/Berlin` |
|
||||
|
||||
**Beispiel DATABASE_URL Formate:**
|
||||
|
||||
```
|
||||
# MariaDB/MySQL
|
||||
mysql+pymysql://benutzer:passwort@192.168.1.100:3306/dateiverwaltung
|
||||
|
||||
# SQLite (nur für Tests)
|
||||
sqlite:///dateiverwaltung.db
|
||||
```
|
||||
|
||||
### 3. Datenbank vorbereiten
|
||||
|
||||
Bei MariaDB/MySQL die Datenbank vorher erstellen:
|
||||
|
||||
```sql
|
||||
CREATE DATABASE dateiverwaltung CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
|
||||
CREATE USER 'dateiverwaltung'@'%' IDENTIFIED BY 'sicheres_passwort';
|
||||
GRANT ALL PRIVILEGES ON dateiverwaltung.* TO 'dateiverwaltung'@'%';
|
||||
FLUSH PRIVILEGES;
|
||||
```
|
||||
|
||||
Die Tabellen werden beim ersten Start automatisch erstellt.
|
||||
|
||||
### 4. Container starten
|
||||
|
||||
Nach dem Start ist die Web-Oberfläche erreichbar unter:
|
||||
```
|
||||
http://<server-ip>:8080
|
||||
```
|
||||
|
||||
## Docker Compose (Alternative)
|
||||
|
||||
```yaml
|
||||
version: '3.8'
|
||||
|
||||
services:
|
||||
dateiverwaltung:
|
||||
image: dateiverwaltung:latest
|
||||
container_name: dateiverwaltung
|
||||
restart: unless-stopped
|
||||
ports:
|
||||
- "8080:8000"
|
||||
volumes:
|
||||
- /mnt:/mnt
|
||||
environment:
|
||||
- TZ=Europe/Berlin
|
||||
- DATABASE_URL=mysql+pymysql://user:pass@db-host/dateiverwaltung
|
||||
```
|
||||
|
||||
## Konfiguration
|
||||
|
||||
Alle Einstellungen werden in der Datenbank gespeichert:
|
||||
|
||||
- **Postfächer**: IMAP-Server, Zugangsdaten, Filter
|
||||
- **Quell-Ordner**: Pfade, Dateitypen, ZUGFeRD/Signiert-Behandlung
|
||||
- **Sortier-Regeln**: Erkennungsmuster, Extraktion, Benennungsschema
|
||||
- **Zeitpläne**: Automatische Ausführung
|
||||
|
||||
## Module
|
||||
|
||||
### Mail-Fetcher
|
||||
Holt Attachments aus IMAP-Postfächern:
|
||||
- Filter nach Dateitypen und Größe
|
||||
- Nur ungelesene oder alle Mails
|
||||
- Alle IMAP-Ordner durchsuchen
|
||||
|
||||
### Grobsortierung
|
||||
Sortiert Dateien nach Typ:
|
||||
- Konfigurierbare Dateitypen
|
||||
- ZUGFeRD-Erkennung
|
||||
- Signierte PDF-Erkennung
|
||||
- Optional: Direkt verschieben ohne Regeln
|
||||
|
||||
### PDF-Processor
|
||||
- Text-Extraktion mit pdfplumber/pypdf
|
||||
- OCR mit ocrmypdf + Tesseract (deutsch)
|
||||
- ZUGFeRD-Erkennung via factur-x
|
||||
|
||||
### Sortier-Regeln
|
||||
- Keyword-basierte Erkennung
|
||||
- Regex für Feldextraktion (Datum, Betrag, Nummer)
|
||||
- Flexibles Benennungsschema
|
||||
|
||||
## Benennungsschema Beispiele
|
||||
|
||||
```
|
||||
# Rechnungen
|
||||
{datum} - {firma} - Rechnung {nummer}.pdf
|
||||
-> 2026-02-01 - Amazon - Rechnung 123456.pdf
|
||||
|
||||
# Mit Betrag
|
||||
{datum} - {firma} - {betrag} EUR.pdf
|
||||
-> 2026-02-01 - Amazon - 49.99 EUR.pdf
|
||||
```
|
||||
|
||||
## Systemanforderungen
|
||||
|
||||
- Docker oder Python 3.11+
|
||||
- MariaDB/MySQL (empfohlen) oder SQLite
|
||||
- Für OCR: tesseract-ocr, ocrmypdf
|
||||
0
backend/app/__init__.py → Source/backend/app/__init__.py
Normal file → Executable file
0
backend/app/__init__.py → Source/backend/app/__init__.py
Normal file → Executable file
25
Source/backend/app/config.py
Executable file
25
Source/backend/app/config.py
Executable file
|
|
@ -0,0 +1,25 @@
|
|||
"""Zentrale Konfiguration"""
|
||||
import os
|
||||
from pathlib import Path
|
||||
|
||||
# Basis-Pfade
|
||||
BASE_DIR = Path(__file__).parent.parent.parent
|
||||
REGELN_DIR = BASE_DIR / "regeln"
|
||||
|
||||
# Datenbank (Default SQLite nur für lokale Entwicklung)
|
||||
DATABASE_URL = os.getenv("DATABASE_URL", "sqlite:///dateiverwaltung.db")
|
||||
|
||||
# Fallback-Ordner (werden nur verwendet wenn kein Zielordner angegeben)
|
||||
# Diese werden NICHT automatisch erstellt - nur als Fallback-Pfade definiert
|
||||
DATA_DIR = Path("/app/data") # Container-interner Pfad
|
||||
INBOX_DIR = DATA_DIR / "inbox"
|
||||
PROCESSED_DIR = DATA_DIR / "processed"
|
||||
ARCHIVE_DIR = DATA_DIR / "archive"
|
||||
ZUGFERD_DIR = DATA_DIR / "zugferd"
|
||||
|
||||
# OCR Einstellungen
|
||||
OCR_LANGUAGE = "deu" # Deutsch
|
||||
OCR_DPI = 300
|
||||
|
||||
# Nur Regeln-Ordner erstellen (wird per Volume gemountet)
|
||||
REGELN_DIR.mkdir(parents=True, exist_ok=True)
|
||||
31
backend/app/main.py → Source/backend/app/main.py
Normal file → Executable file
31
backend/app/main.py → Source/backend/app/main.py
Normal file → Executable file
|
|
@ -2,6 +2,7 @@
|
|||
Dateiverwaltung - Modulares Dokumenten-Management-System
|
||||
Hauptanwendung mit FastAPI
|
||||
"""
|
||||
from contextlib import asynccontextmanager
|
||||
from fastapi import FastAPI
|
||||
from fastapi.staticfiles import StaticFiles
|
||||
from fastapi.templating import Jinja2Templates
|
||||
|
|
@ -13,18 +14,39 @@ import logging
|
|||
from .models import init_db
|
||||
from .routes.api import router as api_router
|
||||
from .config import BASE_DIR
|
||||
from .services.scheduler_service import init_scheduler, shutdown_scheduler
|
||||
|
||||
# Logging konfigurieren
|
||||
logging.basicConfig(
|
||||
level=logging.INFO,
|
||||
format="%(asctime)s - %(name)s - %(levelname)s - %(message)s"
|
||||
)
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
@asynccontextmanager
|
||||
async def lifespan(app: FastAPI):
|
||||
"""Lifecycle-Management für die App"""
|
||||
# Startup
|
||||
print("=== Dateiverwaltung startet ===", flush=True)
|
||||
init_db()
|
||||
print("Datenbank initialisiert", flush=True)
|
||||
init_scheduler()
|
||||
print("Scheduler initialisiert", flush=True)
|
||||
|
||||
yield
|
||||
|
||||
# Shutdown
|
||||
shutdown_scheduler()
|
||||
print("Scheduler beendet", flush=True)
|
||||
|
||||
|
||||
# App erstellen
|
||||
app = FastAPI(
|
||||
title="Dateiverwaltung",
|
||||
description="Modulares Dokumenten-Management-System",
|
||||
version="1.0.0"
|
||||
version="1.0.0",
|
||||
lifespan=lifespan
|
||||
)
|
||||
|
||||
# Statische Dateien
|
||||
|
|
@ -38,13 +60,6 @@ templates = Jinja2Templates(directory=frontend_dir / "templates")
|
|||
app.include_router(api_router)
|
||||
|
||||
|
||||
@app.on_event("startup")
|
||||
async def startup():
|
||||
"""Initialisierung beim Start"""
|
||||
init_db()
|
||||
logging.info("Datenbank initialisiert")
|
||||
|
||||
|
||||
@app.get("/", response_class=HTMLResponse)
|
||||
async def index(request: Request):
|
||||
"""Hauptseite"""
|
||||
4
Source/backend/app/models/__init__.py
Executable file
4
Source/backend/app/models/__init__.py
Executable file
|
|
@ -0,0 +1,4 @@
|
|||
from .database import (
|
||||
Postfach, QuellOrdner, SortierRegel, VerarbeiteteDatei, Zeitplan,
|
||||
VerarbeiteteMail, init_db, get_db, SessionLocal
|
||||
)
|
||||
276
Source/backend/app/models/database.py
Executable file
276
Source/backend/app/models/database.py
Executable file
|
|
@ -0,0 +1,276 @@
|
|||
"""Datenbank-Modelle - Getrennte Bereiche: Mail-Abruf und Datei-Sortierung"""
|
||||
from sqlalchemy import create_engine, Column, Integer, String, Boolean, DateTime, Text, JSON, ForeignKey
|
||||
from sqlalchemy.ext.declarative import declarative_base
|
||||
from sqlalchemy.orm import sessionmaker
|
||||
from datetime import datetime
|
||||
|
||||
from sqlalchemy import event
|
||||
from ..config import DATABASE_URL
|
||||
|
||||
# Datenbank-Engine erstellen (SQLite oder MariaDB)
|
||||
is_sqlite = DATABASE_URL.startswith("sqlite")
|
||||
|
||||
if is_sqlite:
|
||||
# SQLite mit WAL-Modus und Timeout für bessere Concurrency
|
||||
engine = create_engine(
|
||||
DATABASE_URL,
|
||||
echo=False,
|
||||
connect_args={"check_same_thread": False, "timeout": 30}
|
||||
)
|
||||
|
||||
# WAL-Modus aktivieren für bessere gleichzeitige Zugriffe
|
||||
@event.listens_for(engine, "connect")
|
||||
def set_sqlite_pragma(dbapi_connection, connection_record):
|
||||
cursor = dbapi_connection.cursor()
|
||||
cursor.execute("PRAGMA journal_mode=WAL")
|
||||
cursor.execute("PRAGMA busy_timeout=30000")
|
||||
cursor.close()
|
||||
else:
|
||||
# MariaDB/MySQL
|
||||
engine = create_engine(
|
||||
DATABASE_URL,
|
||||
echo=False,
|
||||
pool_pre_ping=True,
|
||||
pool_recycle=3600
|
||||
)
|
||||
|
||||
SessionLocal = sessionmaker(bind=engine)
|
||||
Base = declarative_base()
|
||||
|
||||
|
||||
# ============ BEREICH 1: Mail-Abruf ============
|
||||
|
||||
class Postfach(Base):
|
||||
"""IMAP-Postfach Konfiguration"""
|
||||
__tablename__ = "postfaecher"
|
||||
|
||||
id = Column(Integer, primary_key=True)
|
||||
name = Column(String(100), nullable=False)
|
||||
|
||||
# IMAP
|
||||
imap_server = Column(String(255), nullable=False)
|
||||
imap_port = Column(Integer, default=993)
|
||||
email = Column(String(255), nullable=False)
|
||||
passwort = Column(String(255), nullable=False)
|
||||
ordner = Column(String(100), default="INBOX")
|
||||
alle_ordner = Column(Boolean, default=False) # Alle IMAP-Ordner durchsuchen
|
||||
nur_ungelesen = Column(Boolean, default=False) # Nur ungelesene Mails (False = alle)
|
||||
|
||||
# Ziel
|
||||
ziel_ordner = Column(String(500), nullable=False)
|
||||
|
||||
# Filter
|
||||
erlaubte_typen = Column(JSON, default=lambda: [".pdf"])
|
||||
max_groesse_mb = Column(Integer, default=25)
|
||||
min_groesse_kb = Column(Integer, default=10) # Mindestgröße in KB (gegen Icons)
|
||||
ab_datum = Column(DateTime) # Nur Mails ab diesem Datum verarbeiten
|
||||
# Größenfilter pro Dateityp: {".pdf": {"min_kb": 10, "max_mb": 25}, ".jpg": {"min_kb": 50, "max_mb": 10}}
|
||||
groessen_filter = Column(JSON, default=lambda: {})
|
||||
|
||||
# Status
|
||||
aktiv = Column(Boolean, default=True)
|
||||
letzter_abruf = Column(DateTime)
|
||||
letzte_anzahl = Column(Integer, default=0)
|
||||
|
||||
|
||||
# ============ BEREICH 2: Datei-Sortierung ============
|
||||
|
||||
class QuellOrdner(Base):
|
||||
"""Ordner der nach Dateien gescannt wird"""
|
||||
__tablename__ = "quell_ordner"
|
||||
|
||||
id = Column(Integer, primary_key=True)
|
||||
name = Column(String(100), nullable=False)
|
||||
pfad = Column(String(500), nullable=False)
|
||||
ziel_ordner = Column(String(500), nullable=False)
|
||||
rekursiv = Column(Boolean, default=True) # Unterordner einschließen
|
||||
dateitypen = Column(JSON, default=lambda: [".pdf", ".jpg", ".jpeg", ".png", ".tiff"])
|
||||
# ZUGFeRD-Behandlung: "separieren", "regel", "normal", "ignorieren"
|
||||
zugferd_behandlung = Column(String(20), default="separieren")
|
||||
# Signierte PDFs: "normal", "separieren", "regel", "ignorieren"
|
||||
signiert_behandlung = Column(String(20), default="normal")
|
||||
aktiv = Column(Boolean, default=True)
|
||||
# NEU: Direkt verschieben ohne Regelprüfung
|
||||
direkt_verschieben = Column(Boolean, default=False)
|
||||
# OCR-Optionen
|
||||
ocr_aktivieren = Column(Boolean, default=True) # OCR für gescannte PDFs
|
||||
original_sichern = Column(String(500)) # Ordner für Original-Backup (vor OCR)
|
||||
|
||||
# Status (wie bei Postfächern)
|
||||
letzte_verarbeitung = Column(DateTime)
|
||||
letzte_anzahl = Column(Integer, default=0)
|
||||
|
||||
|
||||
class SortierRegel(Base):
|
||||
"""Regeln für Datei-Erkennung und Benennung"""
|
||||
__tablename__ = "sortier_regeln"
|
||||
|
||||
id = Column(Integer, primary_key=True)
|
||||
name = Column(String(100), nullable=False)
|
||||
prioritaet = Column(Integer, default=100)
|
||||
aktiv = Column(Boolean, default=True)
|
||||
|
||||
# Erkennungsmuster (JSON mit keywords, text_match, text_regex, etc.)
|
||||
# NEU: Auch negative Muster möglich (keywords_nicht, text_not_match)
|
||||
muster = Column(JSON, default=dict)
|
||||
|
||||
# Extraktion (JSON mit datum, betrag, nummer, firma, etc.)
|
||||
extraktion = Column(JSON, default=dict)
|
||||
|
||||
# Ausgabe
|
||||
schema = Column(String(500), default="{datum} - Dokument.pdf")
|
||||
unterordner = Column(String(100)) # Optional: Unterordner im Ziel
|
||||
|
||||
# NEU: Fallback-Regel (greift wenn keine andere Regel passt)
|
||||
ist_fallback = Column(Boolean, default=False)
|
||||
|
||||
# Freie Ordner (zusätzlich zu den zugewiesenen Quell-Ordnern)
|
||||
freie_ordner = Column(JSON, default=list)
|
||||
|
||||
# Ziel-Ordner für diese Regel (optional, überschreibt Quell-Ordner Ziel)
|
||||
ziel_ordner = Column(String(500))
|
||||
|
||||
# Nur umbenennen, nicht verschieben (Dateien bleiben im Quellordner)
|
||||
nur_umbenennen = Column(Boolean, default=False)
|
||||
|
||||
|
||||
class OrdnerRegel(Base):
|
||||
"""Verknüpfung zwischen Quell-Ordnern und Sortier-Regeln"""
|
||||
__tablename__ = "ordner_regeln"
|
||||
|
||||
id = Column(Integer, primary_key=True)
|
||||
ordner_id = Column(Integer, ForeignKey("quell_ordner.id", ondelete="CASCADE"), nullable=False)
|
||||
regel_id = Column(Integer, ForeignKey("sortier_regeln.id", ondelete="CASCADE"), nullable=False)
|
||||
|
||||
|
||||
class Zeitplan(Base):
|
||||
"""Scheduler-Konfiguration für automatische Ausführung"""
|
||||
__tablename__ = "zeitplaene"
|
||||
|
||||
id = Column(Integer, primary_key=True)
|
||||
name = Column(String(100), nullable=False)
|
||||
aktiv = Column(Boolean, default=True)
|
||||
|
||||
# Was wird ausgeführt
|
||||
typ = Column(String(50), nullable=False) # "mail_abruf", "grobsortierung", "sortierregeln"
|
||||
postfach_id = Column(Integer) # Optional: spezifisches Postfach
|
||||
quell_ordner_id = Column(Integer) # Optional: spezifischer Quellordner
|
||||
regel_id = Column(Integer) # Optional: spezifische Regel (für sortierregeln)
|
||||
|
||||
# Zeitplan-Intervall
|
||||
intervall = Column(String(20), nullable=False) # "stündlich", "täglich", "wöchentlich", "monatlich"
|
||||
stunde = Column(Integer, default=6) # Uhrzeit (0-23)
|
||||
minute = Column(Integer, default=0) # Minute (0-59)
|
||||
wochentag = Column(Integer) # 0=Montag, 6=Sonntag (für wöchentlich)
|
||||
monatstag = Column(Integer) # 1-28 (für monatlich)
|
||||
|
||||
# Status
|
||||
letzte_ausfuehrung = Column(DateTime)
|
||||
naechste_ausfuehrung = Column(DateTime)
|
||||
letzter_status = Column(String(50)) # "erfolg", "fehler"
|
||||
letzte_meldung = Column(Text)
|
||||
|
||||
erstellt_am = Column(DateTime, default=datetime.utcnow)
|
||||
|
||||
|
||||
class VerarbeiteteMail(Base):
|
||||
"""Tracking welche Mails bereits verarbeitet wurden"""
|
||||
__tablename__ = "verarbeitete_mails"
|
||||
|
||||
id = Column(Integer, primary_key=True)
|
||||
postfach_id = Column(Integer, nullable=False)
|
||||
message_id = Column(String(500), nullable=False) # Email Message-ID Header
|
||||
ordner = Column(String(200)) # IMAP Ordner
|
||||
betreff = Column(String(500))
|
||||
absender = Column(String(255))
|
||||
anzahl_attachments = Column(Integer, default=0)
|
||||
verarbeitet_am = Column(DateTime, default=datetime.utcnow)
|
||||
|
||||
|
||||
class VerarbeiteteDatei(Base):
|
||||
"""Log verarbeiteter Dateien"""
|
||||
__tablename__ = "verarbeitete_dateien"
|
||||
|
||||
id = Column(Integer, primary_key=True)
|
||||
original_pfad = Column(String(1000))
|
||||
original_name = Column(String(500))
|
||||
neuer_pfad = Column(String(1000))
|
||||
neuer_name = Column(String(500))
|
||||
|
||||
ist_zugferd = Column(Boolean, default=False)
|
||||
ocr_durchgefuehrt = Column(Boolean, default=False)
|
||||
|
||||
status = Column(String(50)) # sortiert, zugferd, fehler, keine_regel
|
||||
fehler = Column(Text)
|
||||
|
||||
extrahierte_daten = Column(JSON)
|
||||
verarbeitet_am = Column(DateTime, default=datetime.utcnow)
|
||||
|
||||
|
||||
def migrate_db():
|
||||
"""Fügt fehlende Spalten hinzu ohne Daten zu löschen"""
|
||||
from sqlalchemy import inspect, text
|
||||
|
||||
inspector = inspect(engine)
|
||||
|
||||
# Migrations-Definitionen: {tabelle: {spalte: sql_typ}}
|
||||
migrations = {
|
||||
"postfaecher": {
|
||||
"alle_ordner": "BOOLEAN DEFAULT 0",
|
||||
"nur_ungelesen": "BOOLEAN DEFAULT 0",
|
||||
"min_groesse_kb": "INTEGER DEFAULT 10",
|
||||
"ab_datum": "DATETIME",
|
||||
"groessen_filter": "JSON"
|
||||
},
|
||||
"quell_ordner": {
|
||||
"rekursiv": "BOOLEAN DEFAULT 1",
|
||||
"dateitypen": "JSON",
|
||||
"zugferd_behandlung": "VARCHAR(20) DEFAULT 'separieren'",
|
||||
"signiert_behandlung": "VARCHAR(20) DEFAULT 'normal'",
|
||||
"direkt_verschieben": "BOOLEAN DEFAULT 0",
|
||||
"ocr_aktivieren": "BOOLEAN DEFAULT 1",
|
||||
"original_sichern": "VARCHAR(500)",
|
||||
"letzte_verarbeitung": "DATETIME",
|
||||
"letzte_anzahl": "INTEGER DEFAULT 0"
|
||||
},
|
||||
"sortier_regeln": {
|
||||
"ist_fallback": "BOOLEAN DEFAULT 0",
|
||||
"freie_ordner": "JSON",
|
||||
"ziel_ordner": "VARCHAR(500)",
|
||||
"nur_umbenennen": "BOOLEAN DEFAULT 0"
|
||||
},
|
||||
"zeitplaene": {
|
||||
"regel_id": "INTEGER"
|
||||
}
|
||||
}
|
||||
|
||||
with engine.connect() as conn:
|
||||
for table, columns in migrations.items():
|
||||
if table not in inspector.get_table_names():
|
||||
continue
|
||||
|
||||
existing = [col["name"] for col in inspector.get_columns(table)]
|
||||
|
||||
for col_name, col_type in columns.items():
|
||||
if col_name not in existing:
|
||||
try:
|
||||
conn.execute(text(f"ALTER TABLE {table} ADD COLUMN {col_name} {col_type}"))
|
||||
conn.commit()
|
||||
print(f"Migration: {table}.{col_name} hinzugefügt")
|
||||
except Exception as e:
|
||||
print(f"Migration übersprungen: {table}.{col_name} - {e}")
|
||||
|
||||
|
||||
def init_db():
|
||||
"""Datenbank initialisieren"""
|
||||
Base.metadata.create_all(engine)
|
||||
migrate_db()
|
||||
|
||||
|
||||
def get_db():
|
||||
"""Database Session Generator"""
|
||||
db = SessionLocal()
|
||||
try:
|
||||
yield db
|
||||
finally:
|
||||
db.close()
|
||||
0
backend/app/modules/__init__.py → Source/backend/app/modules/__init__.py
Normal file → Executable file
0
backend/app/modules/__init__.py → Source/backend/app/modules/__init__.py
Normal file → Executable file
22
backend/app/modules/extraktoren.py → Source/backend/app/modules/extraktoren.py
Normal file → Executable file
22
backend/app/modules/extraktoren.py → Source/backend/app/modules/extraktoren.py
Normal file → Executable file
|
|
@ -12,22 +12,22 @@ logger = logging.getLogger(__name__)
|
|||
|
||||
# ============ DATUM ============
|
||||
DATUM_MUSTER = [
|
||||
# Mit Kontext (zuverlässiger)
|
||||
{"regex": r"Rechnungsdatum[:\s]*(\d{2})[./](\d{2})[./](\d{4})", "order": "dmy"},
|
||||
{"regex": r"Belegdatum[:\s]*(\d{2})[./](\d{2})[./](\d{4})", "order": "dmy"},
|
||||
{"regex": r"Datum[:\s]*(\d{2})[./](\d{2})[./](\d{4})", "order": "dmy"},
|
||||
{"regex": r"Date[:\s]*(\d{2})[./](\d{2})[./](\d{4})", "order": "dmy"},
|
||||
{"regex": r"vom[:\s]*(\d{2})[./](\d{2})[./](\d{4})", "order": "dmy"},
|
||||
# Mit Kontext (zuverlässiger) - akzeptiert 1 oder 2 Ziffern für Tag/Monat
|
||||
{"regex": r"Rechnungsdatum[:\s]*(\d{1,2})[./](\d{1,2})[./](\d{4})", "order": "dmy"},
|
||||
{"regex": r"Belegdatum[:\s]*(\d{1,2})[./](\d{1,2})[./](\d{4})", "order": "dmy"},
|
||||
{"regex": r"Datum[:\s]*(\d{1,2})[./](\d{1,2})[./](\d{4})", "order": "dmy"},
|
||||
{"regex": r"Date[:\s]*(\d{1,2})[./](\d{1,2})[./](\d{4})", "order": "dmy"},
|
||||
{"regex": r"vom[:\s]*(\d{1,2})[./](\d{1,2})[./](\d{4})", "order": "dmy"},
|
||||
|
||||
# ISO Format
|
||||
# ISO Format (immer 2 Ziffern)
|
||||
{"regex": r"(\d{4})-(\d{2})-(\d{2})", "order": "ymd"},
|
||||
|
||||
# Deutsches Format ohne Kontext
|
||||
{"regex": r"(\d{2})\.(\d{2})\.(\d{4})", "order": "dmy"},
|
||||
{"regex": r"(\d{2})/(\d{2})/(\d{4})", "order": "dmy"},
|
||||
# Deutsches Format ohne Kontext - akzeptiert 1 oder 2 Ziffern
|
||||
{"regex": r"(\d{1,2})\.(\d{1,2})\.(\d{4})", "order": "dmy"},
|
||||
{"regex": r"(\d{1,2})/(\d{1,2})/(\d{4})", "order": "dmy"},
|
||||
|
||||
# Amerikanisches Format
|
||||
{"regex": r"(\d{2})/(\d{2})/(\d{4})", "order": "mdy"},
|
||||
{"regex": r"(\d{1,2})/(\d{1,2})/(\d{4})", "order": "mdy"},
|
||||
|
||||
# Ausgeschriebene Monate
|
||||
{"regex": r"(\d{1,2})\.\s*(Januar|Februar|März|April|Mai|Juni|Juli|August|September|Oktober|November|Dezember)\s*(\d{4})", "order": "dMy"},
|
||||
89
backend/app/modules/mail_fetcher.py → Source/backend/app/modules/mail_fetcher.py
Normal file → Executable file
89
backend/app/modules/mail_fetcher.py → Source/backend/app/modules/mail_fetcher.py
Normal file → Executable file
|
|
@ -102,6 +102,8 @@ class MailFetcher:
|
|||
ergebnisse = []
|
||||
erlaubte_typen = self.config.get("erlaubte_typen", [".pdf"])
|
||||
max_groesse = self.config.get("max_groesse_mb", 25) * 1024 * 1024
|
||||
min_groesse = self.config.get("min_groesse_kb", 10) * 1024 # Mindestgröße in Bytes
|
||||
groessen_filter = self.config.get("groessen_filter", {}) # Pro Dateityp: {".pdf": {"min_kb": 10, "max_mb": 25}}
|
||||
bereits_verarbeitet = bereits_verarbeitet or set()
|
||||
|
||||
# Ordner bestimmen
|
||||
|
|
@ -113,14 +115,15 @@ class MailFetcher:
|
|||
|
||||
for ordner in ordner_liste:
|
||||
ergebnisse.extend(self._fetch_from_folder(
|
||||
ordner, ziel, erlaubte_typen, max_groesse,
|
||||
nur_ungelesen, markiere_gelesen, bereits_verarbeitet
|
||||
ordner, ziel, erlaubte_typen, max_groesse, min_groesse,
|
||||
groessen_filter, nur_ungelesen, markiere_gelesen, bereits_verarbeitet
|
||||
))
|
||||
|
||||
return ergebnisse
|
||||
|
||||
def _fetch_from_folder(self, ordner: str, ziel: Path,
|
||||
erlaubte_typen: List[str], max_groesse: int,
|
||||
min_groesse: int, groessen_filter: dict,
|
||||
nur_ungelesen: bool, markiere_gelesen: bool,
|
||||
bereits_verarbeitet: set) -> List[Dict]:
|
||||
"""Holt Attachments aus einem einzelnen Ordner"""
|
||||
|
|
@ -129,9 +132,25 @@ class MailFetcher:
|
|||
try:
|
||||
# Ordner auswählen
|
||||
status, _ = self.connection.select(ordner)
|
||||
if status != "OK":
|
||||
logger.debug(f"Ordner nicht zugreifbar: {ordner}")
|
||||
return []
|
||||
|
||||
# Suche nach Mails
|
||||
search_criteria = "(UNSEEN)" if nur_ungelesen else "ALL"
|
||||
# Suche nach Mails mit optionalem Datumfilter
|
||||
ab_datum = self.config.get("ab_datum")
|
||||
if nur_ungelesen:
|
||||
search_criteria = "(UNSEEN)"
|
||||
elif ab_datum:
|
||||
# IMAP SINCE erwartet Format: dd-Mon-yyyy
|
||||
try:
|
||||
if isinstance(ab_datum, str):
|
||||
ab_datum = datetime.fromisoformat(ab_datum.replace("Z", "+00:00"))
|
||||
datum_str = ab_datum.strftime("%d-%b-%Y")
|
||||
search_criteria = f'(SINCE {datum_str})'
|
||||
except:
|
||||
search_criteria = "ALL"
|
||||
else:
|
||||
search_criteria = "ALL"
|
||||
status, messages = self.connection.search(None, search_criteria)
|
||||
|
||||
if status != "OK":
|
||||
|
|
@ -181,8 +200,17 @@ class MailFetcher:
|
|||
if not payload:
|
||||
continue
|
||||
|
||||
if len(payload) > max_groesse:
|
||||
logger.warning(f"Überspringe {filename}: Zu groß ({len(payload)} bytes)")
|
||||
# Größenlimits: Pro Dateityp oder global
|
||||
typ_filter = groessen_filter.get(datei_endung, {})
|
||||
typ_max = typ_filter.get("max_mb", max_groesse / (1024 * 1024)) * 1024 * 1024
|
||||
typ_min = typ_filter.get("min_kb", min_groesse / 1024) * 1024
|
||||
|
||||
if len(payload) > typ_max:
|
||||
logger.warning(f"Überspringe {filename}: Zu groß ({len(payload)} bytes, max {typ_max})")
|
||||
continue
|
||||
|
||||
if len(payload) < typ_min:
|
||||
logger.debug(f"Überspringe {filename}: Zu klein ({len(payload)} bytes, min {typ_min})")
|
||||
continue
|
||||
|
||||
# Speichern
|
||||
|
|
@ -219,8 +247,11 @@ class MailFetcher:
|
|||
logger.error(f"Fehler bei Mail {mail_id}: {e}")
|
||||
continue
|
||||
|
||||
except imaplib.IMAP4.error as e:
|
||||
# IMAP-Fehler beim Ordner-Zugriff (z.B. nicht existent, keine Berechtigung)
|
||||
logger.debug(f"Ordner übersprungen: {ordner} - {e}")
|
||||
except Exception as e:
|
||||
logger.error(f"Fehler beim Abrufen: {e}")
|
||||
logger.error(f"Fehler beim Abrufen aus {ordner}: {e}")
|
||||
|
||||
return ergebnisse
|
||||
|
||||
|
|
@ -269,6 +300,8 @@ class MailFetcher:
|
|||
|
||||
erlaubte_typen = self.config.get("erlaubte_typen", [".pdf"])
|
||||
max_groesse = self.config.get("max_groesse_mb", 25) * 1024 * 1024
|
||||
min_groesse = self.config.get("min_groesse_kb", 10) * 1024
|
||||
groessen_filter = self.config.get("groessen_filter", {}) # Pro Dateityp
|
||||
bereits_verarbeitet = bereits_verarbeitet or set()
|
||||
|
||||
# Ordner bestimmen
|
||||
|
|
@ -283,7 +316,25 @@ class MailFetcher:
|
|||
|
||||
try:
|
||||
status, _ = self.connection.select(ordner)
|
||||
search_criteria = "(UNSEEN)" if nur_ungelesen else "ALL"
|
||||
if status != "OK":
|
||||
# Ordner konnte nicht geöffnet werden (nicht zugreifbar)
|
||||
logger.debug(f"Ordner nicht zugreifbar: {ordner}")
|
||||
continue
|
||||
|
||||
# Suche mit optionalem Datumfilter
|
||||
ab_datum = self.config.get("ab_datum")
|
||||
if nur_ungelesen:
|
||||
search_criteria = "(UNSEEN)"
|
||||
elif ab_datum:
|
||||
try:
|
||||
if isinstance(ab_datum, str):
|
||||
ab_datum = datetime.fromisoformat(ab_datum.replace("Z", "+00:00"))
|
||||
datum_str = ab_datum.strftime("%d-%b-%Y")
|
||||
search_criteria = f'(SINCE {datum_str})'
|
||||
except:
|
||||
search_criteria = "ALL"
|
||||
else:
|
||||
search_criteria = "ALL"
|
||||
status, messages = self.connection.search(None, search_criteria)
|
||||
|
||||
if status != "OK":
|
||||
|
|
@ -326,8 +377,17 @@ class MailFetcher:
|
|||
if not payload:
|
||||
continue
|
||||
|
||||
if len(payload) > max_groesse:
|
||||
yield {"type": "skip", "datei": filename, "grund": "zu groß"}
|
||||
# Größenlimits: Pro Dateityp oder global
|
||||
typ_filter = groessen_filter.get(datei_endung, {})
|
||||
typ_max = typ_filter.get("max_mb", max_groesse / (1024 * 1024)) * 1024 * 1024
|
||||
typ_min = typ_filter.get("min_kb", min_groesse / 1024) * 1024
|
||||
|
||||
if len(payload) > typ_max:
|
||||
yield {"type": "skip", "datei": filename, "grund": f"zu groß ({len(payload)//1024}KB > {int(typ_max//1024)}KB)"}
|
||||
continue
|
||||
|
||||
if len(payload) < typ_min:
|
||||
yield {"type": "skip", "datei": filename, "grund": f"zu klein ({len(payload)//1024}KB < {int(typ_min//1024)}KB)"}
|
||||
continue
|
||||
|
||||
timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
|
||||
|
|
@ -360,8 +420,15 @@ class MailFetcher:
|
|||
yield {"type": "fehler", "nachricht": f"Mail-Fehler: {str(e)[:100]}"}
|
||||
continue
|
||||
|
||||
except imaplib.IMAP4.error as e:
|
||||
# IMAP-Fehler beim Ordner-Zugriff (z.B. nicht existent, keine Berechtigung)
|
||||
# Diese Fehler sind normal bei "alle_ordner" und werden nur geloggt
|
||||
logger.debug(f"Ordner übersprungen (nicht zugreifbar): {ordner} - {e}")
|
||||
continue
|
||||
except Exception as e:
|
||||
yield {"type": "fehler", "nachricht": f"Ordner-Fehler {ordner}: {str(e)[:100]}"}
|
||||
# Andere unerwartete Fehler werden als Warnung gemeldet
|
||||
logger.warning(f"Ordner-Fehler {ordner}: {e}")
|
||||
continue
|
||||
|
||||
def test_connection(self) -> Dict:
|
||||
"""Testet die Verbindung und gibt Status zurück"""
|
||||
510
Source/backend/app/modules/pdf_processor.py
Executable file
510
Source/backend/app/modules/pdf_processor.py
Executable file
|
|
@ -0,0 +1,510 @@
|
|||
"""
|
||||
PDF-Processor Modul
|
||||
Text-Extraktion, OCR und ZUGFeRD-Erkennung
|
||||
"""
|
||||
import subprocess
|
||||
from pathlib import Path
|
||||
from typing import Dict, Optional, Tuple
|
||||
import logging
|
||||
import re
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
# Versuche Libraries zu importieren
|
||||
try:
|
||||
import pdfplumber
|
||||
PDFPLUMBER_AVAILABLE = True
|
||||
except ImportError:
|
||||
PDFPLUMBER_AVAILABLE = False
|
||||
logger.warning("pdfplumber nicht installiert")
|
||||
|
||||
try:
|
||||
from pypdf import PdfReader
|
||||
PYPDF_AVAILABLE = True
|
||||
except ImportError:
|
||||
PYPDF_AVAILABLE = False
|
||||
logger.warning("pypdf nicht installiert")
|
||||
|
||||
|
||||
class PDFProcessor:
|
||||
"""Verarbeitet PDFs: Text-Extraktion, OCR, ZUGFeRD-Erkennung"""
|
||||
|
||||
def __init__(self, ocr_language: str = "deu", ocr_dpi: int = 300):
|
||||
self.ocr_language = ocr_language
|
||||
self.ocr_dpi = ocr_dpi
|
||||
|
||||
def verarbeite(self, pdf_pfad: str, ocr_erlaubt: bool = True, original_backup_pfad: str = None) -> Dict:
|
||||
"""
|
||||
Vollständige PDF-Verarbeitung
|
||||
|
||||
Args:
|
||||
pdf_pfad: Pfad zur PDF
|
||||
ocr_erlaubt: OCR durchführen wenn nötig
|
||||
original_backup_pfad: Ordner für Original-Backup vor OCR
|
||||
|
||||
Returns:
|
||||
Dict mit: text, ist_zugferd, zugferd_xml, hat_text, ocr_durchgefuehrt
|
||||
"""
|
||||
pfad = Path(pdf_pfad)
|
||||
if not pfad.exists():
|
||||
return {"fehler": f"Datei nicht gefunden: {pdf_pfad}"}
|
||||
|
||||
ergebnis = {
|
||||
"pfad": str(pfad),
|
||||
"text": "",
|
||||
"ist_zugferd": False,
|
||||
"zugferd_xml": None,
|
||||
"ist_signiert": False,
|
||||
"hat_text": False,
|
||||
"ocr_durchgefuehrt": False,
|
||||
"original_gesichert": None,
|
||||
"seiten": 0
|
||||
}
|
||||
|
||||
# 1. ZUGFeRD prüfen
|
||||
zugferd_result = self.pruefe_zugferd(pdf_pfad)
|
||||
ergebnis["ist_zugferd"] = zugferd_result["ist_zugferd"]
|
||||
ergebnis["zugferd_xml"] = zugferd_result.get("xml")
|
||||
|
||||
# 2. Signatur prüfen
|
||||
ergebnis["ist_signiert"] = self.pruefe_signatur(pdf_pfad)
|
||||
|
||||
# 3. Text extrahieren
|
||||
text, seiten = self.extrahiere_text(pdf_pfad)
|
||||
ergebnis["text"] = text
|
||||
ergebnis["seiten"] = seiten
|
||||
ergebnis["hat_text"] = bool(text and len(text.strip()) > 50)
|
||||
|
||||
# 4. OCR falls kein Text (aber NICHT bei ZUGFeRD oder signierten PDFs!)
|
||||
if ocr_erlaubt and not ergebnis["hat_text"] and not ergebnis["ist_zugferd"] and not ergebnis["ist_signiert"]:
|
||||
# Zusätzliche Sicherheitsprüfung: Attachments auf ZUGFeRD prüfen
|
||||
# (falls die normale ZUGFeRD-Erkennung fehlgeschlagen ist)
|
||||
hat_zugferd_attachment = self._hat_zugferd_attachment(pdf_pfad)
|
||||
if hat_zugferd_attachment:
|
||||
ergebnis["ist_zugferd"] = True
|
||||
logger.info(f"ZUGFeRD-Attachment gefunden, überspringe OCR: {pfad.name}")
|
||||
else:
|
||||
logger.info(f"Kein Text gefunden, starte OCR für {pfad.name}")
|
||||
|
||||
# Original sichern falls gewünscht
|
||||
if original_backup_pfad:
|
||||
backup_pfad = self.sichere_original(pdf_pfad, original_backup_pfad)
|
||||
if backup_pfad:
|
||||
ergebnis["original_gesichert"] = backup_pfad
|
||||
logger.info(f"Original gesichert: {backup_pfad}")
|
||||
|
||||
ocr_text, ocr_erfolg = self.fuehre_ocr_aus(pdf_pfad)
|
||||
if ocr_erfolg:
|
||||
ergebnis["text"] = ocr_text
|
||||
ergebnis["hat_text"] = bool(ocr_text and len(ocr_text.strip()) > 50)
|
||||
ergebnis["ocr_durchgefuehrt"] = True
|
||||
|
||||
return ergebnis
|
||||
|
||||
def sichere_original(self, pdf_pfad: str, backup_ordner: str) -> Optional[str]:
|
||||
"""Sichert das Original-PDF vor OCR"""
|
||||
try:
|
||||
import shutil
|
||||
pfad = Path(pdf_pfad)
|
||||
backup_dir = Path(backup_ordner)
|
||||
backup_dir.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
# Eindeutigen Namen generieren
|
||||
backup_pfad = backup_dir / pfad.name
|
||||
counter = 1
|
||||
while backup_pfad.exists():
|
||||
backup_pfad = backup_dir / f"{pfad.stem}_{counter}{pfad.suffix}"
|
||||
counter += 1
|
||||
|
||||
shutil.copy2(str(pfad), str(backup_pfad))
|
||||
return str(backup_pfad)
|
||||
except Exception as e:
|
||||
logger.error(f"Original-Sicherung fehlgeschlagen: {e}")
|
||||
return None
|
||||
|
||||
def extrahiere_text(self, pdf_pfad: str) -> Tuple[str, int]:
|
||||
"""
|
||||
Extrahiert Text aus PDF
|
||||
|
||||
Returns:
|
||||
Tuple von (text, seitenanzahl)
|
||||
"""
|
||||
text_parts = []
|
||||
seiten = 0
|
||||
|
||||
# Methode 1: pdfplumber (besser für Tabellen)
|
||||
if PDFPLUMBER_AVAILABLE:
|
||||
try:
|
||||
with pdfplumber.open(pdf_pfad) as pdf:
|
||||
seiten = len(pdf.pages)
|
||||
for page in pdf.pages:
|
||||
page_text = page.extract_text()
|
||||
if page_text:
|
||||
text_parts.append(page_text)
|
||||
if text_parts:
|
||||
return "\n\n".join(text_parts), seiten
|
||||
except Exception as e:
|
||||
logger.debug(f"pdfplumber Fehler: {e}")
|
||||
|
||||
# Methode 2: pypdf (Fallback)
|
||||
if PYPDF_AVAILABLE:
|
||||
try:
|
||||
reader = PdfReader(pdf_pfad)
|
||||
seiten = len(reader.pages)
|
||||
for page in reader.pages:
|
||||
page_text = page.extract_text()
|
||||
if page_text:
|
||||
text_parts.append(page_text)
|
||||
if text_parts:
|
||||
return "\n\n".join(text_parts), seiten
|
||||
except Exception as e:
|
||||
logger.debug(f"pypdf Fehler: {e}")
|
||||
|
||||
# Methode 3: pdftotext CLI (Fallback)
|
||||
try:
|
||||
result = subprocess.run(
|
||||
["pdftotext", "-layout", pdf_pfad, "-"],
|
||||
capture_output=True,
|
||||
text=True,
|
||||
timeout=30
|
||||
)
|
||||
if result.returncode == 0 and result.stdout.strip():
|
||||
return result.stdout, seiten
|
||||
except Exception as e:
|
||||
logger.debug(f"pdftotext Fehler: {e}")
|
||||
|
||||
return "", seiten
|
||||
|
||||
def pruefe_zugferd(self, pdf_pfad: str) -> Dict:
|
||||
"""
|
||||
Prüft ob PDF eine ZUGFeRD/Factur-X Rechnung ist
|
||||
|
||||
Returns:
|
||||
Dict mit: ist_zugferd, xml (falls vorhanden)
|
||||
"""
|
||||
ergebnis = {"ist_zugferd": False, "xml": None}
|
||||
|
||||
# Methode 1: factur-x Library
|
||||
try:
|
||||
from facturx import get_facturx_xml_from_pdf
|
||||
xml_bytes = get_facturx_xml_from_pdf(pdf_pfad)
|
||||
if xml_bytes:
|
||||
ergebnis["ist_zugferd"] = True
|
||||
ergebnis["xml"] = xml_bytes.decode("utf-8") if isinstance(xml_bytes, bytes) else xml_bytes
|
||||
logger.info(f"ZUGFeRD erkannt: {Path(pdf_pfad).name}")
|
||||
return ergebnis
|
||||
except ImportError:
|
||||
logger.debug("factur-x nicht installiert")
|
||||
except Exception as e:
|
||||
logger.debug(f"factur-x Fehler: {e}")
|
||||
|
||||
# Methode 2: Manuell nach eingebettetem ZUGFeRD-XML suchen
|
||||
if PYPDF_AVAILABLE:
|
||||
try:
|
||||
reader = PdfReader(pdf_pfad)
|
||||
|
||||
# ZUGFeRD/Factur-X XML-Dateinamen
|
||||
zugferd_dateinamen = [
|
||||
"factur-x.xml",
|
||||
"zugferd-invoice.xml",
|
||||
"xrechnung.xml",
|
||||
"ZUGFeRD-invoice.xml",
|
||||
]
|
||||
|
||||
# Eingebettete Dateien aus dem Catalog extrahieren
|
||||
if reader.trailer and "/Root" in reader.trailer:
|
||||
root = reader.trailer["/Root"]
|
||||
if hasattr(root, "get_object"):
|
||||
root = root.get_object()
|
||||
|
||||
if "/Names" in root:
|
||||
names = root["/Names"]
|
||||
if hasattr(names, "get_object"):
|
||||
names = names.get_object()
|
||||
|
||||
if "/EmbeddedFiles" in names:
|
||||
embedded = names["/EmbeddedFiles"]
|
||||
if hasattr(embedded, "get_object"):
|
||||
embedded = embedded.get_object()
|
||||
|
||||
# Namen-Array durchsuchen
|
||||
if "/Names" in embedded:
|
||||
names_array = embedded["/Names"]
|
||||
# Format: [name1, ref1, name2, ref2, ...]
|
||||
for i in range(0, len(names_array), 2):
|
||||
if i < len(names_array):
|
||||
dateiname = str(names_array[i]).lower()
|
||||
if any(zf.lower() in dateiname for zf in zugferd_dateinamen):
|
||||
ergebnis["ist_zugferd"] = True
|
||||
logger.info(f"ZUGFeRD-XML gefunden: {Path(pdf_pfad).name}")
|
||||
return ergebnis
|
||||
except Exception as e:
|
||||
logger.debug(f"ZUGFeRD-Prüfung Fehler: {e}")
|
||||
|
||||
return ergebnis
|
||||
|
||||
def pruefe_signatur(self, pdf_pfad: str) -> bool:
|
||||
"""
|
||||
Prüft ob PDF digital signiert ist
|
||||
|
||||
Returns:
|
||||
True wenn signiert, False sonst
|
||||
"""
|
||||
if not PYPDF_AVAILABLE:
|
||||
return False
|
||||
|
||||
try:
|
||||
reader = PdfReader(pdf_pfad)
|
||||
|
||||
# Methode 1: AcroForm mit Sig-Feldern prüfen
|
||||
if "/AcroForm" in reader.trailer.get("/Root", {}):
|
||||
root = reader.trailer["/Root"]
|
||||
if "/AcroForm" in root:
|
||||
acro_form = root["/AcroForm"]
|
||||
if "/SigFlags" in acro_form:
|
||||
sig_flags = acro_form["/SigFlags"]
|
||||
if sig_flags and int(sig_flags) > 0:
|
||||
logger.info(f"Signatur erkannt (SigFlags): {Path(pdf_pfad).name}")
|
||||
return True
|
||||
|
||||
# Methode 2: Nach /Sig Objekten suchen
|
||||
for page in reader.pages:
|
||||
if "/Annots" in page:
|
||||
annots = page["/Annots"]
|
||||
if annots:
|
||||
for annot in annots:
|
||||
try:
|
||||
annot_obj = annot.get_object() if hasattr(annot, 'get_object') else annot
|
||||
if annot_obj.get("/Subtype") == "/Widget":
|
||||
ft = annot_obj.get("/FT")
|
||||
if ft == "/Sig":
|
||||
logger.info(f"Signatur erkannt (Annot): {Path(pdf_pfad).name}")
|
||||
return True
|
||||
except:
|
||||
pass
|
||||
|
||||
except Exception as e:
|
||||
logger.debug(f"Signatur-Prüfung Fehler: {e}")
|
||||
|
||||
return False
|
||||
|
||||
def _hat_zugferd_attachment(self, pdf_pfad: str) -> bool:
|
||||
"""
|
||||
Prüft ob die PDF ein ZUGFeRD/Factur-X XML-Attachment enthält.
|
||||
Zusätzliche Sicherheitsprüfung vor OCR.
|
||||
|
||||
Returns:
|
||||
True wenn ZUGFeRD-Attachment gefunden
|
||||
"""
|
||||
zugferd_dateinamen = [
|
||||
"factur-x.xml",
|
||||
"zugferd-invoice.xml",
|
||||
"xrechnung.xml",
|
||||
"zugferd-invoice.xml",
|
||||
]
|
||||
|
||||
attachments = self._extrahiere_attachments(pdf_pfad)
|
||||
for dateiname, _ in attachments:
|
||||
dateiname_lower = dateiname.lower()
|
||||
if any(zf.lower() in dateiname_lower for zf in zugferd_dateinamen):
|
||||
return True
|
||||
|
||||
return False
|
||||
|
||||
def _extrahiere_attachments(self, pdf_pfad: str) -> list:
|
||||
"""
|
||||
Extrahiert alle eingebetteten Dateien (Attachments) aus einer PDF
|
||||
|
||||
Returns:
|
||||
Liste von Tuples: (dateiname, daten_bytes)
|
||||
"""
|
||||
attachments = []
|
||||
|
||||
if not PYPDF_AVAILABLE:
|
||||
return attachments
|
||||
|
||||
try:
|
||||
reader = PdfReader(pdf_pfad)
|
||||
|
||||
if reader.trailer and "/Root" in reader.trailer:
|
||||
root = reader.trailer["/Root"]
|
||||
if hasattr(root, "get_object"):
|
||||
root = root.get_object()
|
||||
|
||||
if "/Names" in root:
|
||||
names = root["/Names"]
|
||||
if hasattr(names, "get_object"):
|
||||
names = names.get_object()
|
||||
|
||||
if "/EmbeddedFiles" in names:
|
||||
embedded = names["/EmbeddedFiles"]
|
||||
if hasattr(embedded, "get_object"):
|
||||
embedded = embedded.get_object()
|
||||
|
||||
if "/Names" in embedded:
|
||||
names_array = embedded["/Names"]
|
||||
# Format: [name1, filespec1, name2, filespec2, ...]
|
||||
for i in range(0, len(names_array), 2):
|
||||
if i + 1 < len(names_array):
|
||||
dateiname = str(names_array[i])
|
||||
filespec = names_array[i + 1]
|
||||
if hasattr(filespec, "get_object"):
|
||||
filespec = filespec.get_object()
|
||||
|
||||
if "/EF" in filespec:
|
||||
ef = filespec["/EF"]
|
||||
if hasattr(ef, "get_object"):
|
||||
ef = ef.get_object()
|
||||
|
||||
if "/F" in ef:
|
||||
stream = ef["/F"]
|
||||
if hasattr(stream, "get_object"):
|
||||
stream = stream.get_object()
|
||||
|
||||
daten = stream.get_data()
|
||||
attachments.append((dateiname, daten))
|
||||
logger.debug(f"Attachment extrahiert: {dateiname}")
|
||||
|
||||
except Exception as e:
|
||||
logger.debug(f"Attachment-Extraktion Fehler: {e}")
|
||||
|
||||
return attachments
|
||||
|
||||
def _fuege_attachments_ein(self, pdf_pfad: str, attachments: list) -> bool:
|
||||
"""
|
||||
Fügt Attachments in eine PDF ein
|
||||
|
||||
Args:
|
||||
pdf_pfad: Pfad zur PDF
|
||||
attachments: Liste von Tuples (dateiname, daten_bytes)
|
||||
|
||||
Returns:
|
||||
True bei Erfolg
|
||||
"""
|
||||
if not attachments:
|
||||
return True
|
||||
|
||||
if not PYPDF_AVAILABLE:
|
||||
return False
|
||||
|
||||
try:
|
||||
from pypdf import PdfWriter
|
||||
|
||||
# PDF lesen
|
||||
reader = PdfReader(pdf_pfad)
|
||||
writer = PdfWriter()
|
||||
|
||||
# Alle Seiten kopieren
|
||||
for page in reader.pages:
|
||||
writer.add_page(page)
|
||||
|
||||
# Metadaten kopieren
|
||||
if reader.metadata:
|
||||
writer.add_metadata(reader.metadata)
|
||||
|
||||
# Attachments hinzufügen
|
||||
for dateiname, daten in attachments:
|
||||
writer.add_attachment(dateiname, daten)
|
||||
logger.debug(f"Attachment eingefügt: {dateiname}")
|
||||
|
||||
# Temporäre Datei schreiben
|
||||
temp_pfad = Path(pdf_pfad).with_suffix(".attached.pdf")
|
||||
with open(temp_pfad, "wb") as f:
|
||||
writer.write(f)
|
||||
|
||||
# Original ersetzen
|
||||
Path(pdf_pfad).unlink()
|
||||
temp_pfad.rename(pdf_pfad)
|
||||
|
||||
return True
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Attachment-Einfügung Fehler: {e}")
|
||||
return False
|
||||
|
||||
def fuehre_ocr_aus(self, pdf_pfad: str) -> Tuple[str, bool]:
|
||||
"""
|
||||
Führt OCR mit ocrmypdf durch, erhält dabei eingebettete Attachments
|
||||
|
||||
Returns:
|
||||
Tuple von (text, erfolg)
|
||||
"""
|
||||
pfad = Path(pdf_pfad)
|
||||
temp_pfad = pfad.with_suffix(".ocr.pdf")
|
||||
|
||||
# Attachments VOR OCR extrahieren (ocrmypdf verliert diese sonst)
|
||||
attachments = self._extrahiere_attachments(pdf_pfad)
|
||||
if attachments:
|
||||
logger.info(f"{len(attachments)} Attachment(s) gesichert vor OCR")
|
||||
|
||||
try:
|
||||
# ocrmypdf ausführen
|
||||
result = subprocess.run(
|
||||
[
|
||||
"ocrmypdf",
|
||||
"--language", self.ocr_language,
|
||||
"--deskew", # Schräge Scans korrigieren
|
||||
"--clean", # Bild verbessern
|
||||
"--skip-text", # Seiten mit Text überspringen
|
||||
"--force-ocr", # OCR erzwingen falls nötig
|
||||
str(pfad),
|
||||
str(temp_pfad)
|
||||
],
|
||||
capture_output=True,
|
||||
text=True,
|
||||
timeout=120 # 2 Minuten Timeout
|
||||
)
|
||||
|
||||
if result.returncode == 0 and temp_pfad.exists():
|
||||
# Original mit OCR-Version ersetzen
|
||||
pfad.unlink()
|
||||
temp_pfad.rename(pfad)
|
||||
|
||||
# Attachments wieder einfügen
|
||||
if attachments:
|
||||
if self._fuege_attachments_ein(str(pfad), attachments):
|
||||
logger.info(f"Attachments wiederhergestellt nach OCR")
|
||||
else:
|
||||
logger.warning(f"Attachments konnten nicht wiederhergestellt werden")
|
||||
|
||||
# Text aus OCR-PDF extrahieren
|
||||
text, _ = self.extrahiere_text(str(pfad))
|
||||
return text, True
|
||||
else:
|
||||
logger.error(f"OCR Fehler: {result.stderr}")
|
||||
if temp_pfad.exists():
|
||||
temp_pfad.unlink()
|
||||
return "", False
|
||||
|
||||
except subprocess.TimeoutExpired:
|
||||
logger.error(f"OCR Timeout für {pfad.name}")
|
||||
if temp_pfad.exists():
|
||||
temp_pfad.unlink()
|
||||
return "", False
|
||||
except FileNotFoundError:
|
||||
logger.error("ocrmypdf nicht installiert")
|
||||
return "", False
|
||||
except Exception as e:
|
||||
logger.error(f"OCR Fehler: {e}")
|
||||
if temp_pfad.exists():
|
||||
temp_pfad.unlink()
|
||||
return "", False
|
||||
|
||||
def extrahiere_metadaten(self, pdf_pfad: str) -> Dict:
|
||||
"""Extrahiert PDF-Metadaten"""
|
||||
metadaten = {}
|
||||
|
||||
if PYPDF_AVAILABLE:
|
||||
try:
|
||||
reader = PdfReader(pdf_pfad)
|
||||
if reader.metadata:
|
||||
metadaten = {
|
||||
"titel": reader.metadata.get("/Title", ""),
|
||||
"autor": reader.metadata.get("/Author", ""),
|
||||
"ersteller": reader.metadata.get("/Creator", ""),
|
||||
"erstellt": reader.metadata.get("/CreationDate", ""),
|
||||
}
|
||||
except Exception as e:
|
||||
logger.debug(f"Metadaten-Fehler: {e}")
|
||||
|
||||
return metadaten
|
||||
112
backend/app/modules/sorter.py → Source/backend/app/modules/sorter.py
Normal file → Executable file
112
backend/app/modules/sorter.py → Source/backend/app/modules/sorter.py
Normal file → Executable file
|
|
@ -83,8 +83,9 @@ class Sorter:
|
|||
patterns = muster["text_match"]
|
||||
if isinstance(patterns, str):
|
||||
patterns = [patterns]
|
||||
# Nur prüfen wenn Liste nicht leer
|
||||
for pattern in patterns:
|
||||
if pattern.lower() not in text:
|
||||
if pattern and pattern.lower() not in text:
|
||||
return False
|
||||
|
||||
# text_match_any (mindestens einer muss enthalten sein)
|
||||
|
|
@ -92,7 +93,8 @@ class Sorter:
|
|||
patterns = muster["text_match_any"]
|
||||
if isinstance(patterns, str):
|
||||
patterns = [patterns]
|
||||
if not any(p.lower() in text for p in patterns):
|
||||
# Nur prüfen wenn Liste nicht leer
|
||||
if patterns and not any(p.lower() in text for p in patterns if p):
|
||||
return False
|
||||
|
||||
# text_regex
|
||||
|
|
@ -101,6 +103,33 @@ class Sorter:
|
|||
if not re.search(pattern, text, re.IGNORECASE):
|
||||
return False
|
||||
|
||||
# ============ NEGATIVE MUSTER (dürfen NICHT vorkommen) ============
|
||||
|
||||
# keywords_nicht (keines darf vorkommen)
|
||||
if "keywords_nicht" in muster:
|
||||
keywords = muster["keywords_nicht"]
|
||||
if isinstance(keywords, str):
|
||||
keywords = [k.strip() for k in keywords.split(",")]
|
||||
for keyword in keywords:
|
||||
keyword = keyword.lower().strip()
|
||||
if keyword and (keyword in text or keyword in original_name):
|
||||
return False # Verbotenes Keyword gefunden
|
||||
|
||||
# text_not_match (keines darf enthalten sein)
|
||||
if "text_not_match" in muster:
|
||||
patterns = muster["text_not_match"]
|
||||
if isinstance(patterns, str):
|
||||
patterns = [patterns]
|
||||
for pattern in patterns:
|
||||
if pattern and pattern.lower() in text:
|
||||
return False # Verbotenes Pattern gefunden
|
||||
|
||||
# text_not_regex (Regex darf nicht matchen)
|
||||
if "text_not_regex" in muster:
|
||||
pattern = muster["text_not_regex"]
|
||||
if re.search(pattern, text, re.IGNORECASE):
|
||||
return False # Verbotenes Regex gefunden
|
||||
|
||||
return True
|
||||
|
||||
def extrahiere_felder(self, regel: Dict, dokument_info: Dict) -> Dict[str, Any]:
|
||||
|
|
@ -133,11 +162,57 @@ class Sorter:
|
|||
return felder
|
||||
|
||||
def _extrahiere_mit_regex(self, config: Dict, text: str) -> Optional[str]:
|
||||
"""Extrahiert ein Feld mit einem einzelnen Regex"""
|
||||
"""
|
||||
Extrahiert ein Feld mit Regex - unterstützt einzelne oder mehrere Alternativen
|
||||
|
||||
Unterstützt "auswahl" Option für mehrere Treffer:
|
||||
- "first": Erster Treffer (Standard)
|
||||
- "last": Letzter Treffer
|
||||
- "max": Größter numerischer Wert
|
||||
- "min": Kleinster numerischer Wert
|
||||
"""
|
||||
regex_pattern = config.get("regex")
|
||||
if not regex_pattern:
|
||||
return None
|
||||
|
||||
# Mehrere Regex-Alternativen (Array)
|
||||
patterns = regex_pattern if isinstance(regex_pattern, list) else [regex_pattern]
|
||||
|
||||
# Auswahl-Modus (max, min, first, last)
|
||||
auswahl = config.get("auswahl", "first")
|
||||
|
||||
for pattern in patterns:
|
||||
try:
|
||||
match = re.search(config["regex"], text, re.IGNORECASE | re.MULTILINE)
|
||||
if match:
|
||||
# Bei max/min/last: Alle Treffer finden
|
||||
if auswahl in ("max", "min", "last"):
|
||||
alle_matches = list(re.finditer(pattern, text, re.IGNORECASE | re.MULTILINE))
|
||||
if not alle_matches:
|
||||
continue
|
||||
|
||||
# Werte extrahieren
|
||||
werte = []
|
||||
for match in alle_matches:
|
||||
wert = match.group(1) if match.groups() else match.group(0)
|
||||
werte.append(wert.strip())
|
||||
|
||||
if not werte:
|
||||
continue
|
||||
|
||||
# Auswahl treffen
|
||||
if auswahl == "last":
|
||||
wert = werte[-1]
|
||||
elif auswahl in ("max", "min"):
|
||||
# Versuche numerische Auswahl
|
||||
wert = self._waehle_numerisch(werte, auswahl)
|
||||
else:
|
||||
wert = werte[0]
|
||||
else:
|
||||
# Standard: Erster Treffer (first)
|
||||
match = re.search(pattern, text, re.IGNORECASE | re.MULTILINE)
|
||||
if not match:
|
||||
continue
|
||||
wert = match.group(1) if match.groups() else match.group(0)
|
||||
wert = wert.strip()
|
||||
|
||||
# Datum formatieren
|
||||
if "format" in config:
|
||||
|
|
@ -151,12 +226,35 @@ class Sorter:
|
|||
if config.get("typ") == "betrag":
|
||||
wert = self._formatiere_betrag(wert)
|
||||
|
||||
return wert.strip()
|
||||
return wert
|
||||
except Exception as e:
|
||||
logger.debug(f"Regex-Extraktion fehlgeschlagen: {e}")
|
||||
logger.debug(f"Regex-Extraktion fehlgeschlagen für '{pattern}': {e}")
|
||||
|
||||
return None
|
||||
|
||||
def _waehle_numerisch(self, werte: List[str], modus: str) -> str:
|
||||
"""Wählt max oder min aus einer Liste von Werten (versucht numerisch zu parsen)"""
|
||||
# Versuche alle Werte als Zahlen zu parsen
|
||||
numerische_werte = []
|
||||
for wert in werte:
|
||||
try:
|
||||
# Deutsches Format: 1.234,56 -> 1234.56
|
||||
clean = wert.replace(" ", "").replace(".", "").replace(",", ".")
|
||||
zahl = float(clean)
|
||||
numerische_werte.append((zahl, wert))
|
||||
except ValueError:
|
||||
# Wenn nicht parsebar, ignorieren
|
||||
pass
|
||||
|
||||
if not numerische_werte:
|
||||
# Fallback: Erster Wert wenn keine Zahlen gefunden
|
||||
return werte[0]
|
||||
|
||||
if modus == "max":
|
||||
return max(numerische_werte, key=lambda x: x[0])[1]
|
||||
else: # min
|
||||
return min(numerische_werte, key=lambda x: x[0])[1]
|
||||
|
||||
def _formatiere_betrag(self, betrag: str) -> str:
|
||||
"""Formatiert Betrag einheitlich"""
|
||||
betrag = betrag.replace(" ", "").replace(".", "").replace(",", ".")
|
||||
0
backend/app/routes/__init__.py → Source/backend/app/routes/__init__.py
Normal file → Executable file
0
backend/app/routes/__init__.py → Source/backend/app/routes/__init__.py
Normal file → Executable file
2174
Source/backend/app/routes/api.py
Executable file
2174
Source/backend/app/routes/api.py
Executable file
File diff suppressed because it is too large
Load diff
0
backend/app/services/__init__.py → Source/backend/app/services/__init__.py
Normal file → Executable file
0
backend/app/services/__init__.py → Source/backend/app/services/__init__.py
Normal file → Executable file
0
backend/app/services/pipeline_service.py → Source/backend/app/services/pipeline_service.py
Normal file → Executable file
0
backend/app/services/pipeline_service.py → Source/backend/app/services/pipeline_service.py
Normal file → Executable file
909
Source/backend/app/services/scheduler_service.py
Executable file
909
Source/backend/app/services/scheduler_service.py
Executable file
|
|
@ -0,0 +1,909 @@
|
|||
"""
|
||||
Scheduler-Service für automatische Ausführung von Aufgaben
|
||||
"""
|
||||
from apscheduler.schedulers.background import BackgroundScheduler
|
||||
from apscheduler.triggers.cron import CronTrigger
|
||||
from datetime import datetime
|
||||
import logging
|
||||
import os
|
||||
import stat
|
||||
from pathlib import Path
|
||||
from typing import Optional, Dict, List
|
||||
from zoneinfo import ZoneInfo
|
||||
|
||||
from ..models.database import SessionLocal, Zeitplan, Postfach, QuellOrdner
|
||||
from ..modules.mail_fetcher import MailFetcher
|
||||
from ..modules.sorter import Sorter
|
||||
from ..config import INBOX_DIR
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
def check_folder_permissions(pfad: str, name: str = "") -> Dict:
|
||||
"""
|
||||
Prüft Ordner-Berechtigungen und gibt detaillierte Infos zurück.
|
||||
Gibt Debug-Infos auf stdout aus für Container-Logs.
|
||||
"""
|
||||
result = {
|
||||
"pfad": pfad,
|
||||
"name": name,
|
||||
"existiert": False,
|
||||
"lesbar": False,
|
||||
"schreibbar": False,
|
||||
"ist_ordner": False,
|
||||
"dateien_anzahl": 0,
|
||||
"fehler": None
|
||||
}
|
||||
|
||||
prefix = f"[DEBUG] [{name}]" if name else "[DEBUG]"
|
||||
|
||||
try:
|
||||
p = Path(pfad)
|
||||
result["existiert"] = p.exists()
|
||||
|
||||
if not p.exists():
|
||||
print(f"{prefix} ❌ Pfad existiert NICHT: {pfad}", flush=True)
|
||||
result["fehler"] = "Pfad existiert nicht"
|
||||
return result
|
||||
|
||||
result["ist_ordner"] = p.is_dir()
|
||||
if not p.is_dir():
|
||||
print(f"{prefix} ❌ Pfad ist KEIN Ordner: {pfad}", flush=True)
|
||||
result["fehler"] = "Pfad ist kein Ordner"
|
||||
return result
|
||||
|
||||
# Berechtigungen prüfen
|
||||
result["lesbar"] = os.access(pfad, os.R_OK)
|
||||
result["schreibbar"] = os.access(pfad, os.W_OK)
|
||||
|
||||
# Dateien zählen
|
||||
try:
|
||||
dateien = list(p.glob("*"))
|
||||
result["dateien_anzahl"] = len([f for f in dateien if f.is_file()])
|
||||
except PermissionError:
|
||||
result["dateien_anzahl"] = -1
|
||||
|
||||
# Stat-Infos
|
||||
try:
|
||||
st = p.stat()
|
||||
mode = stat.filemode(st.st_mode)
|
||||
uid = st.st_uid
|
||||
gid = st.st_gid
|
||||
except Exception:
|
||||
mode = "?"
|
||||
uid = "?"
|
||||
gid = "?"
|
||||
|
||||
# Status-Symbol
|
||||
if result["lesbar"] and result["schreibbar"]:
|
||||
status = "✅"
|
||||
elif result["lesbar"]:
|
||||
status = "⚠️ NUR LESBAR"
|
||||
else:
|
||||
status = "❌ KEIN ZUGRIFF"
|
||||
|
||||
print(f"{prefix} {status} {pfad}", flush=True)
|
||||
print(f"{prefix} Rechte: {mode} | UID: {uid} | GID: {gid} | Dateien: {result['dateien_anzahl']}", flush=True)
|
||||
|
||||
if not result["schreibbar"]:
|
||||
result["fehler"] = "Keine Schreibrechte"
|
||||
|
||||
except Exception as e:
|
||||
print(f"{prefix} ❌ FEHLER beim Prüfen: {pfad} - {e}", flush=True)
|
||||
result["fehler"] = str(e)
|
||||
|
||||
return result
|
||||
|
||||
|
||||
def check_all_folders_on_startup():
|
||||
"""Prüft alle konfigurierten Ordner beim Start"""
|
||||
print("", flush=True)
|
||||
print("=" * 60, flush=True)
|
||||
print("[DEBUG] === ORDNER-BERECHTIGUNGEN PRÜFEN ===", flush=True)
|
||||
print("=" * 60, flush=True)
|
||||
|
||||
# Prozess-Info
|
||||
print(f"[DEBUG] Prozess läuft als UID: {os.getuid()}, GID: {os.getgid()}", flush=True)
|
||||
|
||||
# Umgebungsvariablen für PUID/PGID (Unraid-Style)
|
||||
puid = os.environ.get("PUID", "nicht gesetzt")
|
||||
pgid = os.environ.get("PGID", "nicht gesetzt")
|
||||
print(f"[DEBUG] Umgebung: PUID={puid}, PGID={pgid}", flush=True)
|
||||
|
||||
# Aktueller User
|
||||
try:
|
||||
import pwd
|
||||
import grp
|
||||
user_info = pwd.getpwuid(os.getuid())
|
||||
group_info = grp.getgrgid(os.getgid())
|
||||
print(f"[DEBUG] User: {user_info.pw_name}, Gruppe: {group_info.gr_name}", flush=True)
|
||||
except Exception:
|
||||
print("[DEBUG] User/Gruppe konnte nicht ermittelt werden", flush=True)
|
||||
|
||||
db = SessionLocal()
|
||||
try:
|
||||
# Quellordner prüfen
|
||||
quell_ordner = db.query(QuellOrdner).filter(QuellOrdner.aktiv == True).all()
|
||||
print(f"[DEBUG] Gefunden: {len(quell_ordner)} aktive Quellordner", flush=True)
|
||||
print("", flush=True)
|
||||
|
||||
for qo in quell_ordner:
|
||||
print(f"[DEBUG] --- {qo.name} ---", flush=True)
|
||||
# Quellpfad prüfen
|
||||
check_folder_permissions(qo.pfad, f"{qo.name}/Quelle")
|
||||
# Zielpfad prüfen
|
||||
check_folder_permissions(qo.ziel_ordner, f"{qo.name}/Ziel")
|
||||
print("", flush=True)
|
||||
|
||||
# Postfächer Zielordner prüfen
|
||||
postfaecher = db.query(Postfach).filter(Postfach.aktiv == True).all()
|
||||
print(f"[DEBUG] Gefunden: {len(postfaecher)} aktive Postfächer", flush=True)
|
||||
|
||||
for pf in postfaecher:
|
||||
if pf.ziel_ordner:
|
||||
check_folder_permissions(pf.ziel_ordner, f"Postfach/{pf.name}")
|
||||
|
||||
print("=" * 60, flush=True)
|
||||
print("", flush=True)
|
||||
|
||||
except Exception as e:
|
||||
print(f"[DEBUG] ❌ Fehler bei Ordner-Prüfung: {e}", flush=True)
|
||||
finally:
|
||||
db.close()
|
||||
|
||||
# Globaler Scheduler
|
||||
scheduler: Optional[BackgroundScheduler] = None
|
||||
|
||||
# Timezone für Scheduler (robust gegen ungültige TZ-Variablen)
|
||||
def get_timezone():
|
||||
"""Ermittelt eine gültige Timezone"""
|
||||
tz_env = os.environ.get("TZ", "Europe/Berlin")
|
||||
# Prüfen ob gültiger Timezone-String (nicht ${...} oder ähnlich)
|
||||
if tz_env.startswith("$") or "/" not in tz_env:
|
||||
tz_env = "Europe/Berlin"
|
||||
try:
|
||||
return ZoneInfo(tz_env)
|
||||
except Exception:
|
||||
return ZoneInfo("Europe/Berlin")
|
||||
|
||||
|
||||
def get_scheduler() -> BackgroundScheduler:
|
||||
"""Gibt den globalen Scheduler zurück"""
|
||||
global scheduler
|
||||
if scheduler is None:
|
||||
scheduler = BackgroundScheduler(timezone=get_timezone())
|
||||
return scheduler
|
||||
|
||||
|
||||
def init_scheduler():
|
||||
"""Initialisiert den Scheduler beim App-Start"""
|
||||
global scheduler
|
||||
scheduler = BackgroundScheduler(timezone=get_timezone())
|
||||
|
||||
# Ordner-Berechtigungen beim Start prüfen
|
||||
check_all_folders_on_startup()
|
||||
|
||||
# Zeitpläne aus DB laden und Jobs erstellen
|
||||
sync_zeitplaene()
|
||||
|
||||
scheduler.start()
|
||||
logger.info("Scheduler gestartet")
|
||||
|
||||
# Überfällige Zeitpläne beim Start ausführen (asynchron nach 5 Sekunden)
|
||||
import threading
|
||||
def delayed_overdue_check():
|
||||
import time
|
||||
time.sleep(5) # Warte bis App vollständig gestartet
|
||||
execute_overdue_zeitplaene()
|
||||
|
||||
thread = threading.Thread(target=delayed_overdue_check, daemon=True)
|
||||
thread.start()
|
||||
|
||||
|
||||
def execute_overdue_zeitplaene():
|
||||
"""Führt alle überfälligen Zeitpläne aus"""
|
||||
from datetime import datetime
|
||||
|
||||
print("Prüfe überfällige Zeitpläne...", flush=True)
|
||||
db = SessionLocal()
|
||||
try:
|
||||
zeitplaene = db.query(Zeitplan).filter(Zeitplan.aktiv == True).all()
|
||||
now = datetime.utcnow()
|
||||
print(f"Gefunden: {len(zeitplaene)} aktive Zeitpläne, aktuelle Zeit (UTC): {now}", flush=True)
|
||||
|
||||
for zp in zeitplaene:
|
||||
print(f" Zeitplan '{zp.name}': nächste={zp.naechste_ausfuehrung}, letzte={zp.letzte_ausfuehrung}", flush=True)
|
||||
# Prüfen ob überfällig (naechste_ausfuehrung liegt in der Vergangenheit)
|
||||
if zp.naechste_ausfuehrung and zp.naechste_ausfuehrung < now:
|
||||
print(f" -> Führe überfälligen Zeitplan aus: {zp.name}", flush=True)
|
||||
try:
|
||||
execute_zeitplan(zp.id)
|
||||
except Exception as e:
|
||||
print(f" -> Fehler bei überfälligem Zeitplan {zp.name}: {e}", flush=True)
|
||||
# Oder: Noch nie ausgeführt
|
||||
elif zp.letzte_ausfuehrung is None:
|
||||
print(f" -> Führe noch nie ausgeführten Zeitplan aus: {zp.name}", flush=True)
|
||||
try:
|
||||
execute_zeitplan(zp.id)
|
||||
except Exception as e:
|
||||
print(f" -> Fehler bei Zeitplan {zp.name}: {e}", flush=True)
|
||||
else:
|
||||
print(f" -> Nicht überfällig", flush=True)
|
||||
except Exception as e:
|
||||
print(f"Fehler bei Zeitplan-Prüfung: {e}", flush=True)
|
||||
finally:
|
||||
db.close()
|
||||
|
||||
|
||||
def shutdown_scheduler():
|
||||
"""Beendet den Scheduler"""
|
||||
global scheduler
|
||||
if scheduler:
|
||||
scheduler.shutdown()
|
||||
logger.info("Scheduler beendet")
|
||||
|
||||
|
||||
def sync_zeitplaene():
|
||||
"""Synchronisiert Zeitpläne aus der DB mit dem Scheduler"""
|
||||
global scheduler
|
||||
if not scheduler:
|
||||
return
|
||||
|
||||
# Alle bestehenden Jobs entfernen
|
||||
scheduler.remove_all_jobs()
|
||||
|
||||
db = SessionLocal()
|
||||
try:
|
||||
zeitplaene = db.query(Zeitplan).filter(Zeitplan.aktiv == True).all()
|
||||
|
||||
for zp in zeitplaene:
|
||||
add_job_for_zeitplan(zp)
|
||||
logger.info(f"Job hinzugefügt: {zp.name} ({zp.intervall})")
|
||||
finally:
|
||||
db.close()
|
||||
|
||||
|
||||
def add_job_for_zeitplan(zp: Zeitplan):
|
||||
"""Fügt einen Job für einen Zeitplan hinzu"""
|
||||
global scheduler
|
||||
if not scheduler:
|
||||
return
|
||||
|
||||
job_id = f"zeitplan_{zp.id}"
|
||||
|
||||
# CronTrigger basierend auf Intervall erstellen
|
||||
trigger = create_trigger(zp)
|
||||
if not trigger:
|
||||
return
|
||||
|
||||
# Job hinzufügen
|
||||
scheduler.add_job(
|
||||
func=execute_zeitplan,
|
||||
trigger=trigger,
|
||||
args=[zp.id],
|
||||
id=job_id,
|
||||
name=zp.name,
|
||||
replace_existing=True
|
||||
)
|
||||
|
||||
# Nächste Ausführungszeit berechnen und speichern
|
||||
job = scheduler.get_job(job_id)
|
||||
if job:
|
||||
try:
|
||||
# APScheduler 3.x: next_run_time als Attribut
|
||||
# APScheduler 4.x: get_next_fire_time() oder scheduled_fire_time
|
||||
next_time = getattr(job, 'next_run_time', None)
|
||||
if next_time is None and hasattr(job, 'get_next_fire_time'):
|
||||
next_time = job.get_next_fire_time()
|
||||
|
||||
if next_time:
|
||||
db = SessionLocal()
|
||||
try:
|
||||
zeitplan = db.query(Zeitplan).filter(Zeitplan.id == zp.id).first()
|
||||
if zeitplan:
|
||||
zeitplan.naechste_ausfuehrung = next_time
|
||||
db.commit()
|
||||
finally:
|
||||
db.close()
|
||||
except Exception as e:
|
||||
logger.warning(f"Konnte nächste Ausführungszeit nicht ermitteln: {e}")
|
||||
|
||||
|
||||
def create_trigger(zp: Zeitplan) -> Optional[CronTrigger]:
|
||||
"""Erstellt einen CronTrigger basierend auf dem Zeitplan"""
|
||||
try:
|
||||
tz = get_timezone()
|
||||
if zp.intervall == "stündlich":
|
||||
return CronTrigger(minute=zp.minute or 0, timezone=tz)
|
||||
|
||||
elif zp.intervall == "täglich":
|
||||
return CronTrigger(hour=zp.stunde or 6, minute=zp.minute or 0, timezone=tz)
|
||||
|
||||
elif zp.intervall == "wöchentlich":
|
||||
return CronTrigger(
|
||||
day_of_week=zp.wochentag or 0,
|
||||
hour=zp.stunde or 6,
|
||||
minute=zp.minute or 0,
|
||||
timezone=tz
|
||||
)
|
||||
|
||||
elif zp.intervall == "monatlich":
|
||||
return CronTrigger(
|
||||
day=zp.monatstag or 1,
|
||||
hour=zp.stunde or 6,
|
||||
minute=zp.minute or 0,
|
||||
timezone=tz
|
||||
)
|
||||
|
||||
return None
|
||||
except Exception as e:
|
||||
logger.error(f"Fehler beim Erstellen des Triggers: {e}")
|
||||
return None
|
||||
|
||||
|
||||
def execute_zeitplan(zeitplan_id: int):
|
||||
"""Führt einen Zeitplan aus"""
|
||||
db = SessionLocal()
|
||||
try:
|
||||
zeitplan = db.query(Zeitplan).filter(Zeitplan.id == zeitplan_id).first()
|
||||
if not zeitplan:
|
||||
return
|
||||
|
||||
logger.info(f"Starte Zeitplan: {zeitplan.name}")
|
||||
|
||||
try:
|
||||
if zeitplan.typ == "mail_abruf":
|
||||
result = execute_mail_abruf(db, zeitplan)
|
||||
elif zeitplan.typ == "grobsortierung":
|
||||
result = execute_grobsortierung(db, zeitplan)
|
||||
elif zeitplan.typ == "sortierregeln":
|
||||
result = execute_sortierregeln(db, zeitplan)
|
||||
elif zeitplan.typ == "sortierung":
|
||||
# Legacy: alte "sortierung" wird wie "grobsortierung" behandelt
|
||||
result = execute_grobsortierung(db, zeitplan)
|
||||
else:
|
||||
result = {"erfolg": False, "meldung": f"Unbekannter Typ: {zeitplan.typ}"}
|
||||
|
||||
# Status aktualisieren
|
||||
zeitplan.letzte_ausfuehrung = datetime.utcnow()
|
||||
zeitplan.letzter_status = "erfolg" if result.get("erfolg") else "fehler"
|
||||
zeitplan.letzte_meldung = result.get("meldung", "")[:500]
|
||||
|
||||
# Nächste Ausführung berechnen
|
||||
job = scheduler.get_job(f"zeitplan_{zeitplan_id}")
|
||||
if job:
|
||||
try:
|
||||
next_time = getattr(job, 'next_run_time', None)
|
||||
if next_time is None and hasattr(job, 'get_next_fire_time'):
|
||||
next_time = job.get_next_fire_time()
|
||||
if next_time:
|
||||
zeitplan.naechste_ausfuehrung = next_time
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
db.commit()
|
||||
logger.info(f"Zeitplan abgeschlossen: {zeitplan.name} - {zeitplan.letzter_status}")
|
||||
|
||||
except Exception as e:
|
||||
zeitplan.letzte_ausfuehrung = datetime.utcnow()
|
||||
zeitplan.letzter_status = "fehler"
|
||||
zeitplan.letzte_meldung = str(e)[:500]
|
||||
db.commit()
|
||||
logger.error(f"Fehler bei Zeitplan {zeitplan.name}: {e}")
|
||||
|
||||
finally:
|
||||
db.close()
|
||||
|
||||
|
||||
def execute_mail_abruf(db, zeitplan: Zeitplan) -> Dict:
|
||||
"""Führt Mail-Abruf aus"""
|
||||
from ..models.database import VerarbeiteteMail
|
||||
|
||||
# Postfächer bestimmen
|
||||
if zeitplan.postfach_id:
|
||||
postfaecher = db.query(Postfach).filter(
|
||||
Postfach.id == zeitplan.postfach_id,
|
||||
Postfach.aktiv == True
|
||||
).all()
|
||||
else:
|
||||
postfaecher = db.query(Postfach).filter(Postfach.aktiv == True).all()
|
||||
|
||||
if not postfaecher:
|
||||
return {"erfolg": True, "meldung": "Keine aktiven Postfächer gefunden"}
|
||||
|
||||
gesamt_dateien = 0
|
||||
fehler = []
|
||||
|
||||
for postfach in postfaecher:
|
||||
try:
|
||||
# Bereits verarbeitete Message-IDs laden
|
||||
bereits_verarbeitet = set(
|
||||
m.message_id for m in db.query(VerarbeiteteMail)
|
||||
.filter(VerarbeiteteMail.postfach_id == postfach.id)
|
||||
.all()
|
||||
)
|
||||
|
||||
config = {
|
||||
"imap_server": postfach.imap_server,
|
||||
"imap_port": postfach.imap_port,
|
||||
"email": postfach.email,
|
||||
"passwort": postfach.passwort,
|
||||
"ordner": postfach.ordner,
|
||||
"erlaubte_typen": postfach.erlaubte_typen or [".pdf"],
|
||||
"max_groesse_mb": postfach.max_groesse_mb or 25,
|
||||
"min_groesse_kb": postfach.min_groesse_kb or 10
|
||||
}
|
||||
|
||||
fetcher = MailFetcher(config)
|
||||
if not fetcher.connect():
|
||||
fehler.append(f"{postfach.name}: Verbindung fehlgeschlagen")
|
||||
continue
|
||||
|
||||
from pathlib import Path
|
||||
ziel = Path(postfach.ziel_ordner) if postfach.ziel_ordner else INBOX_DIR
|
||||
|
||||
ergebnisse = fetcher.fetch_attachments(
|
||||
ziel_ordner=ziel,
|
||||
nur_ungelesen=postfach.nur_ungelesen,
|
||||
markiere_gelesen=True,
|
||||
alle_ordner=postfach.alle_ordner,
|
||||
bereits_verarbeitet=bereits_verarbeitet
|
||||
)
|
||||
|
||||
# Verarbeitete Mails speichern
|
||||
for ergebnis in ergebnisse:
|
||||
if ergebnis.get("message_id"):
|
||||
db.add(VerarbeiteteMail(
|
||||
postfach_id=postfach.id,
|
||||
message_id=ergebnis["message_id"],
|
||||
ordner=ergebnis.get("ordner"),
|
||||
betreff=ergebnis.get("betreff", "")[:500],
|
||||
absender=ergebnis.get("absender", "")[:255],
|
||||
anzahl_attachments=1
|
||||
))
|
||||
|
||||
# Postfach-Status aktualisieren
|
||||
postfach.letzter_abruf = datetime.utcnow()
|
||||
postfach.letzte_anzahl = len(ergebnisse)
|
||||
db.commit()
|
||||
|
||||
gesamt_dateien += len(ergebnisse)
|
||||
fetcher.disconnect()
|
||||
|
||||
except Exception as e:
|
||||
fehler.append(f"{postfach.name}: {str(e)[:100]}")
|
||||
|
||||
if fehler:
|
||||
return {
|
||||
"erfolg": len(fehler) < len(postfaecher),
|
||||
"meldung": f"{gesamt_dateien} Dateien geholt. Fehler: {'; '.join(fehler)}"
|
||||
}
|
||||
|
||||
return {"erfolg": True, "meldung": f"{gesamt_dateien} Dateien aus {len(postfaecher)} Postfächern geholt"}
|
||||
|
||||
|
||||
def execute_grobsortierung(db, zeitplan: Zeitplan) -> Dict:
|
||||
"""Führt Grobsortierung aus (QuellOrdner verarbeiten)"""
|
||||
from ..models.database import SortierRegel, VerarbeiteteDatei
|
||||
from ..modules.pdf_processor import PDFProcessor
|
||||
from pathlib import Path
|
||||
|
||||
print("", flush=True)
|
||||
print("[GROBSORTIERUNG] === START ===", flush=True)
|
||||
print(f"[GROBSORTIERUNG] Zeitplan: {zeitplan.name} (ID: {zeitplan.id})", flush=True)
|
||||
|
||||
# QuellOrdner bestimmen
|
||||
if zeitplan.quell_ordner_id:
|
||||
quell_ordner = db.query(QuellOrdner).filter(
|
||||
QuellOrdner.id == zeitplan.quell_ordner_id,
|
||||
QuellOrdner.aktiv == True
|
||||
).all()
|
||||
else:
|
||||
quell_ordner = db.query(QuellOrdner).filter(QuellOrdner.aktiv == True).all()
|
||||
|
||||
print(f"[GROBSORTIERUNG] Gefunden: {len(quell_ordner)} aktive Quellordner", flush=True)
|
||||
|
||||
if not quell_ordner:
|
||||
print("[GROBSORTIERUNG] ⚠️ Keine aktiven Quellordner - Abbruch", flush=True)
|
||||
return {"erfolg": True, "meldung": "Keine aktiven Quellordner gefunden"}
|
||||
|
||||
# Regeln laden
|
||||
regeln = db.query(SortierRegel).filter(SortierRegel.aktiv == True).order_by(SortierRegel.prioritaet).all()
|
||||
print(f"[GROBSORTIERUNG] Gefunden: {len(regeln)} aktive Regeln", flush=True)
|
||||
|
||||
if not regeln:
|
||||
print("[GROBSORTIERUNG] ⚠️ Keine aktiven Regeln - Abbruch", flush=True)
|
||||
return {"erfolg": False, "meldung": "Keine aktiven Regeln definiert"}
|
||||
|
||||
# Regeln in Dict-Format
|
||||
regeln_dicts = [{
|
||||
"id": r.id,
|
||||
"name": r.name,
|
||||
"prioritaet": r.prioritaet,
|
||||
"muster": r.muster,
|
||||
"extraktion": r.extraktion,
|
||||
"schema": r.schema,
|
||||
"unterordner": r.unterordner
|
||||
} for r in regeln]
|
||||
|
||||
sorter = Sorter(regeln_dicts)
|
||||
pdf_processor = PDFProcessor()
|
||||
|
||||
gesamt_sortiert = 0
|
||||
gesamt_fehler = 0
|
||||
gesamt_ohne_regel = 0
|
||||
fehler_meldungen = []
|
||||
|
||||
for qo in quell_ordner:
|
||||
ordner_sortiert = 0 # Zähler pro Ordner
|
||||
print("", flush=True)
|
||||
print(f"[GROBSORTIERUNG] --- Verarbeite: {qo.name} ---", flush=True)
|
||||
print(f"[GROBSORTIERUNG] Quelle: {qo.pfad}", flush=True)
|
||||
print(f"[GROBSORTIERUNG] Ziel: {qo.ziel_ordner}", flush=True)
|
||||
print(f"[GROBSORTIERUNG] Einstellungen: direkt_verschieben={qo.direkt_verschieben}, zugferd={qo.zugferd_behandlung}", flush=True)
|
||||
|
||||
try:
|
||||
pfad = Path(qo.pfad)
|
||||
|
||||
# Debug: Ordner-Checks
|
||||
quelle_check = check_folder_permissions(str(pfad), f"{qo.name}/Quelle")
|
||||
ziel_check = check_folder_permissions(qo.ziel_ordner, f"{qo.name}/Ziel")
|
||||
|
||||
if not pfad.exists():
|
||||
print(f"[GROBSORTIERUNG] ❌ Quellpfad existiert nicht - überspringe", flush=True)
|
||||
fehler_meldungen.append(f"{qo.name}: Quellpfad existiert nicht")
|
||||
continue
|
||||
|
||||
if not quelle_check["lesbar"]:
|
||||
print(f"[GROBSORTIERUNG] ❌ Quellpfad nicht lesbar - überspringe", flush=True)
|
||||
fehler_meldungen.append(f"{qo.name}: Keine Leserechte auf Quelle")
|
||||
continue
|
||||
|
||||
if not ziel_check["schreibbar"]:
|
||||
print(f"[GROBSORTIERUNG] ❌ Zielpfad nicht beschreibbar - überspringe", flush=True)
|
||||
fehler_meldungen.append(f"{qo.name}: Keine Schreibrechte auf Ziel")
|
||||
continue
|
||||
|
||||
ziel_basis = Path(qo.ziel_ordner)
|
||||
|
||||
# Dateien sammeln
|
||||
pattern = "**/*" if qo.rekursiv else "*"
|
||||
erlaubte = [t.lower() for t in (qo.dateitypen or [".pdf"])]
|
||||
print(f"[GROBSORTIERUNG] Suche nach: {erlaubte} (rekursiv={qo.rekursiv})", flush=True)
|
||||
|
||||
dateien = [f for f in pfad.glob(pattern) if f.is_file() and f.suffix.lower() in erlaubte]
|
||||
print(f"[GROBSORTIERUNG] ✓ Gefunden: {len(dateien)} Dateien", flush=True)
|
||||
|
||||
if len(dateien) == 0:
|
||||
print(f"[GROBSORTIERUNG] Keine passenden Dateien im Ordner", flush=True)
|
||||
|
||||
for datei in dateien:
|
||||
try:
|
||||
ist_pdf = datei.suffix.lower() == ".pdf"
|
||||
text = ""
|
||||
|
||||
if ist_pdf:
|
||||
pdf_result = pdf_processor.verarbeite(str(datei))
|
||||
if pdf_result.get("fehler"):
|
||||
raise Exception(pdf_result["fehler"])
|
||||
text = pdf_result.get("text", "")
|
||||
|
||||
# ZUGFeRD-Behandlung basierend auf Einstellung
|
||||
# Optionen: "separieren", "regel", "normal", "ignorieren"
|
||||
zugferd_behandlung = getattr(qo, 'zugferd_behandlung', 'normal') or 'normal'
|
||||
ist_zugferd = pdf_result.get("ist_zugferd", False)
|
||||
|
||||
if zugferd_behandlung == "separieren":
|
||||
# NUR ZUGFeRD-PDFs verarbeiten
|
||||
if not ist_zugferd:
|
||||
# Keine ZUGFeRD-PDF -> überspringen
|
||||
continue
|
||||
# ZUGFeRD-PDF in separaten Ordner verschieben
|
||||
zugferd_ziel = ziel_basis / "zugferd"
|
||||
zugferd_ziel.mkdir(parents=True, exist_ok=True)
|
||||
neuer_pfad = zugferd_ziel / datei.name
|
||||
counter = 1
|
||||
while neuer_pfad.exists():
|
||||
neuer_pfad = zugferd_ziel / f"{datei.stem}_{counter}{datei.suffix}"
|
||||
counter += 1
|
||||
datei.rename(neuer_pfad)
|
||||
db.add(VerarbeiteteDatei(
|
||||
original_pfad=str(datei),
|
||||
original_name=datei.name,
|
||||
neuer_pfad=str(neuer_pfad),
|
||||
neuer_name=neuer_pfad.name,
|
||||
ist_zugferd=True,
|
||||
status="zugferd"
|
||||
))
|
||||
gesamt_sortiert += 1
|
||||
ordner_sortiert += 1
|
||||
continue
|
||||
|
||||
elif zugferd_behandlung == "ignorieren":
|
||||
# ZUGFeRD-PDFs überspringen, nur normale verarbeiten
|
||||
if ist_zugferd:
|
||||
continue
|
||||
# Weiter mit Regelprüfung für normale PDFs
|
||||
|
||||
# Bei "regel" oder "normal": Alle PDFs durch Regeln prüfen
|
||||
|
||||
# Direkt verschieben (ohne Regelprüfung)?
|
||||
direkt_verschieben = getattr(qo, 'direkt_verschieben', False)
|
||||
if direkt_verschieben:
|
||||
# Datei direkt in Zielordner verschieben
|
||||
print(f"[GROBSORTIERUNG] → Verschiebe direkt: {datei.name}", flush=True)
|
||||
try:
|
||||
ziel_basis.mkdir(parents=True, exist_ok=True)
|
||||
except PermissionError as pe:
|
||||
print(f"[GROBSORTIERUNG] ❌ Kann Zielordner nicht erstellen: {pe}", flush=True)
|
||||
raise
|
||||
neuer_pfad = ziel_basis / datei.name
|
||||
counter = 1
|
||||
while neuer_pfad.exists():
|
||||
neuer_pfad = ziel_basis / f"{datei.stem}_{counter}{datei.suffix}"
|
||||
counter += 1
|
||||
try:
|
||||
datei.rename(neuer_pfad)
|
||||
print(f"[GROBSORTIERUNG] ✓ Verschoben nach: {neuer_pfad}", flush=True)
|
||||
except PermissionError as pe:
|
||||
print(f"[GROBSORTIERUNG] ❌ Keine Berechtigung zum Verschieben: {pe}", flush=True)
|
||||
raise
|
||||
except Exception as me:
|
||||
print(f"[GROBSORTIERUNG] ❌ Fehler beim Verschieben: {me}", flush=True)
|
||||
raise
|
||||
db.add(VerarbeiteteDatei(
|
||||
original_pfad=str(datei),
|
||||
original_name=datei.name,
|
||||
neuer_pfad=str(neuer_pfad),
|
||||
neuer_name=neuer_pfad.name,
|
||||
status="direkt"
|
||||
))
|
||||
gesamt_sortiert += 1
|
||||
ordner_sortiert += 1
|
||||
continue
|
||||
|
||||
# Regel finden
|
||||
doc_info = {"text": text, "original_name": datei.name, "absender": "", "dateityp": datei.suffix.lower()}
|
||||
regel = sorter.finde_passende_regel(doc_info)
|
||||
|
||||
if not regel:
|
||||
gesamt_ohne_regel += 1
|
||||
db.add(VerarbeiteteDatei(
|
||||
original_pfad=str(datei),
|
||||
original_name=datei.name,
|
||||
status="keine_regel",
|
||||
fehler="Keine passende Regel gefunden"
|
||||
))
|
||||
continue
|
||||
|
||||
# Felder extrahieren und verschieben
|
||||
extrahiert = sorter.extrahiere_felder(regel, doc_info)
|
||||
schema = regel.get("schema", "{datum} - Dokument.pdf")
|
||||
if schema.endswith(".pdf"):
|
||||
schema = schema[:-4] + datei.suffix
|
||||
neuer_name = sorter.generiere_dateinamen({"schema": schema, **regel}, extrahiert)
|
||||
|
||||
ziel = ziel_basis
|
||||
if regel.get("unterordner"):
|
||||
ziel = ziel / regel["unterordner"]
|
||||
ziel.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
sorter.verschiebe_datei(str(datei), str(ziel), neuer_name)
|
||||
gesamt_sortiert += 1
|
||||
ordner_sortiert += 1
|
||||
|
||||
except Exception as e:
|
||||
gesamt_fehler += 1
|
||||
print(f"[GROBSORTIERUNG] ❌ FEHLER bei {datei.name}: {e}", flush=True)
|
||||
logger.error(f"Fehler bei Datei {datei}: {e}")
|
||||
db.add(VerarbeiteteDatei(
|
||||
original_pfad=str(datei),
|
||||
original_name=datei.name,
|
||||
status="fehler",
|
||||
fehler=str(e)[:500]
|
||||
))
|
||||
|
||||
# Ordner-Status aktualisieren (wie bei Postfächern)
|
||||
qo.letzte_verarbeitung = datetime.utcnow()
|
||||
qo.letzte_anzahl = ordner_sortiert
|
||||
print(f"[GROBSORTIERUNG] ✓ {qo.name} abgeschlossen: {ordner_sortiert} Dateien verschoben", flush=True)
|
||||
|
||||
except Exception as e:
|
||||
print(f"[GROBSORTIERUNG] ❌ FEHLER bei Ordner {qo.name}: {e}", flush=True)
|
||||
fehler_meldungen.append(f"{qo.name}: {str(e)[:100]}")
|
||||
|
||||
db.commit()
|
||||
|
||||
meldung = f"{gesamt_sortiert} Dateien sortiert"
|
||||
if gesamt_ohne_regel > 0:
|
||||
meldung += f", {gesamt_ohne_regel} ohne passende Regel"
|
||||
if gesamt_fehler > 0:
|
||||
meldung += f", {gesamt_fehler} Fehler"
|
||||
if fehler_meldungen:
|
||||
meldung += f" ({'; '.join(fehler_meldungen)})"
|
||||
|
||||
print("", flush=True)
|
||||
print(f"[GROBSORTIERUNG] === ENDE === {meldung}", flush=True)
|
||||
print("", flush=True)
|
||||
|
||||
# Erfolg wenn keine echten Fehler (ohne_regel zählt nicht als Fehler)
|
||||
return {"erfolg": gesamt_fehler == 0 and not fehler_meldungen, "meldung": meldung}
|
||||
|
||||
|
||||
def execute_sortierregeln(db, zeitplan: Zeitplan) -> Dict:
|
||||
"""Führt nur Sortierregeln aus (freie Ordner von Regeln)"""
|
||||
from ..models.database import SortierRegel, VerarbeiteteDatei
|
||||
from ..modules.pdf_processor import PDFProcessor
|
||||
from pathlib import Path
|
||||
|
||||
# Regeln laden (optional spezifische Regel)
|
||||
if zeitplan.regel_id:
|
||||
regeln = db.query(SortierRegel).filter(
|
||||
SortierRegel.id == zeitplan.regel_id,
|
||||
SortierRegel.aktiv == True
|
||||
).all()
|
||||
else:
|
||||
regeln = db.query(SortierRegel).filter(SortierRegel.aktiv == True).all()
|
||||
|
||||
if not regeln:
|
||||
return {"erfolg": True, "meldung": "Keine aktiven Regeln gefunden"}
|
||||
|
||||
pdf_processor = PDFProcessor()
|
||||
gesamt_sortiert = 0
|
||||
gesamt_fehler = 0
|
||||
fehler_meldungen = []
|
||||
|
||||
for regel in regeln:
|
||||
freie_ordner = regel.freie_ordner if regel.freie_ordner else []
|
||||
if not freie_ordner:
|
||||
continue
|
||||
|
||||
regel_dict = {
|
||||
"id": regel.id,
|
||||
"name": regel.name,
|
||||
"prioritaet": regel.prioritaet,
|
||||
"muster": regel.muster,
|
||||
"extraktion": regel.extraktion,
|
||||
"schema": regel.schema,
|
||||
"unterordner": regel.unterordner,
|
||||
"ziel_ordner": getattr(regel, 'ziel_ordner', None),
|
||||
"nur_umbenennen": getattr(regel, 'nur_umbenennen', False)
|
||||
}
|
||||
regel_sorter = Sorter([regel_dict])
|
||||
|
||||
for freier_ordner_pfad in freie_ordner:
|
||||
freier_pfad = Path(freier_ordner_pfad)
|
||||
if not freier_pfad.exists() or not freier_pfad.is_dir():
|
||||
continue
|
||||
|
||||
# Dateien sammeln
|
||||
dateien = [f for f in freier_pfad.glob("**/*") if f.is_file() and f.suffix.lower() == ".pdf"]
|
||||
|
||||
for datei in dateien:
|
||||
try:
|
||||
ist_pdf = datei.suffix.lower() == ".pdf"
|
||||
text = ""
|
||||
ist_zugferd = False
|
||||
|
||||
if ist_pdf:
|
||||
pdf_result = pdf_processor.verarbeite(str(datei))
|
||||
if pdf_result.get("fehler"):
|
||||
raise Exception(pdf_result["fehler"])
|
||||
text = pdf_result.get("text", "")
|
||||
ist_zugferd = pdf_result.get("ist_zugferd", False)
|
||||
|
||||
doc_info = {
|
||||
"text": text,
|
||||
"original_name": datei.name,
|
||||
"absender": "",
|
||||
"dateityp": datei.suffix.lower()
|
||||
}
|
||||
|
||||
# Prüfe ob Regel passt
|
||||
passend = regel_sorter.finde_passende_regel(doc_info)
|
||||
if not passend:
|
||||
continue
|
||||
|
||||
# Felder extrahieren
|
||||
extrahiert = regel_sorter.extrahiere_felder(passend, doc_info)
|
||||
|
||||
# Dateiname generieren
|
||||
schema = passend.get("schema", "{datum} - Dokument.pdf")
|
||||
if schema.endswith(".pdf"):
|
||||
schema = schema[:-4] + datei.suffix
|
||||
neuer_name = regel_sorter.generiere_dateinamen(
|
||||
{"schema": schema, **passend}, extrahiert
|
||||
)
|
||||
|
||||
# Zielordner bestimmen
|
||||
if passend.get("nur_umbenennen"):
|
||||
# Nur umbenennen - Datei bleibt im aktuellen Ordner
|
||||
ziel = datei.parent
|
||||
elif passend.get("ziel_ordner"):
|
||||
# Regel hat eigenen Zielordner
|
||||
ziel = Path(passend["ziel_ordner"])
|
||||
if passend.get("unterordner"):
|
||||
ziel = ziel / passend["unterordner"]
|
||||
else:
|
||||
# Kein Zielordner - bleibt im freien Ordner
|
||||
ziel = freier_pfad
|
||||
if passend.get("unterordner"):
|
||||
ziel = ziel / passend["unterordner"]
|
||||
ziel.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
# Verschieben/Umbenennen
|
||||
regel_sorter.verschiebe_datei(str(datei), str(ziel), neuer_name)
|
||||
gesamt_sortiert += 1
|
||||
|
||||
db.add(VerarbeiteteDatei(
|
||||
original_pfad=str(datei),
|
||||
original_name=datei.name,
|
||||
neuer_pfad=str(ziel / neuer_name),
|
||||
neuer_name=neuer_name,
|
||||
ist_zugferd=ist_zugferd,
|
||||
status="sortiert",
|
||||
extrahierte_daten=extrahiert
|
||||
))
|
||||
|
||||
except Exception as e:
|
||||
gesamt_fehler += 1
|
||||
logger.error(f"Fehler bei Datei {datei}: {e}")
|
||||
|
||||
db.commit()
|
||||
|
||||
meldung = f"{gesamt_sortiert} Dateien mit Regeln sortiert"
|
||||
if gesamt_fehler > 0:
|
||||
meldung += f", {gesamt_fehler} Fehler"
|
||||
if fehler_meldungen:
|
||||
meldung += f" ({'; '.join(fehler_meldungen)})"
|
||||
|
||||
return {"erfolg": gesamt_fehler == 0, "meldung": meldung}
|
||||
|
||||
|
||||
def get_scheduler_status() -> Dict:
|
||||
"""Gibt den Status aller Zeitpläne zurück"""
|
||||
global scheduler
|
||||
|
||||
db = SessionLocal()
|
||||
try:
|
||||
zeitplaene = db.query(Zeitplan).all()
|
||||
|
||||
result = []
|
||||
for zp in zeitplaene:
|
||||
job = scheduler.get_job(f"zeitplan_{zp.id}") if scheduler else None
|
||||
|
||||
result.append({
|
||||
"id": zp.id,
|
||||
"name": zp.name,
|
||||
"typ": zp.typ,
|
||||
"intervall": zp.intervall,
|
||||
"aktiv": zp.aktiv,
|
||||
"letzte_ausfuehrung": zp.letzte_ausfuehrung.isoformat() if zp.letzte_ausfuehrung else None,
|
||||
"naechste_ausfuehrung": zp.naechste_ausfuehrung.isoformat() if zp.naechste_ausfuehrung else None,
|
||||
"letzter_status": zp.letzter_status,
|
||||
"letzte_meldung": zp.letzte_meldung,
|
||||
"job_aktiv": job is not None
|
||||
})
|
||||
|
||||
return {
|
||||
"scheduler_laeuft": scheduler is not None and scheduler.running if scheduler else False,
|
||||
"zeitplaene": result
|
||||
}
|
||||
finally:
|
||||
db.close()
|
||||
|
||||
|
||||
def trigger_zeitplan_manuell(zeitplan_id: int) -> Dict:
|
||||
"""Löst einen Zeitplan manuell aus"""
|
||||
db = SessionLocal()
|
||||
try:
|
||||
zeitplan = db.query(Zeitplan).filter(Zeitplan.id == zeitplan_id).first()
|
||||
if not zeitplan:
|
||||
return {"erfolg": False, "meldung": "Zeitplan nicht gefunden"}
|
||||
|
||||
# Synchron ausführen
|
||||
execute_zeitplan(zeitplan_id)
|
||||
|
||||
return {"erfolg": True, "meldung": f"Zeitplan '{zeitplan.name}' wurde ausgeführt"}
|
||||
finally:
|
||||
db.close()
|
||||
0
backend/app/utils/__init__.py → Source/backend/app/utils/__init__.py
Normal file → Executable file
0
backend/app/utils/__init__.py → Source/backend/app/utils/__init__.py
Normal file → Executable file
4
backend/requirements.txt → Source/backend/requirements.txt
Normal file → Executable file
4
backend/requirements.txt → Source/backend/requirements.txt
Normal file → Executable file
|
|
@ -7,6 +7,7 @@ jinja2==3.1.3
|
|||
# Database
|
||||
sqlalchemy==2.0.25
|
||||
aiosqlite==0.19.0
|
||||
pymysql==1.1.0
|
||||
|
||||
# PDF Processing
|
||||
pypdf==4.0.1
|
||||
|
|
@ -18,3 +19,6 @@ factur-x==3.0
|
|||
# Utilities
|
||||
pydantic==2.6.1
|
||||
python-dotenv==1.0.1
|
||||
|
||||
# Scheduler
|
||||
apscheduler==3.10.4
|
||||
56
Source/docker-compose-unraid.yml
Executable file
56
Source/docker-compose-unraid.yml
Executable file
|
|
@ -0,0 +1,56 @@
|
|||
version: '3.8'
|
||||
|
||||
# Unraid Docker Compose für Dateiverwaltung
|
||||
# ==========================================
|
||||
# Projektpfad auf Unraid: /mnt/user/17 - Entwicklungen/20 - Projekte/Dateiverwaltung/
|
||||
#
|
||||
# WICHTIG: Image muss vorher per SSH gebaut werden!
|
||||
#
|
||||
# Verwendung (SSH auf Unraid):
|
||||
# 1. Image bauen:
|
||||
# cd "/mnt/user/17 - Entwicklungen/20 - Projekte/Dateiverwaltung"
|
||||
# docker build -t dateiverwaltung:local .
|
||||
#
|
||||
# 2. In Portainer: Stack deployen (oder per SSH):
|
||||
# docker-compose -f docker-compose-unraid.yml up -d
|
||||
#
|
||||
# 3. Nach Code-Änderungen: Schritt 1 + 2 wiederholen
|
||||
|
||||
services:
|
||||
dateiverwaltung:
|
||||
image: dateiverwaltung:local
|
||||
container_name: dateiverwaltung
|
||||
restart: unless-stopped
|
||||
|
||||
ports:
|
||||
- "8080:8000"
|
||||
|
||||
volumes:
|
||||
# Persistente Daten (Datenbank)
|
||||
- /mnt/user/appdata/firma/dateiverwaltung/data:/app/data
|
||||
|
||||
# Regeln-Konfiguration
|
||||
- /mnt/user/appdata/firma/dateiverwaltung/regeln:/app/regeln
|
||||
|
||||
# Zugriff auf alle Unraid Shares
|
||||
- /mnt/user:/mnt/user
|
||||
|
||||
environment:
|
||||
- TZ=Europe/Berlin
|
||||
- DATABASE_URL=mysql+pymysql://data:8715@192.168.155.83/dateiverwaltung
|
||||
- PUID=99
|
||||
- PGID=100
|
||||
|
||||
healthcheck:
|
||||
test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
|
||||
interval: 30s
|
||||
timeout: 10s
|
||||
retries: 3
|
||||
start_period: 40s
|
||||
|
||||
labels:
|
||||
- "net.unraid.docker.managed=dockerman"
|
||||
|
||||
networks:
|
||||
default:
|
||||
name: dateiverwaltung-net
|
||||
12
docker-compose.yml → Source/docker-compose.yml
Normal file → Executable file
12
docker-compose.yml → Source/docker-compose.yml
Normal file → Executable file
|
|
@ -6,17 +6,15 @@ services:
|
|||
container_name: dateiverwaltung
|
||||
restart: unless-stopped
|
||||
ports:
|
||||
- "8000:8000"
|
||||
- "8080:8000"
|
||||
volumes:
|
||||
# Persistente Daten
|
||||
- ./data:/app/data
|
||||
# Regeln können außerhalb bearbeitet werden
|
||||
# Regeln mounten
|
||||
- ./regeln:/app/regeln
|
||||
# Archiv auf Host mounten (optional, für direkten Zugriff)
|
||||
# - /mnt/user/archiv:/archiv
|
||||
# Zugriff auf externe Mounts (NAS, etc.)
|
||||
- /mnt:/mnt
|
||||
environment:
|
||||
- TZ=Europe/Berlin
|
||||
- DATABASE_URL=sqlite:////app/data/dateiverwaltung.db
|
||||
- DATABASE_URL=mysql+pymysql://data:8715@192.168.155.83/dateiverwaltung
|
||||
healthcheck:
|
||||
test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
|
||||
interval: 30s
|
||||
1521
Source/frontend/static/css/style.css
Executable file
1521
Source/frontend/static/css/style.css
Executable file
File diff suppressed because it is too large
Load diff
2546
Source/frontend/static/js/app.js
Executable file
2546
Source/frontend/static/js/app.js
Executable file
File diff suppressed because it is too large
Load diff
937
Source/frontend/templates/index.html
Executable file
937
Source/frontend/templates/index.html
Executable file
|
|
@ -0,0 +1,937 @@
|
|||
<!DOCTYPE html>
|
||||
<html lang="de">
|
||||
<head>
|
||||
<meta charset="UTF-8">
|
||||
<meta name="viewport" content="width=device-width, initial-scale=1.0">
|
||||
<title>Dateiverwaltung</title>
|
||||
<link rel="stylesheet" href="/static/css/style.css">
|
||||
</head>
|
||||
<body>
|
||||
<div id="app">
|
||||
<!-- Header -->
|
||||
<header class="header">
|
||||
<div class="header-left">
|
||||
<h1>Dateiverwaltung</h1>
|
||||
</div>
|
||||
<div class="header-right">
|
||||
<span id="status-indicator"></span>
|
||||
<button class="btn-icon" onclick="zeigeLogModal()" title="Debug-Log">📋</button>
|
||||
<button class="btn-icon" onclick="zeigeEinstellungenModal()" title="Einstellungen">⚙️</button>
|
||||
</div>
|
||||
</header>
|
||||
|
||||
<!-- Main Content -->
|
||||
<div class="main-container">
|
||||
<!-- Bereich 1: Mail-Abruf -->
|
||||
<section class="bereich">
|
||||
<div class="bereich-header">
|
||||
<h2>📧 Mail-Abruf</h2>
|
||||
<p class="bereich-desc">Attachments aus Postfächern in Ordner speichern</p>
|
||||
</div>
|
||||
|
||||
<div class="bereich-content">
|
||||
<!-- Postfächer Liste -->
|
||||
<div class="card">
|
||||
<div class="card-header">
|
||||
<h3>Postfächer</h3>
|
||||
<button class="btn btn-sm btn-primary" onclick="zeigePostfachModal()">+ Hinzufügen</button>
|
||||
</div>
|
||||
<div class="card-body">
|
||||
<div id="postfaecher-liste">
|
||||
<p class="empty-state">Keine Postfächer konfiguriert</p>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- Abruf starten -->
|
||||
<div class="action-bar">
|
||||
<button class="btn btn-success btn-large" onclick="allePostfaecherAbrufen()">
|
||||
▶ Alle Postfächer abrufen
|
||||
</button>
|
||||
</div>
|
||||
|
||||
<!-- Letzter Abruf Log -->
|
||||
<div class="card">
|
||||
<div class="card-header">
|
||||
<h3>Letzter Abruf</h3>
|
||||
</div>
|
||||
<div class="card-body">
|
||||
<div id="abruf-log" class="log-output">
|
||||
<p class="empty-state">Noch kein Abruf durchgeführt</p>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</section>
|
||||
|
||||
<!-- Bereich 2: Datei-Sortierung -->
|
||||
<section class="bereich">
|
||||
<div class="bereich-header">
|
||||
<h2>📁 Datei-Sortierung</h2>
|
||||
<p class="bereich-desc">Dateien nach Regeln umbenennen und verschieben</p>
|
||||
</div>
|
||||
|
||||
<div class="bereich-content">
|
||||
<!-- Grobsortierung -->
|
||||
<div class="card">
|
||||
<div class="card-header">
|
||||
<h3>Grobsortierung</h3>
|
||||
<button class="btn btn-sm btn-primary" onclick="zeigeOrdnerModal()">+ Hinzufügen</button>
|
||||
</div>
|
||||
<div class="card-body">
|
||||
<div id="ordner-liste">
|
||||
<p class="empty-state">Keine Ordner konfiguriert</p>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- Regeln -->
|
||||
<div class="card">
|
||||
<div class="card-header">
|
||||
<h3>Sortier-Regeln</h3>
|
||||
<button class="btn btn-sm btn-primary" onclick="zeigeRegelModal()">+ Hinzufügen</button>
|
||||
</div>
|
||||
<div class="card-body">
|
||||
<div id="regeln-liste">
|
||||
<p class="empty-state">Keine Regeln definiert</p>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- Sortierung starten -->
|
||||
<div class="action-bar">
|
||||
<button class="btn btn-success btn-large" onclick="sortierungStarten()">
|
||||
▶ Sortierung starten
|
||||
</button>
|
||||
</div>
|
||||
|
||||
<!-- Sortierungs-Log -->
|
||||
<div class="card">
|
||||
<div class="card-header">
|
||||
<h3>Verarbeitete Dateien</h3>
|
||||
</div>
|
||||
<div class="card-body">
|
||||
<div id="sortierung-log" class="log-output">
|
||||
<p class="empty-state">Noch keine Dateien verarbeitet</p>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</section>
|
||||
|
||||
<!-- Bereich 3: Zeitpläne / Scheduler -->
|
||||
<section class="bereich">
|
||||
<div class="bereich-header">
|
||||
<h2>⏰ Zeitpläne</h2>
|
||||
<p class="bereich-desc">Automatische Ausführung von Mail-Abruf und Sortierung</p>
|
||||
</div>
|
||||
|
||||
<div class="bereich-content">
|
||||
<!-- Status-Übersicht -->
|
||||
<div class="card">
|
||||
<div class="card-header">
|
||||
<h3>Status-Übersicht</h3>
|
||||
<button class="btn btn-sm" onclick="ladeStatus()">🔄 Aktualisieren</button>
|
||||
</div>
|
||||
<div class="card-body">
|
||||
<div id="status-uebersicht">
|
||||
<p class="empty-state">Status wird geladen...</p>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- Zeitpläne Liste -->
|
||||
<div class="card">
|
||||
<div class="card-header">
|
||||
<h3>Zeitpläne</h3>
|
||||
<button class="btn btn-sm btn-primary" onclick="zeigeZeitplanModal()">+ Hinzufügen</button>
|
||||
</div>
|
||||
<div class="card-body">
|
||||
<div id="zeitplaene-liste">
|
||||
<p class="empty-state">Keine Zeitpläne konfiguriert</p>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</section>
|
||||
</div>
|
||||
|
||||
<!-- Modal: Postfach hinzufügen -->
|
||||
<div id="postfach-modal" class="modal hidden">
|
||||
<div class="modal-content">
|
||||
<div class="modal-header">
|
||||
<h3>Postfach hinzufügen</h3>
|
||||
<button class="modal-close" onclick="schliesseModal('postfach-modal')">×</button>
|
||||
</div>
|
||||
<div class="modal-body">
|
||||
<div class="form-group">
|
||||
<label>Name</label>
|
||||
<input type="text" id="pf-name" placeholder="z.B. Firma Rechnungen">
|
||||
</div>
|
||||
<div class="form-row">
|
||||
<div class="form-group">
|
||||
<label>IMAP Server</label>
|
||||
<input type="text" id="pf-server" placeholder="imap.example.com">
|
||||
</div>
|
||||
<div class="form-group">
|
||||
<label>Port</label>
|
||||
<input type="number" id="pf-port" value="993">
|
||||
</div>
|
||||
</div>
|
||||
<div class="form-group">
|
||||
<label>E-Mail</label>
|
||||
<input type="email" id="pf-email" placeholder="mail@example.com">
|
||||
</div>
|
||||
<div class="form-group">
|
||||
<label>Passwort</label>
|
||||
<input type="password" id="pf-passwort">
|
||||
</div>
|
||||
<div class="form-row">
|
||||
<div class="form-group">
|
||||
<label>IMAP-Ordner</label>
|
||||
<input type="text" id="pf-ordner" value="INBOX">
|
||||
</div>
|
||||
<div class="form-group">
|
||||
<label>Alle Ordner durchsuchen</label>
|
||||
<select id="pf-alle-ordner">
|
||||
<option value="false">Nein (nur angegebenen Ordner)</option>
|
||||
<option value="true">Ja (alle Ordner)</option>
|
||||
</select>
|
||||
</div>
|
||||
</div>
|
||||
<div class="form-group">
|
||||
<label>Welche Mails durchsuchen</label>
|
||||
<select id="pf-nur-ungelesen">
|
||||
<option value="false" selected>Alle Mails</option>
|
||||
<option value="true">Nur ungelesene Mails</option>
|
||||
</select>
|
||||
</div>
|
||||
<div class="form-group">
|
||||
<label>Mails ab Datum</label>
|
||||
<input type="date" id="pf-ab-datum">
|
||||
<small>Nur Mails ab diesem Datum verarbeiten (leer = alle)</small>
|
||||
</div>
|
||||
<div class="form-group">
|
||||
<label>Ziel-Ordner</label>
|
||||
<div class="input-with-btn">
|
||||
<input type="text" id="pf-ziel" value="/srv/http/dateiverwaltung/data/inbox/">
|
||||
<button class="btn" type="button" onclick="oeffneBrowser('pf-ziel')">📁</button>
|
||||
</div>
|
||||
<small>Hier landen die Attachments</small>
|
||||
</div>
|
||||
<div class="form-group">
|
||||
<label>Erlaubte Dateitypen</label>
|
||||
<div class="checkbox-group" id="pf-typen-gruppe">
|
||||
<label class="checkbox-item"><input type="checkbox" value=".pdf" checked> PDF</label>
|
||||
<label class="checkbox-item"><input type="checkbox" value=".jpg"> JPG</label>
|
||||
<label class="checkbox-item"><input type="checkbox" value=".jpeg"> JPEG</label>
|
||||
<label class="checkbox-item"><input type="checkbox" value=".png"> PNG</label>
|
||||
<label class="checkbox-item"><input type="checkbox" value=".gif"> GIF</label>
|
||||
<label class="checkbox-item"><input type="checkbox" value=".tiff"> TIFF</label>
|
||||
<label class="checkbox-item"><input type="checkbox" value=".doc"> DOC</label>
|
||||
<label class="checkbox-item"><input type="checkbox" value=".docx"> DOCX</label>
|
||||
<label class="checkbox-item"><input type="checkbox" value=".xls"> XLS</label>
|
||||
<label class="checkbox-item"><input type="checkbox" value=".xlsx"> XLSX</label>
|
||||
<label class="checkbox-item"><input type="checkbox" value=".csv"> CSV</label>
|
||||
<label class="checkbox-item"><input type="checkbox" value=".txt"> TXT</label>
|
||||
<label class="checkbox-item"><input type="checkbox" value=".zip"> ZIP</label>
|
||||
<label class="checkbox-item"><input type="checkbox" value=".xml"> XML</label>
|
||||
</div>
|
||||
</div>
|
||||
<div class="form-row">
|
||||
<div class="form-group">
|
||||
<label>Standard Max. Größe (MB)</label>
|
||||
<input type="number" id="pf-max-groesse" value="25" style="width: 100px;">
|
||||
</div>
|
||||
<div class="form-group">
|
||||
<label>Standard Min. Größe (KB)</label>
|
||||
<input type="number" id="pf-min-groesse" value="10" style="width: 100px;">
|
||||
</div>
|
||||
</div>
|
||||
<div class="form-group">
|
||||
<label>Größenfilter pro Dateityp (optional)</label>
|
||||
<small>Überschreibt die Standard-Werte für einzelne Dateitypen</small>
|
||||
<div id="pf-groessen-filter" class="groessen-filter-container">
|
||||
<!-- Dynamisch generiert -->
|
||||
</div>
|
||||
<button type="button" class="btn btn-sm" onclick="toggleGroessenFilter()">
|
||||
Größenfilter pro Typ anzeigen/bearbeiten
|
||||
</button>
|
||||
</div>
|
||||
</div>
|
||||
<div class="modal-footer">
|
||||
<button class="btn" onclick="schliesseModal('postfach-modal')">Abbrechen</button>
|
||||
<button class="btn btn-primary" onclick="speicherePostfach()">Speichern</button>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- Modal: Ordner hinzufügen/bearbeiten - Breit mit 2 Spalten -->
|
||||
<div id="ordner-modal" class="modal hidden">
|
||||
<div class="modal-content modal-fullwidth">
|
||||
<div class="modal-header">
|
||||
<h3 id="ordner-modal-title">Grobsortierung hinzufügen</h3>
|
||||
<button class="modal-close" onclick="schliesseModal('ordner-modal')">×</button>
|
||||
</div>
|
||||
<div class="modal-body ordner-editor-body">
|
||||
<!-- LINKE SPALTE: Grundeinstellungen -->
|
||||
<div class="ordner-spalte">
|
||||
<h4>📁 Grundeinstellungen</h4>
|
||||
|
||||
<div class="form-group">
|
||||
<label title="Eindeutiger Name zur Identifikation dieser Grobsortierung">Name</label>
|
||||
<input type="text" id="ord-name" placeholder="z.B. Firma Inbox" title="Gib der Grobsortierung einen Namen, z.B. 'E-Mail Anhänge' oder 'Scanner Eingang'">
|
||||
</div>
|
||||
|
||||
<div class="form-group">
|
||||
<label title="Ordner in dem neue Dateien eingehen (z.B. Inbox, Download-Ordner)">Quell-Pfad (wo liegen die Dateien?)</label>
|
||||
<div class="input-with-btn">
|
||||
<input type="text" id="ord-pfad" value="/srv/http/dateiverwaltung/data/inbox/" title="Absoluter Pfad zum Ordner der überwacht werden soll">
|
||||
<button class="btn" type="button" onclick="oeffneBrowser('ord-pfad')" title="Ordner auswählen">📁</button>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div class="form-group">
|
||||
<label title="Ordner in den die Dateien nach der Grobsortierung verschoben werden">Ziel-Ordner (wohin nach Sortierung?)</label>
|
||||
<div class="input-with-btn">
|
||||
<input type="text" id="ord-ziel" value="/srv/http/dateiverwaltung/data/archiv/" title="Hier landen die Dateien nach der Grobsortierung. Sortierregeln greifen dann auf diesen Ordner zu.">
|
||||
<button class="btn" type="button" onclick="oeffneBrowser('ord-ziel')" title="Ordner auswählen">📁</button>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div class="form-group">
|
||||
<label title="Sollen auch Dateien in Unterordnern verarbeitet werden?">Unterordner einschließen</label>
|
||||
<select id="ord-rekursiv" title="Ja = alle Unterordner werden durchsucht. Nein = nur der Hauptordner.">
|
||||
<option value="true" selected>Ja (rekursiv)</option>
|
||||
<option value="false">Nein (nur dieser Ordner)</option>
|
||||
</select>
|
||||
</div>
|
||||
|
||||
<!-- Sortier-Modus -->
|
||||
<div class="ordner-section">
|
||||
<h4 title="Wie sollen die Dateien verarbeitet werden?">Sortier-Modus</h4>
|
||||
<div class="radio-group">
|
||||
<label class="radio-item" title="Dateien werden analysiert und mit Sortierregeln verarbeitet">
|
||||
<input type="radio" name="ord-modus" value="regeln" checked>
|
||||
<span>Mit Regeln sortieren</span>
|
||||
</label>
|
||||
<label class="radio-item" title="Dateien werden direkt in den Zielordner verschoben, ohne Regeln anzuwenden">
|
||||
<input type="radio" name="ord-modus" value="direkt">
|
||||
<span>Direkt verschieben (ohne Regeln)</span>
|
||||
</label>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- MITTLERE SPALTE: Dateitypen -->
|
||||
<div class="ordner-spalte">
|
||||
<h4 title="Welche Dateitypen sollen verarbeitet werden?">📄 Dateitypen</h4>
|
||||
<div class="checkbox-group dateitypen-grid" id="ord-typen-gruppe">
|
||||
<label class="checkbox-item" title="PDF-Dokumente (Textextraktion + OCR möglich)"><input type="checkbox" value=".pdf" checked> PDF</label>
|
||||
<label class="checkbox-item" title="JPEG-Bilder"><input type="checkbox" value=".jpg" checked> JPG</label>
|
||||
<label class="checkbox-item" title="JPEG-Bilder"><input type="checkbox" value=".jpeg" checked> JPEG</label>
|
||||
<label class="checkbox-item" title="PNG-Bilder"><input type="checkbox" value=".png" checked> PNG</label>
|
||||
<label class="checkbox-item" title="GIF-Bilder"><input type="checkbox" value=".gif"> GIF</label>
|
||||
<label class="checkbox-item" title="TIFF-Bilder (oft von Scannern)"><input type="checkbox" value=".tiff" checked> TIFF</label>
|
||||
<label class="checkbox-item" title="Bitmap-Bilder"><input type="checkbox" value=".bmp"> BMP</label>
|
||||
<label class="checkbox-item" title="Word-Dokumente (alt)"><input type="checkbox" value=".doc"> DOC</label>
|
||||
<label class="checkbox-item" title="Word-Dokumente (neu)"><input type="checkbox" value=".docx"> DOCX</label>
|
||||
<label class="checkbox-item" title="Excel-Dateien (alt)"><input type="checkbox" value=".xls"> XLS</label>
|
||||
<label class="checkbox-item" title="Excel-Dateien (neu)"><input type="checkbox" value=".xlsx"> XLSX</label>
|
||||
<label class="checkbox-item" title="CSV-Dateien"><input type="checkbox" value=".csv"> CSV</label>
|
||||
<label class="checkbox-item" title="Text-Dateien"><input type="checkbox" value=".txt"> TXT</label>
|
||||
<label class="checkbox-item" title="XML-Dateien"><input type="checkbox" value=".xml"> XML</label>
|
||||
</div>
|
||||
|
||||
<!-- Besondere Dateiarten -->
|
||||
<div class="ordner-section" style="margin-top: 1.5rem;">
|
||||
<h4 title="Spezielle Behandlung für bestimmte Dokumenttypen">🧾 Besondere Dateiarten</h4>
|
||||
<div class="checkbox-group dateitypen-grid">
|
||||
<label class="checkbox-item" title="ZUGFeRD-Rechnungen in separaten Unterordner verschieben"><input type="checkbox" id="ord-zugferd-sep" checked> ZUGFeRD</label>
|
||||
<label class="checkbox-item" title="Digital signierte PDFs in separaten Unterordner verschieben"><input type="checkbox" id="ord-signiert-sep"> Signiert</label>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- RECHTE SPALTE: PDF-Verarbeitung -->
|
||||
<div class="ordner-spalte">
|
||||
<h4 title="Einstellungen für die PDF-Texterkennung">⚙️ PDF-Verarbeitung</h4>
|
||||
|
||||
<div class="ordner-section">
|
||||
<label class="checkbox-label" title="OCR (Texterkennung) für gescannte PDFs aktivieren">
|
||||
<input type="checkbox" id="ord-ocr" checked>
|
||||
<span>OCR aktivieren</span>
|
||||
</label>
|
||||
<small style="color: var(--text-secondary); display: block; margin-top: 0.25rem;">
|
||||
Gescannte PDFs werden durchsuchbar gemacht
|
||||
</small>
|
||||
</div>
|
||||
|
||||
<div class="form-group" style="margin-top: 1rem;">
|
||||
<label title="Optionaler Backup-Ordner für Originale vor OCR-Verarbeitung">Original sichern vor OCR (optional)</label>
|
||||
<div class="input-with-btn">
|
||||
<input type="text" id="ord-original-sichern" placeholder="Leer = kein Backup" title="Wenn angegeben, wird das Original vor der OCR-Verarbeitung hierhin kopiert">
|
||||
<button class="btn" type="button" onclick="oeffneBrowser('ord-original-sichern')" title="Ordner auswählen">📁</button>
|
||||
</div>
|
||||
<small style="color: var(--text-secondary);">Das Original wird vor OCR hierhin kopiert</small>
|
||||
</div>
|
||||
|
||||
<!-- Signatur-Prüfung Info -->
|
||||
<div class="ordner-section" style="margin-top: 1.5rem;">
|
||||
<h4>ℹ️ Info</h4>
|
||||
<div style="font-size: 0.85rem; color: var(--text-secondary);">
|
||||
<p><strong>ZUGFeRD:</strong> Elektronische Rechnungen mit eingebettetem XML. Enthalten strukturierte Daten für automatische Verarbeitung.</p>
|
||||
<p style="margin-top: 0.5rem;"><strong>Signierte PDFs:</strong> Dokumente mit digitaler Unterschrift. Bei Änderung wird die Signatur ungültig - OCR wird übersprungen.</p>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
<div class="modal-footer">
|
||||
<button class="btn" onclick="schliesseModal('ordner-modal')">Abbrechen</button>
|
||||
<button class="btn btn-primary" onclick="speichereOrdner()">Speichern</button>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- Modal: Regel hinzufügen/bearbeiten - 3 Spalten Layout -->
|
||||
<div id="regel-modal" class="modal hidden">
|
||||
<div class="modal-content modal-fullwidth">
|
||||
<div class="modal-header">
|
||||
<h3 id="regel-modal-title">Regel hinzufügen</h3>
|
||||
<button class="modal-close" onclick="schliesseModal('regel-modal')">×</button>
|
||||
</div>
|
||||
<div class="modal-body regel-editor-body">
|
||||
<!-- LINKS: Regex-Hilfe -->
|
||||
<div class="regel-spalte regex-hilfe">
|
||||
<h4>📚 Regex-Hilfe</h4>
|
||||
<!-- Regex aus Markierung Button -->
|
||||
<div style="margin-bottom: 1rem;">
|
||||
<button class="btn btn-sm btn-primary" onclick="regexAusMarkierung()" title="Text im PDF markieren, dann klicken">
|
||||
🎯 Regex aus Markierung
|
||||
</button>
|
||||
<div id="regex-helfer-ergebnis" class="hidden"></div>
|
||||
</div>
|
||||
<div class="regex-cheatsheet">
|
||||
<div class="regex-gruppe">
|
||||
<strong>Zeichen</strong>
|
||||
<div class="regex-item"><code>.</code> Beliebiges Zeichen</div>
|
||||
<div class="regex-item"><code>\d</code> Ziffer (0-9)</div>
|
||||
<div class="regex-item"><code>\w</code> Wortzeichen (a-z, 0-9, _)</div>
|
||||
<div class="regex-item"><code>\s</code> Whitespace (Leer, Tab)</div>
|
||||
<div class="regex-item"><code>\S</code> Nicht-Whitespace</div>
|
||||
</div>
|
||||
<div class="regex-gruppe">
|
||||
<strong>Mengen</strong>
|
||||
<div class="regex-item"><code>*</code> 0 oder mehr</div>
|
||||
<div class="regex-item"><code>+</code> 1 oder mehr</div>
|
||||
<div class="regex-item"><code>?</code> 0 oder 1</div>
|
||||
<div class="regex-item"><code>{3}</code> Genau 3 mal</div>
|
||||
<div class="regex-item"><code>{2,4}</code> 2 bis 4 mal</div>
|
||||
</div>
|
||||
<div class="regex-gruppe">
|
||||
<strong>Gruppen</strong>
|
||||
<div class="regex-item"><code>(...)</code> Erfassungsgruppe</div>
|
||||
<div class="regex-item"><code>[abc]</code> a, b oder c</div>
|
||||
<div class="regex-item"><code>[0-9]</code> Ziffer</div>
|
||||
<div class="regex-item"><code>[^abc]</code> Nicht a, b, c</div>
|
||||
<div class="regex-item"><code>a|b</code> a oder b</div>
|
||||
</div>
|
||||
<div class="regex-gruppe">
|
||||
<strong>Anker</strong>
|
||||
<div class="regex-item"><code>^</code> Zeilenanfang</div>
|
||||
<div class="regex-item"><code>$</code> Zeilenende</div>
|
||||
<div class="regex-item"><code>\b</code> Wortgrenze</div>
|
||||
</div>
|
||||
<div class="regex-gruppe">
|
||||
<strong>Escape</strong>
|
||||
<div class="regex-item"><code>\.</code> Punkt literal</div>
|
||||
<div class="regex-item"><code>\/</code> Slash literal</div>
|
||||
<div class="regex-item"><code>\-</code> Minus literal</div>
|
||||
</div>
|
||||
<div class="regex-gruppe">
|
||||
<strong>Beispiele</strong>
|
||||
<div class="regex-beispiel">
|
||||
<code>\d{2}\.\d{2}\.\d{4}</code>
|
||||
<small>Datum: 31.12.2024</small>
|
||||
</div>
|
||||
<div class="regex-beispiel">
|
||||
<code>[\d.,]+\s*€</code>
|
||||
<small>Betrag: 123,45 €</small>
|
||||
</div>
|
||||
<div class="regex-beispiel">
|
||||
<code>RE-?\d{4,}</code>
|
||||
<small>Nummer: RE-12345</small>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- MITTE: Eingabefelder -->
|
||||
<div class="regel-spalte regel-eingabe">
|
||||
<!-- Grundeinstellungen kompakt -->
|
||||
<div class="regel-section">
|
||||
<div class="form-row">
|
||||
<div class="form-group" style="flex: 2;">
|
||||
<label title="Eindeutiger Name zur Identifikation der Regel">Name</label>
|
||||
<input type="text" id="regel-name" placeholder="z.B. Sonepar Rechnung" title="Gib der Regel einen aussagekräftigen Namen, z.B. 'Sonepar Rechnung' oder 'Telekom Vertrag'">
|
||||
</div>
|
||||
<div class="form-group" style="flex: 1;">
|
||||
<label title="Niedrigere Zahl = höhere Priorität. Regeln werden in dieser Reihenfolge geprüft.">Priorität</label>
|
||||
<input type="number" id="regel-prioritaet" value="100" title="1-999: Niedrig = wichtig. Regel mit Prio 10 wird vor Regel mit Prio 100 geprüft.">
|
||||
</div>
|
||||
</div>
|
||||
<div class="form-row" style="margin-top: 0.5rem; gap: 1rem;">
|
||||
<label class="checkbox-label compact" title="Fallback-Regeln greifen nur wenn keine andere Regel passt">
|
||||
<input type="checkbox" id="regel-ist-fallback">
|
||||
<span>Fallback</span>
|
||||
</label>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- Erkennung -->
|
||||
<div class="regel-section">
|
||||
<h4 title="Bedingungen die erfüllt sein müssen damit die Regel greift">Erkennung</h4>
|
||||
<div class="form-group">
|
||||
<label title="Komma-getrennte Wörter die ALLE im Dokument vorkommen müssen">Keywords (müssen vorkommen)</label>
|
||||
<input type="text" id="regel-keywords" placeholder="rechnung, sonepar" title="Alle Keywords müssen im PDF-Text enthalten sein (Groß/Klein egal). Beispiel: 'rechnung, sonepar' matched 'Rechnung von Sonepar'">
|
||||
</div>
|
||||
<div class="form-group">
|
||||
<label title="Komma-getrennte Wörter die NICHT im Dokument vorkommen dürfen">Ausschluss-Keywords</label>
|
||||
<input type="text" id="regel-keywords-nicht" placeholder="gutschrift, storno" title="Wenn eines dieser Wörter vorkommt, greift die Regel NICHT. Nützlich um z.B. Gutschriften von Rechnungen zu unterscheiden.">
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- Feld-Extraktion -->
|
||||
<div class="regel-section">
|
||||
<h4>Feld-Extraktion</h4>
|
||||
<table class="extraktion-tabelle compact" id="extraktion-tabelle">
|
||||
<thead>
|
||||
<tr>
|
||||
<th>Feldname</th>
|
||||
<th>Typ</th>
|
||||
<th>Regex-Muster / Fester Wert</th>
|
||||
<th title="Bei mehreren Treffern">Auswahl</th>
|
||||
<th></th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody id="extraktion-tbody">
|
||||
<!-- Wird dynamisch befüllt -->
|
||||
</tbody>
|
||||
</table>
|
||||
<button type="button" class="btn btn-sm" onclick="fuegeExtraktionsFeldHinzu()">+ Feld</button>
|
||||
</div>
|
||||
|
||||
<!-- Ausgabe -->
|
||||
<div class="regel-section">
|
||||
<h4 title="Wie und wohin die Dateien sortiert werden">Ausgabe</h4>
|
||||
<div class="form-group">
|
||||
<label class="checkbox-label" title="Datei wird nur umbenannt, bleibt aber im gleichen Ordner">
|
||||
<input type="checkbox" id="regel-nur-umbenennen" onchange="toggleZielOrdnerGruppe()">
|
||||
<span>Nur umbenennen (nicht verschieben)</span>
|
||||
</label>
|
||||
<small style="display: block; margin-top: 0.25rem; color: var(--text-secondary);">
|
||||
Dateien bleiben im Quellordner und werden nur umbenannt
|
||||
</small>
|
||||
</div>
|
||||
<div class="form-group" id="ziel-ordner-gruppe">
|
||||
<label title="Hauptordner in den die Dateien verschoben werden">Ziel-Ordner</label>
|
||||
<div class="input-with-btn">
|
||||
<input type="text" id="regel-ziel-ordner" placeholder="Wo sollen die Dateien hin?" title="Absoluter Pfad zum Zielordner, z.B. /mnt/user/Dokumente/Rechnungen">
|
||||
<button class="btn" type="button" onclick="oeffneBrowser('regel-ziel-ordner')" title="Ordner auswählen">📁</button>
|
||||
</div>
|
||||
</div>
|
||||
<div class="form-group">
|
||||
<label title="So wird die Datei benannt. Platzhalter werden durch extrahierte Werte ersetzt.">Dateiname-Schema</label>
|
||||
<input type="text" id="regel-schema" value="{datum} - Rechnung - {firma} - {nummer} - {betrag} EUR.pdf" title="Verfügbare Platzhalter: {datum}, {firma}, {nummer}, {betrag}, {typ}. Fehlende Felder werden automatisch weggelassen.">
|
||||
<small>Platzhalter: {datum}, {firma}, {nummer}, {betrag}, {typ}</small>
|
||||
</div>
|
||||
<div class="form-group">
|
||||
<label title="Optionaler Unterordner innerhalb des Ziel-Ordners">Unterordner (optional)</label>
|
||||
<input type="text" id="regel-unterordner" placeholder="rechnungen/sonepar" title="Wird dem Ziel-Ordner angehängt. Kann mehrere Ebenen haben: firma/rechnungen/2024">
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- Ordner-Zuweisung -->
|
||||
<div class="regel-section">
|
||||
<h4>Ordner-Zuweisung</h4>
|
||||
<div id="regel-ordner-liste" class="ordner-checkboxen compact">
|
||||
<p style="color: var(--text-secondary);">Lade...</p>
|
||||
</div>
|
||||
<details style="margin-top: 0.5rem;">
|
||||
<summary style="cursor: pointer; color: var(--text-secondary);">+ Freie Ordner</summary>
|
||||
<div id="regel-freie-ordner" class="freie-ordner-liste" style="margin-top: 0.5rem;"></div>
|
||||
<div class="input-with-btn" style="margin-top: 0.5rem;">
|
||||
<input type="text" id="regel-neuer-ordner" placeholder="/pfad/zum/ordner/">
|
||||
<button class="btn" type="button" onclick="oeffneBrowser('regel-neuer-ordner')">📁</button>
|
||||
<button class="btn btn-primary" type="button" onclick="fuegeFreienOrdnerHinzu()">+</button>
|
||||
</div>
|
||||
</details>
|
||||
</div>
|
||||
|
||||
<!-- Versteckte Felder -->
|
||||
<input type="hidden" id="regel-muster">
|
||||
<input type="hidden" id="regel-extraktion">
|
||||
<input type="hidden" id="regel-text-regex" value="">
|
||||
</div>
|
||||
|
||||
<!-- RECHTS: Live-Vorschau -->
|
||||
<div class="regel-spalte regel-vorschau">
|
||||
<h4>📄 Live-Vorschau</h4>
|
||||
<div class="test-controls">
|
||||
<input type="file" id="regel-test-datei" accept=".pdf" onchange="ladeTestPDF()" style="display:none">
|
||||
<button class="btn btn-sm" onclick="document.getElementById('regel-test-datei').click()">📄 PDF laden</button>
|
||||
<button class="btn btn-sm btn-success" onclick="autoRegexGenerieren()">🔮 Auto</button>
|
||||
<button class="btn btn-sm btn-primary" onclick="testeRegelLive()">🔍 Testen</button>
|
||||
</div>
|
||||
<div id="test-datei-name" style="font-size: 0.8rem; color: var(--text-secondary); margin: 0.5rem 0;"></div>
|
||||
|
||||
<!-- PDF-Text Anzeige -->
|
||||
<div class="pdf-text-container">
|
||||
<div id="regel-test-text-display" class="pdf-text-display" contenteditable="false">
|
||||
<p style="color: var(--text-secondary); text-align: center; padding: 2rem;">
|
||||
PDF hochladen um Text anzuzeigen
|
||||
</p>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- Test-Ergebnisse -->
|
||||
<div id="regel-test-ergebnis" class="test-result hidden">
|
||||
<div id="test-status" class="test-status-box"></div>
|
||||
<div id="test-extrahiert" class="test-extrahiert-box"></div>
|
||||
<div id="test-dateiname" class="test-dateiname-box" style="display: none;"></div>
|
||||
</div>
|
||||
|
||||
<!-- Verstecktes Textarea für Kompatibilität -->
|
||||
<textarea id="regel-test-text" style="display:none;"></textarea>
|
||||
</div>
|
||||
</div>
|
||||
<div class="modal-footer">
|
||||
<button class="btn" onclick="schliesseModal('regel-modal')">Abbrechen</button>
|
||||
<button class="btn btn-primary" onclick="speichereRegel()">Speichern</button>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- Modal: Verzeichnis-Browser -->
|
||||
<div id="browser-modal" class="modal hidden">
|
||||
<div class="modal-content">
|
||||
<div class="modal-header">
|
||||
<h3>Verzeichnis wählen</h3>
|
||||
<button class="modal-close" onclick="schliesseModal('browser-modal')">×</button>
|
||||
</div>
|
||||
<div class="modal-body">
|
||||
<div class="file-browser">
|
||||
<div class="file-browser-path">
|
||||
<input type="text" id="browser-path-input" value="/"
|
||||
onkeydown="if(event.key==='Enter'){navigiereToPfad();}"
|
||||
placeholder="Pfad eingeben...">
|
||||
<button class="btn btn-sm" onclick="navigiereToPfad()" title="Zu Pfad navigieren">↵</button>
|
||||
</div>
|
||||
<ul class="file-browser-list" id="browser-list"></ul>
|
||||
</div>
|
||||
</div>
|
||||
<div class="modal-footer">
|
||||
<button class="btn" onclick="schliesseModal('browser-modal')">Abbrechen</button>
|
||||
<button class="btn btn-primary" onclick="browserAuswahl()">Auswählen</button>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- Modal: Zeitplan hinzufügen -->
|
||||
<div id="zeitplan-modal" class="modal hidden">
|
||||
<div class="modal-content">
|
||||
<div class="modal-header">
|
||||
<h3 id="zeitplan-modal-title">Zeitplan hinzufügen</h3>
|
||||
<button class="modal-close" onclick="schliesseModal('zeitplan-modal')">×</button>
|
||||
</div>
|
||||
<div class="modal-body">
|
||||
<div class="form-group">
|
||||
<label>Name</label>
|
||||
<input type="text" id="zp-name" placeholder="z.B. Täglicher Mail-Abruf">
|
||||
</div>
|
||||
|
||||
<div class="form-group">
|
||||
<label>Was soll ausgeführt werden?</label>
|
||||
<select id="zp-typ" onchange="zeitplanTypChanged()">
|
||||
<option value="mail_abruf">Mail-Abruf</option>
|
||||
<option value="grobsortierung">Grobsortierung</option>
|
||||
<option value="sortierregeln">Nur Sortierregeln</option>
|
||||
</select>
|
||||
</div>
|
||||
|
||||
<div class="form-group" id="zp-postfach-gruppe">
|
||||
<label>Postfach (leer = alle aktiven)</label>
|
||||
<select id="zp-postfach">
|
||||
<option value="">Alle aktiven Postfächer</option>
|
||||
</select>
|
||||
</div>
|
||||
|
||||
<div class="form-group hidden" id="zp-ordner-gruppe">
|
||||
<label>Grobsortierung (leer = alle aktiven)</label>
|
||||
<select id="zp-ordner">
|
||||
<option value="">Alle aktiven Ordner</option>
|
||||
</select>
|
||||
</div>
|
||||
|
||||
<div class="form-group hidden" id="zp-regel-gruppe">
|
||||
<label>Sortierregel (leer = alle aktiven)</label>
|
||||
<select id="zp-regel">
|
||||
<option value="">Alle aktiven Regeln</option>
|
||||
</select>
|
||||
</div>
|
||||
|
||||
<div class="form-group">
|
||||
<label>Intervall</label>
|
||||
<select id="zp-intervall" onchange="zeitplanIntervallChanged()">
|
||||
<option value="stündlich">Stündlich (jede Stunde)</option>
|
||||
<option value="täglich" selected>Täglich (einmal pro Tag)</option>
|
||||
<option value="wöchentlich">Wöchentlich (einmal pro Woche)</option>
|
||||
<option value="monatlich">Monatlich (einmal pro Monat)</option>
|
||||
</select>
|
||||
<small id="zp-intervall-info" style="color: #666; display: block; margin-top: 4px;"></small>
|
||||
</div>
|
||||
|
||||
<div class="form-row" id="zp-zeit-gruppe">
|
||||
<div class="form-group">
|
||||
<label>Uhrzeit (Stunde)</label>
|
||||
<input type="number" id="zp-stunde" value="6" min="0" max="23" style="width: 80px;">
|
||||
</div>
|
||||
<div class="form-group">
|
||||
<label>Minute</label>
|
||||
<input type="number" id="zp-minute" value="0" min="0" max="59" style="width: 80px;">
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div class="form-group hidden" id="zp-wochentag-gruppe">
|
||||
<label>Wochentag</label>
|
||||
<select id="zp-wochentag">
|
||||
<option value="0">Montag</option>
|
||||
<option value="1">Dienstag</option>
|
||||
<option value="2">Mittwoch</option>
|
||||
<option value="3">Donnerstag</option>
|
||||
<option value="4">Freitag</option>
|
||||
<option value="5">Samstag</option>
|
||||
<option value="6">Sonntag</option>
|
||||
</select>
|
||||
</div>
|
||||
|
||||
<div class="form-group hidden" id="zp-monatstag-gruppe">
|
||||
<label>Tag im Monat</label>
|
||||
<input type="number" id="zp-monatstag" value="1" min="1" max="28" style="width: 80px;">
|
||||
<small>1-28 (für alle Monate gültig)</small>
|
||||
</div>
|
||||
</div>
|
||||
<div class="modal-footer">
|
||||
<button class="btn" onclick="schliesseModal('zeitplan-modal')">Abbrechen</button>
|
||||
<button class="btn btn-primary" onclick="speichereZeitplan()">Speichern</button>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- Modal: Einstellungen -->
|
||||
<div id="einstellungen-modal" class="modal hidden">
|
||||
<div class="modal-content">
|
||||
<div class="modal-header">
|
||||
<h3>⚙️ Einstellungen</h3>
|
||||
<button class="modal-close" onclick="schliesseModal('einstellungen-modal')">×</button>
|
||||
</div>
|
||||
<div class="modal-body">
|
||||
<div class="form-group">
|
||||
<label>Farbschema</label>
|
||||
<div class="theme-options">
|
||||
<button class="theme-option" data-theme="dark" onclick="setzeTheme('dark')">
|
||||
<span class="theme-preview dark"></span>
|
||||
<span>Dunkel</span>
|
||||
</button>
|
||||
<button class="theme-option" data-theme="light" onclick="setzeTheme('light')">
|
||||
<span class="theme-preview light"></span>
|
||||
<span>Hell</span>
|
||||
</button>
|
||||
<button class="theme-option" data-theme="blue" onclick="setzeTheme('blue')">
|
||||
<span class="theme-preview blue"></span>
|
||||
<span>Blau</span>
|
||||
</button>
|
||||
<button class="theme-option" data-theme="green" onclick="setzeTheme('green')">
|
||||
<span class="theme-preview green"></span>
|
||||
<span>Grün</span>
|
||||
</button>
|
||||
<button class="theme-option" data-theme="breeze" onclick="setzeTheme('breeze')">
|
||||
<span class="theme-preview breeze"></span>
|
||||
<span>Breeze Dark</span>
|
||||
</button>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
<div class="modal-footer">
|
||||
<button class="btn btn-primary" onclick="schliesseModal('einstellungen-modal')">Schließen</button>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- Modal: Debug Log -->
|
||||
<div id="log-modal" class="modal hidden">
|
||||
<div class="modal-content modal-large">
|
||||
<div class="modal-header">
|
||||
<h3>📋 Debug-Log</h3>
|
||||
<button class="modal-close" onclick="schliesseModal('log-modal')">×</button>
|
||||
</div>
|
||||
<div class="modal-body">
|
||||
<div class="form-group">
|
||||
<div class="log-controls">
|
||||
<button class="btn btn-sm" onclick="ladeLog()">🔄 Aktualisieren</button>
|
||||
<button class="btn btn-sm" onclick="leereLog()">🗑️ Leeren</button>
|
||||
<select id="log-filter" onchange="ladeLog()">
|
||||
<option value="">Alle</option>
|
||||
<option value="ERROR">Fehler</option>
|
||||
<option value="WARNING">Warnungen</option>
|
||||
<option value="INFO">Info</option>
|
||||
</select>
|
||||
</div>
|
||||
</div>
|
||||
<div id="log-container" class="log-container">
|
||||
<p class="empty-state">Lade Log...</p>
|
||||
</div>
|
||||
</div>
|
||||
<div class="modal-footer">
|
||||
<button class="btn btn-primary" onclick="schliesseModal('log-modal')">Schließen</button>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- Modal: Regel-Assistent -->
|
||||
<div id="assistent-modal" class="modal hidden">
|
||||
<div class="modal-content modal-large">
|
||||
<div class="modal-header">
|
||||
<h3>🧙 Regel-Assistent</h3>
|
||||
<button class="modal-close" onclick="schliesseModal('assistent-modal')">×</button>
|
||||
</div>
|
||||
<div class="modal-body">
|
||||
<p style="margin-bottom: 1rem; color: var(--text-secondary);">
|
||||
Beantworte die Fragen und ich erstelle die Regel für dich automatisch.
|
||||
</p>
|
||||
|
||||
<!-- Schritt 1: Erkennung -->
|
||||
<div class="assistent-section">
|
||||
<h4>1. Woran erkenne ich diese Dokumente?</h4>
|
||||
<div class="form-group">
|
||||
<label>Welche Wörter müssen im Dokument vorkommen? (Komma-getrennt)</label>
|
||||
<input type="text" id="ass-keywords" placeholder="z.B. rechnung, sonepar">
|
||||
<small>Tipp: Firmenname + Dokumenttyp (z.B. "telekom, rechnung")</small>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- Schritt 2: Firma -->
|
||||
<div class="assistent-section">
|
||||
<h4>2. Von welcher Firma ist das Dokument?</h4>
|
||||
<div class="form-group">
|
||||
<label>Firmenname</label>
|
||||
<input type="text" id="ass-firma" placeholder="z.B. Sonepar, Telekom, Amazon">
|
||||
<small>Wird im Dateinamen verwendet</small>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- Schritt 3: Welche Felder extrahieren? -->
|
||||
<div class="assistent-section">
|
||||
<h4>3. Was soll aus dem Dokument extrahiert werden?</h4>
|
||||
|
||||
<div class="assistent-feld">
|
||||
<label class="checkbox-item">
|
||||
<input type="checkbox" id="ass-datum-aktiv" checked>
|
||||
<strong>📅 Datum</strong>
|
||||
</label>
|
||||
<select id="ass-datum-typ">
|
||||
<option value="auto">Automatisch erkennen</option>
|
||||
<option value="rechnungsdatum">Nach "Rechnungsdatum" suchen</option>
|
||||
<option value="datum">Nach "Datum" suchen</option>
|
||||
<option value="beliebig">Erstes Datum im Text</option>
|
||||
</select>
|
||||
</div>
|
||||
|
||||
<div class="assistent-feld">
|
||||
<label class="checkbox-item">
|
||||
<input type="checkbox" id="ass-betrag-aktiv" checked>
|
||||
<strong>💰 Betrag</strong>
|
||||
</label>
|
||||
<select id="ass-betrag-typ">
|
||||
<option value="auto">Automatisch erkennen</option>
|
||||
<option value="gesamtbetrag">Nach "Gesamtbetrag" suchen</option>
|
||||
<option value="summe">Nach "Summe" suchen</option>
|
||||
<option value="brutto">Nach "Brutto" suchen</option>
|
||||
</select>
|
||||
</div>
|
||||
|
||||
<div class="assistent-feld">
|
||||
<label class="checkbox-item">
|
||||
<input type="checkbox" id="ass-nummer-aktiv" checked>
|
||||
<strong>🔢 Rechnungsnummer</strong>
|
||||
</label>
|
||||
<select id="ass-nummer-typ">
|
||||
<option value="auto">Automatisch erkennen</option>
|
||||
<option value="rechnungsnummer">Nach "Rechnungsnummer" suchen</option>
|
||||
<option value="belegnr">Nach "Beleg-Nr" suchen</option>
|
||||
<option value="invoice">Nach "Invoice" suchen</option>
|
||||
</select>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- Schritt 4: Dateiname -->
|
||||
<div class="assistent-section">
|
||||
<h4>4. Wie soll die Datei heißen?</h4>
|
||||
<div class="form-group">
|
||||
<label>Dateiname-Schema</label>
|
||||
<select id="ass-schema">
|
||||
<option value="{datum} - Rechnung - {firma} - {nummer} - {betrag} EUR.pdf">Datum - Rechnung - Firma - Nummer - Betrag EUR.pdf</option>
|
||||
<option value="{datum} - {firma} - Rechnung {nummer}.pdf">Datum - Firma - Rechnung Nummer.pdf</option>
|
||||
<option value="{firma} - {datum} - {nummer}.pdf">Firma - Datum - Nummer.pdf</option>
|
||||
<option value="{datum} - {firma} - {betrag} EUR.pdf">Datum - Firma - Betrag EUR.pdf</option>
|
||||
</select>
|
||||
</div>
|
||||
<div class="form-group">
|
||||
<label>Unterordner (optional)</label>
|
||||
<input type="text" id="ass-unterordner" placeholder="z.B. rechnungen/sonepar">
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- Vorschau -->
|
||||
<div class="assistent-section" style="background: var(--bg); padding: 1rem; border-radius: var(--radius);">
|
||||
<h4>📋 Vorschau</h4>
|
||||
<div id="ass-vorschau" style="font-family: monospace; font-size: 0.85rem;">
|
||||
<em>Fülle die Felder aus um eine Vorschau zu sehen</em>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
<div class="modal-footer">
|
||||
<button class="btn" onclick="schliesseModal('assistent-modal')">Abbrechen</button>
|
||||
<button class="btn btn-primary" onclick="assistentUebernehmen()">✓ Übernehmen</button>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- Universelles Dialog-Modal -->
|
||||
<div id="dialog-modal" class="modal hidden">
|
||||
<div class="modal-content dialog-modal-content">
|
||||
<div class="modal-header">
|
||||
<h3 id="dialog-title">Hinweis</h3>
|
||||
<button class="modal-close" onclick="dialogSchliessen(false)">×</button>
|
||||
</div>
|
||||
<div class="modal-body">
|
||||
<div id="dialog-icon" class="dialog-icon"></div>
|
||||
<div id="dialog-message" class="dialog-message"></div>
|
||||
</div>
|
||||
<div class="modal-footer" id="dialog-footer">
|
||||
<button class="btn" id="dialog-cancel-btn" onclick="dialogSchliessen(false)">Abbrechen</button>
|
||||
<button class="btn btn-primary" id="dialog-ok-btn" onclick="dialogSchliessen(true)">OK</button>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- Loading Overlay -->
|
||||
<div id="loading-overlay" class="loading-overlay hidden">
|
||||
<div class="spinner"></div>
|
||||
<div class="loading-text" id="loading-text">Wird geladen...</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<script src="/static/js/app.js"></script>
|
||||
</body>
|
||||
</html>
|
||||
0
regeln/beispiel_regeln.yaml → Source/regeln/beispiel_regeln.yaml
Normal file → Executable file
0
regeln/beispiel_regeln.yaml → Source/regeln/beispiel_regeln.yaml
Normal file → Executable file
Binary file not shown.
Binary file not shown.
Binary file not shown.
|
|
@ -1,26 +0,0 @@
|
|||
"""Zentrale Konfiguration"""
|
||||
import os
|
||||
from pathlib import Path
|
||||
|
||||
# Basis-Pfade
|
||||
BASE_DIR = Path(__file__).parent.parent.parent
|
||||
DATA_DIR = BASE_DIR / "data"
|
||||
CONFIG_DIR = BASE_DIR / "config"
|
||||
REGELN_DIR = BASE_DIR / "regeln"
|
||||
|
||||
# Datenbank
|
||||
DATABASE_URL = os.getenv("DATABASE_URL", f"sqlite:///{DATA_DIR}/dateiverwaltung.db")
|
||||
|
||||
# Ordner-Struktur
|
||||
INBOX_DIR = DATA_DIR / "inbox"
|
||||
PROCESSED_DIR = DATA_DIR / "processed"
|
||||
ARCHIVE_DIR = DATA_DIR / "archive"
|
||||
ZUGFERD_DIR = DATA_DIR / "zugferd"
|
||||
|
||||
# OCR Einstellungen
|
||||
OCR_LANGUAGE = "deu" # Deutsch
|
||||
OCR_DPI = 300
|
||||
|
||||
# Erstelle Ordner falls nicht vorhanden
|
||||
for dir_path in [INBOX_DIR, PROCESSED_DIR, ARCHIVE_DIR, ZUGFERD_DIR, REGELN_DIR]:
|
||||
dir_path.mkdir(parents=True, exist_ok=True)
|
||||
|
|
@ -1,4 +0,0 @@
|
|||
from .database import (
|
||||
Postfach, QuellOrdner, SortierRegel, VerarbeiteteDatei,
|
||||
init_db, get_db, SessionLocal
|
||||
)
|
||||
Binary file not shown.
Binary file not shown.
|
|
@ -1,161 +0,0 @@
|
|||
"""Datenbank-Modelle - Getrennte Bereiche: Mail-Abruf und Datei-Sortierung"""
|
||||
from sqlalchemy import create_engine, Column, Integer, String, Boolean, DateTime, Text, JSON
|
||||
from sqlalchemy.ext.declarative import declarative_base
|
||||
from sqlalchemy.orm import sessionmaker
|
||||
from datetime import datetime
|
||||
|
||||
from ..config import DATABASE_URL
|
||||
|
||||
engine = create_engine(DATABASE_URL, echo=False)
|
||||
SessionLocal = sessionmaker(bind=engine)
|
||||
Base = declarative_base()
|
||||
|
||||
|
||||
# ============ BEREICH 1: Mail-Abruf ============
|
||||
|
||||
class Postfach(Base):
|
||||
"""IMAP-Postfach Konfiguration"""
|
||||
__tablename__ = "postfaecher"
|
||||
|
||||
id = Column(Integer, primary_key=True)
|
||||
name = Column(String(100), nullable=False)
|
||||
|
||||
# IMAP
|
||||
imap_server = Column(String(255), nullable=False)
|
||||
imap_port = Column(Integer, default=993)
|
||||
email = Column(String(255), nullable=False)
|
||||
passwort = Column(String(255), nullable=False)
|
||||
ordner = Column(String(100), default="INBOX")
|
||||
alle_ordner = Column(Boolean, default=False) # Alle IMAP-Ordner durchsuchen
|
||||
nur_ungelesen = Column(Boolean, default=False) # Nur ungelesene Mails (False = alle)
|
||||
|
||||
# Ziel
|
||||
ziel_ordner = Column(String(500), nullable=False)
|
||||
|
||||
# Filter
|
||||
erlaubte_typen = Column(JSON, default=lambda: [".pdf"])
|
||||
max_groesse_mb = Column(Integer, default=25)
|
||||
|
||||
# Status
|
||||
aktiv = Column(Boolean, default=True)
|
||||
letzter_abruf = Column(DateTime)
|
||||
letzte_anzahl = Column(Integer, default=0)
|
||||
|
||||
|
||||
# ============ BEREICH 2: Datei-Sortierung ============
|
||||
|
||||
class QuellOrdner(Base):
|
||||
"""Ordner der nach Dateien gescannt wird"""
|
||||
__tablename__ = "quell_ordner"
|
||||
|
||||
id = Column(Integer, primary_key=True)
|
||||
name = Column(String(100), nullable=False)
|
||||
pfad = Column(String(500), nullable=False)
|
||||
ziel_ordner = Column(String(500), nullable=False)
|
||||
rekursiv = Column(Boolean, default=True) # Unterordner einschließen
|
||||
dateitypen = Column(JSON, default=lambda: [".pdf", ".jpg", ".jpeg", ".png", ".tiff"])
|
||||
aktiv = Column(Boolean, default=True)
|
||||
|
||||
|
||||
class SortierRegel(Base):
|
||||
"""Regeln für Datei-Erkennung und Benennung"""
|
||||
__tablename__ = "sortier_regeln"
|
||||
|
||||
id = Column(Integer, primary_key=True)
|
||||
name = Column(String(100), nullable=False)
|
||||
prioritaet = Column(Integer, default=100)
|
||||
aktiv = Column(Boolean, default=True)
|
||||
|
||||
# Erkennungsmuster
|
||||
muster = Column(JSON, default=dict)
|
||||
|
||||
# Extraktion
|
||||
extraktion = Column(JSON, default=dict)
|
||||
|
||||
# Ausgabe
|
||||
schema = Column(String(500), default="{datum} - Dokument.pdf")
|
||||
unterordner = Column(String(100)) # Optional: Unterordner im Ziel
|
||||
|
||||
|
||||
class VerarbeiteteMail(Base):
|
||||
"""Tracking welche Mails bereits verarbeitet wurden"""
|
||||
__tablename__ = "verarbeitete_mails"
|
||||
|
||||
id = Column(Integer, primary_key=True)
|
||||
postfach_id = Column(Integer, nullable=False)
|
||||
message_id = Column(String(500), nullable=False) # Email Message-ID Header
|
||||
ordner = Column(String(200)) # IMAP Ordner
|
||||
betreff = Column(String(500))
|
||||
absender = Column(String(255))
|
||||
anzahl_attachments = Column(Integer, default=0)
|
||||
verarbeitet_am = Column(DateTime, default=datetime.utcnow)
|
||||
|
||||
|
||||
class VerarbeiteteDatei(Base):
|
||||
"""Log verarbeiteter Dateien"""
|
||||
__tablename__ = "verarbeitete_dateien"
|
||||
|
||||
id = Column(Integer, primary_key=True)
|
||||
original_pfad = Column(String(1000))
|
||||
original_name = Column(String(500))
|
||||
neuer_pfad = Column(String(1000))
|
||||
neuer_name = Column(String(500))
|
||||
|
||||
ist_zugferd = Column(Boolean, default=False)
|
||||
ocr_durchgefuehrt = Column(Boolean, default=False)
|
||||
|
||||
status = Column(String(50)) # sortiert, zugferd, fehler, keine_regel
|
||||
fehler = Column(Text)
|
||||
|
||||
extrahierte_daten = Column(JSON)
|
||||
verarbeitet_am = Column(DateTime, default=datetime.utcnow)
|
||||
|
||||
|
||||
def migrate_db():
|
||||
"""Fügt fehlende Spalten hinzu ohne Daten zu löschen"""
|
||||
from sqlalchemy import inspect, text
|
||||
|
||||
inspector = inspect(engine)
|
||||
|
||||
# Migrations-Definitionen: {tabelle: {spalte: sql_typ}}
|
||||
migrations = {
|
||||
"postfaecher": {
|
||||
"alle_ordner": "BOOLEAN DEFAULT 0",
|
||||
"nur_ungelesen": "BOOLEAN DEFAULT 0"
|
||||
},
|
||||
"quell_ordner": {
|
||||
"rekursiv": "BOOLEAN DEFAULT 1",
|
||||
"dateitypen": "JSON"
|
||||
}
|
||||
}
|
||||
|
||||
with engine.connect() as conn:
|
||||
for table, columns in migrations.items():
|
||||
if table not in inspector.get_table_names():
|
||||
continue
|
||||
|
||||
existing = [col["name"] for col in inspector.get_columns(table)]
|
||||
|
||||
for col_name, col_type in columns.items():
|
||||
if col_name not in existing:
|
||||
try:
|
||||
conn.execute(text(f"ALTER TABLE {table} ADD COLUMN {col_name} {col_type}"))
|
||||
conn.commit()
|
||||
print(f"Migration: {table}.{col_name} hinzugefügt")
|
||||
except Exception as e:
|
||||
print(f"Migration übersprungen: {table}.{col_name} - {e}")
|
||||
|
||||
|
||||
def init_db():
|
||||
"""Datenbank initialisieren"""
|
||||
Base.metadata.create_all(engine)
|
||||
migrate_db()
|
||||
|
||||
|
||||
def get_db():
|
||||
"""Database Session Generator"""
|
||||
db = SessionLocal()
|
||||
try:
|
||||
yield db
|
||||
finally:
|
||||
db.close()
|
||||
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
|
|
@ -1,248 +0,0 @@
|
|||
"""
|
||||
PDF-Processor Modul
|
||||
Text-Extraktion, OCR und ZUGFeRD-Erkennung
|
||||
"""
|
||||
import subprocess
|
||||
from pathlib import Path
|
||||
from typing import Dict, Optional, Tuple
|
||||
import logging
|
||||
import re
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
# Versuche Libraries zu importieren
|
||||
try:
|
||||
import pdfplumber
|
||||
PDFPLUMBER_AVAILABLE = True
|
||||
except ImportError:
|
||||
PDFPLUMBER_AVAILABLE = False
|
||||
logger.warning("pdfplumber nicht installiert")
|
||||
|
||||
try:
|
||||
from pypdf import PdfReader
|
||||
PYPDF_AVAILABLE = True
|
||||
except ImportError:
|
||||
PYPDF_AVAILABLE = False
|
||||
logger.warning("pypdf nicht installiert")
|
||||
|
||||
|
||||
class PDFProcessor:
|
||||
"""Verarbeitet PDFs: Text-Extraktion, OCR, ZUGFeRD-Erkennung"""
|
||||
|
||||
def __init__(self, ocr_language: str = "deu", ocr_dpi: int = 300):
|
||||
self.ocr_language = ocr_language
|
||||
self.ocr_dpi = ocr_dpi
|
||||
|
||||
def verarbeite(self, pdf_pfad: str) -> Dict:
|
||||
"""
|
||||
Vollständige PDF-Verarbeitung
|
||||
|
||||
Returns:
|
||||
Dict mit: text, ist_zugferd, zugferd_xml, hat_text, ocr_durchgefuehrt
|
||||
"""
|
||||
pfad = Path(pdf_pfad)
|
||||
if not pfad.exists():
|
||||
return {"fehler": f"Datei nicht gefunden: {pdf_pfad}"}
|
||||
|
||||
ergebnis = {
|
||||
"pfad": str(pfad),
|
||||
"text": "",
|
||||
"ist_zugferd": False,
|
||||
"zugferd_xml": None,
|
||||
"hat_text": False,
|
||||
"ocr_durchgefuehrt": False,
|
||||
"seiten": 0
|
||||
}
|
||||
|
||||
# 1. ZUGFeRD prüfen
|
||||
zugferd_result = self.pruefe_zugferd(pdf_pfad)
|
||||
ergebnis["ist_zugferd"] = zugferd_result["ist_zugferd"]
|
||||
ergebnis["zugferd_xml"] = zugferd_result.get("xml")
|
||||
|
||||
# 2. Text extrahieren
|
||||
text, seiten = self.extrahiere_text(pdf_pfad)
|
||||
ergebnis["text"] = text
|
||||
ergebnis["seiten"] = seiten
|
||||
ergebnis["hat_text"] = bool(text and len(text.strip()) > 50)
|
||||
|
||||
# 3. OCR falls kein Text (aber NICHT bei ZUGFeRD!)
|
||||
if not ergebnis["hat_text"] and not ergebnis["ist_zugferd"]:
|
||||
logger.info(f"Kein Text gefunden, starte OCR für {pfad.name}")
|
||||
ocr_text, ocr_erfolg = self.fuehre_ocr_aus(pdf_pfad)
|
||||
if ocr_erfolg:
|
||||
ergebnis["text"] = ocr_text
|
||||
ergebnis["hat_text"] = bool(ocr_text and len(ocr_text.strip()) > 50)
|
||||
ergebnis["ocr_durchgefuehrt"] = True
|
||||
|
||||
return ergebnis
|
||||
|
||||
def extrahiere_text(self, pdf_pfad: str) -> Tuple[str, int]:
|
||||
"""
|
||||
Extrahiert Text aus PDF
|
||||
|
||||
Returns:
|
||||
Tuple von (text, seitenanzahl)
|
||||
"""
|
||||
text_parts = []
|
||||
seiten = 0
|
||||
|
||||
# Methode 1: pdfplumber (besser für Tabellen)
|
||||
if PDFPLUMBER_AVAILABLE:
|
||||
try:
|
||||
with pdfplumber.open(pdf_pfad) as pdf:
|
||||
seiten = len(pdf.pages)
|
||||
for page in pdf.pages:
|
||||
page_text = page.extract_text()
|
||||
if page_text:
|
||||
text_parts.append(page_text)
|
||||
if text_parts:
|
||||
return "\n\n".join(text_parts), seiten
|
||||
except Exception as e:
|
||||
logger.debug(f"pdfplumber Fehler: {e}")
|
||||
|
||||
# Methode 2: pypdf (Fallback)
|
||||
if PYPDF_AVAILABLE:
|
||||
try:
|
||||
reader = PdfReader(pdf_pfad)
|
||||
seiten = len(reader.pages)
|
||||
for page in reader.pages:
|
||||
page_text = page.extract_text()
|
||||
if page_text:
|
||||
text_parts.append(page_text)
|
||||
if text_parts:
|
||||
return "\n\n".join(text_parts), seiten
|
||||
except Exception as e:
|
||||
logger.debug(f"pypdf Fehler: {e}")
|
||||
|
||||
# Methode 3: pdftotext CLI (Fallback)
|
||||
try:
|
||||
result = subprocess.run(
|
||||
["pdftotext", "-layout", pdf_pfad, "-"],
|
||||
capture_output=True,
|
||||
text=True,
|
||||
timeout=30
|
||||
)
|
||||
if result.returncode == 0 and result.stdout.strip():
|
||||
return result.stdout, seiten
|
||||
except Exception as e:
|
||||
logger.debug(f"pdftotext Fehler: {e}")
|
||||
|
||||
return "", seiten
|
||||
|
||||
def pruefe_zugferd(self, pdf_pfad: str) -> Dict:
|
||||
"""
|
||||
Prüft ob PDF eine ZUGFeRD/Factur-X Rechnung ist
|
||||
|
||||
Returns:
|
||||
Dict mit: ist_zugferd, xml (falls vorhanden)
|
||||
"""
|
||||
ergebnis = {"ist_zugferd": False, "xml": None}
|
||||
|
||||
# Methode 1: factur-x Library
|
||||
try:
|
||||
from facturx import get_facturx_xml_from_pdf
|
||||
xml_bytes = get_facturx_xml_from_pdf(pdf_pfad)
|
||||
if xml_bytes:
|
||||
ergebnis["ist_zugferd"] = True
|
||||
ergebnis["xml"] = xml_bytes.decode("utf-8") if isinstance(xml_bytes, bytes) else xml_bytes
|
||||
logger.info(f"ZUGFeRD erkannt: {Path(pdf_pfad).name}")
|
||||
return ergebnis
|
||||
except ImportError:
|
||||
logger.debug("factur-x nicht installiert")
|
||||
except Exception as e:
|
||||
logger.debug(f"factur-x Fehler: {e}")
|
||||
|
||||
# Methode 2: Manuell nach XML-Attachment suchen
|
||||
if PYPDF_AVAILABLE:
|
||||
try:
|
||||
reader = PdfReader(pdf_pfad)
|
||||
if "/Names" in reader.trailer.get("/Root", {}):
|
||||
# Embedded Files prüfen
|
||||
pass # Komplexere Logik hier
|
||||
|
||||
# Alternativ: Im Text nach ZUGFeRD-Markern suchen
|
||||
for page in reader.pages[:1]: # Nur erste Seite
|
||||
text = page.extract_text() or ""
|
||||
if any(marker in text.upper() for marker in ["ZUGFERD", "FACTUR-X", "EN 16931"]):
|
||||
ergebnis["ist_zugferd"] = True
|
||||
logger.info(f"ZUGFeRD-Marker gefunden: {Path(pdf_pfad).name}")
|
||||
break
|
||||
except Exception as e:
|
||||
logger.debug(f"ZUGFeRD-Prüfung Fehler: {e}")
|
||||
|
||||
return ergebnis
|
||||
|
||||
def fuehre_ocr_aus(self, pdf_pfad: str) -> Tuple[str, bool]:
|
||||
"""
|
||||
Führt OCR mit ocrmypdf durch
|
||||
|
||||
Returns:
|
||||
Tuple von (text, erfolg)
|
||||
"""
|
||||
pfad = Path(pdf_pfad)
|
||||
temp_pfad = pfad.with_suffix(".ocr.pdf")
|
||||
|
||||
try:
|
||||
# ocrmypdf ausführen
|
||||
result = subprocess.run(
|
||||
[
|
||||
"ocrmypdf",
|
||||
"--language", self.ocr_language,
|
||||
"--deskew", # Schräge Scans korrigieren
|
||||
"--clean", # Bild verbessern
|
||||
"--skip-text", # Seiten mit Text überspringen
|
||||
"--force-ocr", # OCR erzwingen falls nötig
|
||||
str(pfad),
|
||||
str(temp_pfad)
|
||||
],
|
||||
capture_output=True,
|
||||
text=True,
|
||||
timeout=120 # 2 Minuten Timeout
|
||||
)
|
||||
|
||||
if result.returncode == 0 and temp_pfad.exists():
|
||||
# Original mit OCR-Version ersetzen
|
||||
pfad.unlink()
|
||||
temp_pfad.rename(pfad)
|
||||
|
||||
# Text aus OCR-PDF extrahieren
|
||||
text, _ = self.extrahiere_text(str(pfad))
|
||||
return text, True
|
||||
else:
|
||||
logger.error(f"OCR Fehler: {result.stderr}")
|
||||
if temp_pfad.exists():
|
||||
temp_pfad.unlink()
|
||||
return "", False
|
||||
|
||||
except subprocess.TimeoutExpired:
|
||||
logger.error(f"OCR Timeout für {pfad.name}")
|
||||
if temp_pfad.exists():
|
||||
temp_pfad.unlink()
|
||||
return "", False
|
||||
except FileNotFoundError:
|
||||
logger.error("ocrmypdf nicht installiert")
|
||||
return "", False
|
||||
except Exception as e:
|
||||
logger.error(f"OCR Fehler: {e}")
|
||||
if temp_pfad.exists():
|
||||
temp_pfad.unlink()
|
||||
return "", False
|
||||
|
||||
def extrahiere_metadaten(self, pdf_pfad: str) -> Dict:
|
||||
"""Extrahiert PDF-Metadaten"""
|
||||
metadaten = {}
|
||||
|
||||
if PYPDF_AVAILABLE:
|
||||
try:
|
||||
reader = PdfReader(pdf_pfad)
|
||||
if reader.metadata:
|
||||
metadaten = {
|
||||
"titel": reader.metadata.get("/Title", ""),
|
||||
"autor": reader.metadata.get("/Author", ""),
|
||||
"ersteller": reader.metadata.get("/Creator", ""),
|
||||
"erstellt": reader.metadata.get("/CreationDate", ""),
|
||||
}
|
||||
except Exception as e:
|
||||
logger.debug(f"Metadaten-Fehler: {e}")
|
||||
|
||||
return metadaten
|
||||
Binary file not shown.
Binary file not shown.
|
|
@ -1,851 +0,0 @@
|
|||
"""
|
||||
API Routes - Getrennte Bereiche: Mail-Abruf und Datei-Sortierung
|
||||
"""
|
||||
from fastapi import APIRouter, Depends, HTTPException
|
||||
from fastapi.responses import StreamingResponse
|
||||
from sqlalchemy.orm import Session
|
||||
from typing import List, Optional
|
||||
from pydantic import BaseModel
|
||||
from datetime import datetime
|
||||
from pathlib import Path
|
||||
import json
|
||||
import asyncio
|
||||
|
||||
from ..models.database import get_db, Postfach, QuellOrdner, SortierRegel, VerarbeiteteDatei, VerarbeiteteMail
|
||||
from ..modules.mail_fetcher import MailFetcher
|
||||
from ..modules.pdf_processor import PDFProcessor
|
||||
from ..modules.sorter import Sorter
|
||||
|
||||
router = APIRouter(prefix="/api", tags=["api"])
|
||||
|
||||
|
||||
# ============ Pydantic Models ============
|
||||
|
||||
class PostfachCreate(BaseModel):
|
||||
name: str
|
||||
imap_server: str
|
||||
imap_port: int = 993
|
||||
email: str
|
||||
passwort: str
|
||||
ordner: str = "INBOX"
|
||||
alle_ordner: bool = False # Alle IMAP-Ordner durchsuchen
|
||||
nur_ungelesen: bool = False # Nur ungelesene Mails (False = alle)
|
||||
ziel_ordner: str
|
||||
erlaubte_typen: List[str] = [".pdf"]
|
||||
max_groesse_mb: int = 25
|
||||
|
||||
|
||||
class PostfachResponse(BaseModel):
|
||||
id: int
|
||||
name: str
|
||||
imap_server: str
|
||||
email: str
|
||||
ordner: str
|
||||
alle_ordner: bool
|
||||
nur_ungelesen: bool
|
||||
ziel_ordner: str
|
||||
erlaubte_typen: List[str]
|
||||
max_groesse_mb: int
|
||||
letzter_abruf: Optional[datetime]
|
||||
letzte_anzahl: int
|
||||
|
||||
class Config:
|
||||
from_attributes = True
|
||||
|
||||
|
||||
class OrdnerCreate(BaseModel):
|
||||
name: str
|
||||
pfad: str
|
||||
ziel_ordner: str
|
||||
rekursiv: bool = True
|
||||
dateitypen: List[str] = [".pdf", ".jpg", ".jpeg", ".png", ".tiff"]
|
||||
|
||||
|
||||
class OrdnerResponse(BaseModel):
|
||||
id: int
|
||||
name: str
|
||||
pfad: str
|
||||
ziel_ordner: str
|
||||
rekursiv: bool
|
||||
dateitypen: List[str]
|
||||
aktiv: bool
|
||||
|
||||
class Config:
|
||||
from_attributes = True
|
||||
|
||||
|
||||
class RegelCreate(BaseModel):
|
||||
name: str
|
||||
prioritaet: int = 100
|
||||
muster: dict = {}
|
||||
extraktion: dict = {}
|
||||
schema: str = "{datum} - Dokument.pdf"
|
||||
unterordner: Optional[str] = None
|
||||
|
||||
|
||||
class RegelResponse(BaseModel):
|
||||
id: int
|
||||
name: str
|
||||
prioritaet: int
|
||||
aktiv: bool
|
||||
muster: dict
|
||||
extraktion: dict
|
||||
schema: str
|
||||
unterordner: Optional[str]
|
||||
|
||||
class Config:
|
||||
from_attributes = True
|
||||
|
||||
|
||||
class RegelTestRequest(BaseModel):
|
||||
regel: dict
|
||||
text: str
|
||||
|
||||
|
||||
# ============ Verzeichnis-Browser ============
|
||||
|
||||
@router.get("/browse")
|
||||
def browse_directory(path: str = "/"):
|
||||
"""Listet Verzeichnisse für File-Browser"""
|
||||
import os
|
||||
|
||||
# Sicherheit: Nur bestimmte Basispfade erlauben
|
||||
allowed_bases = ["/srv", "/home", "/mnt", "/media", "/data", "/tmp"]
|
||||
path = os.path.abspath(path)
|
||||
|
||||
# Prüfen ob Pfad erlaubt
|
||||
is_allowed = any(path.startswith(base) for base in allowed_bases) or path == "/"
|
||||
if not is_allowed:
|
||||
return {"error": "Pfad nicht erlaubt", "entries": []}
|
||||
|
||||
if not os.path.exists(path):
|
||||
return {"error": "Pfad existiert nicht", "entries": []}
|
||||
|
||||
if not os.path.isdir(path):
|
||||
return {"error": "Kein Verzeichnis", "entries": []}
|
||||
|
||||
try:
|
||||
entries = []
|
||||
for entry in sorted(os.listdir(path)):
|
||||
full_path = os.path.join(path, entry)
|
||||
if os.path.isdir(full_path):
|
||||
entries.append({
|
||||
"name": entry,
|
||||
"path": full_path,
|
||||
"type": "directory"
|
||||
})
|
||||
|
||||
return {
|
||||
"current": path,
|
||||
"parent": os.path.dirname(path) if path != "/" else None,
|
||||
"entries": entries
|
||||
}
|
||||
except PermissionError:
|
||||
return {"error": "Zugriff verweigert", "entries": []}
|
||||
|
||||
|
||||
# ============ BEREICH 1: Postfächer ============
|
||||
|
||||
@router.get("/postfaecher", response_model=List[PostfachResponse])
|
||||
def liste_postfaecher(db: Session = Depends(get_db)):
|
||||
return db.query(Postfach).all()
|
||||
|
||||
|
||||
@router.post("/postfaecher", response_model=PostfachResponse)
|
||||
def erstelle_postfach(data: PostfachCreate, db: Session = Depends(get_db)):
|
||||
postfach = Postfach(**data.dict())
|
||||
db.add(postfach)
|
||||
db.commit()
|
||||
db.refresh(postfach)
|
||||
return postfach
|
||||
|
||||
|
||||
@router.put("/postfaecher/{id}", response_model=PostfachResponse)
|
||||
def aktualisiere_postfach(id: int, data: PostfachCreate, db: Session = Depends(get_db)):
|
||||
postfach = db.query(Postfach).filter(Postfach.id == id).first()
|
||||
if not postfach:
|
||||
raise HTTPException(status_code=404, detail="Nicht gefunden")
|
||||
|
||||
update_data = data.dict()
|
||||
# Passwort nur aktualisieren wenn nicht leer
|
||||
if not update_data.get("passwort"):
|
||||
del update_data["passwort"]
|
||||
|
||||
for key, value in update_data.items():
|
||||
setattr(postfach, key, value)
|
||||
|
||||
db.commit()
|
||||
db.refresh(postfach)
|
||||
return postfach
|
||||
|
||||
|
||||
@router.delete("/postfaecher/{id}")
|
||||
def loesche_postfach(id: int, db: Session = Depends(get_db)):
|
||||
postfach = db.query(Postfach).filter(Postfach.id == id).first()
|
||||
if not postfach:
|
||||
raise HTTPException(status_code=404, detail="Nicht gefunden")
|
||||
db.delete(postfach)
|
||||
db.commit()
|
||||
return {"message": "Gelöscht"}
|
||||
|
||||
|
||||
@router.post("/postfaecher/{id}/test")
|
||||
def teste_postfach(id: int, db: Session = Depends(get_db)):
|
||||
postfach = db.query(Postfach).filter(Postfach.id == id).first()
|
||||
if not postfach:
|
||||
raise HTTPException(status_code=404, detail="Nicht gefunden")
|
||||
|
||||
fetcher = MailFetcher({
|
||||
"imap_server": postfach.imap_server,
|
||||
"imap_port": postfach.imap_port,
|
||||
"email": postfach.email,
|
||||
"passwort": postfach.passwort,
|
||||
"ordner": postfach.ordner
|
||||
})
|
||||
return fetcher.test_connection()
|
||||
|
||||
|
||||
@router.get("/postfaecher/{id}/abrufen/stream")
|
||||
def rufe_postfach_ab_stream(id: int, db: Session = Depends(get_db)):
|
||||
"""Streaming-Endpoint für Mail-Abruf mit Live-Updates"""
|
||||
postfach = db.query(Postfach).filter(Postfach.id == id).first()
|
||||
if not postfach:
|
||||
raise HTTPException(status_code=404, detail="Nicht gefunden")
|
||||
|
||||
# Daten kopieren für Generator (Session ist nach return nicht mehr verfügbar)
|
||||
pf_data = {
|
||||
"id": postfach.id,
|
||||
"name": postfach.name,
|
||||
"imap_server": postfach.imap_server,
|
||||
"imap_port": postfach.imap_port,
|
||||
"email": postfach.email,
|
||||
"passwort": postfach.passwort,
|
||||
"ordner": postfach.ordner,
|
||||
"alle_ordner": postfach.alle_ordner,
|
||||
"erlaubte_typen": postfach.erlaubte_typen,
|
||||
"max_groesse_mb": postfach.max_groesse_mb,
|
||||
"ziel_ordner": postfach.ziel_ordner
|
||||
}
|
||||
|
||||
# Bereits verarbeitete Message-IDs laden
|
||||
bereits_verarbeitet = set(
|
||||
row.message_id for row in
|
||||
db.query(VerarbeiteteMail.message_id)
|
||||
.filter(VerarbeiteteMail.postfach_id == id)
|
||||
.all()
|
||||
)
|
||||
|
||||
def event_generator():
|
||||
from ..models.database import SessionLocal
|
||||
|
||||
def send_event(data):
|
||||
return f"data: {json.dumps(data)}\n\n"
|
||||
|
||||
yield send_event({"type": "start", "postfach": pf_data["name"], "bereits_verarbeitet": len(bereits_verarbeitet)})
|
||||
|
||||
# Zielordner erstellen
|
||||
ziel = Path(pf_data["ziel_ordner"])
|
||||
ziel.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
fetcher = MailFetcher({
|
||||
"imap_server": pf_data["imap_server"],
|
||||
"imap_port": pf_data["imap_port"],
|
||||
"email": pf_data["email"],
|
||||
"passwort": pf_data["passwort"],
|
||||
"ordner": pf_data["ordner"],
|
||||
"erlaubte_typen": pf_data["erlaubte_typen"],
|
||||
"max_groesse_mb": pf_data["max_groesse_mb"]
|
||||
})
|
||||
|
||||
attachments = []
|
||||
|
||||
try:
|
||||
# Generator für streaming
|
||||
for event in fetcher.fetch_attachments_generator(
|
||||
ziel,
|
||||
nur_ungelesen=False,
|
||||
alle_ordner=pf_data["alle_ordner"],
|
||||
bereits_verarbeitet=bereits_verarbeitet
|
||||
):
|
||||
yield send_event(event)
|
||||
|
||||
if event.get("type") == "datei":
|
||||
attachments.append(event)
|
||||
|
||||
# DB-Session für Speicherung
|
||||
session = SessionLocal()
|
||||
try:
|
||||
verarbeitete_msg_ids = set()
|
||||
for att in attachments:
|
||||
msg_id = att.get("message_id")
|
||||
if msg_id and msg_id not in verarbeitete_msg_ids:
|
||||
verarbeitete_msg_ids.add(msg_id)
|
||||
session.add(VerarbeiteteMail(
|
||||
postfach_id=pf_data["id"],
|
||||
message_id=msg_id,
|
||||
ordner=att.get("ordner", ""),
|
||||
betreff=att.get("betreff", "")[:500] if att.get("betreff") else None,
|
||||
absender=att.get("absender", "")[:255] if att.get("absender") else None,
|
||||
anzahl_attachments=1
|
||||
))
|
||||
|
||||
# Postfach aktualisieren
|
||||
pf = session.query(Postfach).filter(Postfach.id == pf_data["id"]).first()
|
||||
if pf:
|
||||
pf.letzter_abruf = datetime.utcnow()
|
||||
pf.letzte_anzahl = len(attachments)
|
||||
session.commit()
|
||||
finally:
|
||||
session.close()
|
||||
|
||||
yield send_event({"type": "fertig", "anzahl": len(attachments)})
|
||||
|
||||
except Exception as e:
|
||||
yield send_event({"type": "fehler", "nachricht": str(e)})
|
||||
finally:
|
||||
fetcher.disconnect()
|
||||
|
||||
return StreamingResponse(
|
||||
event_generator(),
|
||||
media_type="text/event-stream",
|
||||
headers={
|
||||
"Cache-Control": "no-cache",
|
||||
"Connection": "keep-alive",
|
||||
"X-Accel-Buffering": "no"
|
||||
}
|
||||
)
|
||||
|
||||
|
||||
@router.post("/postfaecher/{id}/abrufen")
|
||||
def rufe_postfach_ab(id: int, db: Session = Depends(get_db)):
|
||||
postfach = db.query(Postfach).filter(Postfach.id == id).first()
|
||||
if not postfach:
|
||||
raise HTTPException(status_code=404, detail="Nicht gefunden")
|
||||
|
||||
# Bereits verarbeitete Message-IDs laden
|
||||
bereits_verarbeitet = set(
|
||||
row.message_id for row in
|
||||
db.query(VerarbeiteteMail.message_id)
|
||||
.filter(VerarbeiteteMail.postfach_id == id)
|
||||
.all()
|
||||
)
|
||||
|
||||
# Zielordner erstellen
|
||||
ziel = Path(postfach.ziel_ordner)
|
||||
ziel.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
fetcher = MailFetcher({
|
||||
"imap_server": postfach.imap_server,
|
||||
"imap_port": postfach.imap_port,
|
||||
"email": postfach.email,
|
||||
"passwort": postfach.passwort,
|
||||
"ordner": postfach.ordner,
|
||||
"erlaubte_typen": postfach.erlaubte_typen,
|
||||
"max_groesse_mb": postfach.max_groesse_mb
|
||||
})
|
||||
|
||||
try:
|
||||
attachments = fetcher.fetch_attachments(
|
||||
ziel,
|
||||
nur_ungelesen=False, # Alle Mails durchsuchen
|
||||
alle_ordner=postfach.alle_ordner,
|
||||
bereits_verarbeitet=bereits_verarbeitet
|
||||
)
|
||||
|
||||
# Verarbeitete Mails in DB speichern
|
||||
verarbeitete_msg_ids = set()
|
||||
for att in attachments:
|
||||
msg_id = att.get("message_id")
|
||||
if msg_id and msg_id not in verarbeitete_msg_ids:
|
||||
verarbeitete_msg_ids.add(msg_id)
|
||||
db.add(VerarbeiteteMail(
|
||||
postfach_id=id,
|
||||
message_id=msg_id,
|
||||
ordner=att.get("ordner", ""),
|
||||
betreff=att.get("betreff", "")[:500] if att.get("betreff") else None,
|
||||
absender=att.get("absender", "")[:255] if att.get("absender") else None,
|
||||
anzahl_attachments=1
|
||||
))
|
||||
|
||||
postfach.letzter_abruf = datetime.utcnow()
|
||||
postfach.letzte_anzahl = len(attachments)
|
||||
db.commit()
|
||||
|
||||
return {
|
||||
"ergebnisse": [{
|
||||
"postfach": postfach.name,
|
||||
"anzahl": len(attachments),
|
||||
"dateien": [a["original_name"] for a in attachments],
|
||||
"bereits_verarbeitet": len(bereits_verarbeitet)
|
||||
}]
|
||||
}
|
||||
except Exception as e:
|
||||
return {
|
||||
"ergebnisse": [{
|
||||
"postfach": postfach.name,
|
||||
"fehler": str(e)
|
||||
}]
|
||||
}
|
||||
finally:
|
||||
fetcher.disconnect()
|
||||
|
||||
|
||||
@router.post("/postfaecher/abrufen-alle")
|
||||
def rufe_alle_postfaecher_ab(db: Session = Depends(get_db)):
|
||||
postfaecher = db.query(Postfach).filter(Postfach.aktiv == True).all()
|
||||
ergebnisse = []
|
||||
|
||||
for postfach in postfaecher:
|
||||
ziel = Path(postfach.ziel_ordner)
|
||||
ziel.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
fetcher = MailFetcher({
|
||||
"imap_server": postfach.imap_server,
|
||||
"imap_port": postfach.imap_port,
|
||||
"email": postfach.email,
|
||||
"passwort": postfach.passwort,
|
||||
"ordner": postfach.ordner,
|
||||
"erlaubte_typen": postfach.erlaubte_typen,
|
||||
"max_groesse_mb": postfach.max_groesse_mb
|
||||
})
|
||||
|
||||
try:
|
||||
attachments = fetcher.fetch_attachments(ziel)
|
||||
postfach.letzter_abruf = datetime.utcnow()
|
||||
postfach.letzte_anzahl = len(attachments)
|
||||
|
||||
ergebnisse.append({
|
||||
"postfach": postfach.name,
|
||||
"anzahl": len(attachments),
|
||||
"dateien": [a["original_name"] for a in attachments]
|
||||
})
|
||||
except Exception as e:
|
||||
ergebnisse.append({
|
||||
"postfach": postfach.name,
|
||||
"fehler": str(e)
|
||||
})
|
||||
finally:
|
||||
fetcher.disconnect()
|
||||
|
||||
db.commit()
|
||||
return {"ergebnisse": ergebnisse}
|
||||
|
||||
|
||||
# ============ BEREICH 2: Quell-Ordner ============
|
||||
|
||||
@router.get("/ordner", response_model=List[OrdnerResponse])
|
||||
def liste_ordner(db: Session = Depends(get_db)):
|
||||
return db.query(QuellOrdner).all()
|
||||
|
||||
|
||||
@router.post("/ordner", response_model=OrdnerResponse)
|
||||
def erstelle_ordner(data: OrdnerCreate, db: Session = Depends(get_db)):
|
||||
ordner = QuellOrdner(**data.dict())
|
||||
db.add(ordner)
|
||||
db.commit()
|
||||
db.refresh(ordner)
|
||||
return ordner
|
||||
|
||||
|
||||
@router.delete("/ordner/{id}")
|
||||
def loesche_ordner(id: int, db: Session = Depends(get_db)):
|
||||
ordner = db.query(QuellOrdner).filter(QuellOrdner.id == id).first()
|
||||
if not ordner:
|
||||
raise HTTPException(status_code=404, detail="Nicht gefunden")
|
||||
db.delete(ordner)
|
||||
db.commit()
|
||||
return {"message": "Gelöscht"}
|
||||
|
||||
|
||||
@router.get("/ordner/{id}/scannen")
|
||||
def scanne_ordner(id: int, db: Session = Depends(get_db)):
|
||||
ordner = db.query(QuellOrdner).filter(QuellOrdner.id == id).first()
|
||||
if not ordner:
|
||||
raise HTTPException(status_code=404, detail="Nicht gefunden")
|
||||
|
||||
pfad = Path(ordner.pfad)
|
||||
if not pfad.exists():
|
||||
return {"anzahl": 0, "fehler": "Ordner existiert nicht"}
|
||||
|
||||
# Dateien sammeln (rekursiv oder nicht)
|
||||
dateien = []
|
||||
pattern = "**/*" if ordner.rekursiv else "*"
|
||||
for f in pfad.glob(pattern):
|
||||
if f.is_file() and f.suffix.lower() in [t.lower() for t in ordner.dateitypen]:
|
||||
dateien.append(f)
|
||||
|
||||
return {"anzahl": len(dateien), "dateien": [str(f.relative_to(pfad)) for f in dateien[:30]]}
|
||||
|
||||
|
||||
# ============ Regeln ============
|
||||
|
||||
@router.get("/regeln", response_model=List[RegelResponse])
|
||||
def liste_regeln(db: Session = Depends(get_db)):
|
||||
return db.query(SortierRegel).order_by(SortierRegel.prioritaet).all()
|
||||
|
||||
|
||||
@router.post("/regeln", response_model=RegelResponse)
|
||||
def erstelle_regel(data: RegelCreate, db: Session = Depends(get_db)):
|
||||
regel = SortierRegel(**data.dict())
|
||||
db.add(regel)
|
||||
db.commit()
|
||||
db.refresh(regel)
|
||||
return regel
|
||||
|
||||
|
||||
@router.put("/regeln/{id}", response_model=RegelResponse)
|
||||
def aktualisiere_regel(id: int, data: RegelCreate, db: Session = Depends(get_db)):
|
||||
regel = db.query(SortierRegel).filter(SortierRegel.id == id).first()
|
||||
if not regel:
|
||||
raise HTTPException(status_code=404, detail="Nicht gefunden")
|
||||
for key, value in data.dict().items():
|
||||
setattr(regel, key, value)
|
||||
db.commit()
|
||||
db.refresh(regel)
|
||||
return regel
|
||||
|
||||
|
||||
@router.delete("/regeln/{id}")
|
||||
def loesche_regel(id: int, db: Session = Depends(get_db)):
|
||||
regel = db.query(SortierRegel).filter(SortierRegel.id == id).first()
|
||||
if not regel:
|
||||
raise HTTPException(status_code=404, detail="Nicht gefunden")
|
||||
db.delete(regel)
|
||||
db.commit()
|
||||
return {"message": "Gelöscht"}
|
||||
|
||||
|
||||
@router.post("/regeln/test")
|
||||
def teste_regel(data: RegelTestRequest):
|
||||
regel = data.regel
|
||||
regel["aktiv"] = True
|
||||
regel["prioritaet"] = 1
|
||||
|
||||
sorter = Sorter([regel])
|
||||
doc_info = {"text": data.text, "original_name": "test.pdf", "absender": ""}
|
||||
|
||||
passend = sorter.finde_passende_regel(doc_info)
|
||||
|
||||
if passend:
|
||||
extrahiert = sorter.extrahiere_felder(passend, doc_info)
|
||||
dateiname = sorter.generiere_dateinamen(passend, extrahiert)
|
||||
return {"passt": True, "extrahiert": extrahiert, "dateiname": dateiname}
|
||||
|
||||
return {"passt": False}
|
||||
|
||||
|
||||
# ============ Sortierung ============
|
||||
|
||||
def sammle_dateien(ordner: QuellOrdner) -> list:
|
||||
"""Sammelt alle Dateien aus einem Ordner (rekursiv oder nicht)"""
|
||||
pfad = Path(ordner.pfad)
|
||||
if not pfad.exists():
|
||||
return []
|
||||
|
||||
dateien = []
|
||||
pattern = "**/*" if ordner.rekursiv else "*"
|
||||
erlaubte = [t.lower() for t in (ordner.dateitypen or [".pdf"])]
|
||||
|
||||
for f in pfad.glob(pattern):
|
||||
if f.is_file() and f.suffix.lower() in erlaubte:
|
||||
dateien.append(f)
|
||||
|
||||
return dateien
|
||||
|
||||
|
||||
@router.post("/sortierung/starten")
|
||||
def starte_sortierung(db: Session = Depends(get_db)):
|
||||
ordner_liste = db.query(QuellOrdner).filter(QuellOrdner.aktiv == True).all()
|
||||
regeln = db.query(SortierRegel).filter(SortierRegel.aktiv == True).order_by(SortierRegel.prioritaet).all()
|
||||
|
||||
if not ordner_liste:
|
||||
return {"fehler": "Keine Quell-Ordner konfiguriert", "verarbeitet": []}
|
||||
if not regeln:
|
||||
return {"fehler": "Keine Regeln definiert", "verarbeitet": []}
|
||||
|
||||
# Regeln in Dict-Format
|
||||
regeln_dicts = []
|
||||
for r in regeln:
|
||||
regeln_dicts.append({
|
||||
"id": r.id,
|
||||
"name": r.name,
|
||||
"prioritaet": r.prioritaet,
|
||||
"muster": r.muster,
|
||||
"extraktion": r.extraktion,
|
||||
"schema": r.schema,
|
||||
"unterordner": r.unterordner
|
||||
})
|
||||
|
||||
sorter = Sorter(regeln_dicts)
|
||||
pdf_processor = PDFProcessor()
|
||||
|
||||
ergebnis = {
|
||||
"gesamt": 0,
|
||||
"sortiert": 0,
|
||||
"zugferd": 0,
|
||||
"fehler": 0,
|
||||
"verarbeitet": []
|
||||
}
|
||||
|
||||
for quell_ordner in ordner_liste:
|
||||
pfad = Path(quell_ordner.pfad)
|
||||
if not pfad.exists():
|
||||
continue
|
||||
|
||||
ziel_basis = Path(quell_ordner.ziel_ordner)
|
||||
dateien = sammle_dateien(quell_ordner)
|
||||
|
||||
for datei in dateien:
|
||||
ergebnis["gesamt"] += 1
|
||||
# Relativer Pfad für Anzeige
|
||||
try:
|
||||
rel_pfad = str(datei.relative_to(pfad))
|
||||
except:
|
||||
rel_pfad = datei.name
|
||||
datei_info = {"original": rel_pfad}
|
||||
|
||||
try:
|
||||
ist_pdf = datei.suffix.lower() == ".pdf"
|
||||
text = ""
|
||||
ist_zugferd = False
|
||||
ocr_gemacht = False
|
||||
|
||||
# Nur PDFs durch den PDF-Processor
|
||||
if ist_pdf:
|
||||
pdf_result = pdf_processor.verarbeite(str(datei))
|
||||
|
||||
if pdf_result.get("fehler"):
|
||||
raise Exception(pdf_result["fehler"])
|
||||
|
||||
text = pdf_result.get("text", "")
|
||||
ist_zugferd = pdf_result.get("ist_zugferd", False)
|
||||
ocr_gemacht = pdf_result.get("ocr_durchgefuehrt", False)
|
||||
|
||||
# ZUGFeRD separat behandeln
|
||||
if ist_zugferd:
|
||||
zugferd_ziel = ziel_basis / "zugferd"
|
||||
zugferd_ziel.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
neuer_pfad = zugferd_ziel / datei.name
|
||||
counter = 1
|
||||
while neuer_pfad.exists():
|
||||
neuer_pfad = zugferd_ziel / f"{datei.stem}_{counter}{datei.suffix}"
|
||||
counter += 1
|
||||
|
||||
datei.rename(neuer_pfad)
|
||||
|
||||
ergebnis["zugferd"] += 1
|
||||
datei_info["zugferd"] = True
|
||||
datei_info["neuer_name"] = neuer_pfad.name
|
||||
|
||||
db.add(VerarbeiteteDatei(
|
||||
original_pfad=str(datei),
|
||||
original_name=datei.name,
|
||||
neuer_pfad=str(neuer_pfad),
|
||||
neuer_name=neuer_pfad.name,
|
||||
ist_zugferd=True,
|
||||
status="zugferd"
|
||||
))
|
||||
ergebnis["verarbeitet"].append(datei_info)
|
||||
continue
|
||||
|
||||
# Regel finden (für PDFs mit Text, für andere nur Dateiname)
|
||||
doc_info = {
|
||||
"text": text,
|
||||
"original_name": datei.name,
|
||||
"absender": "",
|
||||
"dateityp": datei.suffix.lower()
|
||||
}
|
||||
|
||||
regel = sorter.finde_passende_regel(doc_info)
|
||||
|
||||
if not regel:
|
||||
datei_info["fehler"] = "Keine passende Regel"
|
||||
ergebnis["fehler"] += 1
|
||||
ergebnis["verarbeitet"].append(datei_info)
|
||||
continue
|
||||
|
||||
# Felder extrahieren
|
||||
extrahiert = sorter.extrahiere_felder(regel, doc_info)
|
||||
|
||||
# Dateiendung beibehalten
|
||||
schema = regel.get("schema", "{datum} - Dokument.pdf")
|
||||
# Endung aus Schema entfernen und Original-Endung anhängen
|
||||
if schema.endswith(".pdf"):
|
||||
schema = schema[:-4] + datei.suffix
|
||||
neuer_name = sorter.generiere_dateinamen({"schema": schema, **regel}, extrahiert)
|
||||
|
||||
# Zielordner
|
||||
ziel = ziel_basis
|
||||
if regel.get("unterordner"):
|
||||
ziel = ziel / regel["unterordner"]
|
||||
ziel.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
# Verschieben
|
||||
neuer_pfad = sorter.verschiebe_datei(str(datei), str(ziel), neuer_name)
|
||||
|
||||
ergebnis["sortiert"] += 1
|
||||
datei_info["neuer_name"] = neuer_name
|
||||
|
||||
db.add(VerarbeiteteDatei(
|
||||
original_pfad=str(datei),
|
||||
original_name=datei.name,
|
||||
neuer_pfad=neuer_pfad,
|
||||
neuer_name=neuer_name,
|
||||
ist_zugferd=False,
|
||||
ocr_durchgefuehrt=ocr_gemacht,
|
||||
status="sortiert",
|
||||
extrahierte_daten=extrahiert
|
||||
))
|
||||
|
||||
except Exception as e:
|
||||
ergebnis["fehler"] += 1
|
||||
datei_info["fehler"] = str(e)
|
||||
|
||||
ergebnis["verarbeitet"].append(datei_info)
|
||||
|
||||
db.commit()
|
||||
return ergebnis
|
||||
|
||||
|
||||
@router.get("/health")
|
||||
def health():
|
||||
return {"status": "ok"}
|
||||
|
||||
|
||||
# ============ Einfache Regeln (UI-freundlich) ============
|
||||
|
||||
@router.get("/dokumenttypen")
|
||||
def liste_dokumenttypen():
|
||||
"""Gibt alle verfügbaren Dokumenttypen für das UI zurück"""
|
||||
from ..modules.sorter import DOKUMENTTYPEN
|
||||
return [
|
||||
{"id": key, "name": config["name"], "schema": config["schema"], "unterordner": config["unterordner"]}
|
||||
for key, config in DOKUMENTTYPEN.items()
|
||||
]
|
||||
|
||||
|
||||
class EinfacheRegelCreate(BaseModel):
|
||||
name: str
|
||||
dokumenttyp: str # z.B. "rechnung", "vertrag"
|
||||
keywords: str # Komma-getrennt
|
||||
firma: Optional[str] = None # Fester Firmenwert
|
||||
unterordner: Optional[str] = None
|
||||
prioritaet: int = 50
|
||||
|
||||
|
||||
@router.post("/regeln/einfach")
|
||||
def erstelle_einfache_regel_api(data: EinfacheRegelCreate, db: Session = Depends(get_db)):
|
||||
"""Erstellt eine Regel basierend auf Dokumenttyp - für einfaches UI"""
|
||||
from ..modules.sorter import DOKUMENTTYPEN
|
||||
|
||||
typ_config = DOKUMENTTYPEN.get(data.dokumenttyp, DOKUMENTTYPEN["sonstiges"])
|
||||
|
||||
# Muster als Dict (keywords werden vom Sorter geparst)
|
||||
muster = {"keywords": data.keywords}
|
||||
|
||||
# Extraktion (nur Firma wenn angegeben)
|
||||
extraktion = {}
|
||||
if data.firma:
|
||||
extraktion["firma"] = {"wert": data.firma}
|
||||
|
||||
regel = SortierRegel(
|
||||
name=data.name,
|
||||
prioritaet=data.prioritaet,
|
||||
aktiv=True,
|
||||
muster=muster,
|
||||
extraktion=extraktion,
|
||||
schema=typ_config["schema"],
|
||||
unterordner=data.unterordner or typ_config["unterordner"]
|
||||
)
|
||||
|
||||
db.add(regel)
|
||||
db.commit()
|
||||
db.refresh(regel)
|
||||
|
||||
return {
|
||||
"id": regel.id,
|
||||
"name": regel.name,
|
||||
"dokumenttyp": data.dokumenttyp,
|
||||
"keywords": data.keywords,
|
||||
"schema": regel.schema
|
||||
}
|
||||
|
||||
|
||||
class ExtraktionTestRequest(BaseModel):
|
||||
text: str
|
||||
dateiname: Optional[str] = "test.pdf"
|
||||
|
||||
|
||||
@router.post("/extraktion/test")
|
||||
def teste_extraktion(data: ExtraktionTestRequest):
|
||||
"""Testet die automatische Extraktion auf einem Text"""
|
||||
from ..modules.extraktoren import extrahiere_alle_felder, baue_dateiname
|
||||
|
||||
dokument_info = {
|
||||
"original_name": data.dateiname,
|
||||
"absender": ""
|
||||
}
|
||||
|
||||
# Felder extrahieren
|
||||
felder = extrahiere_alle_felder(data.text, dokument_info)
|
||||
|
||||
# Beispiel-Dateinamen für verschiedene Typen generieren
|
||||
beispiele = {}
|
||||
from ..modules.sorter import DOKUMENTTYPEN
|
||||
for typ_id, typ_config in DOKUMENTTYPEN.items():
|
||||
beispiele[typ_id] = baue_dateiname(typ_config["schema"], felder, ".pdf")
|
||||
|
||||
return {
|
||||
"extrahiert": felder,
|
||||
"beispiel_dateinamen": beispiele
|
||||
}
|
||||
|
||||
|
||||
@router.post("/regeln/{id}/vorschau")
|
||||
def regel_vorschau(id: int, data: ExtraktionTestRequest, db: Session = Depends(get_db)):
|
||||
"""Zeigt Vorschau wie eine Regel auf einen Text angewendet würde"""
|
||||
regel = db.query(SortierRegel).filter(SortierRegel.id == id).first()
|
||||
if not regel:
|
||||
raise HTTPException(status_code=404, detail="Regel nicht gefunden")
|
||||
|
||||
from ..modules.sorter import Sorter
|
||||
|
||||
sorter = Sorter([{
|
||||
"id": regel.id,
|
||||
"name": regel.name,
|
||||
"prioritaet": regel.prioritaet,
|
||||
"aktiv": True,
|
||||
"muster": regel.muster,
|
||||
"extraktion": regel.extraktion,
|
||||
"schema": regel.schema,
|
||||
"unterordner": regel.unterordner
|
||||
}])
|
||||
|
||||
dokument_info = {
|
||||
"text": data.text,
|
||||
"original_name": data.dateiname or "test.pdf",
|
||||
"absender": ""
|
||||
}
|
||||
|
||||
# Prüfen ob Regel matched
|
||||
passende_regel = sorter.finde_passende_regel(dokument_info)
|
||||
|
||||
if not passende_regel:
|
||||
return {
|
||||
"matched": False,
|
||||
"grund": "Keywords nicht gefunden"
|
||||
}
|
||||
|
||||
# Felder extrahieren
|
||||
felder = sorter.extrahiere_felder(passende_regel, dokument_info)
|
||||
|
||||
# Dateiname generieren
|
||||
dateiname = sorter.generiere_dateinamen(passende_regel, felder)
|
||||
|
||||
return {
|
||||
"matched": True,
|
||||
"extrahiert": felder,
|
||||
"dateiname": dateiname,
|
||||
"unterordner": passende_regel.get("unterordner")
|
||||
}
|
||||
Binary file not shown.
Binary file not shown.
Binary file not shown.
|
|
@ -1,543 +0,0 @@
|
|||
/* ============ Variables ============ */
|
||||
:root {
|
||||
--primary: #3b82f6;
|
||||
--primary-dark: #2563eb;
|
||||
--success: #22c55e;
|
||||
--danger: #ef4444;
|
||||
--warning: #f59e0b;
|
||||
--bg: #0f172a;
|
||||
--bg-secondary: #1e293b;
|
||||
--bg-tertiary: #334155;
|
||||
--text: #f1f5f9;
|
||||
--text-secondary: #94a3b8;
|
||||
--border: #475569;
|
||||
--radius: 8px;
|
||||
--shadow: 0 4px 6px -1px rgba(0, 0, 0, 0.3);
|
||||
}
|
||||
|
||||
/* ============ Reset & Base ============ */
|
||||
* {
|
||||
margin: 0;
|
||||
padding: 0;
|
||||
box-sizing: border-box;
|
||||
}
|
||||
|
||||
body {
|
||||
font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif;
|
||||
background: var(--bg);
|
||||
color: var(--text);
|
||||
line-height: 1.6;
|
||||
}
|
||||
|
||||
/* ============ Layout ============ */
|
||||
#app {
|
||||
min-height: 100vh;
|
||||
display: flex;
|
||||
flex-direction: column;
|
||||
}
|
||||
|
||||
.header {
|
||||
background: var(--bg-secondary);
|
||||
padding: 1rem 1.5rem;
|
||||
display: flex;
|
||||
justify-content: space-between;
|
||||
align-items: center;
|
||||
border-bottom: 1px solid var(--border);
|
||||
}
|
||||
|
||||
.header h1 {
|
||||
font-size: 1.25rem;
|
||||
font-weight: 600;
|
||||
}
|
||||
|
||||
.main-container {
|
||||
display: grid;
|
||||
grid-template-columns: 1fr 1fr;
|
||||
gap: 1px;
|
||||
flex: 1;
|
||||
background: var(--border);
|
||||
}
|
||||
|
||||
@media (max-width: 1200px) {
|
||||
.main-container {
|
||||
grid-template-columns: 1fr;
|
||||
}
|
||||
}
|
||||
|
||||
/* ============ Bereiche ============ */
|
||||
.bereich {
|
||||
background: var(--bg);
|
||||
display: flex;
|
||||
flex-direction: column;
|
||||
}
|
||||
|
||||
.bereich-header {
|
||||
padding: 1.5rem;
|
||||
border-bottom: 1px solid var(--border);
|
||||
}
|
||||
|
||||
.bereich-header h2 {
|
||||
font-size: 1.25rem;
|
||||
margin-bottom: 0.25rem;
|
||||
}
|
||||
|
||||
.bereich-desc {
|
||||
color: var(--text-secondary);
|
||||
font-size: 0.875rem;
|
||||
}
|
||||
|
||||
.bereich-content {
|
||||
padding: 1rem;
|
||||
flex: 1;
|
||||
overflow-y: auto;
|
||||
}
|
||||
|
||||
/* ============ Buttons ============ */
|
||||
.btn {
|
||||
padding: 0.5rem 1rem;
|
||||
border: none;
|
||||
border-radius: var(--radius);
|
||||
font-size: 0.875rem;
|
||||
cursor: pointer;
|
||||
transition: all 0.2s;
|
||||
background: var(--bg-tertiary);
|
||||
color: var(--text);
|
||||
}
|
||||
|
||||
.btn:hover {
|
||||
filter: brightness(1.1);
|
||||
}
|
||||
|
||||
.btn:disabled {
|
||||
opacity: 0.5;
|
||||
cursor: not-allowed;
|
||||
}
|
||||
|
||||
.btn-primary {
|
||||
background: var(--primary);
|
||||
color: white;
|
||||
}
|
||||
|
||||
.btn-success {
|
||||
background: var(--success);
|
||||
color: white;
|
||||
}
|
||||
|
||||
.btn-danger {
|
||||
background: var(--danger);
|
||||
color: white;
|
||||
}
|
||||
|
||||
.btn-sm {
|
||||
padding: 0.25rem 0.5rem;
|
||||
font-size: 0.75rem;
|
||||
}
|
||||
|
||||
.btn-large {
|
||||
padding: 0.75rem 1.5rem;
|
||||
font-size: 1rem;
|
||||
}
|
||||
|
||||
/* ============ Cards ============ */
|
||||
.card {
|
||||
background: var(--bg-secondary);
|
||||
border-radius: var(--radius);
|
||||
margin-bottom: 1rem;
|
||||
overflow: hidden;
|
||||
}
|
||||
|
||||
.card-header {
|
||||
padding: 0.75rem 1rem;
|
||||
display: flex;
|
||||
justify-content: space-between;
|
||||
align-items: center;
|
||||
border-bottom: 1px solid var(--border);
|
||||
background: var(--bg-tertiary);
|
||||
}
|
||||
|
||||
.card-header h3 {
|
||||
font-size: 0.875rem;
|
||||
font-weight: 500;
|
||||
}
|
||||
|
||||
.card-body {
|
||||
padding: 1rem;
|
||||
}
|
||||
|
||||
/* ============ Action Bar ============ */
|
||||
.action-bar {
|
||||
padding: 1rem;
|
||||
text-align: center;
|
||||
background: var(--bg-secondary);
|
||||
border-radius: var(--radius);
|
||||
margin-bottom: 1rem;
|
||||
}
|
||||
|
||||
/* ============ Config Items ============ */
|
||||
.config-item {
|
||||
display: flex;
|
||||
justify-content: space-between;
|
||||
align-items: center;
|
||||
padding: 0.75rem;
|
||||
background: var(--bg-tertiary);
|
||||
border-radius: var(--radius);
|
||||
margin-bottom: 0.5rem;
|
||||
}
|
||||
|
||||
.config-item:last-child {
|
||||
margin-bottom: 0;
|
||||
}
|
||||
|
||||
.config-item-info h4 {
|
||||
font-size: 0.875rem;
|
||||
margin-bottom: 0.125rem;
|
||||
}
|
||||
|
||||
.config-item-info small {
|
||||
color: var(--text-secondary);
|
||||
font-size: 0.75rem;
|
||||
}
|
||||
|
||||
.config-item-actions {
|
||||
display: flex;
|
||||
gap: 0.5rem;
|
||||
}
|
||||
|
||||
/* ============ Forms ============ */
|
||||
.form-group {
|
||||
margin-bottom: 1rem;
|
||||
}
|
||||
|
||||
.form-group label {
|
||||
display: block;
|
||||
margin-bottom: 0.5rem;
|
||||
font-size: 0.875rem;
|
||||
color: var(--text-secondary);
|
||||
}
|
||||
|
||||
.form-group input,
|
||||
.form-group textarea,
|
||||
.form-group select {
|
||||
width: 100%;
|
||||
padding: 0.75rem;
|
||||
border: 1px solid var(--border);
|
||||
border-radius: var(--radius);
|
||||
background: var(--bg-tertiary);
|
||||
color: var(--text);
|
||||
font-size: 0.875rem;
|
||||
}
|
||||
|
||||
.form-group input:focus,
|
||||
.form-group textarea:focus {
|
||||
outline: none;
|
||||
border-color: var(--primary);
|
||||
}
|
||||
|
||||
.form-group small {
|
||||
display: block;
|
||||
margin-top: 0.25rem;
|
||||
color: var(--text-secondary);
|
||||
font-size: 0.75rem;
|
||||
}
|
||||
|
||||
.form-row {
|
||||
display: grid;
|
||||
grid-template-columns: 1fr 1fr;
|
||||
gap: 1rem;
|
||||
}
|
||||
|
||||
.code-input {
|
||||
font-family: 'Consolas', 'Monaco', monospace;
|
||||
font-size: 0.8rem;
|
||||
}
|
||||
|
||||
/* ============ Log Output ============ */
|
||||
.log-output {
|
||||
font-family: 'Consolas', 'Monaco', monospace;
|
||||
font-size: 0.8rem;
|
||||
max-height: 350px;
|
||||
min-height: 100px;
|
||||
overflow-y: auto;
|
||||
}
|
||||
|
||||
.log-entry {
|
||||
padding: 0.5rem;
|
||||
border-radius: 4px;
|
||||
margin-bottom: 0.25rem;
|
||||
display: flex;
|
||||
justify-content: space-between;
|
||||
align-items: center;
|
||||
}
|
||||
|
||||
.log-entry.success {
|
||||
background: rgba(34, 197, 94, 0.2);
|
||||
border-left: 3px solid var(--success);
|
||||
}
|
||||
|
||||
.log-entry.error {
|
||||
background: rgba(239, 68, 68, 0.2);
|
||||
border-left: 3px solid var(--danger);
|
||||
}
|
||||
|
||||
.log-entry.info {
|
||||
background: rgba(59, 130, 246, 0.2);
|
||||
border-left: 3px solid var(--primary);
|
||||
}
|
||||
|
||||
.empty-state {
|
||||
color: var(--text-secondary);
|
||||
text-align: center;
|
||||
padding: 1rem;
|
||||
font-size: 0.875rem;
|
||||
}
|
||||
|
||||
/* ============ Modals ============ */
|
||||
.modal {
|
||||
position: fixed;
|
||||
top: 0;
|
||||
left: 0;
|
||||
right: 0;
|
||||
bottom: 0;
|
||||
background: rgba(0, 0, 0, 0.7);
|
||||
display: flex;
|
||||
align-items: center;
|
||||
justify-content: center;
|
||||
z-index: 1000;
|
||||
}
|
||||
|
||||
.modal-content {
|
||||
background: var(--bg-secondary);
|
||||
border-radius: var(--radius);
|
||||
width: 90%;
|
||||
max-width: 500px;
|
||||
max-height: 90vh;
|
||||
overflow-y: auto;
|
||||
}
|
||||
|
||||
.modal-large {
|
||||
max-width: 700px;
|
||||
}
|
||||
|
||||
.modal-header {
|
||||
padding: 1rem;
|
||||
display: flex;
|
||||
justify-content: space-between;
|
||||
align-items: center;
|
||||
border-bottom: 1px solid var(--border);
|
||||
}
|
||||
|
||||
.modal-header h3 {
|
||||
font-size: 1.125rem;
|
||||
}
|
||||
|
||||
.modal-close {
|
||||
background: none;
|
||||
border: none;
|
||||
color: var(--text-secondary);
|
||||
font-size: 1.5rem;
|
||||
cursor: pointer;
|
||||
}
|
||||
|
||||
.modal-body {
|
||||
padding: 1rem;
|
||||
}
|
||||
|
||||
.modal-footer {
|
||||
padding: 1rem;
|
||||
display: flex;
|
||||
justify-content: flex-end;
|
||||
gap: 0.5rem;
|
||||
border-top: 1px solid var(--border);
|
||||
}
|
||||
|
||||
/* ============ Test Result ============ */
|
||||
.test-result {
|
||||
margin-top: 0.5rem;
|
||||
padding: 0.75rem;
|
||||
border-radius: var(--radius);
|
||||
background: var(--bg-tertiary);
|
||||
font-family: monospace;
|
||||
font-size: 0.8rem;
|
||||
white-space: pre-wrap;
|
||||
}
|
||||
|
||||
.test-result.success {
|
||||
border-left: 3px solid var(--success);
|
||||
}
|
||||
|
||||
.test-result.error {
|
||||
border-left: 3px solid var(--danger);
|
||||
}
|
||||
|
||||
/* ============ Status Badges ============ */
|
||||
.badge {
|
||||
display: inline-block;
|
||||
padding: 0.125rem 0.5rem;
|
||||
border-radius: 4px;
|
||||
font-size: 0.7rem;
|
||||
font-weight: 500;
|
||||
}
|
||||
|
||||
.badge-success { background: var(--success); }
|
||||
.badge-warning { background: var(--warning); color: #000; }
|
||||
.badge-danger { background: var(--danger); }
|
||||
.badge-info { background: var(--primary); }
|
||||
|
||||
/* ============ Loading Overlay ============ */
|
||||
.loading-overlay {
|
||||
position: fixed;
|
||||
top: 0;
|
||||
left: 0;
|
||||
right: 0;
|
||||
bottom: 0;
|
||||
background: rgba(0, 0, 0, 0.7);
|
||||
display: flex;
|
||||
flex-direction: column;
|
||||
align-items: center;
|
||||
justify-content: center;
|
||||
z-index: 2000;
|
||||
}
|
||||
|
||||
.spinner {
|
||||
width: 50px;
|
||||
height: 50px;
|
||||
border: 4px solid var(--border);
|
||||
border-top-color: var(--primary);
|
||||
border-radius: 50%;
|
||||
animation: spin 1s linear infinite;
|
||||
}
|
||||
|
||||
@keyframes spin {
|
||||
to { transform: rotate(360deg); }
|
||||
}
|
||||
|
||||
.loading-text {
|
||||
margin-top: 1rem;
|
||||
color: var(--text);
|
||||
font-size: 0.875rem;
|
||||
}
|
||||
|
||||
.progress-bar {
|
||||
width: 200px;
|
||||
height: 6px;
|
||||
background: var(--bg-tertiary);
|
||||
border-radius: 3px;
|
||||
margin-top: 1rem;
|
||||
overflow: hidden;
|
||||
}
|
||||
|
||||
.progress-bar-fill {
|
||||
height: 100%;
|
||||
background: var(--primary);
|
||||
transition: width 0.3s ease;
|
||||
}
|
||||
|
||||
/* ============ File Browser ============ */
|
||||
.file-browser {
|
||||
max-height: 300px;
|
||||
overflow-y: auto;
|
||||
border: 1px solid var(--border);
|
||||
border-radius: var(--radius);
|
||||
margin-bottom: 1rem;
|
||||
}
|
||||
|
||||
.file-browser-path {
|
||||
padding: 0.75rem;
|
||||
background: var(--bg-tertiary);
|
||||
border-bottom: 1px solid var(--border);
|
||||
font-family: monospace;
|
||||
font-size: 0.8rem;
|
||||
display: flex;
|
||||
align-items: center;
|
||||
gap: 0.5rem;
|
||||
}
|
||||
|
||||
.file-browser-list {
|
||||
list-style: none;
|
||||
}
|
||||
|
||||
.file-browser-item {
|
||||
padding: 0.5rem 1rem;
|
||||
cursor: pointer;
|
||||
display: flex;
|
||||
align-items: center;
|
||||
gap: 0.5rem;
|
||||
border-bottom: 1px solid var(--border);
|
||||
}
|
||||
|
||||
.file-browser-item:hover {
|
||||
background: var(--bg-tertiary);
|
||||
}
|
||||
|
||||
.file-browser-item.selected {
|
||||
background: var(--primary);
|
||||
}
|
||||
|
||||
.file-browser-item:last-child {
|
||||
border-bottom: none;
|
||||
}
|
||||
|
||||
.file-icon {
|
||||
font-size: 1rem;
|
||||
}
|
||||
|
||||
/* ============ Checkbox Group ============ */
|
||||
.checkbox-group {
|
||||
display: flex;
|
||||
flex-wrap: wrap;
|
||||
gap: 0.5rem;
|
||||
margin-top: 0.5rem;
|
||||
}
|
||||
|
||||
.checkbox-item {
|
||||
display: flex;
|
||||
align-items: center;
|
||||
gap: 0.25rem;
|
||||
padding: 0.25rem 0.5rem;
|
||||
background: var(--bg-tertiary);
|
||||
border-radius: 4px;
|
||||
font-size: 0.75rem;
|
||||
cursor: pointer;
|
||||
}
|
||||
|
||||
.checkbox-item input {
|
||||
width: auto;
|
||||
margin: 0;
|
||||
}
|
||||
|
||||
.checkbox-item:has(input:checked) {
|
||||
background: var(--primary);
|
||||
}
|
||||
|
||||
/* ============ Input with Button ============ */
|
||||
.input-with-btn {
|
||||
display: flex;
|
||||
gap: 0.5rem;
|
||||
}
|
||||
|
||||
.input-with-btn input {
|
||||
flex: 1;
|
||||
}
|
||||
|
||||
/* ============ Utilities ============ */
|
||||
.hidden {
|
||||
display: none !important;
|
||||
}
|
||||
|
||||
/* ============ Scrollbar ============ */
|
||||
::-webkit-scrollbar {
|
||||
width: 8px;
|
||||
}
|
||||
|
||||
::-webkit-scrollbar-track {
|
||||
background: var(--bg);
|
||||
}
|
||||
|
||||
::-webkit-scrollbar-thumb {
|
||||
background: var(--border);
|
||||
border-radius: 4px;
|
||||
}
|
||||
|
||||
::-webkit-scrollbar-thumb:hover {
|
||||
background: var(--text-secondary);
|
||||
}
|
||||
|
|
@ -1,693 +0,0 @@
|
|||
/**
|
||||
* Dateiverwaltung Frontend
|
||||
* Zwei getrennte Bereiche: Mail-Abruf und Datei-Sortierung
|
||||
*/
|
||||
|
||||
// ============ API ============
|
||||
|
||||
async function api(endpoint, options = {}) {
|
||||
const response = await fetch(`/api${endpoint}`, {
|
||||
headers: { 'Content-Type': 'application/json', ...options.headers },
|
||||
...options
|
||||
});
|
||||
if (!response.ok) {
|
||||
const error = await response.json().catch(() => ({}));
|
||||
throw new Error(error.detail || 'API Fehler');
|
||||
}
|
||||
return response.json();
|
||||
}
|
||||
|
||||
// ============ Loading Overlay ============
|
||||
|
||||
function zeigeLoading(text = 'Wird geladen...') {
|
||||
document.getElementById('loading-text').textContent = text;
|
||||
document.getElementById('loading-overlay').classList.remove('hidden');
|
||||
}
|
||||
|
||||
function versteckeLoading() {
|
||||
document.getElementById('loading-overlay').classList.add('hidden');
|
||||
}
|
||||
|
||||
// ============ File Browser ============
|
||||
|
||||
let browserTargetInput = null;
|
||||
let browserCurrentPath = '/srv/http/dateiverwaltung/data';
|
||||
|
||||
function oeffneBrowser(inputId) {
|
||||
browserTargetInput = inputId;
|
||||
const currentValue = document.getElementById(inputId).value;
|
||||
browserCurrentPath = currentValue || '/srv/http/dateiverwaltung/data';
|
||||
ladeBrowserInhalt(browserCurrentPath);
|
||||
document.getElementById('browser-modal').classList.remove('hidden');
|
||||
}
|
||||
|
||||
async function ladeBrowserInhalt(path) {
|
||||
try {
|
||||
const data = await api(`/browse?path=${encodeURIComponent(path)}`);
|
||||
|
||||
if (data.error) {
|
||||
document.getElementById('browser-list').innerHTML =
|
||||
`<li class="file-browser-item" style="color: var(--danger);">${data.error}</li>`;
|
||||
return;
|
||||
}
|
||||
|
||||
browserCurrentPath = data.current;
|
||||
document.getElementById('browser-current-path').textContent = data.current;
|
||||
|
||||
let html = '';
|
||||
|
||||
// Parent directory
|
||||
if (data.parent) {
|
||||
html += `<li class="file-browser-item" onclick="ladeBrowserInhalt('${data.parent}')">
|
||||
<span class="file-icon">📁</span> ..
|
||||
</li>`;
|
||||
}
|
||||
|
||||
// Directories
|
||||
for (const entry of data.entries) {
|
||||
html += `<li class="file-browser-item" ondblclick="ladeBrowserInhalt('${entry.path}')" onclick="browserSelect(this, '${entry.path}')">
|
||||
<span class="file-icon">📁</span> ${entry.name}
|
||||
</li>`;
|
||||
}
|
||||
|
||||
if (data.entries.length === 0 && !data.parent) {
|
||||
html = '<li class="file-browser-item">Keine Unterordner</li>';
|
||||
}
|
||||
|
||||
document.getElementById('browser-list').innerHTML = html;
|
||||
} catch (error) {
|
||||
document.getElementById('browser-list').innerHTML =
|
||||
`<li class="file-browser-item" style="color: var(--danger);">Fehler: ${error.message}</li>`;
|
||||
}
|
||||
}
|
||||
|
||||
function browserSelect(element, path) {
|
||||
document.querySelectorAll('.file-browser-item.selected').forEach(el => el.classList.remove('selected'));
|
||||
element.classList.add('selected');
|
||||
browserCurrentPath = path;
|
||||
}
|
||||
|
||||
function browserAuswahl() {
|
||||
if (browserTargetInput && browserCurrentPath) {
|
||||
document.getElementById(browserTargetInput).value = browserCurrentPath + '/';
|
||||
}
|
||||
schliesseModal('browser-modal');
|
||||
}
|
||||
|
||||
// ============ Checkbox Helpers ============
|
||||
|
||||
function getCheckedTypes(groupId) {
|
||||
const checkboxes = document.querySelectorAll(`#${groupId} input[type="checkbox"]:checked`);
|
||||
return Array.from(checkboxes).map(cb => cb.value);
|
||||
}
|
||||
|
||||
function setCheckedTypes(groupId, types) {
|
||||
const checkboxes = document.querySelectorAll(`#${groupId} input[type="checkbox"]`);
|
||||
checkboxes.forEach(cb => {
|
||||
cb.checked = types.includes(cb.value);
|
||||
});
|
||||
}
|
||||
|
||||
// ============ Init ============
|
||||
|
||||
document.addEventListener('DOMContentLoaded', () => {
|
||||
ladePostfaecher();
|
||||
ladeOrdner();
|
||||
ladeRegeln();
|
||||
});
|
||||
|
||||
// ============ BEREICH 1: Mail-Abruf ============
|
||||
|
||||
async function ladePostfaecher() {
|
||||
try {
|
||||
const postfaecher = await api('/postfaecher');
|
||||
renderPostfaecher(postfaecher);
|
||||
} catch (error) {
|
||||
console.error('Fehler:', error);
|
||||
}
|
||||
}
|
||||
|
||||
let bearbeitetesPostfachId = null;
|
||||
|
||||
function renderPostfaecher(postfaecher) {
|
||||
const container = document.getElementById('postfaecher-liste');
|
||||
|
||||
if (!postfaecher || postfaecher.length === 0) {
|
||||
container.innerHTML = '<p class="empty-state">Keine Postfächer konfiguriert</p>';
|
||||
return;
|
||||
}
|
||||
|
||||
container.innerHTML = postfaecher.map(p => `
|
||||
<div class="config-item">
|
||||
<div class="config-item-info">
|
||||
<h4>${escapeHtml(p.name)}</h4>
|
||||
<small>${escapeHtml(p.email)} → ${escapeHtml(p.ziel_ordner)}</small>
|
||||
</div>
|
||||
<div class="config-item-actions">
|
||||
<button class="btn btn-sm" onclick="postfachAbrufen(${p.id})">Abrufen</button>
|
||||
<button class="btn btn-sm" onclick="postfachBearbeiten(${p.id})">Bearbeiten</button>
|
||||
<button class="btn btn-sm" onclick="postfachTesten(${p.id})">Testen</button>
|
||||
<button class="btn btn-sm btn-danger" onclick="postfachLoeschen(${p.id})">×</button>
|
||||
</div>
|
||||
</div>
|
||||
`).join('');
|
||||
}
|
||||
|
||||
function zeigePostfachModal(postfach = null) {
|
||||
bearbeitetesPostfachId = postfach?.id || null;
|
||||
|
||||
document.getElementById('pf-name').value = postfach?.name || '';
|
||||
document.getElementById('pf-server').value = postfach?.imap_server || '';
|
||||
document.getElementById('pf-port').value = postfach?.imap_port || '993';
|
||||
document.getElementById('pf-email').value = postfach?.email || '';
|
||||
document.getElementById('pf-passwort').value = ''; // Passwort nicht vorausfüllen
|
||||
document.getElementById('pf-ordner').value = postfach?.ordner || 'INBOX';
|
||||
document.getElementById('pf-alle-ordner').value = postfach?.alle_ordner ? 'true' : 'false';
|
||||
document.getElementById('pf-ziel').value = postfach?.ziel_ordner || '/srv/http/dateiverwaltung/data/inbox/';
|
||||
setCheckedTypes('pf-typen-gruppe', postfach?.erlaubte_typen || ['.pdf']);
|
||||
document.getElementById('pf-max-groesse').value = postfach?.max_groesse_mb || '25';
|
||||
|
||||
document.getElementById('postfach-modal').classList.remove('hidden');
|
||||
}
|
||||
|
||||
async function postfachBearbeiten(id) {
|
||||
try {
|
||||
const postfaecher = await api('/postfaecher');
|
||||
const postfach = postfaecher.find(p => p.id === id);
|
||||
if (postfach) {
|
||||
zeigePostfachModal(postfach);
|
||||
}
|
||||
} catch (error) {
|
||||
alert('Fehler: ' + error.message);
|
||||
}
|
||||
}
|
||||
|
||||
async function speicherePostfach() {
|
||||
const erlaubteTypen = getCheckedTypes('pf-typen-gruppe');
|
||||
if (erlaubteTypen.length === 0) {
|
||||
alert('Bitte mindestens einen Dateityp auswählen');
|
||||
return;
|
||||
}
|
||||
|
||||
const data = {
|
||||
name: document.getElementById('pf-name').value.trim(),
|
||||
imap_server: document.getElementById('pf-server').value.trim(),
|
||||
imap_port: parseInt(document.getElementById('pf-port').value),
|
||||
email: document.getElementById('pf-email').value.trim(),
|
||||
passwort: document.getElementById('pf-passwort').value,
|
||||
ordner: document.getElementById('pf-ordner').value.trim(),
|
||||
alle_ordner: document.getElementById('pf-alle-ordner').value === 'true',
|
||||
ziel_ordner: document.getElementById('pf-ziel').value.trim(),
|
||||
erlaubte_typen: erlaubteTypen,
|
||||
max_groesse_mb: parseInt(document.getElementById('pf-max-groesse').value)
|
||||
};
|
||||
|
||||
if (!data.name || !data.imap_server || !data.email || !data.ziel_ordner) {
|
||||
alert('Bitte alle Pflichtfelder ausfüllen');
|
||||
return;
|
||||
}
|
||||
|
||||
// Bei Bearbeitung: Passwort nur senden wenn eingegeben
|
||||
if (bearbeitetesPostfachId && !data.passwort) {
|
||||
delete data.passwort;
|
||||
} else if (!data.passwort) {
|
||||
alert('Passwort ist erforderlich');
|
||||
return;
|
||||
}
|
||||
|
||||
try {
|
||||
if (bearbeitetesPostfachId) {
|
||||
await api(`/postfaecher/${bearbeitetesPostfachId}`, { method: 'PUT', body: JSON.stringify(data) });
|
||||
} else {
|
||||
await api('/postfaecher', { method: 'POST', body: JSON.stringify(data) });
|
||||
}
|
||||
schliesseModal('postfach-modal');
|
||||
ladePostfaecher();
|
||||
} catch (error) {
|
||||
alert('Fehler: ' + error.message);
|
||||
}
|
||||
}
|
||||
|
||||
async function postfachTesten(id) {
|
||||
try {
|
||||
const result = await api(`/postfaecher/${id}/test`, { method: 'POST' });
|
||||
alert(result.erfolg ? 'Verbindung erfolgreich!' : 'Fehler: ' + result.nachricht);
|
||||
} catch (error) {
|
||||
alert('Fehler: ' + error.message);
|
||||
}
|
||||
}
|
||||
|
||||
async function postfachAbrufen(id) {
|
||||
const logContainer = document.getElementById('abruf-log');
|
||||
logContainer.innerHTML = '<div class="log-entry info"><span>Verbinde...</span></div>';
|
||||
|
||||
// EventSource für Server-Sent Events
|
||||
const eventSource = new EventSource(`/api/postfaecher/${id}/abrufen/stream`);
|
||||
let dateiCount = 0;
|
||||
let currentOrdner = '';
|
||||
|
||||
eventSource.onmessage = (event) => {
|
||||
const data = JSON.parse(event.data);
|
||||
|
||||
switch (data.type) {
|
||||
case 'start':
|
||||
logContainer.innerHTML = `<div class="log-entry info">
|
||||
<span>Starte Abruf: ${escapeHtml(data.postfach)}</span>
|
||||
<small>${data.bereits_verarbeitet} bereits verarbeitet</small>
|
||||
</div>`;
|
||||
break;
|
||||
|
||||
case 'info':
|
||||
logContainer.innerHTML += `<div class="log-entry info">
|
||||
<span>${escapeHtml(data.nachricht)}</span>
|
||||
</div>`;
|
||||
break;
|
||||
|
||||
case 'ordner':
|
||||
currentOrdner = data.name;
|
||||
logContainer.innerHTML += `<div class="log-entry info" id="ordner-status">
|
||||
<span>📁 ${escapeHtml(data.name)}</span>
|
||||
</div>`;
|
||||
break;
|
||||
|
||||
case 'mails':
|
||||
const ordnerStatus = document.getElementById('ordner-status');
|
||||
if (ordnerStatus) {
|
||||
ordnerStatus.innerHTML = `<span>📁 ${escapeHtml(data.ordner)}: ${data.anzahl} Mails</span>`;
|
||||
ordnerStatus.id = ''; // ID entfernen für nächsten Ordner
|
||||
}
|
||||
break;
|
||||
|
||||
case 'datei':
|
||||
dateiCount++;
|
||||
logContainer.innerHTML += `<div class="log-entry success">
|
||||
<span>✓ ${escapeHtml(data.original_name)}</span>
|
||||
<small>${formatBytes(data.groesse)}</small>
|
||||
</div>`;
|
||||
// Scroll nach unten
|
||||
logContainer.scrollTop = logContainer.scrollHeight;
|
||||
break;
|
||||
|
||||
case 'skip':
|
||||
logContainer.innerHTML += `<div class="log-entry" style="opacity:0.6;">
|
||||
<span>⊘ ${escapeHtml(data.datei)}: ${data.grund}</span>
|
||||
</div>`;
|
||||
break;
|
||||
|
||||
case 'fehler':
|
||||
logContainer.innerHTML += `<div class="log-entry error">
|
||||
<span>✗ ${escapeHtml(data.nachricht)}</span>
|
||||
</div>`;
|
||||
break;
|
||||
|
||||
case 'fertig':
|
||||
logContainer.innerHTML += `<div class="log-entry success" style="font-weight:bold;">
|
||||
<span>✓ Fertig: ${data.anzahl} Dateien gespeichert</span>
|
||||
</div>`;
|
||||
eventSource.close();
|
||||
ladePostfaecher();
|
||||
break;
|
||||
}
|
||||
};
|
||||
|
||||
eventSource.onerror = (error) => {
|
||||
logContainer.innerHTML += `<div class="log-entry error">
|
||||
<span>✗ Verbindung unterbrochen</span>
|
||||
</div>`;
|
||||
eventSource.close();
|
||||
};
|
||||
}
|
||||
|
||||
function formatBytes(bytes) {
|
||||
if (bytes < 1024) return bytes + ' B';
|
||||
if (bytes < 1024 * 1024) return (bytes / 1024).toFixed(1) + ' KB';
|
||||
return (bytes / (1024 * 1024)).toFixed(1) + ' MB';
|
||||
}
|
||||
|
||||
async function allePostfaecherAbrufen() {
|
||||
try {
|
||||
zeigeLoading('Rufe alle Postfächer ab...');
|
||||
const result = await api('/postfaecher/abrufen-alle', { method: 'POST' });
|
||||
zeigeAbrufLog(result);
|
||||
ladePostfaecher();
|
||||
} catch (error) {
|
||||
alert('Fehler: ' + error.message);
|
||||
} finally {
|
||||
versteckeLoading();
|
||||
}
|
||||
}
|
||||
|
||||
async function postfachLoeschen(id) {
|
||||
if (!confirm('Postfach wirklich löschen?')) return;
|
||||
try {
|
||||
await api(`/postfaecher/${id}`, { method: 'DELETE' });
|
||||
ladePostfaecher();
|
||||
} catch (error) {
|
||||
alert('Fehler: ' + error.message);
|
||||
}
|
||||
}
|
||||
|
||||
function zeigeAbrufLog(result) {
|
||||
const container = document.getElementById('abruf-log');
|
||||
|
||||
if (!result.ergebnisse || result.ergebnisse.length === 0) {
|
||||
container.innerHTML = '<p class="empty-state">Keine neuen Attachments gefunden</p>';
|
||||
return;
|
||||
}
|
||||
|
||||
let html = '';
|
||||
for (const r of result.ergebnisse) {
|
||||
const status = r.fehler ? 'error' : 'success';
|
||||
const icon = r.fehler ? '✗' : '✓';
|
||||
html += `<div class="log-entry ${status}">
|
||||
<span>${icon} ${escapeHtml(r.postfach)}: ${r.anzahl || 0} Dateien</span>
|
||||
${r.fehler ? `<small>${escapeHtml(r.fehler)}</small>` : ''}
|
||||
</div>`;
|
||||
|
||||
if (r.dateien) {
|
||||
for (const d of r.dateien) {
|
||||
html += `<div class="log-entry info">
|
||||
<span style="padding-left: 1rem;">→ ${escapeHtml(d)}</span>
|
||||
</div>`;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
container.innerHTML = html;
|
||||
}
|
||||
|
||||
// ============ BEREICH 2: Datei-Sortierung ============
|
||||
|
||||
async function ladeOrdner() {
|
||||
try {
|
||||
const ordner = await api('/ordner');
|
||||
renderOrdner(ordner);
|
||||
} catch (error) {
|
||||
console.error('Fehler:', error);
|
||||
}
|
||||
}
|
||||
|
||||
function renderOrdner(ordner) {
|
||||
const container = document.getElementById('ordner-liste');
|
||||
|
||||
if (!ordner || ordner.length === 0) {
|
||||
container.innerHTML = '<p class="empty-state">Keine Ordner konfiguriert</p>';
|
||||
return;
|
||||
}
|
||||
|
||||
container.innerHTML = ordner.map(o => `
|
||||
<div class="config-item">
|
||||
<div class="config-item-info">
|
||||
<h4>${escapeHtml(o.name)} ${o.rekursiv ? '<span class="badge badge-info">rekursiv</span>' : ''}</h4>
|
||||
<small>${escapeHtml(o.pfad)} → ${escapeHtml(o.ziel_ordner)}</small>
|
||||
<small style="display:block;">${(o.dateitypen || []).join(', ')}</small>
|
||||
</div>
|
||||
<div class="config-item-actions">
|
||||
<button class="btn btn-sm" onclick="ordnerScannen(${o.id})">Scannen</button>
|
||||
<button class="btn btn-sm btn-danger" onclick="ordnerLoeschen(${o.id})">×</button>
|
||||
</div>
|
||||
</div>
|
||||
`).join('');
|
||||
}
|
||||
|
||||
function zeigeOrdnerModal() {
|
||||
document.getElementById('ord-name').value = '';
|
||||
document.getElementById('ord-pfad').value = '/srv/http/dateiverwaltung/data/inbox/';
|
||||
document.getElementById('ord-ziel').value = '/srv/http/dateiverwaltung/data/archiv/';
|
||||
setCheckedTypes('ord-typen-gruppe', ['.pdf', '.jpg', '.jpeg', '.png', '.tiff']);
|
||||
document.getElementById('ord-rekursiv').value = 'true';
|
||||
document.getElementById('ordner-modal').classList.remove('hidden');
|
||||
}
|
||||
|
||||
async function speichereOrdner() {
|
||||
const dateitypen = getCheckedTypes('ord-typen-gruppe');
|
||||
if (dateitypen.length === 0) {
|
||||
alert('Bitte mindestens einen Dateityp auswählen');
|
||||
return;
|
||||
}
|
||||
|
||||
const data = {
|
||||
name: document.getElementById('ord-name').value.trim(),
|
||||
pfad: document.getElementById('ord-pfad').value.trim(),
|
||||
ziel_ordner: document.getElementById('ord-ziel').value.trim(),
|
||||
rekursiv: document.getElementById('ord-rekursiv').value === 'true',
|
||||
dateitypen: dateitypen
|
||||
};
|
||||
|
||||
if (!data.name || !data.pfad || !data.ziel_ordner) {
|
||||
alert('Bitte alle Felder ausfüllen');
|
||||
return;
|
||||
}
|
||||
|
||||
try {
|
||||
zeigeLoading('Speichere Ordner...');
|
||||
await api('/ordner', { method: 'POST', body: JSON.stringify(data) });
|
||||
schliesseModal('ordner-modal');
|
||||
ladeOrdner();
|
||||
} catch (error) {
|
||||
alert('Fehler: ' + error.message);
|
||||
} finally {
|
||||
versteckeLoading();
|
||||
}
|
||||
}
|
||||
|
||||
async function ordnerLoeschen(id) {
|
||||
if (!confirm('Ordner wirklich löschen?')) return;
|
||||
try {
|
||||
await api(`/ordner/${id}`, { method: 'DELETE' });
|
||||
ladeOrdner();
|
||||
} catch (error) {
|
||||
alert('Fehler: ' + error.message);
|
||||
}
|
||||
}
|
||||
|
||||
async function ordnerScannen(id) {
|
||||
try {
|
||||
const result = await api(`/ordner/${id}/scannen`);
|
||||
alert(`${result.anzahl} Dateien im Ordner gefunden`);
|
||||
} catch (error) {
|
||||
alert('Fehler: ' + error.message);
|
||||
}
|
||||
}
|
||||
|
||||
// ============ Regeln ============
|
||||
|
||||
let editierteRegelId = null;
|
||||
|
||||
async function ladeRegeln() {
|
||||
try {
|
||||
const regeln = await api('/regeln');
|
||||
renderRegeln(regeln);
|
||||
} catch (error) {
|
||||
console.error('Fehler:', error);
|
||||
}
|
||||
}
|
||||
|
||||
function renderRegeln(regeln) {
|
||||
const container = document.getElementById('regeln-liste');
|
||||
|
||||
if (!regeln || regeln.length === 0) {
|
||||
container.innerHTML = '<p class="empty-state">Keine Regeln definiert</p>';
|
||||
return;
|
||||
}
|
||||
|
||||
container.innerHTML = regeln.map(r => `
|
||||
<div class="config-item">
|
||||
<div class="config-item-info">
|
||||
<h4>${escapeHtml(r.name)} <span class="badge badge-info">Prio ${r.prioritaet}</span></h4>
|
||||
<small>${escapeHtml(r.schema)}</small>
|
||||
</div>
|
||||
<div class="config-item-actions">
|
||||
<button class="btn btn-sm" onclick="bearbeiteRegel(${r.id})">Bearbeiten</button>
|
||||
<button class="btn btn-sm btn-danger" onclick="regelLoeschen(${r.id})">×</button>
|
||||
</div>
|
||||
</div>
|
||||
`).join('');
|
||||
}
|
||||
|
||||
function zeigeRegelModal(regel = null) {
|
||||
editierteRegelId = regel?.id || null;
|
||||
document.getElementById('regel-modal-title').textContent = regel ? 'Regel bearbeiten' : 'Regel hinzufügen';
|
||||
|
||||
document.getElementById('regel-name').value = regel?.name || '';
|
||||
document.getElementById('regel-prioritaet').value = regel?.prioritaet || 100;
|
||||
document.getElementById('regel-muster').value = JSON.stringify(regel?.muster || {"text_match_any": [], "text_match": []}, null, 2);
|
||||
document.getElementById('regel-extraktion').value = JSON.stringify(regel?.extraktion || {}, null, 2);
|
||||
document.getElementById('regel-schema').value = regel?.schema || '{datum} - Dokument.pdf';
|
||||
document.getElementById('regel-unterordner').value = regel?.unterordner || '';
|
||||
document.getElementById('regel-test-text').value = '';
|
||||
document.getElementById('regel-test-ergebnis').classList.add('hidden');
|
||||
|
||||
document.getElementById('regel-modal').classList.remove('hidden');
|
||||
}
|
||||
|
||||
async function bearbeiteRegel(id) {
|
||||
try {
|
||||
const regeln = await api('/regeln');
|
||||
const regel = regeln.find(r => r.id === id);
|
||||
if (regel) zeigeRegelModal(regel);
|
||||
} catch (error) {
|
||||
alert('Fehler: ' + error.message);
|
||||
}
|
||||
}
|
||||
|
||||
async function speichereRegel() {
|
||||
let muster, extraktion;
|
||||
|
||||
try {
|
||||
muster = JSON.parse(document.getElementById('regel-muster').value);
|
||||
} catch (e) {
|
||||
alert('Ungültiges JSON im Muster-Feld');
|
||||
return;
|
||||
}
|
||||
|
||||
try {
|
||||
extraktion = JSON.parse(document.getElementById('regel-extraktion').value);
|
||||
} catch (e) {
|
||||
alert('Ungültiges JSON im Extraktion-Feld');
|
||||
return;
|
||||
}
|
||||
|
||||
const data = {
|
||||
name: document.getElementById('regel-name').value.trim(),
|
||||
prioritaet: parseInt(document.getElementById('regel-prioritaet').value),
|
||||
muster,
|
||||
extraktion,
|
||||
schema: document.getElementById('regel-schema').value.trim(),
|
||||
unterordner: document.getElementById('regel-unterordner').value.trim() || null
|
||||
};
|
||||
|
||||
if (!data.name) {
|
||||
alert('Bitte einen Namen eingeben');
|
||||
return;
|
||||
}
|
||||
|
||||
try {
|
||||
if (editierteRegelId) {
|
||||
await api(`/regeln/${editierteRegelId}`, { method: 'PUT', body: JSON.stringify(data) });
|
||||
} else {
|
||||
await api('/regeln', { method: 'POST', body: JSON.stringify(data) });
|
||||
}
|
||||
schliesseModal('regel-modal');
|
||||
ladeRegeln();
|
||||
} catch (error) {
|
||||
alert('Fehler: ' + error.message);
|
||||
}
|
||||
}
|
||||
|
||||
async function regelLoeschen(id) {
|
||||
if (!confirm('Regel wirklich löschen?')) return;
|
||||
try {
|
||||
await api(`/regeln/${id}`, { method: 'DELETE' });
|
||||
ladeRegeln();
|
||||
} catch (error) {
|
||||
alert('Fehler: ' + error.message);
|
||||
}
|
||||
}
|
||||
|
||||
async function testeRegel() {
|
||||
const text = document.getElementById('regel-test-text').value;
|
||||
if (!text) {
|
||||
alert('Bitte Testtext eingeben');
|
||||
return;
|
||||
}
|
||||
|
||||
let muster, extraktion;
|
||||
try {
|
||||
muster = JSON.parse(document.getElementById('regel-muster').value);
|
||||
extraktion = JSON.parse(document.getElementById('regel-extraktion').value);
|
||||
} catch (e) {
|
||||
alert('Ungültiges JSON');
|
||||
return;
|
||||
}
|
||||
|
||||
const regel = {
|
||||
name: 'Test',
|
||||
muster,
|
||||
extraktion,
|
||||
schema: document.getElementById('regel-schema').value.trim()
|
||||
};
|
||||
|
||||
try {
|
||||
const result = await api('/regeln/test', {
|
||||
method: 'POST',
|
||||
body: JSON.stringify({ regel, text })
|
||||
});
|
||||
|
||||
const container = document.getElementById('regel-test-ergebnis');
|
||||
container.classList.remove('hidden', 'success', 'error');
|
||||
|
||||
if (result.passt) {
|
||||
container.classList.add('success');
|
||||
container.textContent = `✓ Regel passt!\n\nExtrahiert:\n${JSON.stringify(result.extrahiert, null, 2)}\n\nDateiname:\n${result.dateiname}`;
|
||||
} else {
|
||||
container.classList.add('error');
|
||||
container.textContent = '✗ Regel passt nicht';
|
||||
}
|
||||
} catch (error) {
|
||||
alert('Fehler: ' + error.message);
|
||||
}
|
||||
}
|
||||
|
||||
// ============ Sortierung starten ============
|
||||
|
||||
async function sortierungStarten() {
|
||||
try {
|
||||
zeigeLoading('Sortiere Dateien...');
|
||||
const result = await api('/sortierung/starten', { method: 'POST' });
|
||||
zeigeSortierungLog(result);
|
||||
} catch (error) {
|
||||
alert('Fehler: ' + error.message);
|
||||
} finally {
|
||||
versteckeLoading();
|
||||
}
|
||||
}
|
||||
|
||||
function zeigeSortierungLog(result) {
|
||||
const container = document.getElementById('sortierung-log');
|
||||
|
||||
if (!result.verarbeitet || result.verarbeitet.length === 0) {
|
||||
container.innerHTML = '<p class="empty-state">Keine Dateien verarbeitet</p>';
|
||||
return;
|
||||
}
|
||||
|
||||
let html = `<div class="log-entry info">
|
||||
<span>Gesamt: ${result.gesamt} | Sortiert: ${result.sortiert} | ZUGFeRD: ${result.zugferd} | Fehler: ${result.fehler}</span>
|
||||
</div>`;
|
||||
|
||||
for (const d of result.verarbeitet) {
|
||||
const status = d.fehler ? 'error' : (d.zugferd ? 'info' : 'success');
|
||||
const icon = d.fehler ? '✗' : (d.zugferd ? '🧾' : '✓');
|
||||
html += `<div class="log-entry ${status}">
|
||||
<span>${icon} ${escapeHtml(d.neuer_name || d.original)}</span>
|
||||
${d.fehler ? `<small>${escapeHtml(d.fehler)}</small>` : ''}
|
||||
</div>`;
|
||||
}
|
||||
|
||||
container.innerHTML = html;
|
||||
}
|
||||
|
||||
// ============ Utilities ============
|
||||
|
||||
function schliesseModal(id) {
|
||||
document.getElementById(id).classList.add('hidden');
|
||||
}
|
||||
|
||||
function escapeHtml(text) {
|
||||
if (!text) return '';
|
||||
const div = document.createElement('div');
|
||||
div.textContent = text;
|
||||
return div.innerHTML;
|
||||
}
|
||||
|
||||
document.addEventListener('click', (e) => {
|
||||
if (e.target.classList.contains('modal')) {
|
||||
e.target.classList.add('hidden');
|
||||
}
|
||||
});
|
||||
|
||||
document.addEventListener('keydown', (e) => {
|
||||
if (e.key === 'Escape') {
|
||||
document.querySelectorAll('.modal:not(.hidden)').forEach(m => m.classList.add('hidden'));
|
||||
}
|
||||
});
|
||||
|
|
@ -1,366 +0,0 @@
|
|||
<!DOCTYPE html>
|
||||
<html lang="de">
|
||||
<head>
|
||||
<meta charset="UTF-8">
|
||||
<meta name="viewport" content="width=device-width, initial-scale=1.0">
|
||||
<title>Dateiverwaltung</title>
|
||||
<link rel="stylesheet" href="/static/css/style.css">
|
||||
</head>
|
||||
<body>
|
||||
<div id="app">
|
||||
<!-- Header -->
|
||||
<header class="header">
|
||||
<div class="header-left">
|
||||
<h1>Dateiverwaltung</h1>
|
||||
</div>
|
||||
<div class="header-right">
|
||||
<span id="status-indicator"></span>
|
||||
</div>
|
||||
</header>
|
||||
|
||||
<!-- Main Content -->
|
||||
<div class="main-container">
|
||||
<!-- Bereich 1: Mail-Abruf -->
|
||||
<section class="bereich">
|
||||
<div class="bereich-header">
|
||||
<h2>📧 Mail-Abruf</h2>
|
||||
<p class="bereich-desc">Attachments aus Postfächern in Ordner speichern</p>
|
||||
</div>
|
||||
|
||||
<div class="bereich-content">
|
||||
<!-- Postfächer Liste -->
|
||||
<div class="card">
|
||||
<div class="card-header">
|
||||
<h3>Postfächer</h3>
|
||||
<button class="btn btn-sm btn-primary" onclick="zeigePostfachModal()">+ Hinzufügen</button>
|
||||
</div>
|
||||
<div class="card-body">
|
||||
<div id="postfaecher-liste">
|
||||
<p class="empty-state">Keine Postfächer konfiguriert</p>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- Abruf starten -->
|
||||
<div class="action-bar">
|
||||
<button class="btn btn-success btn-large" onclick="allePostfaecherAbrufen()">
|
||||
▶ Alle Postfächer abrufen
|
||||
</button>
|
||||
</div>
|
||||
|
||||
<!-- Letzter Abruf Log -->
|
||||
<div class="card">
|
||||
<div class="card-header">
|
||||
<h3>Letzter Abruf</h3>
|
||||
</div>
|
||||
<div class="card-body">
|
||||
<div id="abruf-log" class="log-output">
|
||||
<p class="empty-state">Noch kein Abruf durchgeführt</p>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</section>
|
||||
|
||||
<!-- Bereich 2: Datei-Sortierung -->
|
||||
<section class="bereich">
|
||||
<div class="bereich-header">
|
||||
<h2>📁 Datei-Sortierung</h2>
|
||||
<p class="bereich-desc">Dateien nach Regeln umbenennen und verschieben</p>
|
||||
</div>
|
||||
|
||||
<div class="bereich-content">
|
||||
<!-- Quell-Ordner -->
|
||||
<div class="card">
|
||||
<div class="card-header">
|
||||
<h3>Quell-Ordner</h3>
|
||||
<button class="btn btn-sm btn-primary" onclick="zeigeOrdnerModal()">+ Hinzufügen</button>
|
||||
</div>
|
||||
<div class="card-body">
|
||||
<div id="ordner-liste">
|
||||
<p class="empty-state">Keine Ordner konfiguriert</p>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- Regeln -->
|
||||
<div class="card">
|
||||
<div class="card-header">
|
||||
<h3>Sortier-Regeln</h3>
|
||||
<button class="btn btn-sm btn-primary" onclick="zeigeRegelModal()">+ Hinzufügen</button>
|
||||
</div>
|
||||
<div class="card-body">
|
||||
<div id="regeln-liste">
|
||||
<p class="empty-state">Keine Regeln definiert</p>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- Sortierung starten -->
|
||||
<div class="action-bar">
|
||||
<button class="btn btn-success btn-large" onclick="sortierungStarten()">
|
||||
▶ Sortierung starten
|
||||
</button>
|
||||
</div>
|
||||
|
||||
<!-- Sortierungs-Log -->
|
||||
<div class="card">
|
||||
<div class="card-header">
|
||||
<h3>Verarbeitete Dateien</h3>
|
||||
</div>
|
||||
<div class="card-body">
|
||||
<div id="sortierung-log" class="log-output">
|
||||
<p class="empty-state">Noch keine Dateien verarbeitet</p>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</section>
|
||||
</div>
|
||||
|
||||
<!-- Modal: Postfach hinzufügen -->
|
||||
<div id="postfach-modal" class="modal hidden">
|
||||
<div class="modal-content">
|
||||
<div class="modal-header">
|
||||
<h3>Postfach hinzufügen</h3>
|
||||
<button class="modal-close" onclick="schliesseModal('postfach-modal')">×</button>
|
||||
</div>
|
||||
<div class="modal-body">
|
||||
<div class="form-group">
|
||||
<label>Name</label>
|
||||
<input type="text" id="pf-name" placeholder="z.B. Firma Rechnungen">
|
||||
</div>
|
||||
<div class="form-row">
|
||||
<div class="form-group">
|
||||
<label>IMAP Server</label>
|
||||
<input type="text" id="pf-server" placeholder="imap.example.com">
|
||||
</div>
|
||||
<div class="form-group">
|
||||
<label>Port</label>
|
||||
<input type="number" id="pf-port" value="993">
|
||||
</div>
|
||||
</div>
|
||||
<div class="form-group">
|
||||
<label>E-Mail</label>
|
||||
<input type="email" id="pf-email" placeholder="mail@example.com">
|
||||
</div>
|
||||
<div class="form-group">
|
||||
<label>Passwort</label>
|
||||
<input type="password" id="pf-passwort">
|
||||
</div>
|
||||
<div class="form-row">
|
||||
<div class="form-group">
|
||||
<label>IMAP-Ordner</label>
|
||||
<input type="text" id="pf-ordner" value="INBOX">
|
||||
</div>
|
||||
<div class="form-group">
|
||||
<label>Alle Ordner durchsuchen</label>
|
||||
<select id="pf-alle-ordner">
|
||||
<option value="false">Nein (nur angegebenen Ordner)</option>
|
||||
<option value="true">Ja (alle Ordner)</option>
|
||||
</select>
|
||||
</div>
|
||||
</div>
|
||||
<div class="form-group">
|
||||
<label>Welche Mails durchsuchen</label>
|
||||
<select id="pf-nur-ungelesen">
|
||||
<option value="false" selected>Alle Mails</option>
|
||||
<option value="true">Nur ungelesene Mails</option>
|
||||
</select>
|
||||
</div>
|
||||
<div class="form-group">
|
||||
<label>Ziel-Ordner</label>
|
||||
<div class="input-with-btn">
|
||||
<input type="text" id="pf-ziel" value="/srv/http/dateiverwaltung/data/inbox/">
|
||||
<button class="btn" type="button" onclick="oeffneBrowser('pf-ziel')">📁</button>
|
||||
</div>
|
||||
<small>Hier landen die Attachments</small>
|
||||
</div>
|
||||
<div class="form-group">
|
||||
<label>Erlaubte Dateitypen</label>
|
||||
<div class="checkbox-group" id="pf-typen-gruppe">
|
||||
<label class="checkbox-item"><input type="checkbox" value=".pdf" checked> PDF</label>
|
||||
<label class="checkbox-item"><input type="checkbox" value=".jpg"> JPG</label>
|
||||
<label class="checkbox-item"><input type="checkbox" value=".jpeg"> JPEG</label>
|
||||
<label class="checkbox-item"><input type="checkbox" value=".png"> PNG</label>
|
||||
<label class="checkbox-item"><input type="checkbox" value=".gif"> GIF</label>
|
||||
<label class="checkbox-item"><input type="checkbox" value=".tiff"> TIFF</label>
|
||||
<label class="checkbox-item"><input type="checkbox" value=".doc"> DOC</label>
|
||||
<label class="checkbox-item"><input type="checkbox" value=".docx"> DOCX</label>
|
||||
<label class="checkbox-item"><input type="checkbox" value=".xls"> XLS</label>
|
||||
<label class="checkbox-item"><input type="checkbox" value=".xlsx"> XLSX</label>
|
||||
<label class="checkbox-item"><input type="checkbox" value=".csv"> CSV</label>
|
||||
<label class="checkbox-item"><input type="checkbox" value=".txt"> TXT</label>
|
||||
<label class="checkbox-item"><input type="checkbox" value=".zip"> ZIP</label>
|
||||
<label class="checkbox-item"><input type="checkbox" value=".xml"> XML</label>
|
||||
</div>
|
||||
</div>
|
||||
<div class="form-group">
|
||||
<label>Max. Größe (MB)</label>
|
||||
<input type="number" id="pf-max-groesse" value="25" style="width: 100px;">
|
||||
</div>
|
||||
</div>
|
||||
<div class="modal-footer">
|
||||
<button class="btn" onclick="schliesseModal('postfach-modal')">Abbrechen</button>
|
||||
<button class="btn btn-primary" onclick="speicherePostfach()">Speichern</button>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- Modal: Ordner hinzufügen -->
|
||||
<div id="ordner-modal" class="modal hidden">
|
||||
<div class="modal-content">
|
||||
<div class="modal-header">
|
||||
<h3>Quell-Ordner hinzufügen</h3>
|
||||
<button class="modal-close" onclick="schliesseModal('ordner-modal')">×</button>
|
||||
</div>
|
||||
<div class="modal-body">
|
||||
<div class="form-group">
|
||||
<label>Name</label>
|
||||
<input type="text" id="ord-name" placeholder="z.B. Firma Inbox">
|
||||
</div>
|
||||
<div class="form-group">
|
||||
<label>Quell-Pfad (wo liegen die Dateien?)</label>
|
||||
<div class="input-with-btn">
|
||||
<input type="text" id="ord-pfad" value="/srv/http/dateiverwaltung/data/inbox/">
|
||||
<button class="btn" type="button" onclick="oeffneBrowser('ord-pfad')">📁</button>
|
||||
</div>
|
||||
</div>
|
||||
<div class="form-group">
|
||||
<label>Ziel-Ordner (wohin nach Sortierung?)</label>
|
||||
<div class="input-with-btn">
|
||||
<input type="text" id="ord-ziel" value="/srv/http/dateiverwaltung/data/archiv/">
|
||||
<button class="btn" type="button" onclick="oeffneBrowser('ord-ziel')">📁</button>
|
||||
</div>
|
||||
</div>
|
||||
<div class="form-group">
|
||||
<label>Dateitypen</label>
|
||||
<div class="checkbox-group" id="ord-typen-gruppe">
|
||||
<label class="checkbox-item"><input type="checkbox" value=".pdf" checked> PDF</label>
|
||||
<label class="checkbox-item"><input type="checkbox" value=".jpg" checked> JPG</label>
|
||||
<label class="checkbox-item"><input type="checkbox" value=".jpeg" checked> JPEG</label>
|
||||
<label class="checkbox-item"><input type="checkbox" value=".png" checked> PNG</label>
|
||||
<label class="checkbox-item"><input type="checkbox" value=".gif"> GIF</label>
|
||||
<label class="checkbox-item"><input type="checkbox" value=".tiff" checked> TIFF</label>
|
||||
<label class="checkbox-item"><input type="checkbox" value=".bmp"> BMP</label>
|
||||
<label class="checkbox-item"><input type="checkbox" value=".doc"> DOC</label>
|
||||
<label class="checkbox-item"><input type="checkbox" value=".docx"> DOCX</label>
|
||||
<label class="checkbox-item"><input type="checkbox" value=".xls"> XLS</label>
|
||||
<label class="checkbox-item"><input type="checkbox" value=".xlsx"> XLSX</label>
|
||||
<label class="checkbox-item"><input type="checkbox" value=".csv"> CSV</label>
|
||||
<label class="checkbox-item"><input type="checkbox" value=".txt"> TXT</label>
|
||||
<label class="checkbox-item"><input type="checkbox" value=".xml"> XML</label>
|
||||
</div>
|
||||
</div>
|
||||
<div class="form-group">
|
||||
<label>Unterordner einschließen</label>
|
||||
<select id="ord-rekursiv">
|
||||
<option value="true" selected>Ja (rekursiv)</option>
|
||||
<option value="false">Nein (nur dieser Ordner)</option>
|
||||
</select>
|
||||
</div>
|
||||
</div>
|
||||
<div class="modal-footer">
|
||||
<button class="btn" onclick="schliesseModal('ordner-modal')">Abbrechen</button>
|
||||
<button class="btn btn-primary" onclick="speichereOrdner()">Speichern</button>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- Modal: Regel hinzufügen -->
|
||||
<div id="regel-modal" class="modal hidden">
|
||||
<div class="modal-content modal-large">
|
||||
<div class="modal-header">
|
||||
<h3 id="regel-modal-title">Regel hinzufügen</h3>
|
||||
<button class="modal-close" onclick="schliesseModal('regel-modal')">×</button>
|
||||
</div>
|
||||
<div class="modal-body">
|
||||
<div class="form-row">
|
||||
<div class="form-group">
|
||||
<label>Name</label>
|
||||
<input type="text" id="regel-name" placeholder="z.B. Sonepar Rechnung">
|
||||
</div>
|
||||
<div class="form-group">
|
||||
<label>Priorität (niedriger = wichtiger)</label>
|
||||
<input type="number" id="regel-prioritaet" value="100">
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div class="form-group">
|
||||
<label>Erkennungsmuster (JSON)</label>
|
||||
<textarea id="regel-muster" class="code-input" rows="4">{
|
||||
"text_match_any": ["sonepar"],
|
||||
"text_match": ["rechnung"]
|
||||
}</textarea>
|
||||
<small>text_match_any: mindestens eins | text_match: alle müssen passen</small>
|
||||
</div>
|
||||
|
||||
<div class="form-group">
|
||||
<label>Feld-Extraktion (JSON)</label>
|
||||
<textarea id="regel-extraktion" class="code-input" rows="6">{
|
||||
"datum": {"regex": "(\\d{2}[./]\\d{2}[./]\\d{4})", "format": "%d.%m.%Y"},
|
||||
"rechnungsnummer": {"regex": "Rechnungsnummer[:\\s]*(\\d+)"},
|
||||
"betrag": {"regex": "Gesamtbetrag[:\\s]*([\\d.,]+)", "typ": "betrag"},
|
||||
"ersteller": {"wert": "Sonepar"}
|
||||
}</textarea>
|
||||
</div>
|
||||
|
||||
<div class="form-group">
|
||||
<label>Dateiname-Schema</label>
|
||||
<input type="text" id="regel-schema"
|
||||
value="{datum} - Rechnung - {ersteller} - {rechnungsnummer} - {betrag} EUR.pdf">
|
||||
</div>
|
||||
|
||||
<div class="form-group">
|
||||
<label>Ziel-Unterordner (optional)</label>
|
||||
<input type="text" id="regel-unterordner" placeholder="sonepar">
|
||||
<small>Wird an den Ziel-Ordner des Quell-Ordners angehängt</small>
|
||||
</div>
|
||||
|
||||
<!-- Tester -->
|
||||
<div class="form-group">
|
||||
<label>Regel testen</label>
|
||||
<textarea id="regel-test-text" rows="3" placeholder="Text zum Testen einfügen..."></textarea>
|
||||
<button class="btn btn-sm" onclick="testeRegel()">Testen</button>
|
||||
<div id="regel-test-ergebnis" class="test-result hidden"></div>
|
||||
</div>
|
||||
</div>
|
||||
<div class="modal-footer">
|
||||
<button class="btn" onclick="schliesseModal('regel-modal')">Abbrechen</button>
|
||||
<button class="btn btn-primary" onclick="speichereRegel()">Speichern</button>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- Modal: Verzeichnis-Browser -->
|
||||
<div id="browser-modal" class="modal hidden">
|
||||
<div class="modal-content">
|
||||
<div class="modal-header">
|
||||
<h3>Verzeichnis wählen</h3>
|
||||
<button class="modal-close" onclick="schliesseModal('browser-modal')">×</button>
|
||||
</div>
|
||||
<div class="modal-body">
|
||||
<div class="file-browser">
|
||||
<div class="file-browser-path">
|
||||
<span id="browser-current-path">/</span>
|
||||
</div>
|
||||
<ul class="file-browser-list" id="browser-list"></ul>
|
||||
</div>
|
||||
</div>
|
||||
<div class="modal-footer">
|
||||
<button class="btn" onclick="schliesseModal('browser-modal')">Abbrechen</button>
|
||||
<button class="btn btn-primary" onclick="browserAuswahl()">Auswählen</button>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- Loading Overlay -->
|
||||
<div id="loading-overlay" class="loading-overlay hidden">
|
||||
<div class="spinner"></div>
|
||||
<div class="loading-text" id="loading-text">Wird geladen...</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<script src="/static/js/app.js"></script>
|
||||
</body>
|
||||
</html>
|
||||
Loading…
Reference in a new issue