2.4 KiB
2.4 KiB
recipe-document-converter
🍳 A Docker-based import service for converting recipe documents (PDF, Word, Excel, Images) into structured JSON data.
🎯 Project Overview
This repository contains a microservice architecture designed to:
- Extract recipe content from multiple document formats
- Structure unorganized data using LLM (Mistral) for intelligent parsing
- Integrate seamlessly with recipe applications via REST API
- Scale independently using Docker containerization
📦 Structure
recipe-document-converter/
├── recipe-document-converter/ # Main import service (this is the actual service)
│ ├── src/ # TypeScript source code
│ ├── Dockerfile # Container definition
│ ├── docker-compose.yml # Multi-service orchestration
│ ├── package.json # Dependencies
│ └── README.md # Service documentation
└── README.md # This file
🚀 Quick Start
Using Docker Compose (Recommended)
cd recipe-document-converter
# Start the service
docker-compose up -d
# Test the service
curl http://localhost:3000/health
Local Development
cd recipe-document-converter
# Install dependencies
npm install
# Start development server
npm run start:dev
📖 API Endpoints
| Method | Endpoint | Description |
|---|---|---|
GET |
/health |
Health check |
POST |
/import/pdf |
Import and extract PDF recipe |
📚 Full Documentation
See recipe-document-converter/README.md for complete documentation.
🔮 Planned Features
- PDF extraction
- Basic recipe structuring
- Mistral LLM integration
- Excel support
- Word support
- Image OCR support
- Web scraping
🛠️ Tech Stack
- NestJS — Node.js framework
- TypeScript — Type safety
- Docker — Containerization
- pdf-parse — PDF extraction
- Zod — Schema validation (coming soon)
- Mistral AI — LLM integration (coming soon)
🤝 Contributing
Contributions welcome! Open an issue or submit a PR.
📄 License
MIT