Features Pipeline Models Get Started Profile GitHub Docs

Build Your Own
OCR Pipeline

Open-source document understanding framework. Complete training scripts, modular architecture, and full control. Train your own models.

๐Ÿš€ Open Source ๐Ÿ“ End-to-End OCR โšก 100+ Models ๐Ÿ—๏ธ Modular ๐Ÿ’ช DDP Support

Why Doctane?

Complete control over your document processing pipeline

๐Ÿ“

End-to-End OCR

Complete pipeline from text detection to recognition. Handles straight and rotated text seamlessly.

๐Ÿ”„

Layout Understanding

Automatic page orientation detection and straightening. Multi-column document support.

๐Ÿ—๏ธ

Modular Architecture

Plug-and-play model components. Easy to extend with custom models and pipelines.

โšก

High Performance

Optimized inference with PyTorch. DDP training support for multi-GPU acceleration.

๐ŸŒ

Multi-Language

Support for English, French, and extensible to other languages.

๐Ÿ“Š

Structured Output

Hierarchical Document objects. Export to JSON and hOCR formats.

Processing Pipeline

Four-stage pipeline for complete document understanding

1

Preprocessing

Page orientation detection and optional straightening.

2

Text Detection

Segmentation-based model identifies text regions.

3

Text Recognition

Crops fed to recognition model for transcription.

4

Document Assembly

Structured output with geometry and confidence scores.

100+ Supported Models

State-of-the-art architectures for detection and recognition

LinkNet

Detection

DeepLabV3+

Detection

SegFormer

Detection

UNet

Detection

UNet++

Detection

FPN

Detection

PSPNet

Detection

PAN

Detection

MAnet

Detection

Faster R-CNN

Detection

SAR

Recognition

ViTSTR

Recognition

CRNN

Recognition

MASTER

Recognition

TRBA

Recognition

ABINet

Recognition

LSTR

Recognition

ViTPTR

Recognition

MATRN

Recognition

PARSeq

Recognition

Plus 80+ encoder variants (ResNet, EfficientNet, VGG, MobileNet, DenseNet, etc.)

Get Started

Clone, install, and run in minutes

# Clone the repository git clone https://github.com/Purushothaman-natarajan/doctane.git cd doctane # Install dependencies pip install -r requirements.txt # Start the API server python api/main.py
1

Clone

Download from GitHub

2

Install

Run pip install

3

Launch

Open localhost:8000/app

๐Ÿ”ง No Pre-trained Weights

We provide the code only, not model weights. Train your own models using the provided scripts.

๐Ÿ’ป Bring Your Infrastructure

Training requires GPU. Use your own hardware or cloud (AWS/GCP/Azure). DDP supported.

๐Ÿ‘จโ€๐Ÿ’ป

Purushothaman Natarajan

AI Engineer & Full-Stack Developer specializing in Computer Vision, NLP, and Deep Learning systems.

Featured Projects

๐Ÿ”ฌ Doctane

Multimodal intelligent document analysis and understanding system with OCR, layout understanding.

Python PyTorch OCR

๐Ÿ”’ Exploit2Patch

AI-Powered Vulnerability Intelligence Platform with autonomous CVE research and patch generation.

AI Agents Cybersecurity

๐Ÿงช DL-Studio

Local deep learning development environment with 20+ algorithms, built-in XAI, and web interface.

Deep Learning XAI Streamlit