Open Source Projects
GITTXT
Gittxt is an innovative open-source CLI and plugin-based tool that enables one-click AI-readiness for any GitHub repository. By transforming both local and remote Git repositories into structured datasets, Gittxt ensures compatibility with Large Language Models. This enhances capabilities such as prompt engineering, code analysis, summarization, and the creation of training pipelines.
Gittxt aims to empower developers, researchers, and AI enthusiasts by automating the intricate processes involved in extracting and organizing code and documentation. This tool streamlines workflows, making it easier to harness the potential of AI technologies in various applications, ultimately allowing users to focus on refining and implementing their ideas rather than navigating the complexities of data management.
-
π Scan & Extract: Pull clean .txt, .json, .md summaries from GitHub or local repos.
π§© Plugin System:
gittxt-api for REST access (FastAPI)
gittxt-streamlit for a visual, hosted interface
π οΈ Reverse Engineering: Reconstruct repo structures from AI summary reports.
π¦ ZIP Bundles & Token Summaries: Complete with directory tree, content, and file-type breakdowns.
π Smart Filtering: .gittxtignore, glob patterns, size limits, and doc-only modes.
β‘ Async Performance: Handles large repos fast β offline or in constrained environments.
-
Turn GitHub repos into datasets for LLM fine-tuning
Build context windows for document agents or retrieval-augmented generation (RAG)
Automate code summarization pipelines
Generate structured artifacts from messy repos for AI chat interfaces
-
Python 3.9+
FastAPI, Streamlit, Poetry, MkDocs
CLI-first design powered by modular plugins
Cloud-ready: compatible with AWS Lambda and Streamlit Cloud
-
pip install gittxt β and start scanning instantly
-
βI created Gittxt out of necessity. Every time I needed to summarize, debug, or fine-tune from a repo, I had to do the prep work manually. Now itβs one command away.β