Simplifi.AI - Making Complex Text Accessible with AI
A proof-of-concept mobile application using NLP, OCR, and LLMs to simplify complex legal documents and academic papers.
Legal documents and academic papers are often inaccessible to many people due to their complexity. Simplifi.ai aims to bridge this gap, making information more understandable and accessible for everyone.
Over the past few days, I’ve developed a prototype as a personal tool to tackle this challenge. Simplifi.ai is a proof-of-concept mobile application meant as a pipeline that takes advantage of technologies such as NLP, OCR, and LLMs (Gemini Flash 1.5) to simplify and decompress complex information.
Features
This tool gives users multiple features to use and take advantage of:
- Text Simplification - Using Flesch-Kincaid method to calculate reading level, and use guidelines from the Citizens Information Board to help display language accordingly
- Advanced Search - Allowing a user to redefine the topic along with changing the reading level and an additional query box
- Read Aloud - Reads aloud the simplified document
- Sentiment Analysis - Analyses the overall sentiment of the document and determines if its positive or negative giving extra information to the user
- Document Upload and OCR - Uses text recognition to extract text from PDF and pass it through to the server for processing
- Document History - Saves previous answers and allows users to go back to it at any time
Tech Stack
The tech stack for this project is minimal, requiring only familiarity with a few libraries and the ability to generate an API token for a Gemini model on Google Cloud.
Frontend (React Native + TypeScript + Expo)
I chose React Native with TypeScript and the Expo development framework. This combination made building a cross-platform application incredibly straightforward and allowed for quick iteration during development.
External libraries used:
- Wink-NLP - To implement sentiment analysis on the submitted document
- React-Native-Paper - UI Library to make everything a tad bit more user friendly
- Expo-speech - Text To Speech tool to read out the simplified document
- React-Native-Markdown-Renderer - Renders the returned document in markdown for easy reading
Backend (Node.js + Express)
For the back-end, I implemented a Node.js server with Express to handle the application’s processes, using the following key libraries:
- Multer - Used to upload large files such as images and PDFs
- tesseract.js - Scans in an image and returns the text for us to parse
- Spellchecker - Fix any common spelling mistakes made by tesseract.js
- @google/generative-ai - Google’s LLM model library to interact with Gemini (1.5 Flash)
- pdf-parse - To parse any of the text out of any PDF uploaded
Future Plans
Looking ahead, I plan to move away from hosted LLM services and explore running local models through services such as Ollama. While I currently lack the necessary hardware for local fine-tuning, I’m exploring services like Runpod to handle this process.
Fine-tuning would enable me to better tailor the models to user needs as well as experiment with agent frameworks such as bee-agent to enhance the tooling and error handling of LLMs.
Links
- YouTube Demo: Watch the demo
- GitHub: starmunchies/simplifi