Simplifi.AI - Making Complex Text Accessible with AI

Legal documents and academic papers are often inaccessible to many people due to their complexity. Simplifi.ai aims to bridge this gap, making information more understandable and accessible for everyone.

Over the past few days, I’ve developed a prototype as a personal tool to tackle this challenge. Simplifi.ai is a proof-of-concept mobile application meant as a pipeline that takes advantage of technologies such as NLP, OCR, and LLMs (Gemini Flash 1.5) to simplify and decompress complex information.

Features

This tool gives users multiple features to use and take advantage of:

Text Simplification - Using Flesch-Kincaid method to calculate reading level, and use guidelines from the Citizens Information Board to help display language accordingly
Advanced Search - Allowing a user to redefine the topic along with changing the reading level and an additional query box
Read Aloud - Reads aloud the simplified document
Sentiment Analysis - Analyses the overall sentiment of the document and determines if its positive or negative giving extra information to the user
Document Upload and OCR - Uses text recognition to extract text from PDF and pass it through to the server for processing
Document History - Saves previous answers and allows users to go back to it at any time

Tech Stack

The tech stack for this project is minimal, requiring only familiarity with a few libraries and the ability to generate an API token for a Gemini model on Google Cloud.

Frontend (React Native + TypeScript + Expo)

I chose React Native with TypeScript and the Expo development framework. This combination made building a cross-platform application incredibly straightforward and allowed for quick iteration during development.

External libraries used:

Wink-NLP - To implement sentiment analysis on the submitted document
React-Native-Paper - UI Library to make everything a tad bit more user friendly
Expo-speech - Text To Speech tool to read out the simplified document
React-Native-Markdown-Renderer - Renders the returned document in markdown for easy reading

Backend (Node.js + Express)

For the back-end, I implemented a Node.js server with Express to handle the application’s processes, using the following key libraries:

Multer - Used to upload large files such as images and PDFs
tesseract.js - Scans in an image and returns the text for us to parse
Spellchecker - Fix any common spelling mistakes made by tesseract.js
@google/generative-ai - Google’s LLM model library to interact with Gemini (1.5 Flash)
pdf-parse - To parse any of the text out of any PDF uploaded

Future Plans

Looking ahead, I plan to move away from hosted LLM services and explore running local models through services such as Ollama. While I currently lack the necessary hardware for local fine-tuning, I’m exploring services like Runpod to handle this process.

Fine-tuning would enable me to better tailor the models to user needs as well as experiment with agent frameworks such as bee-agent to enhance the tooling and error handling of LLMs.

Features

Tech Stack

Frontend (React Native + TypeScript + Expo)

Backend (Node.js + Express)

Future Plans

Links