Clark Kent - GenEdu

Overview

I built an iOS app using the Meta DAT SDK for real-time OCR through Meta Ray-Ban smart glasses, with AI text simplification designed for dyslexia support. The app streams camera frames from the glasses, performs on-device text recognition, and delivers simplified audio through dual local and cloud processing paths.

The Challenge

Students with dyslexia need instant, seamless reading help — not another app to fiddle with. The solution had to:

Work through smart glasses with minimal latency for natural reading flow
Process text on-device to keep response times under 100ms
Support offline use for school environments with limited connectivity
Meet COPPA and FERPA compliance for use with minors in educational settings
Provide audio-guided navigation accessible to users with reading difficulties

What I Built

1. Real-Time OCR Pipeline

A high-performance text recognition system optimized for smart glasses:

Apple Vision on-device OCR — Sub-100ms text recognition without network dependency
Frame selection optimization — Smart filtering to process only high-quality frames
24fps streaming — Real-time camera feed from Meta Ray-Ban glasses via DAT SDK

2. Dual-Path Audio System

Two parallel audio processing paths to balance speed and quality:

Instant local TTS — Sub-400ms response using on-device text-to-speech
Enhanced cloud processing — AI-simplified text with ElevenLabs high-quality voice synthesis
Seamless fallback — Local path activates immediately while cloud processing runs in background

3. Gamified Learning Experience

Engagement features designed to make reading practice rewarding:

XP system with levels and achievement tracking
Reading streaks to encourage daily practice
Character-driven practice sessions
Vocabulary building with personalized word lists

Technical Architecture

Built as a modular iOS application with 14 feature modules:

Device Layer: Meta DAT SDK integration for glasses camera streaming and control
Vision Layer: Apple Vision framework for on-device OCR processing
Audio Layer: Dual-path system with local AVSpeechSynthesizer and cloud ElevenLabs TTS
Backend: Firebase for user data, progress tracking, and analytics
Offline-First: SwiftData for local persistence with background sync
AI Services: Google Gemini and OpenAI for text simplification and comprehension

Security & Quality

Privacy and accessibility are foundational requirements for an education tool targeting minors:

COPPA/FERPA compliant — No personal data collected without consent; parental controls built in
Camera privacy — Frames are processed on-device and never persisted to storage or cloud
Offline-first architecture — Core reading features work without internet connectivity
Audio-guided navigation — Full app usability without reading the screen

Outcome

Production-ready iOS application tested on physical Meta Ray-Ban glasses and simulator
Sub-100ms on-device OCR with dual-path audio for instant and enhanced responses
14 feature modules with gamified learning experience
Full Statement of Work delivered with COPPA/FERPA compliance documentation

Overview

Role

Technologies

The Challenge

What I Built

1. Real-Time OCR Pipeline

2. Dual-Path Audio System

3. Gamified Learning Experience

Technical Architecture

Security & Quality

Outcome

PolyRLAI — RL Trading System →

Clark Kent — AI Reading Assistant for Meta Smart Glasses

Overview

Role

Technologies

The Challenge

What I Built

1. Real-Time OCR Pipeline

2. Dual-Path Audio System

3. Gamified Learning Experience

Technical Architecture

Security & Quality

Outcome

PolyRLAI — RL Trading System →