AR Visual Aid for the Visually Impaired – IT & Computer Engineering Guide

1. Project Overview

The AR Visual Aid for the Visually Impaired is a wearable or mobile application designed to assist users with low or no vision by enhancing environmental awareness using computer vision, auditory feedback, and spatial mapping. The system detects objects, reads signs, and provides navigational support through audio and haptic signals.

2. System Architecture Overview

- Camera Module: Captures real-time surroundings.
- Object Detection Engine: Identifies and classifies objects.
- Scene Interpretation: Describes layout and hazards.
- Audio Feedback System: Converts visual info into spoken or spatial cues.
- Optional Haptics: Feedback via vibrations.
- Backend Services: Optional cloud-based enhancement and learning.

3. Hardware Components

Component	Specifications	Description
Smart Glasses / Mobile Device	AR-enabled with camera & audio output	Captures surroundings and delivers output
Camera	HD RGB / Depth sensing (LiDAR or ToF)	Captures 3D view of environment
Bone Conduction Headphones	Stereo, Bluetooth/USB-C	Provides audio cues without blocking ears
Vibration Motors (Optional)	Mini linear actuators	Used in wearables to provide tactile signals

4. Software Components

4.1 Development Tools

- Development Platform: Android/iOS or Unity with AR Foundation
- Computer Vision: OpenCV, TensorFlow Lite, YOLOv5
- Text-to-Speech: Android TTS, Google TTS, Amazon Polly
- Edge AI: Models optimized with ONNX/TFLite for low-power devices

4.2 Programming Languages

- Python (AI models), Java/Kotlin (Android), Swift (iOS), C# (Unity)

4.3 Libraries and SDKs

- ARCore/ARKit for spatial awareness
- OpenCV for image processing
- MediaPipe for hand/face detection
- TTS APIs for voice feedback
- TensorFlow Lite or CoreML for local inference

5. Core Functional Modules

- Object Detection: Real-time identification of people, vehicles, obstacles
- Text Recognition: OCR to read signs, boards, documents
- Scene Description: AI-generated spoken summaries (e.g., 'a person crossing the street')
- Navigation Aid: Detects clear paths, stairs, doorways
- Gesture Controls: Use hand gestures for interaction (optional)

6. Audio and Haptic Feedback System

- Audio Modes: Mono (for blind), stereo spatial (for low vision)
- Feedback Content: Distance alerts, object names, scene summaries
- Haptic Feedback: Directional vibration cues for alerts or navigation
- Alert Prioritization: Urgent threats override background descriptions

7. User Interface Design (Minimalist)

- Audio-First UI: Voice prompts and gesture/voice commands
- Simplified Controls: Few buttons or automatic operation
- Configuration Options: Adjust TTS speed, languages, detection preferences
- Emergency Mode: SOS feature with voice activation

8. AI Model Optimization and Testing

- Lightweight Models: TFLite, quantized YOLO or MobileNet for speed
- Dataset: Urban/street datasets + accessibility-specific datasets
- Real-world Testing: Simulate crowded, low-light, and noisy environments
- Performance Metrics: FPS, latency, detection accuracy, battery use

9. Deployment and Maintenance

- Device Support: Mid-range Android/iOS phones and AR glasses
- Updates: AI model retraining and OTA delivery
- Offline Mode: Works without internet for core functions
- Cloud Mode (Optional): Improves recognition via remote processing

10. Security and Privacy

- Data Handling: Minimal/no storage of personal video/audio
- Edge Processing: AI runs locally for privacy
- Anonymization: Blurs faces if storing data (optional for research)
- User Control: Toggle data collection/sharing features

11. Future Enhancements

- Voice-controlled assistant integration
- Smart object detection (e.g., familiar faces or objects)
- Braille output or tactile screens
- Real-time translation of signs/text
- Integration with GPS for outdoor navigation

Engineeering & IT Projects and Resources

Pages

AR Visual Aid for the Visually Impaired