Across the United States, 33 babies are born with permanent hearing loss, a common birth defect, every day. Without learning sign language, those children risk gaining Language Deprivation Syndrome (LDS), which can have various negative effects on their social life.
By statistics, 90% of those babies with permanent hearing impairment are born to parents without disabilities. This imposes serious challenges, as learning American Sign Language (ASL) requires resources, which parents typically lack.
Abto Software took part in an ambitious competition to design a model to recognize and classify ASL signs, aimed at education gamification.
Having covered initial discovery, approach determination and evaluation, our team created a TensorFlow Lite language model, trained on data extracted using the MediaPipe Solution.
To allow the algorithm to run across devices and limit the latency, the videos aren’t stored on a public cloud. That means the inference must be smoothly conducted on the user’s device.
The projects main goals:
- To develop an ASL recognition model prioritizing accuracy
- To ensure the machine learning model can run on Android and iOS
Education gamification – a glance at the legacy application
Abto Software took part to expand an educational bubble game with 500 ASL signs and 220,000 total examples. The app tracks performed hand gestures and confirms they’re correct in real-time, making learning more fun and accessible for children, who need more interaction, and parents, who don’t have time for courses.
The concept is simple:
- The app receives live video input
- The algorithm, enabled by artificial intelligence, processes the video input and evaluates whether the demonstrated signs are correct
Our model was aimed at providing ASL recognition at a high accuracy with the help of computer vision.
The stages our team has covered:
1. Preliminary discovery – we explored different approaches to gain more insight into the best practices
2. Architecture design – we designed a model to associate data with the corresponding ASL signs
3. Model training – feeding the built model with the training dataset to help it learn different patterns
4. Model evaluation – providing the created model with the testing dataset to compare predicted output with the actual labels
5. Model fine-tuning and optimization (iterative tweaking and retraining to achieve better accuracy)
6. Model conversion into the requested format (TensorFlow Lite)
The challenges our team has handled:
1. Strict limits
In order to submit our model, we had to consider several requirements:
- The solution should weight no more than only 40 MB
- The solution should process the input in only 100 milliseconds
2. Data inaccessibility
Working on the solution, we could only access pre-processed data, but no original videos.
3. Data distortion
All videos were routed through the MediaPipe Solution, which imposed several issues:
- Artifact distortion
- Frame omission
4. Data instability
All videos were of different lengths, which caused data instability.
Tools and technologies
Tech & AI/ML stack:
- TensorFlow Light
- Deep Learning
- Pose estimation
- Skeleton analysis
- March 2023 – May 2023
- 3 R&D engineers
The designed ASL model is optimized for deficient internet connection, which makes it suitable for use even in developing countries.
The described ASL model isn’t using original input (user video’s), which preserves patient security and privacy and ensures regulatory compliance.
With integrated ASL recognition into products and services, people with hearing loss can enjoy:
- Improved communication
- Educational support and gamification
- Social inclusion
- Practical implementations (convenient control over technology, including phones, wearable devices, and home automation systems)
By incorporating ASL recognition, businesses can leverage greater:
- Business opportunities and reach by:
– Making products more accessible to customers who use sign language
– Developing interactive and engaging learning platforms for teaching sign language
- Customer satisfaction and loyalty by:
– Transforming their customer service