30 Sep 2022| Artificial Intelligence
Production Deep Learning is Accessible to Startups
Deep learning, one of the streams of Artificial Intelligence is in vogue these days. Deep learning has networks capable of learning unsupervised from data that is unstructured or unlabeled. It also is known as dep neural learning or deep neural network.
Contrary to the perception that only companies with large volumes of data, abundant funding, and dedicated research teams dwell in deep learning, these days even startups have started to research this untapped subset of artificial intelligence with panache.
If you are researching deep learning, you would fair better with the support pf a reach team is a common understanding and if you are building a product that applies existing research, you don’t. Many open-source model architectures work well for most deep learning’ production use cases if you know some basic software engineering skills.
To give you an example, lets us say you want to build a conversational agent also commonly known as a chatbot. If you just wanted to build a chatbot that answers questions based on your gathered documentation, you could do that on your own in a few hours. If you are looking at an advanced state-of-the-art chatbot, you would need to outperform Google’s Meena, a sophisticated chatbot that was recently announced with 2.6billion parameters.
We know that deep learning models need to be trained and it requires vasts sets of data for it to find a pattern in the information it is crunching. You don’t need to have vast data sets like that of Google’s. The new wave deep learning models tend to be trained on enormous datasets. Last October, Google published the results of its new NLP framework, Text-to-Text Transformer, which was trained on roughly 750 GB of text.
It is acceptable for Google to do this as crawling the web is its primary function and business. Startups do not have the above-mentioned advantage. But then, to apply a state of the art model to your task, you do not need a massive data set as you don’t need to train a model from scratch.
Google’s Text-to-Text Transformer framework pushes the field of transfer learning forward. Transfer learning is an approach in which a neural network trained for one general task is ‘fine-tuned’ for a related and more specific task. Fine-tuning can be done with a small amount of data because the network’s knowledge generalizes to both tasks.
You require deep pockets to train a deep neural network. To give an example, a team from Google Research and Carnegie Mellon released XLNet, a language model in 2019. During the time of its release, XLNet produced sophisticated results on various NLP tasks. The researchers disclosed that XLNet was trained on 126 Cloud TPU v3 devices for 2.5 days. To give you a perspective, each Cloud TPU v3 device costs $8 per hour and the total estimated cost of training this one model is roughly $61,440.
The above-mentioned sum is just the cost of training the model once. And if the model performs below expectations or the system needs retraining, the cost would increase substantially.
It is near impossible for most startups to endure these costs. Spending $50,000+ to train a model that doesn’t work as expected could legitimately drown a smaller company. There are, however, ways to keep training costs low. As we have already discussed, fine-tuning pre-trained models can lower costs dramatically.
Using vanilla pre-trained models with no extra fine-tuning can eliminate all training costs. Additionally, because model size and performance do not necessarily scale proportionally, you can often use smaller (and therefore cheaper) model architectures without sacrificing much accuracy. If your goal is to train a model that is good enough for your product’s needs, you can keep model training affordable.
There is a new generation of startups emerging which has been showing a lot of promise in the said field of artificial intelligence, machine learning, and deep learning. One such super unicorn is ONPASSIVE, the artificial intelligence-driven startup which deals with AI-enabled plug-n-play smart business solutions for individuals, small business, and enterprises.
ONPASSIVE is not a spinoff from Fortune 100 companies, but this is a startup with minimal funding that is using deep learning to build new products. These startups, which we refer to as ML natives, aren’t always conducting groundbreaking research. More often than not, they’re simply fine-tuning well-known models and deploying them to production.
ONPASSIVE ecosystem builts products specifically to automate the infrastructure of ML projects it is building, and since open-sourcing it, we have been surprised by how many other startups—often founded by a few software engineers with little to no research experience—are building entire products out of trained models.
Implementation, and management, we are here to accelerate innovation and transform businesses. Contextual marketing is a modern marketing strategy to communicate the correct message to the ...
Tags: Technology Artificial Intelligence