Generally, emerging technologies that are valuable enough to become popular tend to decentralize at the earliest opportunity. From the print bureau to the home printer, the processing lab to the smartphone camera, the mainframe to the personal computer — the phase prior to this consumerization is hallmarked by business services gatekeeping the new technology and meting it out to consumer demand in small and increasingly profitable measures as hardware costs reduce — eventually reducing enough to diffuse the technology and kill the ‘gatekeeper’ business model.
The explosion of data storage and processing needs over the last twenty years has not only kept this from happening in the business information services sector, but has, according to some sources, practically eradicated the in-house data center in favor of the cloud.
5 Reasons to Develop AI Systems In-House
1: The Best Core Technologies Are Open-source Anyway
The academic origins of open-source GPU-accelerated machine learning frameworks and libraries over the last ten years have made it all but impossible for well-funded tech giants to cloister promising new AI technologies into patent-locked, proprietary systems.
This is partly because nearly all the seminal contributing work has been the result of international collaborations involving some mix of academic research bodies and government or commercial institutions, and because of the permissive licensing that facilitated this level of global cooperation.
2: Protecting Corporate IP
Most in-house AI projects have a more fragile angle on success than the FAANG companies, such as a patentable use-case concept or the leveraging of internal consumer data — instances where the AI stack configuration and development is a mere deployment consideration rather than a value proposition in itself.
In order to avoid encroachment, it may be necessary to tokenize transactions that take place through cloud infrastructure, but keep local control of the central transaction engine.
Where client-side latency is a concern, one can also deploy opaque but functional algorithms derived from machine learning methods, rather than trusting the entirety of the system to the cloud, and encrypt or tokenize data returns for local analysis.
Such hybrid approaches have become increasingly common in the face of growing breach reports8 and hacking scandals over the last ten years.
3: Keeping Control of Data Governance and Compliance
The specificity of the input data for machine learning models is so lost in the training process that concerns around governance and management of the source training data might seem irrelevant, and shortcuts tempting.
However, controversial algorithm output can result in a clear inference of bias, and in embarrassingly public audits of the unprocessed training source data and the methodologies used.
In-house systems are more easily able to contain such anomalies once identified. This approach ensures that any such roadblocks in machine learning development neither overstep the terms and conditions of the cloud AI providers nor risk infringing the lattice of varying location-specific privacy and governance legislation that must be considered when deploying cloud-based AI processing systems.
Related:- How to Make the Most Useful Dashboard
4: AIaaS Can Be Used for Rapid Prototyping
The tension between in-house enterprise AI and cloud-based or outsourced AI development is not a zero-sum game. The diffusion of open-source libraries and frameworks into the most popular high-volume cloud AI solutions enables rapid prototyping and experimentation, using core technologies that can be moved in-house after the proof-of-concept is established, but which are rather more difficult for a local team to investigate creatively on an ad-hoc basis.
Rob Thomas, General Manager of IBM Data and Watson AI, has emphasized the importance of using at-scale turnkey solutions to explore various conceptual possibilities for local or hybrid AI implementations, asserting that even a 50% failure rate will leave an in-house approach with multiple viable paths forward13.
5: High-Volume Providers Are Not Outfitted for Marginal Use Cases
If an in-house project does not center around the highest-volume use cases of external providers, such as computer vision or natural language processing, deployment and tooling is likely to be more complicated and time-consuming. It’s also likely to be lacking in quick-start features such as applicable pre-trained models, suitably customizable analytics interfaces, or apposite data pre-processing pipelines.
Not all marginal use cases of this nature are SMB-sized. They also occur in industries and sectors that may be essential but operate at too limited a scale or within such high levels of oversight (such as the nuclear and financial industries) that no ‘templated’ AI outsourcing solution is ever likely to offer adequate regulatory compliance frameworks across territories, or enough economy of scale to justify investment on the part of off-the-shelf cloud AI providers.
Commodity cloud APIs can also prove more expensive and less responsive in cases where the value of a data transaction lies in its scarcity and exclusivity rather than its capacity to scale at volume or address a large captive user base at a very low latency.