Let us begin with a motivational quote:
Between now and 2030, it [Deep Learning] will create an estimated $13 trillion of GDP growth. — Andrew Ng
Every tech titan; be it Alphabet, Facebook, Microsoft; sees Data Science as the key to ruling the market.
Jokes like “You are either working for Google or competing against it”, now, feel like nightmares.
With the tech titans open-sourcing their frequently updated research. So almost every other startup is able to build upon that to create their own Deep Learning solution.
These startups, due to their low burning rate, are able to offer Deep Learning solutions at much lower rates.
Note: Henceforth, I will be using Deep Learning and Data Science interchangeably.
Here are my 2 cents on how a Service-Based Organization (SBO) can compete and ace in this environment
MVP over POC
Most service-based companies suffer from a phobia of the word “Product”. So service-based companies develop POCs instead.
Let’s first understand what is what:
Product is an article or substance that is manufactured or refined for sale.
Proof of Concept (POC) is a miniature representation of the end-product, with a few working features, that aims of verifying that some concept has practical potential.
Minimum Viable Product (MVP) is a development technique in which a new product or website is developed with sufficient features to satisfy early adopters.
Not being a product-based company is no excuse to create half-baked solutions in the name of POCs.
Remember the bottom line is always to create what people want. While POCs focus over a few working features, MVP stresses over the fact that a few working features are useless unless they are satisfying the customer.
Often, predominantly in Deep Learning, the working of the solution has this cloud of uncertainty around its end-result. This makes the client nervous about investing in the proposed solution.
Repeatedly, you will encounter clients who already had a few bad experiences before they approached your organization. Showcasing them something they have a hard time wrapping their head around, especially when they are already frustrated, would frankly be stupid.
A controlled-demo or a command-line demo; doesn’t make the client any more comfortable with the work.
The client needs an intuitive and interactive interface; something he can play around with; to satisfy his curiosity and hence overcome the reluctance.
Client requirements are mostly too specific to a niche domain. Hence the availability of a huge high-quality dataset is not available.
Client-based companies are usually working on multiple projects even under the same client. Their resources are much more distributed across different tasks and even expertise.
Depending upon the client for data, especially for POCs / POVs/MVPs, is a lost cause.
To circumvent this, companies initially show the demo upon a rather large but irrelevant (to client’s actual ask) dataset with magnificent results. This sets a disastrous precursor, leading to over-commitment but under-delivery.
Data-efficient systems which can produce somewhat satisfactory results over small datasets make way more sense as ‘small’ high-quality dataset is rather easy to find or even build.
Exact Client Requirements
Decentralized distribution of resources over multiple clients allows service-based companies to penetrate deep into the requirements of individual clients.
This allows service-based companies to tailor (modify, not build) the solution to the client’s exact requirements.
Note: This does not decrease the scope of the project, simply an expansion of scope to include clients’ additional requirements.
This allows service-based companies to move away from academia’s SOTA (State-Of-The-Art) ideology (iff and) when required and simply customize (again, not build) a system that works for that one specific client or type of client.
Service-based companies usually hire domain experts who work upon different projects from the same domain for different clients.
Their niche but solid expertise results in the assurance of deliverability of quality with minimum risk. This assurance is what the big clients value the most, especially the banks and trading corporations.
Out of the Box Solutions
In Data Science, it has become a standard practice to develop solutions that require manual efforts on client-side to train systems before using them.
This is an easy but ineffective way to circumvent developing data-efficient solutions because:
Data Science projects are usually the necessity of a non-technical team on the client’s side (financial team or business team).
Training of such solutions is usually too cumbersome for the the personnel of these non-technical teams.
So the client usually allocates a set of ‘technical personnel’ to train these solutions.
Hence, even before the client has seen a glimpse of the system, he has to wait for:
- education of its technical personnel about the use-case
- education of its technical personnel about the technology
- its technical personnel to train of the system
On the other hand, Out of the Box Solutions are the systems that can directly be used without any pre-training.
Therefore, without much ado, the client has something substantial.
This provides the client with the an early assurance and confidence in capablities of the service-based company.
Though solutions should be tailored to the client’s requirement, this does not translate to developing use-and-throw solutions.
Avoid developing solutions with a scope too narrow to accommodate any future requirements.
Developing an immutable black-box solution which can be barely modified, usually leads to conflicts with clients and messy patch-up work.
Enters the Micro-Service Architecture.
Micro-Service is an architectural style that defines a project a collection of services which are:
- highly maintainable and testable
- loosely coupled
- independently deployable
This architecture ensures that the client can alter the working of the solution by applying a different permutation, omission or addition of services with minimal effort, as per its satisfaction.
Service-based companies are usually scared to allocate ‘sufficient’ separate funds for non-client-specific R&D.
Usually while developing a client-specific Deep Learning solution, the scope of the R&D is limited to the ‘current’ requirements of the client and not the domain of that client.
The Deep Learning projects from the same domain (NLP, Computer Vision etc) can have substantial overlap of pipelines, learners, pre-processing steps etc.
In the absence of a centralized team, there is going to be a lot of redundant efforts.
The presence of a centralized team is the key to ensure that client-specific teams have an existing solution to improve upon, rather than building a new solution every time.
When facing a unique non-traditional client requirement or maintaining an effective R&D; experience plays a crucial role.
Let’s say, you want to design a slot tagger that works well on 20 examples.
One of the architectures that would work:
Attention Augmented Bi-Directional Gated Convolution Network using Scattering Wavelets and Fourier Transformation
Now this architecture, one won’t find in any tutorial nor in any blog.
This architecture is not an obvious choice until you have researched Information Theory, Signal Processing and Deep Learning for quite some time. The amount of time that is generally infeasible.
Professors have decades of experience, by virtue of which, they gain tremendous knowledge around a specific field. Their capability to pinpoint and resolve/circumvent issues is unparalleled.
Everyone is cynical about pushing blackbox solutions to production.
None has been able to predict the behaviour of a neural network especially a deep on under diverse scenarios.
Extensive benchmarking ensures everyone from the developer to the client, is aware of the capabilities and hence the limits of those capabilities for a given solution under different scenarios.
Deep Learning in the simplest of forms is a few random numbers converging to a certain configuration where they start to learn the relations between the input and the output.
Unlike in traditional software solution, one can at max have a hunch about what might work and what might not.
Hence creating robust solutions can take more time and effort than what was expected at the beginning.
A hard definite timeline should never be declared or expected.
Stakeholders should maintain a buffer on their fronts to incorporate the delays.
Special thanks to Deepak Saini & Mayank Saini.
Ace Deep Learning in a Service-Based Organization was originally published in Towards Data Science on Medium, where people are continuing the conversation by highlighting and responding to this story.
Source: Toward Data