YDT Blog

In the YDT blog you'll find the latest news about the community, tutorials, helpful resources and much more! React to the news with the emotion stickers and have fun!
Bottom image

Ace Deep Learning in a Service-Based Organization

From: https://newsroom.cisco.com/feature-content?type=webcontent&articleId=1895218

Let us begin with a motivational quote:

Between now and 2030, it [Deep Learning] will create an estimated $13 trillion of GDP growth. — Andrew Ng

Present Scenario

Every tech titan; be it Alphabet, Facebook, Microsoft; sees Data Science as the key to ruling the market.

Jokes like “You are either working for Google or competing against it”, now, feel like nightmares.

With the tech titans open-sourcing their frequently updated research. So almost every other startup is able to build upon that to create their own Deep Learning solution.

These startups, due to their low burning rate, are able to offer Deep Learning solutions at much lower rates.

Note: Henceforth, I will be using Deep Learning and Data Science interchangeably.

Here are my 2 cents on how a Service-Based Organization (SBO) can compete and ace in this environment

MVP over POC

Most service-based companies suffer from a phobia of the word “Product”. So service-based companies develop POCs instead.

Let’s first understand what is what:

Product is an article or substance that is manufactured or refined for sale.

Proof of Concept (POC) is a miniature representation of the end-product, with a few working features, that aims of verifying that some concept has practical potential.

Minimum Viable Product (MVP) is a development technique in which a new product or website is developed with sufficient features to satisfy early adopters.

Not being a product-based company is no excuse to create half-baked solutions in the name of POCs.

Remember the bottom line is always to create what people want. While POCs focus over a few working features, MVP stresses over the fact that a few working features are useless unless they are satisfying the customer.

Often, predominantly in Deep Learning, the working of the solution has this cloud of uncertainty around its end-result. This makes the client nervous about investing in the proposed solution.

Repeatedly, you will encounter clients who already had a few bad experiences before they approached your organization. Showcasing them something they have a hard time wrapping their head around, especially when they are already frustrated, would frankly be stupid.

A controlled-demo or a command-line demo; doesn’t make the client any more comfortable with the work.

The client needs an intuitive and interactive interface; something he can play around with; to satisfy his curiosity and hence overcome the reluctance.

Data-Efficient Systems

Client requirements are mostly too specific to a niche domain. Hence the availability of a huge high-quality dataset is not available.

Client-based companies are usually working on multiple projects even under the same client. Their resources are much more distributed across different tasks and even expertise.


Depending upon the client for data, especially for POCs / POVs/MVPs, is a lost cause.

To circumvent this, companies initially show the demo upon a rather large but irrelevant (to client’s actual ask) dataset with magnificent results. This sets a disastrous precursor, leading to over-commitment but under-delivery.

Data-efficient systems which can produce somewhat satisfactory results over small datasets make way more sense as ‘small’ high-quality dataset is rather easy to find or even build.

Exact Client Requirements

Decentralized distribution of resources over multiple clients allows service-based companies to penetrate deep into the requirements of individual clients.

This allows service-based companies to tailor (modify, not build) the solution to the client’s exact requirements.

Note: This does not decrease the scope of the project, simply an expansion of scope to include clients’ additional requirements.

This allows service-based companies to move away from academia’s SOTA (State-Of-The-Art) ideology (iff and) when required and simply customize (again, not build) a system that works for that one specific client or type of client.

Domain Expertise

Service-based companies usually hire domain experts who work upon different projects from the same domain for different clients.

Their niche but solid expertise results in the assurance of deliverability of quality with minimum risk. This assurance is what the big clients value the most, especially the banks and trading corporations.

Out of the Box Solutions

In Data Science, it has become a standard practice to develop solutions that require manual efforts on client-side to train systems before using them.

This is an easy but ineffective way to circumvent developing data-efficient solutions because:

Data Science projects are usually the necessity of a non-technical team on the client’s side (financial team or business team).

Training of such solutions is usually too cumbersome for the the personnel of these non-technical teams.

So the client usually allocates a set of ‘technical personnel’ to train these solutions.

Hence, even before the client has seen a glimpse of the system, he has to wait for:

  • education of its technical personnel about the use-case
  • education of its technical personnel about the technology
  • its technical personnel to train of the system

On the other hand, Out of the Box Solutions are the systems that can directly be used without any pre-training.

Therefore, without much ado, the client has something substantial.

This provides the client with the an early assurance and confidence in capablities of the service-based company.

Flexible Architecture

Though solutions should be tailored to the client’s requirement, this does not translate to developing use-and-throw solutions.

Avoid developing solutions with a scope too narrow to accommodate any future requirements.

Developing an immutable black-box solution which can be barely modified, usually leads to conflicts with clients and messy patch-up work.

Enters the Micro-Service Architecture.

Micro-Service is an architectural style that defines a project a collection of services which are:

  • highly maintainable and testable
  • loosely coupled
  • independently deployable

This architecture ensures that the client can alter the working of the solution by applying a different permutation, omission or addition of services with minimal effort, as per its satisfaction.

Centralised R&D

Service-based companies are usually scared to allocate ‘sufficient’ separate funds for non-client-specific R&D.

Usually while developing a client-specific Deep Learning solution, the scope of the R&D is limited to the ‘current’ requirements of the client and not the domain of that client.

The Deep Learning projects from the same domain (NLP, Computer Vision etc) can have substantial overlap of pipelines, learners, pre-processing steps etc.

In the absence of a centralized team, there is going to be a lot of redundant efforts.

The presence of a centralized team is the key to ensure that client-specific teams have an existing solution to improve upon, rather than building a new solution every time.

Educational Tie-Ups

When facing a unique non-traditional client requirement or maintaining an effective R&D; experience plays a crucial role.

Let’s say, you want to design a slot tagger that works well on 20 examples.

One of the architectures that would work:

Attention Augmented Bi-Directional Gated Convolution Network using Scattering Wavelets and Fourier Transformation

Now this architecture, one won’t find in any tutorial nor in any blog.

This architecture is not an obvious choice until you have researched Information Theory, Signal Processing and Deep Learning for quite some time. The amount of time that is generally infeasible.

Professors have decades of experience, by virtue of which, they gain tremendous knowledge around a specific field. Their capability to pinpoint and resolve/circumvent issues is unparalleled.


Everyone is cynical about pushing blackbox solutions to production.

None has been able to predict the behaviour of a neural network especially a deep on under diverse scenarios.

Extensive benchmarking ensures everyone from the developer to the client, is aware of the capabilities and hence the limits of those capabilities for a given solution under different scenarios.


Deep Learning in the simplest of forms is a few random numbers converging to a certain configuration where they start to learn the relations between the input and the output.

Unlike in traditional software solution, one can at max have a hunch about what might work and what might not.

Hence creating robust solutions can take more time and effort than what was expected at the beginning.

A hard definite timeline should never be declared or expected.

Stakeholders should maintain a buffer on their fronts to incorporate the delays.

Special thanks to Deepak Saini & Mayank Saini.


Ace Deep Learning in a Service-Based Organization was originally published in Towards Data Science on Medium, where people are continuing the conversation by highlighting and responding to this story.

Source: Toward Data

[D] Batch Normalization is a Cause of Adversarial Vulnerability

Abstract – Batch normalization (batch norm) is often used in an attempt to stabilize and accelerate training in deep neural networks. In many cases it indeed decreases the number of parameter updates required to achieve low training error. However, it also reduces robustness to small adversarial input perturbations and noise by double-digit percentages, as we show on five standard data-sets. Furthermore, substituting weight decay for batch norm is sufficient to nullify the relationship between adversarial vulnerability and the input dimension. Our work is consistent with a mean-field analysis that found that batch norm causes exploding gradients.

Page – https://arxiv.org/abs/1905.02161

PDF – https://arxiv.org/pdf/1905.02161.pdf

Has anyone read the paper and experienced robustness issues with deployment of Batchnorm models in the real world?

submitted by /u/aseembits93
[link] [comments]
Source: Reddit Machine Learning

5 ways Data Analytics is a Game Changer for the Insurance Industry

We are in the age of ‘Big Data’ and while it is becoming a big business in the emerging technological world, let us understand and get a deeper insight on what Data Analysis basically is. 

Data Analysis is a process through which we intensively inspect, transform, and model unstructured data, reducing it to a structured form. This is done with the intent of discovering useful information, drawing conclusions, and supporting decision making. Ballooned by the growing number of devices, IDC Data Age predicts that by 2025, the total amount of digital data created worldwide will rise to 163 zettabytes.

While touching various industrial verticals with its advantages, Data Analytics is playing its part in the Insurance industry too. Insurance data analysis is the way to effectively gauge your financial situation to get an idea of how much risk, if any, you are able to undertake as a business and how much of it should be transferred to an insurance company. Hence, a complete Insurance data Analysis will ensure that you have all the greater risks covered to your best capabilities. 

Now that we have an understanding of Data Analytics and its role in the insurance industry, following are the top 5 ways in …

Read More on Datafloq

Source: Datafloq

[D] How much do CS/ML assistant professors get paid?

While reading the other thread (https://www.reddit.com/r/MachineLearning/comments/d1ooem/d_when_the_ai_professor_leaves_students_suffer/), someone linked to a H1B database of professor salaries which showed assistant professors are getting paid ~120K max at a top school like CMU (unless in the business school).

Is this really true? Is there variance among schools? I am very surprised since these candidates could easily make 300K-500K as a research scientist at the big tech companies. Granted, there is still old school prestige attached to being a professor, but it seems like they are leaving a lot of money on the table.

submitted by /u/20150831
[link] [comments]
Source: Reddit Machine Learning

Latest Posts