Why every Product Manager should know Machine Learning
Machine Learning (ML) is becoming ubiquitous and every single product out there is attempting to use some flavor of Machine Learning to better address customer problems and delight them. ML is no more a fad and it is not restricted to those familiar use-cases of image recognition, page ranking, spam detection, autonomous cars etc. Appropriate use of ML algorithms is essential to differentiate every product and deliver better value proposition to its customers.
Machine learning is evolving faster than any other technology and it has the power to break new grounds creating more opportunities through providing solutions to problems that were evading us for a longer time. Even products that are not entirely ML dependent, where it already addresses a certain problem, harnessing ML can help the product better address the same problem. ML is soon becoming the de facto technology for every product and its Product Manager, ML offers only two choices (1) Embrace ML effectively or (2) Fad into oblivion. Ideally speaking, there is only one option to pick.
In this entire conundrum of how to embrace ML, can ML offer an effective solution and what are the other alternatives, how ML is better than those alternatives, how to validate the efficacy of ML, and what is the real value delivered to customers? What do you think will be the role of PMs? Please be aware that the context of this blog is to products that are not entirely ML dependent unlike autonomous cars, speech recognition etc.
Great PMs should be tremendously good at discovering, identifying and later defining the problem — What is the problem that the product has to address. Great PMs spend time identifying a problem that is worth addressing. Focus on technology and solution is always secondary. Technology always evolves and so does solution. Identifying the right problem provides the right start to identifying the right technology to delivering the right solution. PMs should focus on identifying the problem and it is entirely the responsibility of engineering to identify right flavors of ML based on the problem statement provided by PM.
Then… Why am I bragging about PMs being aware of Machine Learning? I foresee three reasons for it.
Speak the language of engineering
Even though, PMs should focus more on the problem and less on the solution and PMs need not play a role in drafting HOW to address the problem using ML, PMs should definitely have the technology acumen to understand the HOW. Understanding of HOW will position PMs to unbiasedly evaluate how exactly ML is better at addressing the customer problem, does it really add significant value. Incorporating ML does not automatically bring results. There is no magic behind ML. ML has to be harnessed in a right way with a right set of data and models to contribute the right value. Rightly so, It is the not the destination that is always important, the journey is also equally important to ensure that PMs took a notice of the efforts by the engineering team and not just the outcome. I always believe in rewarding efforts and not just outcomes. During the journey, PM should speak the language of engineering to comprehend their efforts and to assist them as well. Let me imagine a fictional conversation between my engineer and me after I asked for his assistance to help me predict product revenue based on earlier collected data.
Engineer: Hey Murali, I have used multiple models for classification problem. However, the bias is too high with all those models and we have a problem of underfitting. Guess we need more features.
Me: What are features?
Engineer: Any additional information that can augment existing customers data.
Me: Oh, I have customers age and country.
Engineer: Is there a strong correlation between country and the revenue generated
Me: Hmm … Correlation? I doubt
Engineer: No, the data is not useful we need something else. Unless you provide additional data, I cannot build a good prediction model
Me: Completely lost in the jargons, not sure what data to collect. Notwithstanding, I do not know a dime about ML, I will start doubting the capabilities of my engineer. I gave data of 1000s of customers. Yet he is asking for more data, what the heck?
PMs interface with many entities (Engineer, C-Level Execs, Sales, Marketing, and Account Teams etc.) and speaking their respective languages gain mutual respect and trust. When I insist PMs should learn ML, I am not focussing on the ability to build a new model but to understand existing models and at least to the extent of mapping the problem on hand to one of the right flavors — Prediction, classification, ranking, clustering, or anomaly detection etc. PMs should understand the overall landscape of ML to differentiate reality from fad and for absolute clarity on what is possible to solve using ML.
What lies beneath the surface?
Customers (mostly B2B) are not merely interested in what happens above the surface, they are keen to understand what happens beneath the surface as well. It is essential for Product Managers to explain how the product works in addition to articulating what the product could do for them. Without understanding the specific flavor of ML used in the product and how it is improving the overall efficacy of the solution, it would be tough for Product Manager to succinctly articulate the details.
There is a $ at stake and B2B products are evaluated rationally than emotionally, so the person implementing and using the product will curious to know what data is used to validate models and what is the accuracy rate of the outcome. Customer will be interested in knowing the precision score, recall score, and F1 score. In simple terms, customers might be interested to know the rate of false positives and false negatives to consolidate his confidence on the outcome delivered by the product.
ML will soon be as fundamental as Excel to PMs
After I scratched the surface of learning ML through online courses at Coursera and with hands-on of Python courses. I primarily understood two things that ML can soon become a de-facto technology for data visualization and analysis alike Excel.
Data Visualization
Presenting the data is a fundamental skill that every PM should possess. Companies are sitting on top of goldmine of data related to products gathered internally and externally. With increase in collection of data, there should be efficient ways to process and visualize data to get newer insights through adapting newer scalable techniques. The technologies used for presenting the data should evolve and PMs can more restrict their expertise to basic some graphs in XLS.
Matplotlib, Seaborn and various other libraries are having tons of pre-built plots to visualize data in a meaningful way. Using those libraries are not rocket science. Anyone with basic knowledge of programming should be able to use them. Considering that majority of PMs are either developers or designers, using those visualization libraries should be a piece of cake. What matters is that the options those libraries provide to visualize data is tremendous. PMs do have huge chunks of marketing, sales data on their hand. It might not be always feasible taking assistance from engineering to choose right plots for visualizing the data. Visualizing data can provide more insights than plain numbers on a sheet of paper. There are options for bar plots, scatter plots, correlation through heat maps, country/world map etc.
PS: Some of those libraries are already available with excel, but I feel Python provides more flexibility. Using Python reminisces those lots days of programming. Guess, once a developer always a developer :-).
Data Analysis
It is not just visualization, analysis of data is also important. At a very fundamental level, every Product Manager should forecast revenues or assess the probability of a deal to succeed or not, cluster customer segments based on certain common demographic elements. Again, I am speaking mostly from the context of B2B segment. For data analysis, PMs can understand the nature of the problem — prediction, classification or clustering and pick one of the already available algorithms to analyze the data. Ability to interpret errors such as mean squared error, f1 error etc. on both training set and validation set can help Product Manager identify alternate solutions to address bias/variance tradeoffs. The Internet has tons of pre-determined solutions for addressing bias/variance tradeoffs for various combinations of bias/variance problems. ML is definitely a valuable tool for analyzing data and constructing meaningful interpretations of marketing data to define pricing, design market campaigns, define strategies to acquire and retain customers etc.
How did I got started
Even since I started learning ML, I love every bit of it. Probably, it was because of my passion for mathematics and I have a background in statistics. If you detest numbers, then I bet ML will definitely be boring for you.
My journey started with a course in Coursera by Andrew NG. It provided me a basic understanding of Machine Learning fundamentals.
- What is supervised and unsupervised learning?
- What are various models under each of those categories? How to identify the right model depending on the use-case?
- How to measure and interpret error for various algorithms?
- When to use regularization?
- What is bias and variance? What are techniques for bias/variance tradeoffs?
- When to add more features to existing data and when to augment more data to existing data?
Awareness of above details is sufficient to speak the language of ML with engineering, to understand, appreciate and complement their efforts, and comprehend what is truly possible with ML.
Andrew NG course also requires some amount of programming. It is not too difficult as they do too much of hand-holding. If you have been a programmer earlier, try to avoid the normal use of loops and instead use vectors to solve problems. Vector methods are faster and pretty much the entire algorithm is programmed in almost one single line. Utmost 90% of the problems, I solved using vectors. Before you get started, gain some basic understanding of vector operations. For me, it took some time to realize that [A] * [B] is not equivalent to [B] * [A] and also to realize that vector multiplication of 2 elements are possible only if the number of columns of the 1st vector is equivalent to the number of rows of the 2nd vector. Ideally, brush your fundamentals in mathematics, if you are more interested to know how algorithms are derived for various models then you should familiarize in algebra and calculus.
After Andrew NG course, I explored for few courses in Udemy for hands-on experience exploring ML algorithms and I picked the following course: Python for Data Science and Machine Learning Bootcamp (it is cheaper :-)). I completed 50% of the course and the focus is exclusively on using existing Python libraries for solving ML problems. However, the course does not provide any theoretical foundation on the background of various ML algorithms. I am loving every bit of my journey so far. I hope that I will sooner start solving some basic problems in Kaggle. However, for PMs, I would suggest to start and end with Andrew NG course, unless you intend to dive deeper.
MIT also offers a course on Machine Learning. If you are looking for a dataset, Kaggle offers many datasets to play around with ML algorithms. Another good resource is MNIST. In addition, IBM Watson also offers certain tools for validating problems related to data science. However, I am yet to explore it.
Please share your thoughts and opinions on learning ML. I wish good luck in your journey of learning ML.