GS 4 Day 3 (9-April-2021)
4.1 ML foundations by Ravi kumar 9 AM
Ravi talks about a generic framework for online learning. A simple problem to model in this framework can be taken as a ski-rental problem where renting cost is 1$, but ski cost is 25$. The algorithm tries to give theoretical bounds on the maximum cost involved before making a correct decision. Ravi also emphasizes that online learning is pessimistic opposite to machine learning which tries to learn average behavior.
The key idea is to incorporate ‘hints’ at each time step before making a decision. The research is trying to achieve close to offline level performance when hints are good and ensure to perform as good as using no hints in the worst case when all the hints are bad.
Ravi starts with a basic formulation of the problem and builds upon a story on how research has evolved at each stage to solve previous solutions’ issues. Worst case cost bounds are derived for each case by various researchers. In the end, the last case is covered where multiple hints are incorporated at a single time step.
A minimal set of experiments were performed on the ResNet classifier, where a hint is previous gradient. I feel that we may try to model our online prediction models this way.
4.1.1 Additional resources
4.2 ML foundations: Stochastic Gradient Descent (Theory v/s practice) by Prateek Jain, 10:30 AM
First, Prateek shows how different the SGD is in practice than in theory in terms of convexity, non-i.i.d selection of samples, batch size, step size, etc. Then the discussion moves on to how fast gradient descent can find a p-order stationary point. It turns out that for \(p\ge4\), finding stationary points is an NP-hard problem. Somehow, gradient descent can still find local optima, but the rates are unknown.
Gradient descent can find the first-order stationary points. Computing hessian is expensive, and so is finding second-order stationary points. But, there is a simple algorithm to find second-order stationary points with gradients alone. It is noisy GD where you add an isotropic noise to the solution and re-do GD. Practically, it is believed that SGD is equivalent to this, and it finds second-order stationary points with higher probability. It is an open problem to show this concretely.
Prateek concludes the talk by mentioning that if one analyses SGD, it might be possible to overcome the current problems and bring the convergence rate further down.
4.3 Panel discussion: AI in India, 1 PM YouTube link
The discussion was led by Partha Talukdar from Google. The first question was about the journey of Dr. Geetha Manjunath from Niramai who has developed a successful AI solution for early detection of breast cancer.
Geetha explains that X-ray-based cancer detection is not affordable to everyone, and the machines are not affordable by most hospitals. They were able to successfully leverage thermography to solve this problem and make the test affordable as low as 2$ in rural India. She also mentions that they have regulatory clearances by Indian and European standards.
Dr. Balaraman Ravindran admires the end-to-end work done by Niramai, reminding that in India, most verticals do not have the base technology ready to put an AI on top of that. He briefly talks about the need for fairness in AI as per the Indian context.
Pankaj Gupta, Sr. Director of Engineering at Google Pay, throws some insights on the need of connecting the majority of India digitally using AI.
4.3.1 Q&A
Q: What might be good sources to look at updates/problems tackled with AI in India by various industries?
A: Partha says that a symposium like the current can be insightful for this. Geetha suggests looking at FDA-approved startups, and a website called IndiaAI. Ravi suggests an excellent website called AI4Bharat that is maintained specifically for this purpose. Pankaj strongly recommends Twitter, and according to him, many latest cutting-edge things are first shared on Twitter nowadays.
Q: What AI work is happening in India to mitigate Air pollution? What are the potential directions with AI in this field?
A: Ravi describes a bit on mobile sensors used in Tamilnadu to monitor air pollution. According to him, due to heterogeneity in AQI in Delhi, even regression modeling or time-series predictions also help understand the pollution. Pankaj suggests the cerca center in IIT Delhi and the work of Prof. Rijurekha.
Q: How to take AI solution to the ground level where one has to deal with bureaucracy?
A: Geetha suggests approaching negative feedback with a positive mindset and improve what one lacks in the implementation.
Q: How PhD students can work with industries to solve problems with AI?
A: Ravi mentions that there are limited people who can work with Google, Microsoft, so try to see if you can be competitive by creating a startup. Geetha resonates a bit by suggesting working with AI startups to solve challenging problems.
4.3.2 Final thoughts
Geetha encourages all to work with a problem that one can relate to and go end-to-end without stopping at the algorithmic level only. Ravi suggests being aware of the area’s ecosystem where one might be trying to solve a problem. According to him, sometimes AI is consuming a tiny part of the entire problem, so be aware of it and understand it to be successful. Pankaj suggests not to get bogged down by the difficulties and be around the right people.