GS 2 Day 1 (7-April-2021)
2.1 Opening Keynote by Jay Yagnik YouTube 9:00 AM
Jay divided his presentation into three topics.
2.1.1 Patterns of progress in AI
End-to-end ML techniques result in practically the best models in almost all the domains (e.g., health care, speech recognition, vision-based robotics, self-driving vehicles, games, agriculture, weather forecasting).
- Maybe it allows using some mathematical shortcuts underlying the phenomena.
Reusable blocks of ML (e.g., Tensorflow, PyTorch) allow building upon it and quickly connecting with the other pipelines.
Recurrence is closely related to the deep part of DL, yet it does not work well. This is a contradiction or a paradox
Very little has changed in terms of fundamental blocks of DL. Researchers must look at this as there is much room for improvement at the fundamental levels.
Another example of reusability is transfer learning using BERT models
Model capacity and compute have grown way faster than Moore’s law.
- TPU is able to work fast because of lower precision computation
Potential technology inflation point: Quantum Computing
Quantum processors can execute exponentially complex tasks in linear time
A challenge of noise exists but can be tackled in the future by a self-error-correcting mechanism using a large number of qubits.
2.1.2 Future of artificial general intelligence (AGI)
- Including world knowledge in models that are grounded in reality would work well.
2.1.3 Societal impact with AI
Looking at the patient history and helping doctors to focus on important events of the past.
Sharing fields on which Google India is working on Google level as well as societal level.
2.1.4 Q&A
Q: How would one balance between breadth and depth as a PhD student?
A: Jay said he started as a machine learning and vision guy 15 years back. After that, he forced himself to learn a new field at a complexity level of a practitioner of that field every two to three years. He felt that summarizing all the domains, the number of core problems being solved is really small. Just because of different names and slightly different ways, they look very different. Thus, he advised first to get grounded in one field at the nuts and bolts level, and then it would force one to make connections among various areas, which is also a good recipe for building a good career.
Q: What are the expectations from a PhD student who wants to join the research industry? Where should one focus in the last years of PhD? How different is research in the industry from academia?
A: Answering the last question first, industry focuses on four pillars (patterns observed in successful researches): i) fundamental research; ii) Building infrastructure; iii) Taking the product to larger user base; iv) new product renovation. Answering the other questions, one should be able to summarize what they know in different fields. Jay recalls his adviser’s advice that one should be able to summarize a paper in 3-5 sentences. Jay also builds a corollary on top of it that one should write papers one would not be able to summarize in 3-5 sentences.
Q: How to explain end-to-end models? Does Google use something specifically for the explainability of models?
A: Most models are black-box nowadays, and we also have black-box methods to probe them (pointing to gradient propagation and uncertainty propagation methods). What happens in between (exact mathematics) is of the least concern as long as one can reason the output based on particular inputs (e.g., which part of a medical image drove the decision made by an AI model?)
Q: As the model parameters are growing into millions, we need to scale hardware (GPUs and TPUs) at the same pace. This might be a bottleneck after some time. What are your views on this?
A: Jay has two answers to this question. First is, having a small GPU cluster at the university level is not that expensive, and it is enough to do meaningful experiments that can scale in practice. Secondly, Jay believes he would not be surprised if a few years from now we discover that we were unnecessarily wasting a lot of compute and there are clever mathematical tricks (nearest neighbor searching, branch and bound) to do the task efficiently.
Q: Most ML models learn with back-propagation, but the human brain does not seem to learn that way. How to closely mimic human brain learning for AGI (artificial general intelligence)?
A: Jay does not think it is a requirement for AGI (quoting airplanes don’t flap their wings philosophy). He feels that taking inspiration from nature is good, but one has to stay within the limits of what machines can do.
2.2 General ML session by Katherine Heller 10 AM
Katherine talked elaborately about her research projects.
2.2.1 Learning to Detect Sepsis with a Multitask Gaussian Process RNN Classifier
Multitask Gaussian process models were used to convert irregularly spaced data into equally spaced time-series data in this work. After that, the RNN model was used to predict the probability of having sepsis on real-world data. Authors use the Lanczos method to cope with the Gaussian process’s time complexity and draw approximate samples. Authors were able to improve the performance of previous work significantly.
2.2.2 Graph-Coupled HMMs for Modeling the Spread of Infection
Authors try to efficiently model the spread of infection with GCHMMs leveraging sparsity in social networks. Authors successfully leverage mobile phone data collected from 84 people over an extended period to model the spread of infection on an individual level
2.2.3 Hierarchical Graph-Coupled HMMs for Heterogeneous Personalized Health Data
In this work, HGCHMMs were used to detect infections in a small mobile community. The model predicted the probability of infections for each person each day.
2.2.4 Useful resources
2.3 Panel Discussion on Academia and Industry Research Careers, 1:30 PM YouTube link
The discussion was moderated by Manish Gupta. The panel members were diverse, from well-established researchers to a final year PhD student. Manish started the discussion with a question, “How PhD has helped you in your career?”.
Pankaj Jalote brought up three important points that he has used in all the things he did: i) lookup for existing work or literature; ii) talk to the people who know more than you; iii) think academic even in the industrial settings. He adds that in his time, he had not many choices but to go with industry.
Arti Deo shared here the journey of encounter with neural networks in the first wave. According to her, a PhD trains one to thrive in ambiguity. She prefers industry over academia because she wanted to see the value created by her work. She also says that PhD helps in the industry because one goes through the same problem-solving journey as a PhD. Manish adds that he also valued the confidence one gets by going through this journey.
Abhijnan Chakraborty emphasized the freedom one gets in academics by the example of neural networks, which lived on in academia without a real push from the industries. He also mentions that one can be in academia and stay engaged with industries, so there is no one v/s another choice.
Preksha Nema shared her journey and shared how her way of approaching a problem has changed after doing a PhD.
Manish mentioned an article in The Hindu by Pankaj on “What can India do to promote more quality Ph.D programs?” and asked Pankaj on sharing more views on the topic. Pankaj brought an insight from his experience that a fresh PhD graduate from India is not considered as good as candidates from a foreign university. Thus Indian PhD students need to get global exposure.
2.3.1 Q&A
Q: How should one decide between academic and industry research careers? How harder is it to move from one to another?
A: Arati answers the first part by conveying that if one is interested in seeing the immediate value creation industry is a good option. But if one is willing to wait and choose from a broader list of topics without worrying about immediate outcomes, academia is better. Clarifying the second question, Pankaj says that to switch from industry to academia, keep publishing while in the industry. To do vice versa, be engaged in applied research.
Q: What do you look for in a candidate in the selection process?
A: Pankaj says that, for the assistant professor role, they prefer the high potential (how much one knows outside their area) despite having a low number of publications. Arati emphasizes problem-solving skills and thinking beyond narrow areas while having enough depth in the area of expertise. Abhijnan gives a candidate’s viewpoint, saying that he would look if he is comfortable with other team members. Preksha prefers a team that has overlapping interests with her work.
2.3.2 Closing Question
Manish brings the session to a close-by seeking general advice for students from the panelists.
Preksha recommends going for an internship in the industry. Abhijnan suggests that if one is confused between academia and industry after the internship, they may spend two years in either field after their PhD. Arati advises being passionate about work wherever one is. Pankaj believes that one exposure to academia and industry helps in both careers. So, one should definitely go for an internship in their PhD because they get academic exposure in the PhD.