In early April we had a visit from two data scientists at Engage3, who came to speak in our Alumni Seminar Series. Toward the end we had some particularly interesting discussion about the value of a physicist as data scientist. I include that part of the conversation here, addressing the question: you need a lot of machine learning, probability and statistics, subjects never taught in physics departments, so why do you hire physicists?
Anup Doshi, Director of Data Science at Engage3 |
The colloquium title was "The Physics of Shopping and Algorithmic Trading in Consumer Marketplaces." The visit by Ouimet led to our meeting with the Engage3 data scientists. They were James Holliday (PhD in physics from UC Davis in 2007) and Anup Doshi (PhD in EE).
We pick up the discussion toward the very end of Holliday's presentation.
Holliday: We’re trying to hire people. I tend to look for physicists or people
who have gone through a physics education. And the reason I do that is I
believe that physicists have a way of solving problems and approaching problems
that’s unique. I love the way that we’re taught to take a
problem, a complex problem that we’ve never seen before, and break it down into
fundamental blocks: things that we have seen before or things that we
understand very well. And it can be a really complicated thing and maybe we
have to make some approximations, but the ability to look at something that we
have not seen before and come up with a way to solve it – it’s just wonderful
and I think it’s unique to physics.
LK: I’m curious for an engineer’s perspective. We can tell ourselves stuff like this all the time but I’m a physicist. You [Doshi] have physicists working for you, and
you’re in the market for hiring talent. So I’m also really interested in your
perspective on what’s valuable about a physicist.
“I think the key qualities that physicists bring to problem solving are the ability to approach a problem from first principles, mathematically model a problem from first principles, and then follow in some sense a scientific method to get all the way through the problem.”
Doshi: Sure. Just to
preface that question: my background – I did a PhD in Electrical Engineering
and since then I’ve been working in this field of data science for a
number of years now. I’ve had bosses that are physicists, colleagues that are
physicists, and folks that are working for me as physicists, and I always enjoyed working with all of them. I think the key qualities that physicists
bring to problem solving are the ability to approach a problem from first
principles, mathematically model a problem from first principles, and then
follow in some sense a scientific method to get all the way through the
problem. That's formulating the problem, doing background research, modeling,
generating a hypothesis, doing experiments, doing tests, skills from high
energy physics like Monte Carlo simulations, for example, solving great, tough
optimizations, going to the whiteboard and actually writing out the
optimization problem; working out better ways to solve this. And then
beyond that just getting the results and interpreting the results and then
communicating those back. Those skills are unique to I think the
mathematically-oriented, scientific person, like physicists. You don’t get that
necessarily in any other discipline, that I've seen.
Holliday: Exactly. I like to look for the
physicists when I'm hiring data scientists. One thing that I
do when I’m interviewing people is I’ll throw a problem – I’ll throw it very
quickly – I’ll throw a very difficult problem that I don’t expect people to
necessarily be able to solve; I don’t give them all the information they need
to solve that because I want to see if they can ask the right questions to
understand the problem to make progress on it. And I want to see how they think
about it as they’re pushing forward; to see if they can’t
work in those situations. The ones we wind up hiring tend to be ones
from the mathematical, the scientific-oriented fields, that can think through
the problem. So I would encourage everyone as you’re pursuing science or
whatever, make a habit out of asking for clarifying information if you don’t understand something. That is the real
world: sometimes you’re not given all the information you need; you need to get
that information to make progress.
Questioner: Why don’t you guys
hire from mathematicians rather than physicists?
Holliday: We have talked to a few
statisticians, and we hired somebody recently with a statistics background; a
statistician.
Questioner: I think you’re
thinking biased because for data analysis you need a lot of statistics and
machine learning and probability, which are the courses that are never taught
in physics departments. You have to spend a lot of time investing in some
people to teach them those courses.
Somebody else: I think if
you’re a physicist you’re assumed to know probability and statistics; that is
the basic –
Original questioner: A little,
but not as much as mathematicians need statistics. I am working on complexity, but in all interviews I say that I have a
statistics background with probability and machine learning.
Holliday: Yeah, that’s very fair, and I appreciate
the question. I suppose it is coming off like I’m saying I’m putting up a filter: only physicists apply. One nice thing, what I get when I assume that a
physicist or data scientist is coming in, like we said, there’s the assumption
that they have some of that mathematical foundation. If someone were to come
with just a mathematical degree, I would be happy to interview. I would
obviously be impressed with the math; there’s probably a lot that could be said
about the problem solving, and we’d just have to see.
Doshi: So if I could follow up
on that: we see a lot of candidates come across our desk who have X, Y, Z
background, and then they’ve got a Master’s in Data Science. And there’s lots
of programs now. Data science itself is a big, growing field, and a lot of
universities are offering the “Master’s in Data Science.” And they’ll teach you
skills like basic statistics, basic machine learning, computational skills –
learn python, whatever you need to learn – they’ll teach you that for a year or
two, then pump you out with a degree in Data Science. You see a lot of those
candidates coming across our desk. They’ll come across, and we’ll pose them one
of the simplest problems, a Bayesian problem,
and they won’t know how to approach it properly because it doesn’t fit into the
things that they’ve learned.
Math questioner: I didn’t mean those.
Because those are programs that you pay for them; you don’t get admitted to
university for data science; it’s like an MBA.
“the key missing qualities there are that inquisitiveness and the ability to approach a problem from a first principles kind of concept.”
Doshi: Maybe, but there’s also
courses – you can go out and learn by yourself whatever machine learning you
want to learn, and so the key missing qualities there are that inquisitiveness
and the ability to approach a problem from a first principles kind of concept.
Even if you don’t know how to solve the problem the way it’s supposed to be
solved, can you think about a solid approach, and can you formulate it in a way
that, given your background, that will get you to a reasonable answer – a
reasonable hypothesis even – a reasonable answer relatively quickly? And then
can you follow through that logic? That kind of inquisitiveness and the
ability to approach a problem correctly is much more valuable than actually
having those skills because then if we grab the
people that have that ability, then we can go out and say hey, here, read this
book and come back.
Other questioner: What are some of the
things that you thought you needed to learn as you entered industry, that have
helped you succeed in industry?