Thank you again to our panelists for their insights and for opening up the conversation on building effective data science teams.
Our panelists for this webinar were:
Elaine: I think one of the answers, which I wish were not the answer, is that a lot depends on finding the right home in the organization. I don’t think there’s one clear answer to what that is. It depends a lot on your company and stakeholders.
In terms of scaling up, even if you have a lot of credibility, have produced a lot of great work and people are excited about the data science team - if you’re not in a place in the organization that fits in terms of the business and how the company is organized, it’s hard to grow the team.
There can be a lot of uncertainty about what it means if we have more data scientists. An executive who’s not a data science person may not quite understand what we get from that. This is not an easy problem to solve because it can be a lot easier to add people to more classic business things that they understand. For leaders, this is a really important thing to think about. Where in your organization can you find the best long term fit? Where do your highest level leaders understand the value of what you do?
Nasir: At one of my previous employers, I was hired as the first data scientist - actually, the first person to explore whether AI/machine learning would be a value-add. They had a huge amount of data available within the organization. I took the challenge of generating confidence among the stakeholders with limited resources.
I defined some low hanging fruit types of problems and solved them by providing access to self-service tools. In that case, Shiny applications were tremendously helpful to me. I took the sample data and generated outcomes the way they wanted, interacted with them, and put them into the driver’s seat. They were so happy. From this, I was able to get buy-in from most of the stakeholders so that I could grow the team. I then built their engineering team, data science team, and system administration team. It was all about generating confidence among the stakeholders and creating values for the business.
Greg: I think that makes complete sense. To add a point to that, it seems like you have to shift from being a data scientist and put your advertising hat on. You need to advertise what you’re doing and show the value. Then, shift to be an economist and say, “here is the return on investment of adding more data scientists and this is what you can get.” You need to have this broader perspective, rather than just wanting to build models. You need to be an advertiser and I think your example, Nasir, was great. You did that.
Greg: There are a couple of things. One is what their academic background is, depending on what I’m after. I don’t want to just hire people like me, as economists. I need a broader skill set.
The other is their skills and things that they’ve worked on. You can get a sense of how technical they are in certain areas just based on that. When it comes to a phone interview or talking to them, I shift. I talk about some technical things, but I shift into the behavioral-based interviewing. I want to know how they perform and what they did in different specific situations with the idea being that if they perform this way in a certain situation at a company, that could carry over. So, I have the dual frame of technical and behavioral based interviewing.
Kobi: I like what Greg just said. These days, I try to divine out what they actually do and what was created beyond the jargon on the resume. When I’m looking at resumes, I look for that same narrative and description that illustrates an understanding of what was done. I don’t fault candidates for this, but part of this is the nature of the way people are finding jobs and hiring managers are finding resumes.
There’s a tendency to throw a lot of the jargon at NLP behind the resume. You’ll see resumes with a list of acronyms of the software they use and a list of types. I’ve seen people break out clustering analysis and then list types of clustering analysis - machine learning - and then they’ll list types of machine learning. I’ve found, more often than not, candidates behind those resumes didn’t have the level of a sophisticated understanding that I’m looking for - problem solving. Technology will change and techniques will change. I look for people who can think and are willing to. With this field these days, it’s difficult to parse it out, but I think what Greg said about having a scenario and that sort of discussion is helpful.
Jacqueline: I would like to talk about this because I used to take a very neutral stand. Years later, I think the right approach is to be distributed because of exactly the thing we’ve been talking about for the last hour, that communication is key. When you have a distributed data science team, communication with the stakeholder is most important, not the communication between data scientists. Maybe some of the teams are using R and some are using Python, whatever. That’s still better than all using the same language but not talking to the people who use your work.
The other point is that the question may not be “what is the right approach?” The question here is “how do I convince people above me to switch to a better approach?” I don’t have a good answer for that except to say this is the job of a director of data science or someone in leadership. They should be thinking about that. They have the authority and title by their name to make those changes. If you do not have that, depending on your organization, it can be very difficult to get people at levels above you to listen to you as a senior data scientist. In this case, the best things you can do if you want to try and make those changes at the lower level, is get one level up to try and buy in and have them move it up and up rather than going straight to the CEO to say, “Listen to my org distribution.” It’s about thinking it through - how do you work through an organization of people?
Elaine: To hit more on the theme of communication, I would say ask for opportunities to communicate the work that your team is doing to more and more people at higher and higher levels and get really good at that.
Jacqueline: A lot of leadership is organization and communication, as said, so I find the things that have helped me in my career are running a project on my own. Find places where you can do the whole thing instead of asking your manager, “well, now what do I do?”” Find places where you can be the person calling the shots, and eventually, you find yourself calling shots with other people and then you get titles and stuff like that.
Nasir: I would recommend developing your business acumen. I’m sure you are a data scientist, so you’re technically very sound. You know your technical details. You need storytelling experience and to be able to speak in terms of the business language, rather than the technical language. So while you will be explaining your heuristic curves, instead of using the heuristic curves, how can you explain that in the language of the business?
Also as Jacqueline suggested, be the leader of a project. Take your project, take the ownership of that project, and see whether you can execute end to end.
Greg: As Nasir said, putting on your business hat rather than your data science hat and expanding your horizons. Also, talk with your manager and have an individual development plan. A lot of companies have training for new managers or people who want to become managers. If your manager doesn’t even know that you want to grow and become a manager, they may not push you in that direction. So definitely communicate that “hey, I want to do this. How can we make this happen?”
Kobi: Apply for leadership jobs.
Thank you once more to our panelists for opening up this important discussion on how we can build effective data science teams. Our panel webinar focused on three main themes that we think contribute to effective data science teams:
Our last blog post published on June 3rd, shared the panel Q&A that addressed those three themes above.
While we, unfortunately, were unable to address every attendee question during the webinar, we would really love to keep this conversation going. There were so many great follow-up questions that touched on team design, project scoping, tool selection, and selling data science internally that we will dive deeper into through:
If you have other ideas or questions you’d like to share, you can use the RStudio Community link for each individual webinar question to share your thoughts on a specific topic as well.
With the vetiver package, data scientists have a streamlined, consistent way to maintain machine learning pipelines. We recently updated our Bike Share prediction application using vetiver and Quarto.
In this post, we walk through the tools and functionality we used to automate survey results reporting.