Open Question

How Can Universities Prepare Tomorrow’s Leaders to Use Data for Social Good?

Commentary /

A rapidly-growing Stanford major aims to equip students with a combined fluency in data science and social science

Image of female teacher giving a presentation to a class

Data science is evolving faster than we can even imagine. 

At the intersection of math, statistics, and computer science, data scientists use large data sets to uncover insights, patterns, and trends to make decisions and solve real-world problems. New tools, techniques, and applications are emerging all the time, transforming the way we live and work. As the world sees–and industries respond to–new breakthroughs in artificial intelligence and machine learning, universities are also feeling a growing demand for big data professionals.

Data science can be a powerful tool for tackling societal problems by providing focused and relevant  insights that can be harnessed to drive the creation of more effective, efficient, and equitable interventions – as we see in cases like using data to combat world hunger or working to reduce bias in AI-supported decision-making.

With the advent of large language models, it’s now easier for beginner coders to execute programming languages—like a calculator can perform basic and complicated arithmetic. As the data science world evolves, the people training new scientists are homing in on what it takes to build the critical thinking skills of the person behind the calculator. Learning how to unpack the scope of societal problems, figure out where data can add value, and translate solutions are becoming vital skills needed by data scientists today, according to Mallory Nobles.

“Knowing the basic mechanics of how to code probably will become a less valuable skill going forward. There’s more emphasis on being able to identify what the problem is that you’re trying to solve and how you will use data science as a tool to solve that problem,” Nobles said.

Nobles is the Associate Director of the Data Science & Social Systems (DSSS) program—a new major launched in 2022 by Stanford's School of Humanities and Sciences and incubated in Stanford Impact Labs (SIL). Since its inception, DSSS has brought more than 200 undergraduate students into the fold, training students on how to incorporate data science and social science approaches in their thinking. This dual approach creates avenues toward contextualizing data, understanding the underlying motivations and biases that shape it, and making data science relevant to real-world problems.

Through DSSS coursework and projects, undergraduate students build a triple fluency: expertise in statistical and computational methods, domain knowledge across the core social sciences, and a deep and interdisciplinary understanding of an important social problem. For Nobles, social science frameworks are essential for data science because they provide a lens through which to understand and interpret data. 

“If you’re trying to solve a really hard social problem, like climate change mitigation, equitable health care, or political polarization, you’re not going to be able to do it alone. It’s going to require a lot of different folks coming together and understanding how data science can fit into that larger context.” Nobles said.


A Growing Demand

Outside of Stanford, researchers trained as both data scientists and social scientists are filling important niches in the policy world, underscoring the need for more programs like DSSS. In Nobles’ Data Science @ Work course, Stanford’s DSSS students hear from data scientists working in industry, government, and the nonprofit sectors. During the class, former White House Chief Data Scientist DJ Patil advised students to explore those intersections between data science and social change.

When a wave of COVID-19 infections hit California, Patil began working with the state to turn data scientists into new kinds of first responders. By collecting hospital and community data, developing surveys, and modeling the potential impact mass infections could have, his team helped inform real-time responses and scenario planning for policymakers.

“One of the most exciting things we’re starting to see in society is how interdisciplinary everything is. Medicine needs data, astrophysics, and the collection of data from new telescopes, the way we interact and understand diseases—all of these things,” said Patil.

Using data-driven approaches in tandem with public health knowledge, his team worked to scale up the model toward other states. Now Patil's career in data science has taught him that the challenges that face the country—whether in healthcare, environmental sustainability, or national safety— pose persistent threats to society and are impossible to solve through only one perspective.

This tradition dates back even to the beginning of the field. One of history’s earliest data scientists, John Snow, also became the father of modern epidemiology in 1854 by mapping London’s cholera outbreaks and tracing them to public water pumps in the area. Patil calls these massive multidisciplinary problems that need data scientists coming from multiple disciplines and backgrounds to fully understand them, let alone solve them.

“How do you build things that are going to power the world forward and in an interdisciplinary way and responsible way? How are you going to share what you’re building for everyone? That’s data science,” said Patil.

Data has great power to inform policy, but data alone can't speak for itself, especially when it comes to social challenges. Analyzing data requires the context needed to understand why a problem exists in the first place. By taking classes in concentrations ranging from education to global poverty, language, or the environment, DSSS students develop a focus on major societal challenges and the social systems that we need to understand to make progress. By developing greater fluency in the social sciences, students are learning how to apply data science skills to complex social problems and bring interdisciplinary perspectives to their own work.

“When you ask and answer causal questions about a social problem, you’re deepening your understanding of the underlying causes, which can give you clues about how you might go about solving it,” notes SIL Faculty Director Jeremy Weinstein.


A Growing Responsibility

From the rise of advocates for algorithmic justice to the introduction of an AI Bill of Rights, the field and the wider public have begun to recognize the need for responsible data science practices. The world needs solutions-oriented data science to overcome pressing challenges. But what's become clear is that ethical frameworks for how to navigate the data science world are now more important than ever. Stanford's next generation of data scientists is arming themselves with this knowledge by delving into these complex challenges and unearthing new insights in their studies.

“There’s a lot of work being done right now on the question: How can we make algorithms more transparent and auditable and ensure that they are applied appropriately? That is a massive need and this program aims to fill that need by training folks in both data science approaches, and how to think about problems more deeply, consider the context for them, and develop skills around what it means to be responsible with data science or machine learning,” Nobles said.

As a machine learning scientist, Ehi Nosakhare leads a team at Microsoft using AI solutions to solve business challenges across the company. To Stanford’s DSSS students, she said it was an exciting time to be a data scientist, especially when the field feels like it is constantly changing. To make a positive impact on social issues, she advised students to keep up with the way the field is evolving and critically examine their own perspectives as data scientists.

“You should think very critically about what you're doing … It's a privilege to be able to use data to create technologies and systems. Do it in a way that is inclusive, and do it in a way that you are taking in the perspectives of others around you and from different fields,” Noskhare said.

Embarking on Stanford’s Data Science major has helped undergraduate student Esha Thapa find her middle ground between technology, the social sciences, and humanities. After completing the major’s gateway course, Solving Social Problems with Data, she developed the skills needed to frame problems, choose appropriate research designs, and interpret data while understanding its social and political environment. As she pursues a future career as a data scientist, her lessons from the major are guiding her forward.

"Human rights and ethical behavior are embodied in my career as a main priority rather than an afterthought," Thapa said. "In a world ripe with technological growth, it is more important than ever that we confront the consequences of it with both criticism and empathy."