What big data reveals about modern-day housing segregation

In the Neighborhood Branding Project, Ariela Schachter combs through Craigslist ads to uncover how the online rental market reflects and intensifies inequality along racial, ethnic, and socioeconomic lines.

When you’re looking for an apartment or house to rent, how do you find out what options are available? If you’re like a growing number of Americans, you turn to the internet, browsing sites like Craigslist, Zillow, or Facebook Marketplace. On the surface, these platforms may seem to expand equitable access to information, reducing the cost of your housing search while broadening the menu of neighborhoods and homes that you can explore. But when you take a closer look at ads for apartments in St. Louis, you will notice a difference. 

Ariela Schachter uses big data to understand how language in
rental ad listings may perpetuate housing inequality.

An ad for an apartment in a predominantly white neighborhood will often emphasize the “wonderful” amenities and “hustle and bustle” of the area. An ad for an apartment in a predominantly Black neighborhood will more likely emphasize requirements that prospective tenants must fulfill, like “proof of income” and “no evictions” – even if both neighborhoods have relatively high poverty levels. According to Ariela Schachter, an assistant professor of sociology, this information disparity is no coincidence. Rather, it demonstrates the persistent influence of structural racism and inequality in the modern rental market.

“If people are being exposed to fundamentally different kinds and amounts of information as they’re looking for housing, we hypothesize that that could then shape where people end up living,” she said. “If ads in predominantly non-white neighborhoods are not emphasizing the attractive qualities of housing, even though that housing stock might have lots of attractive qualities, that could be something that’s preventing desegregation.”

Schachter is one of the lead researchers of the Neighborhood Branding Project, an interdisciplinary collaboration using big data to examine the online rental market in cities across the United States. Harnessing tools developed by computer scientists, statisticians, linguists, and social scientists, she and her colleagues are probing the ways that seemingly neutral open-access platforms like Craigslist perpetuate housing segregation.

In one recent study, the team analyzed 1.7 million Craigslist ads across the 50 largest metropolitan areas in the country, uncovering widespread information discrepancies at the intersection of race and poverty. Compared to listings in nonpoor neighborhoods, those in poor neighborhoods tend to contain fewer words and thus provide less information. At the same time, listings in predominantly Black or Latino neighborhoods tend to provide less information than those in predominantly white neighborhoods of the same income level.

Online rental listings provide large volumes of data for researchers to analyze.

The type of information provided depends on the neighborhood demographics, too: As in the example from St. Louis, ads in predominantly Black or Latino neighborhoods often focus on renter requirements and qualifications at the expense of describing the positive characteristics of the housing unit itself. And in neighborhoods undergoing gentrification, listings for higher-rent units tend to include more information about neighborhood amenities.

Schachter noted that her team does not assume malicious intent on the part of individual property owners. “Most landlords are posting in a way that they think will help them find someone quickly,” she said. “I think this is much more about these broader market forces. In general, unregulated private markets are going to reflect the segregation and inequality they are operating within.”

The tools of big data

In decades past, renters found housing mainly through classified ads in newspapers – a format ill-suited to large sociological investigations. But now, the proliferation of online listings has enabled researchers like Schachter to access and analyze massive volumes of data with the help of computers. Meanwhile, the rental market has grown to include nearly half of central city residents and 36% of all Americans, making the Neighborhood Branding Project even more timely.

A big data technique for analyzing text known as structural topic models allows Schachter to identify common themes across thousands or millions of rental ads. “The computer doesn’t actually understand the content of the words,” she said. “It’s using an algorithm to see what words tend to co-occur with one another and what kind of patterns it can identify. And then we as the researchers are the ones who actually find the meaning there.”

"There is still this kind of idealistic view that if we just get rid of discriminatory real estate agents and landlords, we can get rid of segregation."

Schachter also examines one word or phrase at a time. For example, she has found that in many cities, the mention of “evictions” is correlated with a high percentage of Black residents, the mention of “campus” is correlated with a high poverty rate, and the mention of “rooftop” is correlated with a high percentage of college-educated residents. But as she is quick to point out, “all these techniques are only as good as the data that you’re using to begin with.” The team constantly works to refine its methods for filtering out spam posts and duplicate ads that could skew the results.

While Schachter’s research has revealed broad patterns in rental markets across the U.S., it has also highlighted the importance of studying each city’s unique circumstances. Lydia Ho, AB ’21, is working with Schachter and John Kuk of the University of Oklahoma (until recently a postdoc at WashU) to explore which St. Louis neighborhoods are mentioned most often in Craigslist ads – and whether the claimed neighborhoods match the actual addresses of the listings. Ho began this work as a senior and continues to be involved with the project.

“Living in St. Louis, we’ve heard about the Delmar Divide, and there’s a lot of segregation within the city itself,” said Ho. “Do these neighborhood claims map to racial differences, or education, or poverty level? Are there certain trends?”

Ho has leveraged her programming skills to analyze about 30,000 Craigslist listings, using geographic information system (GIS) tools to figure out, for instance, whether a house listed in the Central West End actually sits outside the neighborhood boundaries. The team found that ads for rental housing in North City neighborhoods are less likely to mention the area by name, suggesting that some neighborhoods come with a stigma that landlords are trying to avoid. The results, while still preliminary, suggest that neighborhood claims do vary with demographics.

From Craigslist to conversations

In the next phase of the project, Schachter and Max Besbris, an assistant professor of sociology at the University of Wisconsin–Madison, plan to talk with renters directly. With funding from the National Science Foundation, their team is conducting surveys in the San Francisco Bay Area, Los Angeles, Houston, Chicago, and New York City. The respondents will be randomized into groups that will see ads with different types and amounts of information, allowing the researchers to disentangle the content of the ads from the characteristics of the neighborhoods.

 “Understanding how different people view the trade-offs between the housing amenities and underlying neighborhood demographics will speak to those questions about how gentrification is happening and why,” Schachter said. The experiment, she hopes, will cast light on individual decision-making in a way that the Craigslist data alone cannot.

“Something I love about sociology is having these findings that are coming from this big data project to be constantly in conversation with research that’s using ethnography, participant observation, interviews, other methods that are going to capture nuances that our research approach is not,” she said. “When we can look at all those findings together, that’s where we have the richest understanding of our social world.”

In recognition of the value of such interdisciplinary work, WashU launched its Center for the Study of Race, Ethnicity & Equity in August 2020. This spring, the center is supporting the Neighborhood Branding Project, along with the work of four other WashU scholars, through its faculty fellowship program. As a fellow, Schachter has the opportunity to focus exclusively on the project for a full semester, as well as participate in and lead collaborative workshops and seminars.

For those searching for a place to rent, Schachter recommends looking beyond the few neighborhoods that you and your friends are familiar with. While no individual’s actions will solve a structural problem, she said, being aware of how race is entrenched in our cities is an important first step. On the policy side, she suggests that city governments could maintain centralized housing websites or require landlords to provide more information when listing online. Building online platforms with equity in mind could help mitigate the problems that her team has identified. Still, housing segregation is a complex issue with no panacea.

“I do think there is still this kind of idealistic view that if we just get rid of discriminatory real estate agents and landlords, we can get rid of segregation,” Schachter said. “I think it’s much more complicated than that, and our research is just one way of showing that segregation continues to be reflected in how we think about and talk about places and housing.”


Join the conversation!

Learn more about the Neighborhood Branding Project at a virtual Q&A with Ariela Schachter on Nov. 10, 2022, at 4 p.m. CST.