*by Dan O’Brien*

This week we at the Boston Area Research Initiative have released our annual update of data sets that capture growth and investment in Boston through the Boston Data Portal. These include building permits and property assessments. As those who are familiar with our work know, we recognize that most people do not have the ability or spare time to crunch hundreds of thousands of administrative records to generate meaningful insights about neighborhoods. Nor, in fact, would that be advisable–good science relies on agreed upon methodologies that can be compared across time and place. If everyone took their own unique approach to analyzing these sorts of data, the result would be a variety of disparate lessons that may or may not be consistent with each other, or, in simpler terms, a lot of confusion. Thus, we have developed standardized measures of growth and investment from each of these data sets that we release through the Boston Data Library in the form of documented .csvs and through BostonMap as interactive maps. These are now available for download and exploration.

When we completed this release last year, I wrote a blog post focusing on the two main measures that are based on building permit records–local investment (i.e., alterations to existing structures) and new construction–demonstrating how the two tell different stories about the city’s landscape. Whereas the former tells us where a higher proportion of property owners are investing in existing buildings through renovations both big and small, the latter identifies how many new buildings are being erected in each neighborhood (note that these measures are each based on parcels as the fundamental unit for measuring levels of building in a neighborhood).

Having introduced these measures last year, this time I wanted to leverage the longitudinal nature of the data to illustrate trends across the city since 2010, which is when the data started. We will conduct this exploration for census tracts, limiting to those with at least 75 parcels, in order to avoid outliers. (For new constructions, we limit to 2015-present as the permit types changed at that time to enable more reliable identification of new constructions.)

```
ecometrics_2019<-read.csv('Permits.Ecometrics.CT.Longitudinal.v2019.csv')
analyze_ecometrics<-ecometrics_2019[ecometrics_2019$numLandParcels>75,]
```

First, let’s look at how consistent these measures are from year-to-year. That is to say, are places that see investment of one form or the other in one year likely to see a similar level of investment in the next year? An initial comparison of 2018 and 2017 suggests that local investment is quite stable between the two years.

```
cor(analyze_ecometrics$LOCALINV_PP.2018,analyze_ecometrics$LOCALINV_PP.2017)
```

## [1] 0.896445

This correlation is close to perfect. What’s even more remarkable, though, is what happens if we compare 2018 to 2010:

```
cor(analyze_ecometrics$LOCALINV_PP.2018,analyze_ecometrics$LOCALINV_PP.2010)
```

## [1] 0.9099741

Almost the *exact same* places seeing investment in 2010 were those seeing investment in 2018. This is clearly visible in the following maps

```
require(sf)
```

## Loading required package: sf

## Linking to GEOS 3.6.1, GDAL 2.2.3, PROJ 4.9.3

```
require(tmap)
```

## Loading required package: tmap

```
tracts_geo<-st_read("C:/Users/bariuser4/Documents/Dan/INC0511841/Documents/Research/Boston-Radcliffe/Geographical Infrastructure/Geographical Infrastructure v. 2014_ Final Folder/Tracts/Tracts_Boston_2010_BARI.shp")
tracts_geo<-merge(tracts_geo,analyze_ecometrics,by='CT_ID_10',all.x=TRUE)
```

```
tm_shape(tracts_geo) + tm_polygons(c("LOCALINV_PP.2010","LOCALINV_PP.2018"), breaks=c(0,.1,.2,.3,.4,.5)) + tm_facets(nrow=1,ncol=2)
```

The two maps do seem to be highly similar, with the highest levels of local investment occurring in Downtown and Back Bay, with lesser development occurring in eastern Charlestown, southern Allston, and along the main thoroughfares in Dorchester and Roxbury.

As we turn to new constructions, it is worth recalling from last year’s blog post that new constructions and local investments were not particularly correlated in 2017. This is because they are two different types of activity, but also because new constructions are much rarer. To wit:

```
median(analyze_ecometrics$LOCALINV_count.2018)
```

## [1] 59

```
median(analyze_ecometrics$NEWCON_count.2018)
```

## [1] 6

That is to say, the average census tract has 10 times as many parcels undergoing basic additions and revisions as new constructions. As expected, they again did not correlate very highly in 2018.

```
cor(analyze_ecometrics$LOCALINV_PP.2018,analyze_ecometrics$NEWCON_count.2018)
```

## [1] 0.147127

So, given that new constructions are more episodic than local investments, are they less consistent across time?

```
cor(analyze_ecometrics$NEWCON_count.2017,analyze_ecometrics$NEWCON_count.2018)
```

## [1] 0.6876702

```
cor(analyze_ecometrics$NEWCON_count.2014,analyze_ecometrics$NEWCON_count.2018)
```

## [1] 0.5068852

It turns out that the answer is yes. There clearly is still a heavy correlation between where new constructions occurred in 2017 and in 2018. But this relationship appears to diminish as the years stretch out, with new constructions in 2018 more modestly correlating with them in 2010. We see this in the following maps. That said, this very well could be an artifact of a variable whose values are more unpredictable, not necessarily a sign that there has been a substantial shift in where new constructions are occurring across the city. Let’s take a look at the maps.

```
tm_shape(tracts_geo) + tm_polygons(c("NEWCON_count.2015","NEWCON_count.2018"), breaks = c(0,10,20,40,60,80)) + tm_facets(nrow=1,ncol=2)
```

Here we see that a lot of the same places were experiencing new constructions between 2014 and 2018, particularly the South Boston waterfront, western Charlestown, and northern Brighton, as well as a few other spots.

From these initial examinations, it would seem that levels of investment have been rather stable across neighborhoods over the course of the dataset. We might conduct a more formal examination of whether certain places are seeing any meaningful changes in these regards. For this, we use a technique that was demonstrated in a recent post made by Alex Ciomek following our release of updated measures of crime and disorder based on 911 reports earlier this year. She used multilevel models with annual measures nested within census tracts in order to fit separate growth lines for each tract. This is only possible for the local investment because of the smaller number of comparable years for new constructions.

If you’re unfamiliar with this technique, think of it this way. First, this is a trend line for all new constructions in the city.

```
require(ggplot2)
```

## Loading required package: ggplot2

```
test_trend<-data.frame(years=2010:2018,
loc_inv=c(sum(analyze_ecometrics$LOCALINV_count.2010)/sum(analyze_ecometrics$numLandParcels),sum(analyze_ecometrics$LOCALINV_count.2011)/sum(analyze_ecometrics$numLandParcels),sum(analyze_ecometrics$LOCALINV_count.2012)/sum(analyze_ecometrics$numLandParcels),sum(analyze_ecometrics$LOCALINV_count.2013)/sum(analyze_ecometrics$numLandParcels),sum(analyze_ecometrics$LOCALINV_count.2014)/sum(analyze_ecometrics$numLandParcels),sum(analyze_ecometrics$LOCALINV_count.2015)/sum(analyze_ecometrics$numLandParcels),sum(analyze_ecometrics$LOCALINV_count.2016)/sum(analyze_ecometrics$numLandParcels),sum(analyze_ecometrics$LOCALINV_count.2017)/sum(analyze_ecometrics$numLandParcels),sum(analyze_ecometrics$LOCALINV_count.2018)/sum(analyze_ecometrics$numLandParcels)))
```

```
ggplot(data=test_trend, aes(x=years,y=loc_inv)) + geom_point() + geom_smooth(method='lm') + xlab("Year") + ylab("% Parcels with Local Investment")
```

This shows a moderate upward trend in local investment since 2010. The apparent drop in 2018 could be attributable to the fact that we limit our measures to approved building permits, and that some applications that went in in 2018 may not have been approved yet.

Now, let’s do the same but for all 161 census tracts included in this analysis.

```
require(reshape2)
```

## Loading required package: reshape2

```
melted<-melt(analyze_ecometrics[grepl('LOCALINV_PP|CT_ID_10',names(analyze_ecometrics))],id.vars=c("CT_ID_10"))
melted$year<-as.numeric(substr(melted$variable,nchar(as.character(melted$variable))-3,nchar(as.character(melted$variable))))
```

```
ggplot(data=melted, aes(x=year,y=value, color=as.factor(CT_ID_10))) + geom_point() + geom_smooth(method='lm', se=FALSE) + xlab("Year") + ylab("% Parcels with Local Investment") + theme(legend.position = 'none')
```

Putting aside the fact that this graph is a bit messy, the point is that each census tract has its own trendline, some going up, some going down. Because there are high correlations across time, most of the lines have similar slopes; if the correlations were lower, we would see more variation.

Using multilevel linear modeling we can conveniently extract these slopes (note: the algebraic basis of multilevel models is such that the results are not going to be arithmetically identical to what is represented in the graph, but that’s not entirely important for the task at hand; also, given the potential for undercounting, we will drop 2018 for the analysis). We will then create a map of the slope for each neighborhood.

```
require(lme4)
```

## Loading required package: lme4

## Loading required package: Matrix

```
melted$year_trend<-melted$year-2010
trends_lmer<-lmer(value~year_trend + (year_trend|CT_ID_10),data=melted[melted$year!=2018,])
summary(trends_lmer)
trends_coef<-coef(trends_lmer)$CT_ID_10
names(trends_coef)[1]<-"Intercept"
trends_coef<-cbind(analyze_ecometrics$CT_ID_10,trends_coef)
names(trends_coef)[1]<-"CT_ID_10"
tracts_geo<-merge(tracts_geo,trends_coef,by='CT_ID_10',all.x=TRUE)
```

```
tm_shape(tracts_geo) + tm_polygons(c("year_trend"))
```

## Variable "year_trend" contains positive and negative values, so midpoint is set to 0. Set midpoint = NA to show the full spectrum of the color palette.

As our previous analysis would have suggested, there is not a lot of variation in the slopes. In fact, this map is rather underwhelming (or, dare I say, boring?).This is simply another bit of evidence that, yes, a small number of places have seen their tendency towards or away from investment shift, like east Charlestown and South Bay, but generally things have largely remained stable over the course of the decade.

Much like our post about the 311 ecometrics from a few months ago, we find that neighborhood characteristics have not shifted much over the course of the decade. Just as places that were high or low crime before are generally the same, the places where people are investing in construction projects are also the same places as in 2010.

That said, there is a rich amount of information here capturing how Bostonians have (and have not) invested in their properties and neighborhoods. We encourage others to leverage it in their own work.