Nikita Slavin
Cartographer
How how-old-is-this.house was made
IDEA
BIG TIME BCN
From more or less nowhere the idea of making wonderful foldable map of Saint-Petersburg appeared.
It is supposed to show age of buildings, their architectural style, and highlight remarkable representatives of styles.

There are some hazy memories about online projects.
Fast googling gives a bunch of results: Portland, Reykjavík, New York: Brooklyn, Manhattan, Barcelona, Ljubljana, Lviv and even country-scale projects: Netherlands.
There is one about Saint-Petersburg: «Retrospective of the development of St. Petersburg» about Petrogradskiy district and the "Delovoy Peterburg" newspaper project «How St. Petersburg was built up: the history of city construction in 68 seconds».

The goal - a paper map, instead of the idea of web application would arise during work on the project. Begin raw data mining.
RAW DATA
Pavel Souvorov and "Delovoy Peterburg" project are mentioning "Technical and economic passports of apartment buildings" as a source. It is a table of about 22 thousands building with a lot of attributes. The address and year built fields are of interest to us, but there are no geographical coordinates, unfortunately.

«Passports» could be found on Saint-Petersburg Open Data portal. Also there is a section called "Map", and if you are a good investigator or you have well-informed friends, in subsection "Other" you can find layer called "Object-address system of Saint-Petersburg". It contains well-detailed polygons for all objects, with attributed including name, address, and sometimes year. There are about 125 thousand addresses and 41 thousand dates. It is perfect, but before layer should be cleaned up of embankments, sidings, and other things that are not has been buildings.

While not every building in St-Petersburg has an address, however almost every building is digitized by contributors of OpenStreetMap. It is missing part for the future dataset. Downloading all objects with keys building=* and addr:city='Saint-Petersburg', we'll also grab the name, address and year built keys (sometimes it is filled in). As a result we get 143 thousand buildings with 2 thousand of dates included.

To estimate layer coverage let's make and preview : add addresses from "Passports" as points, highlight polygons with years from "Address system" and OSM. Preliminary evaluation shows the lack of data - continuing mining.


No need for a long search: there is an amazing website about the architecture of houses in Saint Petersburg CityWalls.ru. Carefully collect data from the portal - the outcomes are a table with address, years, architect, architect style, links to the CityWall page and photo. In total 27 thousand records, but part of it are lost buildings.
Quickly turn addresses into points, adding to previous layers - seems that the cover is really good - weeee!

Now it is needed to process all the data, remove unnecessary stuff, and get good-looking data
GEOPROCESSING
For GIS tasks I've used QGIS and ArcGIS Pro, for Python scripting - PyCharm.

First-order task is to prepare the layer containing all the polygons, that will be filled with a data. The base is "Object-address system of Saint-Petersburg" layer, polygons geometry is made with high level of details, looks very nice. Besides buildings, also taking all bridges to the final layer. As well as we can, getting rid of unwanted polygons of embankments, access roads, storage areas, etc.

From the OpenStreetMap layer we'll take all polygons that are not intersected with "Address system" polygons. The result is a layer with 142 thousand of records, useful attributes are already included.
Base buildings layer: "Object-address system of Saint-Petersburg" + OpenStreetMap
The "Technical and economic passports of apartment buildings" is a table, one of columns are addresses. To use this data, the table should be geocoded - a point with coordinates should be assigned to each record. I've made a simple Python script, with online geocoding service adding longitude and latitude columns to the table.

СityWalls.ru webpages have simple structure - it means data could be easily collected.
Let's make simply parser in Python - it collects URL to the house webpage, CityWalls-ID, text string - year(s) built, architectural style, architect's name, name, address and link to the photo. The coordinates from site could not be used - often users put a pin near to the house, not on it - that's why I'm geocoding the layer by myself with the script we already have. Outcome of this stage - geographical points with attributes for 27 thousand of buildings.

Assigning points from "Passports" and CityWalls to buildings polygons with a simple rule: "point should be on the polygon". Now every polygon has a table with bunch of information from different sources:
  • "Object-address system of Saint-Petersburg"
  • OpenStreetMap
  • "Technical and economic passports of apartment buildings"
  • СityWalls

Such a mess! The goal of the current stage is to good quality attributes for each building:
  • name
  • year built
  • architect
  • style
  • address
  • link to CityWalls page
  • link to photo
  • id

We determine the priority of sources for attributes, for example address string quality is best in "Object-address system", good in "Passports" and equal quality in CityWalls and OSM entries.

Date mining is a special type of fun: for data analysis we need an integer, but in all sources the date is a string with a very diverse format. It could be easy option "1703" or "1703г", year list "1703,2020", building period "1822-1917", epoch "before 1822", or outstanding "1 9471 94 8", and their combinations and exceptions.

After playing with the data a bit, making wrong decisions couple of times I find a rule that gives an acceptable result: take as a result the first four-digit number in string, if it is not a period otherwise take second four-digit number in the string. After a bit of pain with regular expressions magic, with one line of code all text fields are turned into numbers.
Green polygons are with dates, red witout
If you look at the percentage of polygons with a non-empty data attribute (from 142 thousands polygons, only 55 have it) it seems that the coverage is weak. But if you evaluate the spatial distribution, and quality (for example, houses against transformer booths) coverage is good. Visually, it seems that the layer solves its tasks: spatial patterns are readable, holes are visually small or they are distributed evenly. The layer exploration is attractive, but the inner perfectionist is asking for full coverage, let's shut him up for a while.

It makes sense to evaluate the quality of result. Looking at the graph is one of basic evaluation method, that is always accessible.
Two peaks are immediately noticeable - 1917 and 2008. The pits after the Red October Revolution, the WWII and the collapse of the Soviet Union are logical and readable. What about two outliers? The hypothesis is that many pre-revolutionary buildings are marked as 1917, all the variants "before/presumably" were caught during data processing.

The year 2008 was, indeed, a period of active construction in St. Petersburg. The stats for 2008 for Saint-Petersburg government are lower than mine, but it means nothing. Unfortunately, I did not mark the source from which the year was taken in the nal table, so I will postpone the investigation for later.

This is not a big problem, but it is important to warn the user about anomalies, or make a visualisation whose quality is not affected by the outliers.

Hooray! The data has been processed, the quality of it has been evaluated, and now we can visualize it.

By the way, the set is freely available under the CC by-SA license on page "Dataset".
COLOR RAMP
Colleagues experience shows that spectacular visualization can be reached by a combination of a colour ramp from warm red to cold blue tints on the dark background. Let's give it a try, I've taken the dark ESRI basemap and used QGIS "Spectral" colour ramp for building age representation:
Looks interesting, but I want it to be awesome and somehow connected to Saint-Petersburg.

Local task - to find colours for basemap and buildings age colour ramp. One of my favourite St-Petersburg photographers - Andrei Mikhailov, the colours in his photos are close to my "sense of city". From Andrei's Instagram I'm choosing some pictures to make the colour palette from it.
So, the red color from the facade of the Panteleimon Church, yellow from the Admiralty building and Peter and Paul Cathedral, green from the Kunstkamera tower and blue from the dome of the Trinity Cathedral are collected in the palette.
The behinds of the gryphons' at the Bank bridge, and the roofs of Vasilievsky island give the range of blue and dark gray shades we need. Petersburg. This is Petersburg, after all.

Having fun with Mapbox Studio to easily make a basemap:
studio.mapbox.com
Putting everything together. It is needed to play a little with the distribution of the timeline across the colour ramp. Finally, the best option, when each colour segment has an equal number of objects.
Looks cool, I like it, "fireflies" effect works - map is "glowing". The transition from the old "lamp" street lighting to the new "led" shown in colours, it is quite symbolic. At the same time, the epochs are clearly readable, but not so much that the peaks of 1917 and 2008 were overwhelmingly noticeable. Now we need to find a way to show this to public.
HOW-OLD-IS?
The prototype is needed. Handling the dataset to friends of mine who are developing datahub "Cartofell". On the map, the year built date is shown by colour, and attributes and photos appear on click.

The positive feedback comes that process of map exploration is entertaining. It leads me to the idea that an interactive web-map can be a good way to show the collected data, as well as improve dataset, if a crouwdsoursing mechanism will be realised.

With interested classmates we are looking at related projects experience and technology stack. The requirements for web-application are:
  • fast rendering of datasets' more than 120 000 polygons with the developed colour ramp on our basemap
  • filter by time
  • show objects attributes and photo
  • crowdsourced corrections
I get friendly advice to take a look on the Carto platform. It meets all the requirements except crowdsourcing, and you can do quite a lot on a free account. Doing some tricky processing of the data for Carto compatibility, for example year to work correctly with histogram slider should be formatted as DDMMYYHHMMSSZ, or maximum length of the text field is 255 symbols.

Some magic in HTML code for pop-up to make it good looking and showing the web links. It's a pity that not everything is realised as I wanted to, but customization is accessible with the time-consuming and expensive API. The result is a pretty nice, fast and responsive web map.
carto.com
With an elegant crutch the crowdsourcing problem is solved. The Google form can be pre-filled in using the parameters in the link, the responses will be added to the table. This suits us perfectly: edits can be revised and the layer can be automatically updated after. Generating a link for each object (some of links are very long, but a crutch is a crutch), and fasten the link to the building card - the corretions collection mechanism is ready.

The challenge of finding a domain name for the the project is tough and fun. It is needed to convey the essence of the project in the name, be a little original and avoid the word "everyone" (because it isn't). I find out that there is a domain space ".house" - that's how the variant : "how-old-is-this.house" is born. If the project becomes global, the name already fits.


As mentioned, web-development is not my business - after short overview of cloud based web-site constructors I've chosen Tilda. It is really good in mobile screen adaptation, which is important to me.

Fast route - make web-site, integrate carto.com map, make it bilingual, connect comments and other services, mobile screen adaptation - generally it is fun. The achievement - how-old-is-this.house
"WAS IT PAPER MAP?"
In cartography I love to make things you can touch: printed or wooden. You can put it on shelf or wall - isnt' it cool? On other hand I'm relatively cold to mobile and web stories (check both in my portfolio on Behance).

The original idea was to make a paper map of Saint-Petersburg city centre, I can see it in my head already. But during project development it becomes clear that for paper map it is needed more time and data. So, I'm continuing working on it, you can subscribe to the notification when it will be ready on page "Printed poster and map".

The style developed for the web is crying "put me on the poster and then on the wall" Inspired by Charles Joseph Minard quote "My maps do not just show, they also count, calculate for eye; that is the crucial point..." starting to making the layout of the poster. The goal is to show dataset in a beautiful and perfect way, it is important to be able to see each building on the map and to show spatial distribution of buildings ages. It seems a colored histogram of the building ages will be an informative and elegant addition to the map.

The print-on-demand platforms chosen are Printdirect for Russia and Printful for abroad. From the intersection of possible map scale and available posters sizes define poster size 91*61 cm.


Making poster is an enjoyable process: especially since a lot of work was made before - I'm just exporting high-resolution raster from QGIS, making the histogram in Adobe Illustrator and adding proper copyrights. It is important to show outliers on the histogram and give some comments about it. Looks good!

Last corrections after test prints on both print-on-demand platforms - Ready!

You can buy the poster on the "Printed poster and map" page.
A smiling nameless girl from Printful holds a virtual poster
THANKS
It was fun to do this project, I want to say thank you to:
  • friends who put up with me and tested different parts and stages of the project
  • Sasha Semenov for technical consultations and advice
  • our very best Cartography M. Sc., which gives me the opportunity, a circle of colleagues and the time to make such stories.