I have been playing with the project data to study the impact of COVID-19 social distancing / lockdown to the university, especially the use of campus facilities. Meanwhile there are some time series analysis and behavioural modelling that I’d like to complete sooner than later. Everything has taken me so much longer than what I planned. Here are some previews followed by moaning, i.e., the painful processes to generate these.
The above shows some regular patterns of crowd density and how the numbers dropped due to COVID-19 lockdown. Students started to reduce their time on campus in the week prior to the official campus closure.
The autocorrelation grape shows a possible daily pattern (data resampled in 5 minute interval so 288 samples is a day, hence the location of the first peak).
Seasonal decomposition based on the hypothesis of a weekly pattern. There is also a strong hourly pattern, which I’ll explain in the paper (not written yet!).
These ones above show the area crowd density dynamics of one floor of an academic building. The one on the left shows how an academic workspace, a few classrooms and study areas were used during a normal week when few people in the UK felt the COVID-19 is relevant to them. The middle one shows the week when there were increasing reports of COVID-19 cases in the UK and the government was changing its tones and advising social distancing. Staff and students reduced their hours on campus. The one on the right shows a week during university closure (building still accessible for exceptional purposes).
Bullet points of work carried out (moaning):
- The main tables on the live DB have 500+ million records (which takes about 300 GB space). It took a week to replicate it on a second DB so I can decouple the messy analysis queries from the main.
- A few python scripts to correlate loose data in the DB which got me a 150+ GB CSV file for time series analysis. From there, the lovely Pandas happily chews the CSV like its bamboo shoots.
- The crowd density floor map was done for live data (10 minute coverage). To reprogramme it for historical data and generate the animation above, a few things have to be done:
- A python script ploughed through the main DB table (yes the one with 500 million records) and derive area density in a 10-minute interval. The script also did a few other things at the same time so the whole thing took a few hours.
- Future work will get all these tidied up and automated.
[to be continued]