The next US president may well be able to confirm the number of people attending his inauguration, and this down to deep learning and convolutional neural networks.
Thanks to the team, Yuhong Li, Xiaofan Zhang, and Deming Chen from Beijing University of Posts and Telecommunications and the University of Illinois at Urbana-Champaign, their recently published work, CSRNet: Dilated Convolutional Neural Networks for Understanding the Highly Congested Scenes, may revolutionise how accurate and fast crowd counting will be in the future.
The congested scene analysis work is based on 2d images, and can break down an image and calculate where the higher density of crowds are located. This will help organisers in the future to understand how to keep people flowing, and for static situations, how to keep the density low enough to ensure attendee comfort.
The teams work focused on density map creation, and by using dilated CNN on crowd counting for the first time, the system outperforms other state-of-the-art crowd counting solutions. Demonstrating their approach using five public datasets, the overall model, is smaller, more accurate, and easier to train and to deploy.
Using the ShanghaiTech crowd dataset, which is a dataset of 1198 images with a total of 330,165 people, split over 2 areas, one being a highly congested scene, and one being sparse crowd scenes, the method improved the accuracy, dropping the error rate by 7%.
On the other datasets, the CSRnet system continued to outperform in most of the scenes, only bettered in 1 out of 5 sets by CP-CNN method.
The method doesn’t need to be reserved for counting people, it can also be used for counting vehicles, and using the TRANSCOS dataset it outperformed all other methods.
In summary, next time you’re in a photo, there may be a new method of counting how many other people are in the picture with you, with a better accuracy than previously attained. Time to invest in some camouflage.