Note that we do not store any data, we train an AI algortihm that learns from examples, which is different to compare new data to stored data.
As many technologies, one always look for a compromise. It won't be possible even in the near future to have a 3D sensor with megapixel resolution that works at 1000fps. But for many applications you don't even need that. If you have a static scenario, for instance a fixed location in a public spacfe where you want to obtain images or tell how many people are there, without recording their faces or any personal information about them that could identify them, this approach can be an option. If you need to have some form of 3D sensing, even if not very accurate, but that works at 1000fps (for instance), then this could be useful. If you want very high resolution, then not.