It has *much* more to do with the memory card's erase block size.
NTFS wants to use a 512 BYTE or 1kbyte allocation unit size. (Dont believe me? Right click your system volume, and choose properties. See what your allocation unit size is.)
This size was selected because it is 1:1 the sector size of original winchester style hard disk drives, which makes those sizes the most efficient to transfer to or from the disk controller.
Modern drives tend to favor 4kbyte sized sectors, but still emulate 512 BYTE ones.
FAT had cluster (allocation unit) sizes quite a bit larger than this. Usually between 4k and 16k, but 32k and 64k clusters are supported.
For early flash memory cards, 32k and 64k cluster sizes were 1:1 what the eraseblock sizes of the flash array were, meaning having the filesystem use that size gave the best possible efficiency with the device controller.
SDHC and SDXC devices though, have erase block sizes that (cough), 'greatly exceed' (cough) what FAT32 can support.
ExFAT however, happily lets you use cluster sizes in the MULTIPLE MEGABYTES size range, allowing the flash makers to still have 1:1 cluster->erase unit parity, and maximized device IO efficiency.
Your camera formats that card as ExFAT because that's what the SDCard Assn demands.
The SDCard Assn demands it, so that they can reliably claim the write speeds written on the top of the card.
NTFS will annihilate flash cards with write amplification, and have piss-poor io performance writing to them.