Users may want to reduce their memory consumption by using fp16.
However, in my tests, such attempts will result in lower quality renders.
Some data type conversions did not have any impact, so I removed them completely.
* Provide --data_on_cpu option to save VRAM for training
when there are many training images such as in large scene, most of the VRAM are used to store training data, use --data_on_cpu can help reduce VRAM and make it possible to train on GPU with less VRAM
* Fix data_on_cpu effect on default mask
* --data_on_cpu to --data_device
* update readme
* format warning infos