Theory & Practice of Data Cleaning
This is a topic that I found very interesting, however I find the industry's best practices to counterintuitive and in opposition of this very idea. In the near future I am making plans to attend multiple online classes pertaining in or around the subject, however I believe very little of it will be put into use professionally. Instead this is more for my own systems and business model. Since I'm limited by my hardware and the services I've created. Data backups need to be precise without question. An Example of this very issue, on a project sometime ago I built an application where the clients could upload image to update their profile. Simple enough however the system had no way of knowing if it was the same image or did any kind of file system clean up for images unattached to the profile. This part was left up to another developer, and wasn't a concern for the developer in regards to the sever's hard-drive space. The previous images in the system