While I am far from an organic food purist, I generally prefer real food as opposed to synthetic food. The same goes for data. While real data is messy and full of “flaws,” it is theoretical to have some kind of audit trail.
Often it is only noise in…But sometimes it is something close to a signal.
Synthetic data is so much cleaner than real data. No weird outliers. Just pure data.
If you thought making decisions in a world with real data was difficult, just wait until you experience the world with synthetic data.
No matter how good the underlying assumptions that are used to make the synthesized data, the synthetic data set is only as good as the worst underlying assumptions.
And now those baked in assumptions are feeding into another program based on assumptions about what to do with the data
Oh and wait,