I know little about data duplication. Here’s a scenario:
There is a wordpress blog which is to be indexed by search engines.
There is an independent website which pulls the RSS of the above-mentioned blog, and displays all the latest posts from that blog. (Not just titles or short descriptions; entire posts with images).
Is this data duplication and/or a bad practice and/or illegal?
The search engines should see it as duplicate content and penalize the second site, but sadly, these scrapers often have higher rankings than the sites they copy, so the original site gets penalized.