An intriguing article in Slate posits that the new search technologies unleashed by Google Book Search may expose plagiarisms that occurred in the past and were never detected. It gives an example: a 1899 essay by England Howlett which copied a 1892 book by a well-known author.
Most interesting to me was this report, which indicates how unlikely accidental plagiarism is, even for a simple sentence:
But wait, you might ask, don’t people accidentally repeat each other’s sentences all the time? It seems to me that this should not be unusual. Yet try plugging that last sentence word by word into Google Book Search, and watch what happens.
It: Rejected—too many hits to count
It seems: 11,160,000 matches
It seems to: 3,050,000
It seems to me: 1,580,000
It seems to me that: 844,000
It seems to me that this: 29,700
It seems to me that this should: 237
It seems to me that this should not: 20
It seems to me that this should not be: 9
It seems to me that this should not be unusual: 0
It seems to me that this should not be unusual is itself … unusual.
Google Book Search contains hundreds of millions of printed pages, and yet after just a few words, the likelihood of the sentence’s replication scales down dramatically. And even before our sentence implodes into utter improbability, there’s another telling phenomenon at work. The nine books that contain the penultimate It seems to me that this should not be are from a grab bag of subjects: a 2001 study of Freud, an 1874 collection of Methodist camp sermons, minutes from a 1973 hearing of the Senate subcommittee on transportation. So, if replicating the same sentence alone is suspicious behavior, then to also replicate it on the same subject warrants dialing 911.