I’ve started slogging through all the graffiti transcriptions for my next analysis (see previous post for details), and as a teaser I’m posting the data about quotes found in the Brown University corpus.
Perhaps “plagiarism” is unnecessarily accusatory. Many of the pieces pull from popular culture, a frame of reference ideally shared by writer and reader, such that the reader would likely recognize the source without written attribution, rather than assume that the source of the quote was the writer. That said, I’ve been fooled before by quotes that seemed profound until I Googled them, and there was one example from Brown where a reader expressed interest in marrying the writer of a piece of graffiti, perhaps not realizing the author was W.H. Auden.
A cursory glance at the other data sets shows that music is the most common source of quotes. I’d guess that at Brown, music is referred to less often than average. With only one data point, it’s hard to determine what does (or doesn’t) make the genre results interesting, but I was surprised to see the strong showing by indie rock, as well as the fact that the three varieties of rock music together make up almost half of the data. Meanwhile, rap and R&B only make up 22% of the data– much lower than I expected.
Another preview: the preliminary average “interestingness” score (out of 3, before I implement category-based weighting) for the Brown graffiti is 1.56. The “Quotes” category described above makes up 14% of the pieces of graffiti that fall in a specific category (i.e. excluding the generic “misc” and “reply” categories). The most common category at Brown? Sex, at 20%.