<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Crescat Graffiti, Vita Excolatur &#187; metadata</title>
	<atom:link href="http://www.crescatgraffiti.com/tag/metadata/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.crescatgraffiti.com</link>
	<description>Confessions of the University of Chicago</description>
	<lastBuildDate>Sun, 01 Aug 2010 21:28:23 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Crescat Graffiti: The Data Set</title>
		<link>http://www.crescatgraffiti.com/2009/12/31/crescat-graffiti-the-data-set/</link>
		<comments>http://www.crescatgraffiti.com/2009/12/31/crescat-graffiti-the-data-set/#comments</comments>
		<pubDate>Thu, 31 Dec 2009 16:21:41 +0000</pubDate>
		<dc:creator>Quinn</dc:creator>
				<category><![CDATA[News]]></category>
		<category><![CDATA[RegRemix]]></category>
		<category><![CDATA[data set]]></category>
		<category><![CDATA[metadata]]></category>

		<guid isPermaLink="false">http://www.crescatgraffiti.com/?p=539</guid>
		<description><![CDATA[Until recently, I&#8217;ve thought of Crescat Graffiti as an art/anthropology project, and it never occurred to me to treat it as a data set. But now that&#8217;s what I&#8217;m doing as part of putting together a guest post for a science magazine. I love a good data set, but in the process of making it [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://farm4.static.flickr.com/3397/3588989643_d6dd5106df.jpg" rel="lightbox"><img src="http://farm4.static.flickr.com/3397/3588989643_d6dd5106df_m.jpg" class="alignright" /></a>Until recently, I&#8217;ve thought of <em>Crescat Graffiti</em> as an art/anthropology project, and it never occurred to me to treat it as a data set. But now that&#8217;s what I&#8217;m doing as part of putting together a guest post for a science magazine. I love a good data set, but in the process of making it I&#8217;m finding myself wracked with indecision about how much metadata to capture. Sure, I could just type out all the words, and I could make a pretty word cloud or a word-count list, but I find myself thinking about what other avenues might be fruitful.</p>
<ul>
<li>Differentiating walls from whiteboards from study carrels</li>
<li>Date
<ul>
<li>Quarter</li>
<li>Year</li>
<li>Month? Day?</li>
</ul>
</li>
<li>Granularity
<ul>
<li>Word-level</li>
<li>Sentence level?</li>
<li>Graffiti-post level? &#8211; you could do things like count average number of words in a piece of graffiti</li>
</ul>
</li>
<li>Writing implement?</li>
</ul>
<p>I could do any or all of these, but I wonder how many of them are going to yield any actual interesting results.</p>
<p>Then there&#8217;s the question of format. I was initially thinking public Google Doc spreadsheet, but I prefer doing my data crunching using XSLT, which suggests XML. (And I find it easier to go from XML to spreadsheet than the other way around.) That said, if I&#8217;m going the XML route, I&#8217;m doing my own schema&#8211; no doubt there&#8217;s a way to encode all of the things I want using TEI, but I&#8217;d prefer to have at least an ounce of sanity left by the time I&#8217;m done with this.</p>
<p>The data set will be free for anyone to use, so if you&#8217;ve got preferences (other than using TEI) or suggestions for what metadata I should include, do leave a comment.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.crescatgraffiti.com/2009/12/31/crescat-graffiti-the-data-set/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
