Results:
An initial analysis using Voyent Tools netted a comprehensive word frequency list. That list is represented in the figure on the left. After adjusting the settings to account for "stop words," words like "I" and "and," Voyent calculated that word "going" occurred most often in our dataset. It was used 641 times, which was a 141 times more than "down," the second most used word (497). When analyzing the data set, "going" is clearly an outlier. The list is otherwise fairly homogenous, with differences in frequencies being no more than 100 as the words move in descending order. After "going" and "down," the words "Lord" (484), "baby" (410), and "when" (404) represented the next most often used words. Following those three, the words "man" (399), "ain't" (334), and "mama" (279) rounded out the most notable.
Moving to the next step in the analysis, Antconc scanned the data set, ranking clusters for each word. Clusters are basically word groupings. They are words that are frequently grouped together, providing a semantic context for a key word. And thankfully Antconc ranks the clusters based not just on grouping frequency but on its range (the amount of documents the grouping is found in), ensuring a more accurate ranking. When ran on the blues data set, the clusters produced some interesting results. Below are some of the more interesting clusters.
Going: going to run, going back to, going to get, going to be, going to the, going to jump, going to tell, going to have, going to leave, going to need.
Down: down in the, down on the, down on my, down in old, down the road, down a pallet.
When: When your mother (Outlier), When I was, when things go, when I get, when you're, when the sun.
*Note each cluster is listed in order from highest ranked to lowest.
Moving to the next step in the analysis, Antconc scanned the data set, ranking clusters for each word. Clusters are basically word groupings. They are words that are frequently grouped together, providing a semantic context for a key word. And thankfully Antconc ranks the clusters based not just on grouping frequency but on its range (the amount of documents the grouping is found in), ensuring a more accurate ranking. When ran on the blues data set, the clusters produced some interesting results. Below are some of the more interesting clusters.
Going: going to run, going back to, going to get, going to be, going to the, going to jump, going to tell, going to have, going to leave, going to need.
Down: down in the, down on the, down on my, down in old, down the road, down a pallet.
When: When your mother (Outlier), When I was, when things go, when I get, when you're, when the sun.
*Note each cluster is listed in order from highest ranked to lowest.