Word Frequency Count from Network Review

Title: Word Frequency Count from Network Review
Authors: Evans, T
Item Type: Dataset
Abstract: These are files containing the data used in one of the plots shown in Figure 10 in my basic overview of Complex Networks (a review for Contemporary Physics, see below). Please cite the source if you use this data. However you could also do this yourself by using the LaTeX file via arXiv. It was produced by using various UNIX tools to strip the LaTeX commands to produce a list of words (one per line) followed by counting the number of times each line was repeated. I can see that there was no stopping or stemming e.g. The and the appear separately, vertex and vertices are counted separately. Files:- netrevcountrawdata.xls = rank and count for each word, along with plots netrevcountTabSeparated.txt = rank and count for each word in simple text format netrevindex.txt = raw data, unsorted (note there are some silly words like x   Original Text:- T.S.EvansComplex NetworksContemporary Physics 45 (2004) 455-475DOI: 10.1080/00107510412331283531arXiv:cond-mat/0405123 http://arxiv.org/abs/cond-mat/0405123
These are files containing the data used in one of the plots shown in Figure 10 in my basic overview of Complex Networks (a review for Contemporary Physics, see below). Please cite the source if you use this data. However you could also do this yourself by using the LaTeX file via arXiv. It was produced by using various UNIX tools to strip the LaTeX commands to produce a list of words (one per line) followed by counting the number of times each line was repeated. I can see that there was no stopping or stemming e.g. "The" and "the" appear separately, "vertex" and "vertices" are counted separately. Files:- netrevcountrawdata.xls = rank and count for each word, along with plots netrevcountTabSeparated.txt = rank and count for each word in simple text format netrevindex.txt = raw data, unsorted (note there are some silly 'words' like "x"   Original Text:- T.S.EvansComplex NetworksContemporary Physics 45 (2004) 455-475DOI: 10.1080/00107510412331283531arXiv:cond-mat/0405123 http://arxiv.org/abs/cond-mat/0405123
These are files containing the data used in one of the plots shown in Figure 10 in my basic overview of Complex Networks (a review for Contemporary Physics, see below). Please cite the source if you use this data. However you could also do this yourself by using the LaTeX file via arXiv. It was produced by using various UNIX tools to strip the LaTeX commands to produce a list of words (one per line) followed by counting the number of times each line was repeated. I can see that there was no stopping or stemming e.g. "The" and "the" appear separately, "vertex" and "vertices" are counted separately. Files:- netrevcountrawdata.xls = rank and count for each word, along with plots netrevcountTabSeparated.txt = rank and count for each word in simple text format netrevindex.txt = raw data, unsorted (note there are some silly 'words' like "x"   Original Text:- T.S.EvansComplex NetworksContemporary Physics 45 (2004) 455-475DOI: 10.1080/00107510412331283531arXiv:cond-mat/0405123 http://arxiv.org/abs/cond-mat/0405123  
Issue Date: 12-Dec-2012
URI: http://hdl.handle.net/10044/1/30136
DOI: https://dx.doi.org/10.6084/m9.figshare.104409.v1
Keywords: Complex Networks
Frequency Count
Zipfs Law
Probability
Condensed Matter Physics
Science Policy
Appears in Collections:Physics
Theoretical Physics
Faculty of Natural Sciences



Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

Creative Commonsx