[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Hash Function
From: |
Jordi Gutiérrez Hermoso |
Subject: |
Re: Hash Function |
Date: |
Wed, 5 Dec 2012 10:35:02 -0500 |
On 5 December 2012 10:07, Jonathan Karsch <address@hidden> wrote:
> I am trying to figure out how many distinct words are in a text
> document, and how many instances there are of each.
Octave does not have hashes nor sufficiently flexible associative
arrays. I recommend using a language other than Octave for this task.
For example, here is how you can do it in Python:
#!/usr/bin/env python
import sys
from collections import defaultdict
f = open(sys.argv[1])
wordcount = defaultdict(int)
for line in f.readlines():
words = line.split()
for word in words:
wordcount[word] += 1
f.close()
for word, count in wordcount.iteritems():
print "%s: %d" % (word, count)
Here is how to do it in Perl:
#!/usr/bin/env perl -w
use strict;
my %wordcount;
while(<>){
my @words = split;
foreach my $word (@words){
$wordcount{$word}++;
}
}
while (my ($word, $count) = each %wordcount){
print "$word: $count\n";
}
Both languages are installed already in your McIntosh PC. Put either
of those programs into a file named count_words, give it executable
permissions, and do "./count_words somefile".
HTH,
- Jordi G. H.
- Hash Function, Jonathan Karsch, 2012/12/05
- Re: Hash Function,
Jordi Gutiérrez Hermoso <=
- Re: Hash Function, Sergei Steshenko, 2012/12/05
- Re: Hash Function, Jordi Gutiérrez Hermoso, 2012/12/05
- Re: Hash Function, Sergei Steshenko, 2012/12/05
- Re: Hash Function, Juan Pablo Carbajal, 2012/12/05
- Re: Hash Function, Dimitri Maziuk, 2012/12/05
- Re: Hash Function, Sergei Steshenko, 2012/12/05
- Re: Hash Function, Dimitri Maziuk, 2012/12/05
- Re: Hash Function, Sergei Steshenko, 2012/12/05
- Re: Hash Function, Jordi Gutiérrez Hermoso, 2012/12/05
- Re: Hash Function, Dimitri Maziuk, 2012/12/05