Tool for identifying languages
Jeff Kinz
jkinz at kinz.org
Tue Jan 17 12:57:54 EST 2006
On Tue, Jan 17, 2006 at 10:23:17AM -0500, Christopher Schmidt wrote:
> On Tue, Jan 17, 2006 at 09:34:06AM -0500, Jeff Kinz wrote:
> > Does anyone know of a tool that can determine which language a
> > chunk of text is written in? (Assume a few hundred words)
>
> http://languid.cantbedone.org/
> http://languid.cantbedone.org/Language-Guess.tgz
Wow. Unbelievable. Thank you Chris.
>
> --
> Christopher Schmidt
> Web Developer
Why I'm "wowed":
This tool appears to use some form of statistical analysis based on
how often certain three "character" strings appear. Also, whitespace is
one of the characters. Very nice, and thanks again to Chris.
Here's a few random lines of the English "strings" file:
t t 45
be 46
ld 47
e a 48
rs 49
wa 50
ut 51
ve 52
ll 53
--
Jeff Kinz, Emergent Research, Hudson, MA.
speech recognition software may have been used to create this e-mail
"The greatest dangers to liberty lurk in insidious encroachment by men
of zeal, well-meaning but without understanding." - Brandeis
To think contrary to one's era is heroism. But to speak against it is
madness. -- Eugene Ionesco
More information about the Discuss
mailing list