Category Archives: Featured tools
spaCy
Extremely promising new Python NLP tool: spaCy (commercial open-source software):
Unfortunately, it is only able to deal with English input at the moment and installation on Windows seems to be tricky. The project is currently under intense development and it will be interesting to check the following links on a regular basis:
License: AGPLv3 (free for open-source projects), changed to MIT License (27 Sep 2015)
Source: http://spacy.io/index.html#detailed-speed-comparison [accessed: 24/07/2015]
New release: textblob-de 0.4.2
New release of German language extension textblob-de
for the popular textblob
package:
See overview of working features compared to main package.
Overview for development branch: Click here.
CQPwebInABox
Excellent news! A couple of days ago, Andrew Hardie released a virtual machine with a preconfigured version of CQPweb installed:
> From: a.hardie(*at*)lancaster.ac.uk
> To: cwb(*at*)sslmit.unibo.it
> Date: Thu, 2 Apr 2015 05:20:33 +0000
> Subject: [CWB] Announcing CQPwebInABox
>
> Hi everybody,
>
> This is just a quick note to announce the availability of CQPwebInABox
> – a virtual machine image containing a pre-installed copy of CQPweb.
>
> This is designed to get beginners past the hump of having to install
> all the different components.
>
> The image (1.6GB) can be downloaded here:
> https://sourceforge.net/projects/cwb/files/CQPwebInABox/
>
>
> To run it, you will need to install VirtualBox (although I believe
> other virtualisation tools can also use the same file format, I haven’t
> yet tested this).
>
> You can get VirtualBox here:
> https://www.virtualbox.org/wiki/Downloads
> Then “import appliance” from the .ova download.
>
> The virtual machine runs Linux – however, I have set it up in such a
> way as to make the interface as similar to Windows as possible. So
> don’t fear the Linux!
>
> I will create some video tutorials & put them on YouTube as soon as I can.
>
> Feedback welcome.
>
> best
>
> Andrew.
Related posts on langui.ch:
textblob-de
New release of German language extension textblob-de
for the popular textblob
package:
See overview of working features compared to main package.
Overview for development branch: Click here.
VIEW (Visual Input Enhancement of the Web):
Instant colourization of grammar structures, instant learning activities from any page on the web. Supported languages: English, German, Spanish.

VIEW Toolbar – Firefox Add-on (Screenshot taken on Linux Mint 64-bit, extension is available on all major operating systems (Windows/Mac OSX/Linux)

Colourized prepositions in a Guardian Article published two hours ago … (2 clicks and 10 seconds away)!
New Release: NoSketchEngine
The SketchEngine development team has just released a new open-source version of their tools (bonito
, manatee
, finlib
– Download-Links), including the following highlights:
- extended support for parallel corpora
- support for virtual corpora
- asynchronous query processing showing partial results as they are computed
- corpus info page providing an overall overview of the corpus stats
- lots of smaller enhancements in the functionality and usability of the user interface
- lots of speed enhancements, both for run time (query evaluation) and compile time (corpus indexing)
- lots of bugfixes
Source: http://nlp.fi.muni.cz/trac/noske/wiki/Downloads [accessed: 13/06/2014]
ForBetterEnglish
Class-room friendly collocations dictionary:
[Last update: 03/06/2015]
http://forbetterenglish.com/ (superseded by SkELL – Sketch Engine for Language Learning)
References:
- Kilgarriff, A. (2014, March). “Corpora in the classroom without scaring the students.” British Council – EnglishAgenda Seminar. Retrieved from http://www.youtube.com/watch?v=2APIUxE_i6M [Adam’s talk starts at 1:09:35]
- Adam Kilgarriff, Miloš Husák, Katy McAdam, Michael Rundell, Pavel Rychlý (2008). “GDEX: Automatically Finding Good Dictionary Examples in a Corpus.” In Elisenda Bernal, Janet DeCesaris (Ed.), Proceedings of the 13th EURALEX International Congress (pp. 425–432). Barcelona, Spain: Institut Universitari de Linguistica Aplicada, Universitat Pompeu Fabra. Retrieved from EURALEX 2008
TurnKey virtual appliances
“Turnkey Linux is a virtual appliance library that integrates and polishes the very best open source software into ready to use solutions.”
Excellent base system for CQPweb, ParaVoz, AntWebCorpusFramework, NoSketchEngine, etc.
Source: http://www.turnkeylinux.org/ [accessed: 03/03/2014]
- LAMP Stack Virtual Appliance (~220MB, linux base system [Debian7], admin through convenient web-gui, accessible from any (local) machine within minutes)