Source: http://www.perezparedes.es/big-data-and-corpus-linguistics [accessed: 14/05/2015]
Category Archives: SketchEngine
New Release: NoSketchEngine
The SketchEngine development team has just released a new open-source version of their tools (bonito
, manatee
, finlib
– Download-Links), including the following highlights:
- extended support for parallel corpora
- support for virtual corpora
- asynchronous query processing showing partial results as they are computed
- corpus info page providing an overall overview of the corpus stats
- lots of smaller enhancements in the functionality and usability of the user interface
- lots of speed enhancements, both for run time (query evaluation) and compile time (corpus indexing)
- lots of bugfixes
Source: http://nlp.fi.muni.cz/trac/noske/wiki/Downloads [accessed: 13/06/2014]
ForBetterEnglish
Class-room friendly collocations dictionary:
[Last update: 03/06/2015]
http://forbetterenglish.com/ (superseded by SkELL – Sketch Engine for Language Learning)
References:
- Kilgarriff, A. (2014, March). “Corpora in the classroom without scaring the students.” British Council – EnglishAgenda Seminar. Retrieved from http://www.youtube.com/watch?v=2APIUxE_i6M [Adam’s talk starts at 1:09:35]
- Adam Kilgarriff, Miloš Husák, Katy McAdam, Michael Rundell, Pavel Rychlý (2008). “GDEX: Automatically Finding Good Dictionary Examples in a Corpus.” In Elisenda Bernal, Janet DeCesaris (Ed.), Proceedings of the 13th EURALEX International Congress (pp. 425–432). Barcelona, Spain: Institut Universitari de Linguistica Aplicada, Universitat Pompeu Fabra. Retrieved from EURALEX 2008
TurnKey virtual appliances
“Turnkey Linux is a virtual appliance library that integrates and polishes the very best open source software into ready to use solutions.”
Excellent base system for CQPweb, ParaVoz, AntWebCorpusFramework, NoSketchEngine, etc.
Source: http://www.turnkeylinux.org/ [accessed: 03/03/2014]
- LAMP Stack Virtual Appliance (~220MB, linux base system [Debian7], admin through convenient web-gui, accessible from any (local) machine within minutes)
TheSketchEngine
Home Open-source web-interface to corpus management system manatee (bonito fork with support for parallel corpora) 0.7.x 10 Apr 2016 Work web-based (also on localhost) open source License: GNU GPLv2+ Other free IMPORTANT NOTICE: official NoSketchEngine of manatee 2.59.X / 2.107.1 are NOT SUPPORTED but a compatible fork is provided on github (see link below)
Programming Language(s): Python
Key features: SERVER INSTALLATION, SUPPORT FOR PARALLEL CORPORA
Website: KonText
Website: KonText Repository (bonito fork maintained by Czech National Corpus)
Website: NEW (May 2015):KonText compatible fork of manatee
Home Open-source corpus management system 2.33.1-open-2.130.6-open-3.80.5 (finlib/manatee/bonito) 12 Nov 2015 Work web-based (also on localhost) open source License: GNU GPLv2+ Other free
Programming Language(s): Python, C++, Perl
Key features: SERVER INSTALLATION, MANAGE YOUR OWN CORPORA
Website: NoSketchEngine (bonito, manatee, finlib, open-susanne-corpus)
Website: KonText Repository (alternative front end maintained by Czech National Corpus)
Website: SketchEngine (commercial version)
Home Advanced Corpus Management System stable: 2.33.2-SkE-2.133.6-3.81.2 beta: 2.33.2-SkE-2.133.6-3.81.6) (finlib/manatee/bonito) 19 Jan 2016 (last version check, under constant development) Work web-based commercial License: commercial Other €58.-/year (Academic single user license, own corpus quota: 1 Mio words) 30-day free trial
Programming Language(s): various
Key features: WORD SKETCHES, SOPHISTICATED COLLOCATION MEASURES, THESAURUS, PRELOADED BILLION WORD CORPORA FOR MANY LANGUAGES, EASIEST WAY TO CREATE YOUR OWN SYNTACTICALLY ANNOTATED CORPORA
Website: SketchEngine (stable)
Website: SketchEngine (beta)
Website: NoSketchEngine (open source version – reduced functionality)