New release of German language extension textblob-de
for the popular textblob
package:
See overview of working features compared to main package.
Overview for development branch: Click here.
New release of German language extension textblob-de
for the popular textblob
package:
See overview of working features compared to main package.
Overview for development branch: Click here.
VIEW (Visual Input Enhancement of the Web):
Instant colourization of grammar structures, instant learning activities from any page on the web. Supported languages: English, German, Spanish.
The SketchEngine development team has just released a new open-source version of their tools (bonito
, manatee
, finlib
– Download-Links), including the following highlights:
Source: http://nlp.fi.muni.cz/trac/noske/wiki/Downloads [accessed: 13/06/2014]
If you try to launch AntConc on a Debian-based 64-bit system, you get the following error message (tested with versions 3.2.4u and 3.4.1u):
./antconc3.2.4u: No such file or directory or ./AntConc: No such file or directory
The following steps were necessary for me to be able to start AntConc on a TurnKey Linux Server (Debian7, 64-bit) using ssh
with X11-forwarding
enabled (e.g. PuTTY plus Xming on Windows 8.1).
Important note: Please respect Laurence Anthony’s licensing terms and ask for permission before using AntConc in a server/group environment (see README section ‘LEGAL MATTER’ (p. 11) for details).
1) Activate i386
architecture on 64-bit systems:
apt-get install libc6-i386 dpkg --add-architecture i386
2) Install missing 32-bit libraries:
apt-get install libx11-6:i386 libxss1:i386 libxft2:i386
For Ubuntu-based systems see this post, for other Linux distributions see ongoing discussion on: https://groups.google.com/forum/#!forum/antconc
Class-room friendly collocations dictionary:
[Last update: 03/06/2015]
http://forbetterenglish.com/ (superseded by SkELL – Sketch Engine for Language Learning)
References:
“Turnkey Linux is a virtual appliance library that integrates and polishes the very best open source software into ready to use solutions.”
Excellent base system for CQPweb, ParaVoz, AntWebCorpusFramework, NoSketchEngine, etc.
Source: http://www.turnkeylinux.org/ [accessed: 03/03/2014]
Steb-by-step guide to install Laurence Anthony’s AntWebCorpusFramework [1] on Xubuntu 12.04 LTS:
Requirements: webserver (e.g. apache2
), php5
, perl,
parallel corpus (two sentence aligned parallel text files in strict utf-8
format)
[Click on 'Overview' and 'Select Category' ⇨ 'Corpus Development Tools' ⇨ 'Sentence Alignment' for a list of tools you could use to create your own parallel corpus.]
apache2
web-server environment with php5
and enable sqlite
support for perl
and php5
AntWCF
directory and create database(s)antpwc_concordance.php
to new corpusindex.php
to new corpusapache2
configuration filehttp://localhost/AntWCF
apache2
web-server environment with php5
and enable sqlite
support for perl
and php5
Open a terminal window and type:
sudo apt-get install apache2 php5 php5-sqlite libdbd-sqlite3-perl
Troubleshooting:
If you would like to get rid of apache2
‘s warning “Could not reliably determine the server’s fully qualified domain name”, click here for a quick fix (the server works just fine even if you don’t specify ServerName
).
AntWCF
directory and create database(s)Change into the new AntWCF
directory:
cd path/to/AntWCF
Troubleshooting:
If you get a permission error when trying to drag&drop files into the new AntWCF directory
, you probably created the directory outside your home area, using the sudo
command (e.g. in /opt
). To be able to drag&drop files into the directory, you can take ownership by typing:
sudo chown -R yourusername:yourusername /path/to/AntWCF
After successfully extracting the files into your AntWCF
directory, copy your parallel corpus files to AntWCF/data/corpus
.
To create a database for a parallel corpus type:
perl ./bin/db_creator_L8-CEN-R8da.pl data/corpus/L1_FILE.txt data/corpus/L2_FILE.txt data/AntWCF_db/OUTPUT_FILE.db
Note: [1 corpus = 2 txt files (your files) + 1 db file (created by script)]
L1_FILE.txt
is the source text file of your parallel corpus that you have just copied into the AntWCF/data/corpus
directory, L2_FILE.txt
is the target text file of your parallel corpus and OUTPUT_FILE.db
is the name of the sqlite
database to be created by the script (extension: .db
).
Take a note of the name of the new database file and the number of tokens for L1/L2 displayed by the script (or make sure that you do not close the terminal window, as you will need these pieces of information later on).
Delete the temporary files created by the script:
rm temp_*
Repeat this procedure for other parallel corpora you might wish to include in AntWCF
Troubleshooting:
If the script throws an encoding error, make sure that your text files only contain legal utf8 characters and eliminate all non-utf8 characters before running the database script again. Important: If you choose the same output file name, you have to delete the old file before re-running the script.
antpwc_concordance.php
to new corpusOpen the file www/antpwc_concordance_20120925_2342.php
in a text editor of your choice (e.g.):
leafpad www/antpwc_concordance_20120925_2342.php
[Select 'Options' ⇨ 'Line Numbers' for easier navigation]
Adapt the following lines for your own parallel corpus:
Note: AntWCF is preconfigured to be able to switch between two different parallel corpara. If you created just one database file, comment out lines 32-34 and lines 40-42 (using ‘//
‘ at the beginning of each line) or fill in the same data twice. If, however, you created more than two databases, you could include those corpora by copying lines 32-34, inserting them once before line 35 and a second time before line 43. Subsequently, you could adapt the database information for an additional corpus (and so on …). For number of tokens per language, refer back to the information generated by the db_creator.pl
script above.
index.php
to new corpusOpen the file www/index.php
in a text editor of your choice.
Adapt the following lines for your own parallel corpus:
Set default database:
Adapt database information:
Note: AntWCF is preconfigured to be able to switch between two different parallel corpara. If you just created one database file, comment out line 110 or just fill in the same data twice. If you created more than two database files, copy line 110, insert it before line 111 and adapt the database file and corpus names:
Adapt the language pair for your own parallel corpus (line numbers will have changed slightly if you inserted additional databases above).
apache2
configuration fileCreate an new file called AntWCF.conf
in the directory /etc/apache2/conf.d/
and open it in a text editor of your choice (e.g.):
sudo leafpad /etc/apache2/conf.d/AntWCF.conf
Copy-paste the following lines into the new file, adapt the absolute path to your AntWCF/www
directory (twice for Alias and for Directory) and save the file (you will need superuser privileges (sudo
) to be able to save the file in this location):
# AntWCF default Apache configuration Alias /AntWCF /path/to/AntWCF/www <Directory /path/to/AntWCF/www> Options FollowSymLinks DirectoryIndex index.php </Directory>
After saving the file successfully, restart apache2
by typing:
sudo service apache2 restart
http://localhost/AntWCF
and enjoy!