Install, build and runΒΆ

entity-fishing requires JDK 1.8 and maven 3. It supports Linux-64 and Mac OS environments. Below, we make available the LMDB binary data for these two architectures.

Running the service requires at least 2GB of RAM, but more RAM will be exploited if available for speeding up access to the compiled Wikidata and Wikipedia data (including Wikidata statements associated to entities). After decompressing all the index data, 34 GB of disk space will be used - be sure to have enough free space. SSD is recommended for best performance and experience.

First install GROBID and grobid-ner, see the relative instruction of GROBID and grobid-ner.

The path to grobid-home shall indicated in the file src/main/resource/nerd.properties, for instance:

com.scienceminer.nerd.grobid_home=../grobid/grobid-home/
com.scienceminer.nerd.grobid_properties=../grobid/grobid-home/config/grobid.properties

Install entity-fishing:

$ git clone https://github.com/kermitt2/nerd

Then install the compiled indexed data:

  1. Download the zipped data files corresponding to your environment (warning: total around 10 GB) at the following address:

  2. Unzip the 4 (or 5) archives files under data/wikipedia/.

    This will install four sub-directories data/wikipedia/db-kb/, data/wikipedia/db-en/, data/wikipedia/db-de/ and data/wikipedia/db-fr/. The uncompressed data is about 34 GB.

  3. Build the project, under the NERD project repository.

    $ mvn clean install
    

    Some tests will be executed. If all tests are successful, you should be now ready to run the service.

  4. Run the service with Jetty:

    $ mvn -Dmaven.test.skip=true jetty:run-war
    

The test console is available at port :8090 by opening in your browser (preferably Firefox or Chrome, Internet Explorer has not been tested): http://localhost:8090

For more information, see the next section on the entity-fishing Console.