How to Use Stanford POS Tagger in Python

NLTK is a platform for programming in Python to process natural language. NLTK provides a lot of text processing libraries, mostly for English. Formerly, I have built a model of Indonesian tagger using Stanford POS Tagger. That Indonesian model is used for this tutorial.

To install NLTK, you can run the following command in your command line. I assume that you are using Windows and you have read and followed my first tutorial (in Indonesian) of having two versions of Python in your laptop:

python3 -m pip install -U nltk

In this example, I use a previously trained tagger which I name myTagger.model. It is a model customized for Indonesian. Place the model under the nltk folder so that it will be nltk\myTagger.model. Download the stanford-postagger.jar from http://nlp.stanford.edu/software/tagger.shtml.

To run this tagger, write the following codes in command prompt:

import nltk
from nltk import *
myTagger = StanfordPOSTagger(myTagger.model, "E:\\stanford-postagger.jar")
myTagger.tag('Pada suatu hari, dia pergi ke kota Jakarta.'.split())

This will be the output:

[('', 'Pada/IN'), ('', 'suatu/CD'), ('', 'hari,/Z'), ('', 'dia/PRP'), ('', 'pergi/VB'), ('', 'ke/IN'), ('', 'kota/NN'), ('', 'Jakarta/NNP'), ('', '?/Z')]

Explanation of tags can be found on the website of POS Tagger of Information Retrieval Lab, Faculty of Computer Science, Universitas Indonesia.

Python version: 3.5
Windows: 8.0
NLTK: 3.2
Stanford tools: 3.5.1

Comments

Rizkiana AmaliaJune 26, 2016 at 10:14 PM
hallo,
can I ask how you build previously trained tagger (myTagger.model)? can we get this model for trained tagger too?
UnknownMarch 3, 2017 at 12:30 PM
This comment has been removed by the author.
UnknownMarch 3, 2017 at 12:34 PM
This comment has been removed by the author.
UnknownFebruary 26, 2018 at 5:53 AM
Hi, i want to know how you build your own trained tagger (mytagger.model). Could you make the tutorial too?
UnknownApril 17, 2018 at 6:13 PM
Hi, i want to build my own trained tagger (for Arabic language). Can you help me please?

Shining Meadow

How to Use Stanford POS Tagger in Python

Labels

Comments

Post a Comment