日本語 NINJAL
 

BCCWJ Word List

BCCWJ Word List (Complete)

The word list (frequency list) of the BCCWJ is available to the public. It is free for use for research or educational purposes. Additionally, to ease usage a manual is provided.

Notice

Because the word counts of Long Unit Words in the POS and word classification lists were being treated as identical to those of Short Unit Words, the counts were corrected.(2014/01/07)

Creation and application of word and kanji lists for use regarding Language Policy. (Report)

Author: Special "Japanese corpus" language policy research group.

2011

Special "Japanese corpus" language policy research group.

 


 Based upon the BCCWJ and the "Textbook Corpus", this report covers practical research related to the creation and use of word and kanji lists useful for Japanese language policy and education. The downloads further below also contain examples of research on the "BCCWJ principal word list", "Textbook Corpus word list", "School and Societal contrastive word list", "Educational subject-specific word list", and the "NDC genre-specific kanji frequency list".

BCCWJ Principal Corpus Word List

Author: Special "Japanese corpus" language policy research group.

2011

Special "Japanese corpus" language policy research group.

 


A list allowing for the comparison of the frequency and lexical level of the BCCWJ's fixed length samples of "Library books", "Published books", "Magazines", and "Newspapers", and variable length samples from "Yahoo! Answers" and "Yahoo! Blogs."

Textbook Corpus Word List

Special "Japanese corpus" language policy research group.

2011

Special "Japanese corpus" language policy research group.

 


A complete lexical listing of a "Textbook corpus", made up of textbooks in all subjects and grade levels from primary, middle, and high-school in the year 2005. As all the different grade levels, and subjects are known, it is also possible to to learn the frequencies of words as they are used in textbooks for those different subjects and grade levels. It also allows for comparison with the BCCWJ's collection of fixed-length samples from library books.

School and Societal Language Comparison Word List (Integrated edition)

Author: Special "Japanese corpus" language policy research group.

2011

Special "Japanese corpus" language policy research group.

 


This is a listing that allows for comparison between the words from the middle- and high-school textbooks from the "Textbook corpus" discussed above with a subset of words from the princple BCCWJ that are thought to be used very frequently. Words commonly associated with schools, and words used often in society can thus be compared. The word classification numbers from NINJAL's "Word Classification List, revised edition", are also included. In the "Integrated edition", words with several classification numbers (e.g. polysemes) are treated as single lexical items. The PDF edition available below is designed for viewing this easily.

School and Societal Language Comparison Word List (Divided edition)

Author: Special "Japanese corpus" language policy research group.

2011

Special "Japanese corpus" language policy research group.

 


 The "Divided Edition" contains the same lexical information as the "Integrated edition" above, but in cases where words have multiple classification numbers they are treated as separate lexical items.

Educational Subject-specific Word List

Author: Special "Japanese corpus" language policy research group.

2011

Special "Japanese corpus" language policy research group.

 


 Based on the above "Textbook corpus" and fixed length samples from library book sources, a list of words specific to different scholastic subjects was compiled. This list summarizes various words particular to different subjects in middle- and high-school curriculums.

NDC Genre-specific Kanji Frequency List

Author: Special "Japanese corpus" language policy research group.

2011

Special "Japanese corpus" language policy research group.

 


This is a list containing the frequencies of different Kanji characters in the 10 different genres classified under the Japanese Decimal Classification System (NDC). The list allows for a general summary of the kanji found in each genre.