Compressing dictionaries

Discussion in 'Game Development General Discussion' started by Piglet, Feb 14, 2009.

  1. Piglet

    Piglet Spirited Member

    Joined:
    May 28, 2008
    Messages:
    175
    Likes Received:
    0
    I have spent some time wondering about how to compress a dictionary for scrabble or such. The code doesn't need a definition of the word, it just needs to know if the word exists. I've tried several methods but the 'bit-pair' route seems the best.
    Basically, start with 26 pointers to tables. Each table entry is 2 bits thus:

    00 = does not exist
    01 = exists but cannot end a word
    10 = exists and may end a word
    11 = exists and may or may not end a word.

    The table for the third letter is made from the non 00 entries of the last table and is situated after the second letter table. You can easily calculate the size.
    I noted that after the third letter, the tables get very small very fast so often it's just 1 or 2 entries.
    I have never seen this method used, but then I lead a sheltered life. Anyone know this route already or a better one?
     
  2. retro

    retro Resigned from mod duty 15 March 2018

    Joined:
    Mar 13, 2004
    Messages:
    10,354
    Likes Received:
    822
    Hmm, interesting.

    When I wrote a basic Hangman game for a college project, I used SOWPODS as a word list. That's the official Scrabble word list used here in the UK and many other places, except the USA and Canada who use TWL.

    Of course, the problem is that SOWPODS contains 267,751 words, so it is quite a large list! For me, it wasn't a huge problem - I was basically loading the list and selecting one word at random at the beginning of the game, not checking the dictionary constantly.

    It is apparently broken up thusly:

    2 letter words: 124
    3 letter words: 1,292
    4 letter words: 5,454
    5 letter words: 12,478
    6 letter words: 22,157
    7 letter words: 32,909
    8 letter words: 40,161
    9 letter words: 40,727
    10 letter words: 35,529
    11 letter words: 27,893
    12 letter words: 20,297
    13 letter words: 13,857
    14 letter words: 9,116
    15 letter words: 5,757


    If you need SOWPODS, try the following:

    http://members.ozemail.com.au/~rjackman/wordlist.html

    http://www.absp.org.uk/words/words.html

    http://home.teleport.com/~stevena/scrabble/wordlists.html

    Sorry, that doesn't really answer your question! All I can say is that I've only ever thought about it as "check word exists in DB" rather than looking at each letter and whether it ends a word.

    Why not check out an open source Scrabble for pointers? Here are examples:

    http://pyscrabble.sourceforge.net/

    http://www.gtoal.com/wordgames/scrabble.html
     
sonicdude10
Draft saved Draft deleted
Insert every image as a...
  1.  0%

Share This Page