Open Source DCoT Application - Word Counter
I am in the process of performing some analysis on the posts on Daily Cup of Tech. One of the things that I want to do is a word count and frequency analysis on the entire blog.Now, I could go with good ol’ pen and paper and start counting every single word on the blog. But, that would take me quite a mount of time, not to mention that I would not learn anything in the process.
So, I decided to export the contents of my mySQL database the runs behind the scenes at DCoT to a text file and then download a word and frequency counter. Do you think I could find a word counter that would count all of the words in the file and then count how many times each word appears? No luck.
But, my bad fortune is your lucky day. I decided that since I couldn’t find anything like this, I’d make it myself. So. today I present you with the Daily Cup of Tech Word Counter!
The application is a self contained program that is fully portable to USB devices. You can download the program and the source code if you are interested. The program is written in AutoIt.
Here is a screenshot of my new baby:

Most of the program is self explanatory. You can sort the output alphabetically or by how frequent each word appears. You can also sort in ascending or descending order. You can count the words that you type or paste into the edit box or use a text file.
The delete options may be the only confusion portion. When you are counting words, you need to clean up the rough text a bit. Delete some punctuation, get rid on non-printable letters, or scrub out the non-standard English words. Each of these options selects a different one of these options. Control characters are things like carriage returns and line spacing. Punctuation is your standard punctuation that you will find in most documents. Extended characters are characters that you usually do not see regularly and are often used in some non-English languages.
The Use Spaces option will replace all deleted characters with spaces rather than deleting them. This can modify your outcomes so feel free to experiment.
When you are done counting your words, a complete list of all the words and how often they appeared will be presented in the edit box.
Feel free to play around with this and let me know if you find it to be useful.

Whether we realize it or not, we are a civilization that runs on data. It controls every decision we make or don’t make. We spend countless hours reading it, analyzing it, massaging it, duplicating it, verifying it, summarizing it, and protecting it. It gives us power.